IPv6 DHCP addresses after power outage

vcunat · January 28, 2020, 3:21pm

I had complete power outage, but I think I noticed similar problems when just restarting Omnia (latest 3.11 ATM) without reconnecting/restarting wired clients.

The symptom is that (some) machines lose their “short” IPv6 addresses from DHCPv6, e.g. 2a02:xxxx:xxxx:xxxx::yyy while they still have SLAAC (same prefix but LAN part is MAC-based). My use case is reaching them from the internet, but I verified that even those (Linux) machines do not think they have these addresses anymore and they have to get restarted to obtain them again (smaller action probably suffices, but they don’t recover by themselves even after days).

Any idea how to address this? Does it also happen for you? I can’t say I understand DHCP*.

xsys · January 29, 2020, 10:57am

I can confirm the same behavior (for Windows10 clients only).
Omnia, TOS 4.0.6 HBT, default config except having PPPoE and multiple LAN networks/interfaces.
LAN config:

config interface 'u1u'
	option proto 'static'
	option ipaddr '10.1.1.1'
	option netmask '255.255.255.0'
	option ifname 'lan2'
	option ip6assign '64'
	option ip6hint 'ff'

config dhcp 'u1u'
	option interface 'u1u'
	option start '100'
	option limit '100'
	option leasetime '1d'
	list dhcp_option '6,10.1.1.1'
	option ra 'server'
	option dhcpv6 'server'
	option ra_management '2'

that means I have RA active and DHCPv6-Mode stateful only.

/tmp/hosts/odhcpd:

2a01:510:abcd:efff::55	storj
# lan2 000100012588497f00155d111111 1000155d storj -1 55 128 2a01:510:abcd:efff::55/128
2a01:510:abcd:efff::2f1	centos
# lan2 0004cf4cf4244c1f6f855876e0646f55e46e 2b506735 centos 1580371209 2f1 128 2a01:510:abcd:efff::2f1/128

After router reboot:

files in /tmp/ are gone and /tmp/hosts/odhcpd gets empty
Centos machine gets its IPv6 shortly
Windows 10 machines fall back to local fe80:… address for good

Maybe the thing is that during router boot the Internet is not accessible until PPPoE is established, that happens at the end of startup process, and it means the router does not have IPv6 prefix available for assigning to clients when they ask for it first time and some will give up. The error applies on Windows10 clients only though.

anon50890781 · January 29, 2020, 11:12am

If not mistaken the OP’s case refers to ULA whilst this pertains to GUA. ULA is independent of GUA and upstream connectivity.

Since the GUA related matter appears to be happening with

but not happening with other clients it might be particular to communication between the Window clients and the router, I am not familiar how (well) Windows handles stateful DHCPv6 as opposed to SLAAC.

What happens when you disconnect a Window client for short while, say 2 minutes, and then reconnect?

vcunat · January 29, 2020, 12:48pm

I clarified the OP (hopefully). For this issue I’m only interested in global IPv6 addresses and not ULAs. SLAAC addresses appeared OK (that seems to be different from @xsys case), some machines only lost those DHCPv6 ones (which don’t reveal MACs). Turris WAN shouldn’t take time to work in my case, so perhaps that’s the difference here. In my case all the machines I checked were Linux with dhcpcd.

I added more precise examples to OP; I’m not so confident about terminology here.

anon50890781 · January 29, 2020, 3:12pm

@xsys this German language site mentions something like

netsh interface ipv6 set int ethernet managed=

managed=enabled - Stateful Mode, wenn der Client die IPv6 von einem DHCPv6 bezieht.

Not clear though whether applicable to W10 or some previous version.

^[1] IPv6 unter Windows

vcunat · January 29, 2020, 3:45pm

= 1 everywhere for these options.

= 0

When this was broken, I accessed the clients through IPv4 inside the LAN (through SSH to Turris), and I saw that SLAAC addresses were shown in ip address output as usual, though I didn’t try accessing clients through these.

ULAs aren’t useful for my case, as I need direct ssh access from internet to the client machines. SLAAC addresses should work, but so far I preferred to avoid exposing the MACs.

Unfortunately it’s quite inconvenient for me to do restarting experiments in this network.

anon50890781 · January 29, 2020, 4:11pm

How are the clients establishing/generating such (unusually SLAAC) short suffix (:yyy) in the first place, some option provided by dhcpcd?

vcunat · January 29, 2020, 4:31pm

hopefully I haven’t messed up the terminology. The long addresses that seem OK all the time have the EUI-64 suffix – I think those are assigned through SLAAC and don’t depend on DHCPv6 in LAN.

The short suffixes (::yyy) are those that get broken by some restarts. I think those are assigned through (stateful) DHCPv6 (and not SLAAC); I don’t know how exactly, but I believe I haven’t changed any related defaults.

anon50890781 · January 29, 2020, 4:41pm

It gets a bit confusing now, stateless being mixed up with stateful. Is the ISP providing a hybrid setup - stateful (M flag) and stateless (O flag) ^[2]?

^[2] Home | IETF Community Wiki

anon50890781 · January 30, 2020, 9:45am

@vcunat For the short suffix you are probably leveraging option ip6ifaceid '' in /etc/config/network which strangely is provided by netifd and not as expected (least I did) by odhcpd.

I would now reckon this being sort of timing issue (chicken and egg) bug among netidf and odhcpd during the node’s boot up phase.

Since both, netifd and odhcpd, have undergone code changes in more contemporary OS branches the issue may not exhibit in such branch, if still however perhaps worth to file a bug report with upstream.

A potential workaround, until testing with a contemporary OS branch, might be to implement a routine that restarts odhcpd with a delay or increment its init default START=35 to something like START=90 and see if that helps with the downstream clients.

Just to clarify/correct

Actually, odhcpd provides flags

ra_management	integer	1			RA management mode
		                                0: no M-Flag but A-Flag
                                                1: both M and A 
                                                2: M but not A

Hence, if you do not want/need autonomous ND configuration (A flag for SLAAC) for the downstream clients change to option ra_management '2'

xsys · January 30, 2020, 9:12pm

Thanks for all the input. I didn’t want to mess this thread if it was different issue, but it seems to me it may have the same root cause. I learn IPv6 as I go, so apologize for any eventual nonsenses

Here is my further observation:

ULA is pointless for me, I have turned it off completely.
SLAAC is pointless for me, I have turned it off completely.

Please, how do you assign those addresses to clients? By DHCP reservations [Static Leases http://192.168.1.1/cgi-bin/luci/admin/network/dhcp as described below] ?

I don’t think it makes a difference, since the issue is in both cases with “short” IPv6 addesses provided by DHCPv6. I do not use SLAAC but it should not affect it.
If I turn on Stateless [details below], the Windows10 [Pro, 1903, all default settings] machines get those “long” addresses as well, such as:

   IPv6 Address. . . . . . . . . . . : 2a01:510:abcd:efff::55
   IPv6 Address. . . . . . . . . . . : 2a01:510:abcd:efff:5932:d895:ea8:7695
   Temporary IPv6 Address. . . . . . : 2a01:510:abcd:efff:5d0f:3800:a5b3:74e5

After router reboot:

2a01:510:abcd:efff::55 is lost
2a01:510:abcd:efff:5932:d895:ea8:7695 is good
Temporary 2a01:510:abcd:efff:5d0f:3800:a5b3:74e5 gets changed to 2a01:510:abcd:efff:5413:5b52:5d54:dc54 for example

Without reboot it does not get the :55 back even after 5 hours
Also ipconfig /renew does not acquire ::55 back
After Windows machine reboot it gets 2a01:510:abcd:efff::55 back again

After LAN cable unplugged for 5 mins and plugged back, it gets the ::55 immediately.

SLAAC [stateless, the long addresses, based on RA and MAC] I don’t want. What I only need is the short = statefull address provided by DHCPv6.

When I turn the DHCPv6-Service off, the client only get [from RA] two long IPv6 - static one [MAC based] and temporary one [privacy extension].
When I turn the DHCPv6-Service on and DHCPv6-Mode Stateless only, the client gets all three IPv6 as shown above, but the ::55 changes to ::6f4 and gets different after every reboot.
When I turn the DHCPv6-Service on and DHCPv6-Mode Stateless+Statefull, the client gets all three IPv6 as shown above, and keeps the ::55, probably forever.
When I turn the DHCPv6-Service on and DHCPv6-Mode Statefull only, the client gets only the ::55 one, and seems to keep it forever. If I want to have the ::55 reserved always for the client, I can go to DHCP [http://192.168.1.1/cgi-bin/luci/admin/network/dhcp] and make Static Leases for it. That will make following record to /etc/config/dhcp

config host
	option name 'test'
	option dns '1'
	option mac '00:15:5D:01:0A:15'
	option duid '0001000125c516e600155d010a15'
	option ip '10.1.1.88'
	option hostid '88'

It will make sure this will always be the same:

   IPv6 Address. . . . . . . . . . . : 2a01:510:abcd:efff::88
   IPv4 Address. . . . . . . . . . . : 10.1.1.88

Another way is to hardcode all addresses manually on the clients. Not recommeded though.

My understanding is, the ISP does not have anything to do with it, they only provide Prefix Delegated and thats it. It is the Omnia who is providing clients with RA, and Stateless/Statefull DHVP6 out of the Prefix. Right?
The field IPv6 suffix on http://192.168.1.1/cgi-bin/luci/admin/network/network/lan only affects the address of the very own router’s LAN interface, not any clients.

That’s exactly what I have done.

I played with all combinations of ND Proxy and Relay, it did not do anything or it only broke things completely.

I have no experience with DS-LITE, I saw that only in UPC network at my friends’ home, but the cable modem/router provided by UPC did not have any configuration variability, almost everything is hidden to customers.

My own conclusion so far - because everything else work just fine, I have to live with required Windows restart after every Omnia reboot…

vcunat · January 30, 2020, 9:44pm

I haven’t configured anything related so far; it’s all defaults and just happens. Actually choosing some fixed IPv6 address mappings would be best for my case, especially if it should make the assignment more reliable. (Their point is to run internet-accessible services on them.)

I think I even had a case where two computers would apparently be able to get the same short address, though naturally not both at once. Apart from the obvious mess when using such address, another bad consequence was that the losing machine would then have no short IPv6 and thus reveal its MAC on all outgoing connections to the internet :-/

That’s my assumption as well. (I don’t know DHCP* stuff.)

anon50890781 · January 31, 2020, 12:01pm

Are you getting a /64 delegated prefix from the ISP? According to ^[6] this might explain the generation of randomized short suffixes.

This probably explains the multiple addresses with stateful mode

dhcpv6_na	DHCPv6 stateful addressing hands out IA_NA - Internet Address - Network Address
dhcpv6_pd	DHCPv6 stateful addressing hands out IA_PD - Internet Address - Prefix Delegation

On my node it is set with option dhcpv6_pd '0' since its downstream nodes do not need the IA_PD.

^[6] DHCPv6 suffix length · Issue #84 · openwrt/odhcpd · GitHub

xsys · January 31, 2020, 12:47pm

[just my humble understanding here] :

Again, I don’t think the Prefix Delegated from ISP matters here. I am getting /56, but I assign /64 to every LAN interface.

Yes I agree. If I want to have static, I need to set up Static Lease. It is the same like IPv4 addresses - if we don’t set Static Lease, they get assigned randomly out of the DHCP pool.

in my case always have only one single IPv6, the short one, as I said:

Furthermore, I am not setting option dhcpv6_pd or dhcpv6_na in any way, so it gets all default

anon50890781 · January 31, 2020, 6:20pm

The length of the client’s suffix (IID) is determined by netifd’s option ip6ifaceid '' albeit one would reckon this only being applicable to a router interface. However, changing the permissible values :

random
eui64

changes the client’s suffix (IID) length as well.

The issue with the missing lease after TO power cycle did not reproduce on this TOS6.x instance and thus might be remedied in OS versions more contemporary than TOS3.x|4.x.

A potential workaround, as opposed to dis/re-connecting or rebooting clients might be to restart the TO’s network instead - /etc/init.d/network restart

I recall seeing those weird ::yyy suffixes in TOS3.x and even having discussed it somewhere but that is quite some time ago and I could not trace that discussion. However, since TOS4.x the suffix comprises of numerical 3-digits instead.

xsys · January 31, 2020, 7:46pm

/etc/init.d/network restart did not help

anon50890781 · January 31, 2020, 7:51pm

Well too bad, another avenue - /etc/init.d/odhcpd restart - any luck?

xsys · February 1, 2020, 7:19am

No luck here. What about @vcunat ?
What helps in my case [beside of restart] is to unplug LAN cable. When plugged back, it gets the short IPv6 immediately.
Still needs manual intervention unfortunately.
Everything seems to point the issue to the Windows10 PCs in my case, the router is not to blame.
I could do packet sniff to see whether Windows10 stops asking for DHCP6, but I don’t have time for that now, having two little kids here, they take all my attention

Skywalker-11 · February 1, 2020, 1:26pm

For IPv6 this should be ipconfig /renew6
ipconfig /renew is only for IPv4

vcunat · February 7, 2020, 4:19pm

I think I’ll be able to do more experiments in about a week and later on.