Problem with arp (incomplete entry) for printer in LAN

Hello,

I have weird problem I am trying to debug for a quite.

TLDR: I cannot print from laptops on none of the printers after few hours of printer idle. Core issue is incomplete ARP entry on client: ? (192.168.0.5) at (incomplete) on en0 ifscope [ethernet]

My setup consists from Turris Omnia, multiple laptops, Raspberry Pi and two printers. Turris has IP address 192.168.0.1, printers have IP address 192.168.0.5 and 192.168.0.6 and are in LAN zone. All devices are in same segment (e.g. no firewall between hosts).

Symptomps:

  1. All DHCP clients receive IP address according to DHCP rules (those with fixed entry as defined, others from pool);
  2. Client can print without any issue;
  3. After while (few hours, did not tracked exact time) client cannot print. Printer seems stalled;
  4. Client cannot ping printer;
  5. After looking to ARP table I can see this entry: ? (192.168.0.5) at (incomplete) on en0 ifscope [ethernet];
  6. Upon SSH to Turris I can see normal ARP entry: 192.168.0.5 0x1 0x2 0c:84:dc:3e:19:47 * br-lan;
  7. If I run ping on laptop to printer (ping 192.168.0.5) and in same time I ping from Turris to printer (again ping 192.168.0.5) after first ping on Turris I will receive ICMP response on laptop;
  8. ARP entry looks normal on laptop again: ? (192.168.0.5) at c:84:dc:3e:19:47 on en0 ifscope [ethernet];
  9. printing works for few hours;

I can setup cron job to ping every 10 minutes or so printers, however… I do not like it. I want it solved systematic way.

If you have any hint where I can look I would be very grateful.

Thanks.

Also not sure if this is related but seems like I have issue with Kresd too. After while users complaint websites are not loading. E.g. last case in point - airbnb.com:
From user computer (symptom is same for all devices on network):

dig airbnb.com

; <<>> DiG 9.10.6 <<>> airbnb.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 33080
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;airbnb.com.			IN	A

;; ANSWER SECTION:
airbnb.com.		42	IN	A	34.202.147.136
airbnb.com.		42	IN	A	34.236.9.133

delilah:~ root# ping NOLINK_LIMIT
PING airbn.com (34.202.147.136): 56 data bytes
Request timeout for icmp_seq 0
Request timeout for icmp_seq 1
Request timeout for icmp_seq 2

After I restart kresd on Turris everything works immediately.

I will create separate topic since I am not sure this is related.

I do not see that the resolver has a play in the matter since it is not providing dhcp. That is unless the printers are not accessed via ip but domain instead.

It seems more like the printers stop broadcasting their service on the LAN after some time a/o not waking to packets coming their way. Are they eventually going into a sleep/snooze/standby mode after some time?

Yes, they will go to sleep (powersaving mode). However if problem is on printer side should be this consistent (e.g. incomplete ARP entry on client and on router)?

That would be if the router would provide ARP resolution for the clients, which to my understanding (and I might be wrong) it is not and the router as well as each client resolving ARP on their own.

https://www.tummy.com/articles/networking-basics-how-arp-works/

ARP stands for Address Resolution Protocol. When you try to ping an IP address on your local network, say 192.168.1.1, your system has to turn the IP address 192.168.1.1 into a MAC address. This involves using ARP to resolve the address, hence its name.

Systems keep an ARP look-up table where they store information about what IP addresses are associated with what MAC addresses. When trying to send a packet to an IP address, the system will first consult this table to see if it already knows the MAC address. If there is a value cached, ARP is not used.

If the IP address is not found in the ARP table, the system will then send a broadcast packet to the network using the ARP protocol to ask “who has 192.168.1.1”. Because it is a broadcast packet, it is sent to a special MAC address that causes all machines on the network to receive it. Any machine with the requested IP address will reply with an ARP packet that says “I am 192.168.1.1”, and this includes the MAC address which can receive packets for that IP.

Your case seems to be either the printers are not receiving the broadcast or responding to the clients or the clients not receiving the response from the printers (power saving mode?).

Perhpas you could narrow it down by changing the power saving intervall or turn it off.

Yes, you are right how ARP works, no doubt about that. :slight_smile: I was thinking more it is weird symptom is on client devices, but not present on Turris. Also after ping from Turris everything magically works.

I was using UPC provided router before (and OpenWRT based even before) and printers were working.

You are right, for dissect I will disable anything related to sleep on printers.

Which leads to suspect the TO naturally. Perhpas check the TO’s network interfaces for rx/tx errors/drops.

Zero errors everywhere.

But now I have timeframe. I “fixed” printers on LAN as I typed the post (3 hours ago).

I tried pinging printer and same problem again:
PING 192.168.0.5 (192.168.0.5): 56 data bytes Request timeout for icmp_seq 0

Record in ARP table:
? (192.168.0.5) at (incomplete) on en0 ifscope [ethernet]

Now I tried different approach - I restarted printer. Upon going up my host received ICMP echo reply:
Request timeout for icmp_seq 44 64 bytes from 192.168.0.5: icmp_seq=45 ttl=255 time=89.130 ms

To conclude: problem is somewhere on network level. Can be fixed either by pinging dead host from Turris or rebooting host (and host will re-lease IP from DHCP which will obviously let Turris know device is alive).

I am going to look for sleep settings in printers.

Thanks for helping so far!

I found sleep regime on printer. Sleep is set for 30 minutes. I would like to keep sleep on because of energy savings.

I will debug it this way:

  1. start pinging at 15:31;
  2. if sleep has something to do with this issue it should demonstrate after 30 minutes (e.g. no response for ping);
  3. if device will respond after 30 minutes (after 16:10 for example) I will stop ping, reboot printer and wait 35 minutes then test ping. Problem should be present again.

Will post about results.

I also forgot to mention one thing that might be important: printers are both connected via WiFi.

Pretty much confirmed - it is sleep regime. Circa after 30 minutes printer stopped responding to ping.

I cannot disable sleep regime (I can change values from 5 minutes to one hour).

To confirm I will lower sleep time to 5 minutes and test if problem will demonstrate again.

Wondering whether it would make a difference to connect one printer via cable to the router and the other one wireless, also assuming that the printers not broadcasting their own wifi network/access point.

And run a tcpdump on the router and the client - say for 10 min with the printer sleep intervall set to 5 minutes and then investigate the dumps (arp frames) with wireshark. That might provide a clue to what is happening.

Yup, good approach. Will do. “Bad” news - since restart printer works without any problem. :slight_smile: Must wait until issue will be present again.