Restart services like OpenVPN/LXC on network restart?

lxc

#1

I run a number of network-based services within LXC containers on my Turris Omnia and I’ve found that they stop being able to communicate with the network if the TO’s network is restarted. OpenVPN’s routing also gets affected, requiring it be restarted in order to start functioning again. Restarting the network entirely is pretty common in Foris now when changing settings (LAN/WiFi/network/VPN etc) see the network restarted for the changes to take effect.

In order for the services to regain network access, I need to stop and then start the containers, or restart OpenVPN, which is particularly challenging as I need to actively remember to do it. If accessing remotely or happen to forget, leaves the possibility of having the containers network-less or lacking VPN access until I can fix the problem.

Ideally, it feels like this something that should be built-in to Turris to gracefully restart services and LXC that have a dependency on the network. Or alternatively, said services could be made more resilient to a network restart, which might be more fiddly. That said, if this sort of ability doesn’t exist, is there a location for network-related event/hook scripts so I can at least manually automate the restarts to bring services back online?


#2

What kind of services you run? I don’t have experience with OpenVPN yet but I am using some connection checking script on my Omnia.

It is pinging 3 websites and if it fails few times then it restarts the connection. I am sure you can modify it a bit and place inside the container running to check if there is ping. If not then just restart the service instead of network interface in my case.

script.sh

#!/bin/sh
# Enter the FQDNs you want to check with ping (space separated)
# Script does nothing if any tries to any FQDN succeeds
FQDN="www.google.com"
FQDN="$FQDN wiki.openwrt.org"
FQDN="$FQDN www.turris.cz"
`# Sleep between ping checks of a FQDN (seconds between pings)` `SLEEP=3 # Sleep time between each retry` `RETRY=5 # Retry each FQDN $RETRY times` `SLEEP_MAIN=15 # Main loop sleep time`
check_connection()
{
for NAME in $FQDN; do
for i in $(seq 1 $RETRY); do
ping -c 1 $NAME > /dev/null 2>&1
if [ $? -eq 0 ]; then
return 0
fi
sleep $SLEEP
done
done
# If we are here, it means all failed
return 1
}
while true; do
check_connection
if [ $? -ne 0 ]; then
#command to run if pinging fails
%YOUR_COMMAND_HERE%
fi
sleep $SLEEP_MAIN
done

Take a look. You just have to have command to restart OpenVPN service. I think restarting the whole container is a bit overkill and tho not necessary (depending on your services).


#4

Thanks for the suggestions. Connection checking could be a workaround inside the LXC containers, if a reboot inside the container affects the container’s networking on the host in the same way that lxc-stop / lxc-start does. I’ll try this out and see if I can pin down exactly why the network or routing isn’t functional.

On the host side of things, a restart of the Turris Omnia’s network means that there’s only the briefest of moments of lost network connectivity. So in this case, a cron script on the host would have to be permanently pinging a target to notice a momentary loss in network, and that could be fraught with false-positives.

I know I could restart OpenVPN periodically (eg every hour or day) and I might do that in the meantime, but it would be ideal to have a non-hacky way of keeping the router’s services working after a network restart.


#5

inside LXC container you can archieve it by installation of monit

apt-get install monit

then you have to edit /etc/monit/conf-enabled files and create particular files for services that you want to monitor and restart

then you can google monit openvpn example and adjust it somehow for your needs

example of monit config for openvpn but there are more on the net :

check process vpn-network with pidfile /var/run/vpn-network.pid
start program = “/etc/init.d/openvpn start vpn-network.com
stop program = “/etc/init.d/openvpn stop vpn-network.com

check host tap0 with address 1.1.1.1
start program = “/etc/init.d/openvpn start vpn-network.com
stop program = “/etc/init.d/openvpn stop vpn-network.com
if failed
icmp type echo count 5 with timeout 15 seconds
then restart

great thing you can also monitor other daemons and make sure monit restart them when necessary

also you can install monit into turris/openwrt by opkg install monit as it is part of turris packages and even I could not find example of monit config for LXC I suppose with a bit googling or experimenting also lxc should be monitored and restarted by monit

monit got web interface that can be accessed via login/password on http://your_lxc_container_ip:2812 by default or check status from ssh console via monit status command