The updater is very buggy

A: I’m getting this a few times a day:

warning []: Unable to use //tmp//yin2yang.xsl (No such file or directory).
warning []: YANG format data models will not be available via get-schema.

B: Today updater crashed for me completely:

System log

2016-11-26T19:29:03+01:00 debug updater[2507]: src/lib/journal.c:123 (journal_open): Opening journal
2016-11-26T19:29:03+01:00 emerg sfpswitch.py[1449]: DEBUG:src/lib/journal.c:123 (journal_open):Opening journal
2016-11-26T19:29:03+01:00 emerg sfpswitch.py[1449]: line not found
2016-11-26T19:29:03+01:00 debug updater[2507]: src/lib/interpreter.c:104 (interpreter_error_result): 
Stack Traceback
===============
(2) Lua function '?' at line 64 of chunk '"logging"]'
	Local variables:
	 err = string: "[string \"transaction\"]:325: No journal to recover"
	 err2string = Lua function '?' (defined at line 45 of chunk "logging"])
	 msg = string: "\
[string \"transaction\"]:325: No journal to recover"
	 (*temporary) = table: 0x1a8c680  {msg:
[string "transaction"]:325: No journal to recover}
(3)  C function 'function: 0x1a8a470'
(4) field C function 'recover'
(5) Lua global 'recover' at line 325 of chunk '"transaction"]'
	Local variables:
	 run_state = table: 0x18508d0  {initialized:true, init:function: 0x1850900, lfile:userdata: 0x189f568, release:function: 0x187dd20 (more...)}
(6) Lua function '?' at line 386 of chunk '"transaction"]'
	Local variables:
	 (*temporary) = Lua function '?' (defined at line 352 of chunk "transaction"])
2016-11-26T19:29:03+01:00 emerg sfpswitch.py[1449]: DEBUG:src/lib/interpreter.c:104 (interpreter_error_result):
2016-11-26T19:29:03+01:00 emerg sfpswitch.py[1449]: Stack Traceback
2016-11-26T19:29:03+01:00 emerg sfpswitch.py[1449]: ===============
2016-11-26T19:29:03+01:00 emerg sfpswitch.py[1449]: (2) Lua function '?' at line 64 of chunk '"logging"]'
2016-11-26T19:29:03+01:00 emerg sfpswitch.py[1449]: 	Local variables:
2016-11-26T19:29:03+01:00 emerg sfpswitch.py[1449]: 	 err = string: "[string \"transaction\"]:325: No journal to recover"
2016-11-26T19:29:03+01:00 emerg sfpswitch.py[1449]: 	 err2string = Lua function '?' (defined at line 45 of chunk "logging"])
2016-11-26T19:29:03+01:00 emerg sfpswitch.py[1449]: 	 msg = string: "\
2016-11-26T19:29:03+01:00 emerg sfpswitch.py[1449]: [string \"transaction\"]:325: No journal to recover"
2016-11-26T19:29:03+01:00 emerg sfpswitch.py[1449]: 	 (*temporary) = table: 0x1a8c680  {msg:
2016-11-26T19:29:03+01:00 emerg sfpswitch.py[1449]: [string "transaction"]:325: No journal to recover}
2016-11-26T19:29:03+01:00 emerg sfpswitch.py[1449]: (3)  C function 'function: 0x1a8a470'
2016-11-26T19:29:03+01:00 emerg sfpswitch.py[1449]: (4) field C function 'recover'
2016-11-26T19:29:03+01:00 emerg sfpswitch.py[1449]: (5) Lua global 'recover' at line 325 of chunk '"transaction"]'
2016-11-26T19:29:03+01:00 emerg sfpswitch.py[1449]: 	Local variables:
2016-11-26T19:29:03+01:00 emerg sfpswitch.py[1449]: 	 run_state = table: 0x18508d0  {initialized:true, init:function: 0x1850900, lfile:userdata: 0x189f568, release:function: 0x187dd20 (more...)}
2016-11-26T19:29:03+01:00 emerg sfpswitch.py[1449]: (6) Lua function '?' at line 386 of chunk '"transaction"]'
2016-11-26T19:29:03+01:00 emerg sfpswitch.py[1449]: 	Local variables:
2016-11-26T19:29:03+01:00 emerg sfpswitch.py[1449]: 	 (*temporary) = Lua function '?' (defined at line 352 of chunk "transaction"])
2016-11-26T19:29:03+01:00 emerg sfpswitch.py[1449]: 
2016-11-26T19:29:03+01:00 crit updater[2507]: src/opkg-trans/main.c:95 (main): 
[string "transaction"]:325: No journal to recover
2016-11-26T19:29:03+01:00 emerg sfpswitch.py[1449]: [31;1mDIE[0m:src/opkg-trans/main.c:95 (main):
2016-11-26T19:29:03+01:00 emerg sfpswitch.py[1449]: [string "transaction"]:325: No journal to recover
2016-11-26T19:29:03+01:00 emerg sfpswitch.py[1449]: Aborted

/tmp/updater_crash.log

Stack Traceback
===============
(2) Lua function '?' at line 64 of chunk '"logging"]'
        Local variables:
         err = string: "[string \"transaction\"]:325: No journal to recover"
         err2string = Lua function '?' (defined at line 45 of chunk "logging"])
         msg = string: "\
[string \"transaction\"]:325: No journal to recover"
         (*temporary) = table: 0x1a8c680  {msg:
[string "transaction"]:325: No journal to recover}
(3)  C function 'function: 0x1a8a470'
(4) field C function 'recover'
(5) Lua global 'recover' at line 325 of chunk '"transaction"]'
        Local variables:
         run_state = table: 0x18508d0  {initialized:true, init:function: 0x1850900, lfile:userdata: 0x189f568, release:function: 0x187dd20 (more...)}
(6) Lua function '?' at line 386 of chunk '"transaction"]'
        Local variables:
         (*temporary) = Lua function '?' (defined at line 352 of chunk "transaction"])

C: It messes up with the opkg locking mechanism, breaking some packages

D: It has huge memory leaks

E: I am suspecting that it is the updater which causes my internet connection to drop a few times a day (not a wifi issue since I can connect to the router). I have disabled it now to confirm (it still goes down)

Was it really worth reinventing the wheel? Wouldn’t it be better to simply use opkg for as much as you can and have a tiny script for everything else?

Yours problem about B is explained here: Updater not working?

As you can see yours problem C and D is created on their issue tracker and created by them. What you can do a) sit tight and wait b) try to help them with fix

C was created by me and closed as wontfix :confused:

C was created by you - my mistake, but it was fixed in OpenWRT ( https://github.com/openwrt/packages/commit/65bcae09867a735b098dd7375756a56ff54ee72f ) so I think we can expect the fix also in Turris OS soon.

The problem is that opkg is very stupid and useless as a proper package manager. An attempt to have a tiny script was there before (that is why it is called updater.sh), it was much worse than this updater-ng.

You can see more detailed explanation in the talk about updater-ng during the latest OpenWRT summit.

Hello

A) is not related to updater, but to nuci. This is output by one of the libraries it uses and it is completely harmless (it just says we don’t support some fancy feature of the netconf protocol).

C), well, it was not us who did the locking wrong. Opkg does it a bit wrong, and the script had it 100% wrong. We decided that fixing some third-party script (that is not probably supposed to touch that lock in the first place anyway) is not worth bringing the small bug into our code.

D) It’s not a real memory leak. It’s just the memory usage is higher than we expected it would be and we will have a look into it. But as the updater doesn’t run all the time (it is always just run and it terminates), this is not a big problem (and it wouldn’t be even if there were memory leaks ‒ the memory is returned on termination).

And I can assure you this is not reinventing the wheel. We looked around and discovered nothing has the features we need. And we did the thing with a wrapper around opkg, you can have a look at the master branch of https://gitlab.labs.nic.cz/turris/updater/tree/master ‒ you are especially interested in the files updater.hs, updater-worker.sh and updater-utils.sh. Besides being a really complex and unmaintainable monster, it didn’t meet all the needs. So yes, it was worth it.

1 Like