Pakon.db filling up tmp

pakon

#1

I installed Pakon just to mess with it, but found it a bit pointless at the moment. A list of all current sessions seems to be all it will show. When the database filled up all available space in tmp, I uninstalled it by unchecking the box in the updater.

Or so I thought.

It seems that the package is installed whatever the state of the checkbox, and the database, /tmp/lib/pakon.db, fills all available space every few days, halting all services. That sucks, why are packages installed and running even when not checked?


Omnia loses connection after approximately three days
#2

Hello,
the most likely reason I see is you have Device Detection installed. That relies on Pakon and it’s database as well. Does it help?

It’s still weird that it fills up all the space in /tmp in few days though, it didn’t really happen in our tests. Maybe if you have some really high traffic in your network… but in our tests pakon database was always safely within limits.


#3

Ok, then I get why it’s still installed. Since the last time I rebooted /tmp/lib/pakon.db still sits at 0 b, not that I know why.

I usually run some torrents, that generate a lot of connections in the pakon lists. Nothing else unusual that I can think of.


#4

Yeah, just hit the same issue and same problem - I do also have using transmission, which seems to generate a lot of data and /tmp is not too much big and it causing troubles as well. Is there way how to move db somewhere else?


#5

I believe the main reason Pakon keeps the db in /tmp is that /tmp is essentially a ramdisk, while all of the other devices on the Turris devices are flash. Pakon writes frequently enough that it would lead to early flash failure, so doing the bulk of writes to RAM prevents this.

even if you had another storage device mounted (eg USB drive), it’s non trivial to get pakon to do its writes there. Looking at the pakon scripts, /var/lib/pakon.db is hardcoded in many places in the pakon scripts and the pakon cron job, which archives the pakon db every few hours to permanent storage.


#6

Found out the same. Writing to RAM makes sense, but it doesn’t make sense, that it fillin all space. At least, would be fine, if there would be option to ignore some process or something.


#7

The pakon db in RAM doesn’t grow forever. There is a cron job that backs up the pakon data from RAM to persistent storage once every 8 hours, and every 24 hours moves the last days worth of data from RAM to persistent storage. If you wanted to use pakon long term and have enough network activity that you are consistently filling up /tmp, increasing the frequency of the cron job execution might help.

Any other suggestions @mpetracek?


#8

Well, yeah, it might be solved this way. But, I believe that /tmp has half the size of RAM, so at least 512MB or 1GB, depending on model. Ok, there are other things in /tmp, but still, that means you manage to produce ~500MB worth of traffic records per 24 hours. You would really have to have the persistent storage on external drive, it’s not only a problem to store that in RAM, but also in the permanent flash memory. I’m also not sure how sqlite database would perform in that size scales…

The whole idea of keeping the database in RAM and moving it to permanent storage once in a while was to reduce the amount of writes to flash memory, we didn’t really expect the database may grow to hundreds of megabytes per day.

Personally I’ve never seen the database to grow to that size. We did done some stress testing here and we were able to get to some lower tens of megabytes per day. But the truth is that heavy P2P traffic may generate a lot of connections and we didn’t really focus on that in our tests.

I’ll try to look into this sometime soon too and think about some solution.


#9

Hi.

I’m hitting this issue semi-regularly. I’m not using torrent nor any P2P connections. I checked the cron job that is supposed to do the cleanup but I doubt it actually works. When I tried to run the command manually, it ends with traceback:

root@turris:~# /usr/bin/python3 /usr/libexec/pakon-light/archive.py && /usr/libexec/pakon-light/backup_sqlite.sh /var/lib/pakon.db /srv/pakon/pakon.db.xz
INFO:root:moved 24688 flows from live to archive
Traceback (most recent call last):
  File "/usr/libexec/pakon-light/archive.py", line 132, in <module>
    c.execute('DELETE FROM live.traffic WHERE start < ? AND flow_id IS NULL', (start,))
sqlite3.OperationalError: database is locked

@mpetracek is this expected? :slight_smile: Is there anything I can provide to help with debugging when this happens again?

Thanks!


#10

Hello @nneeoo,
no, this is not expected. That might explain a part of this issue.

I’ll look into that problem and try to create a fix ASAP. I can’t understand why the database should be locked for a long time… sqlite3 should wait 5 seconds before raising this exception, I don’t understand why this can’t be enough…

Can you maybe try px | grep pakon when you see this again. I wonder if there can’t be some hanging process holding lock on that database?

I can at least raise the timeout and let’s see if it will be enough…

Thanks for your report, that’s finally a clue what might be wrong!


#11

Ok, I increased the lock timeout at least. That might fix the problem.

Not sure when we will make new release, but you can try replacing archive script by yourself:
curl https://gitlab.labs.nic.cz/turris/pakon-light/raw/252df926888d27a6ccd468ed6c03dd47e84599c1/archive.py > /usr/libexec/pakon-light/archive.py

If you decide to try it, let me know if the new version works :slight_smile:

Thanks!


#12

I tried it out, but it sill fails with the same error:

root@turris:~# /usr/bin/python3 /usr/libexec/pakon-light/archive.py && /usr/libexec/pakon-light/backup_sqlite.sh /var/lib/pakon.db /srv/pakon/pakon.db.xz
INFO:root:moved 232506 flows from live to archive
Traceback (most recent call last):
  File "/usr/libexec/pakon-light/archive.py", line 131, in <module>
    c.execute('DELETE FROM live.traffic WHERE start < ? AND flow_id IS NULL', (start,))
sqlite3.OperationalError: database is locked

Output of ps | grep pakon shows:

root@turris:~# ps | grep pakon
 2293 root     14692 S    python3 /usr/libexec/pakon-light/pakon-monitor.py
 2307 root     10268 S    python3 /usr/libexec/pakon-light/pakon-handler.py
 2344 root     78940 S    {Suricata-Main} /usr/bin/suricata -c /etc/suricata-pakon/suricata.yaml --pidfile /var/run/suricata/suricata.pid --af-packet=br-lan
12470 root      1116 S    grep pakon

My pakon.db grew to 350M after ~2days. Let me know if you need any more information from me to track this down. Thanks!


#13

Running into the same problem.


#14

Hello,
thanks for the test. That’s really strange. Can you test if the script waits longer before throwing this exception. That was the desired purpose of the changes I made, hoping it will fix it…

Could you please try
time /usr/bin/python3 /usr/libexec/pakon-light/archive.py
? I’m interested in the running time of the script…

Thank you. I can’t replicate that problem in my testing routers…


#15

Hi,

I’m having the same issue of /tmp/lib/pakon.db filling-up the /tmp ramdisk and am available to help troubleshoot it, feel free to ping me in any other information would be useful.

Looking at the router right now, I see that archive.py was launched by cron 6 hours ago and didn’t yet finish (!).

root@turris:~# awk -v ticks="100" 'NR==1 { now=$1; next } END { print strftime("%c", systime() -
(now-($20/ticks))) }' /proc/uptime RS=')' /proc/30412/stat
Fri Nov  9 02:05:01 2018
root@turris:~# date
Fri Nov  9 08:49:54 CET 2018
root@turris:~# ps | grep 30412
12093 root      1116 S    grep 30412
30412 root     40400 R    /usr/bin/python3 /usr/libexec/pakon-light/archive.py

Debug log was activated in the script and here’s what I see using strace:

root@turris:~# strace -f -p 30412 -s 1000 -r 2>&1 | head -15
Process 30412 attached
     0.000000 clock_gettime(CLOCK_REALTIME, {1541750154, 299292050}) = 0
     0.005554 getpid()                  = 30412
     0.000347 write(2, "DEBUG:root:trying:\n", 19) = 19
     0.000277 clock_gettime(CLOCK_REALTIME, {1541750154, 300289038}) = 0
     0.000300 getpid()                  = 30412
     0.000197 write(2, "DEBUG:root:(340605, 1541149898.1838369, 1541149898.1838369, 0, 'xx:xx:xx:xx:xx:xx', 'XX.XX.XX.XX', 55368, 'XX.XX.XX.XX', 80, 'TCP', '?', 206, 74, None)\n", 152) = 152
     0.002478 clock_gettime(CLOCK_REALTIME, {1541750154, 303263080}) = 0
     0.000224 getpid()                  = 30412
     0.000255 write(2, "DEBUG:root:merging with:\n", 25) = 25
     0.000227 clock_gettime(CLOCK_REALTIME, {1541750154, 303967008}) = 0
     0.000272 getpid()                  = 30412
     0.000172 write(2, "DEBUG:root:(340672, 1541149923.1871469, 1541149923.1871469, 0, 'xx:xx:xx:xx:xx:xx', 'XX.XX.XX.XX', 36108, 'XX.XX.XX.XX', 80, 'TCP', '?', 206, 74, None)\n", 152) = 152
     0.000392 clock_gettime(CLOCK_REALTIME, {1541750154, 304799585}) = 0
     0.000168 getpid()                  = 30412
root@turris:~#

The current size of the archive pakon database looks pretty big:

root@turris:~# du -sh /srv/pakon/pakon-archive.db
490.9M  /srv/pakon/pakon-archive.db
root@turris:~# du -sh /tmp/lib/pakon.db 
1.2M    /tmp/lib/pakon.db
root@turris:~# 

Hope this helps!


#16

Thanks for the report!

This looks like the archivation takes too long when the database grows too much. I’m just looking into it and thinking how to improve it, hopefully I’ll find a solution and we will release it soon.


#17

If you still haven’t deleted the database, could you please try sqlite3 /srv/pakon/pakon-archive.db "select details,count(*) from traffic group by details". I’m just wondering how many records are in the database…


#18

Yeah, sure, it is pretty big.

root@turris:~# sqlite3 /srv/pakon/pakon-archive.db "select details,count(*) from traffic group by details"
0|4309608
1|1411
3|410
root@turris:~# 

And by the way, here’s the number of records from a copy of /tmp/lib/pakon.db done after the disk full.

root@turris:~# sqlite3 /srv/pakon/pakon.db "select count(*) from traffic;"
8553273

#19

Please how can I reset/delete all Pakon history and Device detection remembered devices? I tried delete whole pakon folder (/srv/pakon/) and then reboot the router but it still remembers all the history data.


#20

same problem here, /tmp/lib/pakon.db is filling up /tmp, crashing foris… Is there a way to disable this ?

pakon.db is mostfly filled with connections to hubic or amazon s3 from my synology. I don’t know why it’s doing so much noises.