eMMC - live span, health monitoring

anon50890781 · April 9, 2019, 1:08pm

Reading the forum it looks like there is an increasing number of users facing issues with the eMMC and it is not clear whether all/any case(s)s is/are caused by having run LXC or NextCloud on the eMMC instead of another storage medium.

Hence, a few questions

what is the live span expectancy of the eMMC in the TO, i.e. how many write/read operations?
is there any tool available in the TOS repo for monitoring the health/wear level of the eMMC, e.g. something similar to common NAND SSD monitoring tools?
are other apps than LXC stressing the eMMC with extensive write operations, e.g. data collection (pakon) or senitnel?
if the eMMC is worn out (EOL) can it be de-soldered and be replaced with another eMMC unit or does it require a new mainboard altogether?

kukulin · April 9, 2019, 1:13pm

very interesting questions. following…

Fenevadkan · April 9, 2019, 3:03pm

to be honest we have so many connectors and options with TO, even SFP, that is hard used by anyone, I dont understand why we do not have a card reader for this, so anyone could just replace it, or put in an even bigger card like 128GB.

I would also need a tool like, iotop, that is unfortunately not available in TO. Anyone knows a similar tool/way to check disk usage (eMMC and NAS HDD)?

vcunat · April 9, 2019, 3:56pm

There are two USB ports.

Cabal · April 9, 2019, 7:19pm

With current u-boot you are not able to boot from usb:/

vcunat · April 9, 2019, 7:43pm

Booting from the flash surely doesn’t require too many writes, even with some updates… but I perhaps it’s not suitable for those who have worn it out already.

anon50890781 · April 9, 2019, 8:02pm

Thanks for the inputs which though seems to be sliding off topic. Could we please stay on the eMMC and the questions posted and not veer off to uboot and usb?

Whilst the TO data sheet does not specify the brand/type of the eMMC I came across some hint that it might be a SK hynix eMMC, is that correct?

After some search in the public domain it seems that producers of eMMC NAND chips are shy of providing such information. Related data sheets state all kinds of information but nothing related to live span expectancy which seems dependent on the wear levelling algorithm/method deployed in the controller

S.M.A.R.T . is apparently not available for eMMC.

mmc-utils is available in the TOS repo though the version (2016-06-28) seems to be lagging behind the upstream version (2018-03-27) [OpenWrt Wiki] package: mmc-utils.

getting info about the eMMC with mmc extcsd read /dev/mmcblk0p1 though fails with

ioctl: Operation not permitted
Could not read EXT_CSD from /dev/mmcblk0p1

Same happens with mmc status get /dev/mmcblk0p1

ioctl: Operation not permitted
Could not read response to SEND_STATUS from /dev/mmcblk0p1

Whilst offering some eMMC control/tuning it does not provide a health status. Micron though has patched mmc-utils to implement the HEALTH STATUS command

It would probably useful to have such for the TO, ideally integrated with Foris perhaps.

Aside from LXC I would reckon NextCloud and NAS stressing the eMMC unduly, notwithstanding Writing to mmcblk0p1 concerns - #2 by paja - SW help - Turris forum

But what about Pakon and other apps like the DNS resolvers?

ajb007 · April 10, 2019, 6:02am

Can you try all this command whit only /dev/mmcblk0 (whole mmc chip)?

cynerd · April 10, 2019, 7:51am

Flash memory in TO is (up to my knowledge) SK hynix eMMC4.5 which manufacturer specifies, for density of 4GB, 2.4TB total bytes written before EOL.

eMMC on its own does not track write cycles like standard drives and does not have mechanism like smart. You can use mmc extcsd read /dev/mmcblk0 from package mmc-utils to take a peak in to amount of used reserved blocks. That is field EXT_CSD_DEVICE_LIFE_TIME_EST. Value is in range of percents so 0x01 is from 0%-10% blocks used, 0x02 is from 10%-20% and so on. When amount of reserved block used is close to 100% then NAND is pretty much EOL.

You should know that this is not linear. This is secondary statistics and tells you nothing about complete wear. Common wear is an avalanche effect. Nothing is happening for a long time and then all blocks starts failing at once. Meaning percentage of used reserved blocks might start rising very quickly.

Note that you need TOS 4.0+ to see required field. Version of mmc-utils in TOS 3.x is not capable to read it.

We have also in pipeline tool called healtcheck that is intended as a monitoring tool for potential problems on router.

Non that we know about. There might be and user can easily create one just by miss-configuring some standard application. Nonetheless all applications we know about that by design write to FS are pointed to /srv and storage plugin can be used to mount /srv to external storage.

That is pretty much impossible. It is 153 ball FBGA. I am not saying that is can’t be done and you can found companies and individuals that are able to do that but it is almost not worth it. Unless you know someone who is able to do this kind of repair for cheap then it is not worth it. And because you have to heat up board to desolder and solder chip you are also risking some other malfunction.

anon50890781 · April 10, 2019, 9:12am

mmc extcsd read /dev/mmcblk0 is showing

Extended CSD rev 1.8 (MMC 5.1)

Will that be still in the 3.x trunk or only 4.x?

How about schnapps export which seems pretty write extensive. And also schnapps rollback and frequent medkit installations (when testing RC or 4.x or testing tweaked settings)?

That works, thanks for the pointer.

anon50890781 · April 10, 2019, 9:27am

@cynerd

mmc extcsd read /dev/mmcblk0 reveals

Bad Block Management mode [SEC_BAD_BLK_MGMNT]: 0x00

Does it mean that Bad Block Management is turned off (assuming it would read 0x01 if turned on)? And if so would turning it on be beneficial to the health management of the eMMC?

cynerd · April 10, 2019, 9:35am

4.x only. Simply because I am the one who is going to be implementing it and I have in pipeline before that migration from 3.x.

Depends where you are exporting it. In general it creates single tar file of approximate size of 100M or so (depends on what you have installed in your system). If you want to protect flash then you should export it to /tmp instead of /root. This is up to you.

This is just an inode update. No huge write happening there.

This is just once again only write of cca 100M. BTRFS pretty much setups it self with minimal overhead. It only setups its header. So every medkit is around 100M written and that means that you can do around 2.4 thousands of them before you reach EOL.

As I wrote: up to my knowledge. I went to some old datasheets. The production memory is probably newer as it seems. I am not the one handling hardware stuff. The number is just to give you an approximation of what you can expect.

It is turned off because this is not handled by hardware but by FS driver of OS. At lest that is what I think (not 100% sure).

anon50890781 · April 10, 2019, 10:15am

Perhaps it is worth to investigate and clarify and if beneficial for the eMMC’s health turn it on. Thus far my reading on the subject indicates that BBM is not supported in BTRFS

https://btrfs.wiki.kernel.org/index.php/Project_ideas#Bad_block_tracking

Currently btrfs doesn’t keep track of bad blocks, disk blocks that are very likely to lose data written to them.

https://www.spinics.net/lists/linux-btrfs/msg83238.html

BTRFS makes the perfectly reasonable assumption that you’re not trying to use known bad hardware. It’s not alone in this respect either, pretty much every Linux filesystem makes the exact same assumption (and almost all non-Linux ones too), because it really is a perfectly reasonable assumption. The only exception is ext[234], but they only support it statically (you can set the bad block list at mkfs time, but not afterwards, and they don’t update it at runtime), and it’s a holdover from earlier filesystems which originated at a time when storage was sufficiently expensive and unreliable that you kept using disks until they were essentially completely dead.

https://www.oracle.com/technetwork/articles/servers-storage-admin/advanced-btrfs-1734952.html

Btrfs can initiate a check of the entire file system by triggering a file system scrub job that is performed in the background. The scrub job scans the entire file system for integrity and automatically attempts to report and repair any bad blocks it finds along the way. The file system only checks and repairs the portions of disks that are in use—this is much faster than scanning all the disks in a logical volume or storage pool.

Found this interesting read https://www.datalight.com/solutions/technologies/bad-block-management

cynerd · April 11, 2019, 7:57am

(I should have written FS driver stack…)

I don’t think it is and I suspect that you don’t want to play with it. I haven’t looked in to it but I suspect that it handles reserved sectors mapping to failed ones. This is I think handled by software in Linux. I suspect that block layer or mmc driver handles that (maybe in cooperation).

I think that this is there to allow usage with microcontrollers. Having reserved blocks mapping handled in hardware eases integration to microcontrollers with limited program memory and speed.

This is something different. They are comparing it to ext bad blocks exclusion list. This is not the same thing like mapping reserved blocks to bad blocks. That is bad blocks avoidance on FS level. The idea is to map what blocks are bad (command badblocks) and avoid using them in FS. In ext FSs that was introduced to support storage media that have manufacture faults. Intended target media were diskettes. Those don’t have any logic in them to manage failed blocks so FS was doing it instead. The difference between current flash memories is that they now have such logic and also have reserved blocks. It is common that some blocks are faulty from fabrication, that is why you can get new flash memories with something like 0-30% of used reserved blocks. So problem they are talking about is just not there with new storage media. It is hidden and mapped over with reserved blocks. The only situation when you might want to use it is when you have no more reserved blocks. In that situation you could map bad blocks and avoid using them. The problem is the avalanche effect I already noted. Modern FSs are rotating writes on whole storage to not wear out specific location. That means that when one block goes rest is close behind. I suspect that avoiding bad blocks would give you just few hours of run time before another one fails and FS is corrupted on modern storage. And even if you would have dynamic avoidance I would suspect that that would give you just few weeks of life time to storage. I think that that is not worth it and I think that it is waste of resources implementing that for BTRFS. It is just solution to issue that is no longer valid. You just should not use storage that is at the end of its live.