PDA

View Full Version : [Mut@nt] HD51 Crash at end of recording, when end margin starts



Mickkie
05-02-20, 00:08
I've noticed a few times so far following an update to 5.3.015 the following malfunction.

I start watching a recorded programme while the recording is still taking place. When the recorded programme reaches its end time and the tail end margin timer starts, the Mut@nt crashes. The GUI restarts and continues recording (I have set the end margin to 5 minutes). After that event I end up with two recordings present, the main programme and the a separate tail end 5 minute recording. When it restarts following the crash it reverts to any settings I had when I first switched on the Mut@nt, rather than the settings at the time it crashed (e.g. selected channel, volume level).

I've enabled the logs and will keep an eye on this tomorrow, but I am wondering if anyone else has observed the same thing? This did not happen with any version prior to 5.3.015.

Sicilian
07-02-20, 08:08
Not experienced this issue on any of my receivers.

Mickkie
07-02-20, 10:50
OK, it happened again this morn. I couldn't find anything immediately obvious in the logs, but will post them if you think they could be useful. Nevertheless, I had a quick look in dmesg and it seems the ext4 fs on the root partition was corrupted at some point:

[Warning! /dev/mmcblk0p3 is mounted.
Warning: skipping journal recovery because doing a read-only filesystem check.
/dev/mmcblk0p3 contains a file system with errors, check forced.
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
Block bitmap differences: +(393217--397344) +(401409--401667) +(524289--524303) +(524305--524319) +(524321--528160) +(655361--655371) +(655377--655387) +(655393--658208) +(663553--663811) +786433 +(786436--786
437) +(786440--786696) +(787209--787464)
Fix? no
Free blocks count wrong for group #48 (4064, counted=8192).
Fix? no
Free blocks count wrong for group #49 (7933, counted=8192).
Fix? no
Free blocks count wrong for group #64 (4064, counted=7934).
Fix? no
Free blocks count wrong for group #80 (4064, counted=6902).
Fix? no
Free blocks count wrong for group #81 (7933, counted=8192).
Fix? no
Free blocks count wrong for group #96 (7160, counted=7676).
Fix? no
Free blocks count wrong (480867, counted=500920).
Fix? no
Free inodes count wrong (191182, counted=191180).
Fix? no
Padding at end of block bitmap is not set. Fix? no

/dev/mmcblk0p3: ********** WARNING: Filesystem still has errors **********
/dev/mmcblk0p3: 13618/204800 files (0.7% non-contiguous), 338333/819200 blocks

What is the recommended way of repairing this fs?

twol
07-02-20, 11:11
OK, it happened again this morn. I couldn't find anything immediately obvious in the logs, but will post them if you think they could be useful. Nevertheless, I had a quick look in dmesg and it seems the ext4 fs on the root partition was corrupted at some point:

[Warning! /dev/mmcblk0p3 is mounted.
Warning: skipping journal recovery because doing a read-only filesystem check.
/dev/mmcblk0p3 contains a file system with errors, check forced.
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
Block bitmap differences: +(393217--397344) +(401409--401667) +(524289--524303) +(524305--524319) +(524321--528160) +(655361--655371) +(655377--655387) +(655393--658208) +(663553--663811) +786433 +(786436--786
437) +(786440--786696) +(787209--787464)
Fix? no
Free blocks count wrong for group #48 (4064, counted=8192).
Fix? no
Free blocks count wrong for group #49 (7933, counted=8192).
Fix? no
Free blocks count wrong for group #64 (4064, counted=7934).
Fix? no
Free blocks count wrong for group #80 (4064, counted=6902).
Fix? no
Free blocks count wrong for group #81 (7933, counted=8192).
Fix? no
Free blocks count wrong for group #96 (7160, counted=7676).
Fix? no
Free blocks count wrong (480867, counted=500920).
Fix? no
Free inodes count wrong (191182, counted=191180).
Fix? no
Padding at end of block bitmap is not set. Fix? no

/dev/mmcblk0p3: ********** WARNING: Filesystem still has errors **********
/dev/mmcblk0p3: 13618/204800 files (0.7% non-contiguous), 338333/819200 blocks

What is the recommended way of repairing this fs?

I seem to remember this is a standard error because of the way that the 1st Emmc partition is built (and truncated) - so not your issue and nothing you can do about it.
You need to supply crash logs

Mickkie
07-02-20, 13:01
I've mounted the / partition ro and run a fsck.ext4 on it, fixing whatever errors it came up with. I rebooted, reflashed the latest openvix image and we'll see what this gives.

I attach the logs of the crash from this morning. Timestamp 57430.552 shows where I suppose the box woke up to start recording '20200207 0827 - BBC NEWS HD - Worklife'.
Then at 57433.153 the recording starts. The crash happened towards the end of this 30 minutes recording, as I was playing back a previous recording of Newsnight. Not sure when the crash happened exactly.

Log pasted here: REMOVED LINK

ccs
07-02-20, 13:04
That link has a trojan according to my virus guard - you should post them as attachments via advanced options.

Sicilian
07-02-20, 13:07
I've removed the link, as above post the crash log.

Mickkie
07-02-20, 13:31
Apologies, I don't know why it says there's a trojan, it's just a pastebin service and the log is plain text. Regardless, please see attached log (I hope it's not too long).

ccs
07-02-20, 14:25
.... that's a debug log, there doesn't appear to be a crash, just the power button being pressed.

Mickkie
07-02-20, 15:17
Oops! Sorry ... here is the crash log.59669

ccs
07-02-20, 15:28
Looks like the root partition (from your earlier post #3) is still (partially?) read-only
< 62538.824> File "/usr/lib/enigma2/python/RecordTimer.py", line 1125, in saveTimer
< 62538.824> IOError: [Errno 30] Read-only file system: '/etc/enigma2/timers.xml.writing'

twol
07-02-20, 17:08
You have a multiboot system, so assume you have backed up your settings use ImageManager to flash latest image into another slot and try again!!

Mickkie
07-02-20, 17:58
Hmm ... the plot thickens.

At the time I had not yet run fsck on the / fs. For whatever reason the crash took place, inc. the state of the fs, it seems it was remounted as ro when it errored out.

This is what it looks like presently:

root@mutant51:~# mount
/dev/mmcblk0p3 on / type ext4 (rw,relatime,data=ordered)
proc on /proc type proc (rw,relatime)
sysfs on /sys type sysfs (rw,relatime)
debugfs on /sys/kernel/debug type debugfs (rw,relatime)
devtmpfs on /dev type devtmpfs (rw,relatime,size=190260k,nr_inodes=47565,mode=755 )
tmpfs on /media type tmpfs (rw,relatime,size=64k)
/dev/mmcblk0p1 on /boot type vfat (rw,relatime,fmask=0022,dmask=0022,codepage=437,io charset=iso8859-1,shortname=mixed,errors=remount-ro)
tmpfs on /var/volatile type tmpfs (rw,relatime)
/dev/sda1 on /media/hdd type ext4 (rw,relatime,data=ordered)
/dev/sdb1 on /media/usb type vfat (rw,relatime,fmask=0000,dmask=0000,allow_utime=002 2,codepage=437,iocharset=iso8859-1,shortname=mixed,errors=remount-ro)
devpts on /dev/pts type devpts (rw,relatime,gid=5,mode=620,ptmxmode=000)
/etc/auto.network on /media/autofs type autofs (rw,relatime,fd=6,pgrp=2100,timeout=5,minproto=5,m axproto=5,indirect)

However, the file it is looking for above is no where to be found?

root@mutant51:~# ls -la /etc/enigma2/timers.xml.writing
ls: /etc/enigma2/timers.xml.writing: No such file or directory
root@mutant51:~# ls -la /etc/enigma2/timers.xml
-rw-r--r-- 1 root root 32952 Feb 7 14:05 /etc/enigma2/timers.xml

twol
07-02-20, 18:09
I would give up looking and just either reflash the slot or preferably flash another slot vis IMageManager

Mickkie
08-02-20, 10:09
Thanks guys,

Since I've fsck'ed the partition and reflashed the image, the Mut@nt seems to be OK again. I hope the problem was an one off rather than a eMMC hardware failure. I guess time will tell.

Mickkie
09-02-20, 09:52
Thanks guys,

Since I've fsck'ed the partition and reflashed the image, the Mut@nt seems to be OK again. I hope the problem was an one off rather than a eMMC hardware failure. I guess time will tell.

Sadly there was another crash last night, but looking at dmesg after a clean reboot there are no fs errors. What else might it be causing this? Power supply gone wrong?

ccs
09-02-20, 11:28
…. maybe the crash log will give us a clue?

Mickkie
09-02-20, 12:20
…. maybe the crash log will give us a clue?

Thanks ccs, I have enabled logs and rebooted. Will keep an eye out for subsequent crashes and report back.

twol
09-02-20, 12:57
@Mickkie - if its a crash, then the crash log is there - probably in flash if you never set up Debug logs.

Mickkie
09-02-20, 13:38
Ohh! I wasn't sure if a log would be created unless I had enabled the capture of debug logs first. This is interesting: there were no crash logs present, when I looked immediately after the 'crash'. Am I misinterpreting the phenomenon as a crash? This is what happened last night.

I was watching a film from a live transmission, no recording taking place. Suddenly the TV screen went dark and for no longer than 2-3 seconds the VIX spinner showed up at the top left corner of the screen, before the picture returned. This is a much shorter duration than when rebooting the STB or restarting the GUI. If this is not a crash, what could it be?

At previous instances when this occurred while a recording was taking place, the recording stopped and restarted a few seconds later, resulting in two recorded files stored on the disk.

twol
09-02-20, 14:02
@Mickkie - if you get a crash you will see a load of stuff on the screen describing the crash, the system will then reboot enigma and restart.
What you are seeing is an activity starting that consumes the processor (hence the ViX spinner) - maybe a debug log will show what is happening.

Mickkie
09-02-20, 14:21
Thanks twol, from what you're saying I'm exhausting the resources of the box and it barfs while this is happening. I retraced my steps, thinking what I might have changed in the config to cause this problem. From what I recall I changed the EPG refresh settings, asking it to automatically save the EPG. I disabled it now as shown in the attached screenshot. Could someone please confirm if these are the default settings, or what I could/should change to minimise any adverse impact on the box?

birdman
09-02-20, 14:32
For whatever reason the crash took place, inc. the state of the fs, it seems it was remounted as ro when it errored out.That's what Linux does when certain errors occur. To stop things getting any worse.

birdman
09-02-20, 14:36
Thanks twol, from what you're saying I'm exhausting the resources of the box and it barfs while this is happening.From the logs you appear to be using /epg.dat to store EPG data.


< 57428.683> [eEPGCache] /media/hdd/epg.dat not found, try /epg.dat

That could well run you out of space.

Mickkie
09-02-20, 17:09
Thanks birdman, I've gone through some changes in the settings moving the epg.dat location from /media/usb/ to /media/hdd/ and back when among other checks I had unmounted the /media/usb device to fsck its vfat filesystem. I have reset the epg.dat to be stored on /media/usb/ instead of the spinning disk. I assume this is OK for an 8G USB disk, or is the recommended storage space for EPG the spinning disk? The current epg.dat size following a reboot looks OK to me:
root@mutant51:~# ls -la /media/usb/epg.dat
-rwxrwxrwx 1 root root 40 Feb 9 15:45 /media/usb/epg.dat

I've also set the debug logs to be captured on the USB and I'll see if disabling the autosave for the EPG has resolved my problem.

ccs
09-02-20, 17:15
epg.dat is only read once at boot time, and written once when shutting down (and more often if you think you need to), so I'd put it on the hdd.

epg.dat should end up much bigger than the 40bytes you're seeing.

Mickkie
09-02-20, 17:37
Thank you css, for this useful pointer. I have moved the EPG cache to the hard drive. I wonder if this is what caused all these problems in the first place.

ccs
09-02-20, 17:45
This is my (terrestrial only) epg.dat


root@et10000:/media/hdd# ls -l epg.dat
-rw-r--r-- 1 root root 2931039 Feb 9 16:41 epg.dat
root@et10000:/media/hdd#

birdman
10-02-20, 01:02
I have reset the epg.dat to be stored on /media/usb/ instead of the spinning disk. I assume this is OK for an 8G USB disk, or is the recommended storage space for EPG the spinning disk? I have mine on a USB stick. Since I have UK Freeview the EPG data is updated at various times, so I have things set to updated this file from time to time and I don't want to needlessly spin up the disk.

Joe_90
10-02-20, 14:21
@birdman - I think the EPG data is held in RAM during normal running, so it shouldn't be spinning the HDD up. Only when going into standby/deep standby should the system be transferring the EPG cache to disk.

ccs
10-02-20, 14:25
@birdman - I think the EPG data is held in RAM during normal running, so it shouldn't be spinning the HDD up. Only when going into standby/deep standby should the system be transferring the EPG cache to disk.
That's how I see it, and it only needs saving more often if you suffer power cuts or crashes (unlikely).
I reckon my HDD hardly ever needs to spin up to save the epg, but that's the way I use the box - it's either recording or I'm watching a recording.
And if the OP uses timeshift, the disc never spins down anyway.

birdman
10-02-20, 19:21
@birdman - I think the EPG data is held in RAM during normal running, so it shouldn't be spinning the HDD up.
Menu -> Setup -> EPG -> Settings -> Automatic save (On)
Save every (in hours)

So you can save it regularly if you wish.

Mickkie
14-02-20, 17:32
Menu -> Setup -> EPG -> Settings -> Automatic save (On)
Save every (in hours)

So you can save it regularly if you wish.

Thank you all. I have left my EPG cache on the spinning disk, because the USB seems to take up more resources to save the cache and when combined with storing debug logs (on the USB) it caused a black screen and spinner to show up intermittently. Since I use timeshift there is no latency observed because the disk is spinning anyway at the time. I have also disabled the EPG cache auto save. The Mut@nt is running off a UPS to deal with power cuts and brown outs we suffer from in our area, so regular saving wouldn't offer much to me.

I've now flashed the latest firmware, 5.3.016. With multiboot the Mut@nt is now using a different partition. There are no filesystem errors reported, or anything untoward. For a few days now I thought the problem I reported was caused by me switching the EPG cache to USB and all was sorted out.

Nevertheless, the same problem manifested again earlier today. :(

I paused BBC1 HD, while I went to get a coffee. Rewound to the beginning of the BBC programme upon my return, played/paused a couple of times, then left it paused. After about 20 minutes of so, the TV suddenly started playing again all on its own and all the timeshift buffer was gone! I tried to rewind unsuccessfully. Then I looked at the list of recordings and noticed a recording had started (Ian King Live, timestamp 18232.722 in the attached log). I also saw two entries for this recording - one contained the full recording, the other empty:
-rw-r--r-- 1 root root 0 Feb 14 13:26 20200214 1327 - Sky News - Ian King Live.ts
-rw-r--r-- 1 root root 258 Feb 14 14:24 20200214 1327 - Sky News - Ian King Live.ts.meta
-rw-r--r-- 1 root root 213 Feb 14 13:26 20200214 1327 - Sky News - Ian King Live_001.eit
-rw-r--r-- 1 root root 411808736 Feb 14 14:05 20200214 1327 - Sky News - Ian King Live_001.ts
-rw-r--r-- 1 root root 27824 Feb 14 14:05 20200214 1327 - Sky News - Ian King Live_001.ts.ap
-rw-r--r-- 1 root root 36 Feb 14 14:24 20200214 1327 - Sky News - Ian King Live_001.ts.cuts
-rw-r--r-- 1 root root 274 Feb 14 14:24 20200214 1327 - Sky News - Ian King Live_001.ts.meta
-rw-r--r-- 1 root root 967408 Feb 14 14:05 20200214 1327 - Sky News - Ian King Live_001.ts.sc

There was no crash log, but I captured a debug log which I attach in this post. I can see a problem from timestamp 18274.201, but I don't know what caused this.

birdman
15-02-20, 01:16
There was no crash log, but I captured a debug log which I attach in this post. I can see a problem from timestamp 18274.201, but I don't know what caused this.The lack of a crash log would appear to be odd.

The debug log you've posted show enigma2 starting up at ~13:26:50.
Given the timestamp of your files it looks like a recording was started (13:26 20200214 1327 - Sky News - Ian King Live.ts) but never written to. Then enigma2 restarted (unknown reason) and on starting found a pending recording (the same one) so started it again (the 001 series of files) and that worked.
Also odd is that the original meta file appears to be being updated, even though the associated recording isn't.

The errors seem to be associated with timeshift, rather than the Sky News recording.

If you have the previous debug log to the one you've posted it might be more interesting, as that may show why enigma2 restarted.

Mickkie
15-02-20, 11:36
Yes, there is more info in the previous log along with a backtrace - attached. I tried to play both 'Ian King Live' recordings to confirm the first was empty, this could have altered the meta files?

Anyway, I had another crash during the night, when the box wakes up to record a weather report, which resulted in this unexpected message and made me suspect the NTP setting:

< 6714.077> [Task] job Components.Task.Job name=Cleaning Trashes #tasks=1 completed with [] in None
< 6714.079> [InputHotplug] connectionLost? [Failure instance: Traceback (failure with no frames): <class 'twisted.internet.error.ConnectionLost'>: Connection to the other side was lost in a non-clean fashion: Connection lost.
]< 6714.079>
< 6714.083> [AutoBouquetsMaker] next wakeup due 0
< 6714.083> set wakeup time to 2020/02/15 18:22
< 6714.083> recordTimerWakeupAuto True
< 6714.123> [AutoTimer] No changes in configuration, won't parse
< 6714.130> [EPGRefresh] Stopping Timer
< 6714.131> [XMLTVImport] autostart (1) occured at 1581730200.49
< 6714.131> [XMLTVImport] Stop
< 6714.131> [ABM-Scheduler][Scheduleautostart] reason(1), session None
< 6714.131> [ABM-Scheduler][Scheduleautostart] Stop
< 6714.134> [eDVBDB] ---- saving lame channel db
< 6714.136> [eDVBDB] saved 10 channels and 178 services!
< 6714.145> [eDVBResourceManager] release cached channel (timer timeout)
< 6714.145> [eDVBLocalTimerHandler] remove channel 0xf385e0
< 6714.145> [eEPGCache] remove channel 0xf385e0
< 6714.145> [eDVBResourceManager] stop release channel timer
< 6714.146> [Avahi] avahi_timeout_new
< 6714.147> [Avahi] avahi_timeout_free
< 6714.148> [Avahi] avahi_watch_free
< 6714.148> [Avahi] avahi_timeout_update
< 6714.148> [Avahi] avahi_timeout_free
< 6714.154> [eEPGCache] store epg to realpath '/media/hdd/epg.dat'
< 6714.193> [eEPGCache] 28417 events written to /media/hdd/epg.dat
< 6714.375> [eventData] EPG Cache is corrupt (eventData::~eventData), you should restart Enigma!
< 6714.413> [eventData] EPG Cache is corrupt (eventData::~eventData), you should restart Enigma!
[snip ...]

< 6714.478> [eventData] EPG Cache is corrupt (eventData::~eventData), you should restart Enigma!
< 6714.478> [eventData] EPG Cache is corrupt (eventData::~eventData), you should restart Enigma!
< 6714.498> [eDVBLocalTimeHandler] set RTC to previous valid time
< 6714.498> [eDVBLocalTimerHandler] set RTC Time
< 6714.498> [eInit] - (41) eServiceFactoryDVD
< 6714.498> [eInit] - (41) eServiceFactoryWebTS
< 6714.498> [eInit] - (41) eServiceFactoryTS
< 6714.498> [eInit] - (41) eServiceFactoryHDMI
< 6714.498> [eInit] - (41) eServiceFactoryM2TS
< 6714.498> [eInit] - (41) eServiceFactoryMP3
< 6714.498> [eInit] - (41) eServiceFactoryFS
< 6714.498> [eInit] - (41) eServiceFactoryDVB
< 6714.498> [eInit] - (41) Encoders
< 6714.498> [eInit] - (41) Stream server
< 6714.498> [eInit] - (40) eServiceCenter
< 6714.498> [eInit] - (35) CI Slots
< 6714.498> [eInit] - (35) CA handler
< 6714.498> [eInit] - (30) eActionMap
< 6714.498> [eInit] - (22) Hdmi CEC driver
< 6714.499> [eInit] - (21) input device driver
< 6714.617> [eInit] - (20) DVB-CI UI
< 6714.617> [eInit] - (20) UHF Modulator
< 6714.617> [eInit] - (20) RC Input layer
< 6714.617> [eInit] - (20) misc options
< 6714.617> [eInit] - (20) AVSwitch Driver
< 6714.617> [eInit] - (15) eWindowStyleManager
< 6714.617> [eInit] - (10) gRC
< 6714.617> [gRC] waiting for gRC thread shutdown
< 6714.617> [gRC] thread has finished
< 6714.617> [eInit] - (9) GFBDC
< 6714.621> [eInit] - (9) gLCD
< 6714.621> [eInit] - (9) Font Render Class
< 6714.622> [eInit] - (8) graphics acceleration manager
< 6714.622> [eInit] - (5) Tuxtxt
< 6714.622> [eInit] - (1) Background File Eraser
< 6714.622> [eInit] reached rl -1
[MAIN] executing main
TuxTxt cache cleared
cleaning up
TuxTxt cache cleared
When I checked this morning after seeing the above log I discovered there was no /media/hdd/epg.dat file around. Could a non-responsive NTP server cause these problems? A recording timer kicks in and starts a recording, then the time is changed as it starts and this causes the crash? Anyway, I've changed it to use Transponder time for now to see if this fixes it.

birdman
16-02-20, 12:50
Yes, there is more info in the previous log along with a backtrace - attached. I tried to play both 'Ian King Live' recordings to confirm the first was empty, this could have altered the meta files?Yes, it would. It will update the meta file with the size of the recording.
Thanks.

birdman
16-02-20, 13:08
Anyway, I had another crash during the night, when the box wakes up to record a weather report, which resulted in this unexpected message and made me suspect the NTP setting:Not sure why.
Any time it checks the time is sets the RTC (Real Time Clock), which enables the box to know what time it is as soon as it wakes up.

All of the problems relate the the EPG cache. I have no idea what could cause it to become corrupt so often.


When I checked this morning after seeing the above log I discovered there was no /media/hdd/epg.dat file around.
Once it is seen to be corrupt (which happens on writing - it's the in-memory data which is corrupt) it deletes the file (as it has no confidence in what it would write to it as the data is corrupt...).

ccs
16-02-20, 13:16
Not absolutely sure if the OP is only using freeview, but there have been a number of re-tune events in the last 10 days.

Maybe this could be the issue with the epg?


https://www.freeview.co.uk/corporate/platform-management/700mhz-clearance/clearance-events-2020

birdman
17-02-20, 03:32
Not absolutely sure if the OP is only using freeview, but there have been a number of re-tune events in the last 10 days.

Maybe this could be the issue with the epg?I doubt it (but I never run timeshift).

My transmitter had a Mux change on Wednesday. This just resulted in a 1-line change to lamedb (the frequency for that Mux - all of the network ids were unchanged).
Starting the box up with the "wrong" lamedb didn't cause any problems. Just as well, otherwise re-tuning might be an issue.

Mickkie
17-02-20, 12:53
Thanks for pointing the Freeview transmitter changes to me. I'm tuned into the Crystal Palace transmitter and have no satellite tuners. I don't recall this being a problem in the past, but I rescanned all channels this morning and ran the autobouquet scan too. Let's see if this makes any odds.

I wasn't capturing and looking at debug logs on a regular basis to know if the EPG cache was corrupted as often in the past, but it seems to be a regular occurrence in the last few days. I temporarily set it to autosave every 8 hours to see if it would stop the crashes, but it doesn't. The crashes happened whether I store the EPG on the spinning disk or on a USB stick.

Last night there was a crash-fest. More about it below, but here's a list of some things I have tried to see if they stop these 'crashes':

1. Change theme.
2. Change the time NTP servers, also change source from NTP to Transponder.
3. Store the EPG cache on the spinning disk, instead of USB stick.
4. Go back a couple of versions on the firmware.
5. Install the latest firmware in a different slot.
6. Ran fsck on the USB, hard disk and mmc partitions after unmounting them.

None of the above efforts worked to eliminate crashes. The crashes coincide around the start or end of recordings, when the timer margin kicks in. They will always happen if I'm recording more than one programme at at time, but may also happen as a recording starts while I'm watching of have paused something in timeshift.

For example last night I started recording Endeavour on ITV HD (20:00-22:00) and Top Gear on BBC TWO HD (20:00-21:00), plus The Pale Horse on BBC ONE HD (21:00-22:00).

At around 20:32 we started playing back Endeavour (with subtitles on) from the list of recordings.

At 20:57 as the Top Gear was finishing and the Pale Horse recording started, the now familiar crash took place. The TV screen went black for a second or two, all recordings stopped and then restarted creating additional files. Here's the Top Gear and Pale Horse files we ended up with - I have not played back any of them yet to avoid altering their meta files:

root@mutant51:~# ls -ltcr /media/hdd/movie/
[snip ...]
-rw-r--r-- 1 root root 166 Feb 16 19:56 20200216 1957 - BBC TWO HD - Top Gear.eit
-rw-r--r-- 1 root root 1473712 Feb 16 20:58 20200216 1957 - BBC TWO HD - Top Gear.ts.sc
-rw-r--r-- 1 root root 2427191296 Feb 16 20:58 20200216 1957 - BBC TWO HD - Top Gear.ts
-rw-r--r-- 1 root root 154128 Feb 16 21:04 20200216 1957 - BBC TWO HD - Top Gear_001.ts.sc
-rw-r--r-- 1 root root 24 Feb 16 21:05 20200216 1957 - BBC TWO HD - Top Gear_001.ts.cuts
-rw-r--r-- 1 root root 6736 Feb 16 21:05 20200216 1957 - BBC TWO HD - Top Gear_001.ts.ap
-rw-r--r-- 1 root root 294517228 Feb 16 21:05 20200216 1957 - BBC TWO HD - Top Gear_001.ts


root@mutant51:/media/hdd/movie/Drama# ls -ltcr
-rw-r--r-- 1 root root 236 Feb 16 20:56 20200216 2057 - BBC ONE HD - The Pale Horse.eit
-rw-r--r-- 1 root root 33680 Feb 16 20:58 20200216 2057 - BBC ONE HD - The Pale Horse.ts.sc
-rw-r--r-- 1 root root 33304576 Feb 16 20:58 20200216 2057 - BBC ONE HD - The Pale Horse.ts
-rw-r--r-- 1 root root 298 Feb 16 20:59 20200216 2057 - BBC ONE HD - The Pale Horse.ts.meta
-rw-r--r-- 1 root root 1480080 Feb 16 22:00 20200216 2057 - BBC ONE HD - The Pale Horse_001.ts.sc
-rw-r--r-- 1 root root 1438834688 Feb 16 22:00 20200216 2057 - BBC ONE HD - The Pale Horse_001.ts
-rw-r--r-- 1 root root 302 Feb 16 22:00 20200216 2057 - BBC ONE HD - The Pale Horse_001.ts.meta
-rw-r--r-- 1 root root 110496 Feb 16 22:05 20200216 2057 - BBC ONE HD - The Pale Horse_001_002.ts.sc
-rw-r--r-- 1 root root 24 Feb 16 22:05 20200216 2057 - BBC ONE HD - The Pale Horse_001_002.ts.cuts
-rw-r--r-- 1 root root 4592 Feb 16 22:05 20200216 2057 - BBC ONE HD - The Pale Horse_001_002.ts.ap
-rw-r--r-- 1 root root 213626468 Feb 16 22:05 20200216 2057 - BBC ONE HD - The Pale Horse_001_002.ts
-rw-r--r-- 1 root root 300 Feb 16 22:16 20200216 2057 - BBC ONE HD - The Pale Horse_001_002.ts.meta

We had to watch the Endeavour recordings over 3 different files due to the interruptions in the recording.

I've looked through the logs, a new hobby I developed recently and which I'm increasingly spending more time doing than using the Mut@nt. LOL!

Every time there is a crash this is what is captured at the end of the debug file (interestingly as a rule no crash files are generated):

< 3913.531> [eSubtitleWidget] disp width 1920, disp height 1080
< 3913.531> [eSubtitleWidget] add 0 906 1920 58
< 3913.531> [eSubtitleWidget] disp width 1920, disp height 1080
< 3916.033> [eEPGCache] start caching events(1581886700)
< 3916.033> [eDVBSectionReader] DMX_SET_FILTER pid=211
< 3916.033> [eDVBSectionReader] DMX_SET_FILTER pid=561
< 3916.033> [eDVBSectionReader] DMX_SET_FILTER pid=18
< 3916.034> [eDVBSectionReader] DMX_SET_FILTER pid=18
< 3916.034> [eDVBSectionReader] DMX_SET_FILTER pid=18
< 3916.034> [huffman] read.. '/usr/share/enigma2/otv_eeee0000_233a_4084.dict'
< 3916.034> [eEPGCache] abort non avail OpenTV EIT reading
< 3916.187> [eSubtitleWidget] setPage
< 3916.187> [eSubtitleWidget] add 0 848 1920 58
< 3916.187> [eSubtitleWidget] disp width 1920, disp height 1080
< 3916.187> [eSubtitleWidget] add 0 906 1920 58
< 3916.187> [eSubtitleWidget] disp width 1920, disp height 1080
< 3917.741> [eSubtitleWidget] setPage
< 3917.741> [eSubtitleWidget] add 0 906 1920 58
< 3917.741> [eSubtitleWidget] disp width 1920, disp height 1080
< 3919.957> [eSubtitleWidget] setPage
< 3919.957> [eSubtitleWidget] add 0 848 1920 58
< 3919.957> [eSubtitleWidget] disp width 1920, disp height 1080
< 3919.957> [eSubtitleWidget] add 0 906 1920 58
< 3919.957> [eSubtitleWidget] disp width 1920, disp height 1080
< 3920.278> Backtrace:
< 3920.279> /usr/bin/enigma2(_Z17handleFatalSignaliP9siginfo_tPv) [0x1844A4]
< 3920.279> /lib/libc.so.6(__default_rt_sa_restorer) [0xB6D1DE20]
< 3920.280> /usr/bin/enigma2(_ZN9eventDataD2Ev) [0x14533C]
< 3920.282> /usr/bin/enigma2(_ZN9eEPGCache11sectionReadEPKhiPNS_12chann el_dataE) [0x14A54C]
< 3920.283> /usr/bin/enigma2(_ZN9eEPGCache12channel_data8readDataEPKhi) [0x14AD30]
< 3920.283> /usr/bin/enigma2(n/a) [0x1688C0]
< 3920.283> /usr/bin/enigma2(_ZN17eDVBSectionReader4dataEi) [0x159728]
< 3920.284> /usr/bin/enigma2(n/a) [0x1753E8]
< 3920.284> /usr/bin/enigma2(n/a) [0x17C07C]
< 3920.284> /usr/bin/enigma2(_ZN9eMainloop15processOneEventElPP7_object 9ePyObject) [0x182134]
< 3920.284> /usr/bin/enigma2(_ZN9eMainloop7iterateEjPP7_object9ePyObjec t) [0x182290]
< 3920.284> /usr/bin/enigma2(_ZN9eMainloop7runLoopEv) [0x182380]
< 3920.284> /usr/bin/enigma2(_ZN9eEPGCache6threadEv) [0x14A024]
< 3920.284> /usr/bin/enigma2(_ZN7eThread7wrapperEPv) [0x174078]
< 3920.284> -------FATAL SIGNAL (11)

Backtraces are identical for each crash. I attach the debug log from the events leading up to last night's first crash. You'll notice towards the end at timestamp 3916.033, there was an EPG caching attempt, but I don't know if this caused the problem.

I have also looked at the OS logs but couldn't see anything suspicious in there. The OS itself is not crashing, or rebooting.

I have not yet opened the box to measure voltages in case the PSU is on its way out, but I'm thinking if it were a RAM or power supply issue, wouldn't the OS be complaining about it too?

twol
17-02-20, 13:15
So check your disks (including flash) for epg.dat.
Just check in menu/setup where epg is being saved, make sure its what you want)
Telnet into the box and enter init 4(to stop box)(space between init and 4 or later 6)
Using filezilla delete any and all epg.dat‘s!
On telnet enter init 6 (to reboot)

I guess you have said somewhere but how are you creating the EPG?

ccs
17-02-20, 13:22
Searching "huffman" from the debug logs comes up with....

https://www.world-of-satellite.com/showthread.php?62250-Box-crashes-when-using-EPG&p=492352&viewfull=1#post492352

Mickkie
17-02-20, 14:02
Searching "huffman" from the debug logs comes up with....

https://www.world-of-satellite.com/showthread.php?62250-Box-crashes-when-using-EPG&p=492352&viewfull=1#post492352Interesting find! Does this mean it is a bug?

Anyhow, I followed twol's advice:

Following a rescan of channels and autobouquets, I rebooted the box. Then looked around for epg.dat files:

root@mutant51:~# find / -iname epg.dat
root@mutant51:~#

I manually saved the EPG cache, using Setup/EPG/Load-save-delete and looked again:

root@mutant51:~# find / -iname epg.dat
/media/hdd/epg.dat
root@mutant51:~# ls -l /media/hdd/epg.dat
-rw-r--r-- 1 root root 3727829 Feb 17 12:42 /media/hdd/epg.dat
A screenshot of my current EPG settings is attached. I don't know what EPG types I ought to have enabled for Freeview. I noticed in the logs OpenTV EPG could not be found, so I disabled it. I also disabled maintaining old EPG data. I recall enabling this setting recently and wonder if it this was when all these crashes started taking place ... hmm :-/

ccs
17-02-20, 14:07
These are my freeview settings....

I've got OpenTV set to yes, not sure why, but it doesn't do (me) any harm.

twol
17-02-20, 14:09
Whats in Menu/setup/ViX/Mount Manager and also in menu/information/devices??

Mickkie
17-02-20, 14:38
Whats in Menu/setup/ViX/Mount Manager and also in menu/information/devices??

Please see attached screenshots.

Looking at ccs' screenshot (thank you) I also removed the EPG autorefresh which I had enabled recently. No crashes so far. :-)

ccs
17-02-20, 14:52
I wonder if there are any clues in the timers menu.

If you highlight a completed timer which appeared to have issues, and press info, it may tell you what it didn't like.

birdman
18-02-20, 02:31
Searching "huffman" from the debug logs comes up with....That's related to OpenTV, which you don't want/need to have enabled for UK Freeview.
(I get the same message - it's not an issue).

What options do you have enabled for EPG gathering? It should only be getting data from EIT.

Mickkie
19-02-20, 12:13
What options do you have enabled for EPG gathering? It should only be getting data from EIT.
Thank you all for your help with this problem. I should have retraced my steps as soon as I noticed things going wrong and would have kept this thread shorter!

I had made some changes in the EPG Settings recently. One of these changes was to enable "Automatic refresh" every hour, thinking this ought to be a good thing, because it would catch any changes to programmes and times and adjust timers for recordings accordingly. For some reason, the automatic refresh ended up corrupting the EPG, deleting the cache file on /media/hdd/epg.dat, then crashing Enigma if I was doing anything more than watching live TV with the box. Capturing debug logs while trying to bottom out the cause did not help, as this was consuming additional resources. I think this is a bug of some sort, which can be replicated on this box.

With css and birdman's kind input I've now disabled automatic refresh for EPG and left only EIT as a source for EPG data. The box is now back to its normal stable behaviour.

Thanks again! :)

ccs
19-02-20, 14:02
Anyone know what "Automatic refresh" is supposed to do, I've always wondered.

birdman
20-02-20, 02:06
Anyone know what "Automatic refresh" is supposed to do, I've always wondered.Updates epg.dat on disk from the in-memory EPG data.
So when you get EPG data from EIT (which updates "continuously") it is useful to update the data in case enigma2 crashes(on an orderly shutdown it is updated anyway).

I presume it is also useful for those who download their EPG data, so that it saves the last downloaded data (but that may happen by some other means - not using downloaded EPG data I've never looked).

ccs
20-02-20, 11:14
Updates epg.dat on disk from the in-memory EPG data.
So when you get EPG data from EIT (which updates "continuously") it is useful to update the data in case enigma2 crashes(on an orderly shutdown it is updated anyway).

I presume it is also useful for those who download their EPG data, so that it saves the last downloaded data (but that may happen by some other means - not using downloaded EPG data I've never looked).
I thought that was automatic save, with the next option save every(in hours).

Automatic refresh almost sounds a bit like the opposite.

birdman
20-02-20, 13:22
Automatic refresh almost sounds a bit like the opposite.Sorry. You are correct.
This does schedule an automatic refresh from epg.dat (and, if that happens to be /epg.dat, deletes it).
Presumably the epgcache.load() method is how downloaded EPG data gets put into use? Not sure why you'd want to schedule reloads of it, though.