PDA

View Full Version : [VU+ Solo2] Exception No such file or directory: '/tmp/was_rectimer_wakeup'



asktrevis
06-08-17, 07:34
Hello, today a popup with an exception appeared:

Text No such file or directory: '/tmp/was_rectimer_wakeup'


Log attached.

Regards
asktrevis

bbbuk
06-08-17, 22:23
Looks like an unhandled error:

Fehler: [Failure instance: Traceback: <type 'exceptions.OSError'>: [Errno 2] No such file or directory: '/tmp/was_rectimer_wakeup'
I'm no expert but RecordTimer.py uses:

if os.path.exists("/tmp/was_rectimer_wakeup") but that command only returns true if path exists - ie "/tmp/" rather than actual file as it then proceeds to try to open it but "/tmp" will always return true as path will always be there (I believe). Wouldn't it be better to see if file itself exists before trying to open it?

Similar thing in mytest.py as well. Not checked in anything else.

birdman
07-08-17, 01:34
I'm no expert but RecordTimer.py uses:

if os.path.exists("/tmp/was_rectimer_wakeup") but that command only returns true if path exists - ie "/tmp/" rather than actual file....What makes you believe that?
A path in Linux (and Unix) terms is a path to a file-system object - any object. So:

[parent]: ls /var/log/boot.log*
/var/log/boot.log
[parent]: python
Python 2.7.12 (default, Nov 19 2016, 06:48:10)
[GCC 5.4.0 20160609] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> print os.path.exists("/var/log/boot.log")
True
>>> print os.path.exists("/var/log/boot.logx")
False
>>>
[parent]:
i.e. it's a perfectly valid test for "is there such a file". (Well, almost - it would also return True if there were a directory of that name, but that's not the issue here).

birdman
07-08-17, 02:19
Well, the code in question is:
if os.path.exists("/tmp/was_rectimer_wakeup") and not wasRecTimerWakeup:
wasRecTimerWakeup = int(open("/tmp/was_rectimer_wakeup", "r").read()) and True or False
os.remove("/tmp/was_rectimer_wakeup")and the only way I can see the the second and/or third lines can fail if the first line passes is if two threads are running through here at the same time.

Looking at the log a recording is starting up:

< 230.587> [eDVBServiceRecord] Recording to /media/hdd/movie/20170806 0725 - KiKA HD - Matzes Monster.ts...
< 230.588> [eDVBServiceRecord] start recording...
The odd thing is that just before that we have:

< 230.172> [RecordTimer] activating state 2
< 230.176> [RecordTimer] activating state 2That is switching to StateRunning, which is when that file check takes place. But it is being done twice (I've checked with my own logs- it should only get reported once).

So for some (unknown) reason you had two threads running through this code at the same time, and a race condition ensued. Both happened to pass the test but then when one of them deleted the file the other one reported that it wasn't there as an error.

The issue is why were two timers being moved to StateRunning at the same time?

asktrevis
07-08-17, 18:36
Ok. I suspect that I've started enigma2 twice by command Line that day. Sorry for that.

Thanks.

ccs
13-11-18, 20:33
Saw one of these tonight, skipping thru' the graphical epg day by day at the time, looking for a programme on bbc4.

birdman
14-11-18, 02:35
Saw one of these tonight, skipping thru' the graphical epg day by day at the time, looking for a programme on bbc4.If you look at that log you'll see that there was an error in the ubifs file-system at start-up, which is odd , and an error.
Given that /tmp/was_rectimer_wakeup is in memory, it's not directly related - but something on your box's file-system would appear to not be correct.

birdman
14-11-18, 02:46
It is possible to re-write the code so that it won't crash.


if not wasRecTimerWakeup:
try:
wasRecTimerWakeup = int(open("/tmp/was_rectimer_wakeup", "r").read()) and True or False
os.remove("/tmp/was_rectimer_wakeup")
except:
pass

ccs
14-11-18, 10:09
If you look at that log you'll see that there was an error in the ubifs file-system at start-up, which is odd , and an error.
Given that /tmp/was_rectimer_wakeup is in memory, it's not directly related - but something on your box's file-system would appear to not be correct.
I know, I mentioned it ages ago, and it doesn't cause any issues (that I've noticed).

Not much I can do about it I guess?

birdman
14-11-18, 20:16
Not much I can do about it I guess?You could reflash with a recent image.
It sounds (to me...so guessing) as though the nvram has developed a new bad spot?

ccs
14-11-18, 20:26
You could reflash with a recent image.
It sounds (to me...so guessing) as though the nvram has developed a new bad spot?
I couch flashed 5.2.011 on the day it was released, could a usb flash be more effective, there have always been 4 bad "spots" reported when flashing since the day I got the box?

twol
14-11-18, 20:40
If it hasn‘t changed (from 4) I wouldn’t think thats the issue.
I trust ofgwrite (online flash) more than the one-off‘firmware produced by the receivers manufacturer

ccs
14-11-18, 20:44
I've just created an image backup, and there were no errors, although there was one at boot time (a different inode to the crash log yesterday).

ccs
05-05-19, 18:42
I had another one of these today.


< 229.017> Traceback (most recent call last):
< 229.017> File "/usr/lib/enigma2/python/timer.py", line 239, in calcNextActivation
< 229.017> File "/usr/lib/enigma2/python/timer.py", line 314, in processActivation
< 229.018> File "/usr/lib/enigma2/python/RecordTimer.py", line 957, in doActivate
< 229.018> File "/usr/lib/enigma2/python/RecordTimer.py", line 542, in activate
< 229.018> OSError: [Errno 2] No such file or directory: '/tmp/was_rectimer_wakeup'
< 229.018> [ePyObject] (CallObject(<bound method RecordTimer.calcNextActivation of <RecordTimer.RecordTimer instance at 0x7253ba58>>,()) failed)

The crash log looks clean, the only subtle difference in the last 9 months is that the timer (correctly) woke up around 6 hours earler than usual.

birdman
06-05-19, 02:47
I had another one of these today.

Do you have the full debug log?
If so, were there two "[RecordTimer] activating state 2" entries in quick succession for a timer (see #4 (https://www.world-of-satellite.com/showthread.php?57526-Exception-No-such-file-or-directory-tmp-was_rectimer_wakeup&p=455398&viewfull=1#post455398)).

ccs
06-05-19, 10:18
Yes....


< 228.648> [eDVBFrontend0] opening frontend
< 228.649> [eDVBFrontend1] opening frontend
< 228.652> [eDVBFrontend2] opening frontend
< 228.687> [TimerSanityCheck] conflict not found!
< 228.690> [RecordTimer] activating state 2
< 228.692> [RecordTimer] activating state 2
< 228.694> [eDVBServiceRecord] Recording to /media/hdd/movie/20190505 1228 - ITV4 - Live Cycling - Tour de___.ts...
< 228.696> [eDVBServiceRecord] start recording...
< 228.696> [eDVBServiceRecord] RECORD: have 1 video stream(s) (07d1), and 2 audio stream(s) (07d2, 07d3) (07ef), and the pcr pid is 07d1, and the text pid is ffffffff
< 228.697> [eDVBServiceRecord] ADD PID: 0000
< 228.697> [eDVBServiceRecord] ADD PID: 07d0


It was the Tour de Yorkshire, I set up an autotimer, forgot about "all days", so eventually deleted it and created the Sunday timer as a one off timer.
I also edited it to add an extra hour to the end time. No sign of 2 timers.

ccs
06-05-19, 11:31
My crash log in #6 has 2 activating state 2's as well.

birdman
06-05-19, 12:00
Weird....one thing that both logs in this thread (one crash - that happened so quickly most of the log is there, and one debug) is that the AutoTimer thread was running when the recording started up (indeed, in the debug log it was the AutoTimer thread that crashed).

So I'd suggest that I put a mutex on the check for this file so that only one thread can ever run it at any given time. I did this somewhere else in OpenVix a few months back, so I just need to find that code and clone it.

EDIT: And, possibly, work out/look at how the AutoTimer thread can run through the StateRunning section of RecordTimer.

ccs
06-05-19, 13:49
I've got the autotimer startup delay set to 3 minutes, which corresponds roughly with what the logs are saying.

I'm not sure what the default is, but the idea is for autotimer to catch changes to schedules asap, assuming they've reached the epg.

It looks like timers, waking up from deep standby, start up the recording at around the same time, 3 minutes after booting up.

birdman
06-05-19, 17:24
I've got the autotimer startup delay set to 3 minutes, which corresponds roughly with what the logs are saying.That's not the issue. The problem is that something is causing the RecordTimer code to be running in two threads at the same time. This is a problem if both run the StateRunning section.
I can't see why the AutoTimer code should do that: however, since it doesn't seem to cause a problem apart from this then it's possible that it should be running it.
In which case there needs to be a mutex around the "check file exists and delete it if there" code.

birdman
07-05-19, 02:12
I think I see the problem....well, at least the hint of a possible one.

When the AutoTimer code wants to add a new timer it calls:
recordHandler.record(newEntry)

where recordHandler is RecordTimer.
record() in RecordTimer.py calls addTimerEntry() (in timer.py), which calls calcNextActivation() (timer.py) which calls processActivation() (timer.py).
This is, in fact, precisely how the Traceback in the original log goes, but I've only just seen the significance (which I worked out by walking through the code - I've only just noticed that this agrees with the traceback...)

The comment in processActivation() says:
# we keep on processing the first entry until it goes into the future.

and that's the problem.
It does this by running the first timer on the list (they're sorted by next activation time) until the time of the first one is in the future.
BUT, if the first timer has just been fired off in another thread it may well not have yet had time to update its next activation time, so the AutoTimer thread fires it off for a second time - which is what we see. This will be a relatively rare event, which is why we haven't seen it many times.


My first suspicion is that the AutoTimer code (thread) shouldn't be trying to run processActivation() at all! In fact no sub-/background-processes should...

ccs
07-05-19, 09:39
… autotimer had just found a new entry in the last log @< 228.689>, I'd chopped out all the garbage in the file beforehand.

birdman
08-05-19, 19:06
From another thread....

And, in the good tradition of not rushing into fixes, I'm on to version 3.

The standard timers (handled by doActivate in timer.py) remove themselves from the timer_list as the first line of code in doActivate. So they won't be called again if we switch to another thread.

However, RecordTimers have their own doActivate and this doesn't remove them until after the internal activate is called. And that activate can feature file-system access, which is what triggers python to switch to the next thread that wants to run (specifically, it can check for /tmp/was_rectimer_wakeup).
So if I just move the code to remove the timer from timer_list to be at the start of activate all may be OK (provided no activate code expects to find the current timer on the timer list)

And now V4 as on checking through activate() in RecordTimer.py there are actually several bits of code that assume the current timer is still in the list (getNextZapTime() and, probably, saveTimer() for a start). So I can't just remove the timer from the list at the beginning of the code.
However, all we're trying to do is to stop the same timer being run twice (as a result of swapping the running thread) in the loop in processActivation(). So all we need to do is to mark each timer as it is run (and unmark it afterwards) and not run it if it is marked.

So the loop becomes:


# we keep on processing the first entry until it goes into the future.
while True:
timer_list = [ tmr for tmr in self.timer_list if (not tmr.disabled and not getattr(tmr, "currentlyActivated", False)) ]
if timer_list and timer_list[0].getNextActivation() < t:
timer_list[0].currentlyActivated = True
self.doActivate(timer_list[0])
del timer_list[0].currentlyActivated
else:
breakwhich will always stop it running any timer twice.

I'm running with this at the moment to check there are no oddities.

I can't check it against the original problem, as it's not easy (almost impossible) to set-up the required conditions. But the logic is (now) correct.

birdman
09-05-19, 19:31
PR submitted:

https://github.com/OpenViX/enigma2/pull/420

ccs
09-05-19, 19:55
Could this explain why, on vary rare occasions, my box wakes up from deep standby for a timer recording, but doesn't drop back to standby, because /tmp/was_rectimer_wakeup has gone missing?

birdman
10-05-19, 00:16
Could this explain why, on vary rare occasions, my box wakes up from deep standby for a timer recording, but doesn't drop back to standby, because /tmp/was_rectimer_wakeup has gone missing?No.
wasRecTimerWakeup, which is the internal flag set from that file's contents (starts as False), must end up set before the file is deleted. That's true even if two process attempt to do this (this bug); the first will set it, the second will crash but leave the variable unchanged.

ccs
01-09-20, 19:11
Oldish thread, but another crash.....

ccs
02-09-20, 11:06
Only possible clues are that 2 recordings (different mux's) both started at 12:58 from deep standby. 5 hours earlier than my normal routine, when just 1 timer usually starts from deep standby.

birdman
02-09-20, 17:14
Only possible clues are that 2 recordings (different mux's) both started at 12:58 from deep standby.That's what the previous fix was meant to handle. Clearly it doesn't.
I can look at it next week...

birdman
08-09-20, 12:23
It's possible that the code I added created a copy of each of the timer entries, so would never se the other thread set the flag (as that would only happen in its copy).
I'll have to track down what python actually does for this code construct to confirm that....

EDIT: but this is not the case. The new list still refers to the original objects.

So I can't see why it doesn't work.
It would, however, be a simple task to add a lock around the was_rectimer_wakeup check. RecordTimer.py already has such code around writing a new timer entry (as setting a manual timer and AutoTimer setting a timer can occur simultaneously).

birdman
08-09-20, 13:31
OK. I (have to) suppose that what is actually happening is that we have two different timers that pass through this code and so marking the first as activated still lets the second thread activate the second one.

So I'll add a lock around the file test....done. I'll let it run for a while before posting a PR.

birdman
09-09-20, 18:39
https://github.com/OpenViX/enigma2/pull/576

ccs
09-09-20, 18:58
.... thanks, I've installed it and setup tests for the next few days.