Autotimers and Description Uniqueness

**adm** · 10-09-20, 17:10

Originally Posted by birdman

It's a fix for something - although the actual fix should be more like:

Code:

 retValue = extdesc1 == extdesc2

with relevant handling of any Null values.
However, that something is only a small subset of cases - it's not going to have any effect on most of the cases in this thread.

What a fix should do is to prevent a bypassed check 3 from altering the result of check 2. Currently test 2 (short description similarity check) could indicate that the two strings are similar and on its own would set retvalue = true. However when there is no extended description, and no further test, retvalue immediately gets set as false.

This doesn't "cure" the problem of the small number of problem epg description cases in this thread but it does results in consistant results when, and if, the third test is user requested but cannot be performed because there is no additional data on which to base a comparison.

**adm** · 10-09-20, 18:02

Originally Posted by Old Codger

I thought the following might be of some interest.

Having installed OpenPLi a few weeks ago, I decided to have a look at its version of Autotimer.

The 'Edit Autotimer' screen has several extra options not present in OpenViX -
1. Description - short equal extended for match (default 'no')
2. Do not skip match when not description (default 'no')
3. Percentage ratio for duplicate matches (from '50%' to '100%' in 10 percent steps, default '80%')

If you hadn’t participated in this thread, examined the code and included some additional debug comments in the autotimer.py code would you have understood what those options really were and understood the implications of selecting any one of them? As for tailoring the ratio number to detect differences in descriptions, your first guess didn’t work and that’s after you and others in this thread have carefully analysed the data. What chance have 99.999% of other users got of selecting the “correct” value when they only want to spend 30 seconds setting timer to record all episodes and ignore repeats. Would they understand that they may need to set 100% for the <1% of problem program descriptions BUT this same setting could have unforeseen consequences for many other program descriptions that work perfectly in the current version of OpenVix. I guess many people when setting an autotimer may just press record when in the epg and only get two options of timer or autotimer and when selecting the latter this way see no other options (the previously configured defaults are used)

Surely an ongoing aim would be to make the user interface less geeky and more user friendly? I'm sure at times even some of the expert features are not understood by the experts.

In the case of this thread the main problem has been the use of the words unique and uniqueness in the user interface and users questioning why when they can see something is unique, or not, the timers have been set incorrectly as a result.

**adm** · 10-09-20, 20:50

While there is a bug in the part of autotimer code discussed in the previous pages in this thread where if the final test is user requested but doesn’t take place, because the extended description data is not populated, the result of the previous valid test is changed there is another question…….

I’ve been looking at multiple tests cases and looking at the populated data in the short description and extended descriptions.

In the majority of cases the data in the extended description appears to be a direct copy of that in the short description.

Example 1:

name1 ='Aerial Ireland'
name2 ='Tech 24'

shortdesc1='A sky-high tour over Ireland reveals its towering cliffs, rolling hills, ancient ruins, and rich history.'

shortdesc2='This magazine series features the latest innovations in science and technology.'

extdesc1 ='A sky-high tour over Ireland reveals its towering cliffs, rolling hills, ancient ruins, and rich history.'

extdesc2 ='This magazine series features the latest innovations in science and technology.'

There are examples where the above rule seems to be different

I) the extended description is empty (example 2)
ii) the short description is different to the extended description (example3)
iii) both the short and extended descriptions are missing (example 4)
iv) the short description is not populated but the extended description is (example 5)

What is populating the short and long descriptions and should one be a copy of the other? Could some of the checking code previously discussed in this thread be producing the wrong results because the description strings are being incorrectly populated before the test are even being made?
If for instance in previous builds the short and extended data were identical the bug in the similarity checking code wouldn’t have been seen.

Example 2:

name1 ='Aerial Ireland'
name2 ='Malcolm in the Middle'

shortdesc1='A sky-high tour over Ireland reveals its towering cliffs, rolling hills, ancient ruins, and rich history.'

shortdesc2='When Malcolm becomes editor of the high school literary magazine, the principal instructs him to censor a well-written story. (S5, ep9) [S]'

extdesc1 ='A sky-high tour over Ireland reveals its towering cliffs, rolling hills, ancient ruins, and rich history.'

extdesc2 =''

Example3

name1 ='Aerial Ireland'
name2 ='Click'

shortdesc1='A sky-high tour over Ireland reveals its towering cliffs, rolling hills, ancient ruins, and rich history.'

shortdesc2='(S2020E0)'

extdesc1 ='A sky-high tour over Ireland reveals its towering cliffs, rolling hills, ancient ruins, and rich history.'

extdesc2 ='Click looks at how people with disabilities are dealing with lockdowns due to the coronavirus pandemic, speaking to Microsoft's head of accessibility among others.'

Example 4:

AutoTimer name1 ='Aerial Ireland'
AutoTimer name2 ='Scrubs'

shortdesc1='A sky-high tour over Ireland reveals its towering cliffs, rolling hills, ancient ruins, and rich history.'

shortdesc2=''

extdesc1 ='A sky-high tour over Ireland reveals its towering cliffs, rolling hills, ancient ruins, and rich history.'

extdesc2 =''

Example 5:

name1 ='Aerial Ireland'
name2 ='Rick and Morty [adult swim]'

shortdesc1='A sky-high tour over Ireland reveals its towering cliffs, rolling hills, ancient ruins, and rich history.'

shortdesc2=''

extdesc1 ='A sky-high tour over Ireland reveals its towering cliffs, rolling hills, ancient ruins, and rich history.'

extdesc2 ='Raising Gazorpazorp: Morty fathers an alien baby after convincing Rick to buy him a sexy robot. Rick and Summer visit the robot's planet but get trapped in another dimension. (S1 Ep7/11)'

**birdman** · 10-09-20, 22:02

Originally Posted by BrianTheTechieSnail

First you say perfection is impossible so give up.
Now you say it's not perfect - so it's not worth bothering with.

In the context of this thread those two statements are so similar that they are identical.

BrianTheTechieSnail · 11-09-20, 02:27

Originally Posted by birdman

In the context of this thread those two statements are so similar that they are identical.

Point taken.
At least I think so, it seems to me that there are many ways to interpret that statement.

BrianTheTechieSnail · 12-09-20, 02:07

I'm testing the attached modified AutoTimer.py on my box at the moment.
I'm hopeful that it fixes the Two in Clover problem and a few other possible problems too.
But of course it may have nasty mistakes in it or introduce other problems I haven't thought of.
If anyone wants to try it I suggest at least looking through it for anything you completely disagree with before trying it.

It looks for more exact matches in descriptions (it should only ignore punctuation and spacing changes, everything else must match 100% exactly).
This might mean you get extra recordings if you're checking for "unique" descriptions but I thought that was better than not getting any.
Titles must match exactly too except it removes any "New:" before comparing titles so will match titles either with or without "new:" (it doesn't effect manual EPG Search for titles).
Yes I do know that's highly English language specific, maybe even highly UK specific.

**adm** · 12-09-20, 10:37

Originally Posted by BrianTheTechieSnail

I'm testing the attached modified AutoTimer.py on my box at the moment.
I'm hopeful that it fixes the Two in Clover problem and a few other possible problems too.
But of course it may have nasty mistakes in it or introduce other problems I haven't thought of.
If anyone wants to try it I suggest at least looking through it for anything you completely disagree with before trying it.

It looks for more exact matches in descriptions (it should only ignore punctuation and spacing changes, everything else must match 100% exactly).
This might mean you get extra recordings if you're checking for "unique" descriptions but I thought that was better than not getting any.
Titles must match exactly too except it removes any "New:" before comparing titles so will match titles either with or without "new:" (it doesn't effect manual EPG Search for titles).
Yes I do know that's highly English language specific, maybe even highly UK specific.

You appear to be performing test based on some observations I made in my last post. I questioned if what I was seeing prior to the similarity testing was what was expected such as duplication of the short description and/or a missing short description and a populated extended description. Is there another bug to be found before attempting to compensate elsewhere?

You may be making assumptions about the format of the EPG data based only on UK over the air collection and maybe not valid if the EPG data is collected in a different way – from the Internet. Other sources of EPG may have more data.

Perhaps its better to establish if missing description data is a bug that needs to be fixed and/or is duplication in all cases expected when data is obtained in a specific way[1].

The danger of using one example of a problem EPG description, as you say, is that you can miss the bigger picture. In the UK identical programs/repeats may have a slightly different EPG description. For instance the broadcaster may include a marker for a subtitle or signed for the deaf on one showing but not the next. Other non-UK broadcasters may add other such data on certain showings.

You have already indicated that different platforms (for instance Freeview and Freesat) may have different EPGs for the same programs and many people have both terrestrial and satellite tuners and may have recorded a series on one service and for further broadcast on another service, and want the checking to consider already recorded episodes. I personally haven’t seen too much of a difference between the bulk wording between the UK services but have seen it in the Series/Episode part. Does [S1, Ep02] = S1, Ep2 or does [S1, Ep02] = S1, Ep 2/8 or does [HD] = Also in HD?

Giving any preference to equality checking rather than similarity checking may result in many more unwanted repeats than with the current code.

I’m not sure how easily it would be to check your new code was creating better results than the existing code? First you would have to have a large data base of EPG data (perhaps excluding the example of the problem EPG description described in this thread) and run it through the existing similarity checking. Then perform the same test with the same data through your new code. If the results are very similar you may conclude that may not have broken anything (for UK based EPG data that is obtained over the air). If there are difference you need to establish why because in general the exiting code does work in 99+% of cases (at least for me, where my settings may differ from other users).

I don’t know the answer so some, questions...

Is equality checking in the revised code case dependant? (is new the same as NEW?)

You indicate some of the testing is UK dependant (test for New) but what if the EPG string also contain “foreign” characters such as those with umlauts or similar characters in other languages? Would your new code break under these circumstances?

If you replace punctuation and spaces etc. with, say, an underscore and the two identically worded description with the same series/episode ended up being 1 character length different because one had an additional space would your equality testing fail?

[1]
In the grid EPG view occasionally when scrolling through the programs a certain program will have no description whereas those adjacent to it will have a description. However, if leaving the EPG view and then going immediately back in the program with the missing description now has it, Possibly because on entering the EPG for the first time it wasn’t read correctly and on the second time it was a refreshed read. [Wild Speculation] Maybe the first time around the information for that program was being updated over the air and so “busy” and not accessible for both reading and writing. Could the reason for the missing data in the EPG view be what is being seen with the missing description data prior to the similarity checking?

**ccs** · 12-09-20, 10:57

A question: does BBC1 HD have two different "epg" entries if you have both freeview and satellite versions of the channel?

My guess would be yes.

epgcache.cpp would make interesting bedtime reading....

Code:

https://github.com/OpenViX/enigma2/blob/08fdccecfcae6c2913904cf1b3caf72749d6a37f/lib/dvb/epgcache.cpp

**adm** · 12-09-20, 13:42

Originally Posted by ccs

A question: does BBC1 HD have two different "epg" entries if you have both freeview and satellite versions of the channel?

My guess would be yes.

Without comparing the two services for very many entries I cannot say but a quick check of a few seem to shown the same EPG data. However, there may be regional variations as in my area BBC1 HD is a part time channel. Often the equivalent of the potters wheel interlude is shown with a note to tune to BBC SD to see the local programming. Some of the information markers at the end of the EPG may/are different on the two services - series/episode with or without brackets, indicators for signed, audio description etc.

**ccs** · 12-09-20, 13:49

Originally Posted by adm

Often the equivalent of the potters wheel interlude is shown with a note to tune to BBC SD to see the local news.

I get that on freeview BBC1 HD, what does satellite do?

Edit:

epg search (yellow button) on my HD bouquet finds all SD broadcasts as well.
The only exception is the local SD news broadcast (and one or two others), which you'd expect, as the HD entry has a different description.

**adm** · 12-09-20, 14:01

Originally Posted by ccs

I get that on freeview BBC1 HD, what does satellite do?

Possibly different depending on which regoin you select.

On terrestrail you are limited to what your local transmitter transmits so, in general, limited to your local services.
On satellite all regions are available to select so I could set a different region to where I'm actually living.
For instance I could be living t'up north and TV via the aerial only gives me the choice of northern regional programs but I may originate from down 'Sarf' and whilst living in the North via satellite I could select a southern region to watch southern regional programming.

**spanner123** · 12-09-20, 14:02

Originally Posted by ccs

I get that on freeview BBC1 HD, what does satellite do?

Edit:

epg search (yellow button) on my HD bouquet finds all SD broadcasts as well. The only exception is the local SD broadcast, which you'd expect, as the HD entry has a different description.

Satellite does the same.

BrianTheTechieSnail · 12-09-20, 16:14

Originally Posted by adm

You appear to be performing test based on some observations I made in my last post. I questioned if what I was seeing prior to the similarity testing was what was expected such as duplication of the short description and/or a missing short description and a populated extended description. Is there another bug to be found before attempting to compensate elsewhere?

I don't know. Could be, could be not.

Originally Posted by adm

You may be making assumptions about the format of the EPG data based only on UK over the air collection and maybe not valid if the EPG data is collected in a different way – from the Internet. Other sources of EPG may have more data.

Yes True.

Originally Posted by adm

Perhaps its better to establish if missing description data is a bug that needs to be fixed and/or is duplication in all cases expected when data is obtained in a specific way[1].

Maybe yes.

Originally Posted by adm

The danger of using one example of a problem EPG description, as you say, is that you can miss the bigger picture. In the UK identical programs/repeats may have a slightly different EPG description. For instance the broadcaster may include a marker for a subtitle or signed for the deaf on one showing but not the next. Other non-UK broadcasters may add other such data on certain showings.

Yes. Previously I've have had some success with ignoring the late night repeats with sign language that the BBC making the autotimer not record anything after, say, midnight.

Originally Posted by adm

You have already indicated that different platforms (for instance Freeview and Freesat) may have different EPGs for the same programs and many people have both terrestrial and satellite tuners and may have recorded a series on one service and for further broadcast on another service, and want the checking to consider already recorded episodes. I personally haven’t seen too much of a difference between the bulk wording between the UK services but have seen it in the Series/Episode part. Does [S1, Ep02] = S1, Ep2 or does [S1, Ep02] = S1, Ep 2/8 or does [HD] = Also in HD?

Yes I haven't dealt with that possibility at all.

Originally Posted by adm

Giving any preference to equality checking rather than similarity checking may result in many more unwanted repeats than with the current code.

I just followed my personal preference of making the highest priority be not missing the chance to record an unseen episode because it had almost the same description as another. I see unwanted duplicated recordings as something that's mostly easily fixed at any time using the delete button.

Originally Posted by adm

I’m not sure how easily it would be to check your new code was creating better results than the existing code? First you would have to have a large data base of EPG data (perhaps excluding the example of the problem EPG description described in this thread) and run it through the existing similarity checking. Then perform the same test with the same data through your new code. If the results are very similar you may conclude that may not have broken anything (for UK based EPG data that is obtained over the air). If there are difference you need to establish why because in general the exiting code does work in 99+% of cases (at least for me, where my settings may differ from other users).

I'm not expecting it to replace the original code in the official images.
It's up to anyone who's frustrated with the original code to try mine and decide which they like best themselves.

Originally Posted by adm

I don’t know the answer so some, questions...
Is equality checking in the revised code case dependant? (is new the same as NEW?)

Yes, new:, New: NEW: and so on at the beginning are all removed before comparison as are any spaces after the : .
Yes I've made all string comparisons case independent.

Originally Posted by adm

You indicate some of the testing is UK dependant (test for New) but what if the EPG string also contain “foreign” characters such as those with umlauts or similar characters in other languages? Would your new code break under these circumstances?

Good question. I don't know.
I would hope that the Python language handles them in a way that means I don't have to worry about them.

Originally Posted by adm

If you replace punctuation and spaces etc. with, say, an underscore and the two identically worded description with the same series/episode ended up being 1 character length different because one had an additional space would your equality testing fail?

I change any group of one or more spaces or punctuation character in to a single underscore.

Originally Posted by adm

[1]
In the grid EPG view occasionally when scrolling through the programs a certain program will have no description whereas those adjacent to it will have a description. However, if leaving the EPG view and then going immediately back in the program with the missing description now has it, Possibly because on entering the EPG for the first time it wasn’t read correctly and on the second time it was a refreshed read. [Wild Speculation] Maybe the first time around the information for that program was being updated over the air and so “busy” and not accessible for both reading and writing. Could the reason for the missing data in the EPG view be what is being seen with the missing description data prior to the similarity checking?

There's some kind of caching of old going on, I'm hoping it only effects the EPG display but yes I guess ideally it would need to be tested.

**ccs** · 12-09-20, 16:31

In telnet, strings -n 12 /media/hdd/epg.dat | sort | less gives you some indication of the various programme descriptions you might come across.

Assuming that's where you store it.

(Very simplistic, I know.)

**ccs** · 12-09-20, 17:25

Daft example (Two in Clover, mentioned earlier), but the description (wrongly) changes between series 1 and series 2...

Code:

root@et10000:~# strings -n 12 /media/hdd/epg.dat | grep 'Sid James'
1970s comedy series with Sid James and Victor Spinetti as two city-dwellers who have moved to the countryside - expecting farming life to be simpler. S2, ep1/6.
1960s comedy series with Sid James and Victor Spinetti as two city-dwellers who have moved to the countryside - expecting farming life to be simpler. S1, ep4/7 (B/W).
1970s comedy series with Sid James and Victor Spinetti as two city-dwellers who have moved to the countryside - expecting farming life to be simpler. S2, ep4/6.#p
1960s comedy series with Sid James and Victor Spinetti as two city-dwellers who have moved to the countryside - expecting farming life to be simpler. S1, ep5/7 (B/W).J^
1970s comedy series with Sid James and Victor Spinetti as two city-dwellers who have moved to the countryside - expecting farming life to be simpler. S2, ep5/6.,
1960s comedy series with Sid James and Victor Spinetti as two city-dwellers who have moved to the countryside - expecting farming life to be simpler. S1, ep2/7 (B/W).F
1970s comedy series with Sid James and Victor Spinetti as two city-dwellers who have moved to the countryside - expecting farming life to be simpler. S2, ep2/6.2
1960s comedy series with Sid James and Victor Spinetti as two city-dwellers who have moved to the countryside - expecting farming life to be simpler. S1, ep3/7 (B/W).Z[r
1970s comedy series with Sid James and Victor Spinetti as two city-dwellers who have moved to the countryside - expecting farming life to be simpler. S2, ep3/6.
1960s comedy series with Sid James and Victor Spinetti as two city-dwellers who have moved to the countryside - expecting farming life to be simpler. S1, ep7/7 (B/W).
1960s comedy series with Sid James and Victor Spinetti as two city-dwellers who have moved to the countryside - expecting farming life to be simpler. S1, ep1/7 (B/W).
1960s comedy series with Sid James and Victor Spinetti as two city-dwellers who have moved to the countryside - expecting farming life to be simpler. S1, ep6/7 (B/W).~
root@et10000:~#

Thread: Autotimers and Description Uniqueness

Thread Tools

Display

The Following 2 Users Say Thank You to adm For This Useful Post:

The Following User Says Thank You to adm For This Useful Post:

The Following User Says Thank You to ccs For This Useful Post:

Tags for this Thread

Posting Permissions

Options

About

Site Links

Social Media