Hello Guest, if you are reading this it means you have not registered yet. Please take a second, Click here to register, and in a few simple steps you will be able to enjoy our community and use our OpenViX support section.
Page 7 of 10 FirstFirst ... 56789 ... LastLast
Results 91 to 105 of 149

Thread: Autotimers and Description Uniqueness

  1. #91
    ccs's Avatar
    Title
    ViX Beta Tester
    Join Date
    Sep 2014
    Posts
    5,836
    Thanks
    554
    Thanked 1,277 Times in 1,089 Posts
    Quote Originally Posted by spanner123 View Post
    That all makes sense Birdman but why is it Sky boxes or Humax boxes etc never get it wrong?
    Maybe because they use series CRID data.

    Some progress on enigma2/CRID's achieved in Australia a while ago ....

    Code:
    http://beyonwiz.com.au/forum/viewtopic.php?f=54&t=10773
    Last edited by ccs; 10-09-20 at 11:08.

  2. The Following User Says Thank You to ccs For This Useful Post:

    spanner123 (10-09-20)

  3. #92
    ccs's Avatar
    Title
    ViX Beta Tester
    Join Date
    Sep 2014
    Posts
    5,836
    Thanks
    554
    Thanked 1,277 Times in 1,089 Posts
    .... this link looks quite interesting....

    Code:
    https://www.beyonwiz.com.au/forum/viewtopic.php?p=183023#p170895
    @IanSav is probably the best bet for comments.

  4. #93
    BrianTheTechieSnail
    Quote Originally Posted by birdman View Post
    You do seem to be missing the real point.
    It's very simple for a human to read two similar texts and decide whether they are, in fact, the same - we've evolved over thousands of years to recognize patterns.
    There is no way that simple computer code can achieve the same discrimination.
    No matter what comparison code you put in place it will never come up with the "correct result" every time.
    So live with that limitation and look for some some other test to help you.
    Oh come on. I never said my answer would be perfect.

    Sent from my SM-A515F using Tapatalk

  5. #94
    Joe_90's Avatar
    Title
    Moderator
    Join Date
    Mar 2014
    Location
    Wicklow, Ireland
    Posts
    4,126
    Thanks
    1,280
    Thanked 1,126 Times in 888 Posts
    I contributed to a few threads discussing CRIDs several years ago and actually raised a request here on Vix to see if CRID handling could be incorporated into the AutoTimer.py code but that request thread went nowhere at the time. The beyonwiz team in Australia (as per @ccs post) have had some success in using this CRID data, but I think the big issue is the lack of consistency of how the CRID information is structured or encoded. (as @ccs points out, IanSav (and prl) have worked on this in the Australian environment). What might work for Freesat/Freeview might need workarounds or kludges for SKY or other broadcasters. I used the "Series Link" feature on SKY and on Humax Freesat receivers and found them excellent. I originally thought the AutoTimer mechanism on enigma to be a little crude in its operation but, over the years I've adapted to its foibles and have found that it works for me 99% of the time. I rarely miss episodes I want to record and it generally works when a new series starts after being off the air for months (providing the broadcaster doesn't move it to a completely new channel or time slot). Worst case scenario is that I have multiple recording of the same episode occasionally.
    The CRID mechanism would eliminate the ambiguity of trying to match on title or description but I imagine it would need a complete overhaul of the program logic to keep tables of series and programme CRID info to determine if particular episodes need to be recorded or have already been recorded.
    Last edited by Joe_90; 10-09-20 at 11:52. Reason: beyonwiz reference
    GB Quad Plus, Mut@nt HD51, AX HD61, 80cm dish and Supreme Dark motor. Sony STR-DN 1060, Sony UHP-H1 Bluray, Odroid N2+ (CoreElec), Monitor Audio Bronze 5.1 speakers

  6. #95
    ccs's Avatar
    Title
    ViX Beta Tester
    Join Date
    Sep 2014
    Posts
    5,836
    Thanks
    554
    Thanked 1,277 Times in 1,089 Posts
    ... I'm sure there would be room in the *.ts.meta files to store an extra word or two of crid details.

    If it's blank/missing, use the existing system, if it's not, bingo.

  7. #96
    adm's Avatar
    Title
    Forum Supporter
    Donated Member
    Join Date
    Sep 2014
    Location
    Southend on Sea, UK
    Posts
    1,658
    Thanks
    65
    Thanked 658 Times in 514 Posts
    Quote Originally Posted by fat-tony View Post
    I contributed to a few threads discussing CRIDs several years ago and actually raised a request here on Vix to see if CRID handling could be incorporated into the AutoTimer.py code but that request thread went nowhere at the time. The beyonwiz team in Australia (as per @ccs post) have had some success in using this CRID data, but I think the big issue is the lack of consistency of how the CRID information is structured or encoded. (as @ccs points out, IanSav (and prl) have worked on this in the Australian environment). What might work for Freesat/Freeview might need workarounds or kludges for SKY or other broadcasters.
    This part of the problem in that any changes have to work for all broadcasters irrespective where in the world they may be. Even in the UK the main channels may have a good record with CRID data but there have in the past also many instances on the "lesser" channels where strict transmitting of the correct CRID has been a bit lax.

    I originally thought the AutoTimer mechanism on enigma to be a little crude in its operation but, over the years I've adapted to its foibles and have found that it works for me 99% of the time. I rarely miss episodes I want to record and it generally works when a new series starts after being off the air for months (providing the broadcaster doesn't move it to a completely new channel or time slot).
    I've also found Autotimers to be reliable 99+% of the time and if anything "goes wrong" it tends to record too much rather than missing recordings. I can live with the occsaional 2 or 3 copies of the repeat. I tend not to set limited time slots so when a program does move time it tends to be captured three months down-line.

    Note: all my autotimers are set to check in the title and short description only.
    Xtrend ET10K, 2 x satellite tuners 28.2 (Sky FTA), 2 x hybrid (UK Freeview), Zgemma H9S (satellite)

  8. The Following User Says Thank You to adm For This Useful Post:

    Joe_90 (10-09-20)

  9. #97
    BrianTheTechieSnail
    Quote Originally Posted by BrianTheTechieSnail View Post
    The comparisons of the titles and descriptions seem to be done in function checkSimilarity which starts on line 838 of the file AutoTimer.py.
    It uses a function SequenceMatcher from difflib, which you can find descriptions of on the web such as https://towardsdatascience.com/sequencematcher-in-python-6b1e6f3915fc

    I don't think it's really the right function in this application because, for instance, a single character difference in the middle of a description counts as a huge difference while a single character difference near the beginning or end counts only as a small difference. Thus the change from (S01:E04) to (S01:E05) right at the end of a description is seen as something to ignore.
    Okay I've done some tests and THIS IS WRONG.
    It does not seem to see differences in the middle as more important than differences near the beginning and end.
    The descriptions of the SequenceMatcher function I found seem over simple and use over simple examples so I didn't understand exactly what it does (and I still don't).
    SORRY.

    SequenceMatcher probably is a good choice except that numbers need to be given more importance, and I have an idea for that which I will try soon.
    Other people are, as always, free to ignore what I write.

  10. #98
    adm's Avatar
    Title
    Forum Supporter
    Donated Member
    Join Date
    Sep 2014
    Location
    Southend on Sea, UK
    Posts
    1,658
    Thanks
    65
    Thanked 658 Times in 514 Posts
    Quote Originally Posted by BrianTheTechieSnail View Post
    SequenceMatcher probably is a good choice except that numbers need to be given more importance, and I have an idea for that which I will try soon.
    Other people are, as always, free to ignore what I write.

    But don't forget there may be many numbers in the description that don't relate to the series or episode. I saw one description the other day it said something like "....post war britain between 1943 and 1952........" Perhaps just making numbers more important in the current tests is not the way to go.

    Also don't forget that any solution for one problem cannot create another and break something the autotimer does well. Fixing a problem in less than 1% of descriptions cannot cause prolems with, say, 3% of other descriptions.
    Xtrend ET10K, 2 x satellite tuners 28.2 (Sky FTA), 2 x hybrid (UK Freeview), Zgemma H9S (satellite)

  11. #99
    BrianTheTechieSnail
    Quote Originally Posted by adm View Post
    But don't forget there may be many numbers in the description that don't relate to the series or episode. I saw one description the other day it said something like "....post war britain between 1943 and 1952........" Perhaps just making numbers more important in the current tests is not the way to go.

    Also don't forget that any solution for one problem cannot create another and break something the autotimer does well. Fixing a problem in less than 1% of descriptions cannot cause prolems with, say, 3% of other descriptions.
    Okay, lets confine ourselves to fixing the big logical error you described:
    Quote Originally Posted by adm View Post
    There are 3 separate tests working on 3 discrete bits of EPG.
    I) the title data
    ii) the short description data
    iii) the extended description data

    Test 1: only the title data is compared

    Test 2: only the short description data is compared,but only if:
    i) test 1 produced a match
    ii) the menu option “title and short description” has been selected

    Test 3: only the extended description data is compared but only if:
    i) test 2 produced a match
    ii) the menu option “title and all descriptions” has been selected

    Test 2 is falling over on these problem EPG beacuse it is identifying all programs with a genric description with only the last few charcters changing to be similar enough to be identical.

    Test 3 is falling over because there is no extended description data to check but when checking this non-existant data it indicates that every time there is a difference. Garbage in = garbage out - or more correctly garbage in = the same result out every time. This is overriding the result of test 2. Note: there may be extended description from the broadcasters in other countries and maybe if the epg information is obtained over the net.
    The code is clearly not supposed to do this, there is a test for it, but it's screwed up. Maybe the code I posted before and then lost confidence in is the fix:
    Code:
    	def checkSimilarity(self, timer, name1, name2, shortdesc1, shortdesc2, extdesc1, extdesc2, force=False):
    		foundTitle = False
    		foundShort = False
    		retValue = False
    		if name1 and name2:
    			foundTitle = ( 0.8 < SequenceMatcher(lambda x: x == " ",name1, name2).ratio() )
    		# NOTE: only check extended & short if tile is a partial match
    		if foundTitle:
    			if timer.searchForDuplicateDescription > 0 or force:
    				if shortdesc1 and shortdesc2:
    					# If the similarity percent is higher then 0.7 it is a very close match
    					foundShort = ( 0.7 < SequenceMatcher(lambda x: x == " ",shortdesc1, shortdesc2).ratio() )
    					if foundShort:
    						if timer.searchForDuplicateDescription == 2:
    							if extdesc1 and extdesc2:
    								# Some channels indicate replays in the extended descriptions
    								# If the similarity percent is higher then 0.7 it is a very close match
    								retValue = ( 0.7 < SequenceMatcher(lambda x: x == " ",extdesc1, extdesc2).ratio() )
    							else:			# Brian was here
    								retValue = True	# Brian was here
    						else:
    							retValue = True
    			else:
    				retValue = True
    		return retValue

  12. #100
    birdman's Avatar
    Title
    Moderator
    Join Date
    Sep 2014
    Location
    Hitchin, UK
    Posts
    7,829
    Thanks
    239
    Thanked 1,664 Times in 1,311 Posts
    Quote Originally Posted by ccs View Post
    ... I'm sure there would be room in the *.ts.meta files to store an extra word or two of crid details
    Not where it's needed. You really want to know you've already recorded something even after you've deleted the timer for and the recording of it.
    A sqlite database of all recorded CRIDs would be the thing to use.
    With a configurable "forget after" time, so that any record older then this would be pruned.
    MiracleBox Prem Twin HD - 2@DVB-T2 + Xtrend et8000 - 5(incl. 2 different USBs)@DVB-T2[terrestrial - UK Freeview HD, Sandy Heath] - LAN/USB-stick/HDD

  13. #101
    BrianTheTechieSnail
    Quote Originally Posted by birdman View Post
    Not where it's needed. You really want to know you've already recorded something even after you've deleted the timer for and the recording of it.
    A sqlite database of all recorded CRIDs would be the thing to use.
    With a configurable "forget after" time, so that any record older then this would be pruned.
    That would be hard to edit if you accidentally deleted one of a series you were trying to collect.

  14. #102
    birdman's Avatar
    Title
    Moderator
    Join Date
    Sep 2014
    Location
    Hitchin, UK
    Posts
    7,829
    Thanks
    239
    Thanked 1,664 Times in 1,311 Posts
    Quote Originally Posted by BrianTheTechieSnail View Post
    The code is clearly not supposed to do this, there is a test for it, but it's screwed up. Maybe the code I posted before and then lost confidence in is the fix:
    It's a fix for something - although the actual fix should be more like:
    Code:
     retValue = extdesc1 == extdesc2 
    with relevant handling of any Null values.
    However, that something is only a small subset of cases - it's not going to have any effect on most of the cases in this thread.
    MiracleBox Prem Twin HD - 2@DVB-T2 + Xtrend et8000 - 5(incl. 2 different USBs)@DVB-T2[terrestrial - UK Freeview HD, Sandy Heath] - LAN/USB-stick/HDD

  15. #103
    ccs's Avatar
    Title
    ViX Beta Tester
    Join Date
    Sep 2014
    Posts
    5,836
    Thanks
    554
    Thanked 1,277 Times in 1,089 Posts
    Quote Originally Posted by birdman View Post
    Not where it's needed. You really want to know you've already recorded something even after you've deleted the timer for and the recording of it.
    A sqlite database of all recorded CRIDs would be the thing to use.
    With a configurable "forget after" time, so that any record older then this would be pruned.
    OK, but I was rambling on earlier in this thread suggesting that remembering recordings/timers somewhere after they have been deleted wasn't such a bad idea.

  16. #104
    BrianTheTechieSnail
    Quote Originally Posted by birdman View Post
    It's a fix for something - although the actual fix should be more like:
    Code:
     retValue = extdesc1 == extdesc2 
    with relevant handling of any Null values.
    However, that something is only a small subset of cases - it's not going to have any effect on most of the cases in this thread.
    First you say perfection is impossible so give up.
    Now you say it's not perfect - so it's not worth bothering with.

  17. #105

    Title
    Junior Member
    Join Date
    Jul 2020
    Posts
    13
    Thanks
    2
    Thanked 3 Times in 3 Posts
    I thought the following might be of some interest.

    Having installed OpenPLi a few weeks ago, I decided to have a look at its version of Autotimer.

    The 'Edit Autotimer' screen has several extra options not present in OpenViX -
    1. Description - short equal extended for match (default 'no')
    2. Do not skip match when not description (default 'no')
    3. Percentage ratio for duplicate matches (from '50%' to '100%' in 10 percent steps, default '80%')

    The 3rd sounded like it might be of use, so I did a test on 'Two in Clover' with this set to '100%' (and, as normal, no timespan specified). The result was much as I hoped - all unique episodes found, no duplicates. Changing to '90%' naturally didn't work, returning just one hit.

    The 100% setting obviously wouldn't be of any help where, for example, one episode of a set of duplicates was prefixed 'NEW: ', or one had the 'sign language' indicator [SL]. Incidentally, this sort of thing was discussed at length in the 4-year-old topic referred to in my first post.

    Also, this version of Autotimer doesn't have the 'all descriptions' problem.



    I wonder if the following suggestions for the OpenViX version of 'Autotimer' might be worthy of consideration by the various experts here?
    Please feel free to ignore this if you think it's nonsense (which it most likely is).

    1. Add an extra option to the 'Edit Autotimer' screen 'Expert match' or similar, default 'no'.
    If set to 'yes', a set of extra options would appear, for example 'Ignore any leading "NEW: " ', 'Ignore any "[SL]"', 'Percentage match', and anything else deemed useful.

    2. Alternatively, change the matching algorithm to ignore spaces, punctuation (comma, period, brackets, dashes, colons, etc) and superfluous information such as 'NEW: ', '[S]', '[AD]', '[SL]' etc (assuming of course that the matching algorithm can actually be altered). This would hopefully reduce the possiblity of mismatches a bit.

Page 7 of 10 FirstFirst ... 56789 ... LastLast

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
This website uses cookies
We use cookies to store session information to facilitate remembering your login information, to allow you to save website preferences, to personalise content and ads, to provide social media features and to analyse our traffic. We also share information about your use of our site with our social media, advertising and analytics partners.