PDA

View Full Version : New Python script to scrape KingOfSat for populating CCcam.channelinfo & oscam.srv



slain
28-09-12, 00:15
Just thought I'd announce the development of a small tool I've been working on before I release it next week.

For the past few days on and off, I've been knocking up a small Python script to scrape KingOfSat for all data required for a CCcam.channelinfo file, with the aim of using it to automatically compile full and complete files on demand. You can tailor the packages that are scraped easily with a simple configuration file, setting your preferred CAID and provid's as you please, or simply using the full and complete config file provided to generate an all-singing all-dancing CCcam.channelinfo. At the moment it scrapes fantastically and without fault, but it needs some tidying and an option to build a new config file, based upon the packages currently listed on KoS. Things like CAID's and Provid's aren't carried on KoS, and would need to be filled in one time manually (if you're not using the config I'll be providing with it).

This will initially be released as a standard Python script at first, and depends on BeautifulSoup and Mechanize. Soon after I'll be compiling it into both a pyinstaller package and standalone Windows executable to ease it's use. The intention for this isn't really that it'd be used on-receiver, as that would generate a fair bit of additional load for KoS and would also be impolite, but more so people can automate the creation of their own CCcam.channelinfo files for distribution/cron wget.

If there's any functionality any of you would like to see from such a tool, here would be the place to put forward your ideas. :)

Cheers.

Rob van der Does
28-09-12, 07:40
I'm looking forward to your tool. I have been struggling for quite some time to try to keep the files up-to-date.

Still I think that a plugin with an 'auto-refresh'-functionality wouldn't be too bad?

slain
28-09-12, 08:26
It would be nice to have an updater as a plugin I agree, but I think the more polite way to do it would be to generate the available packages of a channelinfo file once per day on a server, and have the plugin pull it from there. I'm not sure KoS would appreciate thousands of additional bot hits per day. ;)

I'll look at making the plugin as flexible as possible anyway, and leave how it's leveraged to the user.

slain
01-10-12, 21:10
I'd planned to have something released over the weekend, but a failed hard-drive, restoring backups and frantic attempts at some data recovery put an end to that. Fortunately I didn't lose my work on this so far! I'll have something for you all to play with soon.

slain
03-10-12, 22:07
Ok, so here is the first release. You'll need Python 2.7, plus the BeautifulSoup and Mechanize libraries. Looking at the supplied config file you should be able to get the gist of what you need to do with it. :) I should have another release out in the next few days or so, with some new features and better error checking.

Huevos
04-10-12, 12:03
Slain, I know you are in a *NIX environment but I'm hoping you might be able to give me an idea where I am going wrong with my windows setup. I've installed BeautifulSoup but python can't find it. I've attached a command line printout of the install and the error.

slain
04-10-12, 13:24
Slain, I know you are in a *NIX environment but I'm hoping you might be able to give me an idea where I am going wrong with my windows setup. I've installed BeautifulSoup but python can't find it. I've attached a command line printout of the install and the error.

I'm running the following versions of both modules:

BeautifulSoup - 3.2.0
Mechanize - 0.2.5

I see you're using BS4, which has been re-factored a great deal and isn't too compatible with BS3. Could you try removing BS4 and try BS3 instead?

slain
04-10-12, 14:03
Just to add to my last post, would you guys feel that an "enc" field in the config would be beneficial, for adding the type of encryption to the output ie.

090F:000000:1F9A "Viasat - TV 4 Sweden [NDS3]"

Currently this isn't in there but it'd be relatively simple to add it. I'd just need to add it based on whether the Encryption field is listed as "Clear" or not on KOS. It might not even be worth distinguishing between them, based on the fact that the clear channels wouldn't be hitting CCcam anyway. It'd only make the lines more accurate in a factual sense. :)

Huevos
04-10-12, 15:46
I'm running the following versions of both modules:

BeautifulSoup - 3.2.0
Mechanize - 0.2.5

I see you're using BS4, which has been re-factored a great deal and isn't too compatible with BS3. Could you try removing BS4 and try BS3 instead?I'll swap to BS3 instead, but what directory should it be installed in as I think this is the problem.

slain
04-10-12, 16:44
I'll swap to BS3 instead, but what directory should it be installed in as I think this is the problem.

I really don't have much of a clue when it comes to Windows mate, but for what it's worth it did look to be installing in the correct place to me. Anyone else able to help Huevos?

Just to add, I have a default kos_scrape.conf file here that I scraped that just needs correct CAID and provider ID's adding for each package. The default config generation will be built into the next release of the script, which will likely be tonight.

Basically when someone can post back a completed config with the correct CAID's and provider ID's, we'll have a tool that can generate complete and up-to-date CCcam.channelinfo files instantly. :)

Huevos
04-10-12, 21:14
I really don't have much of a clue when it comes to Windows mate, but for what it's worth it did look to be installing in the correct place to me.It installed into the same directory as setup.py. I don't think that can be correct. It finds mechanize ok and I installed that the same way. I'll have to have a look where it installed that to and move BS3 to the same parent directory. Anyway I'll have another go tomorrow when I'm feeling a bit more awake.

slain
05-10-12, 09:16
It installed into the same directory as setup.py. I don't think that can be correct. It finds mechanize ok and I installed that the same way. I'll have to have a look where it installed that to and move BS3 to the same parent directory. Anyway I'll have another go tomorrow when I'm feeling a bit more awake.

No worries. Sorry I can't be of any more help on the Windows side of things. If you get really stuck you could use a virtual machine within VirtualBox for Ubuntu 12.04, which should make things a lot easier.

Just to add to earlier posts, I'll be adding an update to use multiple CAID's for a package some time this weekend. Basically it'll mean that every channel gets added x times, x being the number of CAID's available for the package. The default config generation routine will be included in that release.

slain
05-10-12, 17:23
Ok, I've got the multiple card functionality working now, and clear channels aren't processed. The new format for the config is as follows:


[slydigital]
name = Sly Digital
url = http://en.kingofsat.net/pack-slydigital.php
cards = 0963:000000

If Sly were to come out with a new card with a CAID of FFFF and provider of 012345, the config for Sly would look like:


[slydigital]
name = Sly Digital
url = http://en.kingofsat.net/pack-slydigital.php
cards = 0963:000000,FFFF:012345

So yeah, it's a bit more useful now. As it currently stands, the tool does everything you'd want it to, probably bar listing encryption in the channelinfo lines (no big deal so far, really).

slain
09-10-12, 21:36
Ok, I've finished the second version of the script now. New features include:

* Multiple CAID/providers supported
* Ability to create a default config file ready to edit as you please
* Better output to the terminal (tells you exactly what it's doing)
* Ability to disable a package in the config file, ie for un-needed packages
* Don't process FTA channels any more (pointless and incorrect)

That's all I can think of right now. Basically I think everything is there that is needed to create a pretty rockin' CCcam.channelinfo with very little effort. As I've said before, once someone creates a complete kos_scrape.conf file with correct card data in it'll generate a full file in a minute or two. I started off a kos_scrape.conf in the tarball attached, but haven't had time to complete it thus far.

Please, do try your best to give me as much feedback as possible. I'll help where ever possible with folk using Linux, and hopefully someone will come along that can assist with the Windows side of things!

Cheers,

slain

BadBoy
10-10-12, 08:02
Is it possible to have channel numbers?

Sent from my Galaxy Nexus using Tapatalk

slain
10-10-12, 08:10
Is it possible to have channel numbers?

That information isn't carried by KingOfSat.com unfortunately; and besides, it wouldn't have much use to be honest. All CCcam.channelinfo is really used for is the telnet and HTTP interfaces to CCcam, where channel numbers aren't relevant.

slain
22-10-12, 14:02
New release due soon, with oscam.srvid output. :) In testing now.

Ev0
22-10-12, 15:08
New release due soon, with oscam.srvid output. :) In testing now.

Was just about to post to say it would be good if you could add the option to port to oscam for those users.

Huevos
02-11-12, 14:20
I'll help where ever possible with folk using Linux, and hopefully someone will come along that can assist with the Windows side of things!I did get it sorted in the end. Just a silly Windows path problem. Looking forward to the update.

slain
02-11-12, 16:02
Been busy with work and sorting out *yet another* hard drive failure. Word to the wise: avoid Seagate.

With any luck I'll get a minute to upload the update this evening.

Sent from my GT-I9300 using Tapatalk 2

Huevos
02-11-12, 18:16
Any new CAIDs in the config file? Am I right in thinking if the package is marked with 0000 as the CAID the package will not be scraped?

slain
23-11-12, 14:34
Any new CAIDs in the config file? Am I right in thinking if the package is marked with 0000 as the CAID the package will not be scraped?

Unfortuntely, I've not had chance to add many new CAID's into the supplied config file. :( Hell, I've not even had much chance to upload the latest version yet. I'll at least get off my arse and sort out uploading V3 of the software in the next few minutes!

slain
23-11-12, 14:38
As promised, the latest version is attached below. Now with oscam.srv support. Could a mod please update the man thread to include OScam support in the subject please? Thanks. :)

21127

Larry-G
23-11-12, 15:44
Could a mod please update the man thread to include OScam support in the subject please? Thanks. :)

21127

will do that for you now buddy.

tahirm
08-02-13, 17:28
good:cool:

MarsArtis
08-02-13, 22:17
what would be the advantage of using CCcam.channelinfo?
have never used it

Stanman
10-02-13, 18:53
For watching TV no benefit whatsoever.

MarsArtis
10-02-13, 19:57
For watching TV no benefit whatsoever.
and so what should be the advantage of its use?

slain
10-02-13, 20:56
and so what should be the advantage of its use?

You find out what your peers are requesting. More useful in OScam to be honest, as you can see what should be cached.

Stanman
10-02-13, 21:36
You find out what your peers are requesting. More useful in OScam to be honest, as you can see what should be cached.

Best to leave this topic here;)

MarsArtis
10-02-13, 22:09
O K :thumbsup: