Python Grabber scripts |
![]() ![]() |
Python Grabber scripts |
Oct 29 2009, 03:13
Post
#26
|
|
|
Group: Members Posts: 80 Joined: 4-October 04 Member No.: 17477 |
It looks easy to find the lyrics response, but the lyrics are presented in Flash, so impossible with python grabber. Sorry I'm from Canada, but I can read/understand Japanese It's too bad about the above site. Actually, I found an alternative site that uses the same back-end for Lyrics but doesn't seem to use flash. Here it is: http://music.goo.ne.jp/lyric/index.html 1/2 way down the page you'll see a search box labelled: 検索 - 歌詞情報 Same concept as the other site, if you select 曲名 you can search by song. (アーティスト名 is for searching by artist name) The button you need to click to search is labelled: 歌詞検索 Hope you can do something with this one! Thanks again |
|
|
|
Oct 29 2009, 03:19
Post
#27
|
|
|
Group: Validating Posts: 2424 Joined: 21-May 08 Member No.: 53675 |
I'll try this one, it should be OK (although don't know if characters will mess something but will see)
Do you know how many lyrics they have? This post has been edited by 2E7AH: Oct 29 2009, 03:20 |
|
|
|
Oct 29 2009, 03:36
Post
#28
|
|
|
Group: Members Posts: 80 Joined: 4-October 04 Member No.: 17477 |
I'll try this one, it should be OK (although don't know if characters will mess something but will see) Do you know how many lyrics they have? Thanks! Well according to uta-net, they have 84,000 songs in the database. I got tons of obscure Japanese songs, for example stuff from the 1960s, etc, and every single one I was able to find in the database. Thanks again! |
|
|
|
Oct 31 2009, 06:12
Post
#29
|
|
|
Group: Validating Posts: 2424 Joined: 21-May 08 Member No.: 53675 |
Benji99, here is the script:
[attachment=5458:goo.rar] I've tested it by tagging some files with title/artist present on site, and it worked. If you find any problems post What I've learned? - Japanese glyphs have many encodings - some sites don't like python - even more about Unicode |
|
|
|
Nov 10 2009, 09:31
Post
#30
|
|
|
Group: Validating Posts: 2424 Joined: 21-May 08 Member No.: 53675 |
Once again AMG scripts
Now GENRE, STYLE, MOOD and THEME can be assigned at once with: [attachment=5477:AMG_Release.rar] and new AMG review with custom user-agent report, loosen artist match and option to print some info in console: [attachment=5478:AMG_Review.rar] Here is example for AMG_Release: 1. Select custom tag in python grabber settings: ![]() 2. run the script and update files 3. select Properties > Tools > Automatically fill values source: Other and your custom tag pattern: Genres: %genre% \\ Styles: %style% \\ Moods: %mood% \\ Themes: %theme% ![]() 4. then remove AMG tag and with Ctrl - click select newly added tags (GENRE, STYLE, MOOD and THEME) and select "Split values" then OK If we have GENRE and STYLE tags and don't want to update them, than we enter this pattern i.e.: %tmp% \\ Moods: %mood% \\ Themes: %theme% so that GENRE and STYLE remains untouched As a reminder all AMG scripts rely mostly on correct release (%album%) name And do comment about problems, I'm rewriting this scripts as I run to some inconsistencies This post has been edited by 2E7AH: Nov 10 2009, 09:32 |
|
|
|
Nov 10 2009, 11:08
Post
#31
|
|
|
Group: Validating Posts: 2424 Joined: 21-May 08 Member No.: 53675 |
Download this AMG release script:
[attachment=5480:AMG_Release.rar] Problem with previous here: http://www.hydrogenaudio.org/forums/index....st&p=666954 Now use this pattern: Genres:%genre% \\ Styles:%style% \\ Moods:%mood% \\ Themes:%theme% This post has been edited by 2E7AH: Nov 10 2009, 11:27 |
|
|
|
Nov 10 2009, 15:17
Post
#32
|
|
![]() Group: Members Posts: 162 Joined: 5-November 04 From: W Hartford, CT - USA Member No.: 17991 |
Thank you! This is great!
Download this AMG release script: [attachment=5480:AMG_Release.rar] Problem with previous here: http://www.hydrogenaudio.org/forums/index....st&p=666954 Now use this pattern: Genres:%genre% \\ Styles:%style% \\ Moods:%mood% \\ Themes:%theme% -------------------- GO WHALE!!!
[url="http://www.toddberman.com"]My Website[/url] |
|
|
|
Nov 10 2009, 16:22
Post
#33
|
|
|
Group: Validating Posts: 2424 Joined: 21-May 08 Member No.: 53675 |
Enjoy
I didn't forgot about composer/performer conversation, I'll post that soon Here is masstagger script for cleaning the %amg% tag (Canar's version): just run it after the script (if %genre% and %style% should be preserved delete first two action from masstagger script): [attachment=5484:AMG_release_MTS.rar] This post has been edited by 2E7AH: Nov 10 2009, 16:37 |
|
|
|
Nov 10 2009, 18:10
Post
#34
|
|
![]() Group: Members Posts: 162 Joined: 5-November 04 From: W Hartford, CT - USA Member No.: 17991 |
Enjoy I didn't forgot about composer/performer conversation, I'll post that soon Thank you very much... I will wait on tagging any Various Artist Albums until that one comes. Here is masstagger script for cleaning the %amg% tag (Canar's version): just run it after the script (if %genre% and %style% should be preserved delete first two action from masstagger script): [attachment=5484:AMG_release_MTS.rar] I was just in the process of trying to figure out how to use Masstagger to do this.. they timing on this script is perfect! By the way, I have been using the Python scripts on a few albums this morning and they are working great! -------------------- GO WHALE!!!
[url="http://www.toddberman.com"]My Website[/url] |
|
|
|
Nov 11 2009, 07:29
Post
#35
|
|
|
Group: Members Posts: 80 Joined: 4-October 04 Member No.: 17477 |
Benji99, here is the script: [attachment=5458:goo.rar] I've tested it by tagging some files with title/artist present on site, and it worked. If you find any problems post What I've learned? - Japanese glyphs have many encodings - some sites don't like python - even more about Unicode Huge thanks for this script!! It works really well, except for a couple small bugs, if you have some free time, ... 1st bug: Certain track titles make the script crash. CODE foo_grabber_python: Traceback (most recent call last): File "I:\Program Files\foobar2000\pygrabber\scripts\goo.py", line 63, in Query raw_title = handle.Format('[%title%]').decode("utf8").encode("euc_jp") UnicodeEncodeError: 'euc_jp' codec can't encode character u'\uff5e' in position 13: illegal multibyte sequence This seemingly happens when a track has the '~' character in the title, A couple examples: Track title: HIGH G.K LOW ~ハジケロ~ Artist: GreeeeN Track title: 手紙 ~君たちへ~ Artist: GreeeeN Although, this one works: Track title: 島唄~ウチナーグチ・ヴァージョン~ Artist: THE BOOM 2nd bug, The script seems to have trouble finding tracks where there's a large amount of tracks with the same name For example: Track title: YOU Artist: サザンオールスターズ Track title: 海 Artist: サザンオールスターズ I know how this 2nd bug can be fixed I think, I found out that the site has a more advanced search function: http://music.goo.ne.jp/lyric/db.php There you can enter both the artist (アーティスト名) and track title (曲名) If you can modify the script to use that page instead, it would make it really accurate! Huge thanks again! Sebastien |
|
|
|
Nov 11 2009, 08:05
Post
#36
|
|
|
Group: Validating Posts: 2424 Joined: 21-May 08 Member No.: 53675 |
1st bug: Certain track titles make the script crash. CODE foo_grabber_python: Traceback (most recent call last): File "I:\Program Files\foobar2000\pygrabber\scripts\goo.py", line 63, in Query raw_title = handle.Format('[%title%]').decode("utf8").encode("euc_jp") UnicodeEncodeError: 'euc_jp' codec can't encode character u'\uff5e' in position 13: illegal multibyte sequence This seemingly happens when a track has the '~' character in the title Is that happening only with that character? It can be easily fixed if so. That character is fullwidth tilde "~" not ordinar tilde "~". 2nd bug, The script seems to have trouble finding tracks where there's a large amount of tracks with the same name Yeah, I would expect that, because script only tries to find match in first result page, and there can be more pages for some common title names. I'll check your suggestion, and try to make the script better |
|
|
|
Nov 21 2009, 03:09
Post
#37
|
|
|
Group: Developer Posts: 486 Joined: 8-June 07 From: Chengdu Member No.: 44175 |
@2E7AH:
I think replace the u'\uff5e' is a workaround: CODE s = handle.Format('[%title%]').decode("utf8")
raw_title = string.replace(s, u'\uff5e', u'\u301c').encode("euc_jp") |
|
|
|
Nov 21 2009, 04:31
Post
#38
|
|
|
Group: Members Posts: 80 Joined: 4-October 04 Member No.: 17477 |
Is that happening only with that character? It can be easily fixed if so. That character is fullwidth tilde "~" not ordinar tilde "~". Oops, forgot to respond to this, whenever it crashes, that character is always in the the track title. Thanks Btw, as far as making a more complete AMG script. Since I wrote The Godfather scripts for this already years ago. There's a few inconsistencies with the site. For example, the way to displays the performer and composer. It changes sometimes, in particular, it handles Various Artists albums and albums where a few tracks are collaborated by 2nd performer differently. If you can read Delphi and interested in my logic for how I coded around it, drop me a PM with your email, I'll send them to you I've been wanting to update it in Python but I found Python really hard to read/understand... |
|
|
|
Jan 8 2010, 15:54
Post
#39
|
|
|
Group: Members Posts: 252 Joined: 25-September 08 Member No.: 58627 |
2E7AH, i'm trying to use your AMG script, but have trouble with the "split values" step. the values don't seem to be splitting. when i set up a Filter to show %mood%, for example, the entries are not separate, and i get things like "Uncompromising; Fiery; Literate; Cerebral; Brooding" all on one line.
what am i doing wrong? |
|
|
|
Jan 8 2010, 16:46
Post
#40
|
|
![]() Group: Members Posts: 162 Joined: 5-November 04 From: W Hartford, CT - USA Member No.: 17991 |
2E7AH, i'm trying to use your AMG script, but have trouble with the "split values" step. the values don't seem to be splitting. when i set up a Filter to show %mood%, for example, the entries are not separate, and i get things like "Uncompromising; Fiery; Literate; Cerebral; Brooding" all on one line. what am i doing wrong? I think you need to make sure that "MOODS" is listed as a Multivalue field in Preferences/Advanced/Display/Properties Dialog -------------------- GO WHALE!!!
[url="http://www.toddberman.com"]My Website[/url] |
|
|
|
Mar 15 2010, 12:29
Post
#41
|
|
![]() Group: Members Posts: 41 Joined: 6-February 10 Member No.: 77932 |
I have a quick question regarding the python discogs genre/style grabber scripts.
How do you know you've gone over the 5000 limit? Does the lookup just fail? edit : And the AMG script gives me reviews in the AMG tags Example : QUOTE New horizons in historic jazz reissuing were revealed in 2005 when Jazz Oracle came out with a double-CD compendium of recordings made for about a dozen different labels between October 1924 and February 1933 in Vienna, Paris, and Berlin, all involving bandleader Lud Gluskin (1898-1989). Andreas Schmauder, apparently one of the world's leading Gluskin authorities, was asked to paw through literally hundreds of 78 rpm platters to designate the 48 titles included in this package, which is loaded with precious photographs and fascinating information. Gluskin first appears as a drummer with Paul Gason and His Versatile Orchestra. "Ain't She Sweet?" is performed by the Playboys, a Detroit-based band that would soon morph into an expanded and more versatile orchestra under Gluskin's direction. Subsequent billings list the perpetually evolving group as Lud Gluskin and His Versatile Juniors, Lud Gluskin et Son Jazz Orchestre Lud Gluskin, "Lud" Gluskin Ambassadonians, Lud Gluskin and his Ambassadors Orchestra, Jazz-Orchester Lud Gluskin, and finally Lud Gluskin et son Orchestre, which is the name they appeared under most often when serenading patrons at the Casino de Paris. The sound of the band often brings to mind great old-time jazz heroes like the Original Memphis Five, Red Nichols, Miff Mole, Bix Beiderbecke, Frankie Trumbauer, Frank Teschmacher and Jean Goldkette, whose arrangements were in fact used by Gluskin from time to time. Material ranges from hot novelty dance music and traditional pop tunes to substantial jazz numbers like "Tiger Rag," "Milenberg Joys," "Clarinet Marmalade," W.C. Handy's "St. Louis Blues," Fats Waller's "Whiteman Stomp," and Fud Livingston's "Feelin' No Pain." Jazz Oracle continues to astonish and delight all who are fascinated with obscure jazz records from the early 20th century. This installment is particularly rewarding. AMG Review by arwulf arwulf This was in one of my files' AMG tag! edit again : You had another script for looking up tags from Last.FM. Using this would cause inconcistency with the discogs way of tagging which has a more strict set of styles and genres allowed. Would it be possible somehow to tag using the last.fm method only if the style/genre appears in the discogs list of styles and genre, for example if such a list was saved as a .txt file? This post has been edited by tore: Mar 15 2010, 12:56 |
|
|
|
Mar 15 2010, 13:15
Post
#42
|
|
|
Group: Validating Posts: 2424 Joined: 21-May 08 Member No.: 53675 |
Would it be possible somehow to tag using the last.fm method only if the style/genre appears in the discogs list of styles and genre, for example if such a list was saved as a .txt file? Script like that exists for Picard tagger, and it seems reasonable because last.fm tags are mess. I don't use last.fm script, and I'm not interested in making it happen, but script is there so maybe you can extend it a bit |
|
|
|
Mar 15 2010, 15:05
Post
#43
|
|
![]() Group: Members Posts: 41 Joined: 6-February 10 Member No.: 77932 |
By the way, I did find this little piece of information posted on the discogs forums about a month ago :
QUOTE (teo) It's in our plans to remove the 5000 per day limit. That should happen within the next 2 months. Source!That should greatly enhance the usefulness of your discogs scripts |
|
|
|
Mar 15 2010, 20:11
Post
#44
|
|
![]() Group: Members Posts: 41 Joined: 6-February 10 Member No.: 77932 |
I've now tried playing around with your last.fm script to see if I can tailor it to fetch other sorts of last.fm info. However, my lack of knowledge with Python is a hindrance! I've changed the script slightly so that it fetches similar artists.
Here's an example of the information the script can get : http://ws.audioscrobbler.com/2.0/?method=a...ac220b7b2e0a026 I've set a limit to 3 artists. However, I have a hard time getting all three artists - I can only fetch one, just like you only wanted the top tag. So my question is, what do I have to do with this piece of code so that I can fetch all three similar artists? CODE child = doc.getElementsByTagName("artist")[0] toptag = child.getElementsByTagName("name")[0] lyric = toptag.childNodes[0].data.encode('utf_8').capitalize() I see changing the value between the brackets in the "("artist")[0]" snippet changes the information retrieved to the next artist. However, I can't figure out how to get them all in one go! |
|
|
|
Mar 15 2010, 20:28
Post
#45
|
|
|
Group: Validating Posts: 2424 Joined: 21-May 08 Member No.: 53675 |
look at the other last.fm script, i.e.
CODE toptags = child.getElementsByTagName("tag") tags=[] for i in toptags: tags.append(str(i.getElementsByTagName("name")[0].toxml()).replace('<name>','').replace('</name>','').capitalize( )) lyric=str(tags).strip('[]').replace(',', ';').replace('\'','') try something like that |
|
|
|
Mar 15 2010, 20:38
Post
#46
|
|
![]() Group: Members Posts: 41 Joined: 6-February 10 Member No.: 77932 |
look at the other last.fm script, i.e. CODE toptags = child.getElementsByTagName("tag") tags=[] for i in toptags: tags.append(str(i.getElementsByTagName("name")[0].toxml()).replace('<name>','').replace('</name>','').capitalize( )) lyric=str(tags).strip('[]').replace(',', ';').replace('\'','') try something like that Thanks for the suggestion CODE artist = handle.Format("[%artist%]") title = handle.Format("3") try: string=urllib.urlopen('http://ws.audioscrobbler.com/2.0/?method=artist.getsimilar&artist=' + artist.lower().replace(' ','+') + '&limit=' + title.lower().replace(' ','+') + '&api_key=' + api_key).read() doc = minidom.parseString(string) toptags = child.getElementsByTagName("tag") tags=[] for i in toptags: tags.append(str(i.getElementsByTagName("name")[0].toxml()).replace('<name>','').replace('</name>','').capitalize( )) lyric=str(tags).strip('[]').replace(',', ';').replace('\'','') result.append(lyric) except Exception, e: traceback.print_exc(file=sys.stdout) result.append('') continue return result if __name__ == "__main__": LyricProviderInstance = LastFm_TopTag() |
|
|
|
Mar 15 2010, 20:48
Post
#47
|
|
|
Group: Validating Posts: 2424 Joined: 21-May 08 Member No.: 53675 |
I just pasted that part from that other last.fm script, it wasn't intended for literal use, but as example
Look at your XML response: there isn't "tag" node anywhere, so use what you need, and play a little [edit] to help you a bit: where is your "child" variable that you are calling with "toptags" and pasting to look for "tag" node (which doesn't exist BTW in XML response as said)? why "doc" isn't called? name those variables meaningful to you you can explicitly set title, no need for "handle.Format" if you don't need info for some tag to be provided This post has been edited by 2E7AH: Mar 15 2010, 21:08 |
|
|
|
Mar 15 2010, 21:38
Post
#48
|
|
![]() Group: Members Posts: 41 Joined: 6-February 10 Member No.: 77932 |
I just pasted that part from that other last.fm script, it wasn't intended for literal use, but as example Look at your XML response: there isn't "tag" node anywhere, so use what you need, and play a little [edit] to help you a bit: where is your "child" variable that you are calling with "toptags" and pasting to look for "tag" node (which doesn't exist BTW in XML response as said)? why "doc" isn't called? name those variables meaningful to you you can explicitly set title, no need for "handle.Format" if you don't need info for some tag to be provided Actually, after your last post, I got somewhat confused and thought I'd try and break it down to something I could understand. I've never looked at Python coding before today, so be aware that much of the code - especially what you pasted - is completely foreign to me. This is something rough I managed to do on my own : CODE string=urllib.urlopen('http://ws.audioscrobbler.com/2.0/?method=artist.getsimilar&artist=' + artist.lower().replace(' ','+') + '&limit=' + title.lower().replace(' ','+') + '&api_key=' + api_key).read() doc = minidom.parseString(string) child = doc.getElementsByTagName("artist")[0] toptag1 = child.getElementsByTagName("name")[0] child = doc.getElementsByTagName("artist")[1] toptag2 = child.getElementsByTagName("name")[0] child = doc.getElementsByTagName("artist")[2] toptag3 = child.getElementsByTagName("name")[0] lyric1 = toptag1.childNodes[0].data.encode('utf_8').capitalize() lyric2 = toptag2.childNodes[0].data.encode('utf_8').capitalize() lyric3 = toptag3.childNodes[0].data.encode('utf_8').capitalize() result.append(lyric1) This gives me 3 variables - lyric1, lyric2, lyric3. They are the names of the artists I'm trying to fetch. I can change the variable appended in the last line to call the different bands/artists. However, if I try to append several, for example by doing this : CODE result.append(lyric1) result.append(lyric2) result.append(lyric3) .. It finds the tags, but applying them causes foobar to crash completely! Sorry for abandoning your example. Your added hints seem helpful so I will look it over again. However, if you want to help me out with what I got to do to get lyrics1, 2 and 3 appended to the result variable, I'll be happy. |
|
|
|
Mar 15 2010, 21:51
Post
#49
|
|
|
Group: Validating Posts: 2424 Joined: 21-May 08 Member No.: 53675 |
hey, why ride motorcycle when I have my bike
leave it for tomorrow BTW %lastfm_similar_artist% is already provided by biography view component for nowplaying artist This post has been edited by 2E7AH: Mar 15 2010, 21:59 |
|
|
|
Mar 15 2010, 22:34
Post
#50
|
|
![]() Group: Members Posts: 41 Joined: 6-February 10 Member No.: 77932 |
hey, why ride motorcycle when I have my bike leave it for tomorrow BTW %lastfm_similar_artist% is already provided by biography view component for nowplaying artist Don't worry about it! I made something that works, although I'm sure the code will make any regular Python programmer wince. To summarize, this (horrible, but working) modification of 2E7AH's Last.FM genre-tagging script will fetch the 3 most similar artists from last.fm and write them to a tag, f.ex "Hidria Spacefolk; Gong; Kingston Wall". CODE import urllib
from xml.dom import minidom from encodings import utf_8 from grabber import LyricProviderBase class LastFm_TopTag(LyricProviderBase): def GetName(self): return "LastFm Similarity" def GetVersion(self): return "0.1" def GetURL(self): return "http://ws.audioscrobbler.com/" def Query(self, handles, status, abort): result = [] api_key = 'b25b959554ed76058ac220b7b2e0a026' for handle in handles: status.Advance() if abort.Aborting(): return result artist = handle.Format("[%artist%]") title = handle.Format("3") try: string=urllib.urlopen('http://ws.audioscrobbler.com/2.0/?method=artist.getsimilar&artist=' + artist.lower().replace(' ','+') + '&limit=' + title.lower().replace(' ','+') + '&api_key=' + api_key).read() doc = minidom.parseString(string) child = doc.getElementsByTagName("artist")[0] toptag1 = child.getElementsByTagName("name")[0] child = doc.getElementsByTagName("artist")[1] toptag2 = child.getElementsByTagName("name")[0] child = doc.getElementsByTagName("artist")[2] toptag3 = child.getElementsByTagName("name")[0] lyric = toptag1.childNodes[0].data.encode('utf_8').capitalize() + ('; ') + toptag2.childNodes[0].data.encode('utf_8').capitalize() + ('; ') + toptag3.childNodes[0].data.encode('utf_8').capitalize() result.append(lyric) except Exception, e: traceback.print_exc(file=sys.stdout) result.append('') continue return result if __name__ == "__main__": LyricProviderInstance = LastFm_TopTag() |
|
|
|
![]() ![]() |
|
Lo-Fi Version | Time is now: 25th May 2013 - 22:41 |