IPB

Welcome Guest ( Log In | Register )

 
Reply to this topicStart new topic
foo_chacon, A thingie to convert metadata charset.
Yirkha
post Oct 5 2008, 16:08
Post #1





Group: FB2K Moderator
Posts: 1810
Joined: 30-November 07
Member No.: 49158



I needed to fix broken tags of a bunch of files yesterday, so I've made myself this component to do that efficiently and I thought perhaps someone else might find it useful as well, so here it is.

The offered functionality is essentialy similar to what the "Override charset" option in foo_infobox did, though it's accessed directly from the context menu and for any number of tracks at once.

It can be generally used to fix ID3v1 tags or cue sheets saved in a codepage different from that of your system. (And no, nobody wants to hear about those infelicitous files from shabby sources ;)

foo_chacon-0.0.2.zip (54 kB, v0.0.2, 2009/02/14)

This post has been edited by Yirkha: Feb 14 2009, 03:53


--------------------
Full-quoting makes you scroll past the same junk over and over.
Go to the top of the page
 
+Quote Post
Borisz
post Dec 10 2008, 20:59
Post #2





Group: Members
Posts: 371
Joined: 27-September 03
Member No.: 9041



I am surprised that there are no replies to this, maybe because up till now there was no reason to switch from foo_infobox.

Thanks for this plugin, it does a seldom needed function, but when it's needed, it is unmeasurably helpful. It does the exact thing why I kept infobox in foobar, but it does it so much better.

This post has been edited by Borisz: Dec 10 2008, 20:59


--------------------
http://evilboris.sonic-cult.net/346/
Sega Saturn, Shiro!
Go to the top of the page
 
+Quote Post
nevets1219
post Dec 11 2008, 11:01
Post #3





Group: Members
Posts: 18
Joined: 10-October 05
Member No.: 25022



Just wanted to say thanks for such a great utility. This alone definitely made the switch from v0.8.3 all that much easier smile.gif

Might I ask you to explain more in regards to "convert to local codepage first" feature?

Also, might I suggest that a filter be created so that it's easier to select all Chinese codepages or all Japanese codepages.
Go to the top of the page
 
+Quote Post
Yirkha
post Dec 11 2008, 11:57
Post #4





Group: FB2K Moderator
Posts: 1810
Joined: 30-November 07
Member No.: 49158



Oh noes, people have found this and started asking questions... sleep.gif


"Convert to local codepage first" feature is necessary, because foobar2000 reads files with unspecified character set as in "local system codepage". That is then converted to UTF-8, as everything else in fb2k. Because we want to re-read the tags in another charset, it's usually needed to first convert them back from UTF-8 to that local system codepage, then reparse it from whatever you have selected to UTF-8 again - and the checkbox enables the first part of this process.

Note that this is also why this component is inherently unsafe - there is no guarantee that the conversion "CP_target read as CP_system => CP_UTF8 => CP_system" is fully equivalent. The proper way would be to read the tags from various file formats directly, not using the standard input modules. But it seems to work quite well so far, so let's hope this won't be needed.


Regarding the filter -
Yes, something like that could be added and it would need some additional configuration and/or custom-drawn groups. Though when I used Chacon, I usually chose one particular charset and processed many subsequent files with it easily, because the setting was remembered. When something different came, even mindlessly skimming through the whole list was not so much hassle. I tend to leave it as simple and stupid as it is, thank you.


--------------------
Full-quoting makes you scroll past the same junk over and over.
Go to the top of the page
 
+Quote Post
nevets1219
post Dec 11 2008, 22:22
Post #5





Group: Members
Posts: 18
Joined: 10-October 05
Member No.: 25022



Thanks for the info on the "convert to local pages first"

Regarding the filter feature, yea it's a bit more situational and probably wouldn't save all that much effort.
Go to the top of the page
 
+Quote Post
2E7AH
post Jan 23 2009, 22:34
Post #6





Group: Members
Posts: 1811
Joined: 21-May 08
Member No.: 53675



would it be possible to extend the component with custom chararacter remaping?

than we could easily transform latin to cyrillic or something similar
Go to the top of the page
 
+Quote Post
Yirkha
post Feb 2 2009, 10:22
Post #7





Group: FB2K Moderator
Posts: 1810
Joined: 30-November 07
Member No.: 49158



That would be possible. This component currently simply uses Windows routines to convert between different encodings, but adding a custom convertor wouldn't be hard and it would provide additional flexibility.

However I'm thinking about the way to store such remapping tables. To stay within the scope of "character set remapping", it would need to allow mapping arbitrary binary sequences to Unicode codepoints. Because it's not possible to use two different charsets in one user-editable text file, the data must be formatted for instance in hex - and I'd use the same format as iconv for great compatibility. For example, mapping A/B/C to a/b/c:
CODE
0x41 0x61
0x42 0x62
0x43 0x63

But when you speak about transliteration, I'm not sure if that format would be as suitable for it. Some kind of list of replacements, both already in Unicode, seems much better for such usage to me. And then I'm not sure if it has much to do with character set remapping...


--------------------
Full-quoting makes you scroll past the same junk over and over.
Go to the top of the page
 
+Quote Post
2E7AH
post Feb 2 2009, 18:41
Post #8





Group: Members
Posts: 1811
Joined: 21-May 08
Member No.: 53675



ok, probably using $replace() is the easy way

i was thinking about simple remappings, in the same code page, and you think more globaly
i wouldn't have anything to suggest because the subject is beyond me
Go to the top of the page
 
+Quote Post
Yirkha
post Feb 14 2009, 03:59
Post #9





Group: FB2K Moderator
Posts: 1810
Joined: 30-November 07
Member No.: 49158



v0.0.2 is up, features one simple addition: it is possible to copy text from selected fields in the preview pane using context menu or keyboard shortcut Ctrl+C. Helps when you don't have a clue how the tags should really look - you can for instance paste them to Google and see if it yields plausible results.


--------------------
Full-quoting makes you scroll past the same junk over and over.
Go to the top of the page
 
+Quote Post
deviantus
post Feb 17 2009, 13:01
Post #10





Group: Members
Posts: 1
Joined: 24-November 08
Member No.: 63084



What about UTF-16 support?
Go to the top of the page
 
+Quote Post
2E7AH
post Feb 17 2009, 13:40
Post #11





Group: Members
Posts: 1811
Joined: 21-May 08
Member No.: 53675



QUOTE (deviantus @ Feb 17 2009, 13:01) *
What about UTF-16 support?

Preferences → Advanced → Tagging → MP3 → ID3v2 writer compatibility mode

[edit] that is for writting, foobar has no problem with reading UTF 16

This post has been edited by 2E7AH: Feb 17 2009, 13:57
Go to the top of the page
 
+Quote Post
Yirkha
post Feb 17 2009, 15:55
Post #12





Group: FB2K Moderator
Posts: 1810
Joined: 30-November 07
Member No.: 49158



If your tags are stored in UTF-16, but read as UTF-8 or other charset, this component can't help you. It doesn't access the tags directly and such texts would get truncated or otherwise mangled before they even get there. (And that's an inherent limitation of how it works, not limited to UTF-16.)


--------------------
Full-quoting makes you scroll past the same junk over and over.
Go to the top of the page
 
+Quote Post
neothe0ne
post Apr 26 2009, 22:40
Post #13





Group: Members
Posts: 134
Joined: 25-September 05
Member No.: 24684



I was going to say once that Acropolis's masstagger addons component added this functionality to the masstagger in foobar (so there were components doing this before), but with v1.8 around foobar version 0.9.6.x Peter blocked third parties from attaching to it (doh). It's not exactly the same, but if it's possible (I don't know much about codepages) it would be nice to have a function specific to converting Traditional Chinese to Simplified and vice-versa, since Acropolis's not deprecated plugin could do that. I'd understand if you aren't able to or aren't willing though.
Go to the top of the page
 
+Quote Post
Yirkha
post Apr 26 2009, 23:46
Post #14





Group: FB2K Moderator
Posts: 1810
Joined: 30-November 07
Member No.: 49158



That's a bit more related to what 2E7AH suggested, again not so much about character conversion. I might add another interface for this kind of conversions or custom transliterations, basically it's not a bad idea.


--------------------
Full-quoting makes you scroll past the same junk over and over.
Go to the top of the page
 
+Quote Post
2E7AH
post Jul 6 2009, 07:46
Post #15





Group: Members
Posts: 1811
Joined: 21-May 08
Member No.: 53675



Yirkha, can you look here:

I tested one track converting the tags to latin-1 (ISO 8859-1) with Mp3tag, than using foo_chacon to convert it correctly in foobar, but without success. I tried with "Convert to lacal page" checked and unchecked, but same result. I'm in CP1251
It worked OK in the past, but I don't know if I was converting from this code page

Go to the top of the page
 
+Quote Post

Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 9th February 2010 - 18:03