foo_chacon, A thingie to convert metadata charset. |
![]() ![]() |
foo_chacon, A thingie to convert metadata charset. |
Oct 5 2008, 16:08
Post
#1
|
|
![]() Group: FB2K Moderator Posts: 1810 Joined: 30-November 07 Member No.: 49158 |
I needed to fix broken tags of a bunch of files yesterday, so I've made myself this component to do that efficiently and I thought perhaps someone else might find it useful as well, so here it is.
The offered functionality is essentialy similar to what the "Override charset" option in foo_infobox did, though it's accessed directly from the context menu and for any number of tracks at once. It can be generally used to fix ID3v1 tags or cue sheets saved in a codepage different from that of your system. (And no, nobody wants to hear about those infelicitous files from shabby sources ;) foo_chacon-0.0.2.zip (54 kB, v0.0.2, 2009/02/14) This post has been edited by Yirkha: Feb 14 2009, 03:53 -------------------- Full-quoting makes you scroll past the same junk over and over.
|
|
|
|
Dec 10 2008, 20:59
Post
#2
|
|
![]() Group: Members Posts: 371 Joined: 27-September 03 Member No.: 9041 |
I am surprised that there are no replies to this, maybe because up till now there was no reason to switch from foo_infobox.
Thanks for this plugin, it does a seldom needed function, but when it's needed, it is unmeasurably helpful. It does the exact thing why I kept infobox in foobar, but it does it so much better. This post has been edited by Borisz: Dec 10 2008, 20:59 -------------------- http://evilboris.sonic-cult.net/346/
Sega Saturn, Shiro! |
|
|
|
Dec 11 2008, 11:01
Post
#3
|
|
|
Group: Members Posts: 18 Joined: 10-October 05 Member No.: 25022 |
Just wanted to say thanks for such a great utility. This alone definitely made the switch from v0.8.3 all that much easier
Might I ask you to explain more in regards to "convert to local codepage first" feature? Also, might I suggest that a filter be created so that it's easier to select all Chinese codepages or all Japanese codepages. |
|
|
|
Dec 11 2008, 11:57
Post
#4
|
|
![]() Group: FB2K Moderator Posts: 1810 Joined: 30-November 07 Member No.: 49158 |
Oh noes, people have found this and started asking questions...
"Convert to local codepage first" feature is necessary, because foobar2000 reads files with unspecified character set as in "local system codepage". That is then converted to UTF-8, as everything else in fb2k. Because we want to re-read the tags in another charset, it's usually needed to first convert them back from UTF-8 to that local system codepage, then reparse it from whatever you have selected to UTF-8 again - and the checkbox enables the first part of this process. Note that this is also why this component is inherently unsafe - there is no guarantee that the conversion "CP_target read as CP_system => CP_UTF8 => CP_system" is fully equivalent. The proper way would be to read the tags from various file formats directly, not using the standard input modules. But it seems to work quite well so far, so let's hope this won't be needed. Regarding the filter - Yes, something like that could be added and it would need some additional configuration and/or custom-drawn groups. Though when I used Chacon, I usually chose one particular charset and processed many subsequent files with it easily, because the setting was remembered. When something different came, even mindlessly skimming through the whole list was not so much hassle. I tend to leave it as simple and stupid as it is, thank you. -------------------- Full-quoting makes you scroll past the same junk over and over.
|
|
|
|
Dec 11 2008, 22:22
Post
#5
|
|
|
Group: Members Posts: 18 Joined: 10-October 05 Member No.: 25022 |
Thanks for the info on the "convert to local pages first"
Regarding the filter feature, yea it's a bit more situational and probably wouldn't save all that much effort. |
|
|
|
Jan 23 2009, 22:34
Post
#6
|
|
![]() Group: Members Posts: 1811 Joined: 21-May 08 Member No.: 53675 |
would it be possible to extend the component with custom chararacter remaping?
than we could easily transform latin to cyrillic or something similar |
|
|
|
Feb 2 2009, 10:22
Post
#7
|
|
![]() Group: FB2K Moderator Posts: 1810 Joined: 30-November 07 Member No.: 49158 |
That would be possible. This component currently simply uses Windows routines to convert between different encodings, but adding a custom convertor wouldn't be hard and it would provide additional flexibility.
However I'm thinking about the way to store such remapping tables. To stay within the scope of "character set remapping", it would need to allow mapping arbitrary binary sequences to Unicode codepoints. Because it's not possible to use two different charsets in one user-editable text file, the data must be formatted for instance in hex - and I'd use the same format as iconv for great compatibility. For example, mapping A/B/C to a/b/c: CODE 0x41 0x61 0x42 0x62 0x43 0x63 But when you speak about transliteration, I'm not sure if that format would be as suitable for it. Some kind of list of replacements, both already in Unicode, seems much better for such usage to me. And then I'm not sure if it has much to do with character set remapping... -------------------- Full-quoting makes you scroll past the same junk over and over.
|
|
|
|
Feb 2 2009, 18:41
Post
#8
|
|
![]() Group: Members Posts: 1811 Joined: 21-May 08 Member No.: 53675 |
ok, probably using $replace() is the easy way
i was thinking about simple remappings, in the same code page, and you think more globaly i wouldn't have anything to suggest because the subject is beyond me |
|
|
|
Feb 14 2009, 03:59
Post
#9
|
|
![]() Group: FB2K Moderator Posts: 1810 Joined: 30-November 07 Member No.: 49158 |
v0.0.2 is up, features one simple addition: it is possible to copy text from selected fields in the preview pane using context menu or keyboard shortcut Ctrl+C. Helps when you don't have a clue how the tags should really look - you can for instance paste them to Google and see if it yields plausible results.
-------------------- Full-quoting makes you scroll past the same junk over and over.
|
|
|
|
Feb 17 2009, 13:01
Post
#10
|
|
|
Group: Members Posts: 1 Joined: 24-November 08 Member No.: 63084 |
What about UTF-16 support?
|
|
|
|
Feb 17 2009, 13:40
Post
#11
|
|
![]() Group: Members Posts: 1811 Joined: 21-May 08 Member No.: 53675 |
|
|
|
|
Feb 17 2009, 15:55
Post
#12
|
|
![]() Group: FB2K Moderator Posts: 1810 Joined: 30-November 07 Member No.: 49158 |
If your tags are stored in UTF-16, but read as UTF-8 or other charset, this component can't help you. It doesn't access the tags directly and such texts would get truncated or otherwise mangled before they even get there. (And that's an inherent limitation of how it works, not limited to UTF-16.)
-------------------- Full-quoting makes you scroll past the same junk over and over.
|
|
|
|
Apr 26 2009, 22:40
Post
#13
|
|
|
Group: Members Posts: 134 Joined: 25-September 05 Member No.: 24684 |
I was going to say once that Acropolis's masstagger addons component added this functionality to the masstagger in foobar (so there were components doing this before), but with v1.8 around foobar version 0.9.6.x Peter blocked third parties from attaching to it (doh). It's not exactly the same, but if it's possible (I don't know much about codepages) it would be nice to have a function specific to converting Traditional Chinese to Simplified and vice-versa, since Acropolis's not deprecated plugin could do that. I'd understand if you aren't able to or aren't willing though.
|
|
|
|
Apr 26 2009, 23:46
Post
#14
|
|
![]() Group: FB2K Moderator Posts: 1810 Joined: 30-November 07 Member No.: 49158 |
That's a bit more related to what 2E7AH suggested, again not so much about character conversion. I might add another interface for this kind of conversions or custom transliterations, basically it's not a bad idea.
-------------------- Full-quoting makes you scroll past the same junk over and over.
|
|
|
|
Jul 6 2009, 07:46
Post
#15
|
|
![]() Group: Members Posts: 1811 Joined: 21-May 08 Member No.: 53675 |
Yirkha, can you look here:
I tested one track converting the tags to latin-1 (ISO 8859-1) with Mp3tag, than using foo_chacon to convert it correctly in foobar, but without success. I tried with "Convert to lacal page" checked and unchecked, but same result. I'm in CP1251 It worked OK in the past, but I don't know if I was converting from this code page |
|
|
|
![]() ![]() |
|
Lo-Fi Version | Time is now: 9th February 2010 - 18:03 |