Help - Search - Members - Calendar
Full Version: Managing music collection
Hydrogenaudio Forums > Hydrogenaudio Forum > General Audio
exec
Hello everyone!

For a long time I planned to rip my CD collection. Now I really want to start this project using EAC (0.99pb1) and FLAC (1.2.0). So first of all I started to think about how all this could be managed/organized. There are really many possibilities how this can be done. So it would be greatly appreciated if some experts could give some comments/suggestions.


The purpose of all this would be to have a system with which as much information from the CDs as possible could be saved.
So to store information, there are initially 2 possibilities:
1. In tags of the flac files.
2. In another format, seperated from the music files.


Tags: FLAC uses vorbis comments, so I can save anything which can be used as key-value pair. But exactly this "anything" might be a problem. Just think about the DATE tag: I can store a value "1999" in it, but also "07/06/1999". So if you don't watch out, you'll soon run into integrity problems.
Another point would be the use of other tags, which are not in widespread use (e.g. pressing numers of CDs). First of all, such info would be stored in every file of an album (-> redundancy). On the other hand, how such info could be used? You had to use some kind of software which can handle such tags. I don't want to use such a software (see below).

So, bottom line for tags: I'll use then, but only to store the most essential information in it. So let's say, all info which is included while ripping the CD with EAC/FLAC (I mean the placeholders used in EAC, e.g. %a).


Other formats: Well, "formats" might not be the exact term. The idea behind it would be to seperate the music files from the (meta)data in some way.

First of all, I don't want to use a special software for managing my music collection. Why? If I did, I'm always dependent on this software and often every software has its restictions. E.g. when migrating to another OS and this software isn't working on it - then what? In the worst case using another software with other restictions. So don't think about MediaMonkey or something like this.

My second thought was some kind of database system (MS Access, SQL Express). OK, there I'm restricted by Microsoft in a way, but the argument, why I decided against this, was another. You have your music (the files) in one location and the info about it in another. In my opinion this would be a too large gap.

So, don't using any software at all brings us to some kind of file based system. The first thing which came up my mind in this case was XML. There are a few obvious advantages:
  • You could store much info in XML files which would be located in the directories of the ripped CDs. So files and metadata are seperated, but still in one place.
  • You can avoid integrity problems easily using DTDs or XMLSchema.
  • Secondly you could use XSLT transformations to bring these information in any other format (e.g. HTML output of all your albums).
  • Also database-like queries should be possible using XPath/XQuery.
  • And last, but not least: you would be independent of special software/OS.

Then I also could image to have some foobar plugin supporting this system...then you could manage all this with the same software you already use for listening.


To sum it up: tagging only as far as this can be realized with EAC/FLAC. All other information would be stored in XML files (one per album/directory).

So, what do you think? Any suggestions or comments?
Thanks in advance!
Lyx
QUOTE(exec @ Jul 29 2007, 14:16) *

[*] And last, but not least: you would be independent of special software/OS.

Right, it would be *seperate* from software/OS.... AND not be "cross-application". Your metadata will be seperate from the music files, even in cases where you dont want it to be seperate. As you noticed already, you would need to implement this "metadata-format" into every existing music software and hardware. In other words: your solution would be incompatible with and unsupported by the entire existing digital music architecture.

As such, this approach may be interesting from a theoretical POV (or not), but it certainly isnt efficient from a practical POV.

- Lyx

P.S.: Even from a theoretical POV, the idea for the most part does not make much sense, semantically. Info like "date" etc. is directly associated to the music itself - if you copy a musicfile to another computer, then the date-information still applies and doesnt change. Thus, these kind of infos semantically are PROPERTIES of the music. They belong to the music itself. From a semantic POV, there is no use in seperating them. This contrasts with info which is not just about the music, but which instead describes relationships between the music and other things. One example would be playback-statistics.... playback statistics have location, device and listener as context - if you give the music to some other person, these infos may not make sense anymore. Thus, they are NOT part of the music itself. In such cases, an external DB - or generally seperating this data from the musicfiles - may make sense.

- Lyx
exec
@Lyx: OK, in general, it isn't a good idea to seperate something which belongs together. So in your opinion, it would be better to store all available information in the tags, right? But then again, there's no standard. Just look at the DATE tag, which is commonly used: what does it mean? The date of the recording, the date of the original release (e.g. on vinyl) or the date of the actual release as CD (remastered or something)? If I want to store all these info in tags, then I need 3 of them (maybe DATE, ORIGINAL DATE and RECORDING DATE) and I have to invent some tags and define their semantics on my own.

I've found this list, where vorbis tags and their use are recommended. Are there other lists available I haven't found yet?

So let's imagine I've defined my own tags. Perhaps I want to change some tag names one day. Then I have to edit all the files in my library. OK, maybe this can be done by tools like foobar, but this changes nothing on the fact that all files will be altered. Imho this would be a little odd.

On the other hand, what happens to "my tags" if I convert the FLAC files to another format? E.g. MP3 with ID3v1 (in case of a DAP only supporting ID3v1), then some information will be lost for the MP3 files.


In both cases it would be much easier to have the info seperated in another format which can be altered to a new structure (new tags, change tag names) or even converted to any other format without the need to touch all my files.


OK, maybe this is a really theoretical POV with very little practical relevance. I just want to have a "system" which is generic enough to deal with all future eventualities. I just don't want to rip all my CDs now to realize in half a year that I need to change something and have to re-tag all my files or something like that.
hlloyge
Well, this is then called careful planning. I think that, regarding the DATE tag, it should be release date, not recording date. All other things you can place in comments.
I don't really know what do you want to achieve - you have enough tags to tag your files as you want. If you need some database of your music collection, look at MySQL - it is available for Windows and Linux, and I guess it has its incarnation for Macs. but I am unsure what will you achieve with this, as I don't know any program that will read that database and be connected some way to your files, and works on all OSes.
plnelson
QUOTE(exec @ Jul 29 2007, 08:16) *

To sum it up: tagging only as far as this can be realized with EAC/FLAC. All other information would be stored in XML files (one per album/directory).

So, what do you think? Any suggestions or comments?
Thanks in advance!


How do you plan to access or use this information?

That's what should drive your decision. My wife and I are serious music collectors with a huge CD collection that we're currently ripping to a networked hard drive, and thence to a whole-house Sonos system, as well as to an 80G iPod. So we have to be cognizant of how our information architecture (which is basically what you're describing) will work with different playback devices.

Our IA scheme is driven by the need to easily search and find the music we're looking for, even if we have 3 different versions of the Shostakovich Piano Quintet or 6 different versions of Orange Blossom Special, including two different ones by Bill Monroe. Also, how well will your schme accomodate musical forms that have multiple levels in their hierarchy? For example the native MP3 tags have no built-in hierarchy level between "Album" and "Song". So how would you search for, say, a symphony?

We use the MP3 tagging scheme but we define for ourselves what some of them mean. For example we populate the "genre" tag with our own genre's because the default ones (e.g., "classical") are too general and vague. Also, for some genre's we redefine "Album" to mean "opus", but we keep the traditional meaning for other genre's, but we do this systematically. That's the answer to your "date" question - it means whatever YOU want it to mean but you just have to be systematic about it.
exec
Well, I mainly use 2 sources:
1. DAPs. They can only show a very limited number of tags (artist, album, song title and often the filename). So putting all info in tags is more or less useless for DAP-use. And so are XML files, because most DAPs can't even recognize them.
2. PC for home use: here I can display most of the tags (foobar), so on the first thought the XML files are just redundant. My music collection is pretty small right now (~150CDs), so at the moment I know what CDs I own and where to look for a special song, etc. But what happens in a few years? Then it may be easy to lose the overview. Then I have to search for songs. Next tought would be a database system, which I don't want to use (see above). Next step: XML - imho it's the format with the most future possibilities:
- I could query these information in a db-ish way.
- I could transform it to any other format/structure/etc.
- Even when a new lossless format will me released, not supporting my custom tags, then I could pull out all these info from the XML files and use it for the new format.

So I don't care about that this system will not be adopted by any other person/player/DAPs - it's just for my own information structure.
After thinking about it and testing a little I think I would use a good tagging system and my XML files. It shouldn't be a big deal using something like Mp3tag (just thinking on the export feature).
plnelson
QUOTE(exec @ Aug 8 2007, 09:31) *

Well, I mainly use 2 sources:
1. DAPs. They can only show a very limited number of tags (artist, album, song title and often the filename). So putting all info in tags is more or less useless for DAP-use. And so are XML files, because most DAPs can't even recognize them.

. . .

So I don't care about that this system will not be adopted by any other person/player/DAPs - it's just for my own information structure.
After thinking about it and testing a little I think I would use a good tagging system and my XML files. It shouldn't be a big deal using something like Mp3tag (just thinking on the export feature).

We've ripped about 1300 CD's plus maybe a thousand other tracks that we've bought online or gotten from other sources.

An XML scheme is fine as far as it goes but how will you use it with a DAP? What you are likely to find is that as your music collection grows past 150 CD's you will not be able to rely on your memory to find or organize your music and you will get tired of booting up a computer and sitting in front of it just to search through your collection. We use lots of tags that our 80G iPod doesn't support but we make sure that the ones it DOES support, like song, album, composer, etc, are based on an IA that makes it easy to search and organize such a huge selection.

Also N.B. that an iTunes library already is XML-based and contains virtually every tag anyone one could want, plus it's easily hackable.

This weekend we're installing a whole-house Sonos system and we'll see how well it adapts to our schema - this is the big test: can one IA work well with two different playback devices?

As I said, it all comes down to information architecture. The wrong IA can sink you. There are people out there with 15,000 "song" collections and 10's of thousands of dollars of speakers and harddrives and DAC's and home-theater amplifiers and whatnot who go nuts because it's too hard to find their songs or play them in the right order or even remeber what they have because they didn't give enought thought upfront to their information architecture.
kjen
QUOTE(plnelson @ Aug 6 2007, 18:33) *

... Also, how well will your scheme accomodate musical forms that have multiple levels in their hierarchy? For example the native MP3 tags have no built-in hierarchy level between "Album" and "Song". So how would you search for, say, a symphony?

We use the MP3 tagging scheme but we define for ourselves what some of them mean. For example we populate the "genre" tag with our own genre's because the default ones (e.g., "classical") are too general and vague. Also, for some genre's we redefine "Album" to mean "opus", but we keep the traditional meaning for other genre's, but we do this systematically. That's the answer to your "date" question - it means whatever YOU want it to mean but you just have to be systematic about it.


Would you consider posting your scheme? I agree with you entirely about the standard MP3 tagging scheme and would be very interested in how you worked around it!

exec
QUOTE(kjen @ Aug 8 2007, 17:56) *

QUOTE(plnelson @ Aug 6 2007, 18:33) *

... Also, how well will your scheme accomodate musical forms that have multiple levels in their hierarchy? For example the native MP3 tags have no built-in hierarchy level between "Album" and "Song". So how would you search for, say, a symphony?

We use the MP3 tagging scheme but we define for ourselves what some of them mean. For example we populate the "genre" tag with our own genre's because the default ones (e.g., "classical") are too general and vague. Also, for some genre's we redefine "Album" to mean "opus", but we keep the traditional meaning for other genre's, but we do this systematically. That's the answer to your "date" question - it means whatever YOU want it to mean but you just have to be systematic about it.


Would you consider posting your scheme? I agree with you entirely about the standard MP3 tagging scheme and would be very interested in how you worked around it!


x2
That'll be great!
Preuss
I like this idea a lot.

Because you will always be able to use make a program and fill all the tags in mp3, flac og wavpack files from the xml. Like FLAC <-> XML <-> MySQL

Now you will be able to update your xml with a text editor, and later you could use your (self made) program to update the metadata in all your music files. But then you need another program to synchronize between xml and your MySQL database, to be able to search in it.

cool.gif Idea
plnelson
QUOTE(kjen @ Aug 8 2007, 11:56) *

QUOTE(plnelson @ Aug 6 2007, 18:33) *

... Also, how well will your scheme accomodate musical forms that have multiple levels in their hierarchy? For example the native MP3 tags have no built-in hierarchy level between "Album" and "Song". So how would you search for, say, a symphony?

We use the MP3 tagging scheme but we define for ourselves what some of them mean. For example we populate the "genre" tag with our own genre's because the default ones (e.g., "classical") are too general and vague. Also, for some genre's we redefine "Album" to mean "opus", but we keep the traditional meaning for other genre's, but we do this systematically. That's the answer to your "date" question - it means whatever YOU want it to mean but you just have to be systematic about it.


Would you consider posting your scheme? I agree with you entirely about the standard MP3 tagging scheme and would be very interested in how you worked around it!


What else do you want to know about it?

1. We overwrite the online DB genre's with our own. For example we find "classical" too broad, so we have genres called chamber, piano solo, symphonic, etc. Other people prefer period genres like baroque, late-romantic, etc. Pick ones that are meaningful to you.

2. Composer means composer, last name first, always spelled the same - same for all genres. For example Radiohead's "Everything in its Right Place" gets "Yorke, Thom" whether it's performed by RadioHead, or classical pianist Christopher O'Reilly or jazz pianist Brad Mehldau. (BTW, it's amazing how many artists in how many different genre's like to do Radiohead music!)

3. Artist means performing artist or ensemble name

4. Album depends on genre. For all classical genres it refers to the piece (i.e., the opus or the work). For other genres it refers to the record or CD album it first appeared in that recording. For example, if I have a song from the Who album Quadrophenia, even if I got that song off of some compilation, as long as it's the same song (i.e., not remixed or re-recorded) I'll put Quadrophenia in the Album.

5. The track for classical is the movement, e.g, "allegro vivace" or whatever; for everything else it's the song. All classical tracks are preceded by a shorthand description of the piece so we can tell at a glance on our iPod Nano or other players with a really short display what's playing, for example Dvorak's "Dumky"piano trio is "Dvorak P3 Op90". The Trout Quintet is "Schubert P5 Op114";a string quartet would by an S4, etc. An iPod Nano can fit about 24 characters across and so can the Sonos controller so we we have to use shorthand.

So far this system has been working great for us.

kjen
QUOTE(plnelson @ Aug 9 2007, 14:29) *



So far this system has been working great for us.


Thanks very much, that is very helpful. I'm about to rerip my collection, more systematically than last time. I think the big leap is separating the tagging (e.g by composer / composition) from the physical media - the CD. last time my player (Archos) didn't support tags well enough for that to be worthwhile, the new one does.

daybrain
Check this simple tool if you want to quickly find your music based only on the path and filename:

http://www.geocities.com/supertyp77/index.html

(it's freeware / open source)

plnelson
QUOTE(kjen @ Aug 9 2007, 20:09) *
Thanks very much, that is very helpful. I'm about to rerip my collection, more systematically than last time. I think the big leap is separating the tagging (e.g by composer / composition) from the physical media - the CD. last time my player (Archos) didn't support tags well enough for that to be worthwhile, the new one does.


All the players are very frustrating. MP3 and other format all have lots of tags that no external player supports. I mentioned the tags I did because they are the ones I can see on my iPod and Sonos systems but I'd love to use other tags to see recording dates, lyrics, personnel (especially in jazz and classical I want to to know who's playing what).

Sonos is maddening because they are a premium-priced product with a huge display on their remote but they waste so much of their screen real estate on album art that they actually have about the same text size limits as my iPod Nano!! Plus they don't scroll so the Nano can actually display more content!!

The other problem is that the MP3 tagging scheme has no supported provision to link to outside documents, so if I had the liner notes, lyrics, reviews, history of how a piece was written or recorded in, say, an HTML, XML, or .pdf file there's no way to directly access it from the MP3 file. There's nothing to stop anyone from putting it in whatever they think is an appropriate tag, but that would just be their little secret because no software or player would be able to utilize it.

The MP3 tagging scheme was designed in 1996 (ID3V1) and 1997 (ID3V1.1) and it's mind-boggling that anything designed so recently would not include a built-in provision for linking to external documents. The ID3V2 (1998 and beyond) standard does include such a provision but ID3V2 has so many flavors and versions and frame structures that support for it is inconsistent. It's also poorly designed from a software engineering standpoint with different frames having different structures. The other problem we've already noted with ID3 is that it has lousy support for classical music.

Op-Ed : In general I'm a big fan of open-source projects. But to me, ID3 represents a total failure of the open-source paradigm because it's resulted in such a chaotic, poorly-designed mess of competing, incompatible standards that most products only utilize a tiny subset of tags and serious music lovers, regardless of their preferred genres, have to roll their own homebrew solutions and workarounds. To some extent this reflects poor decisions initially by Eric Kemp and Michael Mutschler in the ID3 V1 generation because most of the big problems are the result of people trying to fix the original standard.





kjen
QUOTE(plnelson @ Aug 10 2007, 12:16) *


...

Op-Ed : In general I'm a big fan of open-source projects. But to me, ID3 represents a total failure of the open-source paradigm because it's resulted in such a chaotic, poorly-designed mess of competing, incompatible standards that most products only utilize a tiny subset of tags and serious music lovers, regardless of their preferred genres, have to roll their own homebrew solutions and workarounds. To some extent this reflects poor decisions initially by Eric Kemp and Michael Mutschler in the ID3 V1 generation because most of the big problems are the result of people trying to fix the original standard.


I entirely agree, when I looked at how ID3v1 was put together I couldn't believe it was so limited. However I'd go further back (and to a corporate level) for the two major failures: (i) not putting any data on the CD when the format was defined, despite all the capacity available, that's when the recording companies abdicated; (ii) when MS defined WAV as the base uncompressed format they couldn't be bothered when adapting the existing IFF standards to define blocks for tagging the files. Once those bad decisions played out confusion was inevitable.
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2008 Invision Power Services, Inc.