Help - Search - Members - Calendar
Full Version: Successor for CDDB/FreeDB
Hydrogenaudio Forums > CD-R and Audio Hardware > CD Hardware/Software
Fandango
Hi,

I'm concerned about the current state of the FreeDB and CDDB. First of all the info that can be stored there is rather limited. Second even if there was a possibility to add more specific info for a CD, most people won't use it, some people don't even fill in the year info. And also many redundant entries are created.

Currently freedb/cddb are missing the following (and not limited to) features:
  • mechanism to destinguish remastered albums or albums which otherwise have the exact same TOC. Making replaygain info a standard for submitting album info would help.
  • distinction between original release date (the year when the first release came out) and reissue/remaster/repackaging/re* release date
  • catalog numbers
  • ...
The new system should be based on the principle of automation, which the founders of cddb most likely didn't think of when they started. But today a lot of music applications are able to either search or submitt cddb data, some of them are even able to make perfect audio CD copies and know how to get replaygain correction values. The technology is all there but, it's not used for making a "correct" CD database.

Does anyone know if there's already a project like this in existence, or does anyone have objections or more ideas?

I think freedb/cddb is very useful, it saves a lot of typing when creating tags, but it's sometimes a real bummer, to get your original release labeled as a remaster CDs or you discover weeks after you've used freedb/cddb some mistyped tracks.
Garf
I had some similar thoughts a while ago, and gobbled together http://www.foosic.org. I wasn't thinking of a database in quite the way that you seem to be, though. For one, things like ReplayGain data are easily automatically calculated and are encoder-dependant, so they dont belong in a global DB.

But yeah, I think we can agree the current systems have unfortunate limitations.

Do you know about http://musicbrainz.org ?
Fandango
Musicbrainz rings a bell. I always thought of them to be a group of people who maintain a set of tagging quidelines ("too bloated" I thought back then). But I didn't know that they had a database and how advanced they already are with it all, using song fingerprints and so on... looks promising. But it makes me wonder why I have never heard of any multimedia software that supports musicbrainz. They have a SDK and they're concept seems to be very advanced, what's the hold-up?
Synthetic Soul
MP3tag can query MusicBrainz it seems.

I think any system is always going to be let down by the fact that there are a lot of imbreds in the world. As anal as you may be, there will be one hundred people who don't know how to use the shift key, and think that "thx" is proper English.

You can build the best database in the world, but someday you have to let the general public at it. And the general public are a mixed bunch.

Edit: spelling. See?
jaybeee
QUOTE(Synthetic Soul @ Mar 10 2006, 09:28 PM)
MP3tag can query MusicBrainz it seems.
*

Mp3Tag can also query Discogs and imo that is the best electronic music database. However, it is mainly electronic music (dance etc etc) and may not be in a configurable format to use as easy as freedb, cddb & musicbrainz. But that's just me guessing tbh. I was really just posting to inform you guys of that website/database.
Eli
I think any nextgen DB will have to include some sort of acoustic fingerprint. Unfortunatly MB is moving away from their acousitc fingerprint because they one the were using was closed source and they dont control it, in addition it has proven to not be a nearly robust enough algorithm as it has alot of collisions (matching fingerprints for completely different songs).

Other things that would be nice in a music DB:
-featuring artists separate from song titles
-per song genre
-artist genre
-song lyrics
-cover art
.
.
.
Eli
QUOTE(Synthetic Soul @ Mar 10 2006, 04:28 PM)
MP3tag can query MusicBrainz it seems.

Edit: spelling.  See?
*



Have you tried? I did and cant see anything in MP3tag that has anything to do with MB. I know MB lists them on their site as an app that uses their db, but I dont think they do.
Fandango
There's a user contributed "web source" file (.src) available for Musicbrainz in the MP3Tag development forums. MP3Tag can virtually use any online database for getting tag info.
Cartman_Sr
I'm considering actually just manually typing in the cd information in EAC. Using the freedb just isn't much of a savings in time, I agree.

One solution could be to just erase the entire freedb, but keep the infrastructure of it. Then start all over, requiring submitters to be registered with a valid email address. That way, users who submit data all have to agree on a set of conventions, like the various artists naming convention, etc. And people who send in bogus entries can be prohibited from submitting.
Societal Eclipse
I've sent corrections to FreeDB before for albums that were not correctly entered. I'm not sure if that just creates a duplicate or someone actually looks at it and fixes the problem.
Florian
QUOTE(Eli @ Mar 11 2006, 01:59 AM)
QUOTE(Synthetic Soul @ Mar 10 2006, 04:28 PM)
MP3tag can query MusicBrainz it seems.
*



Have you tried? I did and cant see anything in MP3tag that has anything to do with MB. I know MB lists them on their site as an app that uses their db, but I dont think they do.
*


Mp3tag doesn't use the MusicBrainz TRM identification but can query information from the MusicBrainz web site. Just go to the Web Sources Archive and download the updated MusicBrainz web source.

After copying the file to %APPDATA%\Mp3tag\data\sources, you can use it via the drop-down beside the web icon (see the screenshot attached)
spoon
Any effort into an updated database should go into freedb (there is no reason why a new 2nd database be created that can work along side the old database).

The reasons for this are:

everyone knows about freedb
they have the infustructure in place.

However covertart is a big no-no, it is not copyright infringement to hold the text of what an album contains, but it is to hold the actual cover art image.

Freedb as a next step just needs a better Disc ID and get rid of the old rubbish 8 categories (what is the point). Right now freedb can contain any extra text info you would wish (this is how we added year and any genre to freedb), so define a few standards. Sadly the current maintainers of freedb are not down and out script people, so adding these needs new people.
Eli
QUOTE(Ganymed @ Mar 11 2006, 03:06 AM)

After copying the file to %APPDATA%\Mp3tag\data\sources, you can use it via the drop-down beside the web icon (see the screenshot attached)
*



Hmm, I have added them to the sources folder but they dont show up for me. Is there a way the need to be activated or something?
Florian
QUOTE(Eli @ Mar 11 2006, 03:11 PM)
QUOTE(Ganymed @ Mar 11 2006, 03:06 AM)

After copying the file to %APPDATA%\Mp3tag\data\sources, you can use it via the drop-down beside the web icon (see the screenshot attached)
*



Hmm, I have added them to the sources folder but they dont show up for me. Is there a way the need to be activated or something?
*


There is a difference between %APPDATA% which is C:\Documents and Settings\<username>\Application Data\... and C:\Program Files\... I guess you've copied it to the latter.

Edit: we have a nice support forum too, so it would be great if you can ask your questions there (or via PM) because this is getting a little bit off-topic.
Fandango
I agree with spoon that dropping freedb would be too radical. It's well established and everyone how knows that there is something like a CD database on the net one can query, also knows about freedb.

Making a new database project incompatible to freedb/cddb from scratch like Musicbrainz has one big problem: noone knows about it and even if there's a big team of good developers and supporters behind it, there's also the much bigger group of common users and applications who only support freedb if at all and won't budge to support a new system.

Initiating a freedb2 (being somewhat backwards compatible with old freedb queries) would be a nice thing, setting it up in parallel to the old one, and of course, from scratch. The lack of fingerprinting techniques like TRM isn't a big drawback IMHO, I think TOC info plus replaygain is sufficient in most cases, especially for distinguishing between remastered and original releases, which I concern to be a big problem. The problems that result from the fact that RP correction values are different depending on whether the source material was encoded or is a direct copy from an audio cd could be minimized by adding a the requirement to also include the encoder type when submitting RP info. This plus a statistical databse feature in order to identify false or out of line submissions would somewhat make fingerprinting (which isn't flawless anyway) redundant.

There's probably a lot on the wishlist for a better freedb, like cover art and user comments and reviews... the former is very questionable, it will definitely get the project into legal trouble, becuase cover art is copyrighted. Also adding reviews could be problematic, when users simply copy and paste reviews from major (or minor/private even) music websites.

In my opinion the most important things such a database must provide is, correct info, no redundant entries for the same issue of an CD and some abilities to identify CDs with identical TOCs but different audio content. So if I insert the 1987 version of an album, I want the database to tell me that is it the 1987 version from the US label and not the 1992 re-issue by an european distributor. In cases where the TOC and audio is 100% the same it should leave me the choice to pick the issue that I think matches my CD. Moreover when I query a single with only 2-3 tracks I don't want a list of at least 10 different artists and their singles which all happen to have the same single CD layout. Another big "I want" on my wishlist would be the ability to combine several CDs into a multi-CD release, freedb is full of entries that have nonconsistent naming attempts for double- or multi-CD albums, like appending (Disc 1), (Disk 1), (CD 1/2) to the album field.

Editing entries must be made much easier than it is now, too. A wiki like approach for adding and correcting information would be nice.
Cartman_Sr
I agree with you on most points, except that RP values shouldn't be stored. Basically I think the data that is stored for each disc (artist, year, etc.) is sufficient also. Definitely do not get rid of freedb. I still think what it needs is an "Extreme Makeover", where the information gets removed, and the maintainers implement some sort of fingerprinting/statistical database to keep out people who do not know how to read and can't type properly. Let them access the DB, just can't submit entries. Keep the infrastructure, ditch the illiterate submitters. Obviously it would have to be down for a while, but that is no big deal, IMO.
Eli
What about when you try to fix your tags from your flac or mp3 files and you dont want to have to put the disc in the drive to get the TOC info? Yes, freedb can do a pretty good job with number of files and lengths, ect, but what if you didnt rip the whole album? There is no reason that AF (acoustic fingerprinting) should not be used and should IMHO be the basis of any future db.
Fandango
Some sort of identification besides TOC is definetely needed. Because there's NOT only ONE version of an album. Some albums have seen upto half a dozen re-issues or more, both with renewed audio and sometimes a slightly different TOC. Just look at how many different "disc IDs" you get for some albums, and they all have the same or similar artist/album, that's somewhat fishy. I'd like to have an extra entry with the record label's official catalog number instead. For a music fan it does matter what issue of an album he has in his hands, especially with all these destructive remastering jobs being done in the past 10 years.
Never_Again
QUOTE(Synthetic Soul @ Mar 10 2006, 05:28 PM)
I think any system is always going to be let down by the fact that there are a lot of imbreds in the world.  As anal as you may be, there will be one hundred people who don't know how to use the shift key, and think that "thx" is proper English.

You can build the best database in the world, but someday you have to let the general public at it.  And the general public are a mixed bunch.
*

I was about to reply that your view of the world is too cynical, and then I see this in another recent thread here on HA:

QUOTE(<independently-minded member of general public> @ Mar 5 2006, 11:12 AM)
QUOTE
and then submit that bad data to the AR database.


That is AR's problem, even if AR is the final word in offset detection. They should have thought about it at design time. As a user I need my options and settings open and should be able to do whatever I want to do with them, however wrong or whatever it may be.
*



Eli
There may be a new option on the horizon:
http://www.hydrogenaudio.org/forums/index....showtopic=42381
foosion
Another issue with accessing information using encoded tracks instead of the original CD arises because of data tracks. freedb requires the complete TOC including data tracks, so any query using encoded tracks is bound to fail as the querying software generally lacks knowledge about the presence or size of a data track that might have been on the CD.
dummptyhummpty
QUOTE(Fandango @ Mar 10 2006, 12:41 PM)
Hi,

I'm concerned about the current state of the FreeDB and CDDB. First of all the info that can be stored there is rather limited. Second even if there was a possibility to add more specific info for a CD, most people won't use it, some people don't even fill in the year info. And also many redundant entries are created.

Currently freedb/cddb are missing the following (and not limited to) features:

  • mechanism to destinguish remastered albums or albums which otherwise have the exact same TOC. Making replaygain info a standard for submitting album info would help.
  • distinction between original release date (the year when the first release came out) and reissue/remaster/repackaging/re* release date
  • catalog numbers
  • ...
The new system should be based on the principle of automation, which the founders of cddb most likely didn't think of when they started. But today a lot of music applications are able to either search or submitt cddb data, some of them are even able to make perfect audio CD copies and know how to get replaygain correction values. The technology is all there but, it's not used for making a "correct" CD database.

Does anyone know if there's already a project like this in existence, or does anyone have objections or more ideas?

I think freedb/cddb is very useful, it saves a lot of typing when creating tags, but it's sometimes a real bummer, to get your original release labeled as a remaster CDs or you discover weeks after you've used freedb/cddb some mistyped tracks.
*



You bring up some good ideas. I don't know to much about freedb but there should be some sort of standard for the applications that allow data to be added to the database. Like you should have to specify (maybe in a drop down list or something) 1 or 2 discs and then the program would format that in a consistent way so you don't have the (Disc 1)/CD 1/2 problem..
tomars
One thing Gracenote has up on freedb is unicode. If I look up 8 Teeth To Eat You with iTunes (which uses Gracenote) I get the original Japenese, if I look it up with freedb I get only the translation (or whatever it is "Bura Bura Bushi").
spoon
freedb support utf-8 unicode.
tomars
Oh right, I got my info from here: http://en.wikipedia.org/wiki/Freedb
Probably outdated or incorrect
I retract my previous statement :>
MC Escher
The advantage that Musicbrainz has over Freedb is that it's database isn't such a mess. Musicbrainz can import cd's from Freedb, but when I add a cd to Musicbrainz I ususally just search on Google and Amazon for it, because that often leaves me with better and more information. In Freedb I have to search through 20 almost duplicate albums first. Apart from classical music Musicbrainz has a very accurate and heavily moderated database.
SebastianG
I'm not sure if this has already been mentioned: The disc IDs are SHA1 hashes (160 bit) of the CD TOC instead of 32 bit checksums that are generated by a crappy CDDB algorithm.

So, a lot fewer collisions are to be expected.

Sebi
LANjackal
One word: Allmusic.com (AMG). I don't know what the licensing terms for accessing it are, but as far as my experience goes it's the best resource (content and feature-wise) for music metadata I have ever encountered. WMP 10 queries it for album art/tag/rename purposes, don't know if any other programs do - I think Musicmatch does, but I wouldn't swear about it.

I know there's an obsession with free and open stuff around these parts, but that's just my 2 cents.

On the other hand, I find FreeDB's implementation via EAC to be pretty good. I get very few collisions, and those I do get are easily sorted out (usually obvious from the genre tag). I use it only to name CD images, not for actual tagging. The latter job is done by data fetched from AMG as above - beats anything else I've used.
crimsontide
All I need from my CDDB is an Album Name and Artist, artist, and track name per track, year of recording, and year of pressing. I dont want or need replaygain info, even if it is based on the raw WAV. Thats a local setting for local playback. We'll have no trouble heeeerrrreeeeeee.

Strangely enough CD-Text has been available for years and has never really been used - I've always thought that was a gross misuse of a digital format.

Ive had numerous cd-writers which claim to write it - numerous cd players which claim to read it - and yet never once has this info popped up either from my own burns or professional cds which claim CD-Text. And i tried burning from 10 different cd-writers when i worked as a tech for a video editing company!! no joy... sad.gif I must be dumb. If they'd only used this part of the red-book format then we wouldnt need any of this rubbish in the first place.

650Mb of data and they cant spare 10Kb for track info...... sad.gif ?????




Andavari
This is a good thread!
QUOTE(Societal Eclipse @ Mar 11 2006, 12:42 AM)
I've sent corrections to FreeDB before for albums that were not correctly entered.  I'm not sure if that just creates a duplicate or someone actually looks at it and fixes the problem.
*


The same here. Although I don't see how someone (an individual) could actually go and make corrections hence how would they know which submission was valid or invalid, albeit typos would be an obvious fix and are a necessity which has made me wanting and longing for an automatted spell checking and correction system, but then again perhaps our CD ripping programs should have the ability to plug into a spell checker of some sort.

I've seen shortend track names as if someone on DOS had submitted the information, and unfortunately a meriad of duplicates that always cause me to have to try both or more to see which is the best or most accurate. In some cases if one is lucky some differences are only the genre (seems some people are too picky over something as minuscule as Metal or Heavy Metal, Rock or Hard Rock). Most of the duplicates I've gotten however are because of incorrect data or typos that someone has thankfully corrected, which always has me double checking the track names, etc., that are listed with the CD booklet.

One good thing about these online databases that made me appreciate them that much more was when I looked up a CD yesterday that I haven't known the tracks names of for over ten years due to losing the original CD booklet.

QUOTE(crimsontide @ Mar 14 2006, 07:20 AM)
Strangely enough CD-Text has been available for years and has never really been used - I've always thought that was a gross misuse of a digital format.
*


Yeah, I got a surprise yesterday when I popped in a CD from the late 1990's that had CD-Text, unfortunately it didn't have complete information such as the year and such but nonetheless it was cool as it's the only CD in my collection that has CD-Text, well other than the CD's I've burrrn'd.
spoon
Most Sony released CDs are CD-Text (normally you have to enable the option to read CD-Text first in the ripper, atleast you do in mine).

CD-Text is more limiting than freedb in the information present.

>hence how would they know which submission was valid or invalid,

They don't, the last submission overwrites the existing.
RobH
QUOTE
The latter job is done by data fetched from AMG as above - beats anything else I've used.


LANJackal, How do you "fetch" data from AMG?

Robert
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2008 Invision Power Services, Inc.