Best lossless codec for the time capsule |
![]() ![]() |
Best lossless codec for the time capsule |
Sep 7 2006, 22:45
Post
#1
|
|
|
Group: Members Posts: 11 Joined: 3-July 02 Member No.: 2457 |
I'd like your thoughts on which lossless codec to use for archiving episodes of a weekly radio show I did from 1995 - 2003. The shows were usually 3 hours long, and usually recorded onto VHS hi-fi, two shows per tape. Some shows were recorded from the FM broadcast, and some were recorded straight from the board, with limited success (signal strength from the board sometiems overloaded the VHS audio inputs). Later shows went smoothly from the board to minidisc which, although compressed, sounded better since the overloading problem was addressed. The minidisc shows were recorded to the computer and are archived that way. So my lossless codec archiving project deals primarily with the earlier shows (over 200 of them) on VHS from radio or direct line.
Uncompressed wav's of each show are usually around 2 GB for the three hours. The plan is to clean up the noise and any dead air, cut into neat 1 hour parts, and make mp3s (whole different forum, I know). But I want to keep the untouched audio I sample right from the VHS, like a photographic negative, as well as the cleaned up version stored separately, for possible future use. This is what I'd like to convert to a lossless codec. So here's the part I probably could have just skipped straight to: which lossless codec do you think would be the most future-proof? In other words, if my great-great grandkids want to mess around with the audio from G-paw's old college radio show, which format would make that most likely? I don't really need any fancy features for data or volume, etc. Just one big encoded file per show would do, just in case there's ever some WAY better method of noise redution, or I want to pull out and really tweak a live band performance or a comedy bit. It'd be nice to fit 3 or 4 shows on a 4.7GB single layer DVD data disc, and fit the whole history of the show in one big CD notebook. I'm leaning toward FLAC or WMA losselsss. FLAC since it seems the most widely used and is open source, and WMA because Microsoft will eventually rule the universe and aren't going anywhere. I realize I could transcode at some point in the future if need be, but I and my non-existent future kids would rather not be bothered with it. Ideas? |
|
|
|
Sep 7 2006, 23:22
Post
#2
|
|
![]() Group: Members (Donating) Posts: 1675 Joined: 4-January 04 From: Austin, TX Member No.: 10933 |
MLP. Buy a cheap DVD-A mastering software package and encode your DVD at 16/44. The question isn't if your kids can read the discs, it's how much trouble they're likely to go through.
|
|
|
|
Sep 7 2006, 23:29
Post
#3
|
|
|
Group: Members Posts: 1279 Joined: 13-August 03 Member No.: 8353 |
FLAC, definitely. Because of the future-proof copyright license and patents it uses / not uses, and because it is already widely used in commercial products.
But on a more general scale, I'd give up any hope that you won't have to "update" a historic archive. Digital libraries have already accepted the fact that archiving in the digital age does no mean to "conserve" the medium as best as possible, but to be up-to-date constantly. Using FLAC and DVD-Rs might be a good idea for now, but in 10 years those formats are outdated and you must replace them, same goes for DVD-A of course. EDIT: Also don't forget that DVDs are a very crappy medium for long time storage. They tend to "go bad" in only a few years. So a error correction software is a must, too. PAR2 for example. With the help of the correction data you can restore the files on damaged or failing DVDs easily. Don't underestimate the risk that is involved with DVD-R! You'll break out in tears probably when you'll find out after maybe 5 years that your precious recordings are lost because of some tiny errors due to inevitable chemical decay of the DVD-Rs. This post has been edited by Fandango: Sep 7 2006, 23:35 |
|
|
|
Sep 7 2006, 23:51
Post
#4
|
|
|
Group: Members Posts: 187 Joined: 24-March 06 Member No.: 28803 |
FLAC, definitely. Because of the future-proof copyright license and patents it uses / not uses, and because it is already widely used in commercial products. But on a more general scale, I'd give up any hope that you won't have to "update" a historic archive. Digital libraries have already accepted the fact that archiving in the digital age does no mean to "conserve" the medium as best as possible, but to be up-to-date constantly. Using FLAC and DVD-Rs might be a good idea for now, but in 10 years those formats are outdated and you must replace them, same goes for DVD-A of course. In the far future, all the patents will have expired anyway, so "patent-free" doesn't mean much. Furthermore, by far the most widely used lossless format of the past 15 years is WAV, and considering that data storage costs will be negligible I'd say WAV. We're talking about 1500 hours of music, 900 GB in WAV, 200 DVD-Rs (fit nicely in one of them 200-disc binders). When HD-DVD/Blu-Ray becomes cheap in a year or two, copy to that (20 discs). When UltraSuperDuperHoloDisc comes, you'll probably be down to 1 disc. Or 1 fingernail sized memory chip. Which your dog will eat someday. This post has been edited by eofor: Sep 7 2006, 23:52 |
|
|
|
Sep 7 2006, 23:54
Post
#5
|
|
![]() Group: Members (Donating) Posts: 1675 Joined: 4-January 04 From: Austin, TX Member No.: 10933 |
Yeah, WAV has the particularly nice advantage of requiring absolutely no codec support whatsoever, if all you need is the raw data. If you understand arrays of two's complement integers, you can read WAV files.
|
|
|
|
Sep 8 2006, 01:09
Post
#6
|
|
|
Group: Members (Donating) Posts: 611 Joined: 31-May 06 Member No.: 31326 |
EDIT: Also don't forget that DVDs are a very crappy medium for long time storage. They tend to "go bad" in only a few years. So a error correction software is a must, too. PAR2 for example. With the help of the correction data you can restore the files on damaged or failing DVDs easily. Don't underestimate the risk that is involved with DVD-R! You'll break out in tears probably when you'll find out after maybe 5 years that your precious recordings are lost because of some tiny errors due to inevitable chemical decay of the DVD-Rs. If you must go for optical storage, it would be worth investing some time getting to know dvdisaster to add data checking and recovery to your archives: http://dvdisaster.sourceforge.net/ As it appears very slick now (esp. for a sourceforge.net hosted project!) I suspect as we move from dvd-r to the next level (blu-ray/HD-DVD) and beyond, this project will keep up. Onto the original question: 1. For recording and processing: WAV, just WAV. 2. For archival of the WAV data: FLAC, as it encodes/decodes quickly and you can always convert to<->from WAV as necessary. You might be able to get all the shows onto 25-30 DVD-Rs (without dvdisaster error correction, 30-39 with correction depending on amount). I'd recommend multiple copies using multiple DVD-R vendors (pick the top two ensuring they are sourced from different factories or use different formulations) and/or keep the FLACs or DVD-ISOs stored on a RAID-1 (or higher) array or RAID-NAS. Regular spot checks and multiple storage locations as well, if it's important. -brendan -------------------- Hacking CD Robots & Autoloaders: http://hyperdiscs.pbwiki.com/
|
|
|
|
Sep 8 2006, 01:23
Post
#7
|
|
![]() Group: Members Posts: 142 Joined: 16-August 05 From: Portland, Oregon Member No.: 23924 |
I agree that FLAC or WAV will be good choices, with a slight edge to WAV due to an utter lack of ambiguity (no codecs, etc.).
While Microsoft will be around for a long time, it is unclear that any of their "standards" will really stick. They are evolving from a company that (sort of) invents things into one that maintains a vast and cumbersome infrastructure - in other words, they are becoming the IBM of old, with a far greater obligation to "keep things working" than innovation. I honestly don't expect WMA to be significant in 5 more years as this path is followed. Microsoft will have to use other's standards instead of creating their own. In a small way, that is why MP3 is still many times more important than WMA right now. |
|
|
|
Sep 8 2006, 01:28
Post
#8
|
|
|
Group: Members Posts: 1279 Joined: 13-August 03 Member No.: 8353 |
In the far future, all the patents will have expired anyway, so "patent-free" doesn't mean much. In any case he'll have to "re-organise" is archive every once in a while anyway (when the mediums get errors or when he upgrades his hardware), so patent issues resulting in unreadable formats by then up-to-date software isn't that big of a problem. I think "keeping track of what's going on" and always remembering "that there's this stack of old discs in the basement I have to keep an eye on" is inevitable. @bhoar: That's an interesting project! They even make use of C1/C2 info and seem to be very ambitous and close-to-the-hardware. While PAR2 has a more basic approach on error correction (and is mainly used by warez dudez on usenet btw), dvdisaster is fortunately designed especially for the insecure (but cheap and handy) optical media... Hm, maybe it gets more wide-spread use over time. PAR2 is too complicated for wide-spread use IMHO. But dvdisaster seems to be newbie friendly. EDIT: Haha, I misread eofor posting... of course now he also makes more sense to me. This post has been edited by Fandango: Sep 8 2006, 01:32 |
|
|
|
Sep 8 2006, 01:48
Post
#9
|
|
|
FLAC Developer Group: Developer Posts: 1487 Joined: 27-February 02 Member No.: 1408 |
there is no good way to detect errors in uncompressed WAV audio data.
though not as easy to parse as WAV, FLAC is still low complexity, well documented, and the code is everywhere. (none of that is true for wma lossless, you're dependent on microsoft.) and FLAC will save space. you could add an awful lot of error correction and still use less space than plain WAV. Josh |
|
|
|
Sep 8 2006, 05:12
Post
#10
|
|
|
Group: Members Posts: 11 Joined: 3-July 02 Member No.: 2457 |
Thanks to all of you! It's nice to have people who are deep into a subject dish out good advice. After thinking on it some more, I think I will just go with the good ol' uncompressed wav files. Many of the later shows are already stored as such on a whole bunch of CD-R's. Also, the more I think about the practicality of digitizing and archiving the at least 350 shows in question, the more I realize I don't need to add another step to the process (encoding to FLAC, etc.). Even using uncompressed wavs gives me 2 full shows per DVD-R, plus the lower bitrate mp3s that I will make in order to create a podcast (which is actually how we distributed the show to several affiliates, even before anyone called it podcasting, from 1999 - 2003). This project will likely take 3 - 4 years if I do a couple of shows a week, so it's gotta be relatively easy.
I appreciate you guys bringing up the longevity of discs, since I hadn't really mulled that over to my satisfaction. After some searching I've determined that there are a million opinions on what discs to use and a billion opinions on how long they'll last. I found a NIST study done in '04 which pointed out that there are differences in stability among brands, but of course they didn't mention brands. I've been using Ridata DVD-R's and Verbatim CD-R's for some time now, after coasters from other brands. Is there a forum devoted to media longevity and quality? It think Fandango is right, I'll just have to remember as technology changes to go spot check the discs, and migrate it all to better media. Eventually, ALL the shows, all my everything probably, will be on some solid state doohicky using holographic recording methods. I'll keep the mp3s I make on a mirrored RAID I use for my music and photo collection, and one day copy all the wavs to some form of RAID as well. Now I just have to go develop a workflow with timed overnight digitizing to wav, minor editing, noise reduction with noiseprints made from any dead air I can find, encoding to mp3 (another forum), and making available to friends and family and anyone curious as a podcast. I like a challenge. |
|
|
|
Sep 8 2006, 06:03
Post
#11
|
|
|
Group: Members (Donating) Posts: 611 Joined: 31-May 06 Member No.: 31326 |
... 1. I think I will just go with the good ol' uncompressed wav files ... 2. This project will likely take 3 - 4 years if I do a couple of shows a week, so it's gotta be relatively easy. ... 3. Is there a forum devoted to media longevity and quality? ... 4. Now I just have to go develop a workflow with timed overnight digitizing to wav, minor editing, noise reduction with noiseprints made from any dead air I can find, encoding to mp3 (another forum), and making available to friends and family and anyone curious as a podcast. I like a challenge. 1. If you convert to WAVs, you may want to make it WAV files and CUE files. That way, should you need to convert them to other formats later, you'll have a pretty standard input format for transcoding. 2. I'm in a similar situation: I have a shelf on my wall with approximately 100 DATs with live shows/festivals I've recorded over the years. I do have audio-extracting PC DAT drives. I just don't have a good workflow planned out. Perhaps I'll design one that works well with your "couple shows a week" approach. 3. CDFreaks is the place for medium (DVD/CD disc) quality and longvity discussions: http://club.cdfreaks.com/ ... a warning, it's apparently a hobby unto itself... 4. Do let us know (via posting or wiki) when you've got you're workflow designed, I'd love to hear about it. -brendan -------------------- Hacking CD Robots & Autoloaders: http://hyperdiscs.pbwiki.com/
|
|
|
|
Sep 8 2006, 06:40
Post
#12
|
|
![]() Group: Members (Donating) Posts: 3453 Joined: 7-November 01 From: Strasbourg (France) Member No.: 420 |
I don't want to add an answer (Flac & WavPack would be mine) but a question: is 16/44.1 the best way to store audio from analog tapes, considering that (future & improved) post-processing is the main purpose of that backup? Wouldn't 20, 24 or even 32 bits offer a better digital material for any processing task?
|
|
|
|
Sep 8 2006, 07:32
Post
#13
|
|
|
WinABX developer Group: Developer Posts: 1572 Joined: 1-October 01 Member No.: 137 |
No, because with 16 bits you are already wasting the last few bits encoding the noise floor of the studio, of the FM demodulated signal or of the analog tape.
For postprocessing the first step is always converting the source material to 32 bit floating point or similar, so the source bitdepth doesn't matter as long as all the useful information is there. And with 16 bit it is, in this case. |
|
|
|
Sep 8 2006, 07:45
Post
#14
|
|
|
Group: Super Moderator Posts: 4795 Joined: 1-April 04 Member No.: 13167 |
If much of the content is monaural you will get considerable space savings going with flac.
|
|
|
|
Sep 8 2006, 20:36
Post
#15
|
|
![]() Group: Members (Donating) Posts: 3453 Joined: 7-November 01 From: Strasbourg (France) Member No.: 420 |
For postprocessing the first step is always converting the source material to 32 bit floating point or similar, so the source bitdepth doesn't matter as long as all the useful information is there. And with 16 bit it is, in this case. Thank you for this tip. I didn't imagine that upsampling was useful, even for processing purpose |
|
|
|
Sep 9 2006, 13:59
Post
#16
|
|
![]() LAME developer Group: Developer Posts: 2950 Joined: 1-October 01 From: Nanterre, France Member No.: 138 |
For conservation of archives, I would advise you not to apply ANY kind of processing, not even noise reduction.
You should keep the data compressed with a lossless encoder, encoded at the highest resolution/sampling rate available. Why not applying noise reduction: you do not know how processing abilities will evolve in the future. By applying noise reduction now, you might loose quality compared to what could be possible to achieve in 10 years. You should also use a codec with publically available specifications, as it is a bad idea to rely on the fact that a specific company will still support a given codec in 20 years. Regarding the medium, my advice would be to use DVD, with either PAR2 data or processed through DVDisaster. Every 2 years, you should also scan your dvds, and eventually re-burn them. It is unlikely that they will deteriorate enough in 2 years to not be fully recoverable if you used some kind of parity protection, but I would not trust current casual medium in a longer time period (more than 5 years). edit: those are the usual archiving processes in french national institutions (IRCAM and INA) This post has been edited by Gabriel: Sep 9 2006, 14:01 |
|
|
|
Sep 9 2006, 14:38
Post
#17
|
|
|
Group: Members Posts: 304 Joined: 7-February 05 From: Local Cluster Member No.: 19647 |
not to be totally weird, here.. but what about just buying a cheap relatively-ok speed pc, put in 3 or 4 320gb seagate discs that cost around $90 each, an audiocard with digital out, and just leaving that pc offline as a whole 95% of the time? granted, it'll be a tad more expensive than buying dvd'rs, but the data shouldn't deteriorate, the pc shouldn't break when it's never on, and the digital out should allow you to output it even if you leave the pc offline for the next 10-15 years..
no trouble with outdated codecs that you can't run on your current platform, since you're still running the old one.. if money isn't an issue it would be my choice, anyway.. |
|
|
|
Sep 9 2006, 23:56
Post
#18
|
|
|
Group: Members Posts: 11 Joined: 3-July 02 Member No.: 2457 |
Compared to some other a/v related forums I visit, this one actually has people with information and reasonable opinions! Don't ever take that for granted.
bhoar: Sadly, I know nothing about CUE files, or using "wiki" as a verb. I have some research to do. I'll post regarding my workflow once I develop it. Or maybe I'll figure out how to wiki. Getting off topic... Thanks for the cdfreaks.com link. They are indeed freaks. After a good bit of reading, it seems Taiyo Yuden is a favorite blank DVD (and CD), but I can find no SCIENTIFIC study to back it up, just lots of anecdotal evidence in the form of DVD Info error graphs and the like. People like Verbatim too. I have so far had no problems with RiDATA (Ritek) bought from newegg.com over the past couple of years. I did find one bit of science in the form of a NIST study done on topic, showing that one brand in particular did better than the others, but of course they released no brand names, and that was 2004. And I agree with those discussing the 16 bit / 44.1 kHz sample rate. It's enough. If you heard the tapes I am dealing with here, you'd agree even more. We're talkin' college station with a stereo 100w transmitter that sometimes sounded quite good, and sometimes was distorted mono. With the limited frequency response, dynamic range, and S/N ratio, I'm all set with my JVC S-VHS HR-S9911U, Panasonic SA-HE200 receiver, and Audigy 2 soundcard. I'll be keeping the untouched wav's AND the edited ones on DVD-R, since DVD-R's are cheap. I'll put mp3 versions on my 160GB mirrored RAID. If the DVDs will last even a modest 5 years, I'll copy all of it to a terabyte RAID, or whatever my budget and the state of technology allows. Maybe then I'll move it to Blu-Rays for offline backup for a safe deposit box if I'm really feeling paranoid. I think I'll FLAC (sounds good as a verb) the untouched wavs. Portions of the show, like the band interview and some older band performances, were mono (except for the background static), so FLAC may get me down to 50% of the original overall. That'd be 4 shows per DVD-R. I like the idea of PAR files a lot. Any recommendations for where to go to learn how? |
|
|
|
Sep 10 2006, 00:33
Post
#19
|
|
![]() Group: Members Posts: 913 Joined: 10-January 05 Member No.: 18979 |
I hope you don't end up putting all your eggs in one basket. RAID is not really a proper archive solution (Yes, there is higher fault tolerance, but it's more about constant availability (uptime of servers), or spanning multiple drives (capacity)). I'd rather store critical data on hard drives in a couple safe deposit boxes in different locations than RAID them.
|
|
|
|
Sep 11 2006, 17:46
Post
#20
|
|
|
Group: Members Posts: 187 Joined: 24-March 06 Member No.: 28803 |
I hope you don't end up putting all your eggs in one basket. RAID is not really a proper archive solution (Yes, there is higher fault tolerance, but it's more about constant availability (uptime of servers), or spanning multiple drives (capacity)). I'd rather store critical data on hard drives in a couple safe deposit boxes in different locations than RAID them. Also, one fire or flooding will wipe out the data, however well it's mirrored. So: at least two physical locations for your archive. |
|
|
|
Sep 12 2006, 07:00
Post
#21
|
|
|
Group: Members Posts: 11 Joined: 3-July 02 Member No.: 2457 |
Thanks for he concern. I promise a RAID is, and will only ever be, for convenience on the network. There will always be an offline backup. However, I do at least need to not be lazy about things and put an offline backup somewhere else besides the same house with the RAID. I'll stick one over at dear ol' Mom's house, and maybe when I can afford it, put a hard drive copy in a safe deposit box. Then maybe launch one into space... but then you have micrometeorites... and radiation...
|
|
|
|
![]() ![]() |
|
Lo-Fi Version | Time is now: 22nd November 2009 - 10:52 |