Help - Search - Members - Calendar
Full Version: Backup on CD-R & Error recovery data
Hydrogenaudio Forums > CD-R and Audio Hardware > CD Hardware/Software
tigre
Hi there!

I've read most threads on similar topics - There's RAR and PAR. Both have advantages and disadvantages. I'd like to know if there's something else that suits my needs better. Let me explain:

I want to save some of my compressed audio files to CD-Rs and delete them from my HDD. I'm aware of the fact that CD-Rs are dying within few years, so I want to add some kind of error recovery data (ERD).

Option 1: Put all files to a (uncompressed) RAR archive adding as much recovery info as possible (limit: space on CD-R)
Pros:
+ Random errors can be corrected, the more ERD added , the more errors are tolerated
+ Filename length is not a problem (not an issue for me).
Cons:
- Accessability: Only few players can play back files out of Rar archives, standalone players can't at all. This is point is most important for me, it'll make me stay away from RAR probably.
- Security: I don't know any tests how good RARs error recovery abilities really are (and how near they are to a theoretical limit - I guess this exists) and if they're any good for dying CDs (information loss starts from end of the archive / files that were recorded last).
- Maybe I'm wrong, but I think there are two possibilities: 1. everything is recovered correctly, 2. everything's lost.

Option 2: Adding as many PAR files as fit on the CD-R
Pros:
+ Audio files are accessible directly
+ Security (= I have controll/knowledge what happens): I know how many files can get corrupted before something is lost; Even if some files can't be recovered anymore, a lot of others keep being accessible.
Cons:
- Long filenames are cut - nevermind.
- Correct me, if I'm wrong please - The way I understand it if a PAR file contains some (even only very few) random errors, it can't be used anymore. So it seems like the PAR solution can be screwed completely by much less random errors (distributed over all files) than the RAR solution

I hope this gives an impression of what I want. My questions:

1. If some of my assumptions are wrong, please correct me.
2. Is there another way of using error recovery data that combines the pros of both options and avoids the cons?
3. Do you think it's a good Idea to create a set of PAR files (maybe filling half of the available space), splitting them to very small files (maybe some kByte, so that random errors won't be that bad anymore) and create another set of PAR files for the small split files?
4. Do PAR files contain a built-in integrity check or is some extra CRC/MD5 checking recommendable?

Thanks.

Cheers tigre
WaldoMonster
Par can recover as many files as there are par files exist.
It can recover damaged and even deleted or lost files.
A par file is a few % larger than the largest recoverable file in the same directory.

A example:

user posted image

By this example there are only music files added to the par files.
There are 2 par files .p01 and .p02, this means you can totally delete 2 files and recover them. The file size of the par files are a little bit larger than "15 - Woman in chains Ft. Roland Orzabal.mpc"


EDIT:

FSRaid is a good free windows based program.
http://www.fluidstudios.com/fsraid.html

My experience is that the DOS version is not 100% compatible with FSRaid or the other way around. The problem happens with strange characters.
bodhy
Hi I have a doubt...
As far as I undertood...

Suposse you have 10 files (MP3 or whatelse) and 1 PAR file.
If 2 of your 10 files get corrupted you won't be able to repair it cause you have only one PAR file?

But If you have RAR with RR 5%, you can recover up to 5% random errors in the RAR file, no?

If this is the case, IMO RAR its a better method...
If you have a medium corrupted its very posible that you'll have more than 1 file damaged...
or not?

Best regards.
b:.
tigre
Thanks for your input so far. I've searched and read a lot since I started the thread. Here's what I found out:

PAR 1.xx works the way I suspected:
- Once one of the files (source files or PAR files) is changed it turns completely useless and recovery has to be done with the remaining files. If there aren't enough unchanged files left, everything is screwed (besides source files that are unchanged).
- The PAR files are a little bit longer than the longest source file, so it's good if all source files have the same size.

PAR 2.xx is improved in many points, but there's no stable software available so far that uses this "standard".

In sourceforge PAR forums I found several recommendations to split big files before creating PAR files (and the statement that with PAR 2.xx this won't be necessary anymore).

I still haven't found anything else besides RAR and PAR.

The solution I'm thinking of so far is this:

1. All tracks of an album are split to 100 kB files using Easy MD5 Creator.
2. All small files (.log, .md5 ...) are moved to an archive file, e.g. 7-zip format.
3. "PAR 1st layer": For each folder containing the split tracks of an album (+small files archive) a set of PAR files is created; their number needs to be calculated in a way that there'll be still some space left on CD-R.
4. "PAR 2nd layer": All PAR files (one independent set for each folder/album) are put in one folder and another set of PAR files (2nd layer) is created.
5. The unchanged folders (before step 1) + a folder containing all PAR files and the small files archives are burned to disk. The PAR folder should be recorded at the end of the session (or in a 2nd session).

Recovery:
1. Some software like CD Data Rescue is needed to get as much data as possible.
2. Hopefully all 1st layer PAR files can be recovered using 2nd layer
3. The extracted audio files need to be split of course the same way as before to perform 1st layer recovery.


BTW:
QUOTE(bodhy @ Apr 12 2003 - 11:30 AM)
But If you have RAR with RR 5%, you can recover up to 5% random errors in the RAR file, no?

I'd say it's rather like this: With RR 5% there's a number (or percentage) of random errors that can be corrected for sure, and a number (or percentage) that makes correction impossible for sure (something below 100% tongue.gif ) In the range between these numbers it's statistics. Something like "With RR 5% the probability that 5% random errors can be corrected is 20%" (This example is just a guess!)
floyd
The big problem with using PAR without RAR first is that if you have one big song, (Pink Floyd - Echoes from Meddle for example)then every PAR needs to be at least as big as this song. This wastes tons of space, especially with lossless stuff (imagine every PAR=~70mb, not very useful).

I can't say if your multi-layer PAR system will work, but here is my system:

Start with 400-450 mb of audio data. Split to 5mb, uncompressed rar chunks, no recovery record. Fill remaining space on disk (250-300mb) with 5mb PAR chunks. With about 50 PAR chunks and around 100 RAR chunks, the odds of the disk developing bad sectors beyond the threshold of the PARs is very, very small (especially with good storage)

If you wanted to be more paranoid you could split to even smaller chunks only limited by your sanity. Personally I'm not totally confident that PAR works ok with 1000s of 100k chunks, but it probably does.
de Mon
How do I create .PAR files?
madah
For personal backups (stuff that I've created myself etc) I compress with WinRAR and split to 20 MB archives (with recovery record), then create as many .par-files as necessary to fill up the entire cd (usually 700 MB).
If you are really paranoid then this should be the best method!

PAR v2 offers many improvments over PAR v1, if I understand correctly: You could create a 64 KB .par2 and it could recovery 64 KB data from any of the other files;

For example: if file1.mp3 has 60 KB of bad data (like bad sectors) and file2.mp3 has 4 Kb of bad data, then both of them could be recovered using this 64 KB .par2-file!

QUOTE
4. Do PAR files contain a built-in integrity check or is some extra CRC/MD5 checking recommendable?

PAR v1.x stores a md5 of the first 16 KB of data (I believe it's for fast file searching), a md5 of the entire file, the filesize (using 64-bit integer) and filenames are stored in unicode. Also, it has a md5 of itself (the .par) so I believe it's the best integrity-format available!

QUOTE
My experience is that the DOS version is not 100% compatible with FSRaid or the other way around. The problem happens with strange characters.

This is because filenames are stored in unicode, and the DOS-version certainly cannot support unicode. Most windows-clients that operate on par-files doesn't support unicode either.
I've posted this bug to the PAR-forums here.

QUOTE
Personally I'm not totally confident that PAR works ok with 1000s of 100k chunks, but it probably does.

PAR v1 has a limit of max 255 files, with PAR v2 the limit is 65535. Here is some explanation of why these limits exists.
floyd
hmm, PAR2 is interesting.. What programs are out there supporting it?
tigre
QUOTE(de Mon @ Apr 12 2003 - 02:47 PM)
How do I create .PAR files?
I use SmartPAR. It's listed and linked among other clients on Parchive homepage:
http://parchive.sourceforge.net/
PAR 2 specs can be found there too.

@floyd: No programs supporting PAR 2 so far. From Parchive hompage - announcements:
QUOTE
09.03.2002 - Draft Implementation Spec of PAR v2.0 Available!

Big news! PAR is moving ahead with some ideas spawed from the weaknesses of v1.0. Available in the project documents is a reference implementation spec for developers of clients to test out. Included in the improvements is the shift to a 'packetized' binary file format, recovery down to the article level, and room inside the spec for customized extensions.
Parity Volume Set Spec. 2.0 [2002-09-12] (Pre-Reference Implementation)

Latest announcement available, but > 1/2 year old...

@madah: Thanks for the valuable information you provided. smile.gif

Does anyone know or can point to details about the way files are stored on data CD-Rs? Can there be really random errors or does data become corrupt in patterns (e.g. whole unreadable/bad "blocks" - sectors?) due to error detection/correction layers used?
theduke
QUOTE(tigre @ Apr 13 2003 - 10:58 AM)
Does anyone know or can point to details about the way files are stored on data CD-Rs? Can there be really random errors or does data become corrupt in patterns (e.g. whole unreadable/bad "blocks" - sectors?) due to error detection/correction layers used?

One of the methods used to provide security is to scramble the data before recording, so that you don't have continuous data written in a sequence. Therefore the probability is much higher that you get random errors than burst errors. If a block is corrupt, I think you can still recover some info out of it, it's only not equal to the original data.
floyd
dont data cds either read the exact data or not at all (if damaged or corrupt)? That was my understanding.
tigre
On Windows level this is true AFAIK. But if you use software that doesn't care about these limitations (e.g. CD Data Rescue - it reads bad bad positions at reduced speed several times and tries to get identical results for statistical verification, similar to EAC) you can "extract" everything you want - errors are possible that way.
_io_
Just a quick question.
If your storing your files in rar archives then why use par files and not winrar's recovery volume feature, which from what i can see does the exact same thing without the need for an extra program.

_io_
WaldoMonster
The first PAR 2 command line is available here:

http://sourceforge.net/project/showfiles.p...lease_id=150791
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2008 Invision Power Services, Inc.