QUOTE (DVDdoug @ Feb 6 2008, 21:06)

QUOTE (knutinh @ Feb 5 2008, 01:11)

QUOTE
Now in order to completely automate this process, I think you would need three copies of the record (and some special software). Wherever two (or 3) samples agree, this "data" is good. Where one sample is different from the other two, the data is bad.
I dont think so. Any correlated noise/degradation cannot be removed (this way) anyways. Non-correlated noise can be improved by simply taking the average of 2,3 (or generally N) records. By adding N such tracks, you get the SNR of N*signal/(sqrt(N)*noise). By taking the average of two records _given that all noise is non-correlated_, we get an improvement of sqrt(2) in SNR, or 3dB.
Now, there might be other (possibly better) methods of comparing N records, median being mentioned.
-k
NOTE - I'm not saying this will work with real vinyl transfers in the "real world".
But, sure you can find and remove the noise (error) with 3 only records/files,
if the noise is infrequent enough that it occurs on only one recording at any point in time time... You don't have to do any averaging... You just "throw-out" the bad data. A digital audio file is just a series of integers. Look at the following 3 series of numbers and you can easily see the mismatched data, and it's equally easy to make a new-corrected series:
File1--File2--File3--Corrected
1000 1000 1000 1000
1024 1024 1023 1024
1111 1111 1111 1111
1230 1234 1234 1234
1005 1003 1003 1003
1997 1997 1897 1997
1500 1500 1500 1500
I am well aware of the idea that you suggest, but you said that 3 records was necessary. As my post shows, this is wrong, as improvement can (in our theoretical sandbox at least) be done for only 2 tracks, and any number above that.
As I also said, the plain averaging that I suggested can concievably be improved, especially if one has knowledge of the error mechanisms in the medium. I mentioned the median, and the table you drew is an example of a median (or mode) operation. Of course, the median is different from the mean only for 3 inputs or more.
As has been suggested by others, discussing techniques for comparing N sub-sample aligned renditions of a record is a "luxury-issues". The real problem is how to do continous subsample alignement, while perfectly separating jitter/fluctuations from noise is a difficult trade-off.
I think that an intuitive approach would be something like:
1. Calculating the crosscorrelation between blocks of track B and track A
2. Resampling track A by 8x or 128x, called A'.
3. Using the peak of the crosscorrelation as a startingpoint for every sample in blocks of B and finding the smallest subsample offsets that minimize the difference (i.e. the +/- offset where the error change polarity)
4. Now you have a sample-by-sample offsetvector that explain ALL difference as time-variation. Do a frequency-analysis and decide what is "true" time-variation, and what is noise. The threshold would be device/media-dependant
5. Now you have a crude separation of time-variation and noise, eliminating noise is now possible.
Low-frequency time-alignement should be very accurate (perhaps well into the subsample accuracy). High-frequency time-alignement should not be that accurate, since that would model noise as time-variation (not something we want). Therefore, I am suggesting an accurate time-tracking with suppression of fast changes.
-k