I have followed up on the transient idea, which has expanded the code base several times over the first try. The general strategy is to locate distinct positions in both samples and then use some heuristics to synchronize both.
In a first iteration I have taken transient steepness as distinct attribute. The algorithm compiles a list of the 100 biggest differences between pairs of consecutive samples. This data is then used in a second step to calculate an offset.
Again it works on test samples, but not real world data. When comparing the
tec_sqam_40a_bwf_tcm6-12548.wav sample vs. its LAME encoding it turns out that
LAME completely flattens all transients! Or at least it spreads them over several frames.
These are the results:
CODE
Original's transients:
Delta: -8464 Pos: 58130
Delta: -8484 Pos: 118488
Delta: -8514 Pos: 118470
Delta: 8519 Pos: 58102
Delta: -8650 Pos: 58100
Delta: -8662 Pos: 135541
Delta: 8806 Pos: 118466
Delta: 8977 Pos: 118433
Delta: 9136 Pos: 173506
Delta: 9336 Pos: 118471
Delta: 9343 Pos: 58129
Delta: -9686 Pos: 118434
Delta: 9899 Pos: 58081
Delta: 9978 Pos: 58111
Delta: -10326 Pos: 118483
Delta: -10337 Pos: 135534
Delta: -10370 Pos: 58082
Delta: 10405 Pos: 58078
Delta: 11096 Pos: 118482
Delta: 11745 Pos: 58090
Target's transients:
Delta: 1975 Pos: 343414
Delta: 1978 Pos: 343539
Delta: 1978 Pos: 345801
Delta: 1979 Pos: 324576
Delta: 1981 Pos: 336255
Delta: 1982 Pos: 325832
Delta: 1984 Pos: 343916
Delta: 1990 Pos: 333744
Delta: 1994 Pos: 332739
Delta: 1997 Pos: 331358
Delta: 1997 Pos: 333870
Delta: 1997 Pos: 343288
Delta: 2000 Pos: 331734
Delta: 2004 Pos: 327339
Delta: 2006 Pos: 332614
Delta: 2016 Pos: 331735
Delta: 2017 Pos: 325706
Delta: 2019 Pos: 332488
Delta: 2034 Pos: 345424
Delta: 2035 Pos: 331609
This is getting a little tiring all by myself. Please share, if you have any ideas about how it can be improved.
This is the code of the fingerprinting class:
CODE
import java.util.SortedSet;
import java.util.TreeSet;
/**
*
* @author rpp3po
*/
public class Fingerprinter {
private final SortedSet<TransientPos> originalTransients;
private final SortedSet<TransientPos> targetTransients;
private int lastOriginalSample = 0;
private int lastTargetSample = 0;
private static final int transientCount = 100;
public Fingerprinter() {
originalTransients = new TreeSet<TransientPos>();
targetTransients = new TreeSet<TransientPos>();
init(originalTransients);
init(targetTransients);
}
public void addOriginalSample(int sample, int position) {
TransientPos tp = new TransientPos(sample - lastOriginalSample, position);
addDelta(originalTransients, tp);
lastOriginalSample = sample;
}
public void addTargetSample(int sample, int position) {
TransientPos tp = new TransientPos(sample - lastTargetSample, position);
addDelta(targetTransients, tp);
lastTargetSample = sample;
}
private static void addDelta(SortedSet<TransientPos> set, TransientPos tp) {
TransientPos first = set.first();
if (tp.compareTo(first) > 0) {
set.remove(first);
set.add(tp);
}
}
public void dumpTransients() {
System.out.println("Original's transients:");
for (TransientPos tp : originalTransients) {
System.out.println("Delta: " + tp.delta + "\t" + "Pos: " + tp.position);
}
System.out.println("Target's transients:");
for (TransientPos tp : targetTransients) {
System.out.println("Delta: " + tp.delta + "\t" + "Pos: " + tp.position);
}
}
private static void init(SortedSet set) {
for (int i=0; i<transientCount; i++) {
set.add(new TransientPos(i, i));
}
}
static class TransientPos implements Comparable {
private int delta, position;
public TransientPos(int delta, int position) {
this.delta = delta;
this.position = position;
}
public int compareTo(Object o) {
TransientPos other = (TransientPos) o;
int i = Math.abs(this.delta) - Math.abs(other.delta);
return i != 0 ? i : this.position - other.position;
}
}
}
Two WAV files are fed sample by sample into the
add..Sample methods. After both files have been read, the Fingerprinter instance contains two sets of transient positions for further analysis.