Lets say we have two computers on the Internet. Both of them has a lot of music on them. I want to compare the collections - get a list of identical music.
1. Assume that the files are called something different on each computer
2. Assume that the files have been tagged/replaygained differently on each computer
With 1 alone I could just do a md5sum on both computers and compare those values. It would be rather fast.
But 2 requires that I ignore the tags/replaygain info out. This could be done by:
- copying the file
- stripping the tags
- setting replaygain info to zero
- calculating the md5sum
- deleting the file
This would indeed be slow. So I would need a lookuptable with md5sum/file => md5sum/audio_data
Any ideas/comments?