Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Raspberry Pi often produces smaller FLACs than PC; why? (Read 10730 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Raspberry Pi often produces smaller FLACs than PC; why?

I have couple of Raspberry Pi's and I noticed an interesting effect.

Compared encoding the same wav file on a PC vs the Raspberry Pi (arm)
95% the time the Raspberry one would be make a smaller file.  Decoded both version of flac back to wav file, and compared md5sum's and they where the same.
Guessing arm version flac encodes a tighter file.

Wolf

Raspberry Pi often produces smaller FLACs than PC; why?

Reply #1
It depends which compression settings of FLAC is being used, -0 (bigger file) to -8 (smaller file).

Raspberry Pi often produces smaller FLACs than PC; why?

Reply #2
It depends which compression settings in FLAC is being used -0 to -8.


My bad, both systems used the -8 option for the smallest possible file.

Wolf

Raspberry Pi often produces smaller FLACs than PC; why?

Reply #3
Displayed file size may vary because of different disk formats or format parameters, having less to do with how many bytes of data are in the file. Copy files from one device to the other and compare them side by side.

Raspberry Pi often produces smaller FLACs than PC; why?

Reply #4
To get the actual size in bytes on Linux:

Code: [Select]
$ stat -c '%s' file


On OS X (and possibly UNIX in general):
Code: [Select]
$ stat -f '%z' file


There's also "du -b", but that prints the "apparent size", which "may be larger due to holes in sparse files, internal fragmentation, indirect blocks, and the like".

Raspberry Pi often produces smaller FLACs than PC; why?

Reply #5
Alternatively you can use:
Code: [Select]
$ wc -c file


Raspberry Pi often produces smaller FLACs than PC; why?

Reply #6
Decoded both version of flac back to wav file, and compared md5sum's and they where the same.

If the encoders are not seriously flawed the two wav files must obviously be bit identical. Did you also checksum the two flac files (assumed tags and everything else being equal between them)?
... I live by long distance.

 

Raspberry Pi often produces smaller FLACs than PC; why?

Reply #7
Did you also checksum the two flac files (assumed tags and everything else being equal between them)?
While this may no longer be true, different processors could would yield different flac files when using the same command line arguments.

Raspberry Pi often produces smaller FLACs than PC; why?

Reply #8
https://xiph.org/flac/faq.html#tools__different_sizes

Quote
Why doesn't the same file compressed on different machines with the same options yield the same FLAC file?

It's not supposed to, and neither does it mean either encoding was bad. There are many variations between different machines or even different builds of flac on the same machine that can lead to small differences in the FLAC file, even if they have the exact same final size. This is normal.

Raspberry Pi often produces smaller FLACs than PC; why?

Reply #9
If the encoders are not seriously flawed the two wav files must obviously be bit identical. Did you also checksum the two flac files (assumed tags and everything else being equal between them)?

No, I don't think so. The compiler, the instruction set architecture (which may work with slightly different precision arithmetic) and such may lead to small differences.

Raspberry Pi often produces smaller FLACs than PC; why?

Reply #10
If the encoders are not seriously flawed the two wav files must obviously be bit identical. Did you also checksum the two flac files (assumed tags and everything else being equal between them)?

No, I don't think so. The compiler, the instruction set architecture (which may work with slightly different precision arithmetic) and such may lead to small differences.


These matter for lossy, but should not for lossless unless the format is very very broken.


Raspberry Pi often produces smaller FLACs than PC; why?

Reply #12
Now, that's interesting!

I'm fine with different filesystems using or showing different space for the same raw data, but intuitively I gave for granted that a lossless algorithm is univocal, deterministic and implementation independent.
... I live by long distance.

Raspberry Pi often produces smaller FLACs than PC; why?

Reply #13
I guess this is the consequence of using floating point based analysis and prediction, resulting slight difference in rice parameters or something ... but I don't know for sure.
Yeah, it's not intuitive without looking at FLAC code doing fp math.

Raspberry Pi often produces smaller FLACs than PC; why?

Reply #14
I guess this is the consequence of using floating point based analysis and prediction, resulting slight difference in rice parameters or something ... but I don't know for sure.
Yeah, it's not intuitive without looking at FLAC code doing fp math.

That is the issue. It was explained by Josh starting here.

Raspberry Pi often produces smaller FLACs than PC; why?

Reply #15
Its also pretty common that the assembly version of an algorithm is implemented slightly or even significantly differently than the c version.  If you want to compare between platforms, disabling assembly so that each device runs the same code often gets rid of some or even all of the difference.  Of course it also makes it much slower.

Raspberry Pi often produces smaller FLACs than PC; why?

Reply #16
I'm fine with different filesystems using or showing different space for the same raw data, but intuitively I gave for granted that a lossless algorithm is univocal, deterministic and implementation independent.

Maybe this an easy explanation: FLAC uses 'unreliable' (but easy) floating-point math to do the modeling and approximation, but that model is stored and reconstructed with integer-math only. The residual is integer-math only too. The FP-math stuff is just to point the 'real' encoder in the right direction.

You can compile FLAC using integer-math only, but the files it creates are much larger.
Music: sounds arranged such that they construct feelings.

Raspberry Pi often produces smaller FLACs than PC; why?

Reply #17
No, I don't think so. The compiler, the instruction set architecture (which may work with slightly different precision arithmetic) and such may lead to small differences.

These matter for lossy, but should not for lossless unless the format is very very broken.

I don't understand your point. Of course, the audio after decoding should be identical regardless of the platform you encoded the file on. This thread, however, is about the file size of the FLAC.

Again, I am a layman when it comes to codecs. But it is not plausible to me why binaries must yield FLACs of identical file size in every case. I don't think it's implausible that a given compression code will yield slightly different compression ratios depending on compiler flags, instruction set architecture, and similar factors.

Raspberry Pi often produces smaller FLACs than PC; why?

Reply #18
For me this raises an interesting theoretical question:

Does this mean there might be an "ideal" platform which would yield consistently better compression?
(Possibly where FP math is done with more precision)

Again this is more of a curiosity. At the practical level, my guess is the answer would be irrelevant.



Raspberry Pi often produces smaller FLACs than PC; why?

Reply #20
Does this mean there might be an "ideal" platform which would yield consistently better compression?
(Possibly where FP math is done with more precision)

Probably not. The cases where the details of the FP math (precision, order of operations, rounding direction) would make a difference in an integer result are going to be cases where the "true" result is very close to the decision threshold. The two choices are likely to be equally efficient near that threshold. If a difference in FP math did make a consistent improvement, then that would indicate a area of possible optimization.