IPB

Welcome Guest ( Log In | Register )

 
Reply to this topicStart new topic
Raspberry Pi often produces smaller FLACs than PC; why?, [was “Mini Flac”, “Raberry Pi - Rasbain”; TOS #6]
Werewolf6851
post Oct 15 2013, 08:02
Post #1





Group: Members
Posts: 10
Joined: 27-September 09
Member No.: 73521



I have couple of Raspberry Pi's and I noticed an interesting effect.

Compared encoding the same wav file on a PC vs the Raspberry Pi (arm)
95% the time the Raspberry one would be make a smaller file. Decoded both version of flac back to wav file, and compared md5sum's and they where the same.
Guessing arm version flac encodes a tighter file.

Wolf

This post has been edited by db1989: Oct 16 2013, 20:21
Go to the top of the page
+Quote Post
eahm
post Oct 15 2013, 08:08
Post #2





Group: Members
Posts: 882
Joined: 11-February 12
Member No.: 97076



It depends which compression settings of FLAC is being used, -0 (bigger file) to -8 (smaller file).

This post has been edited by eahm: Oct 15 2013, 08:10
Go to the top of the page
+Quote Post
Werewolf6851
post Oct 15 2013, 08:10
Post #3





Group: Members
Posts: 10
Joined: 27-September 09
Member No.: 73521



QUOTE (eahm @ Oct 15 2013, 09:08) *
It depends which compression settings in FLAC is being used -0 to -8.


My bad, both systems used the -8 option for the smallest possible file.

Wolf
Go to the top of the page
+Quote Post
AndyH-ha
post Oct 15 2013, 09:44
Post #4





Group: Members
Posts: 2191
Joined: 31-August 05
Member No.: 24222



Displayed file size may vary because of different disk formats or format parameters, having less to do with how many bytes of data are in the file. Copy files from one device to the other and compare them side by side.
Go to the top of the page
+Quote Post
skamp
post Oct 15 2013, 10:30
Post #5





Group: Developer
Posts: 1343
Joined: 4-May 04
From: France
Member No.: 13875



To get the actual size in bytes on Linux:

CODE
$ stat -c '%s' file


On OS X (and possibly UNIX in general):
CODE
$ stat -f '%z' file


There's also "du -b", but that prints the "apparent size", which "may be larger due to holes in sparse files, internal fragmentation, indirect blocks, and the like".


--------------------
caudec.net
Go to the top of the page
+Quote Post
nu774
post Oct 15 2013, 13:08
Post #6





Group: Developer
Posts: 476
Joined: 22-November 10
From: Japan
Member No.: 85902



Alternatively you can use:
CODE
$ wc -c file

Go to the top of the page
+Quote Post
Nessuno
post Oct 15 2013, 15:01
Post #7





Group: Members
Posts: 422
Joined: 16-December 10
From: Palermo
Member No.: 86562



QUOTE (Werewolf6851 @ Oct 15 2013, 09:02) *
Decoded both version of flac back to wav file, and compared md5sum's and they where the same.

If the encoders are not seriously flawed the two wav files must obviously be bit identical. Did you also checksum the two flac files (assumed tags and everything else being equal between them)?


--------------------
... I live by long distance.
Go to the top of the page
+Quote Post
greynol
post Oct 15 2013, 16:24
Post #8





Group: Super Moderator
Posts: 10000
Joined: 1-April 04
From: San Francisco
Member No.: 13167



QUOTE (Nessuno @ Oct 15 2013, 07:01) *
Did you also checksum the two flac files (assumed tags and everything else being equal between them)?

While this may no longer be true, different processors could would yield different flac files when using the same command line arguments.

This post has been edited by greynol: Oct 15 2013, 16:46


--------------------
Your eyes cannot hear.
Go to the top of the page
+Quote Post
lvqcl
post Oct 15 2013, 16:44
Post #9





Group: Developer
Posts: 3207
Joined: 2-December 07
Member No.: 49183



https://xiph.org/flac/faq.html#tools__different_sizes

QUOTE
Why doesn't the same file compressed on different machines with the same options yield the same FLAC file?

It's not supposed to, and neither does it mean either encoding was bad. There are many variations between different machines or even different builds of flac on the same machine that can lead to small differences in the FLAC file, even if they have the exact same final size. This is normal.
Go to the top of the page
+Quote Post
aztec_mystic
post Oct 15 2013, 21:05
Post #10





Group: Members
Posts: 93
Joined: 28-March 13
Member No.: 107425



QUOTE (Nessuno @ Oct 15 2013, 16:01) *
If the encoders are not seriously flawed the two wav files must obviously be bit identical. Did you also checksum the two flac files (assumed tags and everything else being equal between them)?

No, I don't think so. The compiler, the instruction set architecture (which may work with slightly different precision arithmetic) and such may lead to small differences.
Go to the top of the page
+Quote Post
saratoga
post Oct 15 2013, 23:44
Post #11





Group: Members
Posts: 4715
Joined: 2-September 02
Member No.: 3264



QUOTE (aztec_mystic @ Oct 15 2013, 16:05) *
QUOTE (Nessuno @ Oct 15 2013, 16:01) *
If the encoders are not seriously flawed the two wav files must obviously be bit identical. Did you also checksum the two flac files (assumed tags and everything else being equal between them)?

No, I don't think so. The compiler, the instruction set architecture (which may work with slightly different precision arithmetic) and such may lead to small differences.


These matter for lossy, but should not for lossless unless the format is very very broken.
Go to the top of the page
+Quote Post
greynol
post Oct 16 2013, 00:02
Post #12





Group: Super Moderator
Posts: 10000
Joined: 1-April 04
From: San Francisco
Member No.: 13167



...but flac isn't broken. wink.gif

http://www.hydrogenaudio.org/forums/index....st&p=847309


--------------------
Your eyes cannot hear.
Go to the top of the page
+Quote Post
Nessuno
post Oct 16 2013, 07:02
Post #13





Group: Members
Posts: 422
Joined: 16-December 10
From: Palermo
Member No.: 86562



Now, that's interesting!

I'm fine with different filesystems using or showing different space for the same raw data, but intuitively I gave for granted that a lossless algorithm is univocal, deterministic and implementation independent.


--------------------
... I live by long distance.
Go to the top of the page
+Quote Post
nu774
post Oct 16 2013, 16:55
Post #14





Group: Developer
Posts: 476
Joined: 22-November 10
From: Japan
Member No.: 85902



I guess this is the consequence of using floating point based analysis and prediction, resulting slight difference in rice parameters or something ... but I don't know for sure.
Yeah, it's not intuitive without looking at FLAC code doing fp math.
Go to the top of the page
+Quote Post
bryant
post Oct 16 2013, 17:31
Post #15


WavPack Developer


Group: Developer (Donating)
Posts: 1287
Joined: 3-January 02
From: San Francisco CA
Member No.: 900



QUOTE (nu774 @ Oct 16 2013, 08:55) *
I guess this is the consequence of using floating point based analysis and prediction, resulting slight difference in rice parameters or something ... but I don't know for sure.
Yeah, it's not intuitive without looking at FLAC code doing fp math.

That is the issue. It was explained by Josh starting here.
Go to the top of the page
+Quote Post
saratoga
post Oct 16 2013, 17:32
Post #16





Group: Members
Posts: 4715
Joined: 2-September 02
Member No.: 3264



Its also pretty common that the assembly version of an algorithm is implemented slightly or even significantly differently than the c version. If you want to compare between platforms, disabling assembly so that each device runs the same code often gets rid of some or even all of the difference. Of course it also makes it much slower.
Go to the top of the page
+Quote Post
ktf
post Oct 16 2013, 18:08
Post #17





Group: Members
Posts: 303
Joined: 22-March 09
Member No.: 68263



QUOTE (Nessuno @ Oct 16 2013, 08:02) *
I'm fine with different filesystems using or showing different space for the same raw data, but intuitively I gave for granted that a lossless algorithm is univocal, deterministic and implementation independent.

Maybe this an easy explanation: FLAC uses 'unreliable' (but easy) floating-point math to do the modeling and approximation, but that model is stored and reconstructed with integer-math only. The residual is integer-math only too. The FP-math stuff is just to point the 'real' encoder in the right direction.

You can compile FLAC using integer-math only, but the files it creates are much larger.


--------------------
Music: sounds arranged such that they construct feelings.
Go to the top of the page
+Quote Post
aztec_mystic
post Oct 17 2013, 08:05
Post #18





Group: Members
Posts: 93
Joined: 28-March 13
Member No.: 107425



QUOTE (saratoga @ Oct 16 2013, 00:44) *
QUOTE (aztec_mystic @ Oct 15 2013, 16:05) *
No, I don't think so. The compiler, the instruction set architecture (which may work with slightly different precision arithmetic) and such may lead to small differences.

These matter for lossy, but should not for lossless unless the format is very very broken.

I don't understand your point. Of course, the audio after decoding should be identical regardless of the platform you encoded the file on. This thread, however, is about the file size of the FLAC.

Again, I am a layman when it comes to codecs. But it is not plausible to me why binaries must yield FLACs of identical file size in every case. I don't think it's implausible that a given compression code will yield slightly different compression ratios depending on compiler flags, instruction set architecture, and similar factors.

This post has been edited by aztec_mystic: Oct 17 2013, 08:06
Go to the top of the page
+Quote Post
Makaki
post Oct 17 2013, 18:41
Post #19





Group: Members
Posts: 52
Joined: 20-May 13
Member No.: 108227



For me this raises an interesting theoretical question:

Does this mean there might be an "ideal" platform which would yield consistently better compression?
(Possibly where FP math is done with more precision)

Again this is more of a curiosity. At the practical level, my guess is the answer would be irrelevant.

Go to the top of the page
+Quote Post
skamp
post Oct 17 2013, 18:57
Post #20





Group: Developer
Posts: 1343
Joined: 4-May 04
From: France
Member No.: 13875



Possibly relevant: deterministic building process


--------------------
caudec.net
Go to the top of the page
+Quote Post
bryant
post Oct 18 2013, 06:16
Post #21


WavPack Developer


Group: Developer (Donating)
Posts: 1287
Joined: 3-January 02
From: San Francisco CA
Member No.: 900



QUOTE (Makaki @ Oct 17 2013, 10:41) *
Does this mean there might be an "ideal" platform which would yield consistently better compression?
(Possibly where FP math is done with more precision)

Probably not. The cases where the details of the FP math (precision, order of operations, rounding direction) would make a difference in an integer result are going to be cases where the "true" result is very close to the decision threshold. The two choices are likely to be equally efficient near that threshold. If a difference in FP math did make a consistent improvement, then that would indicate a area of possible optimization.

Go to the top of the page
+Quote Post

Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 16th April 2014 - 23:16