IPB

Welcome Guest ( Log In | Register )

2 Pages V   1 2 >  
Reply to this topicStart new topic
iTunes Encodes vs. Amazon (Spectrograms), Spectograms.
Engelsstaub
post May 14 2012, 02:39
Post #1





Group: Members
Posts: 545
Joined: 16-February 10
Member No.: 78200



A spectral comparison of an iTunes file and its counterpart from Amazon:

Fear Factory - Recharger (Single) iTunes/m4a




Fear Factory - Recharger (Single) Amazon/mp3



I understand a spectrogram tells us pretty close to nothing about how a music file actually sounds to the listener. I still challenge anyone to find results that are reversed. I never have. (I get "credits," like many others, for Amazon mp3s. I download the iTunes versions as an iTunes Match subscriber.)

At this bitrate I personally doubt listeners will discern between the two. Does this discount the validity of saying one encode is of a "higher quality" or that it is "better" in that it retains more of the source? These graphs can still tell us "what is missing" and what the respective encoders saw fit to lose in lossy encoding.

Thoughts...

This post has been edited by Engelsstaub: May 14 2012, 02:44


--------------------
The Loudness War is over. Now it's a hopeless occupation.
Go to the top of the page
+Quote Post
Gentoo64
post May 14 2012, 03:01
Post #2





Group: Members
Posts: 3
Joined: 14-May 12
Member No.: 99809



I think anyone would prefer a higher bitrate purely for peace of mind, weather you think you can hear a difference or not.
On my computer I only listen to FLAC, but I can't hear a difference between that and a high bitrate MP3 afaik but it's good to know that you're not losing any quality.

For a portable device if you're going to be out and about I guess it wouldn't matter as much. I personally don't use a portable player but if I did I wouldn't bee too fussy, but sitting at home listening I would want the best specced music so I know I'm not missing out on anything.

But if the music is more full in the spectrogram and closer to the original piece than another version, then it is higher quality.

This post has been edited by Gentoo64: May 14 2012, 03:02
Go to the top of the page
+Quote Post
onkl
post May 14 2012, 04:10
Post #3





Group: Members
Posts: 125
Joined: 27-February 09
From: Germany
Member No.: 67444



First of all, if you can't hear tones above 20khz, then this would be "useless" information to begin with. Second, removing this and using the bitrate purely for the "usefull" information seems to be more efficient.

I don't know how AAC works in this regard, but I guess the resulting file could be slightly smaller with the high frequencies removed, so a lowpass would be desirable for maximum efficiency. For MP3 this could potentially improve the quality when close to the maximum frame size (320kbit + bit reservoir), IIRC there has been someone posting an ABX proving this in a previous discussion about worst case scenarios. (Here it is, was about maximizing frame sizes to improve quality. Reducing the actual information that has to fit into a single frame could lead to the same improvement.)

This post has been edited by onkl: May 14 2012, 04:21
Go to the top of the page
+Quote Post
saratoga
post May 14 2012, 04:12
Post #4





Group: Members
Posts: 4718
Joined: 2-September 02
Member No.: 3264



QUOTE (Engelsstaub @ May 13 2012, 21:39) *
Does this discount the validity of saying one encode is of a "higher quality" or that it is "better" in that it retains more of the source?


Yes, its still a meaningless test.

QUOTE (Engelsstaub @ May 13 2012, 21:39) *
These graphs can still tell us "what is missing"


No. All they show you is what the lowpass filter used to preprocess the PCM was set to.

QUOTE (Engelsstaub @ May 13 2012, 21:39) *
Thoughts...


Don't we have a FAQ entry for this kind of thing? I feel like we must its asked constantly.

Edit: Yes, took me a minute to find it though. See the entry here and the links:

http://wiki.hydrogenaudio.org/index.php?ti...dio_Compression

This post has been edited by saratoga: May 14 2012, 04:18
Go to the top of the page
+Quote Post
greynol
post May 14 2012, 04:48
Post #5





Group: Super Moderator
Posts: 10000
Joined: 1-April 04
From: San Francisco
Member No.: 13167



QUOTE (Gentoo64 @ May 13 2012, 19:01) *
But if the music is more full in the spectrogram and closer to the original piece than another version, then it is higher quality.

One only needs to play with the blade encoder in order to dismiss the idea that a full spectrum means higher quality as rubbish and nonsense. In terms of "closer to the original" you need to define it, at which point the best you can possibly do is a truism which isn't very interesting.

This post has been edited by greynol: May 14 2012, 22:23


--------------------
Your eyes cannot hear.
Go to the top of the page
+Quote Post
Engelsstaub
post May 14 2012, 05:01
Post #6





Group: Members
Posts: 545
Joined: 16-February 10
Member No.: 78200



QUOTE (saratoga @ May 13 2012, 22:12) *
QUOTE (Engelsstaub @ May 13 2012, 21:39) *
Thoughts...


Don't we have a FAQ entry for this kind of thing? I feel like we must its asked constantly.

Edit: Yes, took me a minute to find it though. See the entry here and the links:

http://wiki.hydrogenaudio.org/index.php?ti...dio_Compression


If you don't want to participate in a discussion, feel free not to.


--------------------
The Loudness War is over. Now it's a hopeless occupation.
Go to the top of the page
+Quote Post
Engelsstaub
post May 14 2012, 05:06
Post #7





Group: Members
Posts: 545
Joined: 16-February 10
Member No.: 78200



QUOTE (onkl @ May 13 2012, 22:10) *
First of all, if you can't hear tones above 20khz, then this would be "useless" information to begin with.


That's a good point, Onkl. I'm no expert, but I'm seeing frequencies below this being cut off. The average bitrate for the Amazon encode is seemingly higher.

FWIW I've seen similar results from CBR MP3s @320 Kbps from Zune Marketplace.



--------------------
The Loudness War is over. Now it's a hopeless occupation.
Go to the top of the page
+Quote Post
eahm
post May 14 2012, 05:19
Post #8





Group: Members
Posts: 886
Joined: 11-February 12
Member No.: 97076



QUOTE (onkl @ May 13 2012, 20:10) *
First of all, if you can't hear tones above 20khz

OT: Actually both my brothers can up to 23khz, don't know how but they can.
Go to the top of the page
+Quote Post
saratoga
post May 14 2012, 05:21
Post #9





Group: Members
Posts: 4718
Joined: 2-September 02
Member No.: 3264



QUOTE (Engelsstaub @ May 14 2012, 00:01) *
QUOTE (saratoga @ May 13 2012, 22:12) *
QUOTE (Engelsstaub @ May 13 2012, 21:39) *
Thoughts...


Don't we have a FAQ entry for this kind of thing? I feel like we must its asked constantly.

Edit: Yes, took me a minute to find it though. See the entry here and the links:

http://wiki.hydrogenaudio.org/index.php?ti...dio_Compression


If you don't want to participate in a discussion, feel free not to.


You should begin by reading that link. Then if you still have questions, ask here.

But no sense explaining something here that's already explained better elsewhere. smile.gif

Go to the top of the page
+Quote Post
Engelsstaub
post May 14 2012, 05:43
Post #10





Group: Members
Posts: 545
Joined: 16-February 10
Member No.: 78200



Okay, Saratoga. I said: "These graphs can still tell us "what is missing" and you said: "No. (Emphasis mine.) All they show you is what the lowpass filter used to preprocess the PCM was set to."

I'm seeing that one encode is missing nearly everything above 19 kHz. That sure seems to demonstrate that the spectrogram is showing me something that is missing.

(...and I never said I was "hearing" it.) What exactly do I need to read that refutes this?

Even if I drill down to this link HA I'm beginning with reading this: "Lossy compression is a form of compression that significantly reduce multimedia file size by throwing away information imperceptible to humans." If this is true than I believe Amazon's encoder is taking some extreme liberties with what it "thinks" is imperceptible to humans. eahm's brothers aside, it doesn't take some extraordinary ear to hear frequencies that were discarded in the mp3.

Edit: improper wording.

This post has been edited by Engelsstaub: May 14 2012, 05:46


--------------------
The Loudness War is over. Now it's a hopeless occupation.
Go to the top of the page
+Quote Post
saratoga
post May 14 2012, 06:00
Post #11





Group: Members
Posts: 4718
Joined: 2-September 02
Member No.: 3264



QUOTE (Engelsstaub @ May 14 2012, 00:43) *
Okay, Saratoga. I said: "These graphs can still tell us "what is missing" and you said: "No. (Emphasis mine.) All they show you is what the lowpass filter used to preprocess the PCM was set to."

I'm seeing that one encode is missing nearly everything above 19 kHz. That sure seems to demonstrate that the spectrogram is showing me something that is missing.


If they could tell you what was missing, then you would probably conclude your AAC file is lossless. Not so, much that is missing is not shown.

Instead, what they show is what the lowpass filter setting was that was used to preprocess the signal before it was encoded.

QUOTE (Engelsstaub @ May 14 2012, 00:43) *
Even if I drill down to this link HA I'm beginning with reading this: "Lossy compression is a form of compression that significantly reduce multimedia file size by throwing away information imperceptible to humans."


Keep reading past the first link smile.gif

QUOTE (Engelsstaub @ May 14 2012, 00:43) *
If this is true than I believe Amazon's encoder is taking some extreme liberties with what it "thinks" is imperceptible to humans. eahm's brothers aside, it doesn't take some extraordinary ear to hear frequencies that were discarded in the mp3.


Hearing music -40dB below peak at 19kHz would in fact be quite extraordinary! I think even a pure tone would be extremely impressive, and certainly beyond my hearing.
Go to the top of the page
+Quote Post
Canar
post May 14 2012, 06:30
Post #12





Group: Super Moderator
Posts: 3327
Joined: 26-July 02
From: princegeorge.ca
Member No.: 2796



QUOTE (Engelsstaub @ May 13 2012, 21:43) *
eahm's brothers aside, it doesn't take some extraordinary ear to hear frequencies that were discarded in the mp3.
According to the Terms of Service, to continue this discussion, you must provide scientific evidence that you do, indeed, hear said difference if you wish to claim that you can. It may seem counterintuitive that something you can see is not necessarily something you can hear, but that is pure illusion.

saratoga's position mirrors that of mine as a moderator.


--------------------
∑:<
Go to the top of the page
+Quote Post
onkl
post May 14 2012, 06:40
Post #13





Group: Members
Posts: 125
Joined: 27-February 09
From: Germany
Member No.: 67444



QUOTE (Engelsstaub @ May 14 2012, 06:06) *
QUOTE (onkl @ May 13 2012, 22:10) *
First of all, if you can't hear tones above 20khz, then this would be "useless" information to begin with.


That's a good point, Onkl. I'm no expert, but I'm seeing frequencies below this being cut off. The average bitrate for the Amazon encode is seemingly higher.

The difference in bitrate is due to completly different formats/encoders. If you were to encode the same file with everything but the lowpass being identical, you would see a decrease in bitrate OR an increase in quality for hard to encode passages hitting the frame limit OR in case of exceptional hearing capabilites you might notice the absence of higher frequencies.

Try this test and then redo it with music playing in the background. You might be supprised how hard it is to hear anything above 16khz.
Go to the top of the page
+Quote Post
saratoga
post May 14 2012, 07:29
Post #14





Group: Members
Posts: 4718
Joined: 2-September 02
Member No.: 3264



Someone should add a feature to LAME that subtracts the input and lowpass PCM samples, and then uses the difference to guide the quantizer such that it adds noise with vaguely similar power spectrum to bands above the lowpass cut off (but of course chooses the signal such that it is compressible with little bitrate overhead).

Would be very useful for companies like amazon that sell MP3s to people biggrin.gif
Go to the top of the page
+Quote Post
Engelsstaub
post May 14 2012, 07:51
Post #15





Group: Members
Posts: 545
Joined: 16-February 10
Member No.: 78200



Canar: I never claimed anything regarding my personal hearing. In fact if you read the entire thread I state more than once that I'm not personally claiming to hear any difference in these files at this bitrate. I said that which Amazon's encoder saw fit to throw out was within the range of human hearing. I see things below that 20 kHz getting completely thrown out. If I worded something poorly then please try to see it in the context of that which I've stated again and again.

I'm basically trying to figure out how to interpret spectrograms. If those frequencies are still there and my program is "lying" to me, then I'd really like to know.

Saratoga: is there any particular reason I have to read everything on HA and you can't just give me a direct link to something that tells me why your answer would be no to "these graphs can still tell us what is missing?" You want me to "keep reading past that link." Read what? What am I missing here that you can't be bothered to reiterate?

Just asking simple questions. If you don't want to discuss it then don't. Give me a specific link to what you'd like me to consider or let someone else discuss it with me.

Onkl: I see what you're saying. I know I can't hear the stuff that was cut off of in the mp3 file. My point is only that it is gone and I am just wondering if that was more significant than whatever lossy reduction the iTunes encoder performed? Or maybe: "which file theoretically would better represent the source to all humans?"


--------------------
The Loudness War is over. Now it's a hopeless occupation.
Go to the top of the page
+Quote Post
Engelsstaub
post May 14 2012, 07:56
Post #16





Group: Members
Posts: 545
Joined: 16-February 10
Member No.: 78200



QUOTE (saratoga @ May 14 2012, 01:29) *
Someone should add a feature to LAME that subtracts the input and lowpass PCM samples, and then uses the difference to guide the quantizer such that it adds noise with vaguely similar power spectrum to bands above the lowpass cut off (but of course chooses the signal such that it is compressible with little bitrate overhead).

Would be very useful for companies like amazon that sell MP3s to people biggrin.gif


Oh, now you're funny too. I can get that this discussion is too far beneath you because you're a dev or whatever. Why do you still bother to dignify people with less knowledge than you about certain subjects?

Just close the thread if you want, Canar. There's just way too many flaming dickheads on here to have a constructive conversation at my level. I should have known better.


--------------------
The Loudness War is over. Now it's a hopeless occupation.
Go to the top of the page
+Quote Post
saratoga
post May 14 2012, 08:09
Post #17





Group: Members
Posts: 4718
Joined: 2-September 02
Member No.: 3264



QUOTE (Engelsstaub @ May 14 2012, 02:56) *
Oh, now you're funny too. I can get that this discussion is too far beneath you because you're a dev or whatever. Why do you still bother to dignify people with less knowledge than you about certain subjects?


I'll try this one more time. The answer to this question:

QUOTE (Engelsstaub @ May 14 2012, 02:56) *
I'm basically trying to figure out how to interpret spectrograms.


Is this:

QUOTE (saratoga @ May 13 2012, 23:12) *
All they show you is what the lowpass filter used to preprocess the PCM was set to.


or this:

QUOTE
Nothing can be said about quality or audible difference by looking at pictures


You should look at those graphs and be able to read the low pass settings used to preprocess the PCM. Also, maybe guess the sampling rate, but that should be obvious I think. There is nothing else to interpret. Nothing else is contained in those pictures.

QUOTE ( @ May 14 2012, 02:56) *
There's just way too many flaming dickheads on here to have a constructive conversation at my level. I should have known better.


Don't insult me for because you didn't get the answer you wanted. I only offered you an honest answer. If you're not interested in learning new things, you should not be asking questions.
Go to the top of the page
+Quote Post
db1989
post May 14 2012, 08:49
Post #18





Group: Super Moderator
Posts: 5171
Joined: 23-June 06
Member No.: 32180



QUOTE (Engelsstaub @ May 14 2012, 07:51) *
Canar: I never claimed anything regarding my personal hearing. In fact if you read the entire thread I state more than once that I'm not personally claiming to hear any difference in these files at this bitrate. I said that which Amazon's encoder saw fit to throw out was within the range of human hearing. I see things below that 20 kHz getting completely thrown out.
OK, so the lowpass is at 19 kHz rather than 20 kHz. What now? The point is still that most people will not hear any such frequencies at anything near a ‘normal’ SPL, even as a pure tone—and especially not as part of complex music with content that runs the gamut of all lower frequencies.

QUOTE
I'm basically trying to figure out how to interpret spectrograms. If those frequencies are still there and my program is "lying" to me, then I'd really like to know.
No, they’re not there. Why would a program lie? But again: Chances are you and almost every other human in the world cannot hear them, so removing them might actually improve perceptual quality (or, at higher bitrates, the theoretical probability thereof, since both lowpassed and non-lowpassed encodes might be transparent and thus indistinguishable) by sparing bits for lower and more probably audible frequencies, rather than reducing it as you appear to be inferring from a simplistic visual summary.

QUOTE
Onkl: I see what you're saying. I know I can't hear the stuff that was cut off of in the mp3 file. My point is only that it is gone and I am just wondering if that was more significant than whatever lossy reduction the iTunes encoder performed? Or maybe: "which file theoretically would better represent the source to all humans?"
Whilst I cannot presume to speak for “all humans”, perceptual technologies such as lossy audio compression are not aimed at that amorphous group. They’re aimed at most humans. Otherwise, nothing would ever advance due to encoders being too nervous to excise anything for fear of that one mythical golden-eared sage.

This post has been edited by db1989: May 14 2012, 08:50
Go to the top of the page
+Quote Post
Porcus
post May 14 2012, 09:17
Post #19





Group: Members
Posts: 1788
Joined: 30-November 06
Member No.: 38207



Is this low-pass thing an AAC vs MP3 difference, rather than 'iTunes' vs 'Amazon'?


--------------------
One day in the Year of the Fox came a time remembered well
Go to the top of the page
+Quote Post
LithosZA
post May 14 2012, 09:41
Post #20





Group: Members
Posts: 182
Joined: 26-February 11
Member No.: 88525



Encoding stuff above 20Khz is dumb especially for lossy encodings. You are wasting good bits that are better used for stuff you can hear instead.
The lossy encoding will sound better if you do a lowpass at 20Khz or slightly higher/lower.

For lossless I would say keep whatever frequencies you want even if you can't hear them for peace of mind.
Go to the top of the page
+Quote Post
db1989
post May 14 2012, 09:50
Post #21





Group: Super Moderator
Posts: 5171
Joined: 23-June 06
Member No.: 32180



QUOTE (LithosZA @ May 14 2012, 09:41) *
For lossless I would say keep whatever frequencies you want even if you can't hear them for peace of mind.
I’d say “Well, of course!”, but then there are tools like lossyWAV et al., which some users like to apply before ‘lossless’ compression; lowpassing beforehand doesn’t seem so different. I would do neither, but that’s me being a literalist when it comes to lossless-ness. wink.gif
Go to the top of the page
+Quote Post
onkl
post May 14 2012, 10:11
Post #22





Group: Members
Posts: 125
Joined: 27-February 09
From: Germany
Member No.: 67444



QUOTE (LithosZA @ May 14 2012, 10:41) *
The lossy encoding will sound better if you do a lowpass at 20Khz or slightly higher/lower.

Well that would be a too bold generalization of a rather theoretically worst case scenario. I'm not even sure these frame size limitations apply to AAC aswell. But it would certainly lower the required bitrate and thus increase efficiency.

QUOTE (db1989 @ May 14 2012, 09:49) *
that one mythical golden-eared sage.

laugh.gif
Go to the top of the page
+Quote Post
pdq
post May 14 2012, 11:54
Post #23





Group: Members
Posts: 3309
Joined: 1-September 05
From: SE Pennsylvania
Member No.: 24233



All right, let me try my hand at an analogy.

Two patrons walk into a restaurant. Patron A is known to be a very generous tipper, while patron B is just an average tipper. They both sit down and order a meal, but both patrons have exactly the same amount to spend.

Patron A orders a hamburger and a coke, leaving enough to cover a very generous tip. Patron B, however, orders a steak and a glass of wine.

After they leave, someone wonders who had the higher quality meal. Looking only at the size of the tip, he concludes that patron A must have had the better meal.

The bottom line is, you can choose any metric you like for judging quality, but if you choose the wrong metric then you may not get the answer you want.
Go to the top of the page
+Quote Post
Porcus
post May 14 2012, 11:58
Post #24





Group: Members
Posts: 1788
Joined: 30-November 06
Member No.: 38207



QUOTE (LithosZA @ May 14 2012, 10:41) *
Encoding stuff above 20Khz is dumb especially for lossy encodings. You are wasting good bits that are better used for stuff you can hear instead.


Just because there are high frequencies, it does not mean that they are the same as in the original. At least theoretically, it could simply be that one of the basis vectors decodes to a signal with high frequency content. Not faithful in the sense that it is precisely the same high-frequency content as the original signal, but at least those listeners (if any) which can hear that high, will get a comparable tonal balance. Or whatever. As long as it isn't too loud, it probably isn't worse than silence, and it wouldn't cost any bits. (Too dumb example: if you choose a square wave as basis rather than a sine wave, you will automatically get high frequency content.)


Old days' organ makers employed a few tricks up in the treble, for e.g. 'reed' type sound. To make the sound somewhat brighter, you would use a pipe for the overtones. Of course in the far treble end, you would have to make a pipe for a hardly-audible frequency. IIRC, there are quite some where they would save on the number of 'overtone' pipes at the extreme of the range. They would strictly speaking be disharmonic, but on the other hand, leaving them out would yield a sudden discontinuity in the tonal balance, and (by the organ maker's ear) that was more important – the disharmony was hardly anything the listeners would care about.

This post has been edited by Porcus: May 14 2012, 12:00


--------------------
One day in the Year of the Fox came a time remembered well
Go to the top of the page
+Quote Post
2Bdecided
post May 14 2012, 11:58
Post #25


ReplayGain developer


Group: Developer
Posts: 4955
Joined: 5-November 01
From: Yorkshire, UK
Member No.: 409



QUOTE (Engelsstaub @ May 14 2012, 07:51) *
I'm basically trying to figure out how to interpret spectrograms.
One has a ~19kHz low passs filter, the other doesn't.

You can't read much else useful from these pictures...

The temporal resolution of those pictures is about half a second (1 pixel = 400ms). The temporal resolution of the human ear is about 5ms at best due to temporal masking, and 0.011ms at best when detecting interaural time delays. So most of the interesting, audible and relevant details are lost in those pictures.

The amplitude resolution of those pictures is something like 10dB - at least my eyes struggle to see a change any smaller than about 10dB on the colour scale. Whereas human ears can, at best, detect a 1dB change. So again, most audible changes are basically invisible in those pictures.

A to-the-nearest-10dB snapshot of the spectrum, taken only twice each second - it still tells you a lot of course, but it also tells you nothing useful when comparing two almost identical files. In fact, it can perfectly hide many extremely audible differences. While clearly showing you some inaudible ones!

Cheers,
David.
Go to the top of the page
+Quote Post

2 Pages V   1 2 >
Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 23rd April 2014 - 11:26