IPB

Welcome Guest ( Log In | Register )

 
Reply to this topicStart new topic
Ideas on how to test if video quality affects perceived audio quality?, [was “Scientific method to use (College Dissertation Help!)”]
MikeyTSH
post Jan 14 2013, 00:59
Post #1





Group: Members
Posts: 4
Joined: 14-January 13
Member No.: 105855



Not sure if this is the right forum to post this in, if not, feel free to move the topic and apologies mods!

Anyhoo, I'm in University in England doing my final year dissertation. I want to see if visual stimuli affects the perception of audio quality. The plan is to subject test subjects to WAV, MP3 and AAC files with the latter two encoded at 320, 256, 128 and 96. After ABX testing the MP3 to the AAC at each bit rate, I then want to throw video in at 1080p, 720p, 480p and 240p. Test subjects will do an ABX on the audio but I want to see if a corresponding/conflicting video quality will affect their choices.

An idea I had was the have the audio quality drop throughout the video too but I'm having real trouble working out the best way to test this. I've already made a test "scaling" file, simply putting cross-fades between each file to make the transition as seamless as possible but my big question is really what testing method would I use here? I can hardly say "which do you think sounds better" because the audio is going to degrade in quality. I really want to ask them to hit the pause button when they believe the audio quality has changed but I'm worried about 2 things - 1) will this cause preemptive pressing of the button 2) if I can get around this, what kind of testing method would I use? The idea for this test is to have 2 video files - one remaining at 1080p and then another that degrades in time with the speed that the audio quality degrades, to link the visual stimuli hypothesis. But here again I hit a brick wall - do I ask either a) when did the audio quality change or b) do you think the audio quality changed, in addition to the question "which one sounded better" - my lecturer suggested that having both MP3 and AAC can cement my hypothesis in BSc territory as I can directly hypothesize based on how the codec "technically" work.

Any and all input is greatly appreciated, I look on the internet for some answers, end up going, for example, "oh, null testing, what's that?" and then 20 links later I'm down a rabbit hole that has me scratching my head and, unfortunately, procrastinating.

Thanks again!
Mike

This post has been edited by MikeyTSH: Jan 14 2013, 01:07
Go to the top of the page
+Quote Post
Dynamic
post Jan 14 2013, 23:05
Post #2





Group: Members
Posts: 793
Joined: 17-September 06
Member No.: 35307



First thing is that headphones are more likely to reveal artifacts than speakers. Artifact training may also help

Try some ABX tests yourself. This is best done to compare the lossless original (an experimental control) to an encoded file. ABC/HR is also useful to compare multiple codecs against the lossless control (or various 'anchors')

Well encoded MP3 (e.g. LAME VBR) is remarkably good down to 128 kbps (stereo) and needs careful attention and potentially artifact training often even at 96 kbps (stereo) where it's falling down more dramatically. I guess you'd use CBR, which isn't quite so good.

Decent AAC is likely to be pretty darned good at 96 kbps (stereo), but FAAC for example, is considered a fairly poor encoder, so it depends on the encoder. It might be hard to show a distinct effect without going lower in bitrate. Above about 160 or 192 kbps, both AAC and MP3 are likely to be extremely hard to distinguish from lossless.

I can't see gradual change in quality being viable. Wouldn't it be better to compare pairs of videos together, e.g.

1080p AAC256 vs 720p AAC256
1080p AAC096 vs 720p AAC256
1080p AAC096 vs 720p AAC096
1080p AAC256 vs 720p AAC096

You can then separate out true audio differences from video-induced differences.
Go to the top of the page
+Quote Post
MikeyTSH
post Jan 15 2013, 01:15
Post #3





Group: Members
Posts: 4
Joined: 14-January 13
Member No.: 105855



I think I'll drop the variable one but as far as encoding goes I'm using LAME for MP3 and iTunes' encoder for AAC - I want to keep it as close to consumer style encoding as possible - I'll have to check and see what kind of encoding Amazon MP3 use and get as close as I can to that.

Thanks so much for the response, it's all incredibly helpful - as I've said, I find myself going down a rabbit hole that seems endless!
Go to the top of the page
+Quote Post
MikeyTSH
post Jan 15 2013, 13:34
Post #4





Group: Members
Posts: 4
Joined: 14-January 13
Member No.: 105855



Sorry for the double post but just thought I'd ask - do you know of anywhere I could get a program that does ABX with video support? I have foobar with that extension to allow for ABX.

That said - would I just get them to watch the video but TELL the focus on the audio (I'm wondering whether I should TELL them to listen for which sounds worse / how to phrase the question to them). I want to do an audio only test first to null out any possibly issues. Also, what should I be taking into account for in regards to listening medium? I was thinking of doing it on headphones but what volume would be the best? I want to try and be as level as I can with the fletcher/munson curve...)

This post has been edited by db1989: Jan 15 2013, 14:31
Reason for edit: deleting pointless full quote of post #2
Go to the top of the page
+Quote Post
db1989
post Jan 15 2013, 14:33
Post #5





Group: Super Moderator
Posts: 5174
Joined: 23-June 06
Member No.: 32180



QUOTE (MikeyTSH @ Jan 15 2013, 00:15) *
I'll have to check and see what kind of encoding Amazon MP3 use and get as close as I can to that.
Amazon seems to be quite inconsistent in using various versions and configurations of LAME, at least last I heard. There have been a discussion or two here about this in the past; I don’t have links to hand, but you should be able to find them via a search and may find them useful.
Go to the top of the page
+Quote Post
mzil
post Jan 15 2013, 17:13
Post #6





Group: Members
Posts: 422
Joined: 5-August 07
Member No.: 45913



OP, I think you'll get a lot out of this, plus the references at the end should give you hours of further info:
http://gamepipe.usc.edu/~zyda/pubs/Presence9.6.pdf
Go to the top of the page
+Quote Post
MikeyTSH
post Feb 3 2013, 21:50
Post #7





Group: Members
Posts: 4
Joined: 14-January 13
Member No.: 105855



A quick question, if through headphones, which model and also how/would I need to display how those particular cans are affecting the sounds / how would I display that?

Thanks!
Go to the top of the page
+Quote Post
LithosZA
post Feb 4 2013, 12:10
Post #8





Group: Members
Posts: 182
Joined: 26-February 11
Member No.: 88525



QUOTE
First thing is that headphones are more likely to reveal artifacts than speakers. Artifact training may also help

+1

QUOTE
A quick question, if through headphones, which model and also how/would I need to display how those particular cans are affecting the sounds / how would I display that?


Most headphones should reveal low bitrate flaws quite easily I think. Something like Sennhesier HD-280 PRO/Sony MDR-V7506 should work fine.

I think the best way would be to create a ABC/HR tool that randomly muxes the encoded audio tracks with random video tracks.
Maybe it is something I can try to write in my spare time...

If we are talking about stereo here then some bitrates listed are already way higher than what most consider transparent.
I would rather do the following:
- Lossless Reference
- 48Kbit/s HE-AACv2
- 64Kbit/s HE-AACv2
- 96Kbit/s LC-AAC
- 192Kbit/s LC-AAC
Go to the top of the page
+Quote Post
DonP
post Feb 4 2013, 13:44
Post #9





Group: Members (Donating)
Posts: 1469
Joined: 11-February 03
From: Vermont
Member No.: 4955



A recent Scientific American has an article on cross sensory perception.
Related to your experiment, people's perception of how fresh potato chips were was influenced by listening to recordings of fresh vs soggy chips being crunched while they ate their own samples.

Another I found fascinating: subjects (who had no prior training in lip reading) watched a few minutes of silent video of someone talking. Then they listened to a noisy recording of someone talking. They were better able to understand the noisy speech if it was the same person they saw in the silent video.

The online article linked is mostly supplementary video. The text is way less than in the magazine:
http://www.scientificamerican.com/article....-senses-at-time

Go to the top of the page
+Quote Post

Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 24th April 2014 - 02:08