Help - Search - Members - Calendar
Full Version: Algorithm to use in a vocal RETAINER??
Hydrogenaudio Forums > Hydrogenaudio Forum > General Audio
Joe Bloggs
OK, after some experimentation it is obvious to me that throwing away everything except the voice in the centre is much more difficult than throwing out the voice in the centre.

It was simple, really (why things don't work, that is)
To remove vocals, you have L-R in the left channel and R-L in the right channel (where -L and -R are used to denote inverted waveforms of the same channel)

If you subtract that from the original recording, you get
left channel L-(L-R) = R
right channel R-(Rl) = L

which is just the original recording with inverted channels

If you try downmixing the vocal-removed track to mono, you get
L-R+R-L = nothing!

It seems that pop3smtp23 saw through the problem beforehand and started talking about frequency domain analysis, just as the the professor did...

I guess I'm just being impatient--I would meet the professor tomorrow and I guess he would tell me all about how he intends to go about doing it--but I would like to hear some ideas from people here, if it's not too off-topic... tongue.gif

Of course, if you are trying to pick one voice out of many voices, you can't use the 'record the noise' trick...
pop3smtp23
I personally think, picking one voice out of manys without knowing the character will be a very hard project, which wonīt work without some additional AI which takes care of the language which is spoken. Example: If you are hearing a foreign language, itīs hard for you to tell which person is talking, even for your mother language it gets really hard to hear for you how many people are talking if you hear a lot of people discussing. And your brain trys to improve the result by taking care of voice characteristics. As an hint you could take some voice characteristics, as persons normally speak at some special band. So, you could build a histogram to get a first start, take the maxima and then try to do bandpassing. This should work for "some persons talking". Just my few thoughts while coding some computergraphics...
daniel
You are doing(voice removal):
m/s: ch1=l+r, ch2=l-r
remove mid(ch1)
m/s: ch1=0+(l-r), ch2=0-(l-r)
get: ch1=l-r, ch2=r-l

If you do the other way(remove side)
m/s: ch1=l+r+0, ch2=l+r-0
get: ch1=l+r, ch2=l+r

edit: i'm confused. need to sleep.
Joe Bloggs
Hm, would it help any you are directly facing the person you want to listen to while the other speakers are off to the side? My idea, as I said, is to phase-shift and amplify the L/R channels so that the person you wants to listen to SOUNDS like he's facing you even if he isn't, then do some processing to remove the voices to the side.

Wait... just saw daniel's post--can you explain what you are saying? ???

Well, looks like he's saying that the best you can do for 'side removal' is downmix to mono...

Anyway, I was saying... how about I compare the spectral analysis of the two channels and for every time/frequency point take the lower value of L and R? This ought to attenuate sounds off to the side only! smile.gif Is there some way for me to try it out on songs now? What editor would allow me to do this?
pop3smtp23
If you could decide to which person you are listening to, you could do everything smile.gif But extracting the person out of the audiosignal should be the real problem, just taking L-R will only work in situations where in the center there is only the voice and everything else is L/R.
daniel
What I'm saying is that you can keep:
a: left channel
b: right channel
c: mid channel
d: side channel

thats allready 4 channels.
if your person is talking little bit right from the centre you mix the mid with a little right. The result should be ALWAYS mono(speech is mono panned). If you want to extract one persons voice the first step is what I described. The second is spectral etc.etc.
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2009 Invision Power Services, Inc.