Help - Search - Members - Calendar
Full Version: Using neural networks in Audio compression
Hydrogenaudio Forums > Hydrogenaudio Forum > Scientific Discussion
Primius
article on Wikipedia about Neural networks :
Artificial neural networks

My idea :

useing Neural networks in Loseless Compression :

you could use neuronal nets for the Prediction in Lossless Coders.
the neural network will be"trained"by the encoder to make accurate predictions for a given music sample and will be transmitted to the Decoder along with some correction data.

Encoder :

You could"feed"the neuronal net with a small cutouts from your music sample
and compare its predictions to the real values in the music sample.
if they are near to equal, you"award"the network. if they differ to much,you "punish"the network
Like a Dog, which lerns to do a trick you teach him, the net will"lern"to predict more accurately.
By"Lern", I mean that the net's configuration* varies a bit on each punishment,
to avoid doing the same"mistakes"again.
The are many types of Lerning-argorithms for Neural networks, the Wiki article describes the most known.

the lerning procedure will be stopt, if the Peformance of the network is good enough,or the encoding time reacht a limit, the user has set.
A small Neural network couldn't predict the hole "song"accurately out of a some chuncks, so correction
data ist still needed. if the net has been trained well,the correction Data will be small.

the encoder will write the neural network and the correction data in the compressed file
if the network has been trained well,the correction Data will be small.

*the weighting of the inputs of each neuron in the network

Decoder :

the Decoder will get the neuronal network and the correction data from the file.
The trained net will start to predict the next samples according to the last ones.
the Original Data will be generated with the Predictions and the Correction data.

advantages (Compared to todays*Predictions):

- Flexibility :
the Method how the net Predicts has no limit.
it could be very Decision-like** or smooth like a linear one.
it depends on which Method of Prediction gives the best results.

- Compression rate :
I expect the rates to be very good since every compressed file would have its own,
near perfect fitting"Prediction algorithm"

*I mean FLAC,Monkeys audio and other most known Lossless codecs.
**I mean Boolen operations and If-coditions

disadvantages :

- it needs large compution time, because the "lerning"of the Neural Network is some sort of "try and error"
- its not easy to find a good working "architecture" of the Neural Network
(how much nerons?,how many layers?,which lerning algorithm?...)

What do you think?

due to my lack of codeing skills I cant do a experiment to backup my theory.

since English isn't my native language I expect you to ask if something isn't clear
or Slap me, if i dont make any sense smile.gif

Greetings,
Primius
legg
You might want to take a survey on the subject:
http://citeseer.csail.mit.edu/cs?q=neural+...+Documents&cs=1

I haven't worked on neural networks I just know the basics, but a few questions arise that anyone proposing it as a technique should answer:
- How good is the approximation given by a neural network when compared to a polynomial or LPC one?
- How many parameters do you need?
- What's the distribution of the residual?

Neural networks are obviously slower than LPC and polynomial, to make them worthy you'll have to prove that the ratios obtained are worth the long wait (could be days or even weeks to train the NN AFAIK).
HotshotGG
I am not to sure how neural networks actually work, although I did see Research paper that someone wrote that described using neural network in place of an actual psychoacoustics model for audio compression. I think it was in the IEEE explorer database, although I can't remember. Citeseer is good and one of the only databases you don't need to have subscription to have access too. wink.gif
Enig123
IIRC, monkey's audio applied some kind of NN technique.
kwwong
Neural networks is a form of non-linear adaptive filtering method. I was very involved in linear adaptive filtering but unfortunately I do not know much about non-linear adaptive filterings. crying.gif
Primius
QUOTE
- How many parameters do you need?

the encoder has to deliver the weightnings of the inputs of each Neuron.
if you quantsize each weightning with 16 bits and you have
lets say 4000 nerons (each with about 2000 inputs) this would result in 128000000bits (15625 kbybe)
roughly 15MB would be needed in this case.
note that the"parameters"wont be "refreshed" every xxxx samples like in other prediction algorithms.
they will be used for the whole file.
I dont know if 4000 neurons would be enough or already overkill.
the same question applys to the 16bits.unsure.gif
I think NN's have great potential, but obviously many test's need to be done on this subject.
QUOTE
I did see research paper that someone wrote that described using neural network in place of an actual psychoacoustics model for audio compression

useing NN's as a psychoacoustics model is another interesting idea.
imaging you"train"an encoder instead of"tunening"it by just rateing the work of the psychoacoustics model
until it works good enough cool.gif
QUOTE
IIRC, monkey's audio applied some kind of NN technique.

I read the monkey's audio theory page and could find something similar to neural network, maybe I overlooked something...

It looks like i'm a bit late with my idea.
I just saw a paper about useing NN's in image compression, that is 10 years old dry.gif smile.gif
kwwong
QUOTE(Primius @ Apr 23 2006, 10:22 AM) *

It looks like i'm a bit late with my idea.
I just saw a paper about useing NN's in image compression, that is 10 years old dry.gif smile.gif


Hey, even though it is 10 years old, that doesn't mean there aren't anymore new advances / breakthrough in this area. biggrin.gif
SebastianG
Many seem to try to make use of "hip techonologies" like neural networks or wavelets for all sorts of things like these techlologies are some kind of ultimate tool for everything. They are not. Linear prediction models the spectral shape of a portion of audio data quite well. However, the spectral shape changes usually and it might be worth trying to predict the spectral shape via some other predictors (but simple ones, please -- like differential coding of reflection coefficients or something).

my 2 cents

Sebi
pest
QUOTE

Many seem to try to make use of "hip techonologies" like neural networks or wavelets for all sorts of things like these techlologies are some kind of ultimate tool for everything. They are not.


integer wavelets for example aren't useful to exploit correlations but instead
they provide a simple way to split the data into different types of signals.
the entropy encoding improves as my tests showed with an adaptive filterbank.
SebastianG
Sorry, I didn't mean to discredit the usefulness of wavelets for purposes like compression. (They seem to work well for image coding)

Sebi
Primius
QUOTE(SebastianG @ Apr 24 2006, 12:50 PM) *

Many seem to try to make use of "hip techonologies" like neural networks or wavelets for all sorts of things like these techlologies are some kind of ultimate tool for everything. They are not. Linear prediction models the spectral shape of a portion of audio data quite well. However, the spectral shape changes usually and it might be worth trying to predict the spectral shape via some other predictors (but simple ones, please -- like differential coding of reflection coefficients or something).

my 2 cents

Sebi


I agree with you that neural networks aren't an ultimate tool for everything, but I think neural networks and
predictions in general have similar properties.

quote from the wikipedia article :
"They can be used to model complex relationships between inputs and outputs or to find patterns in data."

prediction argorithms do the same job when predicting the next sample.
they use their"knowledge"about the relationships between the inputs(known samples) and the outputs (predicted sample) to make a prediction.

the difference between NN's and prediction algorithms is what kind of"knowledge", they have.
the"knowledge"of prediction algorithms is hardcoded.
a NN will figure the relationship between samples out by itself (during the training phase in the encoder).
since its"knowledge"is somewhat"unlimited",it could find patterns and laws in a simple Music Song that no prediction-algorithm-coder has though of, resulting in more accurate predictions.

I don't want to claim that NN are the best predictors, but the similarities between NN and popular prediction algorithms indicate, that NN might be usefull for Prediction.
SebastianG
heheh smile.gif
You asked for opinions, right?
I wouldn't try going into this direction ... even with enough time on my hands.

Sebi
pest
QUOTE

the difference between NN's and prediction algorithms is what kind of"knowledge", they have.
the"knowledge"of prediction algorithms is hardcoded.
a NN will figure the relationship between samples out by itself (during the training phase in the encoder).


this is not true. even the simpliest linear predictor can easily made to "know" what kind of input it processes.
this is called adaptive wink.gif
the main adavantge of NN is that it can cover different relations than linear predictors. for example
you can make a linear predictor which adjusts it's weighting-stepsize according to the error...whoops
you enter the world of nonlinearity...
kwwong
QUOTE(pest @ Apr 24 2006, 01:15 PM) *

this is not true. even the simpliest linear predictor can easily made to "know" what kind of input it processes.
this is called adaptive wink.gif
the main adavantge of NN is that it can cover different relations than linear predictors. for example
you can make a linear predictor which adjusts it's weighting-stepsize according to the error...whoops
you enter the world of nonlinearity...


That was what I meant. NN is a NON-linear adaptive filtering method which unfortunately I knew nothing about.. tongue.gif
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2008 Invision Power Services, Inc.