Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Distributed, multithreading LAME? (Read 5717 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Distributed, multithreading LAME?

Having spotted multithreading LAME and distributed LAME, I wonder if anyone has stuck the two together? I was first investigating whether a dual-core system would be more suitable for transcoding. Then I was pointed to LAME MT, and now I’ve just ordered a dual-processor, dual-core system. Obviously the purpose is near-real-time, on-the-fly transcoding.

Thinking about this a bit more, it’s occurred to me that after the above system turns up, I’ll have roughly 25ghz’s worth of P4-equivalent power available around my home. You can probably see where I’m going with this. Is there conceivably a way to combine multi-threaded and distributed encoding of MP3’s from a WAV source, called from one location?

Distributed, multithreading LAME?

Reply #1
side note: I do not think that the current Lame-MT is suitable for encoding, only for experimenting.

Distributed, multithreading LAME?

Reply #2
Quote
side note: I do not think that the current Lame-MT is suitable for encoding, only for experimenting.
[a href="index.php?act=findpost&pid=341293"][{POST_SNAPBACK}][/a]


I haven't really done a whole lot of detailed listening tests with it yet... this is all just going out on a limb. What's wrong with it specifically?

Distributed, multithreading LAME?

Reply #3
Quote
Obviously the purpose is near-real-time, on-the-fly transcoding.
[a href="index.php?act=findpost&pid=341278"][{POST_SNAPBACK}][/a]


Quote
I:\>lame -V 2 file.wav
LAME 3.97 (beta 1, Sep 12 2005) 32bits (http://www.mp3dev.org/)
CPU features: MMX (ASM used), SSE (ASM used), SSE2
Using polyphase lowpass filter, transition band: 18671 Hz - 19205 Hz
Encoding I:\file.wav
      to I:\file.wav.mp3
Encoding as 44.1 kHz VBR(q=2) j-stereo MPEG-1 Layer III (ca. 7.3x) qval=3
    Frame          |  CPU time/estim | REAL time/estim | play/CPU |    ETA
  7089/7089  (100%)|    0:39/    0:39|    0:39/    0:39|   4.7149x|    0:00



Probably i don't understand what you mean, but... I thought realtime encoding was achieved long time ago with LAME.

Distributed, multithreading LAME?

Reply #4
Quote
What's wrong with it specifically?

No bit reservoir

Distributed, multithreading LAME?

Reply #5
I wonder if a distributed encoder is possible across different processors, because I've seen on doom9.org that Xvid encodes have minor differences in intel and amd processors. I don't exactly know why this is or if LAME encodes on different processors produce minor differences, though.

Distributed, multithreading LAME?

Reply #6
Quote
I wonder if a distributed encoder is possible across different processors, because I've seen on doom9.org that Xvid encodes have minor differences in intel and amd processors. I don't exactly know why this is or if LAME encodes on different processors produce minor differences, though.
[a href="index.php?act=findpost&pid=341330"][{POST_SNAPBACK}][/a]
That's very interesting. My understanding was that results from different processors should be identical, but maybe one of the LAME devs can inform us.

The challenge with distributing MP3 encoding over multiple CPUs is that, if you include a bit reservoir, the frames of the MP3 are not independent. This means that if you have sounds A and B then MP3(A|B) will be different from MP3(A)|MP3(B) (where | is concatenation). You can't just simply split sounds up and encode the different parts seperately.

However, as mentioned above, considering that LAME can encode at over 10x realtime on most modern CPUs, is there any point to this? Sure, if you are encoding a lot of files, distributing them one file per CPU would come with a benefit, but distributing the encoding of a single file over multiple processors probably wouldn't do anything useful.

Distributed, multithreading LAME?

Reply #7
It's one of these:

http://www.doom9.org/codec-comparisons.htm

I don't remember which one though.


Edit: I remember now, the problem was with a reference build of an application and one optimized for intel, I believe. So it wasn't the same exact program, it was an optimized one vs a generic one.

Distributed, multithreading LAME?

Reply #8
The only mentions I could find were that they got different results from an AMD optimised DLL and an Intel optimised DLL of XviD. This is not a surprise, and has got to do with the way floating point math works. For example, if you have A=C*B+C*D the optimiser could turn that into A=C*(D+B), which algebraically are the same, but on floating point hardware are unlikely to come up with precisely the same answer.

This makes floating point ode optimised for various processors likely to give subtly different answers.

Distributed, multithreading LAME?

Reply #9
Yeah, that's right.  D'oh!

Distributed, multithreading LAME?

Reply #10
DOH!

Yes, Of course I misphrased myself. I meant on-the-fly as in transcoding while transferring, at as near 'USB line speed' as possible. If LAME MT works in a dual-threaded, dual-processor set-up, then I've worked out that I can expect transcode speeds per track of about 5 seconds.

Quote
Quote
Obviously the purpose is near-real-time, on-the-fly transcoding.
[a href="index.php?act=findpost&pid=341278"][{POST_SNAPBACK}][/a]


Quote
I:\>lame -V 2 file.wav
LAME 3.97 (beta 1, Sep 12 2005) 32bits (http://www.mp3dev.org/)
CPU features: MMX (ASM used), SSE (ASM used), SSE2
Using polyphase lowpass filter, transition band: 18671 Hz - 19205 Hz
Encoding I:\file.wav
      to I:\file.wav.mp3
Encoding as 44.1 kHz VBR(q=2) j-stereo MPEG-1 Layer III (ca. 7.3x) qval=3
    Frame          |  CPU time/estim | REAL time/estim | play/CPU |    ETA
  7089/7089  (100%)|    0:39/    0:39|    0:39/    0:39|   4.7149x|    0:00



Probably i don't understand what you mean, but... I thought realtime encoding was achieved long time ago with LAME.
[a href="index.php?act=findpost&pid=341319"][{POST_SNAPBACK}][/a]

Distributed, multithreading LAME?

Reply #11
Quote
Quote
What's wrong with it specifically?

No bit reservoir
[a href="index.php?act=findpost&pid=341327"][{POST_SNAPBACK}][/a]



What does a bit reservoir do?

Distributed, multithreading LAME?

Reply #12
What do you want to achieve with this system? It would be fairly simple to build a system which caches the data until a song is complete, then encodes it with a single CPU. This design will have the same total throughput as one with multithreaded LAME (probably better, because of thread overhead) but four times the latency.

If it's a bulk data transfer application that you are developing, then latency is not a problem (you can wait twenty seconds for a song). A better idea of what you are trying to achieve would be really helpful.

Distributed, multithreading LAME?

Reply #13
Quote
What do you want to achieve with this system? It would be fairly simple to build a system which caches the data until a song is complete, then encodes it with a single CPU. This design will have the same total throughput as one with multithreaded LAME (probably better, because of thread overhead) but four times the latency.

If it's a bulk data transfer application that you are developing, then latency is not a problem (you can wait twenty seconds for a song). A better idea of what you are trying to achieve would be really helpful.
[a href="index.php?act=findpost&pid=341372"][{POST_SNAPBACK}][/a]



Very simple. You know the Audiomorph feature in Red Chair's products or j.River Media Center's transfer transcode feature? An on-the-fly transcode from a library as you're transferring to a player? I'm trying to achieve that from FLAC to MP3.

Distributed, multithreading LAME?

Reply #14
Quote
Quote
What's wrong with it specifically?

No bit reservoir
[a href="index.php?act=findpost&pid=341327"][{POST_SNAPBACK}][/a]


Eventually (as in LAME 4.0+)  will presets be developed that can function without the bit reservoir?  Or is it important enough to be required for best results?

From what I've heard, its not all that useful to high bitrate VBR, but I'm personally not too familar with it.

Also, why is support in the encoder so important?  It seems like a PERL script could do the same thing with current versions of LAME right now with 1000x less work and still use the bit reservoir.

Distributed, multithreading LAME?

Reply #15
Quote
Very simple. You know the Audiomorph feature in Red Chair's products or j.River Media Center's transfer transcode feature? An on-the-fly transcode from a library as you're transferring to a player? I'm trying to achieve that from FLAC to MP3.
[a href="index.php?act=findpost&pid=341408"][{POST_SNAPBACK}][/a]
This wouldn't be difficult at all to develop without encoder support. Basically, you need a script which waits till the FLAC files have been transferred. Then wait until a CPU is free and start the encoder.  Once an encode is finished, delete the FLAC and transfer the MP3 over the outgoing cable.

As Mike Giacomelli said, you can do this in 50 lines of Perl. This will not only support the bit reservoir, be faster than a single multithreaded LAME but also be easily extensible to other encoders.