caudec: a multiprocess audio converter for Linux and OS X, Leverages multi-core CPUs with lots of RAM |
![]() ![]() |
caudec: a multiprocess audio converter for Linux and OS X, Leverages multi-core CPUs with lots of RAM |
Feb 15 2012, 19:40
Post
#1
|
|
![]() Group: Members Posts: 1147 Joined: 4-May 04 From: France Member No.: 13875 |
I'd like to introduce a little program I wrote.
Caudec is a BASH script that transcodes audio files from one format (codec) to another. It leverages multi-core CPUs with lots of RAM by copying input files to a tmpfs mount, and running multiple processes concurrently (one per file and per codec).
Tested under Arch Linux. Download here. I tried my best to make sure that it'll run properly outside of my own environment, but I might have missed a couple of things. Please use the issues tracker to report any bugs. Feedback is most welcome! -------------------- caudec -c lossyFLAC -q S *.flac
|
|
|
|
Feb 15 2012, 22:45
Post
#2
|
|
![]() Group: Members Posts: 1147 Joined: 4-May 04 From: France Member No.: 13875 |
I just released version 1.1.0, which adds support for Musepack.
-------------------- caudec -c lossyFLAC -q S *.flac
|
|
|
|
Feb 16 2012, 00:34
Post
#3
|
|
|
Group: Members Posts: 149 Joined: 20-September 11 Member No.: 93842 |
Excuse my ignorance, but does TAK actually work under Linux?
This post has been edited by Dario: Feb 16 2012, 00:34 |
|
|
|
Feb 16 2012, 00:38
Post
#4
|
|
![]() Group: Members Posts: 1147 Joined: 4-May 04 From: France Member No.: 13875 |
The encoder/decoder (Takc.exe) works with wine. Linux users can use it for archiving, while transcoding to some other codec (e.g. lossy) for listening purposes. Caudec supports TAK encoding and decoding if the user has installed both Wine and TAK.
This post has been edited by skamp: Feb 16 2012, 00:38 -------------------- caudec -c lossyFLAC -q S *.flac
|
|
|
|
Feb 17 2012, 08:21
Post
#5
|
|
![]() Group: Members Posts: 1147 Joined: 4-May 04 From: France Member No.: 13875 |
It just occurred to me that I left out one of caudec's main selling points: it's fast. It sounds obvious to me, but maybe it isn't so much. I was never a sales person. It might also not be obvious that it works best on somewhat large sets of files (e.g. a whole album with one or two CDs, one file per track).
Encoding ABBA's 2CD The Definitive Collection (148 minutes, 37 tracks) from WAV to FLAC --best, with one process, on a Core i7 @ 2.2 GHz: 46x real time. Same as above, with 8 processes: 186x Just for kicks, FLAC -5 (default setting) with 8 processes encodes at 569x, TAK -p2 at 743x. -------------------- caudec -c lossyFLAC -q S *.flac
|
|
|
|
Feb 23 2012, 01:35
Post
#6
|
|
![]() Group: Members Posts: 1147 Joined: 4-May 04 From: France Member No.: 13875 |
I just released version 1.3.0 of caudec, that
Upgrading is highly recommended, if only for the bug fix. Please report any issues using the issues tracker. This post has been edited by skamp: Feb 23 2012, 01:36 -------------------- caudec -c lossyFLAC -q S *.flac
|
|
|
|
Jun 2 2012, 00:31
Post
#7
|
|
|
Group: Members Posts: 169 Joined: 14-November 09 Member No.: 74931 |
Hi skamp. I tried your caudec script and it is definitely very fast. I tested it by transcoding from flac to ogg -q 7 an album of flacs and it shaved maybe 40% off the time taken by oggenc or by ffmpeg>wav>oggenc or straight ffmpeg -i $file -acodec libvorbis etc. As far as I can tell all the speed benefit comes from parallel processing (I checked this by processing a single file and finding that in this case caudec is in fact slower than oggenc or a more typical bash script). So I'm wondering what is the point of creating the tmpfs and doing so much copying? Is it just to facilitate dropping files in and out of a queue? I can't see any need to create a memory consuming structure for machines with large amounts of RAM, because transcoding is almost all CPU. So I like your script's speed but I wonder if the same thing couldn't be achieved more simply by using job control to get bash running parallel encoder processes, or maybe I missed something important?
|
|
|
|
Jun 2 2012, 00:40
Post
#8
|
|
![]() Group: Super Moderator Posts: 3268 Joined: 26-July 02 From: princegeorge.ca Member No.: 2796 |
While encoding is a parallel task, reading from a drive is intrinsically sequential. You can't double read speed by reading 2 files at once. In fact, you're likely to harm read speed. By queuing disk operations and running encoding purely in RAM, caudec cuts out the parallel read bottlenecks and runs the process as fast as possible.
-------------------- (atrix|(fb2k->e-mu 0404 usb|audio 8 dj))->hd280|jvc ha-fx35-b
|
|
|
|
Jun 2 2012, 08:08
Post
#9
|
|
|
Group: Members Posts: 169 Joined: 14-November 09 Member No.: 74931 |
I can see the logic, but disk reads are very high these days. How can there be a bottle neck when reading 6 or 8 or 10 lossless files of maybe between 20MB and 50MB each, which are going to take a a few seconds to decode and encode anyway? Surely that doesn't present any kind of challenge with modern hardware?
I'd noticed that oggenc on XP was significantly faster than a gcc compiled oggenc binary in Linux so I was keen to try to make up the difference and Skamp's script prompted me to go back to my bash scripts and add some parallel processing. My scripts are simpler stuff: essentially decode+dump metadata function, encoder function, metadata writer function. By letting the core functions of the script run in parallel/background processes (number of cores +1) I can achieve about the same improvements, for example the directory I transcoded earlier, flacs to oggs: my original bash script: real 3m3.301s user 3m8.952s sys 0m3.496s caudec: real 1m47.993s user 3m11.467s sys 0m4.126 my bash script with some parallel processes/backgrounding: real 1m52.904s user 3m10.826s sys 0m3.877s But I only have 4 year old dual core AMD Athlon64 desktop and a 5 year old Core Duo (32-bit only) and a similar vintage Core 2 Duo....no experience of i7 here so I can't personally scale my tests up to 4 cores and 8 threads. Has anyone with modern hardware (quad core, multi GB RAM, SATA III etc) actually measured the difference and if so is it found it to be substantial? At the moment I can see Skamp's caudec page which compares single thread processing (and I assume conventional read from HDD) with parallel processing from tmpfs. Obviously the parallelism makes a huge difference and perhaps that accounts for all or almost all the difference, so what is missing is some data showing that the tmpfs is solving a problem or adding a benefit. edited for typos. This post has been edited by Takla: Jun 2 2012, 08:10 |
|
|
|
Jun 2 2012, 10:49
Post
#10
|
|
![]() Group: Members Posts: 1147 Joined: 4-May 04 From: France Member No.: 13875 |
What Canar said. Hard drives don't like concurrent access, and you actually lose read speed (more than proportionally) as you increase the number of concurrent accesses. My laptop hard drive tops out at maybe 70 MB/s on a single access, but it's not like it gives me 17.5 MB/s per file when I'm accessing 4 files at once, it gives me less than that. Same thing with my USB3 HDD where my backup resides. I tested it a while ago so I don't have the exact figures anymore, but my observation was that single-access, sequential reading was needed.
I have a quad-core i7 with 8 threads and 8 GiB of RAM, so my objective was to get the highest transcoding speeds possible while leveraging the gear at my disposal. Copying input files to a tmpfs sequentially while transcoding them concurrently proved to be the most efficient way. The speed gains range from slight to significant, depending on the gear, the configuration (number of processes, etc…) and the set of files you're transcoding. E.g. reading 8 files at once can slow my hard drive to a crawl. This post has been edited by skamp: Jun 2 2012, 10:53 -------------------- caudec -c lossyFLAC -q S *.flac
|
|
|
|
Jun 2 2012, 10:57
Post
#11
|
|
![]() Group: Developer Posts: 3035 Joined: 2-December 07 Member No.: 49183 |
somewhat related: http://www.hydrogenaudio.org/forums/index....showtopic=94783
|
|
|
|
Jun 2 2012, 11:22
Post
#12
|
|
![]() Group: Members Posts: 1147 Joined: 4-May 04 From: France Member No.: 13875 |
I dug up an old version (before 1.0) that didn't copy input files to a tmpfs. Here are the results when transcoding FLACs from my hard drive to Ogg Vorbis, with 8 processes, on a 2 CD album with 37 files (same external encoders):
That's roughly a 21% speed increase. Maybe not quite as dramatic as one could hope, but substantial nonetheless. Obviously I dropped filesystem caches before each run. This post has been edited by skamp: Jun 2 2012, 11:29 -------------------- caudec -c lossyFLAC -q S *.flac
|
|
|
|
Jun 2 2012, 11:48
Post
#13
|
|
|
Group: Members Posts: 169 Joined: 14-November 09 Member No.: 74931 |
Thanks for the info. If I ever get an i7 I'll be keen to transcode this way. I've been trying out different numbers of parallel processes and I've found that on my Athlon 64 I get maximum transcode speed by allowing 5 parallel processes instead of 3, and this now performs at least as quickly as the tmpfs method (time difference is <1%), though it's all snail paced compared to your i7 figures; where you get 124x I get 26x (all on the same disk)
This post has been edited by Takla: Jun 2 2012, 11:49 |
|
|
|
Jun 2 2012, 12:06
Post
#14
|
|
![]() Group: Members Posts: 1147 Joined: 4-May 04 From: France Member No.: 13875 |
I'm guessing your hard drive is less of a bottleneck with your configuration (CPU speed, number of concurrent reads on the HDD) than with mine
Incidentally, the tmpfs method provides no speed gain when I'm transcoding FLACs located on my SSD. In that case, the storage medium is no longer the bottleneck. Unfortunately, my SSD is nowhere near large enough to hold my entire FLAC library, so I still have to deal with my slowish HDD. -------------------- caudec -c lossyFLAC -q S *.flac
|
|
|
|
Jun 2 2012, 12:08
Post
#15
|
|
|
Group: Members Posts: 169 Joined: 14-November 09 Member No.: 74931 |
I got my Core 2 Duo 1.6 GHz running 64-bit Debian Stable headless with 512MB RAM to hit the heady heights of 33x. It's a champagne moment. Tomorrow I buy the (parallel) stripes, body kit and chrome exhaust.
|
|
|
|
Jun 2 2012, 12:41
Post
#16
|
|
![]() Group: Members Posts: 1147 Joined: 4-May 04 From: France Member No.: 13875 |
I'd noticed that oggenc on XP was significantly faster than a gcc compiled oggenc binary in Linux so I was keen to try to make up the difference That's the reason I added support for Windows binaries with Wine. There are instructions on how to install and use those with caudec. lvqcl's Ogg Vorbis AoTuV ICC build might be of interest to you. -------------------- caudec -c lossyFLAC -q S *.flac
|
|
|
|
Jun 2 2012, 13:20
Post
#17
|
|
|
Group: Members Posts: 169 Joined: 14-November 09 Member No.: 74931 |
I saw the info on wine and win binaries in your docs/examples and it struck a chord because I'd previously noticed a big discrepancy between the speed of oggenc in XP (with foobar as frontend) and oggenc in Debian 32-bit. But as I don't make a habit of watching the text scroll by I can live with my newly parallelised scripts doing 26x or 33x (finally quicker than AoTuV in XP on my hardware). I'll stick with native binaries so I can run the same scripts across different free OS and architectures and not have to care if wine is installed/working/worth the effort.
|
|
|
|
Jun 4 2012, 00:04
Post
#18
|
|
|
Group: Members Posts: 169 Joined: 14-November 09 Member No.: 74931 |
btw I booted my XP install to see what foobar2000 and oggenc were doing and discovered that the apparent gulf in encoder performance between oggenc in Debian and oggenc in XP was simply due to foobar2000 running two oggenc processes in parallel (XP version of oggenc being aoTuVb6.03 from rarewares). Once both cores are maxed out oggenc performs a little faster (very little: <1%, probably has more to do with OS services than the binary) in Debian 32-bit than in XP SP3 32-bit though the difference is very slight (if you measured it using a button-press stopwatch you'd never know there was any difference). Anyway if I happen again on an application which apparently performs hugely better or worse on a different OS I'll take a closer look before assuming something is either very wrong or inexplicably excellent....
This post has been edited by Takla: Jun 4 2012, 00:06 |
|
|
|
Jun 4 2012, 11:48
Post
#19
|
|
![]() Group: Members Posts: 1147 Joined: 4-May 04 From: France Member No.: 13875 |
That's roughly a 21% speed increase. Maybe not quite as dramatic as one could hope, but substantial nonetheless. The benefit gets more obvious as CPU time decreases (the HDD becomes more of a bottleneck). Here's a case where the difference becomes "dramatic": encoding WAVs to FLAC (-q 5, FLAC's default compression level).
That's a 84% speed increase -------------------- caudec -c lossyFLAC -q S *.flac
|
|
|
|
Jun 4 2012, 20:14
Post
#20
|
|
|
Group: Members Posts: 230 Joined: 21-February 05 Member No.: 20022 |
I am glad more Linux stuff is being done since I use Linux on my laptop and I learn new things all the time. Regards.
|
|
|
|
Jun 5 2012, 12:49
Post
#21
|
|
![]() Group: Members Posts: 1147 Joined: 4-May 04 From: France Member No.: 13875 |
I was curious, so I implemented a switch for disabling the preloading of input files to RAM, for cases where the underlying medium is a fast SSD, ramdisk or whatever. I ran a few tests with light to intensive CPU tasks, and the speed gains were negligible. Since inappropriate / uneducated use of that switch could easily cause terrible performance, I've decided to revert the change and not include it in a future release (not until everyone has terrabyte SSDs, at least).
This post has been edited by skamp: Jun 5 2012, 12:56 -------------------- caudec -c lossyFLAC -q S *.flac
|
|
|
|
Jun 27 2012, 09:11
Post
#22
|
|
![]() Group: Members Posts: 1147 Joined: 4-May 04 From: France Member No.: 13875 |
I released version 1.4.0, with many changes (pretty much as many commits as all of the other versions combined):
Upgrading is strongly recommended. Please use the tracker to report any bugs. -------------------- caudec -c lossyFLAC -q S *.flac
|
|
|
|
Jul 10 2012, 12:22
Post
#23
|
|
![]() Group: Members Posts: 1147 Joined: 4-May 04 From: France Member No.: 13875 |
Latest version (1.4.3) brings support for Opus and ALAC encoding, among other improvements and fixes. See changes.
-------------------- caudec -c lossyFLAC -q S *.flac
|
|
|
|
Jul 10 2012, 22:08
Post
#24
|
|
![]() Group: Members Posts: 25 Joined: 2-April 10 Member No.: 79529 |
Excelent, thank you skamp. Going to test it on Debian 6.
|
|
|
|
Jul 30 2012, 11:16
Post
#25
|
|
![]() Group: Members Posts: 1147 Joined: 4-May 04 From: France Member No.: 13875 |
I released caudec 1.5.0. Changes:
Thanks to Garf for his help on the RG scanner. This post has been edited by skamp: Jul 30 2012, 11:22 -------------------- caudec -c lossyFLAC -q S *.flac
|
|
|
|
![]() ![]() |
|
Lo-Fi Version | Time is now: 19th June 2013 - 05:28 |