FLACCL: CUDA-enabled FLAC encoder by Gregory S. Chudov (prev. FlaCuda), Formerly "lossless codecs and CUDA" |
FLACCL: CUDA-enabled FLAC encoder by Gregory S. Chudov (prev. FlaCuda), Formerly "lossless codecs and CUDA" |
Jul 13 2008, 05:34
Post
#1
|
|
|
Group: Members Posts: 176 Joined: 20-January 03 From: A Tropical Isle Member No.: 4640 |
With my recent purchase of a 9000 series nVidia graphics card, I started thinking, has anyone investigated if nVidia's CUDA could be useful for lossless compression? I'm not even remotely close to being a programmer, so I haven't a clue how the code works, but it seems like CUDA is valuable for coding/decoding. I know nVidia is already holding a contest to speed up LAME (which ends in about 2 weeks), so perhaps it could be used to speed up lossless compressors? The fastest modes of several codecs are already blazing fast, approaching the limits of hard drives, but perhaps the high-compression modes could be sped-up through CUDA. Maybe, if the speed-up is enough, developers could even implement more ways to gain compression while still maintaining good encoding rates. It would be pretty cool if compression levels like La's best could be done at 50x or something.
Anyway, my curiosity is large, so just thought I'd ask. :) |
|
|
|
![]() |
Sep 15 2009, 07:05
Post
#2
|
|
![]() Group: Members Posts: 677 Joined: 4-May 08 Member No.: 53282 |
IMHO audio or video encoding will not help nvidia survive long because if the only purpose of buying GPU become to accelerating encoding then you'd better buy a higher-end CPU, being written in a lower process than GPU, CPU will always have the advantage in brute encoding force vs. power consumption & heat.
As for a multithreaded flac encoder, AFAIK there is none, I think I recall I read about some very experimental proof-of-concept code on some mailing list, but nothing serious. Maybe we should start a donation to buy a quad core for Josh, it cannot be more useless than buying a PC for Klemm afterall -------------------- CDImage+CUE
Secure [Low/C2/AR(2)] Flac -4 |
|
|
|
Sep 15 2009, 12:43
Post
#3
|
|
|
Group: Members Posts: 913 Joined: 22-October 01 From: the Netherlands Member No.: 335 |
As for a multithreaded flac encoder, AFAIK there is none, .. The simpelest way to use multi threading for any encoder is to run multiple encoders simultaneously (foobar2000 can do that). The number of usable threads depends on when the hard disk becomes the bottleneck. |
|
|
|
Sep 16 2009, 09:11
Post
#4
|
|
![]() Group: Members Posts: 2296 Joined: 18-May 03 From: Denmark Member No.: 6695 |
As for a multithreaded flac encoder, AFAIK there is none, .. The simpelest way to use multi threading for any encoder is to run multiple encoders simultaneously (foobar2000 can do that). The number of usable threads depends on when the hard disk becomes the bottleneck. Now we just need a way to simultanously run CUDA and CPU encoders -------------------- Can't wait for a HD-AAC encoder :P
|
|
|
|
Sep 17 2009, 08:51
Post
#5
|
|
![]() Group: Members Posts: 128 Joined: 9-August 06 Member No.: 33830 |
Now we just need a way to simultanously run CUDA and CPU encoders Sounds fun, though I'm afraid we'd bump into a strong bottleneck because of disk head positioning NCQ in AHCI mode should help a lot with more threads, but it didn't when I tested it a while ago. Physically different source/target drives can alleviate this bottleneck quite a bit. Fast SSDs are worth a try too This CUDA encoder can be a different solution, in case of one instance it's faster than the reference encoder running on one core of my CPU (converting one file at a time is the least disk-bottlenecked way to do it). A natively multithreaded CPU-based encoder (working on segments of one single track) is another option. |
|
|
|
Sep 19 2009, 09:38
Post
#6
|
|
![]() Group: Members Posts: 1061 Joined: 4-May 04 From: France Member No.: 13875 |
Sounds fun, though I'm afraid we'd bump into a strong bottleneck because of disk head positioning Ideally you would run multiple instances of a single-threaded encoder (one track per CPU core) and one instance of the CUDA encoder per GPU at the same time - it's just a matter of making sure that all instances are kept busy. When the number of remaining tracks gets lower than the number of available cores, you prioritize the GPU instance (since it's faster than a single-threaded encoder on a single CPU core), but also run (if available) a multi-threaded encoder; one MT encoder over two cores is likely to be slower than two instances of a ST encoder over the same number of cores (see the Lancer builds of the Ogg Vorbis encoder). In other words, an MT encoder is particularly useful for keeping CPU cores busy when the workload dries up. In short, the priorities go like this (if you have a multi-core CPU, that is): ST * n CPU cores > GPU > MT As for the I/O bottlenecks, that's when a large enough RAMdisk comes in very handy. Even just 1GiB is often enough for encoding a whole album (WAV + FLAC or FLAC + Ogg Vorbis or whatever on the RAMdisk). I already use all available CPU cores when I encode my rips to FLAC or any other codec (one track per core); what I could really use, even before a MT FLAC encoder comes up, is a simple, command-line, multi-threaded Replay Gain utility. As I've said in the past, computing RG values on an album now takes longer than encoding it in the first place (because the former uses only one core while the latter uses all 4 cores on my quadcore CPU). -------------------- Save my friend from going homeless: http://outpost.fr/url/308w
|
|
|
|
Sep 19 2009, 13:04
Post
#7
|
|
![]() Group: Members Posts: 128 Joined: 9-August 06 Member No.: 33830 |
As for the I/O bottlenecks, that's when a large enough RAMdisk comes in very handy. Even just 1GiB is often enough for encoding a whole album (WAV + FLAC or FLAC + Ogg Vorbis or whatever on the RAMdisk). You're absolutely right, I don't know how I could forget about RAMdisks. I used them all the time when 8MiB felt plenty of RAM, but somehow I never thought about them since we have multiple GiBs at our disposal... talk about contradictions... |
|
|
|
gib FLACCL: CUDA-enabled FLAC encoder by Gregory S. Chudov (prev. FlaCuda) Jul 13 2008, 05:34
Martel I apologize for being completely incorrect. Jul 13 2008, 09:29
Garf QUOTE (Martel @ Jul 13 2008, 10:29) If I... Jul 13 2008, 10:00
gib Ah, I see now. Thanks very much for the response,... Jul 14 2008, 03:52
Gregory S. Chudov Here is good news.
An alfa version of flac encode... Sep 10 2009, 03:27
Dr_Colossus Sounds awesome, care to elaborate on the performan... Sep 10 2009, 05:58
Gregory S. Chudov Less impressive than i hoped to, but this is only ... Sep 10 2009, 06:05
Grunpfnul No love for ati? *sniff* Sep 10 2009, 09:24
Gregory S. Chudov There is love, but there's no implementation ^... Sep 10 2009, 09:49
Case I ran some tests with my Core i7 940 (stock speed)... Sep 10 2009, 16:30
Gregory S. Chudov Thank you! Sep 10 2009, 20:34
Ron Jones I'm anxious to see how this would perform on t... Sep 10 2009, 20:44
thundat00th QUOTE (Grunpfnul @ Sep 10 2009, 03:24) No... Sep 10 2009, 21:20
hlloyge Here are my test results:
Klaus Shultze - Dreams ... Sep 10 2009, 22:42
Gregory S. Chudov What was the file size for flac -6? We should comp... Sep 10 2009, 23:33
Wombat Not to shabby. Tried it on a C2D@3600+GTX260
Drea... Sep 11 2009, 00:33
GHammer This is on a 9500 GT
FlaCuda
Filename : Clocks.w... Sep 11 2009, 02:21
Gregory S. Chudov QUOTE (GHammer @ Sep 11 2009, 05:21) Flac... Sep 11 2009, 03:24
Lucho GPU audio encoding will be useful when OpenCL get ... Sep 11 2009, 08:28
hlloyge Here I am again, this time, more detailed:
Flac 1... Sep 11 2009, 19:52
Justin Ruggles QUOTE (hlloyge @ Sep 11 2009, 14:52) Is t... Sep 12 2009, 02:45
gib Hey, wow. This topic of mine was bumped, and with... Sep 12 2009, 06:10
PatchWorKs Well, I believe that even a small gain is always w... Sep 12 2009, 09:40
hlloyge Again: C2D8200, Geforce 9600GT
album.wav to flac ... Sep 12 2009, 10:32
Gregory S. Chudov QUOTE (PatchWorKs @ Sep 12 2009, 12:40) I... Sep 12 2009, 11:33
Dologan Wow, nice work Gregory!
Just wondering... how... Sep 12 2009, 11:40
Gregory S. Chudov Current GPUs do integer computations quite alright... Sep 12 2009, 11:52
Garf QUOTE (Gregory S. Chudov @ Sep 12 2009, 12... Sep 12 2009, 21:35
Dologan Hmm, does the encoder do pipe encoding (i.e. for p... Sep 12 2009, 12:15
Wombat Some questions regarding Flaccuda.
Back when flake... Sep 12 2009, 13:41
Maurits How hard would it be to convert this CUDA version ... Sep 12 2009, 14:48
Gregory S. Chudov QUOTE (Dologan @ Sep 12 2009, 15:15) Hmm,... Sep 12 2009, 15:07
Maurits QUOTE (Gregory S. Chudov @ Sep 12 2009, 15... Sep 12 2009, 15:57
Dologan QUOTE (Gregory S. Chudov @ Sep 12 2009, 15... Sep 12 2009, 17:40
Wombat Thanks for explaining it. Really a nice work you h... Sep 12 2009, 15:39
guruboolez My results on a old Core2Duo E6300 and small Nvidi... Sep 12 2009, 20:09
arri Just finished my tests:
image.wav:
flac 1.2.1 ... Sep 12 2009, 22:16
Wombat QUOTE (arri @ Sep 12 2009, 22:16) Just fi... Sep 12 2009, 22:51
arri QUOTE (Wombat @ Sep 12 2009, 23:51) Afaik... Sep 12 2009, 23:40
Wombat QUOTE (arri @ Sep 12 2009, 23:40) QUOTE (... Sep 12 2009, 23:53
Case I made a more thorough comparison with the new ver... Sep 13 2009, 10:00
hlloyge It sure isn't that compile, as they (at least ... Sep 13 2009, 14:36
alvaro84 I've done a quick test, how a 2+ year old full... Sep 13 2009, 16:19
Gregory S. Chudov Thank you for detailed test results. Looking at th... Sep 13 2009, 16:47
alvaro84 QUOTE (Gregory S. Chudov @ Sep 13 2009, 16... Sep 13 2009, 19:33
Gregory S. Chudov QUOTE (alvaro84 @ Sep 13 2009, 22:33) Is ... Sep 13 2009, 19:42
Case Seems to me like other modes got a speed boost too... Sep 13 2009, 21:39
Gregory S. Chudov Phew. I think i finally squeezed everything i coul... Sep 14 2009, 20:25
Case Impressive.
Sep 14 2009, 21:18
alvaro84 QUOTE (Case @ Sep 14 2009, 21:18) Impress... Sep 15 2009, 04:57
Gregory S. Chudov Thank you. Sep 14 2009, 21:26
sauvage78 So far I wasn't much interested in the small c... Sep 15 2009, 05:24
hlloyge I wouldn't kill nVidia just yet. AFAIK, as of ... Sep 15 2009, 06:50
Case Just converted my entire FLAC -8 library to FlaCud... Sep 15 2009, 20:34
Wombat Seems like i found a strange behaviour. If you hav... Sep 16 2009, 08:45
[JAZ] QUOTE (Wombat @ Sep 16 2009, 09:45) Seems... Sep 16 2009, 18:36
lvqcl FlaCuda_0.4 with "-8" switch, original t... Sep 16 2009, 19:09
Wombat QUOTE (lvqcl @ Sep 16 2009, 20:09) FlaCud... Sep 16 2009, 19:14
Gregory S. Chudov Added lossyWav support. It shouldn't make any ... Sep 17 2009, 23:12
Wombat Thanks for your fast work on that!!
Works ... Sep 17 2009, 23:30
Gregory S. Chudov Is there anybody here who knows the math behind Ch... Sep 18 2009, 19:05
Gregory S. Chudov I must add, that when computations are done in dou... Sep 18 2009, 19:54
gib I've gotten flacuda to work with the old but s... Sep 19 2009, 22:39
Wombat QUOTE (gib @ Sep 19 2009, 23:39) I've... Sep 19 2009, 23:10
gib Wombat, thanks for the suggestion of using Multi F... Sep 20 2009, 00:41
Gregory S. Chudov Here is version that is a tiny bit faster, i hope.... Sep 25 2009, 23:45
Wombat I eally canīt tell if your FlaCuda became any fast... Sep 26 2009, 01:58
Case This is getting ridiculous. New FlaCuda 0.6 is fas... Sep 26 2009, 11:19
alvaro84 I second, it's ridiculous
Now FlaCuda 0.6 at ... Sep 26 2009, 12:54
hlloyge CUDA:
CODED:\temp_2>CUETools.FlaCuda... Sep 26 2009, 13:31
guruboolez I'm far from ridiculous performances with my l... Sep 26 2009, 18:57
Gregory S. Chudov QUOTE (guruboolez @ Sep 26 2009, 21:57) C... Sep 26 2009, 19:23
alvaro84 QUOTE (Gregory S. Chudov @ Sep 26 2009, 19... Sep 27 2009, 06:35
Dologan Holy crap! Nice work, Greg!
FlaCuda -4 on... Sep 26 2009, 18:58
guruboolez Thank you for the explanation, Gregory. I Hope it... Sep 26 2009, 19:36
odyssey It crashes on my Geforce 9300 that's supposed ... Sep 26 2009, 20:15
Gregory S. Chudov Previous versions too? Sep 26 2009, 21:20
odyssey FlaCUDA03-05:
CODEError : Exception of type ... Sep 27 2009, 01:37
Wombat One small question about FlaCudas default blocksiz... Oct 1 2009, 01:01
Gregory S. Chudov Smaller blocks make it slower, but have their adva... Oct 1 2009, 07:17
johnsonlam Sorry for breaking in ...
Did someone planning fo... Oct 1 2009, 15:40
odyssey QUOTE (johnsonlam @ Oct 1 2009, 16:40) Di... Oct 1 2009, 15:48
johnsonlam QUOTE (odyssey @ Oct 1 2009, 22:48) QUOTE... Oct 2 2009, 08:49
hlloyge That was exactly what I was thinking - because it ... Oct 1 2009, 20:08
glebe johnsonlam, see screenshot Oct 2 2009, 17:42
johnsonlam QUOTE (glebe @ Oct 3 2009, 00:42) johnson... Oct 5 2009, 17:27
Rotareneg Sempron 3400+ (Socket 754) o/c'ed to 2500 MHz,... Oct 4 2009, 01:14
Ron Jones Can't wait to see how Fermi does with FLACuda.... Oct 4 2009, 02:08
VeaaC QUOTE (Ron Jones @ Oct 4 2009, 03:08) Can... Oct 5 2009, 02:56
Garf QUOTE (VeaaC @ Oct 5 2009, 03:56) QUOTE (... Oct 5 2009, 20:06
probedb QUOTE (Garf @ Oct 5 2009, 20:06) Or OpenC... Oct 5 2009, 20:44
alvaro84 QUOTE (probedb @ Oct 5 2009, 20:44) QUOTE... Oct 7 2009, 12:59
Maurits QUOTE (probedb @ Oct 5 2009, 20:44) QUOTE... Oct 8 2009, 09:44
Maurits QUOTE AMD Leads Industry as First Chip Supplier to... Oct 19 2009, 12:17![]() ![]() |
|
Lo-Fi Version | Time is now: 21st May 2013 - 21:45 |