IPB

Welcome Guest ( Log In | Register )

17 Pages V  « < 2 3 4 5 6 > »   
Reply to this topicStart new topic
FLACCL: CUDA-enabled FLAC encoder by Gregory S. Chudov (prev. FlaCuda), Formerly "lossless codecs and CUDA"
hlloyge
post Sep 26 2009, 13:31
Post #76





Group: Members
Posts: 689
Joined: 10-January 06
From: Zagreb
Member No.: 27018



CUDA:

CODE
D:\temp_2>CUETools.FlaCuda.exe -8 "Coldplay - Left Right Left Right Left.wav"
CUETools.FlaCuda, Copyright (C) 2009 Gregory S. Chudov.
This is free software under the GNU GPLv3+ license; There is NO WARRANTY, to
the extent permitted by law. <http://www.gnu.org/licenses/> for details.
Filename  : Coldplay - Left Right Left Right Left.wav
File Info : 44100kHz; 2 channel; 16 bit; 00:39:54.7470000
Results   : 157,73x; 285525875 bytes in 00:00:15.1828684 seconds;


FLAC took one minute and 12 seconds!

CODE
D:\temp_2>flac -8 "Coldplay - Left Right Left Right Left.wav"
flac 1.2.1, Copyright (C) 2000,2001,2002,2003,2004,2005,2006,2007  Josh Coalson
flac comes with ABSOLUTELY NO WARRANTY.  This is free software, and you are
welcome to redistribute it under certain conditions.  Type `flac' for details.
Coldplay - Left Right Left Right Left.wav: wrote 285951666 bytes, ratio=0,677


Mighty Impressive! I think I will use FlaCuda for FLAC encoding.
Go to the top of the page
+Quote Post
guruboolez
post Sep 26 2009, 18:57
Post #77





Group: Members (Donating)
Posts: 3474
Joined: 7-November 01
From: Strasbourg (France)
Member No.: 420



I'm far from ridiculous performances with my little fanless 9400GT - so further improvements are still welcome smile.gif

CODE
       0.4     0.6
-0    102.29  145.79    +43%
-2     91.93  127.71    +39%
-4     62.25   54.83    -12%
-6     42.82   47.34    +11%
-8     26.29   36.49    +39%
-10    11.75   15.22    +30%


Speed is clearly better except for -4 compression level for which flacuda 0.4 is faster but with lower compression ratio (see table below).
The file is a 54 minutes full album on .wav (PCM) format. That's really impressive. Congratulations!

Could someone tell me if it's normal that CUETools.FlaCuda.exe reaches a 50% load on my Core2Duo E6300/9400GT whatever the compression level I choose? Shouldn't the CPU stay inactive during the encoding process? I'm using fb2k so I checked on foobar2000 options if I didn't do something wrong (like active DSP…) but apparently it isn't the case. Is it the same with stronger GPU?


APPENDIX: full table

CODE

flacuda version 0.4 size in KB
GPU -0 102.29x 265.028
GPU -2 91.93x 263.071
GPU -4 62.25x 262.059
GPU -6 42.82x 261.771
GPU -8 26.29x 261.579
GPU -10 11.75x 261.254
GPU -11 7.75x 261.137

flacuda version 0.6
GPU -0 145.79x 265.543
GPU -1 131.13x 263.687
GPU -2 127.71x 262.563
GPU -3 126.77x 262.335
GPU -4 54.83x 261.881
GPU -6 47.34x 261.712
GPU -8 36.49x 261.578
GPU -10 15.22x 261.253
GPU -11 9.92x 261.137


flac.exe version 1.21
CPU -0 122.84x 275.077
CPU -5 74.47x 263.170
CPU -8 26.88x 262.408
Go to the top of the page
+Quote Post
Dologan
post Sep 26 2009, 18:58
Post #78





Group: Members (Donating)
Posts: 478
Joined: 22-November 01
From: United Kingdom
Member No.: 519



Holy crap! Nice work, Greg!

FlaCuda -4 on my 8800GT is now pretty much as fast FLAC -6 as 3 of my Q6600 cores, with a tiny bit better compression. The strange slowdown I was getting when using multiple converter threads seems no longer to be an issue; in fact, it speeds up from one thread to up to three, after which it seems to slightly slow down.
Go to the top of the page
+Quote Post
Gregory S. Chudo...
post Sep 26 2009, 19:23
Post #79





Group: Developer
Posts: 683
Joined: 2-October 08
From: Ottawa
Member No.: 59035



QUOTE (guruboolez @ Sep 26 2009, 21:57) *
Could someone tell me if it's normal that CUETools.FlaCuda.exe reaches a 50% load on my Core2Duo E6300/9400GT whatever the compression level I choose? Shouldn't the CPU stay inactive during the encoding process? I'm using fb2k so I checked on foobar2000 options if I didn't do something wrong (like active DSP…) but apparently it isn't the case. Is it the same with stronger GPU?

Yep... Maybe NVIDIA will fix this at some point, but currently the function call that's waiting for the GPU to finish work is wasting 100% of one CPU, obviously just spinning in a loop and constantly checking if GPU is ready. There are options in CUDA which control the waiting mode, but the one which was supposed to make a process sleep and wait for results doesn't seem to be working on Windows Vista, i suppose it's only implemented on Linux where CUDA driver is more advanced.

This post has been edited by Gregory S. Chudov: Sep 26 2009, 19:25


--------------------
CUETools 2.1.4
Go to the top of the page
+Quote Post
guruboolez
post Sep 26 2009, 19:36
Post #80





Group: Members (Donating)
Posts: 3474
Joined: 7-November 01
From: Strasbourg (France)
Member No.: 420



Thank you for the explanation, Gregory. I Hope it'll be fixed by Nvidia soon.

This post has been edited by guruboolez: Sep 26 2009, 21:00
Go to the top of the page
+Quote Post
odyssey
post Sep 26 2009, 20:15
Post #81





Group: Members
Posts: 2296
Joined: 18-May 03
From: Denmark
Member No.: 6695



It crashes on my Geforce 9300 that's supposed to support CUDA crying.gif

Works fine on my Quadro 2700M though.


--------------------
Can't wait for a HD-AAC encoder :P
Go to the top of the page
+Quote Post
Gregory S. Chudo...
post Sep 26 2009, 21:20
Post #82





Group: Developer
Posts: 683
Joined: 2-October 08
From: Ottawa
Member No.: 59035



Previous versions too?


--------------------
CUETools 2.1.4
Go to the top of the page
+Quote Post
odyssey
post Sep 27 2009, 01:37
Post #83





Group: Members
Posts: 2296
Joined: 18-May 03
From: Denmark
Member No.: 6695



FlaCUDA03-05:
CODE
Error     : Exception of type 'GASS.CUDA.CUDAException' was thrown.


FlaCUDA06 shows info of source-file, then crashes in Windows and returns in the console at last:
CODE
Unhandled Exception: ErrorNotInitialized


Can you recommend anything stable I can test/verify that CUDA is working correctly on my GPU?


--------------------
Can't wait for a HD-AAC encoder :P
Go to the top of the page
+Quote Post
alvaro84
post Sep 27 2009, 06:35
Post #84





Group: Members
Posts: 128
Joined: 9-August 06
Member No.: 33830



QUOTE (Gregory S. Chudov @ Sep 26 2009, 19:23) *
Yep... Maybe NVIDIA will fix this at some point, but currently the function call that's waiting for the GPU to finish work is wasting 100% of one CPU, obviously just spinning in a loop and constantly checking if GPU is ready. There are options in CUDA which control the waiting mode, but the one which was supposed to make a process sleep and wait for results doesn't seem to be working on Windows Vista, i suppose it's only implemented on Linux where CUDA driver is more advanced.


Now that you mentioned... I had a look at the processes and foobar eats up ~20% and FlaCuda ~38 while converting.
It would be interesting to compare the energy consumption of the CPU and GPU implementation but I don't have the proper instruments now...
Go to the top of the page
+Quote Post
Wombat
post Oct 1 2009, 01:01
Post #85





Group: Members
Posts: 950
Joined: 7-October 01
Member No.: 235



One small question about FlaCudas default blocksize. When i remeber right flac used a blocksize at -8 of 4608 and changed to 4096 cause of small advantages in compression on average. My own limited tests with FlaCuda show also a tiny advantage with the blocksize at 4096 with a selection of mixed kinds of music.
Go to the top of the page
+Quote Post
Gregory S. Chudo...
post Oct 1 2009, 07:17
Post #86





Group: Developer
Posts: 683
Joined: 2-October 08
From: Ottawa
Member No.: 59035



Smaller blocks make it slower, but have their advantages. I will try to reduce performance penalty for smaller blocks if i can.


--------------------
CUETools 2.1.4
Go to the top of the page
+Quote Post
johnsonlam
post Oct 1 2009, 15:40
Post #87





Group: Members
Posts: 226
Joined: 12-January 03
From: Kowloon, Hong Kong
Member No.: 4533



Sorry for breaking in ...

Did someone planning for a Foobar2000 Flac-CUDA plugin?
Will makes the conversion a lot easier.


--------------------
Hong Kong - International Joke Center (after 1997-06-30)
Go to the top of the page
+Quote Post
odyssey
post Oct 1 2009, 15:48
Post #88





Group: Members
Posts: 2296
Joined: 18-May 03
From: Denmark
Member No.: 6695



QUOTE (johnsonlam @ Oct 1 2009, 16:40) *
Did someone planning for a Foobar2000 Flac-CUDA plugin?

No need for a plugin. foobar2000 relies on commandline encoders, and you can setup any commandline-encoder as you wish - including FlaCUDA. However, if you have a multicore CPU, you might need to set the Thread Count to 1 under Advanced, since this encoder is not CPU-dependant. Note that this affects all encoders (including CPU dependant).

Would be nice if this setting was user-definable for each encoder.


--------------------
Can't wait for a HD-AAC encoder :P
Go to the top of the page
+Quote Post
hlloyge
post Oct 1 2009, 20:08
Post #89





Group: Members
Posts: 689
Joined: 10-January 06
From: Zagreb
Member No.: 27018



That was exactly what I was thinking - because it is needed, too, when using iTunes AAC encoder through foobar2000.
Go to the top of the page
+Quote Post
johnsonlam
post Oct 2 2009, 08:49
Post #90





Group: Members
Posts: 226
Joined: 12-January 03
From: Kowloon, Hong Kong
Member No.: 4533



QUOTE (odyssey @ Oct 1 2009, 22:48) *
QUOTE (johnsonlam @ Oct 1 2009, 16:40) *
Did someone planning for a Foobar2000 Flac-CUDA plugin?

No need for a plugin. foobar2000 relies on commandline encoders, and you can setup any commandline-encoder as you wish - including FlaCUDA. However, if you have a multicore CPU, you might need to set the Thread Count to 1 under Advanced, since this encoder is not CPU-dependant. Note that this affects all encoders (including CPU dependant).

Would be nice if this setting was user-definable for each encoder.


Thanks.

Any example of setting FlacCUDA in Foobar2000?
I'm not good at command line settings.


--------------------
Hong Kong - International Joke Center (after 1997-06-30)
Go to the top of the page
+Quote Post
glebe
post Oct 2 2009, 17:42
Post #91





Group: Members
Posts: 20
Joined: 17-February 09
Member No.: 67079



johnsonlam, see screenshot
Go to the top of the page
+Quote Post
Rotareneg
post Oct 4 2009, 01:14
Post #92





Group: Members
Posts: 187
Joined: 18-March 05
From: Wichita, KS
Member No.: 20701



Sempron 3400+ (Socket 754) o/c'ed to 2500 MHz, GeForce GTX 260 Core 216 at standard frequencies:

Album is Paul Simon's Graceland, converted to a single 456,539,708 byte WAV.

Flac 1.2.1b (RareWare's ICL compile) -8 : 01:06.812, 250,990,457 bytes.

FlaCuda06 -8 : 00:26.015, 250,421,454 bytes.

2.57x faster, sounds good to me! smile.gif
Go to the top of the page
+Quote Post
Ron Jones
post Oct 4 2009, 02:08
Post #93





Group: Members
Posts: 412
Joined: 9-August 07
From: Los Angeles
Member No.: 46048



Can't wait to see how Fermi does with FLACuda. Very exciting stuff to think about, that's for certain.
Go to the top of the page
+Quote Post
VeaaC
post Oct 5 2009, 02:56
Post #94





Group: Members
Posts: 52
Joined: 10-December 06
Member No.: 38550



QUOTE (Ron Jones @ Oct 4 2009, 03:08) *
Can't wait to see how Fermi does with FLACuda. Very exciting stuff to think about, that's for certain.


A DirectX11 version would be better as it could run on ATI hardware too. cool.gif
Go to the top of the page
+Quote Post
johnsonlam
post Oct 5 2009, 17:27
Post #95





Group: Members
Posts: 226
Joined: 12-January 03
From: Kowloon, Hong Kong
Member No.: 4533



QUOTE (glebe @ Oct 3 2009, 00:42) *
johnsonlam, see screenshot


Thank you very much for your help!



--------------------
Hong Kong - International Joke Center (after 1997-06-30)
Go to the top of the page
+Quote Post
Garf
post Oct 5 2009, 20:06
Post #96


Server Admin


Group: Admin
Posts: 4853
Joined: 24-September 01
Member No.: 13



QUOTE (VeaaC @ Oct 5 2009, 03:56) *
QUOTE (Ron Jones @ Oct 4 2009, 03:08) *
Can't wait to see how Fermi does with FLACuda. Very exciting stuff to think about, that's for certain.


A DirectX11 version would be better as it could run on ATI hardware too. cool.gif


Or OpenCL...
Go to the top of the page
+Quote Post
probedb
post Oct 5 2009, 20:44
Post #97





Group: Members
Posts: 1122
Joined: 6-September 04
Member No.: 16817



QUOTE (Garf @ Oct 5 2009, 20:06) *
Or OpenCL...


Haven't Nvidia just realised their OpenCL drivers? Wish AMD would do the same!
Go to the top of the page
+Quote Post
alvaro84
post Oct 7 2009, 12:59
Post #98





Group: Members
Posts: 128
Joined: 9-August 06
Member No.: 33830



QUOTE (probedb @ Oct 5 2009, 20:44) *
QUOTE (Garf @ Oct 5 2009, 20:06) *
Or OpenCL...


Haven't Nvidia just realised their OpenCL drivers? Wish AMD would do the same!


That would be the best - there's no hope for DX11 compute shader for XP (ok, I know, I know, in a few years no one will still be using XP...)
AMD/CUDA is the same huh.gif
Go to the top of the page
+Quote Post
Maurits
post Oct 8 2009, 09:44
Post #99





Group: Members
Posts: 353
Joined: 30-September 05
From: London, Europe
Member No.: 24805



QUOTE (probedb @ Oct 5 2009, 20:44) *
QUOTE (Garf @ Oct 5 2009, 20:06) *
Or OpenCL...


Haven't Nvidia just realised their OpenCL drivers? Wish AMD would do the same!


Apparently AMD/ATI is not sitting on its hands: ATI Jumps on OpenCL Bandwagon by Releasing OpenCL Drivers.

OpenCL would be a lot better than DirectX support as it means it won't be limited to just Windows. Apparently Windows 7 comes standard with OpenCL compatible NVIDIA drivers which is promising...

A more in depth look at OpenCL: OpenCL: To GPGPU and Beyond
AMD: An Introduction to OpenCL

This post has been edited by Maurits: Oct 8 2009, 10:14
Go to the top of the page
+Quote Post
Maurits
post Oct 19 2009, 12:17
Post #100





Group: Members
Posts: 353
Joined: 30-September 05
From: London, Europe
Member No.: 24805



QUOTE
AMD Leads Industry as First Chip Supplier to Offer OpenCL™ Development Kit that Supports GPUs and x86 CPUs

ATI Stream SDK v2.0 beta with support for OpenCL 1.0 allows developers to program complete AMD platforms for breakthrough performance on processing-intensive applications

October 19, 2009 12:01 AM Eastern Daylight Time

SUNNYVALE, Calif.--(EON: Enhanced Online News)--AMD (NYSE: AMD) announced availability of a key piece of its strategy to help improve the end-user's compute experience by leveraging the combined power of AMD graphics processors (GPUs) and AMD multi-core x86 processors through software. With the beta release of its ATI Stream Software Development Kit (SDK) v2.0, featuring OpenCL 1.0 support, AMD provides a free set of tools software developers can use to create applications that are accelerated by AMD GPUs and AMD multi-core x86 CPUs working together. The ATI Stream SDK v2.0 is certified compliant with OpenCL 1.0 by the Khronos Working Group.

* The ATI Stream SDK v2.0 beta is available for download today and can be accessed here.


Good news...
Go to the top of the page
+Quote Post

17 Pages V  « < 2 3 4 5 6 > » 
Reply to this topicStart new topic
2 User(s) are reading this topic (2 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 20th April 2014 - 05:20