Two lame questions on Vorbis

2005-11-26 06:35:29

Please excuse me if this is a FAQ, but:

1) Is it possible for an OGG container to have two streams of significantly varying size? E.g. having a FLAC & portable player 80 kbps versions in a single file, still allowing fast extraction of the former. The size difference of this streams can be as large as 10x; so wouldn't it introduce severe syncing errors? I tried to read RFC3533, but hasn't figured out anything

And if the answer is positive, couldn't you tell the right tools to do the job?

2) I haven't studied the libvorbis sources thoroughly yet. However, I can make an assumption that the coder spends a significant amount of time doing some form of spectral analysis (FFT/MDCT?), which is usually implemented in floating point.

Modern GPU's with programmable fragment pipeline have a tenfold floating point performance when compared to general purpose x86 CPU's, even with all the SIMD units used. Their programmability is flexible enough for some researchers to claim them being stream processors. The hardware is already widespread (even the ps2.0-compatible), API support is stable, and there are plethora of projects on GPU assisted computation. BrookGPU is a "black box" library which uses OpenGL in the backend and can be used to add GPU-side computations to an existing C application. And FFT/MDCT, AFAIK, are very parallelizable, FFT at least with its butterfly operation.

So, is there any chance to see a GPU assisted Vorbis encoder? At least reimplementing FFT/MDCT with BrookGPU is feasible both in worktime and code complexity. Will it give any significant gain compared to SSE-tuned coders? (I'm not sure how much time is spend in VQ/psychoacoustics/Rice encoding etc.) Maybe there are already such projects, and I just don't know?

Please excuse me for my terrible English...

Two lame questions on Vorbis

Reply #1 – 2005-11-26 08:10:42

1) It would work just fine to have 2 streams of very differeny sizes in an Ogg file. This is what generally happens for movies, and OGM (an Ogg-derivative) works perfectly for it. Assuming the FLAC frames take up the same amount of time as the Vorbis frames, you would have 1 FLAC frame, 1 Vorbis frame, 1 FLAC frame, etc., no matter how many bytes each one takes up. If the FLAC frames are twice as long (time-wise) as the Vorbis, then you'd just get 2 Vorbis frames between each FLAC frame. In the end, though, all the timecodes line up.

1.5) I don't know what can do this, though. Maybe something that writes OGMs could produce a compliant Ogg file?

2) That would be REALLY awesome. One problem might be that a lot of the ATI cards can only support 24-bit floating point, which may not be enough precision. I remember there was an audio processing program which used the GPU, but it would only run on nVidia's cards. (This is changing, though: any PS3.0 card can do 32-bit floating point stuff, and I think ATI's new x1800 and the like are PS3.0)

3) Your English is more or less perfect.

Notice