FLAC going wireless

Topic: FLAC going wireless (Read 8970 times) previous topic - next topic

0 Members and 1 Guest are viewing this topic.

FLAC going wireless

2009-01-13 19:41:17

Hello everyone!

Thanks for checking out my topic. I was hoping if anyone out there can confirm and or let me know if this is possible.

I currently work with wireless audio products but was thinking of using FLAC as the audio transport when streaming audio/music from Point-A to Point-B. Now I know there are many devices out there that are using wi-fi or their own proprietary wireless systems and codecs that can handle all this but I was wondering if anyone had some solid answers to this with my current restrictions and actual numbers regarding FLAC's performance.

When actually streaming FLAC from Point-A to Point-B, how much bandwidth is required for a minimum of CD quality (44.1 kHz, stereo, 16bits)?
From other readings I found two un-confirmed sources saying it was around 1MBps, can anyone confirm this?

Also how much processing power is necessary for the encoding/decoding again at minimum CD-quality, I know encoding requires more computing power. The current DSP codec I am working with can juice out about 64MIPS, I'm wondering if this ample or if I need to look into a more powerful codec?

What is the actual audio latency of the FLAC after encoding/decoding for an audio stream when watching movies? For example connecting Point A (transmitter) from movie playing device and using FLAC to Point B (receiver) how off will the audio be from the actual video movements? Many say 30-40ms is the minimum latency before noticing off-sync from audio to video.

Thanks again for reading this all the way through your input will be much appreciated!!

-Nicki:)

FLAC going wireless

Reply #1 – 2009-01-13 21:24:07

Well, 1 MBps may be a typical value, it may even be a little high for a typical value, but it is not the maximum. With a lossless codec there is no guarantee that a file will compress at all (white noise will have little or no compression).

Given that, and the difficulties in implementing compression and decompression on-the-fly, plus problems with delay, etc., I would recommend you rethink what you are doing and just transmit uncompressed wav data, which is only 1.4 MBps.

FLAC going wireless

Reply #2 – 2009-01-14 13:41:08

Quote

I currently work with wireless audio products but was thinking of using FLAC as the audio transport when streaming audio/music from Point-A to Point-B. Now I know there are many devices out there that are using wi-fi or their own proprietary wireless systems and codecs that can handle all this but I was wondering if anyone had some solid answers to this with my current restrictions and actual numbers regarding FLAC's performance.

When actually streaming FLAC from Point-A to Point-B, how much bandwidth is required for a minimum of CD quality (44.1 kHz, stereo, 16bits)?
From other readings I found two un-confirmed sources saying it was around 1MBps, can anyone confirm this?

I take it you are referring to a setup similar to the SlimServer? Is there any reason why you find the need to waste that much bandwidth in the product I take it you are designing? Wouldn't it just be easier to transcode from lossless to a transparent lossy codec on the fly? (actually that might require more computing power, but the bandwidth constraints would be less). In order to stream FLAC files and to reduce the overhead it would be easier if they were placed in an Ogg container, henceforth Ogg FLAC as well. I am not entirely sure about the bandwidth requirements. It really depends on what type of network you will be streaming this media over. You know an Ethernet just won't be efficient, but a Fast Ethernet might be to give you some idea. Josh probably knows the specifics.

Quote

Also how much processing power is necessary for the encoding/decoding again at minimum CD-quality, I know encoding requires more computing power. The current DSP codec I am working with can juice out about 64MIPS, I'm wondering if this ample or if I need to look into a more powerful codec?

Your designing your own audio codec for this project? Why don't you just sick with an open source flavor. You could just stick with FLAC and Ogg Vorbis the SDK's are both portable. The answer is yes generally encoding requires more computing power.

FLAC going wireless

Reply #3 – 2009-01-14 14:11:23

Quote from: pdq on 2009-01-13 21:24:07

.... 1 MBps .... 1.4 MBps.

Surely this should be Mbps rather than MBps?

FLAC going wireless

Reply #4 – 2009-01-14 20:58:26

The 1 Mbps is probably typical of the loudest of today's music releases, representing the average bitrate of whole tracks. The peak bitrate may be somewhat higher, perhaps as much as 1.41 Mbps + overhead when FLAC transmits verbatim blocks, uncompressed.

I guess you have some idea of the streaming bandwidth you can rely on.

I guess you have some idea of the memory which you can devote to a buffer, or whether you have any real-time constraints, which it sounds like, as you're talking about latency.

From example source material of the type you want (e.g. movie soundtracks, music CDs etc) you can compress using FLAC, say (perhaps inside an OGG container for better bitstream transport) at the expected FLAC compression setting. From a large, varied selection, you can get an idea of how often you'd experience buffer under-runs and thus the typical proportion of time you'd need to drop below true lossless CD quality, if at all.

These occasions must include large amplitude data approaching full-scale, so the signal to quantization noise ratio would be very high at these times.

You could consider falling back to truncating, rounding or better still dithering your source to a lower bit-depth for those occasions (e.g. 14-bits or 12-bits), reducing a very high signal to noise ratio to a pretty high SNR. FLAC can then use the wasted bits feature (as exploited by lossyWAV) to reduce the bitrate required (in fact, you control the encoder, so you can simply tell it how many bits must be encoded).

Another option of falling away from true lossless rather gracefully may be to base your system on Wavpack, falling back to wavpack lossy or using wavpack lossy full time, with a relatively high specified bitrate target. If the specified bitrate is enough to be lossless, I believe that it will be, but otherwise quality will gradually diminish during loud and difficult-to-compress sections. If you specify a bitrate below about 500 kbps, you'll rarely reach lossless except in quiet sections or highly tonal music, but should retain transparency in your lossy encoding almost always. There is a range of compression modes, from fast to very slow, so you can optimize for your processor. Like FLAC, latency/delay ought to be tiny.

There are other options, such as CELT (currently in beta) - a lossy transform codec with typically under 10 ms delay, though this might be a little processor-intensive for your processor, purely by guesswork surmising.

FLAC going wireless

Reply #5 – 2009-01-16 01:55:33

MIPS are vague but there are lots of devices with 74mhz arm chips running libFLAC with headroom so I think you'll be ok for decoding.

as for latency, if you are transmitting audio and video separately and they each have fixed delays (flac can) you can just add delay to whichever is behind to achieve good sync, right? delays should be small enough that a small buffer will even it out.

coding delay (not quite the same as latency as you're describing it) in flac is determined by blocksize which is just an encoder setting. it can be tuned as low as you want but there is a sweet spot for compression, i.e. if blocksize gets too big/small then compression starts to suffer.

FLAC going wireless

Reply #6 – 2009-01-17 09:30:11

Quote from: Nicki on 2009-01-13 19:41:17

When actually streaming FLAC from Point-A to Point-B, how much bandwidth is required for a minimum of CD quality (44.1 kHz, stereo, 16bits)?
From other readings I found two un-confirmed sources saying it was around 1MBps, can anyone confirm this?

Also how much processing power is necessary for the encoding/decoding again at minimum CD-quality, I know encoding requires more computing power. The current DSP codec I am working with can juice out about 64MIPS, I'm wondering if this ample or if I need to look into a more powerful codec?

I assume you're considering using Bluetooth and concerned about total bandwidth since even with Bluetooth EDR, you're limited to 2Mbps or 3Mbps. If you're using WiFi, then you've got enough bandwidth for full raw uncompressed audio and you'd have no latency issues at all. And since you mention having 64MIPS, I'm guessing you're using the Kalimba DSP on a CSR BlueCore5-MM part?

Anyway, here's some numbers.

Full, uncompressed 16 bit, 44.1kHz, stereo, would require 1.4112Mbps for data, not counting packet overhead and packet retransmissions for packet loss. The math is pretty simple. 16 x 44100 x 2.

FLAC is said to compress 30-70%, with an average file size around 58% of the original. So, that would be 818kbps for data, again not counting packet overhead, which is not insignificant. And of course that's the "average", so like others have pointed out, you could have some *bad* sources that use much higher, up to the full 1.4Mbps.

I'm not positive if Bluetooth EDR would be able to handle that worst case scenario, but it's possible. Depends on how much overhead is really there and how reliable the signal is. A rule of thumb I've heard is that you only have half of the rated bandwidth as usable, but that could be crap.

Quote from: jcoalson on 2009-01-16 01:55:33

as for latency, if you are transmitting audio and video separately and they each have fixed delays (flac can) you can just add delay to whichever is behind to achieve good sync, right? delays should be small enough that a small buffer will even it out.

I assume he's talking about a situation where only the audio is transmitted wirelessly, say from an iPod or TV where the user sees the video in "realtime" and hears the audio slightly delayed. Thus the latency of the compression and transmission is important.

-Snappy

FLAC going wireless

Reply #7 – 2009-01-17 10:26:19

Quote from: Dynamic on 2009-01-14 20:58:26

You could consider falling back to truncating, rounding or better still dithering your source to a lower bit-depth for those occasions (e.g. 14-bits or 12-bits), reducing a very high signal to noise ratio to a pretty high SNR. FLAC can then use the wasted bits feature (as exploited by lossyWAV) to reduce the bitrate required (in fact, you control the encoder, so you can simply tell it how many bits must be encoded).

Another option of falling away from true lossless rather gracefully may be to base your system on Wavpack, falling back to wavpack lossy or using wavpack lossy full time, with a relatively high specified bitrate target. If the specified bitrate is enough to be lossless, I believe that it will be, but otherwise quality will gradually diminish during loud and difficult-to-compress sections. If you specify a bitrate below about 500 kbps, you'll rarely reach lossless except in quiet sections or highly tonal music, but should retain transparency in your lossy encoding almost always. There is a range of compression modes, from fast to very slow, so you can optimize for your processor. Like FLAC, latency/delay ought to be tiny.

How to degrade gracefully is a really interesting question and one that seems to be missing from a lot of the lossless algorithms out there. For reading and writing files, it's not usually an issue. But for any realtime transmission system, it becomes important. I'm guessing that systems like SqueezeCenter using WiFi just use a buffer and assume that WiFi has enough bandwidth to retransmit in time to recover from any temporary disruption. For something like Bluetooth with much less bandwidth, you also use packet retransmission to recover, but this doesn't always work even with the default SBC and would presumably be much worse with a higher bandwidth lossless compression.

So, besides retransmitting, how do you degrade so the listener doesn't just hear the audio drop out?

It would seem that degrading to a psychoacoustic compression such as MP3 or AAC would be the best experience from a user perspective since you'd be loosing the pieces that they wouldn't notice. But since most, if not all, lossless compression algorithms are not psychoacoustic to start with, this would be hard for them to do. I guess you could implement two completely different algorithms, but that might just be too complex. And MP3 and AAC require much more processing power anyway so aren't really options for portable devices.

Dynamic suggested dropping the bit depth from 16 bit to 14 or 12-bit. You could also consider sampling at a lower rate. Say, 32kHz instead of 44.1kHz. You could also use WavPack or one of those algorithms that have lossy capabilities. So which of these options is best? Which would be least noticeable by the listener? Which is easiest to implement?

I've heard some people say that using 32kHz is totally adequate for personal headphones. I could see that being true if you've got average quality 30mm speakers. How important is a full 16 bits vs 14 or 12? How well does WavPack degrade at lower bitrates? Is it better than just cutting the bit depth or sample rate of FLAC? What about switching to ADPCM that is pretty simple and does pretty good compression at 4:1?

I imagine that changing the sample rate dynamically would be a bit of a pain to implement since you've probably buffered up the samples already before encoding. But perhaps that's a non-issue.

One thing I like about WavPack is the hybrid mode where you have a lossy stream and a correction stream. If you've got enough bandwidth for both streams, then you can reconstruct the lossless audio from the combined streams. But if you run out of bandwidth, then only retransmit the lossy stream. If you find the bandwidth suddenly comes back, then retransmit the rest of the correction stream. That *might* be the best case scenario since there's the potential to recover the complete lossless audio before the buffer runs out but have the safety of focusing on a reasonable lossy stream as a higher priority. Depending on how the bandwidth degrades, it might be a better listener experience and still be fairly easy to implement.

Of course the current implementations of WavPack that I can see don't do this automatically. Anyone out there implemented something like this? Anyone want to?

On a related note, at low bitrates, such as 384kbps, which is better? ADPCM, SBC, or WavPack? I want to know if I use something like WavPack and it has to degrade to that low bandwidth, will it sound much worse than if I had just switched to some other algorithm? Or will it hold it's own?

-Snappy

FLAC going wireless

Reply #8 – 2009-01-17 16:14:36

Quote from: Snappy on 2009-01-17 09:30:11

If you're using WiFi, then you've got enough bandwidth for full raw uncompressed audio and you'd have no latency issues at all.

It's not that easy. I'd assume that compressing/decompressing isn't much of a problem but the network transport. In order to avoid constant stuttering you need some buffering on the client side which results in delay. I doubt that there will be no audible delay with the most standard solutions if you're playing live.

VOIP is one area where this is critical. A quick search showed that 150 to 250 ms seems to be typical.

For normal video playback it's probably easiest to use a player where you can adjust audio/video delay. But the only way to be sure is to try it .

FLAC going wireless

Reply #9 – 2009-01-17 19:13:16

CELT has a much reduced delay, I believe.

FLAC going wireless

Reply #10 – 2009-01-18 22:30:18

Quote from: Nick.C on 2009-01-17 19:13:16

CELT has a much reduced delay, I believe.

Yes, see link at bottom of my previous post for more info.

If you're really tight for bandwidth and need low delay and high quality, CELT is a quality-oriented transform codec (rather than a speech-oriented, but musically-poor codec like CELP or Speex) that can do pretty well for stereo quality at 128 kbps (not as well as optimally tuned but high-latency MP3, AAC, Vorbis or Musepack, but pretty well) or can be extended to higher bitrates for better transparency.

Quote from: Snappy on 2009-01-17 10:26:19

On a related note, at low bitrates, such as 384kbps, which is better? ADPCM, SBC, or WavPack? I want to know if I use something like WavPack and it has to degrade to that low bandwidth, will it sound much worse than if I had just switched to some other algorithm? Or will it hold it's own?

I have no knowledge of SBC other than what it does and that HiFi Bluetooth headphones can sound very good. I know it's a subband codec, like MP2 or Musepack but with lower complexity, fewer subbands, and probably less frequency analysis done in the encoder. I know nothing of its latency, quality or the target bitrates.

ADPCM at 4 bit (353kbps for CD source) is surprisingly good but usually the veil of noise is detectable when I've tried it, if I recall correctly.

Wavpack Lossy at around 350 kbps is typically considered transparent for all but the worst of its problem sample set, and then only non-transparent at very high volume in a quiet environment, if I recall correctly. It doesn't increase latency very much if at all compared to the lossless mode. If you want low latency, you might need to specify shorter Wavpack encoding blocks (e.g. block-size of 512 samples = 11.6 ms of buffering, let alone overall latency, the default of 4096 samples is 92.9 ms block-duration to be buffered).

So, I'm very tempted to say that Wavpack is your best bet of those three at 384 kbps, though sub-band coding (even only 4 bands or whatever SBC uses) may allow sensible exploitation of masking or even just ATH curves to reduce the quantization in certain bands rather than all bands.

The Wavpack PDF (164 kB) that David Bryant wrote for a book on Audio Coding, describe how it works very nicely. Like most lossless coders, a predictor makes a first guess at the next sample value based on the previous few samples. The error (residual) in this prediction compared to the actual value must then be transmitted, and the statistics show that moderately low values of residual are likely and high values are very unlikely, allowing some form of probabilistic entropy encoding scheme to encode the precise error as a shorter bit sequence than 16-bits most of the time, and just occasionally as a longer-than-16-bit sequence.

The kind of Golomb encoding scheme chosen for Wavpack's lossless mode is just slightly suboptimal (compared to, say FLAC's Rice coding) to allow the lossy & hybrid (lossy+correction) modes to be implemented gracefully. [edit: no longer true since Wavpack 4.0 which is superior to plain Rice coding - see David Bryant's post immediately below this]. Basically, the sequence of bits that encodes the prediction error can be cut short by a few bits (the remaining bits of the compressed code going to the correction file [edit: if you choose to create one]) and will still be close to the accurate value that you get from the full bit-sequence, producing a small remaining error (which is part of a "noise signal" that's added to the lossless original music). The decoder only has access to the slightly inaccurate sample value, so Wavpack lossy has to use the last few potentially-inaccurate values to predict the next sample value and then transmit its residual as accurately as is possible with the length of bit sequence allowed by the bitrate. If the allowed length of bit sequence is at least as big as the sequence required to accurately encode the value, then that sample value is stored losslessly.

So you can see that wavpack lossy mode can gracefully progress between completely lossless (if the signal complexity isn't too great for the bitrate allowed) to a gradually increasing veil of noise (which tends to be roughly in proportion to the loudness of the music at any one time, so tends to be masked quite consistently until the bitrate really gets quite low). For CD audio, the bitrate can go as low as 192 kbps (~2.2 bits per sample - the non-negotiable minimum, see Wavpack PDF) and 384 kbps is considered by the developer to be essentially transparent. Wavpack's method is flexible enough to produce surprisingly good quality with a few bits per sample. It can even be a viable option for speech-type signals with low sampling rates like 8000 Hz. 3 or 4 bits per sample with mono content at 8000 Hz would be 24 or 32 kbps and could be directly compared to ADPCM with 3 or 4 bits per sample. You can specify the target bits per sample instead of the target bitrate in Wavpack lossy producing identical results at the equivalent settings, and this would be a pretty fair way to compare it to ADPCM at any sampling rate you like. [edit: see below, David Bryant has measured the added noise to demonstrate Wavpack's margin of superiority to IMA ADPCM]

[edited to correct info about Golomb codes based on next post]

FLAC going wireless

Reply #11 – 2009-01-19 14:18:17

Thanks Dynamic! Your detailed description saved me a lot of time...

There are just two things I’d like to add. The first is that you mention that the Golomb coding scheme I use is just slightly sub-optimal (compared to pure Rice) to allow the lossy and hybrid modes. This was true back in the version 3.97 days when I had completely different coding schemes for the lossless and lossy modes. However, one of the things I came up with for version 4.0 was a variation of the Golomb scheme that gives better compression than pure Rice and works equally well for all modes. This simplified the implementation; the only disadvantage is that it is slightly more complicated than the old lossless coding scheme which is why even the fastest lossless mode of version 4.0 is not as fast decoding as the equivalent 3.97 mode.

The other thing is that I have done direct comparisons between WavPack’s lossy mode and IMA ADPCM and measured about 10-12 dB less added quantization noise in WavPack. Or, put another way, WavPack achieves with 2.5 bits per sample what ADPCM achieves with 4 bits per sample. Another comparison with ADPCM and the use of WavPack in another low-latency, high-quality application is described in this paper:

http://www.ibr.cs.tu-bs.de/users/kurtisi/p...Pack_icme08.pdf

Cheers,
David

Notice