IPB

Welcome Guest ( Log In | Register )

2 Pages V   1 2 >  
Reply to this topicStart new topic
Converting to Opus: 44.1 kHz resampled to 48 kHz
gurkburk
post Sep 17 2012, 18:40
Post #1





Group: Members
Posts: 1
Joined: 17-September 12
Member No.: 103219



So i have a bunch of regular 44100 flac files that i'de like to convert to OPUS file so i have something to listen to when i run, that's not very hard to setup following this guide.
http://www.saunalahti.fi/~cse/Opus/

What does concern me is that opusenc.exe seems to default to a samplingrate of 48000 when you feed it a raw stream (the '-' option as seen in the guide seems to do that) it's easy to see that the .opus files do indeed become converted/upsampled to 48000hz using this method.
So my question is, should i override this behavior by adding --raw-rate 44100 to the encoder?
Does it matter at all that the files get upsampled? Could it introduce noise or other unwanted side effects?

Thanks smile.gif

(yes i know that there aren't many android players yet avail for playback of opus, but the question still remains) smile.gif
Also i think this post belongs to this forums, at least it seemed like it was a good fit, but if i'm wrong please move it :\


This post has been edited by gurkburk: Sep 17 2012, 18:41
Go to the top of the page
+Quote Post
[JAZ]
post Sep 17 2012, 19:03
Post #2





Group: Members
Posts: 1710
Joined: 24-June 02
From: Catalunya(Spain)
Member No.: 2383



Opus does not natively support 44Khz. The idea is that since 48Khz can store a 44Khz signal, this simplifies the algorithm. Then, it just stores the original format and size so that a decoder could decode to the same values, but a decoder is allowed to decode at another samplerate.

You might need the --raw-rate 44100 switch to tell it that the input is working at 44Khz so that it resamples.
Else, I believe it might be encoding directly as if it was a 48Khz signal, so playback would sound wrong (Haven't tried and I don't know how the commandline encoder works in this specific case).

You might opt to resample directly in the program feeding the stream (like foobar2000), or let opus resample. I don't know about the internal resampler's quality, but I know it's not bad.
Go to the top of the page
+Quote Post
LithosZA
post Sep 17 2012, 19:07
Post #3





Group: Members
Posts: 181
Joined: 26-February 11
Member No.: 88525



QUOTE
So my question is, should i override this behavior by adding --raw-rate 44100 to the encoder?

If it is a raw PCM stream then it doesn't contain any header information. So the encoder won't know what is working with and you have to supply the correct values with the raw switches.
If it isn't raw like a WAV file then you do not use any of those options, because the sampling rate is already known from the header.

Every non-48Khz input rate will internally be converted to 48Khz. So the output will always be at 48Khz.
If you don't trust the internal resampler then you could always go and use something like sox.
Go to the top of the page
+Quote Post
Kohlrabi
post Sep 17 2012, 19:08
Post #4





Group: Super Moderator
Posts: 953
Joined: 12-March 05
From: Kiel, Germany
Member No.: 20561



QUOTE (gurkburk @ Sep 17 2012, 19:40) *
that's not very hard to setup following this guide.
http://www.saunalahti.fi/~cse/Opus/
This guide is somewhat obsolete, you don't need to set up a custom encoder profile for Opus anymore.

QUOTE (gurkburk @ Sep 17 2012, 19:40) *
What does concern me is that opusenc.exe seems to default to a samplingrate of 48000 when you feed it a raw stream (the '-' option as seen in the guide seems to do that) it's easy to see that the .opus files do indeed become converted/upsampled to 48000hz using this method.
So my question is, should i override this behavior by adding --raw-rate 44100 to the encoder?
Opus does not support sample rates of 44.1kHz, so setting the input sample rate does not change the fact that the stream has to be resampled to 48kHz.

QUOTE (gurkburk @ Sep 17 2012, 19:40) *
Does it matter at all that the files get upsampled? Could it introduce noise or other unwanted side effects?
Highly unlikely, though you can use any other foobar2000 DSP resampler to resample to 48kHz before conversion.

This post has been edited by Kohlrabi: Sep 17 2012, 19:09


--------------------
Audiophiles live in constant fear of jitter.
Go to the top of the page
+Quote Post
lvqcl
post Sep 17 2012, 19:14
Post #5





Group: Developer
Posts: 3213
Joined: 2-December 07
Member No.: 49183



QUOTE (gurkburk @ Sep 17 2012, 21:40) *
when you feed it a raw stream (the '-' option as seen in the guide seems to do that)

No it doesn't. foobar2000 doesn't send just raw data to encoders. And Opus encoder does know the samplerate of an input file.
Go to the top of the page
+Quote Post
benski
post Sep 17 2012, 19:17
Post #6


Winamp Developer


Group: Developer
Posts: 669
Joined: 17-July 05
From: Ashburn, VA
Member No.: 23375



how does the internal resampling effect gapless playback (and does opus even store enough metadata to allow for gapless playback)?
Go to the top of the page
+Quote Post
Kohlrabi
post Sep 17 2012, 19:28
Post #7





Group: Super Moderator
Posts: 953
Joined: 12-March 05
From: Kiel, Germany
Member No.: 20561



QUOTE (benski @ Sep 17 2012, 20:17) *
how does the internal resampling effect gapless playback (and does opus even store enough metadata to allow for gapless playback)?
Opus is intrinsically gapless.


--------------------
Audiophiles live in constant fear of jitter.
Go to the top of the page
+Quote Post
benski
post Sep 17 2012, 20:22
Post #8


Winamp Developer


Group: Developer
Posts: 669
Joined: 17-July 05
From: Ashburn, VA
Member No.: 23375



QUOTE (Kohlrabi @ Sep 17 2012, 14:28) *
QUOTE (benski @ Sep 17 2012, 20:17) *
how does the internal resampling effect gapless playback (and does opus even store enough metadata to allow for gapless playback)?
Opus is intrinsically gapless.


Does the internal resampler add any padding at the beginning or the end of the stream, then? Some resampling implementations will do this because they don't compensate for the group delay or because they let the filter decay out at the end.
Go to the top of the page
+Quote Post
Brand
post Sep 17 2012, 20:23
Post #9





Group: Members
Posts: 312
Joined: 27-November 09
Member No.: 75355



QUOTE (Kohlrabi @ Sep 17 2012, 20:28) *
Opus is intrinsically gapless.

It still produces a click/glitch with some problematic samples.
(So does every other lossy codec I tried, except for Vorbis.)
Go to the top of the page
+Quote Post
Kohlrabi
post Sep 17 2012, 20:26
Post #10





Group: Super Moderator
Posts: 953
Joined: 12-March 05
From: Kiel, Germany
Member No.: 20561



Apparently I might be wrong, it was deeply rooted at the back of my brain though, but I cannot find any reference through google right now. Though my usual test samples showed no problems at all, but that hardly proves anything.

This post has been edited by Kohlrabi: Sep 17 2012, 20:27


--------------------
Audiophiles live in constant fear of jitter.
Go to the top of the page
+Quote Post
eahm
post Sep 17 2012, 21:12
Post #11





Group: Members
Posts: 884
Joined: 11-February 12
Member No.: 97076



QUOTE (gurkburk @ Sep 17 2012, 10:40) *
So i have a bunch of regular 44100 flac files that i'de like to convert to OPUS file so i have something to listen to when i run, that's not very hard to setup following this guide.
http://www.saunalahti.fi/~cse/Opus/

Not want to steal the fun to create a new encoder option but the new version of foobar2000 has Opus already in its settings smile.gif
Go to the top of the page
+Quote Post
yourlord
post Sep 17 2012, 21:27
Post #12





Group: Members
Posts: 172
Joined: 1-March 11
Member No.: 88621



I believe I read in another thread that opus can't truly do gapless, though it can get pretty close to it.

I think I've settled on continuing to use Vorbis for my lossy file encodes because it's the best lossy codec I've found for handling gapless, and gapless matters to me.

I want to see Opus succeed in the interactive space and streaming. If I was currently working on software or a project that involved either I would use Opus, hands down.
Go to the top of the page
+Quote Post
Dynamic
post Sep 19 2012, 01:15
Post #13





Group: Members
Posts: 793
Joined: 17-September 06
Member No.: 35307



To the OP's question, the resampler in Opus is very good and will be transparent and time-aligned by the time it's decoded (its delay is specified by the encoder as part of the pre-skip delay and stripped out by the decoder), so you need not worry. In a lossless encoder, perceptual transparency is all that matters.

Also regarding gaplessness, all CD-sourced material is an integer number of CD frames in length, each lasting 1/75th of a second. Fortunately this is an integer number of samples at both 48000 Hz (640 samples) and 44100 Hz (588 samples) so the accurate length can be precisely specified at either sampling rate, so it works in Opus regardless. cool.gif

An important addition to gapless playback for some sources is glitchless playback. Inherently, image+CUE files are glitchless but it seems nearly everyone desires track-per-file, few players support CUEsheets and many mainstream rippers are both insecure and don't support image+CUE.

Glitchless means that there should be no discontinuities in important aspects of the audio that could be perceived as a sound not present in the original music. Typically there might be level discontinuities that result in a broad band pop or click, or there might be slope discontinuities for example, with similar effect.

Glitchlessness appears, from posts above, to be acceptable in Ogg Vorbis with a suitably gapless player. It's also no problem in MP3s encoded as an image plus CUE then split into tracks with pcutmp3 (which uses the LAME accurate length and offset to create sample-accurate cuts and uses MP3 frames each side of the cut point in both halves of the split to maintain continuity, but only works with a player that supports LAME's gapless fix of course)

Opus uses asymptotically convergent prediction and specifies that a pre-skip is included in the header, which is non-audio that must be decoded silently before the decoder produces any sound in order to sufficiently converge the predictors. This should be at least 80ms, and ideally a bit longer to eliminate glitches in the worst cases. In the case of individual tracks it's likely that an encoder will use silence, or potentially even try something 'clever' like the start of the track played backwards to train the predictors before the audio begins. As the header includes the pre-skip offset before audio should commence, it's entirely possible to feed the encoder with enough of the tail end of the previous track, to make it effectively glitchless.

I'd expect there's a bit of inertia to overcome to rewrite ripping software to actually provide the prior-track samples and offset to the start of desired track to the encoder in some way.

Perhaps the easiest way to implement glitchless Opus ripping is for the encoder to support Image+CUE input but to be told to output in file-per-track mode, whereby it can automatically pick the right amount of pre-roll and automatically include data from the end of the previous track. People who care about glitchless, gapless playback would be happy to use a ripper in Image+CUE mode, passing that to the Opus encoder. Otherwise or additionally, it will be necessary to implement a method to pass prior track audio samples plus the offset to the start of audio, or to specify a switch that requires a fixed amount of additional audio (e.g. 15 CD frames = 15/75s = 0.2 s) offset which the encoder will use to train the predictors in the decoder to match very closely what they would have been if the two tracks had been encoded as a single piece of audio.

About 0.2 s of unheard audio in a typical 180 second track will not appreciably increase the effective bitrate over a collection of audio tracks.
Go to the top of the page
+Quote Post
NullC
post Sep 19 2012, 05:50
Post #14





Group: Developer
Posts: 200
Joined: 8-July 03
Member No.: 7653



QUOTE (Dynamic @ Sep 18 2012, 17:15) *
Also regarding gaplessness, all CD-sourced material is an integer number of CD frames in length, each lasting 1/75th of a second. Fortunately this is an integer number of samples at both 48000 Hz (640 samples) and 44100 Hz (588 samples) so the accurate length can be precisely specified at either sampling rate, so it works in Opus regardless. cool.gif
Gah. Opus gives sample accurate lengths for all sample rates <=48000. The (cool indeed) convenience of CD frames lengths isn't needed.

QUOTE
Glitchless means that there should be no discontinuities in important aspects of the audio that could be perceived as a sound not present in the original music. Typically there might be level discontinuities that result in a broad band pop or click, or there might be slope discontinuities for example, with similar effect.
Opusenc should achieve that already on inputs that are themselves glitchless, so long as the bitrate is high enough for the first and last frames that the distortion from lossyness doesn't break it.

QUOTE
Glitchlessness appears, from posts above, to be acceptable in Ogg Vorbis with a suitably gapless player.

And opus is in the same boat as Vorbis WRT this.

QUOTE
Opus uses asymptotically convergent prediction and specifies that a pre-skip is included in the header, which is non-audio that must be decoded silently before the decoder produces any sound in order to sufficiently converge the predictors. This should be at least 80ms, and ideally a bit longer to eliminate glitches in the worst cases. In the case of individual tracks it's likely that an encoder will use silence, or potentially even try something 'clever' like the start of the track played backwards to train the predictors before the audio begins.
No, a newly initialized encoder and decoder are instantly converged by definition. The 80ms advice applies to seeking. The preskip exists for three main reasons: To allow that convergence in streams that have been captured out of a longer running stream (analogous to seeking), to allow sample accurate trimming of existing encodes (same as the last reason, but different use case), and to hide the non-constant encoder+decoder latency (the decoder is constant, encoder depends on how its setup) to give sample accurate lengths.

QUOTE
I'd expect there's a bit of inertia to overcome to rewrite ripping software to actually provide the prior-track samples and offset to the start of desired track to the encoder in some way.

The original data may help improve the quality of the first and last frame. But I've not yet found an example where the existing LPC extrapolation (what vorbis does) for 'margin' filling is inadequate. Right now opusenc only LPC extrapolates the end, not the beginning, and it really surprises me that this is enough but so far it has been.

This post has been edited by NullC: Sep 19 2012, 05:52
Go to the top of the page
+Quote Post
Dynamic
post Sep 19 2012, 09:00
Post #15





Group: Members
Posts: 793
Joined: 17-September 06
Member No.: 35307



Thanks again for the corrections, NullC. You know your stuff. I'll rein myself in! I guess that means that if anyone has opus (or Vorbis) files that don't playback gaplessly and glitchlessly, they should provide examples with ABX logs against the join in the lossless originals.

The sample-accurate lengths for CD-sourced material resampled to 48kHz does mean however that it doesn't matter in the slightest whether the decoder plays back at 48kHz or resamples back to the original 44.1 kHz - it's still precise.
Go to the top of the page
+Quote Post
Brand
post Sep 19 2012, 09:55
Post #16





Group: Members
Posts: 312
Joined: 27-November 09
Member No.: 75355



QUOTE (Dynamic @ Sep 19 2012, 10:00) *
I guess that means that if anyone has opus (or Vorbis) files that don't playback gaplessly and glitchlessly, they should provide examples with ABX logs against the join in the lossless originals.

I linked to an example a few posts above. You should easily hear a glitch with the transition from m1 to m2 when encoded to Opus (or MP3/AAC). Well, it helps if you listen with headphones and raise the volume.
I also posted about this same problem in the other Opus thread and I uploaded the recordings of the transition. There was an Opus build that reduced the loudness of the glitch, but it didn't eliminate it.
I haven't heard much from others, but I'd be very surprised if I'm the only one who can hear this.

EDIT2: The only time I don't get a glitch is when I decode the opus files to 44.1k wav and then play them in foobar.

This post has been edited by Brand: Sep 19 2012, 10:32
Go to the top of the page
+Quote Post
NullC
post Sep 19 2012, 14:19
Post #17





Group: Developer
Posts: 200
Joined: 8-July 03
Member No.: 7653



QUOTE (Brand @ Sep 19 2012, 01:55) *
EDIT2: The only time I don't get a glitch is when I decode the opus files to 44.1k wav and then play them in foobar.
If you can't reproduce the problem with opusdec + opusenc then it may be something about foobar2000's behavior.
Go to the top of the page
+Quote Post
Brand
post Sep 19 2012, 15:25
Post #18





Group: Members
Posts: 312
Joined: 27-November 09
Member No.: 75355



I can reproduce with opusenc+opusdec. In fact opusdec by default gives me 44.1k files which don't glitch in Foobar. But if I force --rate 48000 I get the glitch again.

I also tried to unite the WAV files from opusdec with Wavosaur and I get the glitch with both 44.1k and 48k, which is a bit confusing (it means Wavosaur and Foobar are not consistent with how they handle the 44.1k files in this case).


EDIT: tried with two other audio editors and they behave like Foobar: no glitch with 44.1k files, glitch with 48k ones (Audacity makes the glitch very obvious, for example)

EDIT2: correction: there's a glitch with 44.1k as well, it's just much quieter. But anyway, 48k is the important one, since it's Opus' 'native' SR.

This post has been edited by Brand: Sep 19 2012, 15:41
Go to the top of the page
+Quote Post
pdq
post Sep 19 2012, 15:26
Post #19





Group: Members
Posts: 3305
Joined: 1-September 05
From: SE Pennsylvania
Member No.: 24233



QUOTE (Dynamic @ Sep 18 2012, 20:15) *
In a lossless encoder, perceptual transparency is all that matters.

You meant to say lossy encoder. In a lossless encoder, bit accuracy is all that matters.
Go to the top of the page
+Quote Post
Case
post Sep 19 2012, 16:10
Post #20





Group: Developer (Donating)
Posts: 2137
Joined: 19-October 01
From: Finland
Member No.: 322



I'm usually very sensitive to glitches, but I don't seem to hear the glitch in "new build small glitch.wav". It's very audible in "older build louder glitch.wav". Images from Audition's Spectral Frequency Display: old, new.
Go to the top of the page
+Quote Post
punkrockdude
post Oct 8 2012, 13:25
Post #21





Group: Members
Posts: 243
Joined: 21-February 05
Member No.: 20022



Hehe. Gurkburk, du kan inte vara något annat än svensk med det smeknamnet. Translation.
Go to the top of the page
+Quote Post
C.R.Helmrich
post Jan 20 2013, 12:45
Post #22





Group: Developer
Posts: 682
Joined: 6-December 08
From: Erlangen Germany
Member No.: 64012



I have a similar question as the OP. In my case the input file is at 32 kHz sampling rate, and I want this to be encoded at high rates (i.e. CELT-only). Now there are two ways I can do this:

  • Encode the 32-kHz file as-is. This gives me a file decoded at 32 kHz, or decoded at 48 kHz if I specify "--rate 48000" in the decoder command-line.
  • Upsample the input to 48 kHz externally, then encode it. This gives me a 48-kHz output file which contains some artificial content between 16 and 20 kHz.

Which of the two approaches is more efficient? Intuitively I would choose the first approach, since that one only encodes content actually present in the input, but maybe CELT is more efficient on 48-kHz than on 32-kHz input (and the 16-20-kHz spectral range consumes almost no bits)?

What do the experts say?

Chris


--------------------
If I don't reply to your reply, it means I agree with you.
Go to the top of the page
+Quote Post
nu774
post Jan 20 2013, 16:35
Post #23





Group: Developer
Posts: 477
Joined: 22-November 10
From: Japan
Member No.: 85902



I'm not an "expert" but from what I can see from opusenc.c (and audio-in.c) of opus-tools, it resamples 32kHz input to 48kHz (by speex resampler) before sending to encoder.

CODE
  if(rate>24000)coding_rate=48000;
  else if(rate>16000)coding_rate=24000;
  else if(rate>12000)coding_rate=16000;
  else if(rate>8000)coding_rate=12000;
  else coding_rate=8000;

CODE
  if(rate!=coding_rate)setup_resample(&inopt,coding_rate==48000?(complexity+1)/2:5,coding_rate);

CODE
  st=opus_multistream_encoder_create(coding_rate, chan, header.nb_streams, header.nb_coupled,
     mapping, frame_size<480/(48000/coding_rate)?OPUS_APPLICATION_RESTRICTED_LOWDELAY:OPUS_APPLICATION_AUDIO, &ret);

Go to the top of the page
+Quote Post
nu774
post Jan 20 2013, 16:47
Post #24





Group: Developer
Posts: 477
Joined: 22-November 10
From: Japan
Member No.: 85902



It might have not been enough explanation...
"rate" in the first part is the actual input sampling frequency, and it's get resampled to "coding_rate".
Go to the top of the page
+Quote Post
C.R.Helmrich
post Jan 20 2013, 22:35
Post #25





Group: Developer
Posts: 682
Joined: 6-December 08
From: Erlangen Germany
Member No.: 64012



Yes, I expected Opus to resample to 48 kHz internally. The question was more: if it does, will the CELT coder know that the input was upsampled from 32 kHz by the Speex resampler, and limit its encoding bandwidth to 16 kHz (the bandwidth of the original input file), or is the 20-kHz bandwidth of high-bitrate CELT hard-coded regardless of the input sampling rate?

Chris


--------------------
If I don't reply to your reply, it means I agree with you.
Go to the top of the page
+Quote Post

2 Pages V   1 2 >
Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 20th April 2014 - 17:51