Help - Search - Members - Calendar
Full Version: not possible to encode a 3.5GB wave file
Hydrogenaudio Forums > Lossy Audio Compression > Ogg Vorbis > Ogg Vorbis - Tech
Tinobee
hi folks,

meanwhile i'm trying to find a solution to encode a 3.5gig (liverecording) wave file to either ogg or aac. tried all possible frontends and also the commandline itself. but it doesnt wonna work. the black window pops up for a millisecond and closes again telling me "done". a message to the aotuv developer brought out that the reason shall be with the frontends. but why seems no frontend being able to handle such a big file and why does the command line "oggenc.exe -q3 file.wav" not work too?

i had the idea to split the file and join it afterwards, but there are no ogg or aac joiners available out there.

does anybody in here know about an input file size limit concerning ogg and aac as well as all other codecs?
Synthetic Soul
I believe the problem is that the wave is larger than 2GB.

I wonder whether foobar may handle it.

I understand Audacity will handle wave files over 2GB. It also works with OGG files I believe.
CRYON
QUOTE(Synthetic Soul @ Nov 30 2005, 01:04 PM)
I believe the problem is that the wave is larger than 2GB.

I wonder whether foobar may handle it.

I understand Audacity will handle wave files over 2GB.  It also works with OGG files I believe.
*



the problem may lie in support of files larger then 2GB in software, the codec is no rpoblem, only small chunks of data are passed while encoding

software in common use 32 bit file sizes, not 64 bit
this could be a problem
kritip
Try piping the file via stdin to the compressor, may work then, as it won't handle the entire large file. Not sure if they accept stdin as an input though? Maybe some more knowledgeble person will know.

Kristian
CRYON
QUOTE(Tinobee @ Nov 30 2005, 12:39 PM)
hi folks,

meanwhile i'm trying to find a solution to encode a 3.5gig (liverecording) wave file to either ogg or aac. tried all possible frontends and also the commandline itself. but it doesnt wonna work. the black window pops up for a millisecond and closes again telling me "done". a message to the aotuv developer brought out that the reason shall be with the frontends. but why seems no frontend being able to handle such a big file and why does the command line "oggenc.exe -q3 file.wav" not work too?

i had the idea to split the file and join it afterwards, but there are no ogg or aac joiners available out there.

does anybody in here know about an input file size limit concerning ogg and aac as well as all other codecs?
*



i found this on the web
http://www.winappslist.com/multimedia/file_joiners.htm
software for joining
try this, maybe it helps you
http://www.multimedia-downloads.com/download/setup_mj.exe
more info at
http://www.multimedia-downloads.com/
kjoonlee
Um, the limit *should* be at 4 gigs, not 2 gigs.

http://www.hydrogenaudio.org/forums/index....ndpost&p=282931
CRYON
QUOTE(kjoonlee @ Nov 30 2005, 02:30 PM)
Um, the limit *should* be at 4 gigs, not 2 gigs.

http://www.hydrogenaudio.org/forums/index....ndpost&p=282931
*



if you use standard function for reading files (C, C++) you are using int vars, that are limited to 2GB
Synthetic Soul
Exactly. The thread quoted is entitled "The 2GB Wav-limit!?", and in it cliveb explains that many applications unfortunately make this mistake.

It was this thread that actually made me suggest that the problem was the size of the file. I can't see what else it might be, unless the BPS are too high for the encoder?
VEG
Try this:
CODE
oggenc -q3 -o outfile.ogg - < infile.wav
mixminus1
Try reading it as raw data - this has worked in the past with LAME and 2GB+ WAV files that were recorded correctly but had improperly-written WAV headers. Command line in oggenc should be something like this:

oggenc -r -q4 infile.wav -o outfile.ogg

This assumes a 2-channel 16-bit/44.1 kHz file.
Tinobee
this
QUOTE
oggenc -r -q4 infile.wav -o outfile.ogg
works. :-)

but i realised the encoder to mix the channels. as the in original file the left channel appears louder in ogg file the right channel appears louder.

also i would like to know the difference between this command line and the frontends. is it the "-r" making it working? what does it mean?

hope its not a problem to ask if this -r would work with aac and mpc encoders as i am still trying to find out which encoder would be the one encoding the smalles file which is really important when working with such big files. and as i got no frontend to work handle with it i guess i need to run the command line. i wanna avoid to start a new thread in each different encoder section. so i hope its ok to give me your command lines right here....

thanks for your support.

greets, tinobee
Synthetic Soul
QUOTE(oggenc.exe)
-r, --raw            Raw mode. Input files are read directly as PCM data

-r is unlikely to work in other encoders - each has its own syntax.

mppenc doesn't seem to have a switch - it looks like you need to pass a .raw to work in raw mode.

Edit: I think kritip and VEG's suggestions of using STDIN makes most sense.
Tinobee
i'm really thankful for your suggestions. problem is i dont know what this
QUOTE
I think kritip and VEG's suggestions of using STDIN makes most sense
means as i am not that technically experienced to handle with such words

i would even like to try this
QUOTE
it looks like you need to pass a .raw to work in raw mode
if i knew its meaning and how to practice.

chance to gimme some explanation about this? would it be helpful telling you the software i am working with so that you can tell me how to get your suggestions done?

thanks!

tinobee
Synthetic Soul
I am currently testing VEG's command line of:

oggenc -q3 -o outfile.ogg - < infile.wav

... on a 3.36GB wave file.

Basically, this command line says "Use OGGENC.EXE to encode to OGG at quality level 3. The destination is outfile.ogg, and (and here's the interesting bit) the input should be taken from the STDIN stream (the "-" bit). The "< infile.wav" pipes "infile.wav" to the STDIN of OGGENC.EXE.

Using this method you can pipe the wave to any encoder that supports STDIN input.

LAME:

LAME -V2 --vbr-new - outfile.mp3 < infile.wav

MPPENC:

MPPENC - outfile.mpc < infile.wav

As I say, I'm testing myself at the moment - OGGENC is 33% in with almost 20 minutes to go...

Edit: I believe Case's NAAC (Nero AAC encoder) supports STDIN. I don't have Nero here at work so I can't test. Basically you would be looking at a command line like:

NAAC - outfile.aac < infile.wav
mixminus1
QUOTE(Synthetic Soul @ Nov 30 2005, 08:56 AM)
I am currently testing VEG's command line of:

oggenc -q3 -o outfile.ogg - < infile.wav

... on a 3.36GB wave file.

Basically, this command line says "Use OGGENC.EXE to encode to OGG at quality level 3.  The destination is outfile.ogg, and (and here's the interesting bit) the input should be taken from the STDIN stream (the "-" bit).  The "< infile.wav" pipes "infile.wav" to the STDIN of OGGENC.EXE.

Using this method you can pipe the wave to any encoder that supports STDIN input.
*



Cool - didn't know this worked in MS-DOS/Windows command line...
Tinobee
would like to raise my technical knowledge by adding some understanding 'bout
QUOTE
STDIN
and
QUOTE
STDOUT


this is the missing puzzle peace to understand your experiment some better.....

tinobee
Synthetic Soul
Hooray, it finished!

CODE
Done encoding file "outfile.ogg"

       File length:  340m 55.0s
       Elapsed time: 30m 59.0s
       Rate:         11.0036
       Average bitrate: 112.0 kb/s

(AMD Athlon XP 3200+ 2.19GHz 1GB RAM XP SP 2)

Note: this is also why foobar is an option. foobar seems fine with the 3.36GB, and will pipe data to encoders where possible.

You could use foobar 0.9 with the Musepack Converter profile which comes as standard.


Edit: I'm not the best qualified to explain STDIN and STDOUT.

Basically input and output to these encoders is via a stream. A file is a type of stream. STDIN and STDOUT are a type of stream also.

Think of an encoder as a man working in a cubicle. This man paints smarties. It's not a good job but it pays the rent. Normally the man will be given a container of smarties. He will paint the smarties, put them in another container, and pass it on. However, there is a conveyor belt that runs into his cubicle and ends there. Sometimes smarties come in one by one on the conveyor belt - he paints them as quickly as they arrive and then puts them in a container. When the conveyor belt stops bringing in smarties he passes the container on. There is another conveyor belt, that starts in his cubicle. Sometime he is passed a load of smarties in a container and told to put them on this conveyor belt. He doesn't know where the conveyor belt goes and has never thought to ask. This schmo paints smarties for a living and probably beats his wife. Anyway, so he takes the smarties out of the container, paints them, and puts them onto the outgoing conveyor belt as fast as he can go. When the container is empty that's the job done.

Container: a file
Smartie: audio data
Conveyor in: STDIN
Conveyor out: STDOUT

Passing the wave file to an encoder using STDIN is the same as taking all the smarties (audio data) out of your container (wave file) and passing them to the encoder one by one on the conveyor belt (stdin) instead. As it is, a 3.56GB container is too heavy for the man to pick up. He is feeble and made ill by toxic smartie paint fumes.

Really bad analogy, but I kinda got caught up in the moment. smile.gif I'm sure someone else can explain it in one sentence. I'm too busy wondering how much a guy who paints smarties would earn.

Edit: Also note that STDOUT of one application can be passed to STDIN of another. In this way you can decode from a lossless file and pass the audio data to another encoder. In this instance the conveyor belt which begins in the man's cubicle (stdout) actually leads straight into another man's cubicle (his stdin). These two men have never spoken but have worked alongside each other for eleven years.

Also, an application can recieve data from STDIN and output it to STDOUT (conveyor in directly to conveyor out).
dimzon
QUOTE(mixminus1 @ Nov 30 2005, 06:57 PM)
oggenc -r -q4 infile.wav -o outfile.ogg
*


This is bad bcz encoder will threat WAV HEADER as samples too - it may be some distortions at start and maybe invalid channel order if sizeof(WAV_HEADER)%(BytesPerSample*ChannelCount)
Tinobee
*LOL* quite good comparision. this is how i like it!

QUOTE
You could use foobar 0.9 with the Musepack Converter profile which comes as standard.
i am currently trying to do all encodings wither either foobar or eac (if foobar settings are not understandable to me).

my last try to encode this big wave file to ogg and mpc failed at nearly 60%. ogg broke cuz of some clipping. mpc didnt tell me a special reason but kinda "foo_clienc - failed to communicate" or so if i'm right.

the only encoder that finished it was the aac. but i would like to see if the 570mb file is possible to reduce with some other codec and settings.

dunno how to get oog and mpc to finish the file by ignoring some clipping.....

do you?

QUOTE
and maybe invalid channel order
this is exactly what happend to me. look above. what better solution do you suggest?

Tinobee
Synthetic Soul
No. I just tried with foobar and LAME/MPPENC and I too got an error message after a period of time. I don't know why foobar gives up part way through.

Try the MPPENC command line I quoted above and see if that works. OGGENC definately works on the command line because I've tested it. I don't know whether MPPENC or LAME will - i.e.: whether it is foobar falling down or whether these apps just can't deal with that amount of data. I suspect it's the apps - in which case you may be restricted to OOG or AAC... dunno.
Otto42
Might be more long winded than you were looking for, but...

To understand stdin and stdout, you need to understand the concept of streams and what a command line program is actually doing.

A stream is one of the more basic structures in computing. It's basically just a sort of structure for a program to stuff some data. The program takes some data, shoves it into a stream, and forgets about it. Something else will deal with that data. Every stream has two ends to it, one for input, and one for output.

Streams are often used to communicate with hardware. A device driver for, say, a soundcard, creates a stream to receive audio data. A program shoves data into that stream, the driver gets this data and makes the soundcard play the sound.

Sometimes streams can be used as a way for programs to communicate. This is called a "pipe". One program shoves data into the pipe, another receives the data from other end of the pipe.

Now, stdin, stdout, and stderr are three special types of stream. Every program gets them automatically from the operating system (more or less).
-stdin is input from the operating system. Usually the other end of this stream this is hooked to input from the keyboard.
-stdout is where a program can put it's "output". The output from a program is then read by the operating system, which deals with it in whatever way the operating system sees fit. In the case of a command line program, that data gets printed to the screen.
-stderr is another output stream, reserved for errors. In the case of a command line program, stderr is usually also printed on the screen. This varies a lot.

The point is that these are standard streams most often used by command line programs. They get input from somewhere (usually a keyboard) and they output something somewhere (usually the screen). They don't care where the input comes from or where the output goes, because the operating system handles that for them. So a program doesn't need to read the keyboard directly every time he wants a character input, the program can just read stdin and the operating system handles interaction with the keyboard.

The nice thing about these streams is that they can be redirected. The operating system controls one end of them, after all. So if I want to, say, send a file to a program just as if I had typed it, I can redirect that file to the program, which reads it from stdin. The program doesn't know any difference. Or maybe I want to redirect the output of a program to a file. No problem, the operating system is handling where the stuff stdout is going, it can put it in a file just as easily as it can put it to the screen. Maybe I want to see the output on the screen, but log errors to a file? No problem, redirect stderr to a file and leave stdout printing to the screen as normal.

Or maybe I want to be really clever and take the output of one program and use it as the input for another program... Again, that's called a pipe, and the operating system can handle that too. The programs don't need to know they've been redirected.

How to do it:
1. Send a file to a program as stdin:
program.exe < input.txt

2. Send the output of a program to a file:
program.exe > output.txt

3. Combine the above:
program.exe < input.txt > output.txt

4. Only log errors to a file:
program.exe 2> errors.txt

5. Append output from a program to a file (instead of overwriting the file):
program.exe >> output.txt

6. Pipe the output of program1 into the input of program2:
program1.exe | program2.exe

And so on. Simple as that, really.

The reason this bypasses that 2 gig limitation above is that when you tell oggenc or lame or whatever to accept input from stdin (instead of reading a file), then you're not using those program's file routines anymore. They're just reading data from stdin, and the operating system is reading the file and sending it to them on that stream. So their file size limit bug never gets triggered.

There's a lot of use for this sort of thing, if the programs are designed correctly. Most unix programs use stdin and stdout and piping stuff from one to another is normal practice. Makes the command line very powerful, to have programs that do one simple thing only and then be able to combine them in various ways.
Tinobee
ok, thanks so far for your great explanations!

i tried the commandlines mentioned above and got following results:

QUOTE
oggenc -q5 - Omen.ogg - < Omen.wav  --->  Error: Multiple Files specified when using stdin


QUOTE
oggenc -q5 -o Omen.ogg - < Omen.wav  --->  Skipping Chunk Of Type "FACT" Length 4, Opening with wav module: WAV file reader , encoding to standard input to "Omen.ogg" at quality 5,00  --->  done after 1 sec  --->  file length 0, elapsed time 0, rate 0, average bitrate -730,8 kbp/s


QUOTE
mppenc Omen.mpc < Omen.wav  --->  shows lots of status values but doesnt run --> only a 3KB file appears
don't know how to interprete this.

do you?

tinobee
dimzon
try
CODE
oggenc -Q -q 5 - -o "omen.ogg" < omen.wav

or
CODE
oggenc -Q -q5.00 - -o "omen.ogg" < omen.wav
Synthetic Soul
Correct command lines are:

OGGENC -q5 -o Omen.ogg - < Omen.wav

MPPENC - Omen.mpc < Omen.wav

I don't know what the deal is with the second OGGENC command you used, which is syntactically correct. I don't understand enough about the WAVE file RIFF chunks but it seems like you have a non-standard WAVE... somehow.

As I say, yesterday I sucessfully passed a 3.36GB to OGGENC with no problem. This morning I have tested MPPENC with no problem, and LAME is running right now. I don't know what the problem is with foobar and these applications.

How did you create this WAVE - i.e.: in what application?

I can't help thinking that it may be worth trying to "tidy up" the WAVE.

One method, off the top of my head, would be to encode it to FLAC. FLAC removes all chunks except "fmt" and "data" - so this is one method (I'm sure it's not the best) of cleaning up your WAVE header. I believe that you can pass a FLAC to OGGENC directly.

There has to be easier ways, but until someone with more knowledge comes up with a suggestion that is how I would try it.

It's possible that you could just pull the file into Audacity and resave it and that would do the job. Without knowing it's difficult to say though - i.e.: if it still doesn't work how do you know if Audacity actually made any difference? Maybe you could search for tools to clean a WAVE header, or convert WAVE to RAW.

In the time of writing LAME just finished successfully.

Edit: Thinking about it I think Audacity will recreate the WAVE header - so that is an option. Also, as I said right at the start (I think!) Audacity will export directly to Ogg as well.

Edit 2: I tested Audacity to Ogg with my large wave and it worked fine.
kjoonlee
As for foobar2000 giving up on long streams, you might want to try making foobar2000 pass 16bit samples instead of 24bit samples.
kjoonlee
QUOTE(CRYON @ Nov 30 2005, 10:36 PM)
QUOTE(kjoonlee @ Nov 30 2005, 02:30 PM)
Um, the limit *should* be at 4 gigs, not 2 gigs.

http://www.hydrogenaudio.org/forums/index....ndpost&p=282931
*



if you use standard function for reading files (C, C++) you are using int vars, that are limited to 2GB
*


Mkay, but notice how I said "should," meaning that the limit should be at 4 gigs in ideal cases.
Synthetic Soul
QUOTE(kjoonlee @ Dec 1 2005, 10:23 AM)
As for foobar2000 giving up on long streams, you might want to try making foobar2000 pass 16bit samples instead of 24bit samples.
That failed with MPPENC (my only test) as well ("Error writing to file (Generic Error)").

It did get further (254MB over 170MB).

I just tried again and the second failed at 254MB as well. I bit compared the two files and they are the same.

536870900 samples @ 44100Hz
File size: 267 043 216 bytes

NB: The file should be 428 MB (449,212,244 bytes). That's what I got when using MPPENC.EXE directly.

I'm testing again now with 24BPS, and then I'll try 32BPS (if MPPENC can handle it).
Synthetic Soul
As suspected MPPENC can't handle 32BPS. The 24BPS failed at 170MB again.

Therefore:

foobar with MPPENC (failed files):

CODE
BPS   Samples      Filesize
===========================================
16    536870900    254MB, 267 043 216 bytes

24    357913933    170MB, 178 563 048 bytes


Complete file created piping wave to MPPENC:

CODE
BPS   Samples      Filesize
===========================================
16    902096896    428MB, 449 212 244 bytes


Edit: OK, after a little calculation that's around 8,590,000,000 bits for each - which is 1024 * 1024 * 1024 bytes, or 1GB.
Tinobee
the idea to compress to flac and recompress to wav worked fine. all encoders did their job now!

thanks very much for your help! smile.gif

one little thing would be of interest to me.

can anybody please write down such a stdin commandline for using faac encoder, please? it tried it myself using the aacenc -help but dont get this puzzle complete. all i get are syntax errors... dry.gif

tinobee
Synthetic Soul
In its simplest form:

CODE
FAAC -o Omen.aac - < Omen.wav


Tinobee
mhm........this commandline for FAAC seems to not provide any possiblity to set a preset to raise the kpbs. even the --long-help shows only an option for abr setting.

for psytel aac this commandline doenst work as well as my didnt.

do you know how to create one for psytel including settings of quality like -normal or -streaming?

and could you probably post your faac commandline including a quality setting? i cant really find anything in help.....

greetings, tinobee
Synthetic Soul
I don't actually use FAAC. I don't actually use AAC. If I did I would use iTunesEncode I think.

Before you ask, iTunesEncode won't take STDIN as input.

To answer your question I downloaded FAAC from Rarewares, and ran it using FAAC --help and FAAC --long-help.

In that time I noticed the -q switch:

CODE
 -q <quality>  Set default variable bitrate (VBR) quantizer quality in percent.

               (default: 100, averages at approx. 120 kbps VBR for a normal
               stereo input file with 16 bit and 44.1 kHz sample rate; max.
               value 500, min. 10).

I think you are talking about the -b switch:

CODE
 -b <bitrate>  Set average bitrate (ABR) to approximately <bitrate> kbps.
               (max. value 152 kbps/stereo with a 16 kHz cutoff, can be raised
               with a higher -c setting).


Why not try:

CODE
FAAC -q 200 -o Omen.aac - < Omen.wav


Edit: using -q 200 resulted in a ~215kbps file, instead of a ~134kbps file created using the default of 100, for my test file.
Tinobee
cool! this works!

thanks a lot, guys! :-)

tinobee

(... will i ever get this commandline thing in my head?...)
Otto42
QUOTE(Synthetic Soul @ Dec 3 2005, 03:36 AM)
Before you ask, iTunesEncode won't take STDIN as input.
*


True. Limitation of iTunes. And while iTunes does not have a 2 gig limit, as such, there is definitely some kind of limitation there. I've had large files work and I've had them fail and I've not been able to tell why in many cases. I believe the limitation may be inherent to the WAV format itself.

In one case, I was encoding an audiobook. Several hours long. The encode worked, but many of the hours at the end were chopped off. Never did figure out why. Instead, I used LAME to do a fast encode to a 320 kbps MP3, which I then used as input to iTunes. This worked, and the quality difference from transcoding was really unnoticable since I was encoding to a quite low bitrate AAC to begin with (64kbps, I think).
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2008 Invision Power Services, Inc.