Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: A call for better parsability of command-line codec output (Read 17705 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

A call for better parsability of command-line codec output

Quote
Unix tradition strongly encourages writing programs that read and write simple, textual, stream-oriented, device-independent formats. Under classic Unix, as many programs as possible are written as simple filters, which take a simple text stream on input and process it into another simple text stream on output.

Despite popular mythology, this practice is favored not because Unix programmers hate graphical user interfaces. It's because if you don't write programs that accept and emit simple text streams, it's much more difficult to hook the programs together.


The Art of Unix Programming, Eric S. Raymond

caudec is a command-line program (or BASH script, if you insist on pedantry) that exists only because of the many programs that it uses: well engineered codecs and sound processors developed by talented people with lots of knowledge that I don't have, that produce highly convenient and sonically near-perfect audio formats, for efficient playback and recording. As every other user, I'm very grateful for those.

One aspect that seems to have been largely overlooked, however (with the exception of FLAC and SoX), is the possibility that such programs might be re-used by other programs, such as caudec. Maybe the UNIX philosophy cited above isn't as prevalent as it once was.

I am starting this thread because I have run into some difficulties parsing the output of those programs. It is evident that, with the exception of FLAC (the codec, and metaflac) and SoX, which provide many parameters designed to output information for third-party programs to parse, most codecs where designed to produce human-readable output, that is not only more or less difficult (hackish) to parse, but also subject to change from one version to another. I'm talking about straight-forward information like a version number ("some_program --version"), or attributes about encoded files, like the number of channels, sampling rate, bit depth, duration, etc…

As it is now, I'm trying my best to correctly parse needed information, but I'm worried that my best efforts will be rendered void sometime in the future, as a new version of any given codec will be released. I really wish that every codec would use the same syntax for the version string for instance, and that they would provide specific parameters (like "metaflac --show-sample-rate") devoid of excess (programmatically irrelevant) information. FLAC and SoX, in this regard, do it perfectly, and I wish that every codec developer would follow the fine example that they set.

I hope this doesn't come off as a bitchy rant from a lazy script-kiddie. I just think that many developers simply didn't anticipate that people like me would make heavy use of their programs, to produce a third-party tool with the purpose of improving (performance-wise and usability-wise) end-user experience. Moreover, it seems to me that standardizing their output and adding simple parameters for printing file attributes, wouldn't require that much work.

That said, I hate to admit that caudec itself produces human-readable output that would be rather difficult and inconvenient to parse. I have that issue in the back of my mind, even though I have no earthly idea how another program would make use of caudec. Suggestions about that are more than welcome.

A call for better parsability of command-line codec output

Reply #1
This issue exists only due to the fact that the Unix/bash approach is text-centric instead of being data/object centric, like e.g. Windows PowerShell (which would mandate that all the programs you used return proper PSH objects... )

I doubt you can unite all developers under one banner to standardize their text-outputs, but I wish you luck.  Maybe you also reached the limit of what a simple bash-script can do, and you need to dive into the respective APIs.
It's only audiophile if it's inconvenient.

A call for better parsability of command-line codec output

Reply #2
What you say is true Kohlrabi, but unortunately only half the truth: even a data-oriented shell needs standardized data for its programs to interact properly. I agree it is the better of the two ways though. Text is an important feature for the user, but sooo annoying to handle correctly. Different codings and whatnot make it a PITA to setup chains.

A call for better parsability of command-line codec output

Reply #3
If I would really be concerned about it while doing bash scripting, I'd use XML initialization file with nodes reflecting tool version and necessary information for processing, as I guess every system has XML processor. That would allow easy and reliable parameter initialization for bash script.

A call for better parsability of command-line codec output

Reply #4
Open source projects develop features step by step. If you ask the respective codecs developers and support your request with reasonable arguments, I believe they will add parameters for outputting script-parsable information to their code.

Or as has been suggested, advance your project from a mere script to a program (e.g. python?) calling their APIs. The command line codecs are often just wrappers of library APIs. Often the APIs already have python hooks available.

A call for better parsability of command-line codec output

Reply #5
That's not a bad suggestion, however I feel that there is a big difference between the effort needed to learn and rewrite an entire program in another language (caudec, Python) and the effort needed to implement very simple parameters that will work with any program, written in any language. I'll also note that there are other BASH scripts than caudec (which are more popular, to my dismay).

Edit: are there even Python bindings for every codec out there? I'm talking stuff like TAK and LossyWAV. Also, my (rather large) efforts will only improve one program (my own, caudec), whereas the (rather low) efforts of others would improve every other program out there, present and future. Perhaps the notion that caudec is written in BASH gives the wrong impression about its complexity, and the thousands of hours that were put into it to make it easy to use, convenient and performant at the same time.

A call for better parsability of command-line codec output

Reply #6
That's not a bad suggestion, however I feel that there is a big difference between the effort needed to learn and rewrite an entire program in another language (caudec, Python) and the effort needed to implement very simple parameters that will work with any program, written in any language. I'll also note that there are other BASH scripts than caudec (which are more popular, to my dismay).


OK, why not, try the ask the developers to add the simple-format outputs. I am afraid coming up with some general format of text output first and making others comply will not work.

 
Edit: are there even Python bindings for every codec out there? I'm talking stuff like TAK and LossyWAV.


I do not know. If not, they can always be added to the benefit of all

Quote
Also, my (rather large) efforts will only improve one program (my own, caudec), whereas the (rather low) efforts of others would improve every other program out there, present and future. Perhaps the notion that caudec is written in BASH gives the wrong impression about its complexity, and the thousands of hours that were put into it to make it easy to use, convenient and performant at the same time.


Kudos to your effort. I have made a few rather complicated bash scripts too and I know what a pain it is to develop reliably in bash. Yet I understand if I ever have to extend the scripts significantly, I will most likely rewrite them in something less clunky. Things like hash arrays, string processing, handling subprocesses (subthreads) correctly, all that is really ugly to work with in bash scripts. The only question is when such decision happens - small hacks to an existing functioning script are always easier than a complete rewrite

A call for better parsability of command-line codec output

Reply #7
I have made a few rather complicated bash scripts too and I know what a pain it is to develop reliably in bash.


95% of my difficulties aren't related to BASH, they are directly related to having to deal with a dozen different programs (or more, I didn't count) that all have their own behavior, shortcomings and quirks, that sometimes even change from one version to another (off the top of my head, Opus and eyeD3). Not to mention tools with the same names on GNU/Linux and OS X that have different parameters and different outputs. That is where the real pain is at. As far as BASH is concerned, I have long come up with workarounds that work perfectly well, consistently.

A call for better parsability of command-line codec output

Reply #8
As it is now, I'm trying my best to correctly parse needed information, but I'm worried that my best efforts will be rendered void sometime in the future, as a new version of any given codec will be released. I really wish that every codec would use the same syntax for the version string for instance, and that they would provide specific parameters (like "metaflac --show-sample-rate") devoid of excess (programmatically irrelevant) information. FLAC and SoX, in this regard, do it perfectly, and I wish that every codec developer would follow the fine example that they set.

Just use a dedicated software to analyse files, like ffprobe, and parse its output.
That way only code a parser once, and the format is the same for all codecs. FFprobe supports different ouput format: json, xml, and ini like.
FFmpeg can also decode and encode everything with an uniform syntax.

That's not a bad suggestion, however I feel that there is a big difference between the effort needed to learn and rewrite an entire program in another language (caudec, Python) and the effort needed to implement very simple parameters that will work with any program, written in any language. I'll also note that there are other BASH scripts than caudec (which are more popular, to my dismay).

Edit: are there even Python bindings for every codec out there? I'm talking stuff like TAK and LossyWAV. Also, my (rather large) efforts will only improve one program (my own, caudec), whereas the (rather low) efforts of others would improve every other program out there, present and future. Perhaps the notion that caudec is written in BASH gives the wrong impression about its complexity, and the thousands of hours that were put into it to make it easy to use, convenient and performant at the same time.

Like you I have developed scripts similar to Caudec in the past to transcode my library (from anything to wavpack, and from wavpack to vorbis/mp3/flac), apply dsp to remove silence, optimize for headphones listening, auto download covers, etc. First I did it in Bash then the complexity made the code very hard to maintain, so I went to Python.
Now I have about 2k lines of Python, and I still change some stuff from time to time, but the level of satisfaction and reliability I get is even superior to what I had when I was using foobar2000 on Windows.
Building commands with about 8 chained processes with pipes can be challenging, but with the great Python standard library, using 100% of my cpu cores is literally a matter of 3 lines of code.
Opus 96 kb/s (Android) / Vorbis -q5 (PC) / WavPack -hhx6m (Archive)

A call for better parsability of command-line codec output

Reply #9
@skamp: If you have proposals regarding the LAME frontend, just make a wishlist.

A call for better parsability of command-line codec output

Reply #10
Robert: actually I don't have much of a problem with LAME specifically, mostly because it's an output codec for me (as opposed to lossless codecs, which I take as input). Its extensive tagging facilities are actually very convenient.

The only (moderately) offending part, as far as I'm concerned, is the output of "lame --version". It outputs a whole lot of text that has nothing to do with the version number, and which would be better suited for an "--about" parameter (though I can easily just use the first line). Mostly, the first line somewhat deviates from the usual (perhaps unspoken) standard, which is "PROGRAM_NAME VERSION_NUMBER" and nothing else. LAME additionally outputs the architecture (32 or 64 bits) as well as LAME's website URL. I have no way of knowing if that will change in an upcoming version, since it's obviously not designed to be parsed. Moreover, and surely this doesn't concern you directly, halb27's version outputs yet another completely different string.

So my wishlist is just one item:
  • Make "lame --version" output something like: "LAME 3.99.5"


Thanks for asking (seriously).

Edit: maybe I should mention that I use this to set the TSSE frame (Software/Hardware and settings used for encoding) to something slightly more useful that what LAME stores in there by default: LAME's version number and the encoding parameters (VBR preset or CBR/ABR nominal bitrates).

Edit 2: for future reference, if you're wondering, just look at the output of flac, metaflac and soxi (with the available command line parameters) to see what I mean.

A call for better parsability of command-line codec output

Reply #11
This issue exists only due to the fact that the Unix/bash approach is text-centric instead of being data/object centric, like e.g. Windows PowerShell (which would mandate that all the programs you used return proper PSH objects... )

There is nothing wrong with plain text output, when it's designed to be parsed. Especially for really simple data like the version number and file attributes like sampling rate, etc…

I doubt you can unite all developers under one banner to standardize their text-outputs, but I wish you luck.

Yes, that is painfully obvious to me, but what can I say, I'd rather ask and bitch about it, than just bitch about it, if you get my meaning.

Maybe you also reached the limit of what a simple bash-script can do, and you need to dive into the respective APIs.

Not really, no. I have reached the limits of command line programs that weren't designed to be parsed by other programs. That's what it is, really. The contrast between the simplicity of required changes in other programs (output simple, consistent strings), and the complexity of learning another programming language AND creating bindings to discrete libraries when they don't exist already (and some of which are Windows only), is… significant, to say the least. Edit: and it's not like I'm asking for anything new. UNIX philosophy is older than I am.

If anyone wants to call me lazy after the thousands of hours that I've spent developing caudec for no money, very little feedback and mostly no recognition, they better show that they are not, themselves.

A call for better parsability of command-line codec output

Reply #12
I doubt you can unite all developers under one banner to standardize their text-outputs


A decent start would be to unite each developer under his own banner, rather than waving a new for every release. 

 

A call for better parsability of command-line codec output

Reply #13
If anyone wants to call me lazy after the thousands of hours that I've spent developing caudec for no money, very little feedback and mostly no recognition, they better show that they are not, themselves.
I doubt anyone would do that, and certainly I don’t see anywhere that anyone was. All development that furthers usability, interoptability, and so on is appreciated! Your requests here are good, and hopefully they’ll meet with some success.

A call for better parsability of command-line codec output

Reply #14
I guess it wouldn't hurt if I was specific about what's needed.

Version number (program --version): what's pretty much standard, is to output a single line to STDOUT, in the form: "name version". E.g: "FLAC 1.3.0". Please do not output to STDERR, as requesting the version string is a deliberate action and not an error. Also, please refrain from adding any other information, that would be better suited for a "--help" or "--about" or "--license" switch.

Information about the audio characteristics of a compressed file: having a "--info" switch that outputs a human readable summary is good, but for people, not programs. Ideally, every codec program would have discrete switches for each audio characteristic, that would output nothing more than a number, to STDOUT. For instance: "metaflac --show-sample-rate", which outputs e.g. "44100".

List of useful audio characteristics:
  • sampling rate
  • bit depth
  • number of channels
  • total number of samples


With those, I can reliably determine the compression ratio, the size of the decoded WAV file, its precise duration (which I will be able to compute, use and display in any form I need), and whether or not I need to resample or downmix / upmix to stereo, when requested.

Alternatively, instead of discrete switches, a program could provide a single switch that would output those characteristics in a simple, uncluttered, parsable way, like all four values on a single line, separated with spaces, or one value per line. All that would matter is that the order (and thus the significance) of those values be clearly explained in the program's help, and consistent across versions. E.g. "44100 16 2 4658794".

Another value of interest is the internal MD5 hash, if available. Metaflac has a simple a convenient switch for that: --show-md5sum.

Edit: and those aren't caudec problems. That's just the way UNIX programs are usually designed in order to provide output that's usable by other programs in unforeseen use cases. I'll be looking into implementing a machine-readable output for caudec itself, just in case someone, some day, might have a use for it.

A call for better parsability of command-line codec output

Reply #15
Version number (program --version): what's pretty much standard, is to output a single line to STDOUT, in the form: "name version".

I don't think it's standard on *nix at all that the version flag outputs a single line. e.g., "gcc --version" outputs five lines, "vim --version" 27 lines on my ubuntu precise box. So it doesn't surprise me that, say, "lame --version" outputs 20 lines.

A call for better parsability of command-line codec output

Reply #16
Perhaps not so standard. But those outputs are horrible, and none of it has anything to do with the user requesting a very specific piece of information. I have no idea why they do that.

A call for better parsability of command-line codec output

Reply #17
From GNU coding standards
Quote
The standard --version option should direct the program to print information about its name, version, origin and legal status, all on standard output, and then exit successfully. Other options and arguments should be ignored once this is seen, and the program should not perform its normal function.

The first line is meant to be easy for a program to parse; the version number proper starts after the last space. In addition, it contains the canonical name for this program, in this format:
GNU Emacs 19.30

So, it's not a single line requirement, but the first has to follow some rules to be easily parsable.

A call for better parsability of command-line codec output

Reply #18
So, it's not a single line requirement, but the first has to follow some rules to be easily parsable.


Good enough!

A call for better parsability of command-line codec output

Reply #19
I just started working on it. So there will be a TAK 2.3.0 Beta 2 release.

Alternatively, instead of discrete switches, a program could provide a single switch that would output those characteristics in a simple, uncluttered, parsable way, like all four values on a single line, separated with spaces, or one value per line. All that would matter is that the order (and thus the significance) of those values be clearly explained in the program's help, and consistent across versions. E.g. "44100 16 2 4658794".

That's the way i want to do it. Some questions:

1. Are fixed width data columns with left (text)/right (numbers) alignment ok? They will always be separated by at least one space.
2. Is there a practical limit for the length of the line?

A call for better parsability of command-line codec output

Reply #20
I just started working on it. So there will be a TAK 2.3.0 Beta 2 release.


Nice, thanks  It's a point of pride for me to support TAK. I already make heavy use of Takc.exe (and not just for straight up TAK encoding, but also for lossyTAK).

1. Are fixed width data columns with left (text)/right (numbers) alignment ok? They will always be separated by at least one space.
2. Is there a practical limit for the length of the line?


1. Please do not format the output with variable numbers of spaces or tabs. Whatever you will name the new switch, it will be meant to be parsed by a program, not to be read by a human, with pretty column alignments. Just separate each value with a single space; the simpler the syntax, the better. Is there any value in particular that I didn't think of, that would contain a space, and thus require a different syntax?

Edit: also, don't pad numbers with leading zeros or anything. Those numbers are meant to be used in calculations, not to be printed as is. If I (or some other program) wanted to display those values, I could easily feed them to printf() myself. The idea is really to get raw data for processing; any cosmetic changes in the output should be left to the third-party program.

2. Not really, no.

Thanks again for your efforts!

A call for better parsability of command-line codec output

Reply #21
Another thing: make sure to use base units when outputting numbers, but don't print the units themselves. Anything that is not the actual raw data, is something that will need to be parsed out, so it will be an annoyance more than anything else.

E.g. "44100" instead of "44100Hz", "44.1" or "44.1kHz".

And again, the "raw data" switch is not meant to replace a "--info" switch or whatever, that outputs human-readable information. Both have their purpose.

A call for better parsability of command-line codec output

Reply #22
Nice, thanks  It's a point of pride for me to support TAK. I already make heavy use of Takc.exe (and not just for straight up TAK encoding, but also for lossyTAK).

I am glad to support your work! And i really like to have at least one new feature for this release. Speed improvements are nice, but i felt the "What's new list" could have been a bit longer...

1. Please do not format the output with variable numbers of spaces or tabs.

I will implement both modes: unformatted and formatted (fixed width columns). The latter is at least helpful for my own testing.

Is there any value in particular that I didn't think of, that would contain a space, and thus require a different syntax?

I have encountered none so far. Possibly the program name? If one writes "Monkey's Audio" instead of "Mac".

But we need a way to indicate missing values. Could this be a simple space? If single spaces are beeing used to separate values, 2 in a row would be unambiguous.

Or should we better use the ";" as delimiter? Then ";;" would indicate a missing value.

A call for better parsability of command-line codec output

Reply #23
Possibly the program name? If one writes "Monkey's Audio" instead of "Mac".


You'll print the program name with a --version switch, and I guess the GNU style cited by Robert is fine (on STDOUT, first line, version number comes after the last space).

But we need a way to indicate missing values. Could this be a simple space? If single spaces are beeing used to separate values, 2 in a row would be unambiguous.

Or should we better use the ";" as delimiter? Then ";;" would indicate a missing value.


';' (semicolon) sounds fine, but then you should probably print one systematically after each value (present or missing), including the last one. E.g. "one;two;three;four;five;" (no missing value) and "one;;three;four;;" (values 'two' and 'five' are missing). That would be very parsable.

Also, I don't expect that every codec will use the same parameter name, or even the same output format. The only thing that matters is that they provide parsable data, in a straight-forward format (whatever it may be) that is documented, and consistent across versions.

A call for better parsability of command-line codec output

Reply #24
Currently my output looks like this:

Formatted:

Code: [Select]
 0  44100 16  2    48038601 cad70b28439a92814bc4184c37fa11bc 2.3.0     2 0   4096  58.75 1
0  44100 16  2    48038601                                  2.2.0     2 0e  4096  58.30 1
0  44100 16  2    48038601                                  2.2.0     2 0m  4096  58.22 1
0  44100 16  2    48038601                                  2.2.0     2 0   4096  58.75 1

Data columns from left to right, universal data first:

Sample type: 0 = PCM / 1 would be Float
Sample rate
Sample bits
Channel count
Size in samples per channel
MD5
Encoder version

Then codec specific data:

Codec
Preset
Frame size in samples
Compression ratio
Wave file meta data present: 1/0 = yes/no

And now unformatted:

Code: [Select]
0;44100;16;2;48038601;cad70b28439a92814bc4184c37fa11bc;2.3.0;2;0;4096;58.75;1
0;44100;16;2;48038601;;2.2.0;2;0e;4096;58.30;1
0;44100;16;2;48038601;;2.2.0;2;0m;4096;58.22;1
0;44100;16;2;48038601;;2.2.0;2;0;4096;58.75;1