Help - Search - Members - Calendar
Full Version: Of presets, front ends, and making life easier for newbies/g
Hydrogenaudio Forums > Lossy Audio Compression > MP3 > MP3 - General
mpconnelly
I think that all the discussion about the proper presets (see separate thread) misses the needs of the newbie and/or general user. I started this as a separate topic because it diverges significantly from the previous thread:

First, how do we expect someone to first obtain LAME and then how do we expect them to obtain an updated copy?

Right now, it is a very difficult task: lame.org is a redirect to another website which has a lot of information but certainly isn't user friendly. Of course, to get the actual executable you have to go to yet another: Dmitry/Smpman. And, lets assume that most people want a GUI oriented batch encoder like WinLame (RazorLame)--found on yet another site. Meanwhile, I am not aware of any major audio program that has adopted the Lame encoder/decoder even though I believe it would save them money by reducing licensing fees (ie. MS Windows Player, Real, Winamp, MusicMatch, Creative, Rio, etc.). Bottom line: unless you are an enthusiast who is determined to find and use Lame, you likely won't.

Compare, for example, http://www.monkeysaudio.com. It's very easy to see what's the latest version: stable, beta, etc, comparisons to competitors, etc. In other words, it's very easy for a novice to adopt monkeysaudio--not least because what's promoted is the GUI as opposed to command line executable. Granted, few people on the street have ever heard of monkeysaudio and fewer still can afford the required storage space--but that isn't the fault of the website. Also note that I am not pushing solely a Windows GUI--there could be a GUI for each major OS (or even better--multiple platforms for the same GUI).

Second, so now someone has gone to all the trouble to find the proper encoder, the issue that has been discussed extensively on this board is what setting to use. In a perfect world, I think that eliminating the presets in favor of a simplified command line structure would be great. But, let's face it, most users want to know what the right setting is for their needs and be done with it... And, here I think the preset discussion misses the point: what's needed far more than a revamp of the presets on the Lame command line executable is a revamp of the RazorLame interface. I am thinking about the way that you can use a wizard on, say, WinZip and suggesting that we apply something similar to WinLame/RazorLame. The first time that WinLame is used and always accessible from a consolidated options setting, a wizard permits the user to define what is most important to him (size v. quality v. compatibility or, most likely, some hybrid). Based on that input, the wizard selects the best setting (behind the curtain, the wizard chooses from one of the presets such as those summarized by mp3fan). Similarly, for those using the LAME dll as part of another program, there should be a link on that program's preferences dialog box to the LAME website which has an online version of this wizard that walks the user through the generation of either a preset "code" which the user enters in the dialog box or, even better, a downloadable, XML-based config file that is imported into the program utilizing the LAME dll (compare financial programs import of OFX statements). Bottom line, I believe that this will address the needs of 99.xxxx% of the potential users.

Related to that idea, on the Lame website there should be, in the user section, clips which the user can download that clearly demonstrate the audible differences between these settings. For example, most users may not want portable compression but not appreciate the distinction between "casual" and "audiophile" settings. Listening to these clips would assist them in determining whether casual v. audiophile or any permutation thereof is worth the trade-off in storage space.

Separately, I really think that there should be an easy to understand explanation in the developer section of the site that explains exactly how these presets work. It's not sufficient to say, download the code from CVS and you will see what I settings I used. Even if a developer/expert does that, he still probably won't understand the rationale for why the combination of configuration settings was used by r3mix or Dibrom. The number of people that would debate those settings is admittedly few--but it's that type of healthy debate that permits continual improvements or at least provides a "reality check" to preset developers.

As an offshoot of that, for the enthusiasts, perhaps anyone could "publish" as a download from their website, their own XML config settings. This will permit easy utilization of those settings not found in the wizard. And, if there is some agreed reporting process (i.e. these settings are downloaded frequently enough and voted up or down by users), then these settings plus a voting process associated with the clips described earlier may influence the presets that the wizard chooses from.

Finally, if someone _really_ wants to simplify life for users, they would copy the consolidated GUI for ripping/encoding/tagging used by MS/Real/MusicMatch but where these emphasize multiple encoders (or their proprietary encoder) and real-time ripping/encoding (at least on a per-CD basis), I would emphasize batch rip (as many CDs as you have disk space), then batch encode, then batch file name/ID3 cleanup--all from a simple user interface. Optionally, there could be batch RMS normalization. In this manner, a user could fill up his hard drive with as many wav files as possible and go to sleep. In the morning, he would have compressed mp3 files. And no proprietary "media library" system would be required a la MS/Real/MusicMatch.
Dibrom
Some of this I agree with.. especially the simplifications parts and the gui coupling, etc. I've started looking into a unique solution for much of this since I've recently become interested in possibly overhauling the LAME frontend. I'll post more details as they become available..
Volcano
Wow! That would make a perfect world... I agree with you on almost all points... only... who wants to do all this (the programming work, I mean - it's not easy to just make a program that has a fancy MMJB-like interface AND does all the expert jobs, but then still is as easy to use as, say, Audiograbber)?

I proposed a way of solving the LAME DLL problem in the "LAME presets" thread. Please tell me what you'd think of that smile.gif

Anyway, at least the multiple-encoder issue (which honestly bugs the hell out of me, because it's a pest having to fire up so many different programs for different encoders) is going to be solved once the Hydrogen Audio Tools are completed.

And the problem that there is no real centralized website where you can download the software AND get the necessary information in plain english, is also going to be solved in some months. I'm working on it smile.gif

BTW, I don't think that the command lines behind the HQ presets should be given away, for the exact reasons Dibrom keeps on pointing out. Just have a look on the newbie forums (like Audiograbber) - the people there come up with self-invented command lines that are total shit (after freely admitting to being new to MP3), and even after being told three times to rely on the professionals' presets, they won't listen to you. If those people found out about the sophisticated command lines behind the "good" presets... you know what would happen.

CU

Dominic
mpconnelly
Volcano, you asked who wants to do all the work... my belief is that not only is the work done frequently but the community keeps reinventing the wheel.

Look how many rippers, how many tag editors, how many encoder-specific GUIs, etc., exist. A quick search of download.com and some of the mp3 sites, shows that there's many dozens of them. Many of them incorporate multiple functions to limited success. Take EAC for example: that is gradually including normalization and encoding (in real-time as opposed to batch mode). I am not going to knock the developer given that he releases his work for free--but he, like many others, is reinventing the wheel by building redundant functionality. Far better, from my perspective for EAC to be modularized and for the author to concentrate on its two weakest points: crashes on testing the capabilities of drives and crashes on certain damaged CDs. By limiting the focus of the ripping module, you would make it easier for other developers to assist development.

Addressing the modularity: i think it would be beneficial to have one cross-platform superstructure for audio file batch conversion into which ripping, encoding, normalizing, file name editing, and ID3 tag editing modules can "plug in". The superstructure then serves as a scheduler or scripting engine to handle inputs and outputs from the plugins, passing from one to the next as work is completed. EAC could morph into a ripping module. Lame could easily be incorporated as an encoding module. EncSpot could morph into an audio file properties/analysis module. The functionality of the RenameFiles utility could be included as part of a file name editing module. ID3-TagIT or something similar could morph into the ID3 tag editing module.

Winamp3 is a reasonably good example of this plugin superstructure for playback, as contrasted with audio content conversion.

Hopefully, everything would be GPL/BSD licensed. But if it is modularized enough, you could still use proprietary modules. For example, you could reencode your existing Dibrom-preset, Lame encoded music collection in MS Windows Media format in lower fidelity and smaller file sizes for playback on your portable. Someone would simply need to create a wrapper around the Windows file encoder that interfaces with the superstructure.

In theory, you could go a step or two further and include playback and device synchronization but these are done well by other players and not part of the creation process. So unless Winamp3 is suddenly open-sourced or unless it becomes possible to put a wrapper around all of Winamp3 and turn *it* into a plugin.... we should stick with an audio file batch conversion superstructure....

For those that are concerned about preset names, etc., I would point out that the person/group that creates the superstructure defines the interfaces and, perhaps, module configuration formats. You will then likely find that the modules (or at least their wrappers) conform to the superstructure rather than the other way around.

Dibrom and r3mix have leveraged the existing community's knowledge. Each could release their own distribution of this superstructure including their preferred modules and configuration settings in the same way that there are many Linux distributions built off the same kernel.

Volcano: I haven't thought much about the dll because I don't use it because RazorLame uses the exe. What I have intended to stress in this and previous email is that I don't think the name of the preset is of paramount importance. The total distribution with a particular emphasis on the ease of adoption IS important (i.e. ultimately, the name of the presets don't matter because the user concentrates instead on what his target is--in terms of file size, quality, and compatibility).

Volcano: I honestly think it is a mistake to not describe what Dibrom has done and why with regard to the settings. If people are going to tweak then they will do that regards of whether Dibrom posts his settings behind his presets. He doesn't have to support his presets much less people who have deviated therefrom. But, people like me just want to understand better why, for example, his settings sound good--in the same way that I read an article in Scientific America or Dr. Dobbs Journal, I want to learn. That makes me and every other reader more sophisticated users and better able to offer suggestions for improvement.
Dibrom
QUOTE
Originally posted by mpconnelly
Addressing the modularity: i think it would be beneficial to have one cross-platform superstructure for audio file batch conversion into which ripping, encoding, normalizing, file name editing, and ID3 tag editing modules can "plug in". The superstructure then serves as a scheduler or scripting engine to handle inputs and outputs from the plugins, passing from one to the next as work is completed. EAC could morph into a ripping module. Lame could easily be incorporated as an encoding module. EncSpot could morph into an audio file properties/analysis module. The functionality of the RenameFiles utility could be included as part of a file name editing module. ID3-TagIT or something similar could morph into the ID3 tag editing module.


This kind of thing, while not exactly the same in implementation, is what sphoid and I are going to attempt to provide with the Hydrogen Audio Tools.

QUOTE
Dibrom and r3mix have leveraged the existing community's knowledge. Each could release their own distribution of this superstructure including their preferred modules and configuration settings in the same way that there are many Linux distributions built off the same kernel.


I'm not quite sure I'd see the point in this. It seems contradictory to what you were just suggesting. You should only need to provide 1 thing that works and that can be used in different ways.. not 1 thing but implemented 50 different ways.. that leads to fragmentation which will eventually come back to haunt you. This won't make things any easier in the end.. it will just shift the confusion from one aspect to another. Now you have to worry about which distribution is best, instead of which programs included are best or since there would be different distributions, even that would still be a problem.. in effect making it twice as bad.

QUOTE
Volcano: I honestly think it is a mistake to not describe what Dibrom has done and why with regard to the settings.


First of all, anyone wanting to have this information can get it. I'm not keeping any sorts of secrets, I'm just choosing not to try and explain all of the experimental options all over again. I've done that plenty already... and in the end many people still misunderstand what everything is supposed to be doing. I will probably address this more in the future in the form of a well written FAQ but I'd rather not just throw all the information out there without some kind of "guide".. and I don't have time to continue to answer the same questions on the forum over and over.

QUOTE
If people are going to tweak then they will do that regards of whether Dibrom posts his settings behind his presets. He doesn't have to support his presets much less people who have deviated therefrom.  But, people like me just want to understand better why, for example, his settings sound good--in the same way that I read an article in Scientific America or Dr. Dobbs Journal, I want to learn. That makes me and every other reader more sophisticated users and better able to offer suggestions for improvement.


There is a point here.. but the problem is that much of what I tweak should be internal things. Half of the command lines in use are experimental and shouldn't even be switchable via the command line. Allowing access to all of this was a mistake, and now theres not much that can be done about it because the cat is let out of the bag so to speak.

For example.. think about what MPC or Vorbis would be like if the most important internal variables were modifiable via the command line with much of it actually being redundant and harmful even, and then on top of that being mentioned in all of the help (but not adequately explained) just begging for people to modify even if they don't understand it. It'd be a complete mess..

I've seen enough people use -Z in the wrong manner, or --ns-sfb21 in the wrong manner.. or --athtype 3 or many other countless options in the wrong manner because they just blatantly cut and paste options from some of these long lines they've seen.

Theres not much of a good solution to this problem.. you can't really keep the information from those who shouldn't be messing with it without also denying access to some of the people who know what they are doing. However, with a bit of knowledge people can easily go look in CVS and see what is going on. So that kind of pushes out the people inbetween those levels unfortunately.. but I think those willing to learn enough will be driven to read the sourcecode, which is actually a better thing in the end anyway because then the chance of them getting involved in the project themselves is more likely.
mpconnelly
QUOTE
Originally posted by Dibrom

I'm not quite sure I'd see the point in this.  It seems contradictory to what you were just suggesting.  You should only need to provide 1 thing that works and that can be used in different ways.. not 1 thing but implemented 50 different ways.. that leads to fragmentation which will eventually come back to haunt you.  This won't make things any easier in the end.. it will just shift the confusion from one aspect to another.  Now you have to worry about which distribution is best, instead of which programs included are best or since there would be different distributions, even that would still be a problem.. in effect making it twice as bad.


In a perfect world, would it be great if we could all agree on one superstructure and the best plug-in for each category of functionality? Sure.

But the rough and tumble world of open source development is different. At the risk of stepping on feet here, if we can't agree on a naming structure for presets or which preset is "high quality" v. something else, we certainly won't all agree on what should go into a distribution. People will always have their preferences.

That is why I believe the Linux distribution model may be initially confusing to some but ultimately makes a lot of sense--from all the competing distributions, certain winners emerge. Think Linux and you immediately think RedHat, maybe Caldera or Mandrake or Suse. And there are many others (although the number distributions are thinning out). RedHat clearly sets the benchmark against which others are measured.

Bottom line: nothing stops anyone from creating their own distribution, leveraging all the previous development. But, ensuring widespread adoption of your distribution is a different matter. Inevitably, I foresee at most maybe 2-3 major distributions (I will shy away from names this time). To the extent that someone creates their own distribution that incorporates a valued improvement, this improvement will likely be adopted in the "mainstream" distributions.

Further, the bulk of developers' energies will be invested in the most frequently used plugins, assuming those are also open source. This competitive but integrated environment is far, far better IMO than the thousands of combinations of independent tools used today--none of which work as a cohesive whole.

QUOTE
There is a point here.. but the problem is that much of what I tweak should be internal things.  Half of the command lines in use are experimental and shouldn't even be switchable via the command line.  Allowing access to all of this was a mistake, and now theres not much that can be done about it because the cat is let out of the bag so to speak....


Well, I am an advocate of an XML configuration file with the ability to modify each setting by inserting/altering the value of the XML key. That way, someone can modify the Lame at its innermost level without needing to recompile it or to distribute their own version including their "presets." From my perspective that simply separates the settings from the executable.

But ultimately, as a user it matters less to me since I don't have the time to test my own improvements and just use one of your settings.

Having read your position here and elsewhere, I am not sure why you object to empowering people unless it is simply the chore of defending why you made your previous design decisions.

Ultimately I think there are more important issues to address; I simply thought it beneficial to have a summary document that explains why you did what you did. Clearly, if I ready every article ever posted, I would get the gist of it since many of the settings are the result of collaboration on this board and others. But, a summary document/FAQ would save interested individuals a lot of time. Forgive me if such already exists but I am still learning your site having spent the past year+ at r3mix.net.

On a separate note: I did read some of the Hydrogen postings after I posted my initial message. I see many commonalities (particularly in the superstructure) and wish you well. I think the primary difference is more focus: mine on batch audio file conversion including ripping and not including playback at this point. Also some of the listed features such as skinning or chat seemed uninmportant to me.
Dibrom
QUOTE
Originally posted by mpconnelly
But the rough and tumble world of open source development is different. At the risk of stepping on feet here, if we can't agree on a naming structure for presets or which preset is "high quality" v. something else, we certainly won't all agree on what should go into a distribution. People will always have their preferences.


The naming scheme of presets is a relatively new debate, I think it would be one which would be solved easily in time. MPC's scheme works great, as does PsyTel's.. I don't really see people complaining there. The thing is that it just needs to be done.

Also, I don't think there is much disagreement about what is high quality, certainly not from those actually performing listening tests. The disagreement is what people think is relevant and what isn't. Roel thinks that dm-preset standard is way overkill and irrelevant, but I doubt he thinks it is low quality. I consider --r3mix to be relatively "high quality" but not at the level of dm-preset standard. I think Roel would also agree with this on a technical level, but again he would be quick to add that he thinks the difference is irrelevant.

QUOTE
That is why I believe the Linux distribution model may be initially confusing to some but ultimately makes a lot of sense--from all the competing distributions, certain winners emerge. Think Linux and you immediately think RedHat, maybe Caldera or Mandrake or Suse. And there are many others (although the number distributions are thinning out). RedHat clearly sets the benchmark against which others are measured.


If the goal is simplicity and consolidation, then this approach fails. One of the reasons Linux is having difficulty penetrating the desktop market to a significant degree is because there are so many different distributions. There isn't 1 agreed upon, focused, standard. Sure.. Redhat has kind of become the "benchmark" but theres also Suse, Debian, Slackware, and countless other popular distros out there.

For that matter there is a significant number of BSD users who choose their particular platform specifically because they much prefer the more consolidated and focused approach. Admittedly, it certainly is easier to work with.

QUOTE
Bottom line: nothing stops anyone from creating their own distribution, leveraging all the previous development. But, ensuring widespread adoption of your distribution is a different matter. Inevitably, I foresee at most maybe 2-3 major distributions (I will shy away from names this time). To the extent that someone creates their own distribution that incorporates a valued improvement, this improvement will likely be adopted in the "mainstream" distributions.


There is nothing stopping this, but I certainly believe it is a bad thing to encourage it. You can get much more done if you work together on 1 thing then spending time "reinventing the wheel" to quote your own words.

At any rate, I can almost guarantee you that there is not going to be any sort of "super structure" which many groups work on which will be distributed in this manner. The Hydrogen Audio Tools is going to be something similar to what you originally described, but there are not going to be multiple distributions of it. It will be open source, so if someone wants to create their own version of it they can.. but its not going to be Hydrogen Audio Tools anymore and it isn't going to be supported by us.

QUOTE
Well, I am an advocate of an XML configuration file with the ability to modify each setting by inserting/altering the value of the XML key. That way, someone can modify the Lame at its innermost level without needing to recompile it or to distribute their own version including their "presets." From my perspective that simply separates the settings from the executable.


I don't see an advantage here. All you do is just hide these long convoluted command lines in an XML file. People can still change them and they will.. so you aren't really solving anything, just pushing the problem off somewhere else. In fact, if I understand correctly you are even suggesting that [b]more
internal stuff be given access to in that manner.

I don't see any point in trying to centralize all of these other projects and creating all these wizards and this and that to provide simplicity, but then working against that by allowing such conflicts to arise (all these different distros) and allowing an even lower level of control over internal values for these programs. This will make things even worse than they are now in my opinion.

QUOTE
Having read your position here and elsewhere, I am not sure why you object to empowering people unless it is simply the chore of defending why you made your previous design decisions.


People should be empowered with the ability to create high quality audio and to use that in many different ways. What they shouldn't be empowered with is the ability to change internal encoder modes when they have no knowledge of what they are doing. I'm not trying to be elitist.. I'm trying to be practical. Most programs that are considered successful are that way because they are simple and efficient. Exposing developer level options to the average user and encouraging their use has the opposite effect.

What I am for is something that is cross platform, easy to use, and "just works". That is what is needed. If you do it right then you don't need all of this extra stuff IMO.

In a way I have been inspired by something like MPC. When you initially look at the encoder, coming from using LAME, you are a bit shocked by its lack of options. But the thing is that it "just works". You don't need 50 switches to get good quality.. it does the right thing and it does it really damn well.. and its simple enough for anyone to use.

QUOTE
On a separate note: I did read some of the Hydrogen postings after I posted my initial message. I see many commonalities (particularly in the superstructure) and wish you well. I think the primary difference is more focus: mine on batch audio file conversion including ripping and not including playback at this point. Also some of the listed features such as skinning or chat seemed unimportant to me.


Mass control of digital audio is basically the goal behind the Hydrogen Audio Tools. That comes before all other "details". The primary form of all of that will be in mass encoding, decoding, management, processing, possibly sharing, and whatever else is logical. Ripping is also going to be looked at (via native interfaces, which in fact almost all of the modules will probably use in some form). Playback is going to be just a side effect of decoding. The chat stuff is likely going to be a side effect of possible audio sharing stuff. Skinning (or more correctly, theming) is just a detail of it all... its not supposed to be a big focus and its already completely implemented.
deej_1977
A little word of warning: if you put up binaries of Lame you might get attract the attention of Thompson/Fraunhoffer. Don't get me wrong, I'm all about helping the newbies (heck I got into a fist fight with r3mix over it wink.gif) but be careful! The more attention Lame will get from the general public, the more it's going to be exposed to the big corporations and their money hungry lawyers...
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2009 Invision Power Services, Inc.