Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: A HA.org Sample Database? (Read 25645 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

A HA.org Sample Database?

[span style='font-size:7pt;line-height:100%']Thread split from here
-------------------------------------------------------------
[/span]
Task #1  I want/must/should/can take away is the XMMS plugin.
An important second task is maintaining of test samples.

All lossy encoders take benfits from such a web page.
AAC, AC3, Lame, Musepack, Ogg Vorbis, MP2, dts, ...

Currently there is a very old test sample page for gpsycho:

http://www.mp3dev.org/mp3/gpsycho/quality.html

I think such a page is very important. Otherwise developers must
waste time to hunt for test samples in web forums which appears
and disappears very quickly. Most of the test samples you can't download
2 or 3 days later.

This is a mess and annoying (at least for such persons like Frank Klemm).

The other problem is, when you are downloading these samples,
then you have the mess on your hard disks. Thousands of files
with names like goldc.pac, Hex.pac, jo3.ape, dr4.pac, t1.pac, track07.pac,
lalaw.pac, 11.wav, lust.pac, hvmh3.shn, beo.pac .
You don't know the source of these files and you don't know anymore
what encoder problems these files show.

I have 2,5 GB (4 full CDs) with files with such names. I downloaded
they sometimes somewhere for an unknown purpose, but after 6 or 12 or
24 month I don't know much about these files.

I can backup these 4 CDs and give it CHJW, because I can't upload 2,5 GB
of data.

This  job must be done by someone with useful ears (he/she must make
some checks to avoid too much SPAM in this list), a wide band
internet access, a lot of web space and he/she should be able to send
the test samples on request via S-mail (5..10 €)

Code: [Select]
Listening Test Samples Page
~~~~~~~~~~~~~~~~~~~~~~~~~~~

For Downloads you need a special HydrogenAudio account, which you can get for free.
There is a download limit of 5 MByte per day (except for active developers).
For further information see [download rights].

Samples   [1-50]  [51-100]  [101-150]  [151-186]

------------------------------------------------------------------------------------

This page:
 [1]   [2]   [3] ...
                          ...  [50]

------------------------------------------------------------------------------------

Test Sample #0001

Reported by: Anonymous
Reported on: Mar 20 2003

Source: CD
- Album: Der kleine Hobbit (1980)
   EAN/UPC: 9-783895-841675
   Katalog: 167-6
   LC: ?
   Copyright: 1980 WDR Köln /1986 Hörverlag Stuttgart
- Artits: -
- Media: CD 4
- Track: 2
- Offset 0:00.00
- Title: Folge 8
- Sample rate: 44100 Hz, 2x16 bit
- Length: 5.4 sec

Contents
 This is a radio play. It is mastered very noisy. You have a lot of background noise
 which needs a lot of bits.

Typical errors for encoders:
- Lame: You hear colorization of the background noise and silibant distortions of the in the transition of s-o in the first word.
- Musepack: ...
- FAAC: ...
- Nero AAC: ...

[Download (FLAC file) 0001_Hobbit.flac]

------------------------------------------------------------------------------------

Test Sample #0002

Reported by: Anonymous
Reported on: Mar 22 2003

Source: CD

- Album: Midnite Vultures (1999)
    EAN/UPC: 6-06949-05272-0
    Katalog: 4905272
    LC: LC 07266;
    Copyright: 1999 Geffen Records Inc.
- Artist: Beck
- Media: CD 1
- Track: 4
- Title: Get Real Paid
- Sample rate: 44100 Hz, 2x16 bit
- Offset: 3:51.74 (near the end)
- Length: 8.2 sec

Contents
 Signal is extremely transient. You have attacks in a distance around 16 ms.

Typical errors for encoders:
- Lame: ...
- Musepack: ...
- Nero AAC: ...

[Download (FLAC file) 0002_Beck.flac]

------------------------------------------------------------------------------------

...

------------------------------------------------------------------------------------

This page:
 [1]   [2]   [3] ...
                          ...  [50]

------------------------------------------------------------------------------------

Samples   [1-50]  [51-100]  [101-150]  [151-186]
--  Frank Klemm

A HA.org Sample Database?

Reply #1
Quote
An important second task is maintaining of test samples.

What about using the HA wiki for a test sample page? And offering the test samples by bit torrent?

A HA.org Sample Database?

Reply #2
Quote
Quote
An important second task is maintaining of test samples.

What about using the HA wiki for a test sample page? And offering the test samples by bit torrent?

Hmm, I don't think bittorrent will work very well on files that are only downloaded now and then.

I can host with unlimited bandwidth on audiocoding.com if you want, but it's only 100 MB webspace, I suppose that will be filled quite quickly.

Menno

A HA.org Sample Database?

Reply #3
IMO its a good idea to start with such a samples page, and not only for MPC. It could be precious for the whole community. matroska has got support from a nice guy lately, his nick is atomic, and he has a server connected with a 100 mbps line, and we have 15 GB space on there. I will talk to him if he could think of doing this for MPC ....

A HA.org Sample Database?

Reply #4
Bittorrent also sucks for all us college students who can only dl things at 1kbps because uploads are firewalled  . Also, what about already existing sites such as http://www.ff123.net/samples.html ??

A HA.org Sample Database?

Reply #5
Quote
Quote
Quote
An important second task is maintaining of test samples.

What about using the HA wiki for a test sample page? And offering the test samples by bit torrent?

Hmm, I don't think bittorrent will work very well on files that are only downloaded now and then.

I can host with unlimited bandwidth on audiocoding.com if you want, but it's only 100 MB webspace, I suppose that will be filled quite quickly.

Menno

I expect data in the range ~ 1 GByte and download rates around 20...30 GByte/month.
The base pages (HTML) can be on HA, the PCM files can be splitted on multiple hosts.

There must be done something to avoid that people which are not interested in testing do
download the files only "to have the files".

I would propose to register before download becomes possible.
Active developer has unlimited access, other people has a download limit per day.

When I remove the binary stuff from www.uni-jena.de, I also have 90 MByte for audio files.
--  Frank Klemm

A HA.org Sample Database?

Reply #6
The wiki can still be used for the page to make it easy for ppl to add comments and new samples.
But I agree that bit torrent is not a very good solution for many small files...
I have no idea if HA can host these by itself or external hosting would be needed.

A HA.org Sample Database?

Reply #7
Quote
Bittorrent also sucks for all us college students who can only dl things at 1kbps because uploads are firewalled  . Also, what about already existing sites such as http://www.ff123.net/samples.html ??

Three remarks:

The files should not have a known extention. A lot of proxies forbid to access file
with the extensions ".WAV" and ".MP3". I known at least one proxy which do that.
.AIFF, .PAC, .FLAC, .APE are allowed.

Some people can only access via port 21 (ftp) and 80 (http).

I do also know some URLs where sample files can be found. But the problem is that only
10...20% of useful samples files can be found on such stable long term URLs.
Most samples are only temporary available or the URL can only be found by reading
all web forums.
--  Frank Klemm

A HA.org Sample Database?

Reply #8
I could host about 700 MB of the test-samples on anytag.de, but I've only ~ 20-30 GB traffic left per month.

~ Florian

A HA.org Sample Database?

Reply #9
I think I can host 200MB or so of stuff, maybe a bit more. I have 10GB of total bandwidth for my hosting plan, and my sites have yet to exceed even 400MB of bandwidth although they are becoming busier so might need 1GB for them in the next few months. It'd be at donovansmith.us or lowcostaudio.org, you can choose the domain name, or I can get you a subdomain at donovansmith.us. If this can be of any use for either samples and/or web pages feel free to let me know. I appreciate your work on MPC very much, Mr. Klemm

A HA.org Sample Database?

Reply #10
Quote
I think I can host 200MB or so of stuff, maybe a bit more. I have 10GB of total bandwidth for my hosting plan, and my sites have yet to exceed even 400MB of bandwidth although they are becoming busier so might need 1GB for them in the next few months. It'd be at donovansmith.us or lowcostaudio.org, you can choose the domain name, or I can get you a subdomain at donovansmith.us. If this can be of any use for either samples and/or web pages feel free to let me know. I appreciate your work on MPC very much, Mr. Klemm

A possible solution would be:

- Base pages are on HA
- Files are spreaded over multiple hosts
- Maintainer has local copy of all spreaded files ...
- ... and test from time to time (2 weeks) the availibility of the spreaded files
- files which become dead are placed on HA.org
- may be the test can be done automatically
- files should be compressed losslessly
- md5 hash and file length of the files should be available, you can test quickly the accurar of your file
--  Frank Klemm

A HA.org Sample Database?

Reply #11
Quote
Quote
I think I can host 200MB or so of stuff, maybe a bit more. I have 10GB of total bandwidth for my hosting plan, and my sites have yet to exceed even 400MB of bandwidth although they are becoming busier so might need 1GB for them in the next few months. It'd be at donovansmith.us or lowcostaudio.org, you can choose the domain name, or I can get you a subdomain at donovansmith.us. If this can be of any use for either samples and/or web pages feel free to let me know. I appreciate your work on MPC very much, Mr. Klemm

A possible solution would be:

- Base pages are on HA
- Files are spreaded over multiple hosts
- Maintainer has local copy of all spreaded files ...
- ... and test from time to time (2 weeks) the availibility of the spreaded files
- files which become dead are placed on HA.org
- may be the test can be done automatically
- files should be compressed losslessly
- md5 hash and file length of the files should be available, you can test quickly the accurar of your file

I can take on the task of checking/maintaining the files.
I suggest we use FLAC and md5 (md5 can just be uploaded to the wiki) as Klemm suggested.
Frank Klemm: If you would make an example of a test file at the wiki that would be very nice:
http://doc.hydrogenaudio.org/wikis/hydrogenaudio/FrontPage <-- here under downloads I put a TestSample page you and everybody else can add to.

A HA.org Sample Database?

Reply #12
Quote
Quote
Quote
I think I can host 200MB or so of stuff, maybe a bit more. I have 10GB of total bandwidth for my hosting plan, and my sites have yet to exceed even 400MB of bandwidth although they are becoming busier so might need 1GB for them in the next few months. It'd be at donovansmith.us or lowcostaudio.org, you can choose the domain name, or I can get you a subdomain at donovansmith.us. If this can be of any use for either samples and/or web pages feel free to let me know. I appreciate your work on MPC very much, Mr. Klemm

A possible solution would be:

- Base pages are on HA
- Files are spreaded over multiple hosts
- Maintainer has local copy of all spreaded files ...
- ... and test from time to time (2 weeks) the availibility of the spreaded files
- files which become dead are placed on HA.org
- may be the test can be done automatically
- files should be compressed losslessly
- md5 hash and file length of the files should be available, you can test quickly the accurar of your file

I can take on the task of checking/maintaining the files.
I suggest we use FLAC and md5 (md5 can just be uploaded to the wiki) as Klemm suggested.
Frank Klemm: If you would make an example of a test file at the wiki that would be very nice:
http://doc.hydrogenaudio.org/wikis/hydrogenaudio/FrontPage <-- here under downloads I put a TestSample page you and everybody else can add to.

Task #3 is the WinAMP 2/3/5 plugin:

- Do somebody know a valid eMail address of Case? mobiili.net bounces.
- Is Case still interested in WinAMP plugin development?
- Who is interested in WinAMP plugin development and maintaining?

The latest source of the WinAMP has Case, I give the code away
more than 1 1/2 years ago.
--  Frank Klemm

A HA.org Sample Database?

Reply #13
Case's homepage indicates: cse@sci.fi

A HA.org Sample Database?

Reply #14
I'm not sure if the wiki would be optimal for such a database. Maybe a specialized solution (anyone up for coding something in PHP or Perl?), which would allow adding/editing/deleting/searching samples/mirrors/comments using a fixed layout and interface.
"To understand me, you'll have to swallow a world." Or maybe your words.

A HA.org Sample Database?

Reply #15
1. I think several hosts should share the load, across several countries, to make
it easier for those who have slower connections to certain continents or pay-free
access within their ISP/country. Some redundancy will be needed, meaning that
the samples will have to be hosted by at least 2 web sites, to avoid the
problem of one host going down and lack of availability.

2. I can offer 3 GB of space with unlimited bandwidth on a very fast European site,
but files will have to be uploaded there through me, and not directly by those who
  submit the samples. I don't think I have the time to do this, if many samples are
  submitted. I think atomic should be contacted, as Chris suggested. A site or
    several sites with access to certain chosen devs/volunteers is indeed the 
              appropriate way to do it.

3. BitTorrent can still be used as a back-up system for those who are not limited to
ports 80/21 or are not behind a firewall. The entire collection of samples can be
divided into 15-20 chunks based on either artist/track name or based on the type
of problem (transients on one package, for example) and those 15-10 packages
can be torrented through several dedicated members of the community who can
keep their BitTorrent client open. This should take quite a lot of the load off the
main hosts, and it would allow us to have more sources for the collection, if the
dedicated web sites are down, for whatever reason.

4. AIFF, for example, should not be accepted. Only lossless formats that can be
decoded on all OSs. Shorten = yes. FLAC = yes. AIFF = no (a waste of space).
And if Warhol's a genius, what am I? A speck of lint on the ***** of an alien

A HA.org Sample Database?

Reply #16
Quote
4. AIFF, for example, should not be accepted. Only lossless formats that can be
decoded on all OSs. Shorten = yes. FLAC = yes. AIFF = no (a waste of space).

I think to keep it simple they should all be in one format.  I would suggest FLAC as it seems to be what is most commonly used here at HA.org.
gentoo ~amd64 + layman | ncmpcpp/mpd | wavpack + vorbis + lame

A HA.org Sample Database?

Reply #17
I agree with all the points Seed mentioned and would like to add further considerations:

1. There should be a fixed filenaming scheme. Right now most people, who regularly post samples, use their own schemes or rather meaningless names like track1.flac. I'd suggest artistwithoutspaces-songnospaces.sample20sec.flac. No spaces are used for compatibility and ease of use reasons.

2. Contributers should be able to add/remove their own mirrors, so even people without lots of webspace (many ISPs include 5-10MB for a personal homepage) could help spreading the load.

3. Metadata shouldn't be a requirement, but a recommendation. TrackGain could be useful too, since it gives an indication about the nature of the signal.

dev0
"To understand me, you'll have to swallow a world." Or maybe your words.

A HA.org Sample Database?

Reply #18
i can provide small hosting - about 20 megs, and about 1 Gig monthly limit for the samples, if that is of any use...

edit: iam not aware of any good way to limit the bandwidth btw. (without root rights that is)
PANIC: CPU 1: Cache Error (unrecoverable - dcache data) Eframe = 0x90000000208cf3b8
NOTICE - cpu 0 didn't dump TLB, may be hung

A HA.org Sample Database?

Reply #19
Quote
I'm not sure if the wiki would be optimal for such a database. Maybe a specialized solution (anyone up for coding something in PHP or Perl?), which would allow adding/editing/deleting/searching samples/mirrors/comments using a fixed layout and interface.

Another problem with using the wiki, IMO, is that it would probably be hard to hack a download limiter into it (Klemm's idea of allowing registered users downloading 5 samples per day, Developers download as much as they want, non-registered users download nothing...)

Quote
4. AIFF, for example, should not be accepted. Only lossless formats that can be decoded on all OSs. Shorten = yes. FLAC = yes. AIFF = no (a waste of space).


Erm... AIFF is not even lossless, AIFF is Apple's audio container pretty much as WAV is Microsoft's audio container. You can have inside PCM, ADPCM, MP3...

And AIFF can be "decoded" on all mainstream OSs. On Windows: Winamp, nearly every audio editor, QuickTime, (foobar?)... on Mac: QuickTime, iTunes... and on Linux, anything using libsndfile. Converting it to Wav is a snap as well.

A HA.org Sample Database?

Reply #20
Still FLAC should be used, since we should all be able to agree on the fact that lossless compression is better than no compression.
"To understand me, you'll have to swallow a world." Or maybe your words.

A HA.org Sample Database?

Reply #21
I meant that all collected samples must be compressed, and this is why AIFF should
not be accepted. I've been using this file format for 13 years, so I know very well
what it is, kthx.

dev0's suggestion is perfect. FLAC is the one file format everyone can deal with and
it offers decent compression ratios.
And if Warhol's a genius, what am I? A speck of lint on the ***** of an alien

A HA.org Sample Database?

Reply #22
Quote
I'm not sure if the wiki would be optimal for such a database. Maybe a specialized solution (anyone up for coding something in PHP or Perl?), which would allow adding/editing/deleting/searching samples/mirrors/comments using a fixed layout and interface.

I'm trying to make as much use for the wiki as possible. With a good template/example I think the wiki would work out fine.
IMO a new site would be overshooting the mark a bit. I don't think it is gonna be that big and used that much. So that much hosting is probably not needed either IMO.

Quote
1. I think several hosts should share the load, across several countries, to make
it easier for those who have slower connections to certain continents or pay-free
access within their ISP/country. Some redundancy will be needed, meaning that
the samples will have to be hosted by at least 2 web sites, to avoid the
problem of one host going down and lack of availability.

Wouldn't it be enough if one person downloads all files and checks the links every 2 week or something?

Quote
1. There should be a fixed filenaming scheme. Right now most people, who regularly post samples, use their own schemes or rather meaningless names like track1.flac. I'd suggest artistwithoutspaces-songnospaces.sample20sec.flac. No spaces are used for compatibility and ease of use reasons.

I think this is problematic since a lot of the problem samples are old and from unknown source: fatboy, castanets etc.
We should encourage a useful filenaming scheme and tagging though.
A description of the problem can be saved in a tag also.


Quote
Another problem with using the wiki, IMO, is that it would probably be hard to hack a download limiter into it (Klemm's idea of allowing registered users downloading 5 samples per day, Developers download as much as they want, non-registered users download nothing...)


Do you think BW would actually be a problem? If the files a spread around several hosts I find it hard to believe that the load would be very high.
--------

I suggest we do following:
  • Someone makes a template/example at the wiki that includes:
     
    • Description of the problem.
       
    • Info about the source (cd, length, (style?)).
       
    • Download location.
       
    • MD5-sum or MD5-file location.
  • I make sure to have a copy of the files.
  • I suggest we use only FLAC so that you only need 1 decoder that is different from what you normally use. And FLAC seems to be a good choice.
  • Filename should contain name should contain: album, artist, trackname.
  • Info about who and how to contact people willing to host (email etc.)
What do you think of this?

A HA.org Sample Database?

Reply #23
Quote
Do you think BW would actually be a problem? If the files a spread around several hosts I find it hard to believe that the load would be very high.

I expect there will be bandwidth spikes whenever a developer calls for testing. (considering people will actually respond to the call this time). That could kill several of the small hosts.

A HA.org Sample Database?

Reply #24
Why not keep it simple ?
Everyone could access the pages with informations for each sample.
Only developers could download directly samples.
Everyone else could download with torrents a package that is updated each week, or each month.

For those behind proxies, if they're not developers, maybe some will host as mirrors the full package
It's a 'Jump to Conclusions Mat'. You see, you have this mat, with different CONCLUSIONS written on it that you could JUMP TO.