Help - Search - Members - Calendar
Full Version: BakaTime, an application for timing processes
Hydrogenaudio Forums > Misc. > Off-Topic
Synthetic Soul
QUOTE(Liisachan @ May 25 2006, 12:10) *
I alraedy made an accurate timing tool which will correct the time cost needed for timing itself. (I mean, "Timing" itself needs some CPU time)
If you have something that you feel will time such tests accurately I would be interested to see it, if you are willing to release it to the public.

NB: I am currently using Timer, the reason for which can be found in the Yalac thread.

I would have thought that the process of timing would be consistent, and therefore negligable and/or irrelevant when comparing times with other encoders. I would be interested to see confirmation or repudiation of this though.
Liisachan
QUOTE(Synthetic Soul @ May 25 2006, 13:00) *

I would have thought that the process of timing would be consistent, and therefore negligable and/or irrelevant when comparing times with other encoders. I would be interested to see confirmation or repudiation of this though.


Well, in practice, that's true. Technically, tho, let's say if the overhead was 10ms (constant), and foo cost 50ms observed and bar cost 500ms observed, then, according to your theory, you would say foo is 10 times faster than bar. However, true costs are 40ms vs 490ms, so actually foo is more than 12 times faster than bar. "10 times" would be underestimating. Such a situation might happen.

Anyway, what I did was a real simple trick. So simple I even named it BakaTime (silly timer). Just look at this output and you can easily guess what is being done:

CODE

BakaTime -x wavpack -b384x "E:\My Music\Sample.wav" "E:\My Music\Output1.wv"

BakaTime for Windows 2000/XP v0.3
-----------------------------------
    Timing: { wavpack -b384x "E:\My Music\Sample.wav" "E:\My Music\Output1.wv" }



WAVPACK  Hybrid Lossless Audio Compressor  Win32 Version 4.31  2005-12-10
Copyright (c) 1998 - 2005 Conifer Software.  All Rights Reserved.

created E:\My Music\Output1.wv in 2.73 secs (lossy, 407 kbps)


[ BakaTime Overhead Self-Test ]
#0 : 0.024801 sec
#1 : 0.026226 sec
#2 : 0.025181 sec
#3 : 0.025422 sec
#4 : 0.025154 sec
#5 : 0.025412 sec
#6 : 0.025461 sec
#7 : 0.025159 sec
#8 : 0.025036 sec
#9 : 0.025442 sec
Ave: 0.025329 sec

[ BakaTime Report ]
Cmd   =wavpack -b384x "E:\My Music\Sample.wav" "E:\My Music\Output1.wv"
Start =    787611395122
End   =    796995790950
Freq  =      3391560000 / sec
Elapse=        2.766985 sec
Corr  =       -0.025329 sec
Cost  =        2.741656 sec
Accur =        0.001425 sec
Cost  =        2.742    sec


It's a simple trick like children's play, but better than nothing I guess smile.gif
From this, you can tell:
(1) Overhead is not really constant but like +- 1 ms, so the result is only reliable down to millisec or so. Microseconds would be pointeless even if the timing tool is calling the nanosecond-accuracy counter internally. It's meaningfull to know the number of significant digits.
(2) Overhead is not really small. 25ms in this case--however the estimated overhead is 25+-1ms, so we can safely try to correct it.

Again, this is rather like a child's toy, just a better-than-nothing thing. Still, there's nothing wrong in trying to mesure things as accurate as possible, right?
Synthetic Soul
Thanks for sharing.

I am assuming that "Accur" is the difference between the slowest and fastest self-test? Is it used at all when reporting the final "Cost"?

Your point regarding the comparison of figures is valid. If you consider the difference between encoding using FLAC -0 and OptimFROG's best compression settings the difference is a lot more than 10x. I suppose I was only thinking of like encoders encoding whole tracks or albums... blush.gif

If you have read the Yalac thread you may have seen the discussion regarding CPU time vs CPU+IO time. I assume your timer is recording CPU+IO (i.e.: the whole process) time. I would be interested to hear your thoughts on the subject.

As you say, no harm in trying to be as accurate as possible.
Liisachan
QUOTE(Synthetic Soul @ May 25 2006, 15:35) *

Thanks for sharing.

I am assuming that "Accur" is the difference between the slowest and fastest self-test? Is it used at all when reporting the final "Cost"?


Estimated Accur(acy) of the Overhead cost is used to esitimate the final accuracy. If observed overhead is +- 10ms, the final accuracy is roughly +- 10ms. In the above example, the Accuracy is like +-1ms, but if it was like +-10ms, the final result would be shown as Cost = 2.74 sec, not 2.742 becase the trailing 2 would be not reliable in that case.

QUOTE

If you have read the Yalac thread you may have seen the discussion regarding CPU time vs CPU+IO time. I assume your timer is recording CPU+IO (i.e.: the whole process) time. I would be interested to hear your thoughts on the subject.


I'm no expert, just a user. But generally, IO cost should be included. For instance, assume you'd like to record a radio program losslessly using OptimFROG: would that be possible? The answer would be YES, if and only if CPU+IO >= real time. So for such a purpose, IO cost means much. A good example is huffyuv, a lossless video codec. huffyuv fast mode is faster CPU-wise, but the compression is bad, which means you'd need a really fast HD io speed aka bw (because the resulted file is huge). If your HD is not very fast, huffyuv best ("slowest") mode is paradoxically faster. I think you can imagine the situation.
Of course it is ideal to defrag the HD before each testing, and keep the temperature the same if possible.
However, IO cost is hw thing, essentially random, so in a purely theoretical comparison, measuring only the software-cost might be a good idea (more reproducable). Simply put, for a practical purpose you cannot ignore the IO cost, but for testing for development, more accurate comparision would be possible and easier by ignoring IO cost. That's especially ture when handling huge (lossless) file like when testing YALAC.
Thomas Becker was talking about 0.1% difference, and in such subtle discussions, the IO cost is a pest. If you have to include the IO cost, you really have to defrag everytime and have to keep HD temperature. I would feel that rather annoying or "time-consuming" wink.gif Just my 2 cents.
Synthetic Soul
QUOTE(Liisachan @ May 25 2006, 17:30) *
In the above example, the Accuracy is like +-1ms, but if it was like +-10ms, the final result would be shown as Cost = 2.74 sec, not 2.742 becase the trailing 2 would be not reliable in that case.
But would 2.740 be any more accurate? If you need to report a time (the purpose of running a test) ignoring the variance isn't really helping. Or would you use the lack of a third decimal place as a sign to repeat the test?

QUOTE(Liisachan @ May 25 2006, 17:30) *
Thomas Becker was talking about 0.1% difference, and in such subtle discussions, the IO cost is a pest. If you have to include the IO cost, you really have to defrag everytime and have to keep HD temperature. I would feel that rather annoying or "time-consuming" wink.gif Just my 2 cents.
Indeed, IO is a variable factor, and trying to unify the time across runs is an impossible task.

I agree that both CPU-only and CPU+IO times are useful.


I'm sorry to have dragged this off topic, but your talk of timing is very relevant to me at the moment. If you are in agreement I will split these posts to a new thread... possibly titled "BakaTime, an application for timing processes"?
Liisachan
QUOTE(Synthetic Soul @ May 25 2006, 19:29) *

QUOTE(Liisachan @ May 25 2006, 17:30) *
In the above example, the Accuracy is like +-1ms, but if it was like +-10ms, the final result would be shown as Cost = 2.74 sec, not 2.742 becase the trailing 2 would be not reliable in that case.
But would 2.740 be any more accurate? If you need to report a time (the purpose of running a test) ignoring the variance isn't really helping.


When you say a value, you are supposed to say 2 things: the value itself and its precision.
In that sense 2.74 and 2.740 are different, even tho they are equal in math.
If you say 2.74, that means something like 2.74 ± 0.01 seconds, i.e. the number of significant digits in the fractional part is 2.
If you say 2.740, that means like 2.740 ± 0.001 seconds, i.e. the number of significant digits in the fractional part is 3, and the last significant digit happens to be zero.

QUOTE

I'm sorry to have dragged this off topic, but your talk of timing is very relevant to me at the moment. If you are in agreement I will split these posts to a new thread... possibly titled "BakaTime, an application for timing processes"?
Well, sure, this is getting too off-topic, obviously. And even tho my tool itself is trivial and of little value, it'd be great if this topic would inspire deeper, more meaningfull and more productive discussions, like good tips for more reasonable time-cost estimation. I didn't imagine my poor knowledge would help, as I'm just a user and no expert of this area, but if it does help a little, i'm glad.

ps: A doom9 mod would have just split the thread already without even asking the normal users; I so get used to it I felt you are too nice smile.gif
Synthetic Soul
QUOTE(Liisachan @ May 26 2006, 19:33) *
When you say a value, you are supposed to say 2 things: the value itself and its precision.
OK, then I would just suggest that this is documented, if and when you release BakaTime. Otherwise I think people will assume it is simply 2.740.

QUOTE(Liisachan @ May 26 2006, 19:33) *
Well, sure, this is getting too off-topic, obviously. And even tho my tool itself is trivial and of little value, it'd be great if this topic would inspire deeper, more meaningfull and more productive discussions, like good tips for more reasonable time-cost estimation. I didn't imagine my poor knowledge would help, as I'm just a user and no expert of this area, but if it does help a little, i'm glad.
Exactly, I'm just trying to provide feedback to you and maybe get some other people thinking a little too.

I think the idea of trying to discount the actual timing process is justified, but I would also say that I am very interested in Timer's CPU-only time, as the IO variation is just too much of a rogue influence. It does need to be considered at times, but it is also useful to be able to discount it.
Liisachan
QUOTE(Synthetic Soul @ May 27 2006, 09:56) *

QUOTE(Liisachan @ May 26 2006, 19:33) *
When you say a value, you are supposed to say 2 things: the value itself and its precision.
OK, then I would just suggest that this is documented, if and when you release BakaTime. Otherwise I think people will assume it is simply 2.740.


Um, do you really think so? Does anyone have any comments about this? I don't really think so, as this is a well-established convention:

From Wikipedia http://en.wikipedia.org/wiki/Significant_digits
QUOTE
In order to express the degree of precision to which a value was measured, decimal numerals are used. When using significant figures rules, it should be assumed that the last significant digit of every measurement was estimated. Using the previous example, if the observer read the amount of liquid in the cylinder to be exactly at the 12 ml mark, the observer would write the value as 12.0 ml, which would indicate that the tenths place was the precision obtained, and the 0 was estimated. If the cylinder were marked off to every tenth of a ml, the observer would write the value as 12.00 ml.


QUOTE(Liisachan @ May 26 2006, 19:33) *
I think the idea of trying to discount the actual timing process is justified, but I would also say that I am very interested in Timer's CPU-only time, as the IO variation is just too much of a rogue influence.


If you are talking about YALAC specifically, maybe something like this MPlyaer's option would work for the purpose:
mplayer -vo null -nosound -benchmark
...or generally something like > NUL... which would eliminate writing time software-wise for benchmark.
As for input... HD reading access time...hmm maybe we'll use huge RAM like 1GB~4GB and first will copy the file into the memory? Might be a silly idea tho biggrin.gif


Synthetic Soul
QUOTE(Liisachan @ May 27 2006, 11:35) *
Um, do you really think so?
Yes. Not all users who perform timings have a good mathematical background. My main point is that I, for example, use Timer to record hundreds of timings in one test. It is unrealistic for me to highlight those timings where 2DP, and those where 3DP, are used. If you are documenting a few processes then highlighting the lack of a third DP may be realistic. Don't get me wrong, I don't see this as a major deal; just feeding back/discussing.

QUOTE(Liisachan @ May 26 2006, 19:33) *
If you are talking about YALAC specifically
No, I'm not. IO will be an issue for any process where the time reading or writing to disk is a significant proportion of the process. In my timings for FLAC -0, YALAC Fast, WavPack -f, etc. the IO time accounts for 20% of the total process time.
Liisachan
QUOTE(Synthetic Soul @ May 27 2006, 18:17) *

QUOTE(Liisachan @ May 27 2006, 11:35) *
Um, do you really think so?
Yes. Not all users who perform timings have a good mathematical background.
Ok, that's true. The "timing-itself" lag could be 10~50ms (0.01~0.05 sec), so if you want 0.01 sec accuracy, and if you want to be really strict, the correction has some valid points too anyway.

QUOTE
IO will be an issue for any process where the time reading or writing to disk is a significant proportion of the process. In my timings for FLAC -0, YALAC Fast, WavPack -f, etc. the IO time accounts for 20% of the total process time.
I guess that "20%" part depends on your device, but yes...that IS disturbingly huge.
have you already tried something like this, to kill the HD writing lag?

flac in.wav -f -o NUL

Do you think using the NUL device here is a good/valid approach?

As for the reading delay, I'm clueless. Just hypothetically, if stdin/out was supported (like lame's "-"), we could use a simple pipeline trick ("BakaInput") which would first load the file into RAM, and then feed it to stdio (should be much faster than reading directly from HD). The only tricky part would be, to make sure for Windows not to use a swap. But the fact is, not every encoder supports STDIO, right? I'm gonna have to think about it more...
Is there a good tool that creates a virtual drive on RAM?

Another thing: a "fast" compressor might be fast partly because it uses good io technique, and it is possible that when we compare several tools naively, we are actually not purely comparing the encoding/decoding algos, but are "tricked" by IO hacks. What I mean here is not the HD r/w speed but how well the author of the app optimized the IO part speed-wise. If the app can be a memory eater, it can be faster by using a huge buffer, or maybe buffering the file with one thread while compressing it with another.

So... in theory, using a lot of relatively small samples might be a good idea here. If the file is huge, we will be confused by IO in 2 ways: (1) the physical r/w speed and (2) the tool's i/o algo.
A tool might be created so that it uses least memory (suitable for HW support), and if so, the IO buffer should be very small in size which makes it look slower than another memory-eating app. What is the fair comparison of compression speed? is a rather complicated question to answer... Perhaps we should also check the memory usage, not only speed and compression ratio.

Synthetic Soul
QUOTE(Liisachan @ May 28 2006, 03:12) *
I guess that "20%" part depends on your device, but yes...that IS disturbingly huge.
Indeed. It will depend on your device, which means it is variable between users, and it is a large proportion of the process time. My machine is an Athlon XP 2400; 512MB RAM; Win 2K SP4; 7200 PATA hard drive. A bit outdated nowadays, but not the worst machine by any means.

QUOTE(Liisachan @ May 28 2006, 03:12) *
have you already tried something like this, to kill the HD writing lag?

flac in.wav -f -o NUL

Do you think using the NUL device here is a good/valid approach?
I haven't tried that. I would be interested to test this type of approach against Timer's Process Time (CPU only). It's very difficult analysing these things though as no run is the same. Still, we would be able to see if they produced results that appeared to be the same. I would expect that your approach would be valid, yes; but again, I'm by no means an expert. Would ouputting to NUL work with any encoder/application?

QUOTE(Liisachan @ May 28 2006, 03:12) *
As for the reading delay, I'm clueless. Just hypothetically, if stdin/out was supported (like lame's "-"), we could use a simple pipeline trick ("BakaInput") which would first load the file into RAM, and then feed it to stdio (should be much faster than reading directly from HD). The only tricky part would be, to make sure for Windows not to use a swap. But the fact is, not every encoder supports STDIO, right? I'm gonna have to think about it more...
Is there a good tool that creates a virtual drive on RAM?
That sounds interesting, but as you say I guess you can't rely on the swap file not coming into play. A timer with a built-in RAM disk sounds pretty cool though. I have seen mention of "RAMDisk XP" in the MP3 repacker thread, but that's the first I've heard of a RAM disk.

QUOTE(Liisachan @ May 28 2006, 03:12) *
Another thing: a "fast" compressor might be fast partly because it uses good io technique, and it is possible that when we compare several tools naively, we are actually not purely comparing the encoding/decoding algos, but are "tricked" by IO hacks. What I mean here is not the HD r/w speed but how well the author of the app optimized the IO part speed-wise. If the app can be a memory eater, it can be faster by using a huge buffer, or maybe buffering the file with one thread while compressing it with another.

So... in theory, using a lot of relatively small samples might be a good idea here. If the file is huge, we will be confused by IO in 2 ways: (1) the physical r/w speed and (2) the tool's i/o algo.
A tool might be created so that it uses least memory (suitable for HW support), and if so, the IO buffer should be very small in size which makes it look slower than another memory-eating app. What is the fair comparison of compression speed? is a rather complicated question to answer... Perhaps we should also check the memory usage, not only speed and compression ratio.
Some interesting points. It sounds like you should be speaking to Josef Pohm! Thomas Becker briefly discussed encoder and IO frame sizes in this post in the Yalac thread.
pepoluan
Hope you don't mind my 150 IDR (that's 0.02 USD wink.gif )

QUOTE(Synthetic Soul @ May 29 2006, 13:52) *
QUOTE(Liisachan @ May 28 2006, 03:12) *
have you already tried something like this, to kill the HD writing lag?

flac in.wav -f -o NUL

Do you think using the NUL device here is a good/valid approach?
I haven't tried that. I would be interested to test this type of approach against Timer's Process Time (CPU only). It's very difficult analysing these things though as no run is the same. Still, we would be able to see if they produced results that appeared to be the same. I would expect that your approach would be valid, yes; but again, I'm by no means an expert. Would ouputting to NUL work with any encoder/application?
Works, i.e. flac doesn't puke at outputting to NUL. However, remember that Windows' I/O routine is still called. Of course this overhead will be minimal, as Windows' is smart enough to merely ignore the data but return a success code. But if you really want to be accurate, you have to find a way to measure this overhead, and subtract from the result.

QUOTE(Synthetic Soul @ May 29 2006, 13:52) *
QUOTE(Liisachan @ May 28 2006, 03:12) *
As for the reading delay, I'm clueless. Just hypothetically, if stdin/out was supported (like lame's "-"), we could use a simple pipeline trick ("BakaInput") which would first load the file into RAM, and then feed it to stdio (should be much faster than reading directly from HD). The only tricky part would be, to make sure for Windows not to use a swap. But the fact is, not every encoder supports STDIO, right? I'm gonna have to think about it more...
Is there a good tool that creates a virtual drive on RAM?
That sounds interesting, but as you say I guess you can't rely on the swap file not coming into play. A timer with a built-in RAM disk sounds pretty cool though. I have seen mention of "RAMDisk XP" in the MP3 repacker thread, but that's the first I've heard of a RAM disk.
The result will be a lot slower than NUL trick above. There is significant overhead in updating directory entries and allocation tables. Especially if the RAM disk is configured to use small blocks, as disk flush hapens on block boundaries. Further, the NUL device, IIRC, is not cached. Thus removing another source of overhead.

If your RAM is big enough for a RAM disk to contain the files, there's chance that Windows' file cache will be big enough to not flush, thus performance may be similar with or without RAM disk. You can tune Windows' file cache size and behavior (e.g. laziness of flushing) with CachemanXP.

Oh, and Windows' also sometimes swap for no reason, other than a DLL has expired (i.e. no process is using it for some time). Problem is when the DLL is part of the System Executive. Later on, a process will ask for that DLL, and it must be reloaded from the hard disk. (This behavior is tunable; CachemanXP can configure Windows to 'lock' the System Executive in RAM. Not recommended for < 512 MB RAM).

QUOTE(Synthetic Soul @ May 29 2006, 13:52) *
QUOTE(Liisachan @ May 28 2006, 03:12) *
So... in theory, using a lot of relatively small samples might be a good idea here. If the file is huge, we will be confused by IO in 2 ways: (1) the physical r/w speed and (2) the tool's i/o algo.
A tool might be created so that it uses least memory (suitable for HW support), and if so, the IO buffer should be very small in size which makes it look slower than another memory-eating app. What is the fair comparison of compression speed? is a rather complicated question to answer... Perhaps we should also check the memory usage, not only speed and compression ratio.
Some interesting points. It sounds like you should be speaking to Josef Pohm! Thomas Becker briefly discussed encoder and IO frame sizes in this post in the Yalac thread.
Remember that small samples means the overhead can be significant.
Liisachan
@pepoluan
Thanks for clarification.
Im guessing the real pest is HD writing, HD reading cost is relatively uniform because the same file is always similary frag'ged. Basically.
So if outputting NUL is a OK method, that's really nice.

As for inputting, I'm thinking about this.
It's basically, to let the timer app make a tmp file where the input.wav is copied, and the timer app will do -input tmp.wav -output NUL.
The point is, you can use FILE_ATTRIBUTE_TEMPORARY when calling CreateFile,
and then, "A file is being used for temporary storage. File systems avoid writing data back to mass storage if sufficient cache memory is available, because an application deletes a temporary file after a handle is closed. In that case, the system can entirely avoid writing the data. Otherwise, the data is written after the handle is closed."

Sounds like a hackable feature for our purpose, no?

Edit--if "the data is written after the handle is closed" ideally we'd like not to close the handle on purpose even after Copy is done, but I guess then tmp.wav can't be opened from the Encoder... needs some testing.
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2008 Invision Power Services, Inc.