Help - Search - Members - Calendar
Full Version: reFLAC.py
Hydrogenaudio Forums > Lossless Audio Compression > FLAC
Sunfall
I recently coded up a Python script meant for reencoding a FLAC collection. This has only been tested on my own computer, it may eat your data, blah blah blah. Make sure you test it on a small (copied!) subset of your files before you let it loose on your entire collection.

That said, it tries to be as safe as possible when it comes to replacing files. Following Josh's advice in posts here, it does a flac -t before anything else, and will not attempt to re-encode the file if that does not work. If anything bad happens in the re-encoding process, it will print an 'ERROR:' and not replace the file either. It does its work in a temporary file named 'QQQreFLACQQQ.flac', so it's easy to find if something strange happens. (That said, it should handle kill and Ctrl-C correctly, cleaning up before aborting.)

This has only been tested in Linux, but the only hardcoded path is the location of the FLAC executable, and that can be changed via a command-line parameter. It /should/ work in Windows (with a Python interpreter, obviously), but YMMV. It does use libraries new to Python 2.4, so older Pythons won't work.

You can run reFLAC.py -h to get a list of all of the parameters. It has a state file (default reFLAC.state) that will allow it to pick up on an aborted run, nice for letting it run overnight but killing it when you're ripping new CDs, etc. reFLAC.state defaults to being written in your current directory, so either always run it from the same place or change its location via the command-line. The .state file is itself relocatable, though, as it stores absolute paths for files.

I've been running this across my entire collection for a day or so now (and there are many, many days left for it to run before it's done). I already fixed a bug due to my using rather talkative options to FLAC, which were filling up the stdout pipe.

BIG NOTE: Those of you in locales where 1.1.3 has a bug should change REFLAC_PARAMETERS to add the proper tukey() calls. This also applies to anyone who wants to add more compression than --best provides.

Once again, this script is provided with no warranty, may destroy data, and so on. That said, feel free to use it! It may ease some people's recompression process.

reFLAC.py

EDIT NOTE: This is a new version of reFLAC.py. The only code change is the addition of a REFLAC_PARAMETERS variable at the top that makes it easier to change the parameters that get passed to flac, so those of you who want to do super-duper-compression can do it without digging into the guts of the code. There are also a lot more comments scattered throughout the code and describing the variables, for those of you who like to poke at stuff before you blindly run it. (Something I highly recommend, by the way.)

This has been running across my collection all weekend and has saved me a good gig and a half so far, with no problems.
Sunfall
If you were holding off for a complete run on at least one system, mine finished this morning with these lines:

CODE
reFLAC: [/mnt/media/audio-archive/Z/Zwan/Mary Star of the Sea/Settle Down - Zwan [Mary Star of the Sea - Track 02].flac] Re-encoding ...
reFLAC: [/mnt/media/audio-archive/Z/Zwan/Mary Star of the Sea/Settle Down - Zwan [Mary Star of the Sea - Track 02].flac] 38858628 -> 38455421; switching.
reFLAC: [/mnt/media/audio-archive/Z/Zwan/Mary Star of the Sea/Yeah! - Zwan [Mary Star of the Sea - Track 11].flac] Considering ...
reFLAC: [/mnt/media/audio-archive/Z/Zwan/Mary Star of the Sea/Yeah! - Zwan [Mary Star of the Sea - Track 11].flac] Testing ...
reFLAC: [/mnt/media/audio-archive/Z/Zwan/Mary Star of the Sea/Yeah! - Zwan [Mary Star of the Sea - Track 11].flac] Re-encoding ...
reFLAC: [/mnt/media/audio-archive/Z/Zwan/Mary Star of the Sea/Yeah! - Zwan [Mary Star of the Sea - Track 11].flac] 21649013 -> 21421429; switching.
reFLAC: Writing state.
reFLAC: Shrank 557100087942 bytes to 551910401229 bytes.
reFLAC: Writing state.
reFLAC: Shrank 557100087942 bytes to 551910401229 bytes.


So, yay, I saved ~5G of space. I've since kicked off a run of "flac -st" across all of the files, Just In Case, but I'm not anticipating any errors. And I've rebuilt my XMMS playlist, which is telling me that everything's (as expected) the same length.

I pronounce reFLAC.py field-tested and Sunfall-approved. (It still might eat your data, though.)
egd
Not sure if you're still around, but if you are, any chance you've since turned this into a multi-threaded app to take advantage of dual- and quad-core cpu's?
patmcg
Hey, I have a similar script, although it is not setup for FLAC->FLAC yet. It wouldn't be hard to enable. The good thing is that all the multi threading part is done. Let me know if you are interested, and I'll see if I can find some time to updated it.
Preuss
QUOTE (patmcg @ Sep 12 2009, 10:08) *
Hey, I have a similar script, although it is not setup for FLAC->FLAC yet. It wouldn't be hard to enable. The good thing is that all the multi threading part is done. Let me know if you are interested, and I'll see if I can find some time to updated it.


Is this also made with python?
patmcg
yes
egd
QUOTE (patmcg @ Sep 12 2009, 19:08) *
Hey, I have a similar script, although it is not setup for FLAC->FLAC yet. It wouldn't be hard to enable. The good thing is that all the multi threading part is done. Let me know if you are interested, and I'll see if I can find some time to updated it.

If you've the time and inclination I'd love to see a multi-threaded version. Thx.
Preuss
I would also be interested in seeing a multithreaded app taking advantage of my dual core cpu.
Sunfall
QUOTE (egd @ Sep 12 2009, 00:07) *
Not sure if you're still around, but if you are, any chance you've since turned this into a multi-threaded app to take advantage of dual- and quad-core cpu's?


No, I haven't. And that's actually intentional; the problem with running more than one at a time is that the files end up fragmented on the disc. (And don't let the people who tell you that Linux files don't fragment fool you. They really do.) It'd be pretty trivial to turn this app multithreaded, but for disk consistency/safety reasons I'd rather not.

The other problem is that if you have a power outage mid-"flight" you have the potential for losing even more files.

Yes, I'm that paranoid.
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2009 Invision Power Services, Inc.