Every so often I notice that my iTunes library has a lot of duplicate items and I need to get rid of things. It’s mostly a task for finding an audio file that has the same name as another audio file but with a space and digit added right before the extension. Song.mp3 and Song 1.mp3 are examples of this problem. They’ll be in the same directory; duplicates in different directories are a different matter and I’ve done that too by comparing file digests. That’s not this problem.
Here’s what I did. It’s not pretty because I didn’t care to make it so. It’s not exciting because I wasn’t researching anything or exploring an idea. It’s just getting rid of files. I solved the problem and moved on.
I develop these little tools incrementally. Can I get all the files? Can I find the files that have the name without the space and digit. I program a step and check the result. Then I add the next step. The process drives the procedural structure.
And during this process I forked then completely rewrote a File::Find module. You don’t need my version for this though. The one zef install File::Find
gives you should be fine.
A few quick notes:
- I really like the
#`( ... )
comments - An array interpolated into a pattern is an alternation of the elements
.IO
objects know how to make or tear apart paths.
#!/Applications/Rakudo/bin/perl6 use File::Find:auth<bdfoy>; # https://github.com/briandfoy/perl6-file-find #`( iTunes loves screwing up files. I get a file imported multiple times instead of realizing it's already in there. some song.mp3 some song 1.mp3 Let's find these pairs where the names differ by the addition of a space and a single digit (although I've had problems like this with as high as 4. ) my $dir = '/Users/brian/Dropbox/iTunes/Music'; my $target = '/Users/brian/BackupMusic'; #`( I actually used this module to count extensions and these are the one I want to focus on. An array in a regex is an alternation. ) my @extensions = <mp3 m4a="" m4p="">; my $sequence := find( dir => $dir, name => / \h '1' '.' @extensions $ /, ); my $count = 0; my $dry-run = True; # try it before we move any files for $sequence -> $file { #`( other = the file of the same name without a numbered copy if that file doesn't exist we don't have a problem ) my $other = $file.subst: / \h '1' '.' (@extensions) $ /, ".$0"; my $exists = $other.IO.e; next unless $exists; $count++; put '-' x 50; put "file: $file"; put "other: $other ($exists)"; #`( We need the part after the starting directory because we'll add that to the new target directory. We might have to make a subdir tree ) my $rel = $file.IO.relative: $dir; my $new = $rel.IO.absolute: $target; my $new-dir = $new.IO.parent.IO; $new-dir.mkdir unless $new-dir.e; $other.IO.rename( $new ) unless $dry-run; } say "Found $count files";
Here’s the program I used to survey the file extensions I used to populate @extensions
:
use File::Find:auth<BDFOY>; # https://github.com/briandfoy/perl6-file-find use PrettyDump; my $list := find( dir => $*HOME.child( 'Dropbox/iTunes/Music' ) ); my %extensions; for $list -> $item { next if $item.d; %extensions{ $item.extension }++ } pd %extensions;
And here’s what I had done the last time I had this problem. It’s Perl 5 interpreting the results of an external find
. I think last time I ended up throwing an unlink
in there at some point:
use v5.14; open my $fh, 'find . -name "* [1234].m[p4][ap3]" |'; while( <$fh> ) { chomp; my $other = s/\s+(\d+)(?=\.[^.]+\z)//r; next unless -e $other; print "$_\n\t$other\n"; } close $fh or die "Error in Find!"
I’ll probably lose my Perl 6 program, forget I wrote this, and recreate this in nine months. I might even post about it again.
As stated here one should not use $*SPEC directly.
You’re right.
$*SPEC
is deprecated in 6.d and is scheduled for removal in 6.e. I already have bad habits in Perl 6!