This is an advance chapter of Learning Perl 6 by Randal L. Schwartz and brian d foy. It is incomplete and may contain merely notes or an outline for future work. It may also contain sections of their writing from other sources, although these are noted where possible.
This work is copyrighted under a contract between O'Reilly Media and the authors, and you cannot repost it or distribute it without permission.
control structures: if, unless
for
pointy operator, ->
scalars
arrays
say, print, print
invocant
topicalize
So far we've shown you how to read input from the user through standard input and sende information to standard output. Now we'll show you the more general form that lets us create files and read from them again.
Filehandles are our connection to the information inside files. It's separate from the file or the filename, and we have to make the connection in our program to be able to use it. That filehandle is another sort of scalar variable.
Perl 6 already comes with special filehandles that we can use immediately without any setup on our part. Table XXX shows the
filehandles that Perl 6 creates automatically. We divide these filehandles in two groups: those that are readable and we get data
from, and those that are writeable and we send data to. Either way, the * in the variable name
reminds us that these are global variables:
Table XXX: Global filehandles
Readable
$*IN standard input
$*DATA virtual filehandle for DATA section
$*ARGS command line files
Writeable
$*OUT standard output
$*ERR standard error
$*ARGVOUT output for in place editing
Previously, we omitted the filehandle variable with say and print because both of those use $*OUT by default. We can use $*OUT explicitly to do the same thing. We place a colon after the filehandle name to note that it's the
invocant of the operation:
say $*OUT: "XXX";
That might look a little bit better in the object notation:
$*OUT.say( "XXX" );
When we want to send the output somewhere besides standard output, we use a different filehandle to connect us to that
destination. We could send the data to standard error instead by using the $*ERR filehandle:
say $*ERR: "XXX";
If you're running this from a terminal, it probably looks like the output is going to the same place because it is. Unless we do something special, a terminal merges both standard output and standard error and displays them together. From the command line we can redirect standard output to a file, and the standard error output still shows up on the screen:
% script.p6 > output.txt
When we want to send data to a file directly, we have to create a filehandle that we can use as a connection to that file. The
open built-in does that for us. We tell it the name of the file that we want to open and how we'd
like to open it. Table XXX shows the different modes that we can use to open a file.
my $fh = open $filename, $mode;
We want to open a file for writing so we can send our output to it, so we'll use the :w mode.
That's another sort of adverb like we discussed in Chapter XXX.
my $output = open $destination, :w;
If a file with that name doesn't exist, open creates it for us. If a file with the same name
alrady exists, open still creates the filehandle, but it truncates the file, meaning that anything
that was already in the file is now gone, even if we don't use the filehandle (so don't try this on any file that has data you want
to keep).
If everything works and we're able to create the filehandle, we can use it immediately as the invocant:
say $output: "XXX";
When we're done with the filehandle, we get rid of it with close. That ensurs that the last of
the data makes it into the file before we break the connection:
close $output;
XXX is there autoclose anymore?
Table XXX: Modes for opening files
:r open the file for reading, the default mode
:w open the file for writing, first truncating the file. Any previous content disappears
:a open the file for appending, adding new output to the end of the file
? read/write
It's possible that we might not be able to create a filehandle. Perhaps we don't have the right permission to create a new file,
or the file exists but is already in use. Before we use a filehandle we just created, we need to ensure that we actually created a
filehandle. If the open doesn't succeed, it returns undef instead
of a filehandle.
We can use the defined built-in to let us know if the open
worked. If $output is defined, we're okay, otherwise we don't have a filehandle. If we weren't
able to create the filehandle, we probably don't want to go on with our program. The die built-in
stops our program and sends a message to standard error. We can use the $! variable, which holds
the error message from the last system failure:
my $output = open $destination, :w;
unless defined $output {
die "Could not open file $destination: $!";
}
We can also write this as a single statement using the err keyword from Chapter XXX. Perl 6
only continues through the err to the die if the result of the
open is undefined, which is the only way the open signals that it
failed.
my $output = open $destination, :w
err die "Could not open file $destination: $!"
say $output: "XXX";
close $output;
The :w mode truncated the file if it already existed, but we don't always want that. If we want
to add data to the end of the file, we use the :a mode to open the file for appending.
my $append = open $destination, :a
err die "Could not append to $file: $!";
The file doesn't have to exist beforehand, though, since open will create it for us if it needs
to.
To read from a file, we create a filehandle just as we did before, but we use the <:r> mode, checking the result to ensure the
open succeeded.
my $input = open $source, :r
err die "Could not open file $destination: $!"
We can then read a line from the file as we did with standard input. We use the = sign to get
the next line:
my $first_line = =$input;
In list context, the = returns all of the lines from the filehandle, although we normally don't
want to do this:
my @array = =$input;
We don't have to store it in an array to use the filehandle in list context, though. We could use the filehandle like an object. In that case we don't use the iterator notation:
say $input.reverse.join('');
More likely, we'll want to read it line by line. We can use the for control structure from
Chapter XXX with our filehandle iterator as the source, automatically topicalizing each line to save ourselves a bit of typing. As
before, reading a line of input automatically removes the newline on the end, so we have to use say to add the newline again when we want to output it.
for =$fh {
say;
}
We can also use the pointy operator to put the next line into a variable of our choosing.
for =$fh -> $line {
say $line;
}
Now we're able to copy one file to another. We open the source file for reading and the destination file for writing, checking the result each time to ensure that it worked. Once we have both filehandles, we read from the input and write it to the output:
my $input = open $source, :r
err die "Could not open file $source for reading: $!"
my $output = open $destination, :w
err die "Could not open file $destination for writing: $!"
for =$input -> $line {
say $output: $line;
}
That's not very useful in itself, but it's the basic idea when we want to process a file line-by-line. We can add some simple processing. Just for fun, let's try an experiment. Some XXXpeople say that we can still understand words as long as the first and last letters are where they should be, even if the insides are scrambled.
for =$input -> $line {
say $output: $line;
$line =~ s/\b([a-z])([a-z]+)([a-z])\b/$1 ~ ~ $2/
XXX; # do some processing
}
Does it work? Can you still read the text?
The $*ARGS filehandle is a bit different than what we've shown you so far, where each
filehandle represented a connection to a single file. The $*ARGS represents a connection to all
the files we specify on the command line. We can read from those files in sequence; when we finish reading the first file, the
$*ARGS filehandle automatically closes that file and opens the next file without us having to do
anything. Here's a program to simply print all of the lines from the $*ARGS filehandle:
#!/usr/local/bin/pugs
for =$*ARGS {
.say
};
We
If we don't specify any files on the command line, =$*ARGS reads from standard input
instead.
XXX: I'm making this up hoping it comes true.XXX With the $ARG variable, we can even tell which
file we're working on:
for $*ARGS {
say "$*ARG: $_";
}
Inside our script, we can have a virtual file that we can read with the $=DATA filehandle. The
= twiggle reminds us that this variable is specific to this particular file[1].
We start this virtual DATA file with the =begin DATA and
terminate it with =end. These special sequences have to show up at the beginning of the line and
when Perl is expecting a new statement.
#!/usr/local/bin/pugs
=begin DATA
1
2
3
4
5
=end
for =$=DATA {
.say
}
Although this may seem a little strange, it turns out to be very useful. We don't have to put this virtual file in the middle of our script, and we don't even have to think our the file as a script. We like to consider them as data files that just happen to have a script on top. Perhaps we have a long list of numbers in a file that we need to process:
XXXX
First, we mark the beginning of the data with =begin DATA, then make room for our script above
that. Since there is nothing after the data, we don't need the =end to terminate it. This can be
especially handy for quick-n-dirty or one-off scripts.
#!/usr/local/bin/pugs
# SCRIPT GOES HERE
=begin DATA
XXX
Next, we add the code we need to process the data. Let's filter the data to output only the odd numbers (so, those for which
n % 2 is true. The for topicalizes each line from $=DATA.
#!/usr/local/bin/pugs
for =$=DATA {
say if $_ % 2
}
=begin DATA
XXX
If we're going to put all the data at then end, we can also use the =begin END sequence. This
one doesn't need an =end END becuase it's, well, the end and Perl isn't going to do anything else
with the the stuff afterwards. We can still get if from the
#!/usr/local/bin/pugs
for =$=DATA {
say if $_ % 2
}
=begin END
XXX
for =$handle :prompt('$ ') { say $_ + 1 }
In Perl 5, filehandles were a separate data type. In the Perl 4 and early Perl 5 days, a file handle was a bareword, and the file mode was noted symbolically as part of the file name:
open( FILE, ">> $file" ) || die "Could not open $file: $!";
Later in Perl 5, filehandles could be scalars, as long as we started with a scalar that did not have a value, and we could also separate the file mode from the file name:
open my($fh), ">>", $file or die "Could not open $file: $!";
Pelr 6 took this one step further by removing the bareword filehandles which caused a lot of problems, and by making open return the filehandle instead of forcing us to create a variable for it to fill in. Since open returns undef on failure, we specifically use the new err short circuit operator to test for definedness instead of simply looking for false values:
my $fh = open $file, :a err die "Could not open $file: $!";
The default filehandles have also changed. They have new names in Perl 6, are scalar variables, and have a twigil to note their
scope. The Perl 5 __DATA__ and __END__ tokens that marked the section for use by DATA is now the pod =begin DATA.
Perl 5 Perl 6
STDOUT $*OUT
STDERR $*ERR
DATA $=DATA
STDIN $*IN
ARGV $*ARGS
ARGVOUT $*ARGVOUT
Perl 5's line-input, or diamond, operator <> is gone (mostly). Instead, we use the = in front of the filehandle name to create an iterator. When we read from a filehandle, Perl 6
automatically chomps the line ending too:
for =$=DATA { ... }
Curiously, the diamond operator is still around, but only as a sentimental stand-in for the old ARGV. We still read it by making
it an iterator by adding the = to it:
for =<> { ... }
Writing to filehandles is a bit different in Perl 6 because there's a slight twist on the old Perl 5 notation. In Perl 5, the print (and friends) was actually an indirect method call on the filehandle object, and that's why there's no comma after the filehandle:
print STDOUT "Hello Perl 5!";
In Perl 6, we mark the invocant (that's the filehandle in this case) with a trailing colon to denote that the first argument is the filehandle.
say $*OUT: "Hello Perl 6!";
Also, since the Perl 6 filehandle is also an object, we can avoid that syntax by calling methods on the filehandle instead:
$*OUT.say( "Hello Perl 6!" ),
Synopsis 16: http://perlcabal.org/syn/S16.html