This is an advance chapter of Learning Perl 6 by Randal L. Schwartz and brian d foy. It is incomplete and may contain merely notes or an outline for future work. It may also contain sections of their writing from other sources, although these are noted where possible.

This work is copyrighted under a contract between O'Reilly Media and the authors, and you cannot repost it or distribute it without permission.

Prerequisites

control structures: if, unless

for

pointy operator, ->

scalars

arrays

say, print, print

invocant

topicalize

Filehandles

So far we've shown you how to read input from the user through standard input and sende information to standard output. Now we'll show you the more general form that lets us create files and read from them again.

Filehandles are our connection to the information inside files. It's separate from the file or the filename, and we have to make the connection in our program to be able to use it. That filehandle is another sort of scalar variable.

Global filehandles

Perl 6 already comes with special filehandles that we can use immediately without any setup on our part. Table XXX shows the filehandles that Perl 6 creates automatically. We divide these filehandles in two groups: those that are readable and we get data from, and those that are writeable and we send data to. Either way, the * in the variable name reminds us that these are global variables:

Table XXX: Global filehandles


Readable
$*IN		standard input
$*DATA		virtual filehandle for DATA section
$*ARGS		command line files

Writeable
$*OUT		standard output
$*ERR		standard error
$*ARGVOUT	output for in place editing

Previously, we omitted the filehandle variable with say and print because both of those use $*OUT by default. We can use $*OUT explicitly to do the same thing. We place a colon after the filehandle name to note that it's the invocant of the operation:


say $*OUT: "XXX";

That might look a little bit better in the object notation:


$*OUT.say( "XXX" );
 

When we want to send the output somewhere besides standard output, we use a different filehandle to connect us to that destination. We could send the data to standard error instead by using the $*ERR filehandle:


say $*ERR: "XXX";

If you're running this from a terminal, it probably looks like the output is going to the same place because it is. Unless we do something special, a terminal merges both standard output and standard error and displays them together. From the command line we can redirect standard output to a file, and the standard error output still shows up on the screen:


% script.p6 > output.txt

Writing to a file

When we want to send data to a file directly, we have to create a filehandle that we can use as a connection to that file. The open built-in does that for us. We tell it the name of the file that we want to open and how we'd like to open it. Table XXX shows the different modes that we can use to open a file.


my $fh = open $filename, $mode;

We want to open a file for writing so we can send our output to it, so we'll use the :w mode. That's another sort of adverb like we discussed in Chapter XXX.


my $output = open $destination, :w;

If a file with that name doesn't exist, open creates it for us. If a file with the same name alrady exists, open still creates the filehandle, but it truncates the file, meaning that anything that was already in the file is now gone, even if we don't use the filehandle (so don't try this on any file that has data you want to keep).

If everything works and we're able to create the filehandle, we can use it immediately as the invocant:


say $output: "XXX";

When we're done with the filehandle, we get rid of it with close. That ensurs that the last of the data makes it into the file before we break the connection:


close $output;

XXX is there autoclose anymore?

Table XXX: Modes for opening files


:r		open the file for reading, the default mode

:w		open the file for writing, first truncating the file. Any previous content disappears

:a		open the file for appending, adding new output to the end of the file

? 		read/write

Handling errors

It's possible that we might not be able to create a filehandle. Perhaps we don't have the right permission to create a new file, or the file exists but is already in use. Before we use a filehandle we just created, we need to ensure that we actually created a filehandle. If the open doesn't succeed, it returns undef instead of a filehandle.

We can use the defined built-in to let us know if the open worked. If $output is defined, we're okay, otherwise we don't have a filehandle. If we weren't able to create the filehandle, we probably don't want to go on with our program. The die built-in stops our program and sends a message to standard error. We can use the $! variable, which holds the error message from the last system failure:


my $output = open $destination, :w;

unless defined $output {
    die "Could not open file $destination: $!";
    }

We can also write this as a single statement using the err keyword from Chapter XXX. Perl 6 only continues through the err to the die if the result of the open is undefined, which is the only way the open signals that it failed.


my $output = open $destination, :w 
    err die "Could not open file $destination: $!"

say $output: "XXX";

close $output;

Appending to a file

The :w mode truncated the file if it already existed, but we don't always want that. If we want to add data to the end of the file, we use the :a mode to open the file for appending.


my $append = open $destination, :a 
    err die "Could not append to $file: $!";
    

The file doesn't have to exist beforehand, though, since open will create it for us if it needs to.

Reading from filehandles

To read from a file, we create a filehandle just as we did before, but we use the <:r> mode, checking the result to ensure the open succeeded.


my $input = open $source, :r 
    err die "Could not open file $destination: $!"

We can then read a line from the file as we did with standard input. We use the = sign to get the next line:


my $first_line = =$input;

In list context, the = returns all of the lines from the filehandle, although we normally don't want to do this:


my @array = =$input;

We don't have to store it in an array to use the filehandle in list context, though. We could use the filehandle like an object. In that case we don't use the iterator notation:


say $input.reverse.join('');

More likely, we'll want to read it line by line. We can use the for control structure from Chapter XXX with our filehandle iterator as the source, automatically topicalizing each line to save ourselves a bit of typing. As before, reading a line of input automatically removes the newline on the end, so we have to use say to add the newline again when we want to output it.


for =$fh {
    say;
    }
    

We can also use the pointy operator to put the next line into a variable of our choosing.


for =$fh -> $line {
    say $line;
    }
    

Now we're able to copy one file to another. We open the source file for reading and the destination file for writing, checking the result each time to ensure that it worked. Once we have both filehandles, we read from the input and write it to the output:


my $input = open $source, :r 
    err die "Could not open file $source for reading: $!"

my $output = open $destination, :w 
    err die "Could not open file $destination for writing: $!"

for =$input -> $line {
    say $output: $line;
    }
    

That's not very useful in itself, but it's the basic idea when we want to process a file line-by-line. We can add some simple processing. Just for fun, let's try an experiment. Some XXXpeople say that we can still understand words as long as the first and last letters are where they should be, even if the insides are scrambled.


for =$input -> $line {
    say $output: $line;
    $line =~ s/\b([a-z])([a-z]+)([a-z])\b/$1 ~  ~ $2/
    XXX; # do some processing
    }

Does it work? Can you still read the text?

Reading from the command line arguments

The $*ARGS filehandle is a bit different than what we've shown you so far, where each filehandle represented a connection to a single file. The $*ARGS represents a connection to all the files we specify on the command line. We can read from those files in sequence; when we finish reading the first file, the $*ARGS filehandle automatically closes that file and opens the next file without us having to do anything. Here's a program to simply print all of the lines from the $*ARGS filehandle:


#!/usr/local/bin/pugs

for =$*ARGS {
    .say
    };

We

If we don't specify any files on the command line, =$*ARGS reads from standard input instead.

XXX: I'm making this up hoping it comes true.XXX With the $ARG variable, we can even tell which file we're working on:


for $*ARGS {
    say "$*ARG: $_";
    }

The DATA filehandle

Inside our script, we can have a virtual file that we can read with the $=DATA filehandle. The = twiggle reminds us that this variable is specific to this particular file[1].

We start this virtual DATA file with the =begin DATA and terminate it with =end. These special sequences have to show up at the beginning of the line and when Perl is expecting a new statement.


#!/usr/local/bin/pugs

=begin DATA
1
2
3
4
5
=end

for =$=DATA {
    .say
    }

Although this may seem a little strange, it turns out to be very useful. We don't have to put this virtual file in the middle of our script, and we don't even have to think our the file as a script. We like to consider them as data files that just happen to have a script on top. Perhaps we have a long list of numbers in a file that we need to process:


XXXX

First, we mark the beginning of the data with =begin DATA, then make room for our script above that. Since there is nothing after the data, we don't need the =end to terminate it. This can be especially handy for quick-n-dirty or one-off scripts.


#!/usr/local/bin/pugs

# SCRIPT GOES HERE

=begin DATA
XXX

Next, we add the code we need to process the data. Let's filter the data to output only the odd numbers (so, those for which n % 2 is true. The for topicalizes each line from $=DATA.


#!/usr/local/bin/pugs

for =$=DATA {
    say if $_ % 2
    }

=begin DATA
XXX

If we're going to put all the data at then end, we can also use the =begin END sequence. This one doesn't need an =end END becuase it's, well, the end and Perl isn't going to do anything else with the the stuff afterwards. We can still get if from the


#!/usr/local/bin/pugs

for =$=DATA {
    say if $_ % 2
    }

=begin END
XXX

Prompts

for =$handle :prompt('$ ') { say $_ + 1 }

Changing the default filehandle

Differences to Perl 5

In Perl 5, filehandles were a separate data type. In the Perl 4 and early Perl 5 days, a file handle was a bareword, and the file mode was noted symbolically as part of the file name:


open( FILE, ">> $file" ) || die "Could not open $file: $!";

Later in Perl 5, filehandles could be scalars, as long as we started with a scalar that did not have a value, and we could also separate the file mode from the file name:


open my($fh), ">>", $file or die "Could not open $file: $!";

Pelr 6 took this one step further by removing the bareword filehandles which caused a lot of problems, and by making open return the filehandle instead of forcing us to create a variable for it to fill in. Since open returns undef on failure, we specifically use the new err short circuit operator to test for definedness instead of simply looking for false values:


my $fh = open $file, :a err die "Could not open $file: $!";

The default filehandles have also changed. They have new names in Perl 6, are scalar variables, and have a twigil to note their scope. The Perl 5 __DATA__ and __END__ tokens that marked the section for use by DATA is now the pod =begin DATA.


Perl 5			Perl 6

STDOUT			$*OUT
STDERR			$*ERR
DATA			$=DATA
STDIN			$*IN
ARGV			$*ARGS
ARGVOUT			$*ARGVOUT

Perl 5's line-input, or diamond, operator <> is gone (mostly). Instead, we use the = in front of the filehandle name to create an iterator. When we read from a filehandle, Perl 6 automatically chomps the line ending too:


for =$=DATA { ... }

Curiously, the diamond operator is still around, but only as a sentimental stand-in for the old ARGV. We still read it by making it an iterator by adding the = to it:


for =<> { ... } 

Writing to filehandles is a bit different in Perl 6 because there's a slight twist on the old Perl 5 notation. In Perl 5, the print (and friends) was actually an indirect method call on the filehandle object, and that's why there's no comma after the filehandle:


print STDOUT "Hello Perl 5!";

In Perl 6, we mark the invocant (that's the filehandle in this case) with a trailing colon to denote that the first argument is the filehandle.


say $*OUT: "Hello Perl 6!";

Also, since the Perl 6 filehandle is also an object, we can avoid that syntax by calling methods on the filehandle instead:


$*OUT.say( "Hello Perl 6!" ),

Further Reading

Synopsis 16: http://perlcabal.org/syn/S16.html

Exercises

  1. 1.
  2. 2.
  3. 3.
  4. 4.