Input from Standard Input
chomp($line = <STDIN>);
# read the next line and chomp it
Since the line-input operator will return undef
when you reach end-of-file, this is handy for dropping out of loops:
while (defined($line = <STDIN>)) {
print "I saw $line";
}
A shortcut for above program:
while (<STDIN>) {
print "I saw $_";
}
Now, before we go any further, we must be very clear about something: this shortcut works only if you write it just as we did. If you put a line-input operator anywhere else (in particular, as a statement all on its own), it won’t read a line into $_
by default. It works only if there’s nothing but the line-input operator in the conditional of a while loop. If you put anything else into the conditional expression, this shortcut won’t apply.
Evaluating the line-input operator in a list context gives you all of the (remaining) lines of input as a list—each element of the list is one line:
foreach (<STDIN>) {
print "I saw $_";
}
Input from the Diamond Operator
Another way to read input is with the diamond operator: <>. This is useful for making programs that work like standard Unix utilities, with respect to the invocation arguments.
$ ./my_program fred barney betty
That command means to run the command my_program (which will be found in the current directory), and that the program should process file fred, followed by file barney, followed by file betty.
If you give no invocation arguments, the program should process the standard input stream. Or, as a special case, if you give just a hyphen as one of the arguments, that means standard input as well. So, if the invocation arguments had been fred - betty, that would have meant that the program should process file fred, followed by the standard input stream, followed by file betty.
The diamond operator is actually a special kind of line-input operator. But instead of getting the input from the keyboard, it comes from the user’s choice of input:
while (defined($line = <>)) {
chomp($line);
print "It was $line that I saw!\n";
}
In fact, since this is just a special kind of line-input operator, we may use the same shortcut we saw earlier to read the input into $_
by default:
while (<>) {
chomp;
print "It was $_ that I saw!\n";
}
This works like the loop above, but with less typing. And you may have noticed that we’re using the default for chomp
; without an argument, chomp will work on $_
. Every little bit of saved typing helps!
Since the diamond operator is generally being used to process all of the input, it’s typically a mistake to use it in more than one place in your program.
The Invocation Arguments
Technically, the invocation diamond operator isn’t literally looking at the invocation arguments—it works from the @ARGV
array. This array is a special array that is preset by the Perl
interpreter as the list of the invocation arguments. In other words, this is just like any other array, (except for its funny, all-caps name), but when your program starts, @ARGV is already stuffed full of the list of invocation arguments.
The diamond operator looks in @ARGV
to determine what filenames it should use. If it finds an empty list, it uses the standard input stream; otherwise it uses the list of files that it finds. This means that after your program starts and before you start using the diamond, you’ve got a chance to tinker with @ARGV
. For example, here we can process three specific files, regardless of what the user chose on the command line:
@ARGV = qw# larry moe curly #; # force these three files to be read
while (<>) {
chomp;
print "It was $_ that I saw in some stooge-like file!\n";
}
Output to Standard Output
The print
operator takes a list of values and sends each item (as a string, of course) to standard output in turn, one after another. It doesn’t add any extra characters before, after, or in between the items; if you want spaces between items and a newline at the end, you have to say so:
$name = "Larry Wall";
print "Hello there, $name, did you know that 3+4 is ", 3+4, "?\n";
Of course, that means that there’s a difference between printing an array and interpo- lating an array:
print @array; # print a list of items
print "@array"; # print a string (containing an interpolated array)
That first print statement will print a list of items, one after another, with no spaces in between. The second one will print exactly one item, which is the string you get by interpolating @array
into the empty string—that is, it prints the contents of @array
, separated by spaces. So, if @array
holds qw/ fred barney betty /
, the first one prints fredbarneybetty, while the second prints *fred barney betty *separated by spaces.
But before you decide to always use the second form, imagine that @array
is a list of unchomped lines of input. That is, imagine that each of its strings has a trailing newline character. Now, the first print statement prints fred, barney, and betty on three separate lines. But the second one prints this:
fred
barney
betty
Perl is interpolating an array, so it puts spaces between the elements. So, we get the first element of the array (fred and a newline character), then a space, then the next element of the array (barney and a newline character), then a space, then the last element of the array (betty and a newline character). The result is that the lines seem to have become indented, except for the first one.
Generally, if your strings contain newlines, you simply want to print them, after all:
print @array;
But if they don’t contain newlines, you’ll generally want to add one at the end:
print "@array\n";
Since print is looking for a list of strings to print, its arguments are evaluated in list context. Since the diamond operator (as a special kind of line-input operator) will return a list of lines in a list context, these can work well together:
print <>; # source code for 'cat'
print sort <>; # source code for 'sort'
What might not be obvious is that print has optional parentheses, which can some- times cause confusion. Remember the rule that parentheses in Perl may always be omitted, except when doing so would change the meaning of a statement.
print (2+3)*4; # Oops! will print 5
When Perl sees this line of code, it prints 5, just as you asked. Then it takes the return value from print, which is 1, and multiplies that times 4. It then throws away the product, wondering why you didn’t tell it to do something else with it.
Formatted Output with printf
The printf
operator takes a format string followed by a list of things to print. The format string is a fill-in-the-blanks template showing the desired form of the output:
my $user = "Larry";
my $days_to_die = 10;
printf "Hello, %s; your password expires in %d days!\n", $user, $days_to_die;
The format string holds a number of so-called conversions; each conversion begins with a percent sign (%) and ends with a letter. (As we’ll see in a moment, there may be significant extra characters between these two symbols.) There should be the same number of items in the following list as there are conversions; if these don’t match up, it won’t work correctly.
To print a number in what’s generally a good way, use %g
, which automatically chooses floating-point, integer, or even exponential notation as needed:
printf "%g %g %g\n", 5/2, 51/17, 51 ** 17; # 2.5 3 1.0683e+29
The %d format means a decimal integer, truncated as needed:
printf "in %d days!\n", 17.85; # in 17 days!
Note that this is truncated, not rounded;
In Perl, printf
is most often used for columnar data, since most formats accept a field width. If the data won’t fit, the field will generally be expanded as needed:
printf "%6d\n", 42; # output like ````42 (the ` symbol stands for a space)
printf "%2d\n", 2e3 + 1.95; # 2001
The %s conversion means a string, so it effectively interpolates the given value as a string, but with a given field width:
printf "%10s\n", "wilma"; # looks like
```wilma
A negative field width is left-justified (in any of these conversions):
printf "%-15s\n", "flintstone"; # looks like flintstone`````
The %f conversion (floating-point) rounds off its output as needed, and even lets you request a certain number of digits after the decimal point:
printf "%12f\n", 6 * 7 + 2/3; # looks like ```42.666667
printf "%12.3f\n", 6 * 7 + 2/3; # looks like ``````42.667
printf "%12.0f\n", 6 * 7 + 2/3; # looks like ``````````43
To print a real percent sign, use %%, which is special in that it uses no element from the list:
printf "Monthly interest rate: %.2f%%\n", 5.25/12; # the value looks like "0.44%"
Arrays and printf
Generally, you won’t use an array as an argument to printf. That’s because an array may hold any number of items, and a given format string will work with only a certain fixed number of items: if there are three conversions in the format, there must be exactly three items.
But there’s no reason you can’t whip up a format string on the fly, since it may be any expression. This can be tricky to get right, though, so it may be handy (especially when debugging) to store the format into a variable:
my @items = qw( wilma dino pebbles );
printf "The items are:\n".("%10s\n" x @items), @items;
Note that here we have @items being used once in a scalar context, to get its length, and once in a list context, to get its contents. Context is important.
Filehandles
But there are also six special filehandle names that Perl already uses for its own pur- poses: STDIN
, STDOUT
, STDERR
, DATA
, ARGV
, and ARGVOUT
. Although you may choose any filehandle name you’d like, you shouldn’t choose one of those six unless you intend to use that one’s special properties.
Opening a Filehandle
When you need other filehandles, use the open operator to tell Perl to ask the operating system to open the connection between your program and the outside world. Here are some examples:
open CONFIG, "dino";
open CONFIG, "<dino";
open BEDROCK, ">fred";
open LOG, ">>logfile";
The first one opens a filehandle called CONFIG to a file called dino. That is, the (existing) file dino will be opened and whatever it holds will come into our program through the filehandle named CONFIG. This is similar to the way that data from a file could come in through STDIN if the command line had a shell redirection like <dino. In fact, the second example uses exactly that sequence. The second does the same as the first, but the less- than sign explicitly says “use this filename for input,” even though that’s the default.
Although you don’t have to use the less-than sign to open a file for input, we include that because, as you can see in the third example, a greater-than sign means to create a new file for output. This opens the filehandle BEDROCK for output to the new file fred. Just as when the greater-than sign is used in shell redirection, we’re sending the output to a new file called fred. If there’s already a file of that name, we’re asking to wipe it out and replace it with this new one.
The fourth example shows how two greater-than signs may be used (again, as the shell does) to open a file for appending. That is, if the file already exists, we will add new data at the end. If it doesn’t exist, it will be created in much the same way as if we had used just one greater-than sign.
You can use any scalar expression in place of the filename specifier, although typically you’ll want to be explicit about the direction specification:
my $selected_output = "my_output";
open LOG, "> $selected_output";
Note the space after the greater-than. Perl ignores this, but it keeps unexpected things from happening if $selected_output were ">passwd" for example (which would make an append instead of a write).
In modern versions of Perl (starting with Perl 5.6), you can use a “three-argument” open:
open CONFIG, "<", "dino";
open BEDROCK, ">", $file_name;
open LOG, ">>", &logfile_name();
The advantage here is that Perl never confuses the mode (the second argument) with some part of the filename (the third argument), which has nice advantages for security.
Bad Filehandles
If you try to read from a bad filehandle (that is, a filehandle that isn’t properly open), you’ll see an immediate end-of-file. (With the I/O methods we’ll see in this chapter, end-of-file will be indicated by undef
in a scalar context or an empty list in a list context.) If you try to write to a bad filehandle, the data is silently discarded. You can check the return value of open
before using it.
my $success = open LOG, ">>logfile"; # capture the return value
if ( ! $success) {
# The open failed
...
}
Closing a Filehandle
When you are finished with a filehandle, you may close it with the close
operator like this:
close BEDROCK;
Closing a filehandle tells Perl to inform the operating system that we’re all done with the given data stream, so any last output data should be written to disk in case someone is waiting for it. Perl will automatically close a filehandle if you reopen it (that is, if you reuse the filehandle name in a new open) or if you exit the program.
Fatal Errors with die
The die
function prints out the message you give it (to the standard error stream, where such messages should go) and makes sure that your program exits with a nonzero exit status.
we could rewrite the previous example, like this:
if (! open LOG, ">>logfile") {
die "Cannot create logfile: $!";
}
If the open
fails, die
will terminate the program and tell you that it cannot create the logfile. But what’s that $!
in the message? That’s the human-readable complaint from the system. In general, when the system refuses to do something we’ve requested (like opening a file), $!
will give you a reason (perhaps “permission denied” or “file not found,” in this case). This is the string that you may have obtained with perror
in C or a similar language. This human-readable complaint message will be available in Perl’s special variable $!
. It’s a good idea to include $!
in the message when it could help the user to figure out what he or she did wrong. But if you use die
to indicate an error that is not the failure of a system request, don’t include $!
, since it will generally hold an unrelated message left over from something Perl did internally. It will hold a useful value only immediately after a failed system request. A successful request won’t leave anything useful there.
There’s one more thing that die
will do for you: it will automatically append the Perl program name and line number to the end of the message, so you can easily identify which die
in your program is responsible for the untimely exit.
If you don’t want the line number and file revealed, make sure that the dying words have a newline on the end. That is, another way you could use die is with a trailing newline on the message:
if (@ARGV < 2) {
die "Not enough arguments\n";
}
Warning Messages with warn
Just as die
can indicate a fatal error that acts like one of Perl's built-in errors(like dividing by zero), you can use the warn
function to cause a warning that acts like one of Perl's built-in warnings(like using an undef
value as if it were defined, when warnings are enabled).
The warn
function works just like die
does, except for that last step—it doesn’t actually quit the program.
Using Filehandles
Once a filehandle is open for reading, you can read lines from it just like you can read from standard input with STDIN
. So, for example, to read lines from the Unix password file:
if (! open PASSWD, "/etc/passwd") {
die "How did you get logged in? ($!)";
}
while (<PASSWD>) {
chomp;
...
}
A filehandle open for writing or appending may be used with print or printf, appearing immediately after the keyword but before the list of arguments:
print LOG "Captain's log, stardate 3.14159\n"; # output goes to LOG
printf STDERR "%d percent complete.\n", $done/$total * 100;
Did you notice that there’s no comma between the filehandle and the items to be prin- ted?* This looks especially weird if you use parentheses. Either of these forms is correct:
printf (STDERR "%d percent complete.\n", $done/$total * 100);
printf STDERR ("%d percent complete.\n", $done/$total * 100);
Changing the Default Output Filehandle
By default, if you don’t give a filehandle to print
(or to printf
, as everything we say here about one applies equally well to the other), the output will go to STDOUT
. But that default may be changed with the select
operator. Here we’ll send some output lines to BEDROCK
:
select BEDROCK;
print "I hope Mr. Slate doesn't find out about this.\n";
print "Wilma!\n";
Once you’ve selected a filehandle as the default for output, it will stay that way. But it’s usually a bad idea to confuse the rest of the program, so you should generally set it back to STDOUT
when you’re done. Also by default, the output to each filehandle is buffered. Setting the special $|
variable to 1 will set the currently selected filehandle (that is, the one selected at the time that the variable is modified) to always flush the buffer after each output operation. So if you wanted to be sure that the logfile gets its entries at once, in case you might be reading the log to monitor progress of your long-running program, you could use something like this:
select LOG;
$| = 1; # don't keep LOG entries sitting in the buffer
select STDOUT;
# ... time passes
print LOG "This gets written to the LOG at once!\n";
Reopening a Standard Filehandle
redirect STDERR example:
# Send errors to my private error log
if (! open STDERR, ">>/home/barney/.error_log") {
die "Can't open error log for append: $!";
}
After reopening STDERR, any error messages from Perl will go into the new file. But what happens if the die is executed—where will that message go, if the new file couldn’t be opened to accept the messages?
The answer is that if one of the three system filehandles—STDIN, STDOUT, or STDERR— fails to reopen, Perl kindly restores the original one.† That is, Perl closes the original one (of those three) only when it sees that opening the new connection is successful.
Output with say
Perl 5.10 borrows the say
built-in from the ongoing development of Perl 6 (which may have borrowed its say from Pascal’s println). It’s the same as print
, although it adds a newline to the end. These forms all output the same thing:
use 5.010;
print "Hello!\n";
print "Hello!", "\n";
say "Hello!";
To just print a variable’s value followed by a newline, you don’t need to create an extra string or print a list. You just say
the variable. This is especially handy in the common case of simply wanting to put a newline after whatever you want to output:
use 5.010;
my $name = 'Fred';
print "$name\n";
print $name, "\n";
say $name;
To interpolate an array, you still need to quote it though. It’s the quoting that puts the spaces between the elements:
use 5.010;
my @array = qw(a b c d);
say @array; # "abcd\n"
say "@array"; "a b c d\n"
print @array; # "abcd"
print "@array"; # "a b c d"
Just like with print
, you can specify a filehandle with say
:
use 5.010;
say BEDROCK "Hello!";