As other languages do, Perl has the ability to make subroutines, which are user-defined functions. The subroutine name comes from a separate namespace, so Perl won’t be confused if you have a subroutine called &fred
and a scalar called $fred
in the same program— although there’s no reason to do that under normal circumstances.
Defining a Subroutine
To define your own subroutine, use the keyword sub
, the name of the subroutine (without the ampersand), then the indented block of code (in curly braces), which makes up the body of the subroutine, something like this:
sub marine {
$n += 1;
print "Hello, sailor number $n!\n";
}
In any case, you don’t normally need any kind of forward declaration. Subroutine definitions are global; without some powerful trickiness, there are no private subroutines. If you have two subroutine definitions with the same name, the later one *overwrites the earlier one. That’s generally considered bad form, or the sign of a confused maintenance programmer.
Invoking a Subroutine
Invoke a subroutine from within any expression by using the subroutine name (with the ampersand):
&marine; # says Hello, sailor number 1!
&marine; # says Hello, sailor number 2!
Return Values
The subroutine is always invoked as part of an expression, even if the result of the expression isn’t being used. When we invoked &marine
earlier, we were calculating the value of the expression containing the invocation, but then throwing away the result.
All Perl subroutines have a return value—there’s no distinction between those that return values and those that don’t. Since all Perl subroutines can be called in a way that needs a return value, it’d be a bit wasteful to have to declare special syntax to “return” a particular value for the majority of the cases. So Larry made it simple. As Perl is chugging along in a subroutine, it is calculating values as part of its series of actions. Whatever calculation is last performed in a subroutine is automatically also the return value.
For example, let’s define this subroutine:
sub sum_of_fred_and_barney {
print "Hey, you called the sum_of_fred_and_barney subroutine!\n";
$fred + $barney; # That's the return value
}
The last expression evaluated in the body of this subroutine is the sum of $fred
and $barney
, so the sum of $fred
and $barney
will be the return value. Here’s that in action:
$fred = 3;
$barney = 4;
$wilma = &sum_of_fred_and_barney; # $wilma gets 7
print "\$wilma is $wilma.\n";
$betty = 3 * &sum_of_fred_and_barney; # $betty gets 21
print "\$betty is $betty.\n";
sub sum_of_fred_and_barney {
print "Hey, you called the sum_of_fred_and_barney subroutine!\n";
$fred + $barney; # That's not really the return value!
print "Hey, I'm returning a value now!\n"; # Oops!
}
In this example, the last expression evaluated is not the addition; it’s the print statement. Its return value will normally be 1, meaning “printing was successful”, but that’s not the return value you actually wanted. So be careful when adding additional code to a subroutine, since the last expression evaluated will be the return value.
“The last expression evaluated” really means the last expression evaluated, rather than the last line of text. For example, this subroutine returns the larger value of $fred
or $barney
:
sub larger_of_fred_or_barney {
if ($fred > $barney) {
$fred;
} else {
$barney;
}
}
Arguments
To pass an argument list to the subroutine, simply place the list expression, in parentheses, after the subroutine invocation, like this:
$n = &max(10, 15); # This sub call has two parameters
The list is passed to the subroutine; that is, it’s made available for the subroutine to use however it needs to. Of course, you have to store this list somewhere, so Perl automat- ically stores the parameter list (another name for the argument list) in the special array variable named @_
for the duration of the subroutine. The subroutine can access this variable to determine both the number of arguments and the value of those arguments.
This means that the first subroutine parameter is stored in $_[0]
, the second one is stored in $_[1]
, and so on. But—and here’s an important note—these variables have nothing whatsoever to do with the (_ variable, any more than )dino3 has to do with $dino (a completely distinct scalar variable).
sub max { # Compare this to &larger_of_fred_or_barney
if ($_[0] > $_[1]) {
$_[0];
} else {
$_[1];
}
}
Well, as we said, you could do that. But it’s pretty ugly with all of those subscripts, and hard to read, write, check, and debug, too.
There’s another problem with this subroutine. The name &max
is nice and short, but it doesn’t remind us that this subroutine works properly only if called with exactly two parameters:
$n = &max(10, 15, 27); # Oops!
Excess parameters are ignored—since the subroutine never looks at $[2], Perl doesn’t care whether there’s something in there or not. And insufficient parameters are also ignored—you simply get undef
if you look beyond the end of the @ array, as with any other array.
The @_
variable is private to the subroutine;[1] if there’s a global value in @_
, it is saved away before the subroutine is invoked and restored to its previous value upon return from the subroutine. This also means that a subroutine can pass arguments to another subroutine without fear of losing its own @_
variable—the nested subroutine invocation gets its own @_
in the same way. Even if the subroutine calls itself recursively, each invocation gets a new @_
, so @_
is always the parameter list for the current subroutine invocation.
[1] Unless there’s an ampersand in front of the name for the invocation, and no parentheses (or arguments) afterward, in which case the @_ array is inherited from the caller’s context. That’s generally a bad idea, but is occasionally useful.
Private Variables in Subroutines
By default, all variables in Perl are global variables; that is, they are accessible from every part of the program. But you can create private variables called lexical variables at any time with the my
operator:
sub max {
my($m, $n); # new, private variables for this block
($m, $n) = @_; # give names to the parameters
if ($m > $n) {
$m
} else {
$n
}
}
These variables are private (or scoped) to the enclosing block; any other $m
or $n
is totally unaffected by these two. And that goes the other way, too—no other code can access or modify these private variables, by accident or design. It’s also worth pointing out that, inside the if’s blocks, there’s no semicolon needed after the return value expression. Although Perl allows you to omit the last semicolon in a block, in practice you omit it only when the code is so simple that you can write the block in a single line.
That my
operator can also be applied to a list of variables enclosed in parentheses, so it’s customary to combine those first two state- ments in the subroutine:
my($m, $n) = @_; # Name the subroutine parameters
Variable-Length Parameter Lists
In real-world Perl code, subroutines are often given parameter lists of arbitrary length. That’s because of Perl’s “no unnecessary limits” philosophy that you’ve already seen. Of course, this is unlike many traditional programming languages, which require every subroutine to be strictly typed (that is, to permit only a certain, predefined number of parameters of predefined types).
A Better &max Routine
So let’s rewrite &max
to allow for any number of arguments:
sub max {
my($max_so_far) = shift @_; # the first one is the largest yet seen
foreach (@_) { # look at the remaining arguments
if ($_ > $max_so_far) { # could this one be bigger yet?
$max_so_far = $_;
}
}
$max_so_far;
}
$maximum = &max(3, 5, 10, 4, 6);
Empty Parameter Lists
So in short order, Perl returns the value of $max_so_far—undef—as the return value of the subroutine if the parameter list is empty. In some sense, that’s the right answer because there is no largest value in an empty list.
Notes on Lexical (my) Variables
Those lexical variables can actually be used in any block, not merely in a subroutine’s block. For example, they can be used in the block of an if, while, or foreach:
foreach (1..10) {
my($square) = $_ * $_; # private variable in this loop
print "$_ squared is $square.\n";
}
The variable $square
is private to the enclosing block; in this case, that’s the block of the foreach loop. If there’s no enclosing block, the variable is private to the entire source file. For now, your programs aren’t going to use more than one source file, so this isn’t an issue. But the important concept is that the scope of a lexical variable’s name is limited to the smallest enclosing block or file.
Note also that the my operator doesn’t change the context of an assignment:
my($num) = @_; # list context, same as ($num) = @_;
my $num = @_; # scalar context, same as $num = @_;
In the first one, $num
gets the first parameter, as a list-context assignment; in the second, it gets the number of parameters, in a scalar context. Either line of code could be what the programmer wanted; you can’t tell from that one line alone, and so Perl can’t warn you if you use the wrong one.
So long as we’re discussing using my() with parentheses, it’s worth remembering that without the parentheses, my
only declares a single lexical variable:
my $fred, $barney; # WRONG! Fails to declare $barney
my($fred, $barney); # declares both
Of course, you can use my
to create new, private arrays as well:
my @phone_number;
Any new variable will start out empty—undef
for scalars, or the empty list for arrays.
The use strict Pragma
A pragma is a hint to a compiler, telling it something about the code. In this case, the use strict
pragma tells Perl’s internal compiler that it should enforce some good pro- gramming rules for the rest of this block or source file.
The return Operator
The return operator immediately returns a value from a subroutine:
my @names = qw/ fred barney betty dino wilma pebbles bamm-bamm /;
my $result = &which_element_is("dino", @names);
sub which_element_is {
my($what, @array) = @_;
foreach (0..$#array) { # indices of @array's elements
if ($what eq $array[$_]) {
return $_; # return early once found
}
}
−1; # element not found (return is optional here)
}
Omitting the Ampersand
As promised, now we’ll tell you the rule for when a subroutine call can omit the ampersand. If the compiler sees the subroutine definition before invocation, or if Perl can tell from the syntax that it’s a subroutine call, the subroutine can be called without an ampersand, just like a built-in function. (But there’s a catch hidden in that rule, as you’ll see in a moment.)
This means that if Perl can see that it’s a subroutine call without the ampersand, from the syntax alone, that’s generally fine. That is, if you’ve got the parameter list in parentheses, it’s got to be a function call:
my @cards = shuffle(@deck_of_cards); # No & necessary on &shuffle
Or if Perl’s internal compiler has already seen the subroutine definition, that’s generally okay, too; in that case, you can even omit the parentheses around the argument list:
sub division {
$_[0] / $_[1]; # Divide first param by second
}
my $quotient = division 355, 113; # Uses &division
That’s not the catch, though. The catch is this: if the subroutine has the same name as a Perl built-in, you must use the ampersand to call it. With an ampersand, you’re sure to call the subroutine; without it, you can get the subroutine only if there’s no built-in with the same name:
sub chomp {
print "Munch, munch!\n";
}
&chomp; # That ampersand is not optional!
Nonscalar Return Values
A scalar isn’t the only kind of return value a subroutine may have. If you call your subroutine in a list context, it can return a list of values.
The least you can return is nothing at all. A return with no arguments will return undef
in a scalar context or an empty list in a list context. This can be useful for an error return from a subroutine, signaling to the caller that a more meaningful return value is unavailable.
Persistent, Private Variables
With my
we were able to make variables private to a subroutine, although each time we called the subroutine we had to define them again. With state
, we can still have private variables scoped to the subroutine but Perl will keep their values between calls.
use 5.010;
sub marine {
state $n = 0; # private, persistent variable $n
$n += 1;
print "Hello, sailor number $n!\n";
}
There’s a slight restriction on arrays and hashes as state variables, though. We can’t initialize them in list context as of Perl 5.10:
state @array = qw(a b c); # Error!