User-Defined Subroutines





                    CHAPTER 14: USER-DEFINED SUBROUTINES


   User-Defined Subroutines

     - sub sub_name
       {
         statement(s);     # Subroutine body
       }

     - Subroutine names have their own namespace

     - Subroutine definitions can be placed anywhere in the Perl
       program, even inside other subroutine definitions

     - BUT all subroutine definitions are global

     - By default, all variable references in the subroutine body are 
       references to global variables

     - Ex.

         sub print_msg
         {
           print ("The print_msg subroutine has been invoked\n");
         }


   Invoking A User-Defined Subroutine

     - Prefix the subroutine name with an &

     - Ex.

         &print_msg;

     - Note that a subroutine can not be invoked inside either a single 
       or double-quoted string

     - A subroutine can also be invoked as:

         do subroutine_name;

       The & form is preferred.


   Subroutine Arguments

     - Arguments are passed to a subroutine by following the subroutine 
       invocation with an argument list in parentheses

     - The subroutine definition has NO formal parameters

     - The arguments are available to the subroutine in the special
       array, @_

     - The @_ array is local to the subroutine

     - The values of @_ are actually REFERENCES to the scalar parameters 
       which are in the argument list.  Any changes to the @_ array 
       CHANGES the corresponding variable(s)!

     - Ex.

         sub print_msg2
         {
           print ("Here is your string: $_[0]\n");
           print ("Here is your number: $_[1]\n");
         }

         &print_msg2 ("Test message", 5);


   Array Arguments

     - If an array is used as a subroutine argument, Perl replaces the 
       array variable name with its elements to produce the actual 
       argument list

     - Consider:

         @x = (5, 6, 7);
         &sub1 (@x);               # @_ is (5, 6, 7)

         Any changes made to @_ in the subroutine change the elements
         of @x (assuming you did not force call-by-value).


   Subroutine Arguments Are Really References!!

     - Yes, they REALLY are references!  They may not look it, but 
       they are!

     - Consider:

         #!/usr/bin/perl

         sub refs
         {
           print ("Arguments are initially: @_\n");
           $_[0] = 7;
           $_[1] = 8;
           $_[2] = 9;
           $_[3] = 10;
           print ("Arguments are finally : @_\n");
         }

         $a = 1;
         $b = "Joe";
         @c = ($a, "Bob", $b);
         $x = 2;
         print ("Before the subroutine, array c is: @c\n");
         print ("Before the subroutine, scalar x is: $x\n");
         &refs (@c, 3*$x);
         print ("After the subroutine, array c is: @c\n");
         print ("After the subroutine, scalar x is: $x\n");
         
         The output of this program is:

           Before the subroutine, array c is: 1 Bob Joe
           Before the subroutine, scalar x is: 2
           Arguments are initially: 1 Bob Joe 6
           Arguments are finally: 7 8 9 10
           After the subroutine, array c is: 7 8 9
           After the subroutine, scalar x is: 2

         Note that x is unchanged!  The subroutine had a reference
         to a location which held the value of the expression
         3*$x and not a reference to $x itself.


   Return Value

     - The return value of a subroutine is the value of the last
       expression evaluated within the body of the subroutine

     - The return value can be a scalar or a list

     - Ex.

         sub add
         {
           $sum = 0;
           foreach $n (@_)
           {
             $sum += $n;
           }
           $sum;                   # Required!
         }

         $total = &add (3, 8, 12, 14);

     - Note that if we did not include the line with just "$sum;", the 
       last expression evaluated in the subroutine would be the foreach 
       expression, resulting in a null return value
    
     - Also note that $sum is a global variable!  If $sum did not exist
       before this subroutine was called, it will spring into existence
       as a global variable, when add() is invoked!  
       (See Local Variables below.)
    

   Return Function

     - You can make an explicit call to the return function to return 
       from a subroutine

     - return LIST

     - Returns from a subroutine with the value specified

     - The preferred method is to use the value of the last evaluated 
       expression as the return value of the subroutine.  Use of the 
       return function is slower.

     - Typical use:

         return $sum;


   Local Variables

     - By default, all variable references in the subroutine body are 
       references to global variables

     - In fact, by default, all Perl variables are global

     - Local variables are created using the local function


   Local Function

     - Declares the listed variables to be local to the enclosing block 
       or subroutine

     - local (LIST)

     - Scoping is dynamic: the subroutine invoking local and any of its 
       invoked subroutines, reference the local variable, but not the 
       global variable of the same name

     - Since the result of the local function can serve as an lvalue, 
       the local variables can be initialized

     - If no initialization is done, all scalars are initialized to the 
       null string and all arrays are initialized to the empty list

     - Typical use:

         local ($sum);
         local ($sum) = 0;

     - Ex.

         sub add
         {
           local ($sum) = 0;       # $sum is now a local variable
           foreach $n (@_)
           {
             $sum += $n;
           }
           $sum;                   # Required!
         }

         $total = &add (3, 8, 12, 14);

     - Note that the local function is a runtime command and creates
       a instance of the local variable on the stack each time it is
       invoked.  For this reason, all your invocations of local should
       be at the beginning of your subroutine.


   Call-By-Reference And Call-By-Value

     - Since you can assign values to the @_ array, Perl subroutines
       are inherently call-by-reference

     - If you assign the arguments to a set of local variables and then 
       only reference these local variables (and not @_), you essentially 
       have call-by-value

     - It is strongly recommended that call-by-value be used!
       (See Variable Suicide below.)
       (Also see Array Arguments below.)

     - Ex.

         sub sub1
         {
           local ($scalar1, $scalar2, @list1) = @_;
           # Rest of subroutine
         }

     - Note that if meaningful variable names are used, this helps
       make your code more readable


   Variable Suicide

     - Variable suicide is a side effect of call-by-reference using the 
       global @_ array and the dynamic scoping provided by the local 
       function

     - Consider:

         #!/usr/local/bin/perl

         $x = 17;                  # Call this the main $x
         print ("In main: x is $x\n");
         &sub1 ($x);
         print ("In main: x is $x\n");

         sub sub1
         {
           local ($x);             # This is the local $x
           local ($y) = @_;        # And this is the local $y
           print ("In sub1: x is $x\n");
           print ("In sub1: y is $y\n");
         }

       The results of this program are:

         In main: x is 17
         In sub1: x is
         In sub1: y is 
         In main: x is 17

     - You'll notice that the local $y variable did NOT get the proper 
       value of 17 from the argument array, @_.  In fact, somehow, we 
       clobbered $_[0]!

     - What happened is this: $_[0] is a reference to the main $x used 
       in the invocation of sub1.  But the "local($x)" statement causes 
       all further references to $x to be the LOCAL $x variable.  When 
       we access $_[0], we see a reference to $x and we get the LOCAL 
       $x which has an undef value.  Once we exit the subroutine, 
       references to $x again refer to the main $x.

     - To prevent this variable suicide (or is it really murder???) we 
       need to:

         a) use local variable names which are unique to the entire
            program OR
         b) copy all arguments from @_ into local variables before
            any other action is done in the subroutine (thus,
            essentially using call-by-value!)


   Associative Array Arguments

     - What about an associative array as an argument?  Consider:

         %y = ("bob", 1, "joe", 2);
         &sub1 (%y);               # @_ is ("bob", 1, "joe", 2)

         Ok, looks like Perl expanded the associative array into its
         key-value pairs.  BUT, in this case,  if the subroutine 
         changes @_, the original associative array, %y, is left
         unchanged!

     - What if you want to truly pass an associative array as an
       argument and make changes that effect the global associative
       array?  For that matter, what if you want to pass an array and 
       make changes to the array itself (for example, with the push/pop 
       functions) and not just to the elements of the array?  The 
       solution?  A type glob!


   Type Glob

     - If you prefix the name of a Perl variable with an '*', you can 
       refer to ALL the Perl objects (scalars, arrays, filehandles, 
       subroutines) which have that name

     - Called a type glob (since the * represents a $ or @ or %)

     - When a type glob is assigned to another type glob, object 
       aliasing is set up.

     - It is recommended that type globs be used as lvalues only inside 
       local().  Otherwise, you may clobber an existing variable!

     - Ex:

         @x = (5, 6, 7);
         &sub1 (*x); 
                                   
         sub sub1
         {
           local (*array) = @_;

           # Now any references to @array in the subroutine
           #   are really references to the global array @x.
           # Be careful!  All objects with name x have been
           #   aliased.  References to $array inside the subroutine
           #   will reference a global $x; same with %array
           #   and %x.
           # But now any array operation (for example, push/pop)
           #   can be done on @array and the changes made to the
           #   global @x.
     
         }




Bob Tarr
University of Maryland, Baltimore County
tarr@umbc.edu