Perl: A Quick Tour

Presented by William Huston
to STNYLUG  Mar 25 2003

email: bhuston@vegdot.org
phone: 607-724-1755


    1. Things to keep in mind:

    2. Data Types

      1. Scalar
        1. Text String, Integer, or Real number
        2. "reference"
        3. $_ is the default scalar
        4. New scalars autovivify as either 0 or the null string, depending on context.

      2. Array / List 
        • a grouping of scalers, w/numeric index
        • an array is named list, (item1, item2, ...) is an unnamed list
        • @month is the entire month list
        • @month=("Jan", "Feb", "Mar");
        • @month= qw/  Jan Feb Mar  /;
        • push @month, "Apr";        # month array now has 4 elements
        • $month[10] is the 11th element of @month (if defined, 'undef' otherwise)
        • $#month returns the index of the largest defined element of @month
        • @month[1,2,3,4] is the same as @month[1..4]     (array slices)
        • @_ is the default array
        • Arrays spring up as null lists until used.
        • You probably don't want to test defined(@some_array)
        • Array constructors:
          • @some_array = ( "this", "that", 49, 12.3, "the other")
          • $some_array[2]  returns 49
          • $arrayref = [ "this", "that"  49, 12.3, "the other"  ]
          • $arrayref = [ qw/ this that  49 12.3 "the other" / ]
          • $arrayref->[0]  returns "this"
        • Real array functions:  pop/push, shift/unshift, splice
        • List functions: grep, join, map, qw/STRING/, reverse, sort, unpack

      3. Hash (aka, Associative Array)
        • list of scalars, string index
        • $color{"red"} = "#ff0000";
        • $color{"green"} = "#00ff00";
        • $color{"blue"} = "#0000ff";
        • @colors=keys %color;   
        • print $colors[1];   # prints "green"
        • There is no default hash
        • use exists() to test hash key, used defined() to test hash elements
        • Functions for hashes: delete, each, exists, keys, values
        • you probably don't want to test defined(%some_hash)
        • Hash Constructors
          • %hash = ( "key1", "value 1", "key2", "value 2", ...);
          • $hashref = { key1 => "value1", "key2" => "value2", ...};
        • Hash keys are assumed in double quoted context:
          $color{"red"} may be written $color{red}

      4. Filehandle
        • Filehandles are bare identifiers (no leading type identifier like $ @ or %)
        • Built in filehandles STDIN, STDOUT, STDERR
        • # method 1:
          open F, "/etc/passwd" or die "can't open /etc/passwd";
          while ($line=<F>) {
             print $line;
          }
          close F;

          # method 2:
          open F, "/etc/passwd" or die "can't open /etc/passwd";
          while (<F>) {     # assigns to default scalar
             print;                # prints default scalar
          }
          close F;


      5. Complex data structures (multi-dimensional hashes and arrays) are implemented via references to anonymous things.
      6. Typeglob (symbol table entry)
        • Useful for alias and some other obscure reasons
      7. Objects

      8. Scoping and Namespaces
        •  global
          • global is really "package global" (packages are used for extensible modules and also for object-oriented features)
          • This is the default scope
          • Default package is "main::"
        • local $a 
          • saves old package global, makes a temporary package global
          • useful in rare cases (special variables, local filehandles, safe typeglob aliases)
        • my $a 
          • "lexical scoping", normally only visible within the enclosing block.
          • This is probably what you want to use as a local variable.
        • Rule of thumb, "Always us my, never use local", but read "Seven Useful Uses of Local" by Mark Jason Dominus:  http://perl.plover.com/local.html
        • More info: http://perl.plover.com/FAQs/Namespaces.html
        • $foo (scalar), @foo (array), %foo (hash), &foo (subroutine), foo (filehandle), foo (print format) are all unique. 

    3. Syntax

      1. Perl is free-form (white space doesn't matter)
      2. Comments are from # to end of line
      3. Boolean True and False
        • False is undefined, null, 0, 0.00, 0E10, \000, 0x00, "0"
        • True any defined scaler which is not null or zero or "0".
        • # Note: this is the first use of $a :
          if (defined $a) { print "defined\n" } else { print "not defined\n" }  
          if ($a) { print "true\n" } else { print "false\n" }
          # prints not defined, false

          $a=0;
          if (defined $a) { print "defined\n"} else {print "not defined\n"}
          if ($a) { print "true\n" } else {print "false\n" }
          # prints defined false

          $a=1;
          if (defined $a) { print "defined\n"} else {print "not defined\n"}
          if ($a) { print "true\n"} else {print "false\n"}
          # prints defined true

           
      4. Declarations can occur anywhere, but do not affect program flow
      5. A statement is an expression evaluated for its side effects.
      6. A block is a collection of statements enclosed by braces {}
      7. Every statement must end in a semi-colon, unless it is the last statement in a block.
      8. Any simple statement may be optionally followed by a  single conditional modifier:
        • statement if EXPR;
        • statement unless EXPR;
        • statement while EXPR;   # EXPR evaluated before statement
        • statement until EXPR;    # EXPR evaluated before statement
        • statement foreach EXPR;  # iterator: $_ set for statement
      9. Control flow 
        • if (EXPR) BLOCK        
        • if (EXPR) BLOCK else BLOCK
        • if (EXPR) BLOCK elsif (EXPR) BLOCK ... else BLOCK
        • LABEL while (EXPR) BLOCK
        • LABEL while (EXPR) BLOCK continue BLOCK
        • LABEL for (EXPR; EXPR; EXPR) BLOCK
        • LABEL foreach VAR (LIST) BLOCK
        • LABEL BLOCK continue BLOCK
        • Labels, else, elsif and continue blocks are all optional.
      10. Loop control: (all optionally takes a label for nested loops)
        • last  : exit loop. (like break in C). 
        • next : next iteration of loop (like next in BASIC or continue in C)
        • redo : restart loop block without re-evaluating conditional or continue block
      11. No switch statement (but can easily be emulated)
      12. Yes, Perl has a GOTO (but this is not recomended)
      13. eval BLOCK, with die "error text" can be used for exception handling. (Similar to try/throw)
      14. POD (Plain Old Documentation): Embed documentation within programs.

    4. Operators

    5. Functions, keywords, named operators

      1. Functions for SCALARs or strings
                    chomp, chop, chr, crypt, hex, index, lc, lcfirst, length, oct, ord, pack, q/STRING/, qq/STRING/, reverse,
                    rindex, sprintf, substr, tr///, uc, ucfirst, y///       
      2. Regular expressions and pattern matching
                    m//, pos, quotemeta, s///, split, study, qr//       
      3. Numeric functions
                    abs, atan2, cos, exp, hex, int, log, oct, rand, sin, sqrt, srand       
      4. Functions for real @ARRAYs
                    pop, push, shift, splice, unshift       
      5. Functions for list data
                    grep, join, map, qw/STRING/, reverse, sort, unpack       
      6. Functions for real %HASHes
                    delete, each, exists, keys, values      
      7. Input and output functions
                    binmode, close, closedir, dbmclose, dbmopen, die, eof, fileno, flock, format, getc, print, printf, read,
                    readdir, rewinddir, seek, seekdir, select, syscall, sysread, sysseek, syswrite, tell, telldir, truncate,
                    warn, write       
      8. Functions for fixed length data or records
                    pack, read, syscall, sysread, syswrite, unpack, vec       
      9. Functions for filehandles, files, or directories
                    -X, chdir, chmod, chown, chroot, fcntl, glob, ioctl, link, lstat, mkdir, open, opendir, readlink, rename,
                    rmdir, stat, symlink, umask, unlink, utime       
      10. Keywords related to the control flow of your perl program
                    caller, continue, die, do, dump, eval, exit, goto, last, next, redo, return, sub, wantarray
      11. Keywords related to scoping
                    caller, import, local, my, package, use
      12. Miscellaneous functions
                    defined, dump, eval, formline, local, my, reset, scalar, undef, wantarray
      13. Functions for processes and process groups
                    alarm, exec, fork, getpgrp, getppid, getpriority, kill, pipe, qx/STRING/, setpgrp, setpriority, sleep,
                    system, times, wait, waitpid
      14. Keywords related to perl modules
                    do, import, no, package, require, use
      15. Keywords related to classes and object-orientedness
                    bless, dbmclose, dbmopen, package, ref, tie, tied, untie, use
      16. Low-level socket functions
                    accept, bind, connect, getpeername, getsockname, getsockopt, listen, recv, send, setsockopt, shutdown,
                    socket, socketpair
      17. System V interprocess communication functions
                    msgctl, msgget, msgrcv, msgsnd, semctl, semget, semop, shmctl, shmget, shmread, shmwrite
      18. Fetching user and group info
                    endgrent, endhostent, endnetent, endpwent, getgrent, getgrgid, getgrnam, getlogin, getpwent, getpwnam,
                    getpwuid, setgrent, setpwent
      19. Fetching network info
                    endprotoent, endservent, gethostbyaddr, gethostbyname, gethostent, getnetbyaddr, getnetbyname, getnetent,
                    getprotobyname, getprotobynumber, getprotoent, getservbyname, getservbyport, getservent, sethostent,
                    setnetent, setprotoent, setservent
      20. Time-related functions
                    gmtime, localtime, time, times
      21. Special mention:
        • eval EXPRESSION allows you to build strings containing perl code which are compiled on the fly (similar to LISP and Unix shell)
    6. Subroutines
    7. Regular Expressions (briefly!)
    8. NOTE: Perl 6 is coming!
More info:


Some Real Perl Programs:


#!/usr/local/bin/perl

# trivial hex file dumper

sub print_saved {
  print "\n$saved\n";
  $saved="       ";
}

while (<>) {
   print "Dump for $ARGV" unless $count;
   foreach $c (split //, $_) {
      if (int($count/32) == ($count/32)) {
         print_saved;
         printf "%06x ", $count;
      } elsif (int($count/2) == ($count/2)) {
         print " ";
         $saved .= " ";
      }
      printf "%02x", ord($c);
      $saved .= ($c =~ /[ -~]/) ? "$c " : "**" ;
      $count ++;
   }
   if (eof) {
      print_saved;
      print "\n";
      $count=0;
   }
}




#!/usr/bin/perl

# "count" read items from stdin (each line is considered an item), then report how many of each we find:

while (<>) {
   $foo{$_}++;
}

for $key (sort {$foo{$a} <=> $foo{$b}} keys %foo) {
   print "$foo{$key} $key";
}



One - Liners from the command line:
Prints userid and actual names from /etc/passwd:

perl -ne '@a=split /:/; print "userid: $a[0], Name : $a[4]\n"'   /etc/passwd






#!/usr/bin/perl

# multi_tail
# just like tail -f, but for multiple files

use FileHandle;  # so we can have an array of filehandles

if (!$ARGV[0]) {
  print "\n\nUsage: $0 file1 [file2 ... filen]\ntail -f on all files\n\n";
  exit;
}

for $i (0..$#ARGV) {
   # build array of filehandles, open files for read, seek to the end.
  $fh = new FileHandle;
  if ($fh->open("< $ARGV[$i]")) {
    push @fh, $fh;
    $filename{$fh}=$ARGV[$i];
    seek($fh, 0, 2);
  } else {
    warn "can't open  $ARGV[$i] for read: $!";
  }
}

if (!@fh) {
  print "$0: No valid filenames!\n";
  exit;
}

while (1) {
  foreach $fh (@fh) {
     while (<$fh>) {
       print "$filename{$fh}: $_";
     }
     seek($fh, 0, 1);
  }
  sleep 1;
}