Perl Lesson #2

In this little script, we take the Apache access_log file, and read it into an array. Each element of the array is split on the space (" "), and we print out the first element, which is the IP address of the machine connecting to our website.

Here is the format for the log file:


69.47.145.197 - - [15/Jan/2006:22:35:12 -0500] "GET / HTTP/1.1" 301 354
69.47.145.197 - - [15/Jan/2006:22:35:39 -0500] "GET /content/images/oa1.jpg HTTP/1.1" 301 376
69.47.145.197 - - [15/Jan/2006:22:35:40 -0500] "GET / HTTP/1.1" 301 354
202.7.166.167 - - [16/Jan/2006:00:02:41 -0500] "GET /content/images/oa1.jpg HTTP/1.0" 301 376
202.7.166.167 - - [16/Jan/2006:00:06:23 -0500] "GET /content/images/oa1.jpg HTTP/1.0" 301 376
66.249.64.14 - - [16/Jan/2006:00:14:08 -0500] "GET /robots.txt HTTP/1.0" 301 364

PERL:
  1. #!/usr/bin/perl
  2. open (FD, "/path/to/your/access_log");
  3. while (<fd>)
  4. {
  5. @array = split (/[" "]/);
  6. foreach $i (@array)
  7.         {
  8.         print $array[0] . "\n";
  9.         }
  10. }
  11. close (FD);

In this example we use the special variable "$_". This is used to hold data from the file without having to do an explicit read operation. If we wanted to split the lines on other characters we could do something like this:

@array = split (/[-,:," "]/);

This would split it on the dash "-", colon ":" and the space " ".

If we wanted to read out each line as it is read in, we could simply replace:

print $array[0] . "\n";

With:

print $i . "\n";


5 thoughts on “Perl Lesson #2

  1. It looks like you’re coming from a PHP background (based on seeing how others who do that do things in Perl). It’s pretty odd to use the ‘.’ (concatenation) operator when you just want to interpolate a variable.

    print $array[0] . "\n";

    is usually written:

    print "$array[0]\n";

    And you’re doing something odd in the foreach loop. (You’ll start to abbreviate that to ‘for’ if you continue with Perl — the two are synonymous even though there are two forms of “what to loop over”.) Why are you doing:

    for $i (@array) { print $array[0] . "\n"; }

    That’ll give you as many copies of $array[0] as there are parts in the array.

    You can accomplish everything you’re doing in this file with:

    perl -lanwe 'print $F[0]' /path/to/your/access_log

    For text processing things, it’s often better to use (with no explicit file-opening/-closing) than .

    Just some tips for someone who might come across this later, as I did.

  2. Oh, and really, text-processing that’s this straight-forward is usually better done with ‘cut':

    cut -f1 -d' ' /path/to/access_log

    (But, if you want to do something more complicated, it’s nice to have the scaffolding that you have:)


    #!/usr/bin/perl
    while (<>) {
    my @array = split;
    # do something with @array here
    print "$array[0]\n";
    }

  3. @Benzi & Stefhen,

    Thanks for the tips… These little snips are notes to myself from my adventures trying to teach myself perl, so having your pointers really helps. I tend to like using the UNIX tools like cut, sed and awk better as well because I’m much more familiar with them.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>