Interfacing to the Operating System in perl (edited – lorraine)

Beginning Perl, Second Edition

Written by James Lee

note – fix empty spaces

Published by Apress

Chapter 10



Perl is a popular language for system administrators and programmers who have to work with files and directories due to the fact that there are many built-in functions to perform sys admin activities. These activities include creating directories, changing the names of files, creating links, and executing programs in the operating system.

In this chapter we will look at several functions that make working with files and directo ries easy. Also, we will look at two ways of executing operating system commands or other applications: system() and backquotes.

The %ENV Hash

When a Perl program starts executing, it inherits from the shell all of the shell’s exported environment variables. If you are curious about what environment variables are defined in your shell, try this command in Unix:

$ env

Depending on what shell you are using, you might need to execute

$ printenv

In Windows try

c:> set

All of the environment variables that the Perl program inherits are stored in the special hash %ENV . Here are a few possible examples:  

$ENV{HOME}
$ENV{PATH}
$ENV{USER}

These environment variables can be assigned. If you want to change the path for the current execution of the program, simply assign to $ENV{PATH} (note that this will not change the path for the shell that is invoking this program).

$ENV{PATH} = ‘/bin:/usr/bin:/usr/local/bin’;

The following program whereis.pl , is an example of reading from %ENV . It will implement the whereis command, a useful program found in Unix that reports to the user the location of a program within the PATH environment variable. Here is the code:

#!/usr/bin/perl – w
# whereis.pl

use strict;

my $prog = shift @ARGV;
die "usage: perl whereis.pl <file>" unless defined $prog;

my $found = 0;

foreach my $dir (split /:/, $ENV{PATH}) {
    if (-x "$dir/$prog") {
        print "$dir/$progn";
        $found = 1;
        last;
   
}
}

print "$prog not found in PATHn" unless $found;

First, we grab the command line argument and place it in $prog . This argument is the program that we are trying to locate. If the argument is not provided, we complain:

my $prog = shift @ARGV;
die "usage: perl whereis.pl <file>" unless defined $prog;

Then we see the following:

my $found = 0;

foreach my $dir (split /:/, $ENV{PATH}) {
    if (-x "$dir/$prog") {
        print "$dir/$progn";
        $found = 1;
        last;
   
}
}

First, we assume we won’t find the program and we assign $found the value 0, or false. We’ll check this variable at the end of the program and print a message, if necessary. The foreach loop loops through each directory listed in $ENV{PATH} , a colon-separated list of filenames.

For each of these directories, we test to see if the program we are looking for is an executable file in that directory:

if (-x "$dir/$prog") {

If so, we print the directory/filename, set $found to true since we found the program, and then last out of the outer loop.

Finally, if we did not find the program, the program says so:

print "$prog not found in PATHn" unless $found;

Executing this code produces the following:

$ perl whereis.pl sort
/usr/bin/sort
$ perl whereis.pl noprogram
noprogram not found in PATH
$

Working with Files and Directories

Perl provides several mechanisms to work with files and directories. In this section we will explore the concept of file globbing, directory streams, and several built-in functions that allow us to perform operating system actions. First, file globbing.

File Globbing

Those of us who are Unix users know that this command lists all the files in the current directory that end with the .pl extension:  

$ ls *.pl

A similar command in Windows would be

c:> dir *.pl

The part of these commands that indicates which files we want to list is *.pl . This is known as a file glob—it globs, or collects together, all the filenames that end in .pl . Those filenames are then listed.  

We can perform a similar action in Perl by taking the glob pattern and, like reading from a filehandle, wrap it in angle brackets:

<*.pl>

There are two ways of reading from a file glob—scalar context or list context. In scalar context, it returns back the next filename that ends in .pl :

$nextperlfilename = <*.pl>;

In list context, it returns back all the filenames that end in .pl :

@alltheperlfilenames = <*.pl>;

Like using the ls or dir commands, we can indicate more than one pattern to glob. These patterns can be absolute or relative paths. For instance, this example globs all the filenames in the current directory that end in .pl and all the filenames that end in .dat :

<*.pl *.dat>

This example lists all the .c and .h files in specific directories:

</usr/src/*.c /usr/include/*.h>

Like reading from a filehandle, if we read from a glob within a while loop, and the glob is not explicitly assigned to a variable, it is assigned to $_ by default:

while (<*.dat>) {
    print "found a data file: $_n";
}

This program lists the contents of the current directory and uses file tests to examine each file:

#!/usr/bin/perl
# directory-glob.pl

use strict;

print "Contents of the current directory:n";
foreach (<*>) {
    next if $_ eq "." or $_ eq "..";
    print $_, " " x (30 – length($_));
    print "d" if -d $_;
    print "r" if -r _;
    print "w" if -w _;
    print "x" if -x _;
    print "o" if -o _;
    print "t";
    print -s _ if -r _ and -f _;
    print "n";
}

The first thing we see after a friendly print() is

foreach (<*>) {

This loops foreach filename returned by <*> , or all files in the current directory. The file name is read into $_ . Then we check to see if it is either . or .. , special directories in DOS and Unix that refer to the current and parent directories respectively. We skip these in our program:

  next if $_ eq "." or $_ eq "..";

We then print out the name of each file, followed by some spaces. The length of the filename plus the number of spaces will always add up to 30, so we have nicely arranged columns.

  print $_, " " x (30 – length($_));

First we test to see if the file is a directory using the file tests we saw in Chapter 8:

  print "d" if -d $_;

Then we test to see if the file is readable, writable, executable, and owned by us:

  print "r" if -r _ ;
  print "w" if -w _;
  print "x" if -x _;
  print "o" if -o _;

No, this isn’t a typo: we do mean _ and not $_ here. Just as $_ is the default value for some operations, such as print() , _ is the default filehandle for file tests. It actually refers to the last file explicitly tested. Since we tested $_ previously, we can use _ for as long as we’re referring to the same file.


Note  When Perl does a file test, it actually looks up all the data at once—ownership, readability, writability, and so on; this is called a stat of the file. _ tells Perl not to do another stat, but to use the data from the previous one. As such, it’s more efficient than stat-ing the file each time.


Finally, we print out the file’s size—this is only possible if we can read the file, and only useful if it is a regular file:

  print -s _ if -r _ and -f _;

Executing this code produces the following:

$ perl directory-glob.pl
Contents of the current directory:

a.dat                         rwo     20 addsizes.pl                   rwo     242
b.dat                         rwo     20
backquote.pl                  rwo     297
dir1                          drwxo
directory-dir.pl              rwo     460 directory-glob.pl             rwo     371
links.pl                      rwo     316
os.pl                         rwo     1049 system.pl                     rwo     132 whereis.pl                    rwo     334
$

The number at the end of the line is the size of the file in bytes; as for the letters, “d” shows that this is a directory, “r” stands for readable, “w” for writable, “x” for executable, and “o” shows that the user that is running the program is the owner.

Reading Directories

Directories can be treated kind of like files—we can open them and read from them. Instead of using open() and a filehandle, which are used with files, we use opendir() and a directory handle:  

opendir DH, "." or die "Couldn’t open the current directory: $!";

To read each file in the directory, we use readdir() on the directory handle.

Previously, we saw directory-glob.pl , a program to perform file tests on files that we obtained from a glob. In the spirit of TMTOWTDI, let’s do the same action using a directory handle instead of a file glob:

#!/usr/bin/perl
# directory-dir.pl

use strict;

print "Contents of the current directory:n";
opendir DH, "." or die "Couldn’t open the current directory: $!";
while ($_ = readdir(DH)) {
   
next if $_ eq "." or $_ eq "..";
    print $_, " " x (30 – length($_));
    print "d" if -d $_;
    print "r" if -r _;
    print "w" if -w _;
    print "x" if -x _;
    print "o" if -o _;
    print "t";
    print -s _ if -r _ and -f _;
    print "n";
}
closedir DH;

The only changes from the previous program are these two lines:

opendir DH, "." or die "Couldn’t open the current directory: $!";
while ($_ = readdir(DH)) {

and this line:

closedir DH;

The current directory, . , is opened. Then we read from the directory with readdir() , and as long as we have a filename, we perform the same tests as before. After we are all finished with the files, we close the directory handle. This program produces the same result as directory-glob.pl :

$ perl directory-dir.pl
Contents of the current directory:
a.dat                         rwo      20 addsizes.pl                   rwo      242 b.dat                         rwo      20 backquote.pl                  rwo      297 dir1                          drwxo directory-dir.pl              rwo      460 directory-glob.pl             rwo      371 links.pl                      rwo      316 os.pl                         rwo      1049 system.pl                     rwo      132 whereis.pl                    rwo      334
$

Functions to Work with Files and Directories

Perl provides many built-in functions to perform operating system actions on files and directories. Let’s look at a few of them.  

The chdir() Function

To change directories within a Perl script, use the chdir() function. Its syntax is

chdir(directory)

This function attempts to change directories to the directory passed as its argument (defaulting to
$ENV{HOME} ). If it successfully changed directories, it returns true, otherwise false.


Note  chdir() changes the working directory in the script. This has no effect on the shell in which the script is invoked—when the script exits the user will be in whatever directory they were in when they executed the program.


The fact that this function returns true on success or false on failure can be very helpful to us. We should always check the return value and respond appropriately if the directory change failed. For instance, this code attempts to change directory and die() s if we couldn’t make the change:

chdir ‘/usr/local/src’ or die "Can’t change directory to /usr/local/src: $!";

Recall that $! is a variable that contains the error string of whatever just went wrong.

The unlink() Function

The unlink() function deletes files from disk. Its syntax is

unlink(list_of_files)

This function removes the files from disk. It returns true if successful, false if not. This function acts like the Unix rm command and the Windows del command. Here is an example:

unlink ‘file1.txt’, ‘file2.txt’ or warn "Can’t remove files: $!";

The rename() Function

The rename() function renames one file to a new name. Its syntax is

rename(old_file_name, new_file_name)

This function renames the old file to the new name. It returns true if successful, false if not. This function acts like the Unix mv command and the Windows ren command. Here is an example:

rename ‘old.txt’, ‘new.txt’ or warn "Can’t rename file: $!";

Note that you can also move a file with this function (like the mv command in Unix and move command in Windows):

rename ‘oldir/old.txt’, ‘newdir/new.txt’ or warn "Can’t move file: $!";

The link(), symlink(), and readlink() Functions

These functions allow us to work with hard and soft links. These functions are Unix-centric—they don’t function the same in the Windows world, so it is suggested you avoid using them there.

The link() function creates a hard link. Its syntax is

link(file_to_link_to, link_name)

The symlink() function creates a symbolic link. Its syntax is

symlink(file_to_link_to, sym_link_name)

To find out what file a symbolic link points to, use the readlink() function:

readlink(sym_link_name)

Here is an example of creating a soft link in Perl and finding out the name of the file to which it links:

#!/usr/bin/perl -w
# links.pl

use strict;

my $filetolink = ‘links.pl’;
my $linkname   = ‘linktolinks.pl’;

symlink($filetolink, $linkname) or die "link creation failed: $!";

print "link created ok!n";

my $readlinkresult = readlink($linkname); print "$linkname is a sym link to $readlinkresultn";

Here is an example of executing this code. Note that the link doesn’t exist before the code is executed:

$ ls -l link*
-rw-r–r–   1 jdoe  users  349 22 Apr 14:05 links.pl
$ perl links.pl
link created ok!
linktolinks.pl is a sym link to links.pl
$ ls -l link*
-rw-r–r–   1 jdoe  users  349 22 Apr 14:05 links.pl
lrwxr-xr-x   1 jdoe  users    8 22 Apr 14:06 linktolinks.pl -> links.pl
$

The mkdir() and rmdir() Functions

The mkdir() function makes a directory. Its syntax is

mkdir(directory_name, mode)

This function creates directory_name . It returns true on success, false on failure. The mode , or permissions, are applied to the directory (possibly modified by the umask ). Note that the mode should be represented as an octal number by preceding it with a 0 since Unix interprets the number representation of the mode as an octal value.

Here is an example of mkdir() that creates the directory newdir in the current directory with the permissions of 751 (in the Unix world, this looks like rwxr-x–x ):

mkdir(‘newdir’, 0751) or die $!;

As usual, we are handling the failure of this function—in this case we are die() ing.

The rmdir() removes an empty directory. It returns true on success, false on failure. Its syntax is

rmdir(directory_name)

The chmod() Function

Speaking of permissions, the chmod() function changes the mode, or permissions, on a file or directory. Its syntax is

chmod(file_or_directory_name, mode)

Again, the mode should be represented as an octal number since that is how Unix interprets it. This changes the mode of the file resume.txt to be readable only by the owner of the file, die() ing if the chmod fails:

chmod(‘resume.txt’, 0600) or die $!;

An Example

Here is an example program using a bunch of these functions. The comments describe what is going on:

#!/usr/bin/perl – w
# os.pl

use strict;

# first prompt the user for a directory name and attempt
# to create the directory in the current directory
print "please enter a directory name: "; chomp(my $dir = <STDIN>);

mkdir $dir, 0777 or die "failed to make directory $dir: $!n";
print "made the directory $dir ok!n";

# so far so good – now, change directory into the
# directory
chdir $dir or die "failed to change into $dir: $!n";
print "changed into $dir ok!n";

# ok, now move the file ../a.dat into this new directory
# giving it a new name
print "enter new file name: ";
chomp(my $newname = <STDIN>);
rename "../a.dat", $newname or die "rename failed: $!n";
print "file moved successfully!n";

# list the contents of the directory
# using a directory handle
print "contents of the new directory:n"; opendir DH, ‘.’ or die "opendir failed: $!"; my $filename;
while ($filename = readdir(DH)) {
   
print "    $filenamen";
}
close DH;

# that’s it, say goodbye
print "we are all done… goodbye!n";

Here is what happens when it is executed on a Unix system:

$ perl os.pl
please enter a directory name: newdir
made the directory newdir ok!
changed into newdir ok!
enter new file name: new.dat
file moved successfully!
contents of the new directory:
    .
    ..
    new.dat
we are all done… goodbye!
$

Executing External Programs

There are times when we want our Perl program to execute external programs such as another Perl script, shell commands (like ls and dir ), or other programs or applications.  

There are several ways to execute other programs from within a Perl script. We have already seen one way: opening pipes with the open() function discussed in Chapter 8. In this chapter we will discuss two other ways: the system() function and backquotes.

The system() Function

The system() function takes an argument and executes that argument as if it were entered into a shell. If the command produces any standard output, system() allows it to go to stan dard output. Its syntax is  

system(command)

It returns the error status of command . In Unix and Windows, the error status is a way for a program to report back to whoever invoked it, informing the calling program or shell whether or not the program executed correctly. By convention, when all is well, the error status is 0. If there was a problem, the program will return a non-0 value (such as 1 or 255).


THINK TWICE BEFORE YOU USE SYSTEM()

The system() function can perform all sorts of operating system commands such as making directories, copying files, moving files, etc. For instance, in Unix we could execute

system "rm a.dat";  # delete the file a.dat in Unix


instead of

unlink ‘a.dat’;


There are two main reasons not to use the system() function instead of the unlink() function to remove a file. First, passing "rm a.dat" to the system() function as shown previously works fine in Unix, but not in Windows (in Windows we would use the del command). Therefore, in many cases, the system() function is not portable between operating systems, while the unlink() function is portable.

Second, the unlink() function is named unlink() because it calls the low-level operating system library function named unlink(), immediately removing the file. The system() function, on the other hand, creates a shell. The shell is a big program that must start up, reading various configuration files. The shell is then passed the argument to the system() function as if a user typed it into the shell. The shell then parses the string, determines that the user wants to remove a file, and calls the low-level operating system function
named unlink(). So, you can call the unlink() function yourself using the Perl function named unlink(), or you can start up a big program that does a lot of work before finally calling the low-level operating system unlink() function.

A shell is also created when using one of these two methods of executing an external program: backquotes, and opening pipes with open(), so keep this in mind when deciding between built-in functions such as unlink() and rename() and using another Perl mechanism to perform operating system actions.

Another important note: the program system.pl displayed the current date using the system() function:

my $error_status = system ‘date’;

This created a shell, which is an inefficient way of determining the date on the machine. A better way is to
use the localtime() function in scalar context:

print scalar(localtime), "n";

A rule of thumb is this: most actions that you want to take in Perl are implemented in the language in a way
that does not require launching a shell. Mentioning every feature of Perl is not the intent of this book, so we
will not discuss all the different ways of doing the same thing.1 But a little bit of searching on your part may
uncover an efficient, cool way of taking action in Perl without going out to the shell, so get in the habit of
looking deeper into this language when you are trying to do something new.2


1. Remember TMTOWTDI? Divining how many is left as an exercise to the reader.

2. www.perl.com, www.perlmonks.org, www.google.com, and perdoc are our friends.

Here is an example program that executes the date command—its job is to print to standard output the date in a readable format. The return from system() is stored in a variable and then printed to standard output.

#!/usr/bin/perl -w
# system.pl

use strict;

my $error_status = system ‘date’;

print "system() returned: $error_statusn";

Executing this program might produce the following:

$ perl system.pl
Fri Apr 23 07:17:31 CDT 2004
system() returned: 0
$

Backquotes

The system() function prints the output of its argument to the screen. Sometimes, however, we want to capture the output and bring it into our program. The backquotes allow us to do just that. Here is the syntax:

`command `

That is the backquote (aka the grave character), not the single quote character.

The backquotes execute the operating system command, capturing and returning its standard output, if any. The error status is available in the special variable $? . The backquotes can be read in either scalar context or in list context:

$output = `$command`;
@output = `$command`;

In scalar context, the entire output including newline characters is returned as a single string (here stored in $output ). In list context, the entire output is returned as a list, newlines included; each line of output is a single element in the list (here stored in @output ).

Here is an example that executes the program directory-dir.pl that we discussed previously in this chapter and adds up all the sizes of the files:

#!/usr/bin/perl -w
# addsizes.pl

use strict;

my @result = `perl directory-dir.pl`;
my $size = 0;

foreach (@result) {
    if (/^.{30}[drwox]*t(d+)$/) {
        $size += $1;
    }
}

print "The total size of all files: $sizen";

First, we execute the script directory-dir.pl and capture the output of the backquotes in list context. This means that @result will be an array and each element is an individual line of output from the script:

my @result = `perl directory-dir.pl`;

Then, the size is initialized to 0:

my $size = 0;

Now it is time to examine the output:

foreach (@result) {
    if (/^.{30}[drwox]*t(d+)$/) {
        $size += $1;
    }
}

The foreach is looping though each line of output. If the line matches the pattern that includes a size (that is, the d+ ), then we use the parentheses to extract the size into $1 . The size is added to $size .

Executing this program produces the following:

$ perl addsizes.pl
The total size of all files: 3241
$

There’s More

There are many other ways that Perl interfaces to the operating system—we’ve only covered the basics here. There are dozens of built-in functions available to do all sorts of system administration stuff (see perldoc perlfunc for a list). Other operating system things that Perl can do include create child processes (with fork() ), send processes signals (with kill() ), low-level file i/o (with sysread() and syswrite() ), read password information (with getpwent() and others), and many more . . .  

Summary

In this chapter, we have discussed several ways of performing operating system actions from within a Perl script. These include file globs, executing built-in functions such as mkdir() and rename() , and executing operating system commands with system() and backquotes.  

Exercises

  1. Write a program that takes two arguments: a directory and an integer. Change into the directory that is the first argument and list all the files that have a size greater than or equal to the second argument. First use a glob and then use a filehandle.
  2. Automate a task that you perform on a regular basis.

 

[gp-comments width="770" linklove="off" ]

chat sex hikayeleri Ensest hikaye