Basic Input and Output – I/O
Input refers to getting information into your program Output refers to getting information out of your program
I/O is how computer programs talk to the rest of the “world”
For example, you may want to:
• Open a file and read its contents
• Write your results to a file
• Ask the user of the program to supply information
Every Perl script starts with three connections to the outside world
One of these is for input to your program, and two are for output from your program
Input – By default, Perl has a connection set up for taking information entered from the keyboard.
This connection is referred to as STDIN
Output – 1) By default, Perl has a connection set up for writing data out to your terminal (screen).
This connection is referred to as STDOUT
2) By default, Perl has a connection set up for writing diagnostic messages, (warnings, etc.) to your terminal.
This connection is referred to as STDERR
You can change the default locations for STDIN, STDOUT and STDERR
Reading data from STDIN
aka: how to type in information and get your program to listen
To read a line of data into your program from the keyboard, use the angle bracket function <> on STDIN
$line = <STDIN>
<STDIN> reads one line of input from the keyboard and this line is “fed into” the variable $line.
In computer speak: <STDIN> reads a line of input from standard input and returns it
as the function result. This result is then assigned to the scalar variable $line.
Reading data from STDIN
Try this:
#!/usr/bin/perl –w
print “Please enter your name: “;
$name = <STDIN>;
print “Hello $name. Glad to meet you.”;
When you run the script, does the greeting look like what you wanted?
Newlines and STDIN
When you entered your name, you pressed the Return key
Perl takes both your name, and the return-key value into the script.
Thus, in the first version of the script, $name contains your name followed by a return, also known as a newline.
So your greeting probably looked something like:
Hello Bob
. Glad to meet you.
Also, Perl didn’t know to return at the end of your greeting, so your
greeting ran straight into your cursor
Sorting out the newlines
Now try:
#!/usr/bin/perl –w
print “Please enter your name: “;
$name = <STDIN>;
chomp ($name);
print “Hello $name. Glad to meet you.\n”;
Save and run this script.
Chomp and \n
#!/usr/local/perl –w
print “Please enter your name: “;
$name = <STDIN>; #at this point $name still has the newline at the end chomp ($name); #chomp removes the final newline from the variable print “Hello $name. Glad to meet you.\n”;
# \n adds a newline to the end of your greeting
Newlines and STDIN
\n – this means newline
There are other useful characters like this. For example:
\t – tab
\b – backspace
\a – alarm (bell)
print “Hello $name. Glad to meet you.\n”;
print “Hello $name. Glad to meet you.\n\n”;
print “Hello $name. Glad to meet you.\n\n\n”;
STDOUT – getting stuff to the screen
This statement:
print “Hello $name. Glad to meet you.\n”;
resulted in a greeting being printed to your screen.
But you never explicitly told Perl to send it there.
Recall: By default, Perl has a connection set up for writing data out to your terminal (screen) - STDOUT
The print command sends information to STDOUT by default.
And by default, STDOUT is connected to your terminal.
Reading from and writing to files
STDIN, STDOUT and STDERR are known as Filehandles.
Filehandles are connections.
You can set up connections to locations other than the keyboard or terminal by setting up your own Filehandles.
To do this, you need to tell Perl a few things:
• what do you want a connection to e.g. the name of a file
• what do you want to do e.g. write to a file, read to a file, both?
• what do you want to call the connection when you refer to it in your program
name the Filehandle
Reading from and writing to files
To open a connection to read from a file:
open(FILECON, “/home/btiwari/myfile.txt”);
• what do you want a connection to /home/btiwari/myfile.txt
• what do you want to do read from the file
• what do you want to call the connection when you refer to it in your program FILECON
It is convention that FILEHANDLES be in capital letters.
Reading from and writing to files
To open a connection to read from a file:
open(FILECON, “/home/btiwari/myfile.txt”);
To open a connection to create and write to a file:
open(FILECON, “>/home/btiwari/anotherfile.txt”);
To open a connection to append to a file:
open(FILECON, “>>/home/btiwari/anotherfile.txt”);
To open a connection to a file you wish to read from and write to:
open(FILECON, “+>/home/btiwari/myfile.txt”);
Reading from and writing to files
Overview of accessing files:
• Open the file to read from
• Open a file to write to (Carry out processing)
• Close the file you are writing to
• Close the file you are reading from
#!/usr/bin/perl -w
open(FROMFILE, “/home/btiwari/infile.txt”);
open(TOFILE, “>/home/btiwari/outfile.txt”);
#Process Process Process#
close(TOFILE);
close(FROMFILE);
Reading from and writing to files
An example to try (seqlength.pl):
#!/usr/bin/perl -w use strict;
my $count = 0; #declare variable $count
open(SEQ, “x52524.tfa”);
open(OUTFILE, “>outfile.txt”);
while (<SEQ>) { if (/^>/) {
print OUTFILE $_;
}
else {
chomp $_; #get rid of the newline
$count += length($_);
} }
print OUTFILE “The length of the above sequence is $count bases.\n”;
close(OUTFILE);
close(SEQ);
Now the niceties – explaining seqlength.pl
#!/usr/bin/perl -w use strict;
my $count;
open(SEQ, “x52524.tfa”) or die “I can’t open your file: $!”;
open(OUTFILE, “>outfile.txt”) or die “I can’t open your file: $!”;
while (<SEQ>) { if (/^>/) {
print OUTFILE; #where is $_
}
else {
chomp; #where is $_
$count += length(); #where is $_
} }
print OUTFILE “The length of the above sequence is $count bases.\n”;
close(OUTFILE);
close(SEQ);
Remember this?
Reading data from STDIN
#!/usr/bin/perl –w
print “Please enter your name: “;
$name = <STDIN>;
Print “Hello $name. Glad to meet you.”;
Why not ask the user for what files they want to process?
More niceties – ask the user for the filenames (seqlength2.pl)
#!/usr/bin/perl -w use strict;
my $count;
print “\nWhich sequence do you want to process: “;
chomp (my $infile = <STDIN>);
print “\nWhat file should I write the results to: “;
chomp (my $outfile = <STDIN>);
open(SEQ, $infile) or die “I can’t open $infile: $!;
open(OUTFILE, “>$outfile”) or die “I can’t open $outfile: $!;
while (<SEQ>) { if (/^>/) {
print OUTFILE;
} else {
chomp;
$count += length();
} }
print OUTFILE “The length of the above sequence is $count bases.\n”;
close(OUTFILE);
close(SEQ);
Even more niceties – the magic of <>
Up until now, you have seen <FILEHANDLE>
But <> without an explicit filehandle is “magical”:
It reads from each file listed on the command line as if it were one single large file.
If no files are given on the command line, it reads from STDIN.
Try typing:
./seqlength3.pl x52524.tfa
#!/usr/bin/perl -w use strict;
my $count;
print “\nWhat file should I write the results to: “;
chomp (my $outfile = <STDIN>);
open(OUTFILE, “>$outfile”) or die “I can’t open $outfile: $!;
while (<>) {
if (/^>/) {
print OUTFILE;
} else {
chomp;
$count += length();
} }
print OUTFILE “The length of the above sequence is $count bases.\n”;
close(OUTFILE);
The magic of <> continued
< > reads from each file listed on the command line as if it were one single large file.
Try typing:
./seqlength4.pl x52524.tfa m83172.tfa
Look at the sequence files and the result file.
Open a new terminal and read the script seqlength4.pl Can you see why it does what it does?
How did it store and print out only the sequence names without the rest of the title line?
Why does the script now count the total number of bases in both sequences?
More on operators
You have already seen some of the many operators available in Perl
= used when assigning variables + used for addition
- used for subtraction
* used for multiplication
< > the filehandle operator There are many others.
How does Perl know what to do first?
2+3*4; #this evaluates to 14 – just like normal math, Perl multiplies before it adds
(2+3)*4 #this evaluates to 20 – just like normal math, Perl attends to things in brackets first
But what about other operators like:
>> && || or and not ! split
More on operators
There is lots of information about the precedence of different operators in Perl Books.
There is a man page covering the topic in depth.
Try typing:
man perlop Or see:
http://www.perl.com/doc/manual/html/pod/perlop.html
A short note on || and or, and && and and
|| means or, so why use “or”?
&& means and, so why use “and”?
1) More readable
2) Rules of Precendence!
open SEQ, $infile || die “I can’t open $infile: $!; #bzzzt. Wrong!
open SEQ, $infile or die “I can’t open $infile: $!; #Yay!