UNIX Shell Script

(1)

(2)

Introduction to Shell 3

Comments 3

Variables 4

Expressions in variables 4

Other forms of input to shell variables or commands in a script 7

Shell commands and control structures 8

Quoting 15

F unctions 17

Command line arguments 18

Command substitution 19

Script Examples 20

How do you execute (run) a shell script? 23

Debugging 24

Shell Tips & Tricks 25

Reference 32

(3)

Shell Programming

Even though there are various graphical interfaces available for Linux the shell still is a very neat tool. The shell is not just a collection of commands but a really good

programming language. You can automate a lot of tasks with it, the shell is very good for system administration tasks, you can very quickly try out if your ideas work which makes it very useful for simple prototyping and it is very useful for small utilities that perform some relatively simple tasks where efficiency is less important than ease of configuration, maintenance and portability.

So let's see now how it works:

Creating a script

There are a lot of different shells available for Linux but usually the bash (bourne again shell) is used for shell programming as it is available for free and is easy to use. So all the scripts we will write in this article use the bash (but will most of the time also run with its older sister, the bourne shell).

For writing our shell programs we use any kind of text editor, e.g. nedit, kedit, emacs, vi...as with other programming languages.

The program must start with the following line (it must be the first line in the file):

#!/bin/sh

The #! characters tell the system that the first argument that follows on the line is the program to be used to execute this file. In this case /bin/sh is shell we use.

When you have written your script and saved it you have to make it executable to be able to use it.

To make a script executable type chmod +x filename

Then you can start your script by typing: ./filename

Comments

Comments in shell programming start with # and go until the end of the line. We really recommend you to use comments. If you have comments and you don't use a certain script for some time you will still know immediately what it is doing and how it works.

(4)

Variables

As in other programming languages you can't live without variables. In shell

programming all variables have the datatype string and you do not need to declare them. To assign a value to a variable you write:

varname=value

To get the value back you just put a dollar sign in front of the variable:

#!/bin/sh

# assign a value: a="hello world"

# now print the content of "a": echo "A is:"

echo $a

Type this lines into your text editor and save it e.g. as first. Then make the script executable by typing chmod +x first in the shell and then start it by typing ./first The script will just print:

A is:

hello world

Sometimes it is possible to confuse variable names with the rest of the text:

num=2

echo "this is the $numnd"

This will not print "this is the 2nd" but "this is the " because the shell searches for a variable called numnd which has no value. To tell the shell that we mean the variable num we have to use curly braces:

num=2

echo "this is the ${num}nd"

This prints what you want: this is the 2nd

There are a number of variables that are always automatically set. We will discuss them further down when we use them the first time.

If you need to handle mathematical expressions then you need to use programs such as expr (see table below).

Besides the normal shell variables that are only valid within the shell program there are also environment variables. A variable preceeded by the keyword export is an

environment variable. We will not talk about them here any further since they are normally only used in login scripts.

Expressions involving variables

Expressions can be used in assigning values to new variables; as substitutions in command lines in the script; and in flow-of-control statements: if, foreach, while, and switch.

(5)

Expressions are always enclosed in a pair of parentheses ( and ). Note that you should put blanks (white space) around the parentheses and the operators within them to insure proper parsing.

• Arithmetic expressions

Standard arithmetic operators (+ - * /) can be used to operate on integer variable values. To save the results in another variable, however, you should use the @ command rather than the set command for the new variable. For example, you can increment a sum with the value of an argument like this:

@ sum = 0 # initialize @ sum = ( $sum + $1 )

Note that there must be a space between the @ command and the name of the variable that it is setting.

The C-shell cannot do floating point (decimal) arithmetic, for example, 1.1 * 2.3.

However, you can invoke the bc calculator program from within a shell script to perform decimal arithmetic. Probably the simplest way to do that is to define an alias, called for example, "MATH", that performs decimal arithmetic via the bc program on the variables or constants passed to the alias. For example, you can define this alias in your script

# Set MATH alias - takes an arithmetic assignment statement # as argument, e.g., newvar = var1 + var2

# Separate all items and operators in the expression with blanks alias MATH 'set \!:1 = `echo "\!:3-$" | bc -l`'

This says that the word "MATH" will be replaced by a set command which will set the variable given as the first word following the alias equal to the result of a math

calculation specified by the remainder of the words following the alias, starting with the third word (skips the = sign), which is piped to the bc program to perform.

For example, you can use this MATH alias in a shell script to multiply two floating point values like this:

set width = 20.8 set height = 5.4

MATH area = $width * $height echo $area

Note to experts: normally, you would quote the asterisk character in a command line like the one shown above, so the shell does not try to interpret it as the filename wildcard matching character. But the alias expansion is done first, before filename matching, and as a result, the asterisk becomes protected by the quotes in the alias.

(6)

These are usually used in if statements for conditional execution. Logical expressions always evaluate to either the value 1 if true, or value 0 if false. The result of a logical expression can also be assigned to another variable with the @ operator.

These expressions test whether a variable equals another, or a constant value, or is greater than or less than another value. There are some special advanced tests, but the basic ones are indicated by the following operators, with a variable or constant on either side. To get a variable's value, you must, of course, use the $ operator. Character constants should be enclosed in a pair of quote characters. Use either single (') or double (") quotes, as long as you use the same type on both ends of the constant. If the variable being tested may contain blanks or other special shell characters as part of its value, then it must also be enclosed in quotes, but only double quotes can be used for variables, e.g., "$a".

( $a < $b )

Tests if the numeric value of variable a is less than the numeric value of variable b. You will get an error message if either variable does not contain a numeric string.

( $a <= 5 )

Tests if numeric value of variable "a" is less than or equal to numeric value 5. You will get an error message if the variable a does not contain a numeric string.

( $a > $b )

"Greater than" works the same as "less than". There is also a "greater than or equal" test, using the operator >= Again, you will get an error message if either variable does not contain a numeric string.

( $a == 5 )

Tests if the numeric value of variable a is exactly equal to numeric value 5. Again, you will get an error message if the variable a does not contain a numeric string.

( "$c" == "yes" )

Tests if the string value of variable c is exactly equal to the character string constant yes. Logical expressions can be modified by the "NOT" (!) operator or combined with the "OR" (||) or "AND" (&&) operators. Examples:

( ! "$c" == "yes" )

The logical NOT operator -- the exclamation point -- can be put at the front of any expression (inside the parentheses) to reverse the result of the expression. That is, whenever the expression would normally give a "true" result, it now gives a "false" result, and vice-versa. In this example, the overall expression is true whenever the value of the variable c is not equal to the constant string "yes".

( $a > 1 && $a < 5 )

Tests if the value of variable a is greater than numeric value 1 and also less than numeric value 5. Both cases have to be true for the expression to evaluate "true".

( $c == "yes" || $c == "YES" )

Tests if value of variable c is either the character string constant yes or the string constant YES (remember that upper and lower case are not the same). If it is equal to either one, then the expression evaluates "true".

(7)

• Built-in logical tests for file existence

The C-shell has built in tests for existence or characteristics of a file. These tests are made with a special form of a logical expression in which a variable substitution or constant string is preceded by a one letter operator introduced with the minus sign. For example, the expression

( -f "junk" )

evaluates to "true" if there is an existing plain file (not a directory or special file) in the current directory with the name junk. Similarly, if the first shell script argument was the string subdir, then the expression

( -d $1 )

evaluates to "true" if subdir is the name of a existing directory (not a plain file) in the current directory.

The complete list of these operators, and what they test for, is:

• Pattern matching expressions

A more advanced feature allows you to test if a variable value matches a pattern instead of just whether it is an exact match to a character string. Read about this in the C-shell reference manual.

Other forms of input to shell variables or commands in a script

You can execute a command from the shell script and have it read the next few lines in the script as the command's standard input. You do this by redirecting the standard input with the special symbol << followed by a "marker" string. The shell reads the lines from the script following this command until it encounters a line that contains only the marker string. All the lines it has read (not counting the marker string line) are then fed to the command as its standard input.

-e tests if a file exists (it can be any type of file) -f tests for an existing plain file

-d tests for an existing directory

-z tests for an existing plain file of zero size (that is, an empty file) -r tests if you have read access to a file

-w tests if you have write access to a file -x tests if you have execute access to a file -o tests if you own a file

(8)

Here is an example of a fragment of code from a shell script using this feature: cat > newfile << EOF

This line from the script will be read by cat. So will this line.

The next line has the 'EOF' marker string that terminates the cat input. EOF

You can read a line of input from the standard input of the shell script (usually the terminal, but could be redirected from a file) and set it as the value of a variable with this syntax:

set variable=$<

This can be used to "prompt" for information from the user. First use the echo command to print a question; then read the answer typed by the user at the terminal with the syntax above.

Shell commands and control structures

There are three categories of commands which can be used in shell scripts: 1)UNIX Commands:

Although a shell script can make use of any unix commands here are a number of commands which are more often used than others. These commands can generally be described as commands for file and text manipulation.

Command syntax Purpose

echo "some text" write some text on your screen

ls list files

wc -l file wc -w file wc -c file

count lines in file or count words in file or count number of characters cp sourcefile destfile copy sourcefile to destfile mv oldname newname rename or move file

rm file delete a file

grep 'pattern' file search for strings in a file_{Example: grep 'searchstring' file.txt}

cut -b colnum file

get data out of fixed width columns of text Example: get character positions 5 to 9 cut -b5-9 file.txt

Do not confuse this command with "cat" which is something totally different

(9)

file somefile describe what type of file somefile is

read var prompt the user for input and write it into a variable (var) sort file.txt sort lines in file.txt

uniq

remove duplicate lines, used in combination with sort since uniq removes only duplicated consecutive lines

Example: sort file.txt | uniq expr do math in the shellExample: add 2 and 3

expr 2 "+" 3

find

search for files

Example: search by name: find . -name filename -print

This command has many different possibilities and options. It is unfortunately too much to explain it all in this article.

tee

write data to stdout (your screen) and to a file Normally used like this:

somecommand | tee outfile

It writes the output of somecommand to the screen and to the file outfile

basename file

return just the file name of a given name and strip the directory path

Example: basename /bin/tux returns just tux

dirname file

return just the directory name of a given name and strip the actual file name

Example: dirname /bin/tux returns just /bin

head file print some lines from the beginning of a file tail file print some lines from the end of a file

sed

sed is basically a find and replace program. It reads text from standard input (e.g from a pipe) and writes the result to stdout (normally the screen). The search pattern is a regular

expression (see references). This search pattern should not be confused with shell wildcard syntax. To replace the string linuxfocus with LinuxFocus in a text file use:

cat text.file | sed 's/linuxfocus/LinuxFocus/' > newtext.file This replaces the first occurance of the string linuxfocus in each line with LinuxFocus. If there are lines where linuxfocus appears several times and you want to replace all use:

cat text.file | sed 's/linuxfocus/LinuxFocus/g' > newtext.file awk Most of the time awk is used to extract fields from a text line.

(10)

The default field separator is space. To specify a different one use the option -F.

cat file.txt | awk -F, '{print $1 "," $3 }'

Here we use the comma (,) as field separator and print the first and third ($1 $3) columns. If file.txt has lines like:

Adam Bor, 34, India Kerry Miller, 22, USA then this will produce: Adam Bor, India

Kerry Miller, USA

There is much more you can do with awk but this is a very common use.

2) Concepts: Pipes, redirection and backtick:

They are not really commands but they are very important concepts.

Pipes : (|) send the output (stdout) of one program to the input (stdin) of another program.

grep "hello" file.txt | wc -l

Finds the lines with the string hello in file.txt and then counts the lines.

The output of the grep command is used as input for the wc command. You can concatinate as many commands as you like in that way (within reasonable limits). Redirection : writes the output of a command to a file or appends data to a file > writes output to a file and overwrites the old file in case it exists

>> appends data to a file (or creates a new one if it doesn't exist already but it never overwrites anything).

Backtick :

The output of a command can be used as command line arguments (not stdin as above, command line arguments are any strings that you specify behind the command such as file names and options) for another command. You can as well use it to assign the output of a command to a variable.

The command

find . -mtime -1 -type f -print

finds all files that have been modified within the last 24 hours (-mtime -2 would be 48 hours). If you want to pack all these files into a tar archive (file.tar) the syntax for tar would be:

tar xvf file.tar infile1 infile2 ...

Instead of typing it all in you can combine the two commands (find and tar) using backticks. Tar will then pack all the files that find has printed:

#!/bin/sh

# The ticks are backticks (`) not normal quotes ('): tar -zcvf lastmod.tar.gz `find . -mtime -1 -type f -print`

(11)

3) Control structures :

The "if" statement tests if the condition is true (exit status is 0, success). If it is the "then" part gets executed:

if ....; then .... elif ....; then .... else .... fi

Most of the time a very special command called test is used inside if-statements. It can be used to compare strings or test if a file exists, is readable etc...

The "test" command is written as square brackets " [ ] ". Note that space is significant here: Make sure that you always have space around the brackets. Examples:

[ -f "somefile" ] : Test if somefile is a file.

[ -x "/bin/ls" ] : Test if /bin/ls exists and is executable. [ -n "$var" ] : Test if the variable $var contains something [ "$a" = "$b" ] : Test if the variables "$a" and "$b" are equal

Run the command "man test" and you get a long list of all kinds of test operators for comparisons and files.

Using this in a shell script is straight forward:

#!/bin/sh

if [ "$SHELL" = "/bin/bash" ]; then

echo "your login shell is the bash (bourne again shell)" else

echo "your login shell is not bash but $SHELL" fi

The variable $SHELL contains the name of the login shell and this is what we are testing here by comparing it against the string "/bin/bash"

4)

Shortcut operators :

People familiar with C will welcome the following expression:

[ -f "/etc/shadow" ] && echo "This computer uses shadow passwors"

The && can be used as a short if-statement. The right side gets executed if the left is true. You can read this as AND. Thus the example is: "The file /etc/shadow exists AND the command echo is executed". The OR operator (||) is available as well. Here is an example:

#!/bin/sh

mailfolder=/var/spool/mail/james

[ -r "$mailfolder" ] || { echo "Can not read $mailfolder" ; exit 1; } echo "$mailfolder has mail from:"

grep "^From " $mailfolder

The script tests first if it can read a given mailfolder. If yes then it prints the "From" lines in the folder. If it cannot read the file $mailfolder then the OR operator takes effect. In plain English you read this code as "Mailfolder readable or exit program". The problem here is that you must have exactly one command behind the OR but we need two: -print an error message

(12)

-exit the program

To handle them as one command we can group them together in an anonymous function using curly braces. Functions in general are explained further down.

You can do everything without the ANDs and ORs using just if-statements but sometimes the shortcuts AND and OR are just more convenient.

The case statement can be used to match (using shell wildcards such as * and ?) a given string against a number of possibilities.

case ... in

...) do something here;; esac

Let's look at an example. The command file can test what kind of file type a given file is:

file lf.gz returns:

lf.gz: gzip compressed data, deflated, original filename, last modified: Mon Aug 27 23:09:18 2001, os: Unix

We use this now to write a script called smartzip that can uncompress bzip2, gzip and zip compressed files automatically :

#!/bin/sh ftype=`file "$1"` case "$ftype" in "$1: Zip archive"*) unzip "$1" ;; "$1: gzip compressed"*) gunzip "$1" ;; "$1: bzip2 compressed"*) bunzip2 "$1" ;;

*) error "File $1 can not be uncompressed with smartzip";; esac

Here you notice that we use a new special variable called $1. This variable contains the first argument given to a program. Say we run

smartzip articles.zip

then $1 will contain the string articles.zip

The select statement is a bash specific extension and is very good for interactive use. The user can select a choice from a list of different values:

select var in ... ; do break

done

.... now $var can be used ....

Here is an example:

#!/bin/sh

echo "What is your favourite OS?"

select var in "Linux" "Gnu Hurd" "Free BSD" "Other"; do break

done

echo "You have selected $var"

Here is what the script does:

(13)

1) Linux 2) Gnu Hurd 3) Free BSD 4) Other #? 1

You have selected Linux

In the shell you have the following loop statements available:

while ...; do ....

done

The while-loop will run while the expression that we test for is true. The keyword "break" can be used to leave the loop at any point in time. With the keyword "continue" the loop continues with the next iteration and skips the rest of the loop body.

The for-loop takes a list of strings (strings separated by space) and assigns them to a variable:

for var in ....; do ....

done

The following will e.g. print the letters A to C on the screen:

#!/bin/sh

for var in A B C ; do echo "var is $var" done

A more useful example script, called showrpm, prints a summary of the content of a number of RPM-packages:

#!/bin/sh

# list a content summary of a number of RPM packages # USAGE: showrpm rpmfile1 rpmfile2 ...

# EXAMPLE: showrpm /cdrom/RedHat/RPMS/*.rpm for rpmpackage in $*; do

if [ -r "$rpmpackage" ];then

echo "=============== $rpmpackage ==============" rpm -qi -p $rpmpackage

else

echo "ERROR: cannot read file $rpmpackage" fi

done

Above you can see the next special variable, $* which contains all the command line arguments. If you run

showrpm openssh.rpm w3m.rpm webgrep.rpm

then $* contains the 3 strings openssh.rpm, w3m.rpm and webgrep.rpm.

The GNU bash knows until-loops as well but generally while and for loops are sufficient.

5)Flow-of-control statements

if and foreach are the basic flow-of-control statements. There are more advanced ones named switch and while which are similar to the statements of the same name in the C language.

(14)

if

The if command allows you to conditionally execute a command or set of commands depending upon whether some expression is true. There are two forms.

• if ( logical_expression ) command ...

This form will execute command (which can have a long list of arguments) if the

logical__expression is true. This expression can be one of the logical or file testing expressions described above. For example, you can test if a file whose name is stored in the shell's built-in variable $1 (first argument to the shell script) exists as a plain file, and if so, make a backup copy of it with:

if ( -f $1 ) cp $1 $1.bak

This simple cp command will not work if given a directory to copy, which is why there is the test for a "plain" file.

• if ( logical_expression ) then

block of commands - any number of lines to be executed if logical_expression is "true" (or has non-zero value).

else

another block of commands - any number of lines to be executed if logical_expression is "false" (or has value 0).

endif

This form allows you to execute more than one command if the expression is true. The then keyword must follow the logical_expression test on the same line, and the endif keyword must be on a line by itself to end the entire if command.

The else statement is optional. If you use this, the else keyword must be on a line by itself. The following lines up to the endif are executed if the expression was false. The "blocks of commands" may in turn contain additional nested if commands. Just be sure that each if has a matching endif statement enclosed in the same block.

• foreach

This statement allows you to execute a loop, like a do statement in Fortran or a for statement in C.

(15)

foreach name (wordlist)

block of statements to be executed end

foreach and end are keywords. The end statement must be on a line by itself.

name is the name of a variable that you create. This is the "loop index".

(wordlist) is a list of "words", meaning any character strings separated by blanks. The parentheses are required. The "word list" can be a list of actual constant values, or the results of variable substitution, arithmetic expressions, or command substitution. When the foreach statement is encountered in a shell script, the wordlist is evaluated (necessary variable and command substitutions and expressions are done) and stored. Then the new variable name is set equal to the first word in the list and the block of statements is executed. When the end statement is reached, the script goes back to the foreach line and sets the variable name equal to the next word in the list and executes the block of statements again. This is done over and over until all words in the wordlist have been used up.

This type of looping statement is good for repetitively executing the same commands for a set of arguments. For example, you could check all the arguments given to the shell to see if they are plain files, and if so, make backup copies, using these statements in a shell script:

foreach file ( $* )

if (-f $file) cp $file $file.bak end

The foreach command can also be used interactively from the terminal. In this form, you will get question mark prompts (?) to enter your commands. When done, type end after one of these prompts. For example, you could type the following commands interactively at your terminal to make backup copies of all Fortran programs in your current directory:

foreach file ( *.f )

? if (-f $file) cp $file $file.bak ? end

Quoting :

Before passing any arguments to a program the shell tries to expand wildcards and variables. To expand means that the wildcard (e.g. *) is replaced by the appropriate file names or that a variable is replaced by its value. To change this behaviour you can use quotes: Let's say we have a number of files in the current directory. Two of them are jpg-files, mail.jpg and tux.jpg.

#!/bin/sh echo *.jpg

This will print "mail.jpg tux.jpg".

(16)

#!/bin/sh echo "*.jpg" echo '*.jpg'

This will print "*.jpg" twice.

Single quotes are most strict. They prevent even variable expansion. Double quotes prevent wildcard expansion but allow variable expansion:

#!/bin/sh echo $SHELL echo "$SHELL" echo '$SHELL'

This will print:

/bin/bash /bin/bash $SHELL

Finally there is the possibility to take the special meaning of any single character away by preceeding it with a backslash:

echo \*.jpg echo \$SHELL

This will print:

*.jpg $SHELL

Here documents

Here documents are a nice way to send several lines of text to a command. It is quite useful to write a help text in a script without having to put echo in front of each line. A "Here document" starts with << followed by some string that must also appear at the end of the here document. Here is an example script, called ren, that renames multiple files and uses a here document for its help text:

#!/bin/sh

# we have less than 3 arguments. Print the help text: if [ $# -lt 3 ] ; then

cat <<HELP

ren -- renames a number of files using sed regular expressions USAGE: ren 'regexp' 'replacement' files...

EXAMPLE: rename all *.HTM files in *.html: ren 'HTM$' 'html' *.HTM HELP exit 0 fi OLD="$1" NEW="$2"

# The shift command removes one argument from the list of # command line arguments.

shift shift

# $* contains now all the files: for file in $*; do

if [ -f "$file" ] ; then

newfile=`echo "$file" | sed "s/${OLD}/${NEW}/g"` if [ -f "$newfile" ]; then

echo "ERROR: $newfile exists already" else

(17)

echo "renaming $file to $newfile ..." mv "$file" "$newfile"

fi fi done

This is the most complex script so far. Let's discuss it a little bit. The first if-statement tests if we have provided at least 3 command line parameters. (The special variable $# contains the number of arguments.) If not, the help text is sent to the command cat which in turn sends it to the screen. After printing the help text we exit the program. If there are 3 or more arguments we assign the first argument to the variable OLD and the second to the variable NEW. Next we shift the command line parameters twice to get the third argument into the first position of $*. With $* we enter the for loop. Each of the arguments in $* is now assigned one by one to the variable $file. Here we first test that the file really exists and then we construct the new file name by using find and replace with sed. The backticks are used to assign the result to the variable newfile. Now we have all we need: The old file name and the new one. This is then used with the command mv to rename the files.

Functions

As soon as you have a more complex program you will find that you use the same code in several places and also find it helpful to give it some structure. A function looks like this:

functionname() {

# inside the body $1 is the first argument given to the function # $2 the second ...

body }

You need to "declare" functions at the beginning of the script before you use them. Here is a script called xtitlebar which you can use to change the name of a terminal window. If you have several of them open it is easier to find them. The script sends an escape sequence which is interpreted by the terminal and causes it to change the name in the titlebar. The script uses a function called help. As you can see the function is defined once and then used twice:

#!/bin/sh

# vim: set sw=4 ts=4 et: help()

{

cat <<HELP

xtitlebar -- change the name of an xterm, gnome-terminal or kde konsole USAGE: xtitlebar [-h] "string_for_titelbar"

OPTIONS: -h help text EXAMPLE: xtitlebar "cvs" HELP

(18)

}

# in case of error or if -h is given we call the function help: [ -z "$1" ] && help

[ "$1" = "-h" ] && help

# send the escape sequence to change the xterm titelbar: echo -e "\033]0;$1\007"

#

It's a good habit to always have extensive help inside the scripts. This makes it possible for others (and you) to use and understand the script.

Command line arguments

We have seen that $* and $1, $2 ... $9 contain the arguments that the user specified on the command line (The strings written behind the program name). So far we had only very few or rather simple command line syntax (a couple of mandatory arguments and the option -h for help). But soon you will discover that you need some kind of parser for more complex programs where you define your own options. The convention is that all optional parameters are preceeded by a minus sign and must come before any other arguments (such as e.g file names).

There are many possibilities to implement a parser. The following while loop combined with a case statement is a very good solution for a generic parser:

#!/bin/sh help() {

cat <<HELP

This is a generic command line parser demo.

USAGE EXAMPLE: cmdparser -l hello -f -- -somefile1 somefile2 HELP

exit 0 }

while [ -n "$1" ]; do case $1 in

-h) help;shift 1;; # function help is called -f) opt_f=1;shift 1;; # variable opt_f is set

-l) opt_l=$2;shift 2;; # -l takes an argument -> shift by 2 --) shift;break;; # end of options

-*) echo "error: no such option $1. -h for help";exit 1;; *) break;;

esac done

echo "opt_f is $opt_f" echo "opt_l is $opt_l" echo "first arg is $1" echo "2nd arg is $2"

Try it out! You can run it e.g with:

cmdparser -l hello -f -- -somefile1 somefile2

It produces

(19)

opt_l is hello

first arg is -somefile1 2nd arg is somefile2

How does it work? Basically it loops through all arguments and matches them against the case statement. If it finds a matching one it sets a variable and shifts the command line by one. The unix convention is that options (things starting with a minus) must come first. You may indicate that this is the end of option by writing two minus signs (--). You need it e.g with grep to search for a string starting with a minus sign:

Search for -xx- in file f.txt: grep -- -xx- f.txt

Our option parser can handle the -- too as you can see in the listing above.

Command substitution

The result of any command, meaning whatever it would write to standard output, can be "captured" by the shell and used to set the value of a variable or as part or all of the arguments to another command. This is different from sending the output through a pipe to be the input of another command. Here, the output of one command becomes the arguments of another command, not the input file. This is called "command substitution".

Capturing the output of a command is requested by enclosing that entire command with its arguments in a set of matching backwards quote marks (`) (also called "accent" mark or "grave" mark). This is not the same character as the apostrophe or single quote, which slants forward (').

In captured output, all newline characters (end-of-line markers) are changed to blanks, so the lines in the captured output are joined into one long string of words. Empirical tests on pangea show that the total length of this captured string can be at least 50,000 bytes. Other Unix systems may have smaller limits. All should allow at least 4,000 bytes in command substitution captured strings.

The shell stores captured output in a temporary area of memory. Once you have captured the output of a command, of course you want to do something with it. You can assign the output to a variable, or use it as part or all of the arguments to another command.

Examples:

• If a file contains a list of names, you can assign the contents of that file to a

variable by capturing the output of a cat command that is listing the file to standard output . Here is the kind of line you might use for that in a shell script:

set names = ` cat file `

• You could use that captured list of names from the cat command directly as the argument list to another command, such as finger:

(20)

This example could also be typed as an interactive command.

• You could tell the mv command to move all the files whose names are listed in the file list to the new directory newdir with this command:

mv `cat list` newdir

Again, this example could be typed as an interactive command.

Script Examples

1) A general purpose sceleton

Now we have discussed almost all components that you need to write a script. All good scripts should have help and you can as well have our generic option parser even if the script has just one option. Therefore it is a good idea to have a dummy script, called framework.sh, which you can use as a framework for other scripts. If you want to write a new script you just make a copy:

cp framework.sh myscript

and then insert the actual functionality into "myscript". Let's now look at two more examples:

2) A binary to decimal number converter

The script b2d converts a binary number (e.g 1101) into its decimal equivalent. It is an example that shows that you can do simple mathematics with expr:

#!/bin/sh

# vim: set sw=4 ts=4 et: help()

{

cat <<HELP

b2h -- convert binary to decimal USAGE: b2h [-h] binarynum

OPTIONS: -h help text EXAMPLE: b2h 111010 will return 58 HELP exit 0 } error() {

# print an error and exit echo "$1"

(21)

exit 1 }

lastchar() {

# return the last character of a string in $rval if [ -z "$1" ]; then

# empty string rval=""

return fi

# wc puts some space behind the output this is why we need sed: numofchar=`echo -n "$1" | wc -c | sed 's/ //g' `

# now cut out the last char

rval=`echo -n "$1" | cut -b $numofchar` }

chop() {

# remove the last character in string and return it in $rval if [ -z "$1" ]; then

# empty string rval=""

return fi

# wc puts some space behind the output this is why we need sed: numofchar=`echo -n "$1" | wc -c | sed 's/ //g' `

if [ "$numofchar" = "1" ]; then # only one char in string rval=""

return fi

numofcharminus1=`expr $numofchar "-" 1` # now cut all but the last char:

rval=`echo -n "$1" | cut -b 0-${numofcharminus1}` }

-h) help;shift 1;; # function help is called --) shift;break;; # end of options

-*) error "error: no such option $1. -h for help";; *) break;;

esac done

# The main program sum=0

weight=1

# one arg must be given: [ -z "$1" ] && help binnum="$1"

binnumorig="$1"

while [ -n "$binnum" ]; do lastchar "$binnum"

(22)

if [ "$rval" = "1" ]; then

sum=`expr "$weight" "+" "$sum"` fi

# remove the last position in $binnum chop "$binnum"

binnum="$rval"

weight=`expr "$weight" "*" 2` done

echo "binary $binnumorig is decimal $sum" #

The algorithm used in this script takes the decimal weight (1,2,4,8,16,..) of each digit starting from the right most digit and adds it to the sum if the digit is a 1. Thus "10" is: 0 * 1 + 1 * 2 = 2

To get the digits from the string we use the function lastchar. This uses wc -c to count the number of characters in the string and then cut to cut out the last character. The chop function has the same logic but removes the last character, that is it cuts out everything from the beginning to the character before the last one.

A file rotation program

Perhaps you are one of those who save all outgoing mail to a file. After a couple of months this file becomes rather big and it makes the access slow if you load it into your mail program. The following script rotatefile can help you. It renames the mail folder, let's call it out mail, to outmail.1 if there was already an outmail.1 then it becomes outmail.2 etc...

#!/bin/sh

# vim: set sw=4 ts=4 et: ver="0.1"

help() {

cat <<HELP

rotatefile -- rotate the file name USAGE: rotatefile [-h] filename OPTIONS: -h help text

EXAMPLE: rotatefile out

This will e.g rename out.2 to out.3, out.1 to out.2, out to out.1 and create an empty out-file

The max number is 10 version $ver HELP exit 0 } error() { echo "$1" exit 1

(23)

}

-h) help;shift 1;; --) break;;

-*) echo "error: no such option $1. -h for help";exit 1;; *) break;;

esac done

# input check:

if [ -z "$1" ] ; then

error "ERROR: you must specify a file, use -h for help" fi

filen="$1"

# rename any .1 , .2 etc file: for n in 9 8 7 6 5 4 3 2 1; do if [ -f "$filen.$n" ]; then p=`expr $n + 1`

echo "mv $filen.$n $filen.$p" mv $filen.$n $filen.$p

fi done

# rename the original file: if [ -f "$filen" ]; then echo "mv $filen $filen.1" mv $filen $filen.1

fi

echo touch $filen touch $filen

How does the program work? After checking that the user provided a filename we go into a for loop counting from 9 to 1. File 9 is now renamed to 10, file 8 to 9 and so on. After the loop we rename the original file to 1 and create an empty file with the name of the original file.

How do you execute (run) a shell script?

There are two methods you can use to execute a shell script.

First, you can give the script file name as an argument to an instance of the shell program, that is, type a command like:

csh filename

csh is the name of the C-shell program itself. This command starts up a new C-shell process that executes the commands in the script filename and then terminates.

(24)

Second, you can give the name of the shell script itself as a command, just like any other program on Unix. First, you have to let the Unix kernel recognize that this is a shell script by doing the following two steps. Then you can simply type the shell's filename as a command name to execute it. The kernel will automatically start up a new C-shell process to execute the commands in the script.

Note that your login shell has to be able to find the shell script file when you type its name as a command. The login shell only looks in a set of specific directories, called its path, to find files that contain programs. On pangea, the default path includes your current working directory, so you can run a shell script in the current directory simply by typing its name. Otherwise, type the absolute pathname of the script (for example, /home/sysop/farrell/programs/addup), or add the directory where the script lives to your standard path. Whenever you add a directory to your standard path, you must run the rehash built-in C-shell command to tell your login shell to rebuild its list of programs using your new path definition.

To make your shell script file executable as a program, do these steps: Put the following line as the first line in your script:

#! /bin/csh -f

This is a special comment telling the kernel that you want this script to be executed by the C-shell (there is an alternate shell named simply sh). The -f option helps the command to start up faster by skipping the initial read of the .cshrc file.

Use chmod to set execute permission for your file. For instance, if you want anyone to be able to execute the script file, use

chmod ugo+x filename

Debugging

The most simple debugging help is of course the command echo. You can use it to print specific variables around the place where you suspect the mistake. This is probably what most shell programmers use 80% of the time to track down a mistake. The advantage of a shell script is that it does not require any re-compilation and inserting an "echo"

statement is done very quickly.

The shell has a real debug mode as well. If there is a mistake in your script "strangescript" then you can debug it like this:

sh -x strangescript

This will execute the script and show all the statements that get executed with the variables and wildcards already expanded.

The shell also has a mode to check for syntax errors without actually executing the program. To use this run:

(25)

If this returns nothing then your program is free of syntax errors.

Shell Tips & Tricks :

• Changing Directories

Always check the error status of chdir, to avoid running commands in the wrong directory. Alternatively, use fully qualified paths to obviate the need the chdir.

#!/bin/sh

cd $nosuchdir || exit 1 rsync elsewhere:/foo .

Without || exit 1 to abort the script should cd fail, the subsequent rsync command could move files to the wrong location or fill up the wrong partition.

• Redirection

Redirection can take place (almost) anywhere, not just at the end.

$ echo a b >c $ echo >c a b

$ >c echo a b

Placing the filename at the beginning allows easier editing of the search term at the end of the command.

$ </var/log/messages grep foo $

</var/log/messages grep bar $ </var/log/messages grep user1

(26)

I use xargs(1) frequently to convert output from something (file, or another program) to arguments to another command. For instance, to commit only modified files in a cvs sandbox where there may be conflicted, new, or other troublesome files mixed in, use the following.

$ cvs up | perl -ne 'print if s/M //' | xargs cvs ci

Depending on the editor, one can use concept above to open certain files for editing, for example, files in a cvs sandbox that have conflicts.

$ cvs up | perl -ne 'print if s/C //' | xargs vi

ex/vi: Vi's standard input and output must be a terminal $

cvs up | perl -ne 'print if s/C //' | xargs emacs emacs: standard

input is not a tty

$ cvs up | perl -ne 'print if s/C //' | xargs bbedit

The bbedit utility is part of BBEdit for Mac OS X, and avoids terminal issues by sending the files to the BBEdit application. Using emacs in server/client mode may avoid this problem for emacs. The alternative is to use backticks to make the files available as arguments to the program, instead of feeding them in through xargs.

$ vi `cvs up | perl -ne 'print if s/C //'`

xargs can be chained with other programs. For instance, one may want to find perl scripts containing the text While and do something with them.

(27)

$ find . -name '*.pl' \ | xargs fgrep -l While \

| xargs perl -i -ple 's/While/while/g'

xargs will fail or do the wrong thing if passed filenames contain spaces. This is common on filesystems that traditionally have allowed spaces in filenames (Mac OS), or where file trees have been uploaded to Unix from other platforms. If using find/xargs pairs, the spaces-in-filenames problems can be avoided as follows. $ find . -type f -print0 | xargs -0 echo

• Stubborn Files

Dealing with files that have odd characters in their names can often be a chore on Unix, as one cannot type in the names in question. One could use a graphical file manager tool, but I find those cumbersome, ill suited to dealing with large numbers of files, and usually not installed on server systems.

To simply delete the bad filenames, there are a few options.

• Qualify the path.

Files that being with a hyphen (such as a file -rf) will trigger option processing, as the shell is very stupid. These can be avoided by either disabling option processing, or prefixing a directory name to the file path.

$ ls -rf $ rm -rf $ ls -rf $ rm * $ ls -rf $ rm

(28)

-- -rf $ ls $ touch ./-rf $ ls -rf $ rm ./-rf $ ls $

The -- argument only works on systems whose getopt(3) library supports the syntax. On other systems, or for portability, the qualified path option must be used.

• Find inode and delete by that.

Each file on a Unix filesystem has a inode number associated with it; knowing the inode number of the bad file allows us to search for and delete it.

$ ls -i * 615383 foo $ find . -inum 615383 -exec rm {} \;

• Large numbers of files.

If there are large numbers of files with wacky characters in their

filenames, something more powerful than the shell is usually required to filter out the files in question. For instance, to list the inode number of files in the current directory with non-printable characters in their names, use perl.

$ ls -i | perl -nle 'print if /[[:^print:]]/' \ |

(29)

while read inum name; do echo $inum; done

For situations where the mangled filenames are in deep directory trees, or where the mangling is consistent (uploaded filenames from a DNA sequencer come to mind), use File::Find and write a standalone script. Directories with huge numbers of files will cause rm * to fail, as the wildcard expands to a list the shell cannot cope with. To delete all the files, remove the parent directory. If only deleting by a pattern, use readdir to loop over each file in turn, and apply a match to each filename.

$ rm -rf /the/bad/dir $ perl -le 'opendir D, shift or die "$!"' \

-e 'while (readdir D) { unlink if -f and m/\.doc/ }' /the/bad/dir

• Looping without for or while echo foo bar | sed 's/ /\

/g' | xargs -n 1 echo

ls

• Fun with SSH

Commands can be run over ssh(1), though how the shell handles more complex commands can cause problems.

client$ ssh example.org hostname

server.example.org

client$ ssh example.org sleep 3 && hostname

(30)

The && is handled by the local shell, not the remote server. Quoting can fix the problem.

client$ ssh example.org 'sleep 3 && hostname'

server.example.org

• Clobbering a File

There are several ways one can empty the contents of an existing file without removing and touch(1)-ing the file in question. Using echo(1) is not portable, as some systems do not support the -n flag, such as Digital UNIX without

CMD_ENV=bsd set. The use of the shell null operator : is a clever way, and saves typing.

$ cat /dev/null >file $ echo -n

>file $ : >file

• Grepping for processes

grep can match itself. To avoid this problem, the regular expression can be altered so grep does not match itself, which is easier than appending | grep -v grep to a command.

$ ps wwo pid,command | grep 'ss[h]'

The regular expression ss[h] cannot match the literal string ssh[h] in the process listing, but will match any process name containing ssh. Another option: use commands such as pgrep(1).

• Pass input to command

Some utilities are controlled via command interfaces. Full control of command interfaces may require the expect utility, or Expect. Simple needs can be met by

(31)

printing commands on standard input, then parsing the output. For example, the Mac OS X scutil can be queried for information:

SERVICE_GUID=`cat <<EOF | scutil | awk '/PrimaryService/{print $3}' open get State:/Network/Global/IPv4 d.show EOF` echo $SERVICE_GUID

Filenames with spaces

If parsing a list of filenames from a file, spaces in filenames may cause shell Interpolation problem.

To workaround, use xargs -0, and convert the file to a null delimited list:

$ touch "foo bar" $ echo foo bar > test

$ < test tr '\n' '\0' | xargs -0 file foo bar:

empty

• Argument Processing

Modern getopt(3) implementations support -- to stop option processing, allowing commands such as fgrep -- -search to work.

(32)

Reference:

1. http://user.it.uu.se/~matkin/documents/shell/ 2. http://www.injunea.demon.co.uk/pages/page212.htm 3. http://www.washington.edu/computing/unix/shell.html 4. http://www.freeos.com/guides/lsst/ch02sec01.html 5. http://www.fmrib.ox.ac.uk/fslcourse/lectures/scripting/s_0060.htm