• No results found

Linux command line. An introduction to the Linux command line for genomics. Susan Fairley

N/A
N/A
Protected

Academic year: 2021

Share "Linux command line. An introduction to the Linux command line for genomics. Susan Fairley"

Copied!
78
0
0

Loading.... (view fulltext now)

Full text

(1)

Linux command line

An introduction to the Linux command line for genomics

Susan Fairley

(2)

Aims

• Introduce the command line

• Provide an awareness of basic functionality

• Illustrate with some examples

• Provide some information on how to find out more

(3)

What we will not achieve

• Immediate proficiency in the command line

– As with learning a language, it takes time and use

• A comprehensive survey of the command line

– There is a vast array of commands, this session can only cover a small fraction

(4)

Format

• Series of short talks followed by exercises

• Should not need to listen and type at the same time

• Suggested reading listed at end

(5)

Overview

• Introduction to Linux

• Navigating the filesystem

• Basic commands

• Linking commands and directing output

• Additional commands

• Shells and shell scripts

(6)

Introduction to Linux

• What is Linux and what is an operating system?

• Unix and Linux – what’s the difference?

• Why consider Unix/Linux?

• The command prompt

(7)

Linux is an operating system

• Operating systems enable applications and users to make use of computer hardware

User

Applications

Operating system

Hardware

(8)

Operating systems

• Operating systems act as resource managers for the machine on which they are installed

• They wrap, and provide access to, hardware functionality

• The OS kernel controls the hardware

• Access to kernel services is provided to higher level applications and system utilities via

system calls

(9)

Operating systems and shells

• Applications and system utilities can be started via a shell or GUI

• A shell is a textual command line interface

• A variety of shells, with slightly different features, exist

• Examples of shells include bash, bourne, csh and tcsh

• Using the shell can provide useful functionality

(10)

Unix and Linux

From http://www.doc.ic.ac.uk/~wjk/UnixIntro/Lecture1.html

(11)

Unix and Linux

• There are many variations of these systems

• They have some differences but many similarities

• Examples of Unix and Unix-like systems

include Sun Solaris, GNU/Linux and Mac OS X

• Popular Linux distributions (packaging a Linux kernel with system utilities, GUI and

applications) include Redhat and Debian, among others

(12)

Why Linux?

• Linux systems are commonplace in bioinformatics

• Large variety of software is developed by academic groups for these platforms

• Free and open source software

(13)

The command prompt

• The command line (shell) and GUI enable interaction with applications and system utilities

• A command prompt, where commands are entered at the command line, is accessed via software that provides a terminal window

(14)

The command prompt

(15)

The command prompt

• A terminal window can be opened when logged in to a machine

• Also options to open terminals on remote machines – ssh, telnet and PuTTY

• Today, we will use PuTTY to connect from the classroom Windows machines to a Linux

machine

(16)

Commands

• The command prompt

• 1) The command

• 2) Options

• 3) What the command is to run on

• Example: prompt$ command –option thing_to_run_on

(17)

Commands

• White space

• Quotes

• Special characters

• We’ll return to some of these topics later

• Typically best to avoid white space in file names

(18)

Important points

• No need to be afraid but…

• Type with care

– You will NOT be asked if you really mean it

• Some commands are powerful and can remove many files at once

• Sometimes a command will run and run and run because something is wrong

• Can use Ctrl-C to kill processes in most cases

• If in doubt, ask!

(19)

In Exercise 1

• Open PuTTY

• Connect to a remote machine

• Copy material to the remote machine

(20)

Exercise 1

(21)

Overview

• Introduction to Linux

• Navigating the filesystem

• Basic commands

• Linking commands and directing output

• Additional commands

• Shells and shell scripts

(22)

Navigating the filesystem

• Where am I?

• What is here (and permissions)?

• Moving around

• Searching

(23)

Navigating the filesystem

• In Exercise 1 we used some commands

• ls listed the contents of the directory

• tar unpackaged linux_course.tar.gz

• These commands followed the pattern we described earlier

• Command –option thing_to_run_on

• The command is given along with any necessary additional information

(24)

Navigating the filesystem

• Now we are going to look at commands related to navigating the filesystem

• At any point in time, the command prompt is somewhere within the filesystem

• The filesystem is similar to the directory

structure you will be familiar with in graphical interfaces, where you navigate by clicking on folders and documents

(25)

Navigating the filesystem

(26)

Navigating the filesystem

(27)

Where am I?

• The current directory is also called the working directory

• pwd – print working directory

• Gives the path from the top of the file system (or root) to the current directory

• Root can be written as /

[s08sf2@login1(maxwell) ~]$ pwd /users/s08sf2

(28)

What is here?

• We’ve already used ls

• ls – lists the contents of the current directory

• We used the simple form of ls

• Options can also be specified, including –l or combinations of options such as –lh

• -l gives the long version of output and –h

converts file sizes to “human-readable” form

(29)

What is here?

[s08sf2@login1(maxwell) spades_Dec14]$ pwd /users/s08sf2/ken_forbes/spades_Dec14

[s08sf2@login1(maxwell) spades_Dec14]$ ls

mv.sh process_quast.pl quast_summary.txt

notes.txt quast.sh spades_listeria_Dec14.sh

(30)

What is here?

[s08sf2@login1(maxwell) spades_Dec14]$ ls -l total 136

-rw-r--r-- 1 s08sf2 clsm 50028 Dec 10 16:53 mv.sh

-rw-r--r-- 1 s08sf2 clsm 45948 Dec 12 14:55 notes.txt

-rw-r--r-- 1 s08sf2 clsm 1932 Dec 10 16:51 process_quast.pl -rw-r--r-- 1 s08sf2 clsm 634 Dec 10 16:13 quast.sh

-rw-r--r-- 1 s08sf2 clsm 23025 Dec 10 16:53 quast_summary.txt

-rw-r--r-- 1 s08sf2 clsm 977 Dec 10 11:19 spades_listeria_Dec14.sh

[s08sf2@login1(maxwell) spades_Dec14]$ ls -lh total 136K

-rw-r--r-- 1 s08sf2 clsm 49K Dec 10 16:53 mv.sh

-rw-r--r-- 1 s08sf2 clsm 45K Dec 12 14:55 notes.txt

-rw-r--r-- 1 s08sf2 clsm 1.9K Dec 10 16:51 process_quast.pl -rw-r--r-- 1 s08sf2 clsm 634 Dec 10 16:13 quast.sh

-rw-r--r-- 1 s08sf2 clsm 23K Dec 10 16:53 quast_summary.txt

-rw-r--r-- 1 s08sf2 clsm 977 Dec 10 11:19 spades_listeria_Dec14.sh

(31)

Permissions

• First character is file type: - for file, d for directory

• Then groups of three characters describing permissions for the user (u), group (g) and others (o)

• Each set of three characters is read, write and execute

• r = read, w = write, x=execute, -=no permission

• Here, we have a file (not a directory) where the owner can read and write, the group and all users can read and nobody can execute the file

-rw-r--r--

All users

Group Owner

File type

(32)

Permissions

• Linux cares about permissions

• Permissions (including who the owner of a file is) can, in some cases, get carried over when moving or copying files

• Permissions can be changed using chmod

(33)

chmod

• chmod has various ways in which it can be used

• We’ll look at one where a number is supplied for each of user, group and other

• How do we know what number to supply for each category?

(34)

chmod

--- 0

--x 1

-w- 2

-wx 3

r-- 4

r-x 5

rw- 6

rwx 7

(35)

chmod

• chmod 600 private_file.txt

• chmod 777 everything_file.txt

• chmod 644 my_rw_otherwise_read.txt

(36)

Moving around

• We’ve said the command prompt is in a directory

• How do we change that directory?

• cd – change directory

• cd directions/to/where/we/want/to/go

(37)

Moving around

• Start in our home directory

• Can return there using ~

• cd ~

• There are some other directories we can refer to easily

• . is the directory we are in

• .. is the parent directory of our current location (the level above where we are)

(38)

Moving around

• We can move to a directory by specifying it and its location relative to root or our current location

• cd ../linux_course/text_files

• cd /users/s08sf2/linux_course/text_files

• cd linux_course/text_files

• NB: once you have started typing the path, you can press tab to autocomplete

• NB: you can use the up arrow to get the previous command and then edit it

(39)

Exercise 2

• We’ve looked at how you establish what is in a directory and how to move about

• In exercise 2, we’ll use the contents of

linux_course to try out some of this material using pwd, ls and cd

• We’ll also try out chmod

(40)

Exercise 2

(41)

Overview

• Introduction to Linux

• Navigating the filesystem

• Basic commands

• Linking commands and directing output

• Additional commands

• Shells and shell scripts

(42)

Basic commands

• We’ve now used a few commands, including some with options and have the basic skills to navigate through the file system

• Now, we’ll look at some additional commands, enabling you to make directories, move files, copy files, remove files, view files and find

them

(43)

man

• man – manual

• You can use the man command by supplying the name of a command you want to see the manual entry for i.e. man ls

• The manual entry provides information on the command and its usage

• Information can also be found online

(44)

mkdir

• mkdir – make directory

• This creates the specified directory as a sub- directory of the current directory

• Multiple levels can be created at once using the –p option

• mkdir my_dir

• mkdir –p my_dir/new_dir/another_new_dir

(45)

cp

• cp – copy file

• cp existing.txt copy.txt

• cp existing.txt ../different_location/copy.txt

• We can also use the –r option to recursively copy a directory and all of its contents

• cp –r dir copy_of_dir

(46)

mv

• mv – move

• Moves instead of copying

• Can be used to rename something in the same location

• mv old_name.txt new_name.txt

• mv old.txt ../new_location/new.txt

• Can be applied to files and directories

(47)

rm

• Type with care

• rm – remove (this means delete, and it will NOT move it to trash)

• rm file_to_remove.txt

• Need –r option to remove a directory because you must also remove any contents

• rm –r directory_to_remove

(48)

less

• less can be used to view files

• When viewing file press q to quit

• With less, can search the file using / and then typing pattern to search for

• less shakespeare/romeo_and_juliet.txt

• Also head, tail and more

(49)

wc

• wc – word count

• Counts the number of words in a file

• Has options that can be used, for example, to count the number of lines in a file

• wc –l file.txt

(50)

sort

• sort – does what it says

• By default, sorts lexicographically

• Can sort numerically and can output only unique lines

• sort file.txt

• sort –u file.txt

(51)

grep

• grep – general regular expression print

• grep options pattern files

• grep Juliet romeo_and_juliet.txt

• grep –r tide shakespeare

• Can supply patterns or regular expressions, which describe what to look for

• NB: we don’t have time to discuss regular expressions today

(52)

find

• find – search for things

• This command has many options

• find options path expression

• Path says where to look and expression what to look for

• find . –name going*

(53)

Exercise 3

• In Exercise 3 we’ll try out some of the commands we’ve just looked at

(54)

Exercise 3

(55)

Overview

• Introduction to Linux

• Navigating the filesystem

• Basic commands

• Linking commands and directing output

• Additional commands for genomics

• Shells and shell scripts

(56)

Linking commands and directing output

• So far, any output from our commands has been printed in our terminal

• However, we can redirect output to files or pipe it to another command

• | pipes output from one command to another

• > writes output to a file

• >> appends output to a file

(57)

stdout and stderr

• In most cases, output is written to standard output (stdout)

• Some errors are written to standard error (stderr)

• Both are, by default, written to the terminal

• We’ll look at redirecting stdout but stderr can also be redirected

(58)

Examples

• ls > dir_contents.txt

• ls | sort > sorted_dir_contents.txt

(59)

Exercise 4

• In Exercise 4 we’ll use the output of one command as input for another by piping

• We’ll also try redirecting output from standard out (stdout) to a file

(60)

Exercise 4

(61)

Overview

• Introduction to Linux

• Navigating the filesystem

• Basic commands

• Linking commands and directing output

• Additional commands

• Shells and shell scripts

(62)

Additional commands

• Many and varied commands can be used (including when handling genomic data)

• These are a few arbitrary examples

(63)

Grep for FASTA headers

• FASTA files have headers for each sequence

• Headers start with the character >

• grep ‘>’ proteins.fa

• Note the use of ‘’ around >, enabling the command line to differentiate from

redirection

(64)

Identify unique FASTA headers

• grep ‘>’ proteins.fa

• grep ‘>’ proteins.fa | wc –l

• grep ‘>’ proteins.fa | sort –u

• grep ‘>’ proteins.fa | sort –u | wc –l

(65)

Retain part of FASTA header

• grep ‘>’ protein_2.fa

• >sp|P09922|MX1_MOUSE Interferon-induced GTP-binding protein Mx1 OS=Mus musculus GN=Mx1 PE=1 SV=1

• sed –e’s/>\(\S*\).*/\1/’

• -e execute

• s/substitute this/for this/

• \(\) capture the contents of the brackets (using \ to escape) and reuse the contents using \1

• \S non-whitespace, * match many times

(66)

Compare sorted lists

• comm – compares sorted lists

• Options -123

• -1 lines unique to file1, -2 lines unique to file2, -3 lines that appear in both files

• Also diff

• diff –y --suppress-common-lines file1 file2

(67)

Exercise 5

• In Excercise 5 we’ll work with some FASTA files

• We’ll try some of the examples that we’ve discussed

(68)

Exercise 5

(69)

Overview

• Introduction to Linux

• Navigating the filesystem

• Basic commands

• Linking commands and directing output

• Additional commands

• Shells and shell scripts

(70)

Shells and shell scripts

• We briefly discussed shells earlier in the session

• There are different shells that differ slightly in how they operate

• Often, you can identify the shell you are using by typing: echo $SHELL

• $SHELL is a variable

(71)

Scripts

• Scripts let us put together a sequence of commands that can then be run

• We can run a script by typing: source script.sh

• Source runs the commands in your current shell environment

• Alternatively, you can make the shell file (.sh) executable

(72)

Scripts

#!/bin/bash

echo Hello World

(73)

Scripts

#!/bin/bash echo Hello World

echo Goodbye World

(74)

Exercise 6

• In this exercise, we’ll look at scripts that run commands we’ve already discussed

• We’ll also review checking permissions to see if files are executable

(75)

Exercise 6

(76)

More information

• http://www.doc.ic.ac.uk/~wjk/UnixIntro/

• http://www.ee.surrey.ac.uk/Teaching/Unix/

• Also many books and online resources

(77)

Feedback

• Please complete and return the feedback form before you leave

(78)

Acknowledgements

• Naveed Khan

• Tony Travis

• Eduardo Alves

• Mel McCann

References

Related documents