FILE Compresssion and Decompression in Linux

Full text

(1)

TAR

TAR (for "Tape ARchive").

TAR can be used to make a big file filled with many smaller files. It can be

easily transported across directories, mahcines or networks.

Need to use the TAR command to extract files from an archive or to create an archive (also known as tarball). A single file that contains various individual files is called tarball or an archive.

A ".tar" file is not a compressed files , it is a collection of files within a single file uncompressed. If the file is a .tar.gz ("tarball") or ".tgz" file , it is a collection of files that is compressed. “.tar.gz” is same thing as “tar.gz”. Familiar Options in tar Command.

c, x, v, f, z These are commonly used “tar” options.

 c = Create a new tar file.

 x = Extract the contents (.tar file) from a archive.

 v = Verbose output , showing progress to compress or uncompress.

 f = Create the tar file with filename provided as the argument.

 z = Use gzip to zip it (compress or decompress automatically). POINTS TO NOTE :

 TAR creates one “tar file” out of several files and directories.

 .tgz extension is a shorthand version for the.tar.gzextension.

 Important to know , TAR doesn’t compress the files and (or) directories.

 TAR file will take up the same amount of space as all the individual files.

(2)

z option for DECOMPRESS/EXTRACT the contents of the compressed archive created by (gzip) program i.e. (tar.gz extension).

J option for DECOMPRESS/EXTRACT the contents of the compressed archive

created by (bzip2) program i.e. (tar.bz2 extension).

Creating Simple tar file

As I said , to combine multiple files and (or) directories into a single file.

$ tar -cf file.tar input_file1 input_file2 input_file3

$ tar -cf backup.tar f1 f2 f3

Lets check all files in directory

# ls -l

-rw-r--r-- 1 root root 74471 Jan 18 09:45 f1 -rw-r--r-- 1 root root 102019 Jan 18 09:45 f2 -rw-r--r-- 1 root root 10737 Jan 18 09:45 f3 -rw-r--r-- 1 root root 47199 Jan 18 09:45 f4

Creating Simple tar file using multiple files

# tar -cf backup.tar f1 f2 f3 f4 # ls -l

-rw-r--r-- 1 root root 245760 Jan 18 10:31 backup.tar

-rw-r--r-- 1 root root 74471 Jan 18 09:45 f1 -rw-r--r-- 1 root root 102019 Jan 18 09:45 f2 -rw-r--r-- 1 root root 10737 Jan 18 09:45 f3 -rw-r--r-- 1 root root 47199 Jan 18 09:45 f4

(3)

 Check above output. TAR doesn’t compress the files and (or) directories.

Lets (compress/zip) the “ backup.tar file “

I am going to compress a “backup.tar” into a new tar file “newback.tar.gz”.

# tar –zcvf newback.tar.gz backup.tar

backup.tar

Lets check file size - (.tar vs .tar.gz)

# ls -l

-rw-r--r-- 1 root root 245760 Jan 18 10:31 backup.tar

-rw-r--r-- 1 root root 74471 Jan 18 09:45 f1 -rw-r--r-- 1 root root 102019 Jan 18 09:45 f2 -rw-r--r-- 1 root root 10737 Jan 18 09:45 f3 -rw-r--r-- 1 root root 47199 Jan 18 09:45 f4

-rw-r--r-- 1 root root 65217 Jan 18 10:44 newback.tar.gz

We can see the difference for both files backup .tar vs newbackup.tr.gz. By default tar doesn’t compress file.

Difference between “.tgz” vs “.tar.gz”

As I said .tgz and .tar.gz both are signify a tar file zipped with gzip.

# tar –zcvf newback1.tgz backup.tar # ls -ld n*

-rw-r--r-- 1 root root 65217 Jan 18 11:43 newback1.tgz

(4)

How to (Unzip/Uncompress) a tar file-(newback.tar.gz)

Files with extension tar.gz (or) .tgz are tar files compressed with gzip. We

can uncompress a “newback.tar.gz”. To unzip such a zipped tar file, directly

can use the z option

# tar –xzvf newback.tar.gz

backup.tar

POINTS TO NOTE :

tar.gz is a compressed tar archive. The .gz file extension are created using Gzip

program which reduces the size of the tar files. gunzip/gzip is software

application used for file compression and decompression.

gzip is short for GNU zip ;

the program is a free software replacement for the compress program used in early UNIX/LINUX systems. The best option for decompress .tgz / tar.gz files.

$ tar -zxvf filename.tgz or $ ta r -zxvf filename.tar.gz

Examples for (Zip/Compress) and (Unzip/Uncompress) a tar file

$ tar -xvf example.tar

Extract the contents of example.tar and display the files as they are extracted.

$ tar -cf backup.tar /home/rose/test_dir

Create a TAR file named backup.tar from the contents of /home/rose/test_dir

$ tar -zxvf example.tgz

Gunzip (uncompress) example.tgz and then extract the contents.

$ tar -tvf example.tar

(5)

How To make a .tar file of whole directory

To create a tar archive of an entire directory including all files and sub-directories we can use following options

To Create : $ tar -cvf whatever.tar </home/oracle/scripts/> To Extract : $ tar -xvf whatever.tar

We can use any name in place of file.tar, but should keep the .tar extension.

GZIP and GUNZIP

gzip and gunzip: Files With .gz Extensions

gzip and gunzip are GNU file compression and de compression utilities.

[gzip  compression] and [gunzip decompression].

Files that have been compressed by gzip will have a .gz extension. Sometimes

we see .tgz extension. TAR file that has been compressed by gzip. The .tgz

extension is a shorthand version for the .tar.gz extension.

To check if tools are installed in Linux box $ which <tool_name>

$ which tar

output the location where gunzip tools is installed like: /bin/tar

$ which gzip

output the location where gunzip tools is installed like: /bin/gzip

$ which gunzip

output the location where gunzip tools is installed like: /bin/gunzip

If tools are not installed , we will get output like

(6)

File Compression and Decompression

To Compress a file [gzip] $ gzip <file_name>

$ gzip f1.txt

By default, gzip will delete the filename.txt file (fi2.txt)

Result of the command filename.txt.gz. (f1.txt.gz)

-rw-r--r-- 1 root root 102019 Jan 19 00:07 f1.txt

# gzip fi2.txt # ls -l

-rw-r--r-- 1 root root 34189 Jan 19 00:07 f1.txt.gz

To Uncompress a file [gunzip]

$ gunzip <file_name>

$ gunzip f1.txt

By default, gunzip will delete the filename.txt.gz file.(f1.txt.gz)

Result of the command filename.txt (f1.txt)

-rw-r--r-- 1 root root 34189 Jan 19 00:07 f1.txt.gz

# gunzip f1.txt.gz # ls -l

-rw-r--r-- 1 root root 102019 Jan 19 00:07 f1.txt

bzip2 and bunzip2

bzip2 and bunzip2 are file compression and decompression utilities. The bzip2 and bunzip2 utilities are newer than gzip and gunzip with little advantage.

(7)

Tar vs Zip

 "zip" file format has built-in compression.

 "zip" is a file format used for data compression and archiving.

 A zip file contains multiple files that have been compressed.

 zip combines both the archiving and compression in one program.

 “tar” doesn't compression.

 “tar” only makes a single file out of multiple files.

 "tar" format is purely an archive format with no compression.

 “tar” to collect multiple files into one archive; specifically storing on tape.

 "tar" files are uncompressed, they are usually compressed by "gzip",

"bzip2", and "xz" -z(gzip) or -j(bzip2) options. Purely compression formats .

 Compression has become so popular. tar has the -z to implement gzip

functionality directly and -j switch for bzip2 compression.

A single file that contains multiple files is called tarball or an archive.

Tools in Linux to Compress and Decompress data.

Compression UnCompressing Extension

zip Unzip .zip

gzip gunzip, gzip –d .gz

tar –cf tar –xf .tar

Pack Unpack .z

bzip2 bunzip2 .bz2

Question Task

How do you extract a Single Directory from archive?

Create a tar file of an entire directory and its sub-directory ? How do you see a list of files stored In Tar Ball or an Archive ?

(8)

Figure

Updating...

Related subjects :