INTRODUCTION TO INFORMATION PROCESSES AND DATA
DIFFERENT TYPES OF MEDIA
Media is the plural of medium; in the context of information, media refers to something in the middle that is used to transmit a message of some sort. This is what the press does; it transmits news, a form of information, using television, radio or print media. The term ‘multimedia’ is used to refer to information that combines text, sound, graphics and/or video. For example, the worldwide web makes extensive use of multimedia; the types of media used are chosen to best communicate the intended information.
In this section we consider different types of media commonly used by information systems, namely:
• text,
• numbers,
• image,
• audio and
• video.
These media provide a method for representing data and communicating information.
Each media type conveys different information and is used to represent different types of data, yet computers represent all types of media in binary. Binary is a number system, just like the familiar decimal system, except rather than ten digits it uses only two, namely 0 and 1. Computers ultimately represent all the different types of media as a sequence of 0s and 1s. It is the way this data is organised that makes it meaningful and therefore able to be transformed into information.
Consider the following:
This book is primarily composed of text, hence the name textbook. In reality it uses other media, together with text, to communicate information. During your IPT studies your teacher uses this textbook together with other media to teach IPT.
GROUP TASK Discussion
Identify different types of media used during your IPT classes. For each type discuss advantages and disadvantages compared to straight text.
Text
The text media type is used to represent characters. These characters can be printable, such as letters of the alphabet, or non-printable, such as carriage returns or tabs. A sequence of characters is used to represent words, paragraphs or complete books, however text can also be used for many other purposes, for example, phone numbers are usually represented as text, as the sequence in which the digits appear is vital, yet each identical digit’s meaning is the same.
What makes data a candidate for the text media type? Any data that is composed of a string of distinct characters where the order of the characters is important but each character, when considered in isolation, has a constant meaning regardless of this order. For example, the string of characters ‘The cat sat on the mat.’ is composed of 23 distinct characters, the meaning is derived as a consequence of the order in which these characters appear, yet each occurrence of say, the letter ‘a’, has the same meaning. In contrast consider the number 2320, the first occurrence of the character ‘2’ means 2 thousand and the second means 2 tens. Numbers are therefore not good candidates for the text media type.
There are numerous methods for representing text digitally;
all these methods code each unique character into a number. The two most commonly used methods are ASCII (American Standard Code for Information Interchange) pronounced as-kee and EBCDIC (Extended Binary Coded Decimal Interchange Code) pronounced ebb-see-dik. IBM mainframe and mid-range computers, together with devices that communicate with these machines, use EBCDIC. The ASCII system of coding text is used more widely and has become the standard for representing text digitally.
Standard ASCII represents each character using a decimal number in the range 0 to 127. This range is used as each character can then be represented in binary using just seven bits (binary digits). The table in Fig 2.9 shows the standard ASCII character set together with the decimal code for each of these characters. We can see in this table that the decimal number 65 represents ‘A’, 65 in decimal is equivalent to the seven bit binary number 1000001. The text ‘The cat sat on the mat.’ would likewise be represented in ASCII as 84 104 101 32 99 97 116 32 111 110 32 116 104 101 32 109 97 116 46 and in binary as a sequence of 23 seven bit binary numbers. Notice that in ASCII each of the characters in the alphabet are arranged in order, as are the digits, this greatly simplifies the sorting of text into alphabetical order. Also, the non-printable characters occupy the decimal values from 0 to 31.
Char Dec Char Dec
Consider the following:
The ASCII table in Fig 2.9 shows the decimal code for each character, but in reality computers represent these numbers using binary. Binary is the base 2 number system whereas decimal uses a base of 10. The decimal number 465, means 4 hundreds, 6 tens and 5 ones. Hundred, ten and one are all powers of ten, namely 102, 101 and 100, so 465 = (4 × 102) + (6 × 101) + (5 × 100). In binary rather than powers of ten we use powers of two, hence the binary number 1101 in decimal really means (1 × 23) + (1 × 22)+ (0 × 21) + (1 × 20) = 8 + 4 + 0 + 1 = 13. As computers generally work on groups of 8 bits, called a byte, it would be common to see the binary number 1101 written as 00001101, this is similar to writing 465 as 00000465, any leading zeros can be ignored.
Numbers
The number media type is used to represent integers (whole numbers), real numbers (decimals), currency and even dates and times. In fact any quantity that can be expressed on a numerical scale can be represented using numbers; ask yourself, is it possible to place a single example of this data on an ordered continuous line and is it possible and desirable to perform mathematical operations with this data? If the answer to these questions is yes then this data is a prime
candidate to be represented as a number. Numbers have magnitude, that is, the concept of size is built into all numbers, for example, ‘15 is bigger than 10 but smaller than 20’ describes the magnitude of 15 The digits that make up numbers have different meanings dependant on their position relative to other digits in the number.
These attributes are not present in the other types of media. For example, images do not have magnitude and nor does text, to say that a photograph of a bird is greater than one of a building or to say this sentence is greater than the last is meaningless.
Ultimately all data stored and processed by digital computers is represented as numbers. Computers, at their most basic level, process binary numbers by adding and comparing them, consequently all media types must be represented and processed as binary numbers.
GROUP TASK Activity
The following 8 bit binary numbers are used to represent a portion of text using standard ASCII. What does it say? Once you work it out you have my permission to call out your answer!
01001001 0100000 01101100 01101111 01110110 01100101 0100000 01001001 01010000 01010100 0100001
GROUP TASK Discussion
Is the sequence of binary numbers in the above activity data or information? Discuss.
456 -345 16.0004440550066 -0.002
$65.45
$5,000,000 11/07/2003 4:44:47 PM 11-July-03
Fig 2.10
Data suitable for use by the number media type.
Computers are finite devices, they cannot represent or calculate every possible number, there is a limit to the accuracy with which they represent and calculate numbers. As a consequence the manner in which they represent numbers is a compromise between space, speed and accuracy. As the needs of different information systems and their processes require different types of numbers and different levels of accuracy various different methods of representing numbers are in common usage.
For example, if we are counting the number of cars that pass by a given point then our data is positive whole numbers; we have no need to store decimal fractions. If we are calculating the average of a set of numbers then the fractional part of the answer is significant and a real number representation method is required.
Let us briefly consider the storage requirements, range, strengths and limitations of commonly used methods for representing integers, real numbers, currency and dates/times:
• Integers
Commonly integers are represented using the two’s complement system, this system codes the sign of each number in such a way that binary calculations need not consider the sign of the numbers. Each integer is represented using either 16 bits or 32 bits; the range for 16 bit integers is from –32768 to 32767 and for 32 bit integers from –2147483648 to 2147483647. Whole number calculations within these ranges are perfectly accurate, however calculations outside the range are not possible. Any calculations resulting in fractional answers cannot be stored as integers without loss of the fractional part. For example, the result of simple divisions, such as 2 divided by 4, cannot be stored as they are not whole numbers.
• Real numbers
Real numbers are commonly represented using a system known as ‘floating-point’. Floating-point numbers are represented using a technique similar to scientific notation. For example, 1234.5678 is written in scientific notation as 1.2345678 × 103, 1.2345678 is called the mantissa and the 3 is known as the exponent; the position of the decimal point changes (or floats) depending on the value of the exponent.
There are two common standards; single precision floating-point which represents each number using 32 bits and double precision floating-point which uses 64 bits. Single precision has an approximate range of
–3.4 × 1038 to 3.4 × 1038 and double precision has an approximate range of –10308 to 10308. Be aware that not all numbers within these ranges can be represented precisely, even simple fractions, such as ⅓, have no exact floating-point equivalent. Single precision representations are accurate to around 7 significant figures and double precision to 15 significant figures, therefore in single precision ⅓ is represented as 0.3333333 and in double precision as 0.333333333333333, be aware that repetitive calculations can multiple inaccuracies significantly. Floating-point calculations are more processor intensive than integer calculations; consequently most CPU designs include a dedicated floating-point unit (FPU).
The set of real numbers
0 1 2 3 4 -4 -3 -2 -1
0.3333332 0.3333333 0.3333334
0 1 2 3 4 -4 -3 -2 -1
The set of integers
Floating-point represents a subset of the real numbers
Fig 2.11
Integers, real numbers and floating-point.
GROUP TASK Investigation
Investigate the accuracy of calculations performed by a spreadsheet with which you are familiar. What type of representation do you think is being used for numbers?
• Currency
Financial calculations require very precise calculations but within a relatively restricted range. For most currency calculations accuracy must be perfect up to two decimal places. To achieve these requirements a system similar to integer representation is used but with the decimal point moved four places to the left;
essentially integers are scaled by a factor of 10000. This results in a representation that is accurate to the required two decimal places. Commonly each data item is represented using 64 bits (8 bytes), resulting in an effective range of – 922,337,203,685,477.5808 to 922,337,203,685,477.5807. Every decimal number with up to four decimal places can be represented precisely within this range.
• Dates/Times
Many older systems coded dates and times using separate numbers for the day, month, year and time, it is now common for a single date and time to be represented as a double-precision floating-point number. For example, 37816.25 converts to 6am on the 14/7/2003, the whole number part is the number of days that have elapsed since the 30/12/1899 and the fractional part is the fraction of the day that has elapsed. The method of representation is identical to the double precision floating-point system;
this is the way dates/times are organised. The analysing process transforms these numbers into dates and times that we humans understand.
Images
The image media type is used to represent data that will be displayed as visual information. Using this definition all information displayed on monitors and printed as hardcopy is represented as images. This is true, all monitors and printers are used to display image media, however text and numbers are organised into image data only in preparation for display. Photographs and other types of graphical data are designed specifically for display; this is their main purpose. In these cases the method of representing the image is chosen to best suit the types of processing required. For example, the representation used when editing a photograph to be included in a commercial publication is different to that used when drawing a border around some text in a word processor. There are essentially two different techniques for representing images; bitmap or vector; let us consider each of these in turn.
• Bitmap
Bitmap images represent each element or dot in the picture separately. These dots are called pixels (short for picture element) and each pixel can be a different colour and is represented as a binary number. The number of colours present in an image has a large impact on the overall size of the binary representation. For examples, a black and white image requires only a single bit for each pixel, 1 meaning black and 0 meaning white. For 256 colours, 8 bits are required for each pixel so the image would require 8 times the storage of a similarly sized black and white bitmap image. Most colour images can have up to 16 million different colours, where each pixel is represented using 24 bits. The number of bits per pixel is often referred to as the image’s colour depth; the higher the colour depth, the more colours it includes and the larger the storage requirements for the image will be.
GROUP TASK Activity
Using a spreadsheet, enter various numbers and then format them as dates and times. Verify if the system used is the same as the one outlined above.
The other important parameter in regard to bitmap images is resolution. The resolution is the number of pixels the image contains and is usually expressed in terms of width by height. The image of the Alfa Romeo in Fig 2.12 has a resolution of 505 pixels by 391 pixels, when the image is enlarged each pixel is merely made larger, e.g. the jaggy looking grille inset at the top right of the photo. When using bitmap images it is vital to consider the likely display device to be used to determine the resolution required.
Bitmap images are often compressed to reduce their size prior to storage or
transmission. Many different bitmap image file formats are available; some reduce the size of the image file without altering the image (lossless compression) whilst others alter the image data as part of the compression process (lossy compression). For example the Alfa Romeo image in Fig 2.12 takes up 578 kilobytes when stored as a standard uncompressed Windows BMP file and only 28.4 kilobytes when stored using lossy compression as a JPEG file.
• Vector
Vector images represent each portion of the image mathematically. That is, the data used to generate the image is a mathematical description of each shape that makes up the final image. Each shape within a vector image is a separate object that can be altered without affecting other objects. For example, a single line within a vector image can be selected and its size, colour, position or any other property altered independent of the rest of the image. For example, the body of the cat in Fig 2.13 has been drawn using a single filled line whose attributes can be altered independently from the rest of the image.
The total size of the data required to represent a
vector image is, in most cases, less than that for an equivalent bitmap image however the processing needed to transform this data into a visual image is far greater. In fact all vector images must be transformed into bitmaps before they can be displayed on a monitor, printer or any other output device. Vector images can be resized to any required resolution without loss of clarity and without increasing the size of the data used to represent the image. Vector graphics are generally unsuitable for representing photographic images, as the detail required is difficult to reproduce mathematically.
Fig 2.12
The resolution of bitmap images should be appropriate to the display device.
GROUP TASK Investigation
Load a photograph into a photo editor such as MS-Paint. Save this image using different formats and colour depths. Observe and document the differences in terms of storage size and clarity of the resulting images.
Fig 2.13
Vector images are represented as separate editable shapes.
Audio
The audio media type is used to represent sounds; this includes music, speech, sound effects or even a simple ‘beep’. All sounds are transmitted through the air as compression waves, vibrations cause the molecules in the air to compress and then decompress, this compression is passed onto
further molecules and so the wave travels through the air. Our ear is able to detect these waves and our brain transforms them into what we recognise as sound. The sound waves are the data and what we recognise as sound is the information.
All waves have two essential components, frequency and amplitude. Frequency is measured in hertz (Hz) and is the number of times per second that a complete wavelength occurs. Sound waves are made up of sine waves where a wavelength is the length of a single complete waveform, that is, a half cycle of high pressure followed by a half
cycle of low pressure. In terms of sound, frequency is what determines the pitch that we hear, higher frequencies result in higher pitched sounds and conversely lower frequencies result in lower pitched sounds. The human ear is able to discern frequencies in the range 20 to 20,000 Hz, for example, middle C has a frequency of around 270 Hz.
Amplitude determines the volume or level of the sound, very low amplitude waves cannot be heard whereas very high amplitude waves can damage hearing. Amplitude is commonly measured in decibels (db). Decibels have no absolute value; rather they must be referenced to some starting point. For example, when used to express the pressure levels of sound waves on the human ear, 0 decibels is usually defined to be the threshold of hearing, that is, only sounds above 0 decibels can be heard, sounds above 120 decibels are likely to cause pain.
Let us now consider how audio or sound data can be represented in binary. There are two methods commonly used, the first is by sampling the actual sound at precise intervals of time and the second is to describe the sound in terms of the properties of each individual note. Sampling is used when a real sound wave is converted into digital, where as descriptions of individual notes is generally used for computer generated sound, particularly musical compositions.
• Sampling
The level, or instantaneous amplitude, of the signal is recorded at precise time intervals – each sample is stored as a binary number. This results in a large number of points that can be joined to approximate the shape of the original sound wave. There are two parameters that affect the accuracy and quality of audio samples; the number of samples per second and
the number of bits used to represent each of these samples. For example, stereo music stored on compact disks contains 44100 samples for each second of audio for both left and right channels and each of these samples is 16 bits long. This means that an audio track that is 5 minutes long requires storage of 44100 samples × 300 secs × 16 bits per sample × 2 channels; this equates to approximately 50MB of storage. A normal audio
Amplitude
Wavelength High
pressure Low pressure
Fig 2.14
Sound is transmitted by compression and decompression of molecules.
Molecules in air
Molecules in air