Data Representation and Storage - OBJECTIVES In this chapter we provide an introduction to comp

OBJECTIVES In this chapter we provide an introduction to computing and engineering problem solving, including:

1.4 Data Representation and Storage

Recall that the design of the Analytical Engine’s storage unit consisted of an indeﬁnite number of columns of discs, each disc inscribed with the 10 decimal digits (0–9). By arranging the the discs of one column, a decimal number could be stored.

A modern digital computer follows a similar design except information is represented in base 2, or binary, rather than base 10. The base two number system has only two binary digits, 0 and 1, so a binary digit can be represented by one bit in a digital computer. The value of the bit at any given time can be either 0 or 1. In hardware terms, the bit is said to be off, or low, if it has a value of 0 and on, or high, if it has a value of 1.

Binary numbers can be stored in memory as a sequence of bits, called a word. The right word

most bit of a word represents the one’s position, the next bit represents the two’s position, the next bit represents the four’s position, and the left most bit represents the 2ⁿ⁻¹position where n is the number of bits. Each additional bit in a word increases the word size by a power of 2, and doubles the range of values that can be represented. The number of words available in memory is referred to as the memory space, or address space.

address space

Figure 1.4 provides a diagram of memory that has an address space of 8 (2³) and a word size of 16 (2⁴). The binary value stored at address 000 is 00001010110111012= 278110. It is important to note that the word size determines the range of values that can be stored in one word of memory and the address space determines the number of words that can be stored.

The theoretical Analytical Engine has a word size deﬁned by the number of discs on a column and a memory space deﬁned by the number of columns.

The C++ programming language has built-in data types for representing integer val-ues, ﬂoating point valval-ues, characters, and boolean values. Each built-in data type has a pre-determined size measured in bytes, where a byte is a sequence of 8 bits. Type decla-bytes

type declaration statements

ration statements are required to define identifiers and allocate memory. When an identifier

identiﬁers

Figure 1.4 Memory Diagram: Address Space= 8, Word Size = 16

Section 1.4 Data Representation and Storage 17 is deﬁned, a data type is speciﬁed and the required number of bytes is allocated for storage.

For example, the type declaration statement:

int iValue= 2781;

defines an identifier, iValue, that references the first byte of an integer value stored in memory.

The initial value stored in memory is the binary representation of 278110. A memory diagram is given below:

iValue⇒ 00000000 00000000 00001010 11011101

Before the development of compilers and high-level programming languages, program-mers were required to load each memory location with instructions and data in binary form.

Today, we have software (compilers, linkers and loaders) that perform these tasks for us. In Chapter 2 we begin our discussion of C++ statements and data types, but ﬁrst we will provide a brief discussion of number systems and low-level data representation to allow for a better understanding of the operations that are performed when a program is executed.

Number Systems

The base 10 number system has 10 decimal digits, (0–9) and each digit in a decimal number multiplies a power of 10. Most of us have an easy time counting in base 10 and comprehending base 10 numbers. When we read the number 24510for example, we read it as two hundred and fourtyﬁve. We understand that the 2 is in the hundreds position (10²), the 4 in the tens position (10¹) and the 5 in the ones position (10⁰). If we expand the number as follows:

2∗ 10²+ 4 ∗ 10¹+ 5 ∗ 10⁰

we can add the terms 200+ 40 + 5 to arrive at a value of 24510. In the following sections, we will discuss three number systems, binary, octal and hexadecimal, that are useful when studying digital computer systems. We will also develop algorithms for converting numbers to different bases.

Binary Numbers. Digital computers represent data in binary form. The base two number system has two binary digits, 0 and 1, and each digit multiplies a power of 2. If we examine the binary number, 00001010110111012, stored in memory location 000 in Figure 1.4, and we want to determine the equivalent decimal number, we can expand the number as follows, noting that the right most binary digit multiplies 2⁰and the left most digit multiplies 2¹⁵. 0000101011011101₂ =

0∗ 2¹⁵+ 0 ∗ 2¹⁴+ 0 ∗ 2¹³+ 0 ∗ 2¹²+ 1 ∗ 2¹¹+ 0 ∗ 2¹⁰+ 1 ∗ 2⁹+ 0 ∗ 2⁸+ 1∗ 2⁷ + 1 ∗ 2⁶ + 0 ∗ 2⁵ + 1 ∗ 2⁴ + 1 ∗ 2³ + 1 ∗ 2² + 0 ∗ 2¹+ 1 ∗ 2⁰= 0+ 0 + 0 + 0 + 2048 + 0 + 512 + 0 + 128 + 64 + 0 + 16 + 8 + 4 + 0 + 1 = 2781₁₀

As we will see in the following sections, the above conversion algorithm can be applied to numbers represented in any base to obtain the equivalent base 10 value.

18 Chapter 1 Introduction to Computing and Engineering Problem Solving

Suppose that we have a decimal number that we wish to represent in base 2. The equivalent binary number will be a sequence of binary digits. To generate these digits, we will use a conversion algorithm that repeatedly divides the decimal number by 2, and records the remainder of each division as a successive digit of the equivalent binary number.

We illustrate the use of the algorithm in the example below. Note that the remainder of the first division is recorded as the least significant digit (LSD) of the equivalent binary least significant

digit number, the remainder of the last division is recorded as the most signiﬁcant digit (MSD), most signiﬁcant

digit

and we divide until the quotient becomes zero.

Example

The above conversion algorithm can be used to convert from base 10 to any other base.

The base of the other number system deﬁnes the divisor, as we will see in the following sections.

Octal Numbers. The octal, or base 8, number system has 8 octal digits, (0–7). Octal numbers are useful for understanding 8 bit character codes, and for setting permissions on files and directories on a linux/unix platform. For example, if we have a file that we want to be readable, writable, and executable by everyone, we can use the chmod command to set the permissions on the file as follows:

chmod 777 aFile

Permissions for aFile will be set to 7₈, or 111₂for the owner, the group and the world. If we next do a long listing,

ls -l

we will see something like the following:

-rwxrwxrwx 1 jeaninei jeaninei 258 Jun 26 12:27 aFile.

The above indicates that (r)ead(w)rite(x)execute permission is granted to the owner, the group and the world. If we wish to grant only read status to the group and the world, we can change the permissions with the following command:

chmod 744 aFile

A long listing will now display the following:

-rwxr-r- 1 jeaninei jeaninei 258 Jun 26 12:27 aFile

Section 1.4 Data Representation and Storage 19 indicating that the owner hasrwxpermission and the group and the world have read permission, but no write permission and no execute permission. The number 48 = 1002, thus a 1 grants permission and a 0 denies permission.

Each digit in an octal number multiplies a power of 8. To determine the equivalent decimal value of an octal number, we can apply the same conversion algorithm used to determine the decimal value of a binary number. Note that the base of the number being converted to decimal, 8 in this case, deﬁnes the base of the exponential terms.

Example 2178=?10

2∗ 8² + 1 ∗ 8¹+ 7 ∗ 8⁰= 2∗ 64 + 1 ∗ 8 + 7 ∗ 1 = 128+ 8 + 7 =

143₁₀

To illustrate the algorithm for converting from base 10 to any other base, we will convert 14310back to base 8. We apply the algorithm using 8 as our divisor.

14310 =?8

8 143 8 17 R7

8 2 R1 0 R2

14310 = 2178

We have thus far illustrated an algorithm for converting from base 10 to any other base and an algorithm for converting from any base to base 10, but now assume that we want to convert a base 8 number to base 2. One approach would be to ﬁrst convert the base 8 number to base 10, and then convert the equivalent base 10 number to base 8. This approach is valid, but since 8 is a power of 2, 8 = 2³, and each of the eight octal digits can be represented in binary using three binary digits, we can use this relationship to quickly convert from binary to octal and octal to binary, as illustrated in the next two examples. Table 1.2 lists the binary representation of the 8 octal digits for your convenience.

TABLE 1.2 Binary Representation of Octal Digits

Octal Digit Binary Representation

0 000

1 001

2 010

3 011

4 100

5 101

6 110

7 111

20 Chapter 1 Introduction to Computing and Engineering Problem Solving Example

1438=?2

1 4 3

↓ ↓ ↓

001 100 011

143₈= 0011000112

Example

0001010110111012=?8

101000 011 011 101

↓ ↓ ↓ ↓ ↓

0 5 3 3 5

000101011011101₂= 53358

Practice!

Convert each of the following base 10 numbers to base 2.

1. 92110=?2

2. 810=?2

3. 100₁₀=?2

Convert each of the following base 8 numbers to base 10.

4. 100₈=?10

5. 2478=?10

6. 168=?10

Convert each of the following numbers to the requested base.

7. 1002=?8

8. 37168=?2

9. 1101001112=?10

10. 221₆=?8

Hexadecimal Numbers. The hexadecimal, or base 16, number system has 16 hex-adecimal digits, (0–9,A–F), and each digit in a hexhex-adecimal number multiplies a power of 16.

Section 1.4 Data Representation and Storage 21 The decimal values of the hexadecimal digits represented by letters are (10,11,12,13,14,15) respectively.

The number 2BF₁₆ is an example of a hexadecimal number. Applying the conversion algorithm for converting from any base to base 10, we can determine the equivalent decimal number, as illustrated below.

2FB16=?10

2∗ 16² + F ∗ 16¹+ B ∗ 16⁰= 2∗ 256 + 15 ∗ 16 + 11 ∗ 1 = 512+ 240 + 11 =

76310

To illustrate the algorithm for converting from base ten to any other base, we will convert 76310back to base 16. The remainder of each division is a value between 0 and F.

76310 =?16

16 763

16 47 R11(B)

Hex Digits 16 2 R15(F)

0 R2

76310 = 2FB16

Sixteen is a power of 2, 16= 2⁴, and each of the hexadecimal digits can be represented in binary using four binary digits. This relationship can be used to quickly convert from binary to hexadecimal and hexadecimal to binary, as illustrated in the next two examples. Table 1.3 lists the binary representation of the hexadecimal digits for your convenience.

Example 7BF₁₆=?2

7 B F

↓ ↓ ↓

0111 1011 1111

7BF16= 0111101111112

Example

1010001101₂=?16

1010 000 1101

↓ ↓ ↓

A 0 D

1010000011012= A0D16

22 Chapter 1 Introduction to Computing and Engineering Problem Solving

TABLE 1.3 Binary Representation of Hexadecimal Digits

Hexadecimal Digit Binary Representation

0 0000

1 0001

2 0010

3 0011

4 0100

5 0101

6 0110

7 0111

8 1000

9 1001

A 1010

B 1011

C 1100

D 1101

E 1110

F 1111

Practice!

Convert each of the following base 10 numbers to base 16.

1. 921₁₀=?16

2. 8₁₀ =?16

3. 100₁₀=?16

Convert each of the following base 16 numbers to base 10.

4. 1C016=?10

5. 29E16 =?10

6. 1616=?10

Convert each of the following numbers to the requested base.

7. 10010011₂ =?16

8. 3 A1B₁₆=?2

9. 110100111₂=?10

10. 261₈=?16

Section 1.4 Data Representation and Storage 23

Data Types and Storage

When data is represented in memory, it is represented as a sequence of bits. The sequence of bits may represent an instruction, a numeric value, a character, a portion of an image or digital signal, or some other type of data. If we look at the bit sequence 01000110₂, for example, it has a decimal value of 70. It also is the American National Standard Institute (ANSI) ANSI

character code for the character F.

Data representation is becoming an increasingly important and interesting field in engi-neering, math and computer science. The amount of data that can be generated and processed is increasing as computers become more powerful, and the use of computers in data inten-sive applications such as bio-computing, communications and signal processing present new challenges for defining and representing data. In this section, we will discuss the digital rep-resentation of two basic data types, integer and floating point.

Integer Data Type. In the previous section, we used an algorithm to convert from base 10 to base 2. This algorithm works for any base 10 integer, so in theory we can represent, exactly, all base 10 integers in base 2. In practice, we may be limited by the word size of our computing system. Integer data is often stored in memory using 4 bytes, or 32 bits. The left most bit is reserved for the sign, leaving 31 bits for the magnitude of the number.

Consider the following example that uses a word size of 8 for simplicity. The largest signed integer that can be represented in 8 bits is 2⁷− 1 or 12710, as illustrated below.

0 1 1 1 1 1 1 1

The representation of data in a digital computer affects the efﬁciency of arithmetic and logic operations. Many computer systems store positive signed integers as illustrated above, and store negative signed integers in their 2’s complement form. Storing negative integers 2’s complement

form in their 2’s complement form allows for efﬁcient execution of arithmetic operations, without checking the sign bit. First, we illustrate how to form the 2’s complement of a negative num-ber, then we will illustrate how storing negative numbers in their 2’s complement form can simplify arithmetic operations.

The 2’s complement of a binary number is formed by negating all of the bits and adding one. Negating a bit means switching the value, or state, of a bit from 1 to 0, or from 0 to 1.

Negating all the bits of a binary number forms the 1’s complement of the number. Adding 1’s complement

one to the 1’s complement results in the 2’s complement of the number. Using a word size of 8, the 2’s complement of−12710is computed in the following example.

Example Compute the 2’s complement representation for the value−12710.

To form the 2’s complement of a negative integer, we begin with the binary representa-tion of the unsigned value.

12710 = 011111112.

Next, we negate the bits to form the 1’s complements.

100000002.

24 Chapter 1 Introduction to Computing and Engineering Problem Solving Finally, we add 110= 000000012to form the 2’s complement.

10000001.

Thus, the 2’s complement representation for−12710is 10000001.

Notice what happens when we add 127₁₀to the 2’s complement of−12710. 011111112 12710

+ 100000012 2’s complement of−12710

= 000000002 result of addition is 0

The 2’s complement form for representing signed integers has the property that adding a positive integer, n, to the 2’s complement of n results in zero for all n. Another important property of 2’s complement representation is that there is a unique representation for binary 0.

When performing addition on signed integers represented in 2’s complement form, the result, if negative, will be in its 2’s complement form, as shown in the following examples.

Example Addition of 2 binary numbers, positive result.

11110110 −1010represented in 2’s complement + 00001101 1310

= 00000011 310

Example Addition of 2 binary numbers, negative result.

00001010 1010

+ 11110011 −1310represented in 2’s complement

= 11111101 −3 represented in 2’s complement Practice!

Find the 2’s complement of the following integers:

1. 110011112

2. −19210

3. −458

Floating Point Data Type. A ﬂoating point number, or real number, such as 12.2510

includes a decimal point. The digits to the left of the decimal point form the integral part of the number and the digits to the right of the decimal point form the fractional part of a number.

The fractional part of a decimal number can be converted to binary by repeatedly multiplying the factional part by 2 and recording the carry bits until the factional part becomes zero. This carry bits

algorithm is illustrated in the following example.

Section 1.4 Data Representation and Storage 25 Example Convert12.2510to binary.

First, the integral part, 1210is converted to binary:

2 12 2 6 R0

2 3 R0 2 1 R1 0 R1

1210= 11002

Next we will convert the fractional part,.2510 to binary by repeatedly multiplying the frac-tional part by 2 and recording the carry bits:

.25 ∗ 2 = 0.5 C0 .5 ∗ 2 = 1.0 C1 .2510= .012

The integral and fraction parts are now combined to form the equivalent binary value. Thus, the value 12.2510= 1100.012.

To convert the ﬂoating point binary number 1100.012back to decimal, the binary digits to the right of the decimal point are multiplied by negative powers of 2, as shown below.

1100.012=

1∗ 2³+ 1 ∗ 2²+ 0 ∗ 2¹+ 0 ∗ 2⁰+ 0 ∗ 2⁻¹ + 1 ∗ 2⁻²= 1∗ 8 + 1 ∗ 4 + 0 ∗ 2 + 0 ∗ 1 + 0 ∗ (1/2) + 1 ∗ (1/4) = 8+ 4 + 0 + 0 + 0 + 0.25 =

12.25

1100.012= 12.2510

In the above example, we found an exact binary representation for the value 12.2510. Unfortunately, many ﬂoating points decimal have only an approximate binary representation, as illustrated in the next example.

Example Convert 12.610to binary.

The integral portion, 1210 has been shown to equal 11002. Converting the fractional portion, we repeatedly multiply by 2 and record the carry bits.

.6 ∗ 2 = 1.2 C1 .2 ∗ 2 = 0.4 C0 .4 ∗ 2 = 0.8 C0 .8 ∗ 2 = 1.6 C1 .6 ∗ 2 = 1.2 C1 .2 ∗ 2 = 0.4 C0 .4 ∗ 2 = 0.8 C0 .8 ∗ 2 = 1.6 C1

26 Chapter 1 Introduction to Computing and Engineering Problem Solving

We can see that we are repeating the bit pattern of 1001 and will never arrive at a terminating value of zero. Thus, the best binary approximation of 0.610 is the repeating binary number 0.100110011001. . .2. Expanding this to 8 bits of precision, we have:

12.610 ≈ 1100.100110012

It is important to know that the binary representation of a ﬂoating point decimal is an approximation, not an equality. This affects the way we use and test ﬂoating point values in programs and it also affects the accuracy of numerical calculations.

In document Engineering Problem Solving with C++ -Delores M. Etter, Jeanine A. Ingber- 3rd ED (Page 34-44)