MODULE II
MODULE II
Data types - Specification of data types, implementation of elementary data types, Declarations, Data types - Specification of data types, implementation of elementary data types, Declarations, type checking and type conversion Assignment and Initialisation Structured data types type checking and type conversion Assignment and Initialisation Structured data types -Specification of data structure types, Implementation of data structure type - Declarations and type Specification of data structure types, Implementation of data structure type - Declarations and type checking for data structures.
checking for data structures.
2.1 DATA TYPES
2.1 DATA TYPES
A data type is a class of data objects together with a set of operations for creating and A data type is a class of data objects together with a set of operations for creating and manipulating them. A program deals with particular data objects such as an array A, the integer manipulating them. A program deals with particular data objects such as an array A, the integer variable X, or the file F, a programming language necessarily deals more commonly with data types variable X, or the file F, a programming language necessarily deals more commonly with data types such as the class of arrays,integers or files and the operations provided for manipulating array, such as the class of arrays,integers or files and the operations provided for manipulating array, integers or files.
integers or files.
Every language has a set of primitive data types that are built into the language. In addition a Every language has a set of primitive data types that are built into the language. In addition a language may provide facilities to allow the programmer to define new data types.
language may provide facilities to allow the programmer to define new data types. The basic elements of a specification of a data type are as follows
The basic elements of a specification of a data type are as follows 1.
1. The attribute that distinguish data objects of that type.The attribute that distinguish data objects of that type. 2.
2. The values that the data objects of that type may have, andThe values that the data objects of that type may have, and 3.
3. The operations that define the possible manipulations of data objects of that type.The operations that define the possible manipulations of data objects of that type. The following are the basic elements of the implementation of a data type: The following are the basic elements of the implementation of a data type: 1.
1. The storage representation that is used to represent the data objects of the data type in theThe storage representation that is used to represent the data objects of the data type in the storage of the computer during program execution, and
storage of the computer during program execution, and 2.
2. The manner in which the operations defined for the data type are represented in terms of The manner in which the operations defined for the data type are represented in terms of algorithms or procedures that manipulate the chosen storage representation of the data object. algorithms or procedures that manipulate the chosen storage representation of the data object.
SPECIFICATION OF ELEMENTARY DATA TYPES SPECIFICATION OF ELEMENTARY DATA TYPES
An elementary data object contains a single data value.a class of such data objects over An elementary data object contains a single data value.a class of such data objects over which various operations are defined is termed as an elementary data type. It include
which various operations are defined is termed as an elementary data type. It include •• AttributesAttributes
•• ValuesValues •• OperationsOperations
Attributes
Attributes
Distinguish data objects of a given type
Distinguish data objects of a given type Data type and name - invariant during the lifetime of Data type and name - invariant during the lifetime of the object
the object
Approaches: Approaches:
•• stored in a descriptor and used during the program executionstored in a descriptor and used during the program execution
•• used only to determine the storage representation, not used explicitly during executionused only to determine the storage representation, not used explicitly during execution
Values
Values
•• The data type determines theThe data type determines the valuesvaluesthat a data object of that type may havethat a data object of that type may have •• Specification:Specification: Usually an ordered set, i.e. it has a least and a greatest valueUsually an ordered set, i.e. it has a least and a greatest value
Operations
Operations
Operations
Operationsdefine the possible manipulations of data objects of that type.define the possible manipulations of data objects of that type. •• PrimitivePrimitive- specified as part of the language definition- specified as part of the language definition
•• Programmer-definedProgrammer-defined(as subprograms, or class methods)(as subprograms, or class methods)
Operation signature Operation signature
.. Specifies the domain and the rangeSpecifies the domain and the range
•• the the number, number, order order and and data data types types of of the the arguments arguments in in the the domain,domain, •• the the number, number, order order and and data data type type of of the the resulting resulting rangerange
mathematical notation for the specification: mathematical notation for the specification: op name: arg type x arg type x … x arg type
op name: arg type x arg type x … x arg type →→ result typeresult type
The action is specified in the operation implementation The action is specified in the operation implementation
•• Implicit arguments, e.g. use of global variablesImplicit arguments, e.g. use of global variables
•• Implicit Implicit results results - - the the operation operation may may modify modify its its argumentsarguments
•• Self-modification - Self-modification - usually through change usually through change of local of local data data between calls, between calls, e.g. e.g. random numberrandom number generators change the seed.
generators change the seed.
1.
1. IMPLEMENTATION OF ELEMENTARY DATA TYPESIMPLEMENTATION OF ELEMENTARY DATA TYPES
The
The implementation of implementation of elementary data elementary data type includes:type includes: •• Storage representationStorage representation
•• Implementation of operationsImplementation of operations
Storage representation
Storage representation
Influenced by the hardware Described in terms of : Influenced by the hardware Described in terms of :
•
• Size of the memory block requiredSize of the memory block required •
• Layout of attributes and data values within the block Layout of attributes and data values within the block
Implementati
Implementation
on of operations
of operations
•• Hardware operation: direct implementation.Hardware operation: direct implementation.E.g. integer additionE.g. integer addition •• Subprogram/functionSubprogram/function, e.g. square root operation, e.g. square root operation
In-line code
In-line code. Instead of using a subprogram, the code is copied into the program at the point where. Instead of using a subprogram, the code is copied into the program at the point where the subprogram would have been invoked
DECLARATIONS
DECLARATIONS
•
• Information about the name and type of data objects needed during program execution.Information about the name and type of data objects needed during program execution.
Explicit – programmer defined Explicit – programmer defined Implicit – system defined Implicit – system defined
Examples Examples
FORTRAN - the first letter in the name of the variable determines the type FORTRAN - the first letter in the name of the variable determines the type Perl - the variable is declared by assigning a value
Perl - the variable is declared by assigning a value
$abc = 'a string' $abc is a string variable
$abc = 'a string' $abc is a string variable
$abc = 7 $abc is an integer variable $abc = 7 $abc is an integer variable
Declaration of Operations Declaration of Operations
prototypes of the
prototypes of the functions or subroutines that are functions or subroutines that are programmeprogrammer-defined.r-defined.
Examples: Examples:
declaration:
declaration: float Sub(int, float)float Sub(int, float)
signature:
signature: Sub: int x float --> floatSub: int x float --> float
Purpose of Declaration Purpose of Declaration
•• Choice of storage representationChoice of storage representation •• Storage managementStorage management
•• Polymorphic operationsPolymorphic operations •• Static type checkingStatic type checking
TYPE CHECKING AND TYPE CONVERSION TYPE CHECKING AND TYPE CONVERSION Type checking:
Type checking:
checking that each operation executed by a program receives the proper number of checking that each operation executed by a program receives the proper number of arguments of the proper data types.
arguments of the proper data types.
Static type checking
Static type checking is done atis done atcompilationcompilation..
Dynamic type checking
Dynamic type checkingis done atis done atrun-timerun-time
•• Strong typing:Strong typing:all type errors can be statically checkedall type errors can be statically checked
•• Type inference:Type inference:implicit data types, used if implicit data types, used if the interpretation is unambiguous.the interpretation is unambiguous.
Type Conversion and Coercion: Type Conversion and Coercion:
Coercion:
Coercion: Implicit type conversion, performed by the sImplicit type conversion, performed by the s ystem.ystem.
Explicit conversion
Explicit conversion : routines to change from one data type to another.: routines to change from one data type to another.
Pascal:
Pascal: the functionthe function round round - converts a real type into integer- converts a real type into integer
C
C- cast, e.g.- cast, e.g.(int)X(int)X for floatfor float XXconverts the value of converts the value of XX to type integerto type integer
Coercion:
Coercion:
Two opposite approaches Two opposite approaches
•
• No coercions, any type mismatch is considered an error : Pascal, AdaNo coercions, any type mismatch is considered an error : Pascal, Ada •
• Coercions are the rule. Only if no conversion is possible, error is reported.Coercions are the rule. Only if no conversion is possible, error is reported.
ASSIGNMENT AND INITIALIZATION
ASSIGNMENT AND INITIALIZATION
Assignment:
Assignment:
- the basic operation for changing the binding of a value to a data object. - the basic operation for changing the binding of a value to a data object. The assignment operation can be defined using the concepts
The assignment operation can be defined using the conceptsL-valueL-valueandand R-valueR-value L-value:
Value,
Value, by itself, generally meansby itself, generally meansR-valueR-value Example
Example
A = A + B ; A = A + B ;
•• Pick Pick up up contents contents of of location location A: A: R-value R-value of of AA
•• Add Add contents contents of of location location B: B: R-value of R-value of BB •• Store Store result result into into address address A: A: L-value L-value of of AA
Initialization:
Initialization:
Uninitialized data object - a data object has been
Uninitialized data object - a data object has been created, but no value is assigned,created, but no value is assigned,
i.e. only allocation of a block storage has been performed. i.e. only allocation of a block storage has been performed. Initialization can be done in two ways
Initialization can be done in two ways Implicit and Explicit initializations. Implicit and Explicit initializations.
STRUCTURED DATA TYPES
STRUCTURED DATA TYPES
A data structure is a data object that contains other data objects as its elements or components. A data structure is a data object that contains other data objects as its elements or components.
1.
1.
Specifications
Specifications
2.2.
Number of componentsNumber of components
Fixed size - Arrays Fixed size - Arrays
Variable size – stacks, lists. Pointer is used to link components. Variable size – stacks, lists. Pointer is used to link components.
Type of each componentType of each component
Homogeneous – all components are the same type Homogeneous – all components are the same type Heterogeneous – components are of different types Heterogeneous – components are of different types
Maximum number of componentsMaximum number of components
Organization of the components: simple linear sequenceOrganization of the components: simple linear sequence
simple linear sequencesimple linear sequence
multidimensional structures:multidimensional structures:
separate types (Fortran)separate types (Fortran)
vector of vectors (C++)vector of vectors (C++)
Operations on data structures Operations on data structures
Component selection operationsComponent selection operations
Sequential Sequential Random Random
Insertion/deletion of componentsInsertion/deletion of components
Whole-data structure operationsWhole-data structure operations
Creation/destruction of data structures Creation/destruction of data structures
3.
3.
Implementation of data structure types
Implementation of data structure types
Storage representation Storage representation Includes:
Includes:
a.
a. storage for the componentsstorage for the components b.
b. optional descriptor - to contain some optional descriptor - to contain some or all of the attributesor all of the attributes
Sequential representation:
Sequential representation:the data structure is stored in a single contiguous block the data structure is stored in a single contiguous block of storage, that includes both descriptor and components. Used for fixed-size
of storage, that includes both descriptor and components. Used for fixed-size structures, homogeneous structures (arrays, character strings)
structures, homogeneous structures (arrays, character strings)
Linked representation:
Linked representation:the data structure is stored in several noncontiguous blocksthe data structure is stored in several noncontiguous blocks of storage, linked together through pointers.
of storage, linked together through pointers. Used for variable-size structured (trees,Used for variable-size structured (trees, lists)
lists)
Stacks, queues, lists can be represented in either way. Linked representation is more Stacks, queues, lists can be represented in either way. Linked representation is more flexible and ensures true variable size, however it has to be software simulated. flexible and ensures true variable size, however it has to be software simulated.
Implementation of operations on data structures Implementation of operations on data structures
Component selection in sequential representation:
Component selection in sequential representation:Base address plus offsetBase address plus offset calculation. Add component size to current location to move to next component. calculation. Add component size to current location to move to next component.
Component selection in linked representation:
Component selection in linked representation:Move from address location toMove from address location to address location following the chain of pointers.
Storage management Storage management
Access paths to a structured data object - to endure access to the object for its Access paths to a structured data object - to endure access to the object for its processing. Created using a name or a pointer.
processing. Created using a name or a pointer. Two central problems:
Two central problems:
Garbage
Garbage– the data object is bound but access path is destroyed.– the data object is bound but access path is destroyed. Memory cannot be unbound.
Memory cannot be unbound.
Dangling references
Dangling references– the data object is destroyed, but the access path still– the data object is destroyed, but the access path still exists.
exists.
Declarations and type checking for data structures
Declarations and type checking for data structures
What is to be checked: What is to be checked:
Existence of a selected componentExistence of a selected component
Type of a selected componentType of a selected component
Vectors and arrays
Vectors and arrays
A vector
-A vector - one dimensional arrayone dimensional array
A matrix
-A matrix -two dimensional arraytwo dimensional array
Multidimensional arrays Multidimensional arrays A slice
-A slice -a substructure in an array that is also an array, e.g. a column in a matrix.a substructure in an array that is also an array, e.g. a column in a matrix.
Implementation of array operations: Implementation of array operations:
.. Access -Access -can be implemented efficiently if the length of the components of the arraycan be implemented efficiently if the length of the components of the array is known at compilation time. The address of each selected element can be computed is known at compilation time. The address of each selected element can be computed using an arithmetic expression.
using an arithmetic expression. a.
a. Whole array operations,Whole array operations,e.g. copying an array - may require much memory.e.g. copying an array - may require much memory.
Associative arrays Associative arrays
Instead of using an integer index, elements are selected by a key value, that is a part of the Instead of using an integer index, elements are selected by a key value, that is a part of the element. Usually the elements are sorted by the key and binary search is performed to find an element. Usually the elements are sorted by the key and binary search is performed to find an
A record is a data structure composed of a fixed number of components of different types. A record is a data structure composed of a fixed number of components of different types. The components may be heterogeneous, and they are named with symbolic names.
The components may be heterogeneous, and they are named with symbolic names.
Specification
Specification of attributes of a record:of attributes of a record: Number of components
Number of components
Data type of each component Data type of each component
Selector used to name each component. Selector used to name each component.
Implementation: Implementation:
Storage
Storage: single sequential block of memory where the components are stored: single sequential block of memory where the components are stored sequentially.
sequentially.
Selection
Selection: provided the type of each component is known, the location can be: provided the type of each component is known, the location can be computed at translation time.
computed at translation time.
Note on efficiency of storage representation Note on efficiency of storage representation::
For some data types storage must begin on specific memory boundaries (required by the For some data types storage must begin on specific memory boundaries (required by the hardware organization). For example, integers must be allocated at word boundaries (e.g. hardware organization). For example, integers must be allocated at word boundaries (e.g. addresses that are multiples of 4). When the structure of a record is designed, this fact has to addresses that are multiples of 4). When the structure of a record is designed, this fact has to be taken into consideration. Otherwise the actual memory needed might be more than the be taken into consideration. Otherwise the actual memory needed might be more than the sum of the length of each component in the record. Here is an example:
sum of the length of each component in the record. Here is an example:
struct
structemployeeemployee {{charcharDivision;Division;
int
int IdNumber; };IdNumber; };
The first variable occupies one byte only. The next three bytes will remain unused and then The first variable occupies one byte only. The next three bytes will remain unused and then the second variable will be allocated to a word boundary.
the second variable will be allocated to a word boundary.
Careless design may result in doubling the memory requirements. Careless design may result in doubling the memory requirements.
Other structured data objects Other structured data objects
Records and arrays with structured components
Records and arrays with structured components: a record may have a component that is: a record may have a component that is an array, an array may be built out of components that are records.
an array, an array may be built out of components that are records.
Lists and sets
Lists and sets: lists are usually considered to represent an ordered sequence of elements,: lists are usually considered to represent an ordered sequence of elements, sets - to represent unordered collection of elements.
sets - to represent unordered collection of elements.
Executable data objects Executable data objects
In most languages, programs and data objects are separate structures (Ada, C, C++). In most languages, programs and data objects are separate structures (Ada, C, C++). Other languages however do not distinguish between programs and data - e.g. PROLOG. Other languages however do not distinguish between programs and data - e.g. PROLOG. Data structures are considered to be a special type of program statements and all are treated Data structures are considered to be a special type of program statements and all are treated in the same way.