• No results found

The last step is to load the data into the schema, often called populating the database.

1.4 Summary

Data modeling is a fundamental concern in GIS databases. Data modeling deals with data classes, information categories, evidence corroboration, knowledge building, and wisdom

perspectives. We defined the terms to provide a clearer sense of their differences and relatedness to help with data modeling effort. The differences and similarities do not come easy, as you have to work with the concepts to make them work as second nature. The purpose of elucidating the above five levels of knowing are meant to provide readers a perspective that GIS is not just about data and databases, but extends through higher levels of knowing. Understanding those terms sets the stage for understanding data models and database models.

Data modeling is a process of creating database designs. Data models and database models are both used to create and implement database designs. We can differentiate data models and database models in terms of the level of abstraction in a data modeling language. A database design process creates several levels of database descriptions, some oriented for human

communication, while others are oriented to computer-based computation. Conceptual, logical, physical have been used to differentiate levels of data modeling abstraction. A data model is the foundation framework that underlies the expression of a database model, i.e. we use data models to design database models. A database model is a particular design of a database, i.e., the design

has some real world substantive focus, not just an abstract expression of data constructs. A conceptual data model organizes and communicates the meaning of data categories in terms of object (entity) classes, attributes and (potential) relationships. Logical data models, (e.g. object, relational, or object-relational) are the underlying formal frames for database management system software. A physical data model expresses physical storage units and includes

capabilities to specify performance enhancements such as indexing mechanisms that sort data records. Each of those data models can have a corresponding database model for a particular set of information categories.

There are three special aspects to data models, the constructs, operations (that establish relationships), and integrity/validity constraints (rules). As the first aspect, spatial data

constructs in geospatial data models are composed of geospatial object classes; also called data construct types by some people. The second major aspect of data models concerns operations, i.e., relationships among constructs. Operations are a way of deriving relationships. A third major aspect of conceptual data models are the types of rules that assist in constraining operations on data elements. A validity rule maintains the valid character of data. No data should be stored in a database that does not conform to the particular construct type which is being manipulated at the time. Differences in data models dictate the differences in data constructs used to store data, the differences in operations on those data for retrieving and storing, plus the differences in validity constraints used to ensure a robust database.

ArcGIS software includes a large, (but still not all) set of data models: raster or image/grid data model, triangulated irregular network (TIN) data model, shapefile data model, coverage data model, and the geodatabase data model. The TIN and the Grid are often used to represent continuous surfaces. The shapefile, coverage, and geodatabase data models are used for storing points, lines and areas that represent mostly discrete features.

We use data models to create database models. Database models are the outcomes of a database design process. We introduced a geodatabase database design process as a data modeling

process consisting of nine steps spread across the three levels of data models, conceptual, logical and physical data models. The conceptual design process that forms a conceptual database model consists of four steps: 1) identifying the information products or the research question to be addressed, 2) identifying the key thematic layers and feature classes, 3) detailing all feature class(es) and 4) grouping representations into datasets. The logical design process that forms a logical database model consists of two steps: 2) defining attribute database structure and behavior for feature classes and 2) defining spatial properties of datasets. The physical design process that forms a physical database models consists of three steps: 1) data field specification, implementation of the schema, and populating the database. The outcome of that process was an extended Greenvalley database design and database.

1.5 References

Arctur, D. and Zeiler, M. 2004. Designing Geodatabases, ESRI Press.

Chen, P P-S 1976. The entity-relationship model—toward a unified view of data, ACM Transactions on Database Systems (TODS), 1(1):9-36.

Codd, E. F. 1970. A relational data model for large shared data banks. Communications of the ACM 13(6), 377-387.

Cova, T.J., and Goodchild, M.F. 2002. Extending geographical representation to include fields of spatial objects. International Journal of Geographical Information Science, 16(6): 509-532 ESRI (Environmental Systems Research Institute) 2006. Data model gateway.

http://support.esri.com/index.cfm?fa=downloads.dataModels.gateway, last accessed November 15, 2006.

ESRI (Environmental Systems Research Institute) 2003. Building Geodatabases with CASE Tools,

http://support.esri.com/index.cfm?fa=knowledgebase.documentation.viewDoc&PID=43&MetaI D=658, last accessed November 17, 2006.

Hull, R. and R. Kling 1987. Semantic Database Modeling, ACM Computing Surveys, 19(3):201- 260.

IBM 2003. Entity-Relationship Modeling.

http://www3.software.ibm.com/ibmdl/pub/software/rational/web/whitepapers/2003/ermodeling.p df, last accessed November 15, 2006.

IUCN (International Union of the Conservation of Nature) 1997 Approach to Assessing Progress Toward Sustainability in the Tools and Training Series, IUCN Publication Services Unit, Cambridge, UK, available from Island Press, Washington DC. Kent. W. 1984 A realistic look at data. Database Engineering, 7, 22.

Longley, P. Goodchild, M. Maguire, M. and Rhind, D. 2001. Geographic Information Systems and Science. Wiley. New York.

Martin, J., 1976. Principles of Data-Base Management, Prentice-Hall, Englewood Cliffs.

National Institute for Standards and Technology 1994. Federal Information Processing Standard 173-1, Spatial Data Transfer Standard, National Institute for Standards, Gaithersburg, MD. Nyerges, T. 1991. Geographic Information Abstractions: Conceptual Clarity for Geographic Modeling, Environment and Planning A, 1991, vol. 23:1483-1499.

Rumbaugh, J., Jacobson, I., and Booch, G. 1999. The Unified Modeling Language Reference Manual, Addison-Wesley, Reading, Massachusetts.

Sayer, A. 1984. Method in Social Science, London: Hutchinson.

Wachs, M., and Schofer, J. L. 1969. Abstract values and concrete highways. Traffic Quarterly, 133-145.

Zeiler, M. 1999. Modeling Our World, ESRI Press, Redlands, CA. 1.6 Review Questions

1. Differentiate among data, information and evidence.

2. Why is it important to differentiate evidence from knowledge?

3. Why is it useful to understand the difference between a data model and a database model when choosing a software system versus choosing the data categories to develop an application?

4. Why do we have three levels of database abstraction, conceptual, logical, physical models? 5. What are the three components of every conceptual, logical and physical data model? 6. What is the difference between an image and grid data model?

7. Why did ESRI develop the geodatabase data model? 8. What is a general process for undertaking database design? 9. Why is a concerns hierarchy important to database design? 1.7 Glossary

class – a generic term for a data category composed by bundling observations of like kinds; for

example a feature class in ArcGIS

data – raw observations for characterizing past, present, future, or imaginary topic (reality) data model – the collection of constructs, operations, and constraints that form the basis of a

data management system; a concept directed at software design but useful in characterizing the capabilities of a database management system. A data model can be specified at conceptual, logical, and physical levels.

database design – process composed of specifying conceptual, logical, and physical schemas as

the major steps in formulating a database model.

database model – A schema and data dictionary associated with the outcomes of a particular

database design process.

data record – a collection of data fields.

data structure – a way of organizing data; a concept similar to abstract data type.

data type format – the specification of a data field in terms of ways of storing data, for example

as a floating point number, integer, text (character) string, blob etc.

data type, abstract – the specification of a class using data fields in a conceptual manner at the

level of a conceptual data model.

evidence – information

n assembled to encourage a common interpretation of a collection of observations.

information – data situated in a context that takes on meaning for use

knowledge – evidence brought together than re-enforces (corroborates) an interpretation of data,

information, and/or evidence and has withstood challenges about its validity.

rules, for a data model – a statement about the way to test the believability of data; for example

as in validity rules – rules to establish the correctness of data stored in a data base.

schema – description of data categories plus data fields that characterize features (entities) about

Related documents