• No results found

EEOS 381 -Spatial Databases and GIS Applications

N/A
N/A
Protected

Academic year: 2021

Share "EEOS 381 -Spatial Databases and GIS Applications"

Copied!
57
0
0

Loading.... (view fulltext now)

Full text

(1)

Lecture 3

GIS Data Models Data Formats

EEOS 381 - Spatial

Databases and

(2)

Overview

Overview

GIS Data Models

Common GIS Data

Formats

(3)

Overview

Overview

Key points:

– It is important to understand what

model to use, based on the

application

– The model determines what specific

format you use

– The format may determine what

types of analysis you perform

(4)

Data Model

Data Model

General definition:

–Abstraction or representation of

objects and processes in the real

world, incorporating properties

relevant to the application at

hand

(5)

GIS Data Model

GIS Data Model

Definition:

–Digital representation of

geographic objects (spatial data)

in GIS software

• includes relationships between and

attributes of objects

• doesn’t include all of reality

(6)

GIS Data Models

GIS Data Models

The role of a data model in GIS

(7)

GIS Data Models

GIS Data Models

Levels of abstraction:

Reality Real-world phenomena - e.g. wells, streets, lakes

Conceptual Model Decide which objects are applicable, what relationships exist among them, what processes they participate in

Logical Model List objects, with names, descriptions, behavior, interaction, location, what GIS will do

(8)

GIS Data Models

GIS Data Models

Example implementation:

Reality Wells, dry cleaners, streets

Conceptual Model Ask - How does pollution from dry cleaners and

major roads affect public water supplies (wells and reservoirs)?

Logical Model Use ArcGIS to compare wells (points), reservoirs (polygons), dry cleaners (points) and streets (lines),

with buffer and proximity operations; focus on wells with 100+ gallons per minute yield and major roads, in

eastern Mass.

Physical Model BUFFER shapefile WELLS_PT, join to GPM table on YIELD

field; determine how many dry cleaners are within 1 mile of large wells and proximity to reservoirs and wells to major roads; store in Oracle-based ArcSDE geodatabase

(9)

GIS Data Models – 2 Conceptual Views

GIS Data Models – 2 Conceptual Views

Discrete objects

– World is empty except where

occupied by objects with well-defined locations and/or boundaries

• e.g. wells, streets, lakes

Fields

– Measurements may be

made at any location over a continuous surface

• e.g. elevation,

(10)

GIS Data Models

(11)

GIS Data Models

GIS Data Models

Raster is a data model

– space is divided into array (rows and

columns) of cells

– each cell (pixel, or picture element) in a

layer is the same size and has a

homogeneous value

• cell size refers to resolution (10m, 1

foot, etc.)

– usually associated with field view

(12)

GIS Data Models

GIS Data Models

Raster - examples

(13)

GIS Data Models

GIS Data Models

Raster

– Cells may belong to

zones (groups of cells with same values,

usually representing the same feature)

– Can include ‘NODATA’ -null values (out of range of dataset or no

information available for that cell)

– Some image formats can include attributes (value attribute table)

(14)

GIS Data Models

GIS Data Models

Raster

– Advantages:

• A simple data structure—a matrix of cells with values, representing a coordinate, sometimes linked to an

attribute table.

• A powerful format for intense statistical and spatial analysis; perform overlays with complex data faster than with vector data.

– “Spatial Analyst” extension in ArcGIS

• The ability to represent continuous surfaces and perform surface analysis.

• The ability to uniformly store points, lines, polygons, and surfaces.

(15)

GIS Data Models

GIS Data Models

Raster

– Disadvantages:

• Inherent spatial inaccuracies due to

the cell-based feature representation, especially if low resolution.

(16)

GIS Data Models

GIS Data Models

Vector is a data model

– points - single coordinate values

– lines (arcs) - strings of connected points

– polygons (areas) - enclosed lines

– usually associated with discrete object view

(17)

GIS Data Models

GIS Data Models

Vector –

the basics

POINT - location with a

set of coordinates (0-D)

LINE – connected

string of points (1-D)

POLYGON – area defined by a line (2-D)

2 line segments (a direct line between two points) shown here

(18)

GIS Data Models

GIS Data Models

(topological junction, or endpoint of line)

(direct connection between two nodes)

(sequence of line segments)

(directed sequence of nonintersecting line segments with nodes

(an area defined by an outer ring without inner rings) (sequence of any line segments with

closure) (curve string)

(an area defined by an outer ring with

inner rings) (a link between two

nodes, with one direction designated)

Vector

(other

(19)

GIS Data Models

GIS Data Models

Vector

– Advantages:

• Precise values

• Efficient storage

• Topological relationships

• High-quality cartographic output

• Useful for a variety of spatial analysis

(20)

GIS Data Models

GIS Data Models

Vector

– Disadvantages:

• Poor for storing continuous surfaces

(e.g. elevation models)

• Overlay operations can be

time-consuming and computer intensive

(21)

GIS Data Models

GIS Data Models

Vector

– Simple vs. Topologic features:

• Simple - a.k.a. “spaghetti model” - no inherent connectivity relationships

• Topologic - simple features with defined spatial relationships Spaghetti – 4 linear features Topologic - 14 linear features - 13 nodes Node Line

(22)

GIS Data Models

GIS Data Models

Spaghetti Data Model

– No details of logical relationships between objects

• The line shared by two adjacent polygons is stored separately (twice) in the computer

• Spatial relationships are only implied

– Efficient for cartographic display but not data storage

– At first, GIS used vector data and cartographic spaghetti structures

(23)

GIS Data Models

GIS Data Models

Topology

– Connectivity: chains are connected at which nodes? – Direction: defined by a “from node” and a “to-node”

of a chain Example analysis: Modeling flow through the connecting lines in a network

(24)

GIS Data Models

GIS Data Models

Topology

– Adjacency: which polygons are on the left and which are on the right side of a chain?

Example analysis: Identifying adjacent

features;

Combining adjacent polygons with similar

(25)

GIS Data Models

GIS Data Models

Topology

– Inclusion: simple spatial objects (node,

chain, smaller polygon) are within a polygon

Example analysis: Overlaying geographic

(26)

GIS Data Models

GIS Data Models

Network

– Type of topologic vector data model (see pgs 218-219 in book)

– Models flow of goods and services (e.g. routes of roads, rivers, utility lines)

• Radial - flow in one direction (e.g. upstream, downstream) • Looped - intersections allowed, choices for flow allowed

“Network Analyst” extension in ArcGIS contains tools for this type

(27)

GIS Data Models

GIS Data Models

Regions

– Type of

topologic vector

data model

– Groups of

polygons in

coverages

– “Multi-part”

polygons

(28)

GIS Data Models

GIS Data Models

Routes

– Composite line features

• Created from sections (whole or partial arc) • contain “M” values (measures along route)

• Ex.: All the arc segments in ALL_ROADS that make up Interstate 90, treated as one feature in MAJOR_ROUTES

(29)

GIS Data Models

GIS Data Models

Linear Referencing System (LRS)

– Uses a relative position along an already existing linear feature, without explicit x,y

coordinates. Location is given as a position, or measure, along it (distance, or percent along).

• Have “base layer” of lines, plus a series of related “event tables”

– Address, Speed Limit, Route Number tables, etc…

• Highways/city streets (MassDOT), railroads, rivers, and pipelines, water and sewer networks

• Dynamic segmentation / “flat file”

– See pages 219-221 in

(30)

GIS Data Models

GIS Data Models

Linear Referencing System (LRS)

1 “Base” arc Speed limit # of lanes 3 “Flat file” arcs ID = 1 55 mph 45 mph 30 mi. 0 100 3 lanes 2 lanes ID = 1 2 3 3 55 1 ID SPEEDLIMIT NUMLANES 2 3 NUMLANES 60 0 1 100 60 1 ID F_MEAS T_MEAS 45 55 SPEEDLIMIT 100 30 T_MEAS 30 0 F_MEAS 1 1 ID 2 1 ID Base arcs feature class attribute table

Flat file arcs

SPEEDLIMIT Table NUMLANES Table

LRS Tables

(31)

GIS Data Models

GIS Data Models

TIN (Triangular Irregular Network)

– Topologic data model for surfaces (e.g. elevation) made up of connected triangles (faces)

– Triangle nodes have X,Y,Z values

– Triangles may be sized differently, based on original data density

(32)

GIS Data Models

GIS Data Models

TIN

As viewed in ArcScene

(33)

GIS Data Models

GIS Data Models

Terrain Dataset

– a multiresolution, TIN-based surface built from

measurements stored as features in a geodatabase. They're typically made from LiDAR, sonar, and

photogrammetric sources. Terrains reside in the

geodatabase, inside feature datasets with the features used to construct them.

(34)

GIS Data Models

GIS Data Models

Annotation

– text labels (vector

features)

– fixed position, size,

orientation

• anno does NOT

reposition as you pan and zoom

– N/A for shapefiles (only in GDB

and coverages)

(35)

GIS Data Models

GIS Data Models

Object-Relational

– Everything stored in database tables

• attributes, geometry in RDBMS

– Defined relationships between objects – Can store topology

– Can design with CASE (Computer-Aided Software Engineering) tools (like MS Visio) to produce UML (Unified Modeling Language) diagrams (see pages

221-226 in textbook)

– Download models from esri.com for various industries

(36)

GIS Data Models

GIS Data Models

Object-Relational UML Diagram

An example of a CASE tool (Microsoft Visio) The UML model

(37)

GIS Data Models

GIS Data Models

Object-Relational Diagram

A water-facility object model

(38)

Definition

Definition

Format -

The pattern into which data (coordinates, attributes, indexes, spatial

reference, etc.) is systematically arranged for use on a computer. A file format is the specific design of how information is organized in the file. (All GIS data is a file on disk at the most basic level).

– For example, ArcInfo has specific, proprietary formats used to store coverages. DLG, DEM, and TIGER are geographic datasets with different file formats. ESRI has also developed Shapefiles and Geodatabases.

(39)

GIS Data Formats

GIS Data Formats

Common raster formats:

– GeoTIFF, TIFF, BIL, BIP

– MrSID (.SID), JPG, JPEG 2000 – GRID, DEM

– ERDAS IMAGINE (.IMG) – Intergraph - CIT, COT – ER Mapper

– ADRC

– NTIF - National Image Transfer Format – Geodatabase “raster datasets”

(40)

Raster - file components:

– Image file (.tif, .sid, ... )

– Header (“world”) file (.tfw, sdw, …):

– Auxiliary file (.aux) - stores spatial reference

– Reduced raster resolution (.rrd or .ovr) – stores pyramid levels

GIS Data Formats

GIS Data Formats

1.000000000000000 0.000000000000000 0.000000000000000 -1.000000000000000 237000.500000000000000 897999.500000000000000

Cell size (x-scale)

Coordinates of center of upper left pixel

Rotation terms

(41)

GIS Data Formats

GIS Data Formats

Common vector formats:

– Shapefile, Coverage, Geodatabase “feature classes” – DXF, DWG - CAD-based

– MapInfo - MIF – DLG

– TIGER, VPF – ASCII, DBF

– SDTS - Spatial Data Transfer Standard – SDC - Smart Data Compression

(42)

Definitions

Definitions

A feature is a point, line, or polygon in a dataset that represents a real-world object A feature class is a collection of features, categorized by the type of geometry used to define the feature (e.g., how the coordinates are stored, as a point, line, or polygon)

– “polygon feature class”, “arc feature class”, “point feature class”, etc.

– Should represent similar objects

(43)

Common ArcGIS Formats

Common ArcGIS Formats

Coverage

Shapefile

Geodatabase

(“geographic database”)

– Personal, File

– Spatial Database Engine (SDE)

File-based

data model

File-based

data model

DBMS-based data model

(aka Object data model)

DBMS-based data model

(aka Object data model)

Vector

Vector & Raster

(44)

GIS Data Formats - Shapefile

GIS Data Formats - Shapefile

Developed by ESRI (ArcView 2)

Stored on disk in folders

Consists of a set of files

– .shp – spatial geometry

– .shx – spatial geometry index

– .dbf – dBASE file

(feature attributes)

– optional others (.prj, .sbn, .sbx, .ain,

.aih, .aig, …)

always present

(45)

GIS Data Formats - Shapefile

GIS Data Formats - Shapefile

Simpler than coverages - useful for

mapmaking and some kinds of analysis. Fast display (especially when local)

Single feature class (geometry) per shapefile

– Point (points and multipoints) or

– Line (simple lines and multipart polylines) or

– Polygon (simple and multipart)

No topology or annotation

10-character max. field names (dbf limitation) May be edited in ArcGIS and ArcView GIS 2x+ Open format (specs available); may be

(46)

GIS Data Formats - Coverage

GIS Data Formats - Coverage

Developed by ESRI, c.1981

Traditional (Arc/Info) format for

complex geoprocessing, high-quality

geographic data, and sophisticated

spatial analysis.

Stores features and attributes for

thematically associated data

Can explicitly store topology (features

stored only once) - use BUILD or CLEAN

commands (vs. “spaghetti data model”

(47)

GIS Data Formats - Coverage

GIS Data Formats - Coverage

Stored on disk as a directory (folder) of files, with more files in associated ‘info’ directory Attributes in INFO format (tables)

Coverage folder stored in a “workspace” - a

special name for a folder with a coverage (or

Grid or TIN)

Workspace

Coverages

(48)

GIS Data Formats - Coverage

GIS Data Formats - Coverage

Multiple feature classes can be grouped and stored in one coverage

– Primary (label point, arc, polygon, node) – Secondary (tics, links, annotation)

– Compound (routes/sections, regions; built from primary features) – like “multi-part features”

Edit in ArcInfo Workstation only

Polygons can’t have “holes” (because of “universal polygon” (i.e. the background)

You cannot have points and polygons in the

(49)

GIS Data Formats - Coverage

GIS Data Formats - Coverage

(point attribute table) (arc attribute table)

(route attribute table)

(polygon attribute table) (node attribute table)

(50)

GIS Data Formats - Coverage

GIS Data Formats - Coverage

Explicit topology

– Connectivity (arcnode topology)

-arcs connect to each other at nodes

(51)

GIS Data Formats - Coverage

GIS Data Formats - Coverage

Explicit topology

– Area Definition (polygonarc topology)

-Arcs that connect to surround an area

define a polygon

(52)

GIS Data Formats - Coverage

GIS Data Formats - Coverage

Explicit topology

– Contiguity (adjacency) - Arcs have

direction and left and right sides

(53)

GIS Data Formats - Coverage

GIS Data Formats - Coverage

Coverage attribute tables have “Sacred Items”

– Point/Polygon: AREA, PERIMETER, <COVER>#, <COVER>-ID

– Arc: <COVER>#, <COVER>-ID, FNODE#, TNODE#, LPOLY#, RPOLY#, LENGTH

Topology between feature classes managed

with sacred items

– Ex.: <cover># in .PAT (polygon attribute table) relates to

LPOLY# and RPOLY# in .AAT (arc attribute table)

– <cover># = 1 in polygon coverages’ “universal polygon” (hidden in ArcGIS Desktop)

(54)

Data Format Conversion

Data Format Conversion

Workflow may dictate that data

need to be in another format

In ArcMap, Right-click layer in Table of Contents and choose Data >

Export Data > and select format

(55)

Data Format Conversion

Data Format Conversion

(56)

Use ArcToolbox Conversion Tools

ArcInfo license and installation of ArcInfo Workstation required for Coverage conversion tools

Data Format Conversion

(57)

Distribution

Distribution

Process of moving data from one location to another

Copy/paste in ArcCatalog if source and

destination are both accessible, otherwise:

– Coverage – export to “Arc/Info Export File” (a.k.a “interchange file”) in ArcToolbox

• ASCII file with .e00 extension

• User then “Imports” file with ArcToolbox (ArcInfo)

– Shapefile – send all components or use WinZip, PKZIP, StuffIt, etc., to send all in one file

References

Related documents

A three-layered data warehouse semantic model is proposed, where we structure the warehouse summary data into three levels of abstraction, namely, quantitative (numerical)

communicative approach to language teaching and learning, and within it, of the task- based approach [Willis, 07]. In this sense, our application consists of a

We apply an established theory of human behavior, the Theory of Planned Behavior (Ajzen, 1988), with the aim of (1) identifying significant differences between cultural groups in

The broad objective of this study is to assess the implementation of Zimbabwe’s current tobacco- control legislation, and to present an analysis of the current situation in the

The average area of corn grain produced per finished animal was much greater in the Midwest than the Northern Plains, but production values per animal for corn silage,

o Interlinked mains wired smoke detectors with integral battery back-up located in the escape. route on all

Our results indicate that bank risk governance is not directly associated with poor risk events during the financial crisis, whereas it is associated with new factors

to extend the Java Virtual Machine (JVM) to be “cluster- aware” so that a group of JVMs running on distributed clus- ter nodes can work together as a single, more powerful JVM