• No results found

Using SAS ACCESS to retrieve and store data in relational database management systems

N/A
N/A
Protected

Academic year: 2021

Share "Using SAS ACCESS to retrieve and store data in relational database management systems"

Copied!
21
0
0

Loading.... (view fulltext now)

Full text

(1)

Using SAS ACCESS to retrieve and store

data in relational database management

systems

Daniel Ricard

Department of Biology Dalhousie University

(2)

Background Examples

Disclaimer

SAS is the only proprietary software that I use I use SAS to fit models, and for nothing else I use SAS for Unix running on a variety of Unices

My computing toolkit

Linux OS (Ubuntu and CentOS) PostgreSQL with postGIS unixODBC

The R software environment for statistics and graphics The Generic Mapping Tools (GMT)

The usual suspects (Emacs, Perl, shell scripts)

(3)

Background Examples

What I do

Fisheries research

Models of fish population dynamics to determine past, present and future abundance and biomass.

The catch rate is used to estimate abundance and biomass.

Counting fish in the ocean is like counting trees in the forest, except that you can’t see them and they move.

(4)

Background Examples

What I do

Fisheries research

Models of fish population dynamics to determine past, present and future abundance and biomass.

The catch rate is used to estimate abundance and biomass.

Counting fish in the ocean is like counting trees in the forest, except that you can’t see them and they move.

(5)

Background Examples

What I do

Fisheries research

Models of fish population dynamics to determine past, present and future abundance and biomass.

The catch rate is used to estimate abundance and biomass.

Counting fish in the ocean is like counting trees in the forest, except that you can’t see them and they move.

(6)
(7)
(8)

Background Examples

Relational Database Management System (RDBMS)

What?

Structured Query Language (SQL) Create tables and load data Access data through SQL queries Oracle, DB2, MS Access, PostgreSQL, mySQL, ...

Why?

Indexing of data and optimisation of queries to perform retrievals

Provides a “neutral” layer that can be used by most analytical software through database bindings

OpenGIS ”Simple Features Specification for SQL” - to define geospatial data and perform queries with spatial operators

(9)

Background Examples

Relational Database Management System (RDBMS)

What?

Structured Query Language (SQL) Create tables and load data Access data through SQL queries Oracle, DB2, MS Access, PostgreSQL, mySQL, ...

Why?

Indexing of data and optimisation of queries to perform retrievals

Provides a “neutral” layer that can be used by most analytical software through database bindings

OpenGIS ”Simple Features Specification for SQL” - to define geospatial data and perform queries with spatial operators

(10)

Background Examples

SAS ACCESS

Database connectivity to Oracle, DB2 and ODBC PROC SQL with pass-through facility

PostGIS runs on PostgreSQL

No support for PostgreSQL or mySQL

However, Open Source RDBMS all come with ODBC drivers

SAS/ACCESS(R) 9.1.3 for Relational Databases: Reference, Second Edition

(11)

Background Examples

ODBC

Open Database Connectivity (ODBC) provides a standard software API method for using database

management systems

(12)

Background Examples

Getting data from a RDBMS into SAS

Assuming that you have an ODBC Data Source Name (DSN) called “DSNentry”, the following code will allow you to define the RDBMS as a SAS libref.

libname myRDBMSlib odbc USER=’username’

DATASRC= ’DSNentry’ SCHEMA=’public’;

(13)

Background Examples

Getting data from a RDBMS into SAS (cont.)

Assuming that you have an ODBC DSN called “DSNentry”, the following code using PROC SQL pass-through facility will allow you to get data directly from the RDBMS.

proc sql;

connect to odbc (user=username datasrc=DSNentry); create table mydata as

select * from connection to odbc (

SQL QUERY GOES HERE );

disconnect from odbc; quit;

(14)

Background Examples

Moving data out of SAS and into a RDBMS

libname gfsDB odbc USER=’postgres’ DATASRC= ’gfsDB’ SCHEMA=’public’; PROC GENMOD data=mydata;

ods output ParameterEstimates = myestimates; ...

RUN;

PROC SQL;

CREATE TABLE gfsDB.isoest AS

SELECT * FROM myestimates; QUIT;

(15)

Background Examples

Pitfalls or why I had to call the SAS helpline

PROC SQL;

CREATE TABLE gfsDB.isoest(dbtype= (Label=’varchar(30)’ Estimate=’numeric(12,5)’ StdErr=’numeric(12,5)’ LowerCL=’numeric(12,5)’ UpperCL=’numeric(12,5)’ ChiSq=’numeric(12,5)’ ProbChiSq=’numeric(12,5)’) ) AS

SELECT * FROM myestimates; QUIT;

(16)

Background Examples

Doing the same in R

Using the RODBC package, the same can be accomplished in R.

# R code

require(RODBC)

chan <- odbcConnect(dsn="gfsDB", uid = "postgres", case=’postgresql’,believeNRows=FALSE) qu <- paste("SQL QUERY GOES HERE ")

mydata <- sqlQuery(chan, qu, errors= TRUE)

(17)

Summary

RDBMS are unmatched for storing and retrieving data, spreadsheets can only go so far

SAS, like all other statistical software, provides tools to access a variety of RDBMS and to invoke SQL queries to obtain data Once a query is developed, it can be used from a variety of software, hence facilitating collaborations with other scientists RDBMS can be configured to be accessible on the Internet, hence a powerful tool for sharing data

(18)

Summary

RDBMS are unmatched for storing and retrieving data, spreadsheets can only go so far

SAS, like all other statistical software, provides tools to access a variety of RDBMS and to invoke SQL queries to obtain data Once a query is developed, it can be used from a variety of software, hence facilitating collaborations with other scientists RDBMS can be configured to be accessible on the Internet, hence a powerful tool for sharing data

(19)

Summary

RDBMS are unmatched for storing and retrieving data, spreadsheets can only go so far

SAS, like all other statistical software, provides tools to access a variety of RDBMS and to invoke SQL queries to obtain data Once a query is developed, it can be used from a variety of software, hence facilitating collaborations with other scientists RDBMS can be configured to be accessible on the Internet, hence a powerful tool for sharing data

(20)

Summary

RDBMS are unmatched for storing and retrieving data, spreadsheets can only go so far

SAS, like all other statistical software, provides tools to access a variety of RDBMS and to invoke SQL queries to obtain data Once a query is developed, it can be used from a variety of software, hence facilitating collaborations with other scientists RDBMS can be configured to be accessible on the Internet, hence a powerful tool for sharing data

(21)

References

Related documents

This paper deals with the use of Geographical Information Systems (GIS), Relational Database Management Systems (RDBMS) and Document Management Systems (DMS) in the

We have identified query, the possible transformation of the result and delivery of the resulting data as key services that a Grid data service should support. These three

From the summary of the tools for data migration between various types of RDBMS we can conclude that tools enable migration of database tables and their data. The

Using an approach called query franchising , Big Data SQL provides unified query across Oracle database, Hadoop, and NoSQL datastores in a fashion which maximizes performance

of machine learning algorithms, Weka [9], which can only be used on data sets that can fit into main memory, and extended it to be able to use a database as backend.. In the

The Hadoop system has provided the features that the end users did not find in the RDBMS, like the high capability of processing large amounts of semi-structured and unstructured

SAS software was used to read the entered data files, check for potential errors, apply corrections to data sets, and combine batches into analytic data sets.. The data

SQL Server 2005 provides the following XQuery-based capabilities on the XML data type: query transformation, atomic value extraction, existence check, node-to-row mapping