Application of Grid-Enabled Technologies
for Solving Optimization Problems in Data
Driven Reservoir Systems
M. Parashar, H. Klie, U. Catalyurek, T.
Kurc, V. Matossian, J. Saltz, M.F.
UT Austin Univ. Chicago OSU UMD Rutgers
ITR Collaborators
• University of Chicago—CS: Stevens, Papka • University of Maryland—CS: Sussman• Ohio State—CS: Saltz, Kurc, Catalyurek
• Rutgers—ECE: Parashar • MIT—Engineering: Haines • UT Austin – Wheeler, Dawson,
Peszynska, Klie, Bangerth (Computaional and Applied
math); Sen, Stoffa, Seifoullaev (UTIG), Torres-Verdin (CPGE)
The Instrumented Oil Field
The Instrumented Oil Field
Detect and track changes in data during production
Invert data for reservoir properties
Detect and track reservoir changes
Assimilate data & reservoir properties into
the evolving reservoir model
Use simulation and optimization to guide future production,
future data acquisition strategy
Assumptions:
Production of oil and gas will take advantage of
permanently installed geophysical sensors and down hole
instrumentation that will monitor the reservoir’s state as
fluids are extracted.
Knowledge of the reservoir’s state during production will
result in better engineering decisions to modify production
techniques that optimize goals while maintaining safe
operating conditions in environmentally complex and
Optimize
• Economic revenue • Environmental hazard • …
Based on the present subsurface knowledgeand numerical model
Improve numerical model
Plan optimal data acquisition Acquire remote sensing data Improve knowledge of subsurface to reduce uncertainty Update knowledge of model Management decision START
Dynamic Decision
Dynamic Decision
System
System
Driven Assimilation
Driven Assimilation
Dynamic Data
Dynamic Data
-
-Data assimilation Subsurface characterization Experimental design Autonomic Autonomic Grid Grid Middleware Middleware
Grid Data Management
Grid Data Management
Processing Middleware
Processing Middleware
Data Driven Model Optimization
DDDSF Requires Multi-petabyte Virtual Data
Archive
Ohio Supercomputing Center Mass Storage Testbed
Ω D V D Ω D V D Ω D V D Ω D V D Ω D V D Ω D V D MetaData Servers
Core Storage Pool (35/50 TB) with SAN.FS (4) 772 MB/s throughput (4) 772 MB/s throughput (4) 772 MB/s throughput (4) 772 MB/s throughput
SAN Volume Controller (4 servers) FAStT900 (4) Backup Storage 3584 Tape 1 L32 2 D32 Actual: 640 cartridges @ 200 GB for a total of 128 TB 4 drives max drive data rate is 35 MB/s Cisco Directors 9509 (4) (2) 890 MB/s throughput (2) 890 MB/s T hroughput (2) (2) (2) (2) )(2 (2) (16 4 pe r ser ver) 890 MB /s thr ough
put Scratch / Archive Storage Pool (310/420 TB)FAStT600 Turbo (20)
LinTel boxes (PvFS/ Active Disk Archive) (20)
Ω D V D Ω D V D Ω D V D Ω D V D Ω D V D Ω D V D Ω D V D Ω D V D Ω D V D Ω D V D Ω D V D Ω D V D Ω D V D Ω D V D Ω D V D Ω D V D Ω D V D Ω D V D Ω D V D Ω D V D (40 - 2 per T600) 384 MB/s throughput (40 - 2 per xSeries) 10 GB/s • 50 TB of performance storage
– home directories, project storage space, and long-term frequently accessed files.
• 420 TB of
performance/capacity storage
– Active Disk Cache
-compute jobs that require directly connected storage – parallel file systems, and
scratch space.
– Large temporary holding area
• 128 TB tape library
– Backups and long-term "offline" storage
IBM’s Storage Tank technology combined with TFN connections will allow large data sets to be seamlessly moved throughout the state with increased redundancy and seamless delivery.
IBM’s Storage Tank technology combined with TFN connections will allow large data sets to be seamlessly moved throughout the state with increased redundancy and seamless delivery.
A new generation of IPARS
A new generation of IPARS
Optimizing oil production on
the Grid
Objective function Steering Monitoring Data manag./ assimilation Dynamic data Static data Visualization Collaboration Clien tsOptimization with a Known Oil Reservoir
Model
•
f
: Objective
function
•
α
: Control variables
in feasibility set
A
•
c
: Model data
Interplay between Data Acquisition, Data
Assimilation and Optimization
• Optimization seeks best production
strategy
•
Control variables
α
parameterize
production and
data acquisition strategy
•
Good choice of
α
optimizes production
and improves model certainty
• Model c as stochastic
• E: expectation for PDF of c
• A posteriori PDF computed to
describe current subsurface
knowledge
Parallel/Grid Computing Tools
Parallel/Grid Computing Tools
The Multiblock Adaptive Computational Engine (MACE) for solving heterogeneous domain applications
Adaptive grid blocks
Automatic and transparent scheduling, load balancing
Distributed Shared Objects: distributed dynamic arrays
Datacutter/STORM: Middleware for On-Demand Data Product Generation for Large Archival Scientific Datasets in a Grid Environment
Exploration and analysis of scientific datasets in distributed and
heterogeneous environments
Represents components of a data-intensive application as a set of
filters
Data virtualization for heterogeneous collections of data formats,
storage systems
Discover: Grid Computational Collaboratory enabling seamless and secure access to and interactions between users, applications, services, data and resources
P2P Grid Middleware: services, autonomic composition, secure access
Scalability of IPARS and
geomechanical coupling
Domain 76800 by 76800 by 1059 feet
513 by 513 by 45 mesh points
282 nodes of dual-processor Dell PowerEdge 1750 3.06GHz computer
interconnected by a Myrinet 2000 with a point-to-point bandwidth of
2Gb/sec. Each node has a 2GB of memory.
0.6 0.7 0.8 0.9 1 1.1 1.2 64 128 192 256 Number of processors P ara llel ef fi ci en cy
Data Middleware Services
• Filter-stream based distributed execution
middleware (
DataCutter, STORM
)
• Grid based data virtualization, data
management, query, on demand data
product generation (
STORM, Active ProxyG,
Mako
)
• Distributed metadata management (
Mobius
Global Model Exchange
)
– Track metadata associated with workflows, input
image datasets, checkpointed intermediate
Data Middleware Services and Very Large
Scale Distributed Data Applications
Processing Remotely-Sensed Data
NOAA Tiros-N w/ AVHRR sensor
AVHRR Level 1 Data
AVHRR Level 1 Data • As the TIROS-N satellite orbits, the
Advanced Very High Resolution Radiometer (AVHRR) sensor scans perpendicular to the satellite’s track. • At regular intervals along a scan line measurements are gathered to form an instantaneous field of view
(IFOV).
• Scan lines are aggregated into Level 1 data sets.
A single file of Global Area Coverage(GAC) data represents:
• ~one full earth orbit. • ~110 minutes. • ~40 megabytes. • ~15,000 scan lines.
One scan line is 409 IFOV’s
Satellite Data Processing
Managing Oilfields, Contaminant Transport Digital Pathology Derivation of macroscopic materials properties from MD simulations DCE-MRI Analysis
DataCutter
9/11/2002 DataCutter 19
Combined Data/Task Parallelism
host1 R0 R1 host2 R2 host3 Ra0 host1 E0 EK host2 EK+1 EN host4 Ra1 host5 Ra2 host1 M Cluster 1 Cluster 3 Cluster 2
Flow control between components
Schedulers place filters on grid
processors (scheduler API)
Parallel stream based communication
Data aggregation implemented as a
component
Filters placed near data sources
NPACkage, NMI
Scientific and engineering applications require interactive exploration and
analysis of datasets.
Applications developers generally prefer storing data in files
Support high level queries on multi-dimensional distributed datasets
Many possible data abstractions, query interfaces
Grid virtualized object relational database or XML database
Grid virtualized objects with user defined methods invoked to access and process data
A virtual relational table view
Large distributed scientific datasets
Data Service
Data
Virtualization
Our Approach
• Automatic data virtualization
–
Friendly front-end
Support a basic SQL Select query with a virtual
relational table view or a virtual XML database view
–
A lightweight layer on top of datasets
• STORM runtime middleware STORM carries out query
execution, query planning
–
Compiler front end customizes runtime support
Automatic customization and configuration of runtime
query support middleware
Compiler Customization – support for Select
query
SELECT < Data Elements > SELECT *
FROM < Dataset Name > FROM IPARS
WHERE < Expression > WHERE REL in (0,6,26,27) AND TIME>1000 AND Filer( < Data Element> ); AND TIME<1100 AND SOIL>0.7
Analysis of Oil Reservoir Simulation Data
Prototype Implementation
• Evaluate geologic uncertainty and production strategies
simultaneously
– Multiple realizations of multiple geostatistical models
– Multiple production strategies (number, location of wells)
• Dataset Size = ~5TB
– 500 simulations, selected from several Geostatistics models and well patterns
– Each simulation is ~10GB
• 2,000 time steps, 65K grid elements, 8 scalars + 3 vectors = 17 variables
• Stored at
– SDSC: HPSS and 30TB Storage Area Network System
– UMD: 9TB disks on 50 nodes: PIII-650, 128MB, Switched Ethernet – OSU: 7.2TB disks on 24 nodes: PIII-900, 512MB, Switched Ethernet
• Data Analysis
– Economic model assessment – Bypassed oil regions
Component #
Receiver group #
&receiver group position
Sp (or CDP) #
& source position
Line #
Array #
Seismic Data Analysis – STORM: On
Demand Processing of 1.5 TB Seismic
Dataset
Traces
Survey #
Data Archive & Sensors Data Archives Sensors, Non-Traditional Data Sources Discovery Points Laptop PDA Computer User Scientist Resources CPU's, Storage, Instruments, ... Applications & Services Application Service Discovery Points P2P Grid Middleware DISCOVER Portals
DISCOVER: A Grid Computational Collaboratory enabling seamless and secure access to and interactions between users, applications, services, data and resources
P2P Grid Middleware (PAWN, DISCOVER-COG)
Peer services (discovery, routing, message publication, notification, event), context-aware access control, p2p deductive engines.
Autonomic and Interactive Components (DIOS, AUTOMATE)
Components encapsulate sensors, actuators, policies and rules. Distributed control network connects sensors, actuators and interaction agents.
P2P deductive shell, control network, rules and polices enable autonomic composition, configuration, interaction, protection, optimization and adaptation.
Collaborative Portals
Pervasive (secure) access, monitoring, interaction and control
Autonomic
Autonomic
Oil Well Placement (UT-CSM, UT-IG)
Oil
•
Optimization services:
–
VFSA (Very Fast Simulated Annealing)
–
SPSA (Simultaneous Perturbation Stochastic Optimization)
•
IPARS delivers
–
fast-forward model (guess->objective function value)
–
post-processing
•
Formulate a parameter space
–
well position and pressure (y,z,P)
•
Formulate an objective function:
Autonomic Oil Reservoir Optimization using
Decentralized Services
Components of the AORO Application
•
IPARS : Integrated Parallel Accurate Reservoir Simulator
– Parallel reservoir simulation framework
•
IPARS Factory
– Configures instances of IPARS simulations
– Deploys them on resources on the Grid
– Manages their execution
•
VFSA/SPSA Optimization Services
– Optimizes the placement of wells and the inputs (pressure, temperature) to IPARS simulations.
•
Economic Modeling Service
– Uses IPARS simulations outputs and current market parameters (oil prices, costs, etc.) to compute estimated revenues for a particular reservoir configuration.
•
DISCOVER Computational Collaboratory
– Interaction & Collaboration
– Distributed Interactive Object Substrate (DIOS)
Autonomic
Autonomic
Autonomic
Oil Well Placement (SPSA)
Oil
Permeability field showing the positioning of current wells. The symbols “*” and “+” indicate injection and producer wells, respectively.
Search space response surface:
Expected revenue - f(p) for all possible well locations p. White marks indicate optimal well locations found by SPSA for 7 different starting points of the algorithm.
The Future
•
Scaling up:
– High resolution IPARS simulations
– Multi-petabyte distributed archives of model data
– Exploitation of OSC and Teragrid resources (large teragrid allocation approved)
– Large scale demonstration of Discover/STORM/DataCutter integration
•
Experimental testbeds
– EPA/INEEL collaboration – live sensor data from superfund site
– NSF Center for Subsurface Sensing and Imaging Systems
– Data from industrial affiliates
•
New numerical methods
– Next generation accurate, multi-scale coupled chemical, fluid, geomechanical and geophysical simulator