Conventional seismic-well tie workflows are dependent on the availability of sonic and density logs at every well location. Additionally, the seismic-well tie typ- ically involves a subjective and labor-intensive workflow that depends on the inter- preter’s experience and intuition, making the process non-repeatable and challenging to validate in the presence of multiple well. Usingdatamatchingtechniques, such as LSIM and predictive painting, I provide an approach of integrating available welllogdata and seismicdata. As indicated by computational examples in this thesis, the proposed approach allows us to predict missing welllogdata and accurately compute seismic-well ties. It also provides a method to validate the consistency of multiple well ties. Furthermore, the seismic-well tie using LSIM is iterative, with the velocity smoothly updated after each iteration based on the shifts estimated from the LSIM scan. This iterative approach ensures that unrealistic velocity updates are not in- troduced into the seismic-well tie. The proposed approach is convenient for blind well tests and prediction of well logs properties away from well locations. The final velocity and density models from the Teapot Dome dataset include welllogdata that have have been previously excluded due to lack of overlapping sonic and density well logs in each well.
We carried out time-lapse analysis in a producing Niger Delta X-field, by first investigating the re- sponse and sensitivity of rock properties/attributes to lithology and pore fill in 3-D cross plot do- main and by Gassmann’s fluid substitution modeling. Furthermore, 4-D seismicdata were inverted into acoustic impedance volumes through model based inversion scheme. This served as input in- to a multi-attribute neural network algorithm for the extraction of rock attribute volumes based on the results of the petrophysical log analysis. Subsequently, horizon slices of rock properties/ attributes were extracted from the inverted seismicdata and analyzed. In this way, we mapped hydrocarbon depleted wells in the field, and identified probable by-passed hydrocarbon zones. Thus, the integration of well and time lapse seismic (4-D) data in reservoir studies has remarkably improved information on the reservoir economic potential, and enhanced hydrocarbon recovery factor.
data intensive distributed application, namely enterprise information integration, collaborating web services, ontology based agents communication, web catalogue integration and schema based P2P database systems. There has been a plethora of algorithms and techniques researched in schema matching and integration for data interoperability. Numerous surveys have been presented in the past to summarize this research. The requirement for extending the previous surveys has been created because of the mushrooming of the dynamic nature of these data intensive applications. Indeed, evolving large scale distributed information systems are further pushing the schema matching research to utilize the processing power not available in the past and directly increasing the industry investment proportion in the matching domain. This article reviews the latest application domains in which schema matching is being utilized. The paper gives a detailed insight about the desiderata for schema matching and integration in the large scale scenarios. Another panorama which is covered by this survey is the shift from manual to automatic schema matching. Finally the paper presents the state of the art in large scale schema matching, classifying the tools and prototypes according to their input, output and execution strategies and algorithms.
It is most often straightforward to reach level 1, if you use robust methods and pre- conditions of coherence are met. This level actually measures the matching noise that can depend both on the amount of the sampling and non-sampling errors of the source data sets and on the effectiveness of the chosen matching method. The second and third levels can be checked either through simulation studies, the use of auxiliary information or more complex techniques that reflect properly the uncertainty of the estimates. Current studies on uncertainty analysis and multiple imputation techniques focus on the sensitivity of parameter estimates (e.g. correlation coefficient) to different prior assumptions. The fourth level will not be usually attained, unless the common variables determine the variables to be imputed through an exact functional relationship. In any case, since the true values of the variables are unknown, only simulation studies will allow an assessment that this condition is satisfied.
Hydrocarbon reservoir beds have been delineated using direct hydrocarbon indicator on seismic sections as well as well logs data in X field, Onshore Niger Delta. The research methodology involved horizon interpretation to produce sub- surface structure map. Geophysical welllog signatures were employed in identifying hydrocarbon bearing sand. The well-to-seismic tie revealed that the reservoir tied directly with hydrocarbon indicator (bright spot) on the seismic sec- tions. The major structure responsible for the hydrocarbon entrapment is anticline. The crest of the anticline from the depth structural map occurs at 3450 metres.
The case of non-match means that information for structure existence is coming from only one of the three data sets. Most of these cells are observed in the northwestern Bulgaria due to the presence of gravity lineaments reflating the block borders of the Moesian platform and its coupling with the West Balkan . The lack of earthquakes and observed faults confirms the present tectonic stability of those structures today. Similar is the case southwest of the Burgas re- gion. Due to the number of effusive and intrusive magmatic bodies embedded in the upper crustal section , a lot of corresponding gravity lineaments are de- lineated while there is not information about earthquakes and active faults.
Acknowledgements. This work is dedicated to the memory of Andrés Pérez-Estaún, brilliant scientist, colleague, and friend. The authors sincerely thank Ian Ferguson and an anonymous reviewer for their useful comments on the manuscript. Xènia Ogaya is currently supported in the Dublin Institute for Advanced Studies by a Science Foundation Ireland grant IRECCSEM (SFI grant 12/IP/1313). Juan Alcalde is funded by NERC grant NE/M007251/1, on interpretational uncertainty. Juanjo Ledo, Pilar Queralt and Alex Marcuello thank Ministerio de Economía y Competitividad and EU Feder Funds through grant CGL2014- 54118-C2-1-R. Funding for this Project has been partially provided by the Spanish Ministry of Industry, Tourism and Trade, through the CIUDEN-CSIC-Inst. Jaume Almera agreement (ALM-09-027: Characterization, Development and Validation of SeismicTechniques applied to CO 2 Geological Storage Sites), the CIUDEN-Fundació Bosch i Gimpera agreement (ALM-09-009 Development and Adaptation of Electromagnetic techniques: Characterisation of Storage Sites) and the project PIERCO2 (Progress In Electromagnetic Research for CO 2 geological reser- voirs CGL2009-07604). The CIUDEN project is co-financed by the European Union through the Technological Development Plant of
We first examine the pre-match effort generally needed by the single prototypes. Depending on the reuse capabilities, manual effort is necessary for the specification of synonyms (COMA/COMA++, CUPID, LSD) and domain constraints (LSD, GLUE, IMAP). Furthermore, the machine learning-based prototypes (LSD, GLUE, IMAP) depend on the effort to train the individual learners. These efforts are not needed by other systems that do not utilize auxiliary information or learning techniques (SF, CLIO, SEMINT, PROMPT). While a default configuration is desirable, the flexibility to customize the match opera- tion is crucial to deal with heterogeneous domains and match problems. Composite pro- totypes typically allows to select different matchers (COMA/COMA++, LSD, GLUE) and a strategy for their combination (COMA/COMA++), while hybrid approaches are more lim- ited in this aspect and only allow to set relevant weights and thresholds (CUPID, CLIO). For SF, different filters can be chosen for match candidate selection. With a pre-speci- fied configuration, match prototypes typically execute their matchers in a single pass. COMA++ also supports an iterative execution of matchers for successive refinement. On the other side, several prototypes, such as SF and DIKE, QOM, OLA, and PROMPTDIFF, apply fix-point computation to automatically iterate matcher execution.
The investigation of wells/boreholes, using various instruments and techniques (depending on the well /borehole environment) and specific parameters being sought for, is known as Geophysical Well Logging or Borehole Geophysics. The subsurface geologic investigation with the use of wireline geophysical well logs has progressed over the years and has thus become a standard of operation in petroleum exploration. With the integration of exploration results from gravity, magnetic and seismic geophysical prospecting methods, favorable geological conditions for hydrocarbon accumulation may be identified. Exploratory wells are drilled into the prospective structure to evaluate the prospect. This is called Formation Evaluation. It is the process of using information obtained from borehole to determine the physical and chemical properties of subsurface rocks and their fluid content along the borehole (Figure 1.1). It involves the analysis and interpretation of well-logdata, drill-stem tests, cores, drill cuttings, etc. Petrophysics is a term used to express the physical and chemical properties of rocks which are related to pore and fluid distributions, particularly as they pertain to detection and evaluation of hydrocarbon bearing layers, (Archie, 1950). Petrophysics pertains to the science of measuring rock properties and establishing the relationships between these properties. It is related to petrology as much as geophysics is related to geology. Petrophysics is an important tool in hydrocarbon exploration. Its use in hydrocarbon prospecting involves well drilling and formation evaluation. The measurements are displayed as a set of continuous curves called Log, from which hydrocarbon reservoirs can be identified and reservoir parameters such as porosity, water saturation, hydrocarbon saturation and reservoir thickness can be estimated. These parameters will help in the estimation of hydrocarbon in place.
In the paper it is shown that Geo dataintegration is an adjustment problem. The application of adjustment techniques leads to a significant improvement of the geometrical accuracy with a comparable small effort. Only via adjustment techniques it becomes possible to integrate measurement types which already exist and are of high economic value. Adjustment models need information about identical objects. It is shown that identities can be generated automatically by efficient matching algorithms and that sophisticated matching algorithms can be efficiently based on the theory of mathematical statistics. However, dataintegration can also be seen as an ongoing process where distance dependent correlations play an important role.
We study the welllog and seismic responses of intensively fractured portions of deep intru- sive/metamorphic rocks in southern Tuscany (Italy), which constitute the main drilling targets of the geothermal exploration in the Larderello–Travale area. In particular, the target we con- sider is located near the contact between a deep Pliocene granitic intrusion and the overlying Palaeozoic metamorphic basement. Sonic, density and borehole image logs are analysed to- gether with post-stack reflection attributes and reflection amplitude versus source to receiver azimuth (AVAZ) responses. It turns out that the intense fracturing in the contact zone causes significant decreases in the density and P-wave velocity, and that fracture planes exhibit very high dips and a common preferential direction. The fractured zone found by the well coincides with peculiar alignments of high-amplitude signals in the 3-D seismic stack volume, which are particularly visible on the reflection strength and instantaneous phase time slices. The normal incidence synthetic seismogram based on the logdata matches the observed stack trace nearest to the well and confirms that the high-amplitude reflection occurs at the fractured zone. We then consider the pre-stack domain to study the same reflections on bin gathers that are close to the well and coincident with the anomalies in the 3-D volume. In particular, we perform AVAZ analysis to detect possible anisotropic features in the reflected amplitudes due to the preferential orientation of the fractures, and we study the effect of crack density on the seismic responses and on velocity and density values. To this end, we build simplified models where a level with vertical fractures is encased in tight isotropic rocks. Notwithstanding the suboptimal quality of the seismicdata, we estimate the overall matching between the borehole information and the seismic response as fair. In particular, the azimuthal amplitude variation of the reflections from the studied fractured zone has a sinusoidal trend that is quite consistent with the fracture planes’ orientation as indicated by the image logs. Moreover, the comparison between the actual AVAZ response and the AVAZ responses of synthetic seismograms gener- ated on models with different crack densities suggests that it may be feasible to estimate crack density values from the azimuthal amplitude variation of the observed reflections, within the resolution of the seismicdata.
The materials used for this study includes a 3-D seismicdata and suite of wireline data which consist of sonic, density, gamma ray, resistivity and porosity logs. The work flow adopted for this work is shown in figure 2. Acoustic impedance provides better understanding of reservoir due to its relationship with various petrophysical parameters such as porosity, lithology and fluid content. Prior to the seismic inversion, Crossplot analysis was carried out in the well domain to establish the relationship between acoustic impedance and porosity (), water saturation (S w ) and gamma ray reading. Model based inversion was carried out by integrating seismic and well
Data mining, the extraction of hidden predictive information from large databases, is a powerful new technology with great potential to help companies focus on the most important information in their data warehouses. Data mining tools predict future trends and behaviors, allowing businesses to make proactive, knowledge-driven decisions. New trends in Internet services that are dependent on clouds of servers to handle tasks are denoted by Cloud computing. Data mining in cloud computing is the process of extracting structured information from unstructured or semi-structured web data sources. It allows organizations to centralize the management of software and data storage, with assurance of efficient, reliable and secure services for their users.  This research proposal proposes the concept of data mining association rules for a cloud environment.
Generally all the healthcare organizations across the world stored the healthcare data in electronic format. Healthcare data mainly contains all the information regarding patients as well as the parties involved in healthcare industries. The storage of such type of data is increased at a very rapidly rate. Due to continuous increasing the size of electronic healthcare data a type of complexity exist in it. In other words, we can say that healthcare data becomes very complex. By using the traditional methods it becomes very difficult in order to extract the meaningful information from it. But due to advancement in field of statistics, mathematics and very other disciplines it is now possible to extract the meaningful patterns from it. Data mining is beneficial in such a situation where large collections of healthcare data are available.
According to , Web mining is the integration of information that is extracted from web data that is taken from the document, serverlogs, web content ,hyperlink.and usage logs of web sites. Web use mining is the way of applying information mining systems to find utilization design from the web information. It is one of the mechanism to locate personalization of web pages. It is applied in various levels such as server level, client level and proxy levels. It also tells that it comes from various resources through they interact with the browser, HTTP protocol and the status code . However in the existing situation the online user is increasing day by day and use of data per user is increasing randomly. There are various kind of form we are using in web, if user click on the submit button all its activities are saved in server side. All this activities are recorded and maintained by the server which is know as web log or log file. The files are rich in information ,it contain entries like IP address of the system making the request, hit or miss, server location and name of the requested file, the HTTP status code, the file size
In this thesis we have presented a novel model for latent factor regression and vari- ance batch effect adjustment, and have shown how to jointly adjust the data and reduce dimension, and obtain sparse covariance estimates. We outlined three different prior con- figurations for the loadings: flat, Normal-spike-and-slab (Normal-SS), and a new type of Non-local priors (NLPs), i.e. Normal-spike-MOM-slab (MOM-SS). We discussed Laplace- tailed extensions, but deeper analyses remain as future work. To our knowledge this is the first time NLPs are implemented in the factor analysis context. We gave determinis- tic optimisations for our model and provided novel EM algorithms to obtain closed-form posterior modes. We showed that the use of sparse models increases the quality of param- eter estimations, even in the absence of batches. MOM-SS priors proved to be appealing, improving the estimation of factor cardinality and encouraging parsimony and selective shrinkage.
In practice, PAM is embedded in the statistical analysis systems, such as SAS, R, S+ and etc. to deal with the applications of large sized datasets, i.e., CLARA (Clustering Large Applications) . By applying PAM to multiple sampled subsets of a dataset, for each sample, CLARA can produce the better clustering results than PAM in larger data sets. But the efficiency of CLARA depends on the sample size. On the other hand, a local optimum clustering of samples may not the global optimum of the whole data set. Ng and Han  abstracts the mediods searching in PAM or CLARA as searching k subgraphs from n points graph, and based on this understanding, they propose a PAM-like clustering algorithm called CLARANS (Clustering Large Applications based upon Randomized Search). While PAM searches the whole graph and CLARA searches some random sub-graphs, CLARANS randomly samples a set and selects k medoids in climbing sub-graph mountains. CLARANS selects the neighboring objects of medoids as candidates of new medoids. It samples subsets to verify medoids in multiple times to avoid bad samples. Obviously, multiple time sampling of medoids verification is time consuming. This limits CLARANS from clustering very large datasets in an acceptable time period.
To synchronize updates, data must migrate from site to site on demand and avoid the use of local copies which could become out of date. ScaleOut GeoServer implements data migration and read/write access by transparently incorporating it into the IMDG’s existing distributed locking mechanism, which has been extended to span multiple sites. The IMDG automatically migrates ownership of data from a remote site when it is locked for reading by the application. This ensures that updates are always performed locally and at exactly one site at a time. The application does not have to manually restage data across sites nor provide its own mechanism for global data synchronization.
In a broad range of application areas, data is being collected at unprecedented scale. Decisions that previously were based on guesswork, or on painstakingly constructed models of reality, can now be made based on the data itself. Such Big Data analysis now drives nearly every aspect of our modern society, including mobile services, retail, manufacturing, financial services, life sciences, and physical sciences. Scientific research has been revolutionized by Big Data .The Sloan Digital Sky Survey has today become a central resource for astronomers the world over. The field of Astronomy is being transformed from one where taking pictures of the sky was a large part of an astronomer’s job to one where the pictures are all in a database already and the astronomer’s task is to find interesting objects and phenomena in the database. In the biological sciences, there is now a well-established tradition of depositing scientific data into a public repository, and also of creating public databases for use by other scientists. In
Sandeep Kumar Mohapatra, Anamika Upadhyay and Channabasava Gola  introduced a prediction model for rainfall prediction of Bangalore city using linear regression technique. The objective is this work is to predict rainfall of Bangalore city on the basis of dependent attributes usingdata mining techniques. In these 100 years of data provided by the meteorological department has been analyzed and on the basis of that data, a prediction model was created.100 years of rainfall data ranging from 1901 to 2002 of Bangalore city is analyzed using linear regression data mining technique. The performance of this model was further improved by using the Ensembles technique using K fold. Two Sampling Techniques are used in this work. One is 80-20 fixed sampling technique i.e. 80% of total data is used as a training set and 20% data is used for validating the model and other is K fold Validation sampling techniques. In this linear regression model, seven data parameters were used. They are Rainfall, Maximum temperature, Precipitation Wet dry frequency, Mean temperature, Relative humidity, Total cloud amount, Wind speed.