Research on Aggregation Service-oriented Spatial Subdivision Data
Storage Scheduling Model
1
Du Gen-yuan,
2Qiu Ying-yu,
3Miao Fang
1
International School of Education, Xuchang University, China, [email protected]
2College of Computer Science and Technology, Xuchang University, China
3Key Lab of Earth Exploration and Information Techniques of Education Ministry of China,
China
Abstract
With the expansion in the field of spatial data application in depth and breadth, spatial data faces some access speed, and application efficiency problems in the areas of organization, management, storage and update. Global partition organization theory is a kind of spatial data organization method based on global grid partition; it is a seamless, open hierarchical spatial data management framework; it also has unique properties and advantages in spatial information expression and management. In view of the above problems, based on global subdivision organization theory, combined with the client oriented aggregation service that G/S mode, study and propose a space subdivision data storage scheduling service model, presents the model architecture, data access process, designs the address coding structure and resolution process, forming effective mechanism of space subdivision data organization and management, according to the needs of integration, fast scheduling with data distributed storage and client information aggregation. Finally realized and verified the prototype system, it has certain theory significance and the application value.
Keywords
: Spatial data, Global subdivision, G/S mode, Subdivision facet, Storage scheduling1. Introduction
Geographic information is essentially integrated, for everything, every element includes spatial information, and spatial location is the only notable sign between information. Owing to the advantages of high real-time performance, wide coverage, rich and objective information, remote sensing data has already been widely used in aerospace, military reconnaissance, disaster prediction, dynamic monitoring, resource exploration, land planning and utilization and many other military and civilian fields. With the development of sensor, remote sensing platform, data communications and related technologies, the amount of spatial data acquired through remote sensing expands rapidly, resulting in the phenomenon that "the spatial data production and transmission capacity is far greater than the spatial data analysis capability"; at the same time, application areas require higher and higher of the real time, accuracy and reliability for remote sensing image, thus processing speed has become the bottleneck of rapid application of remote sensing image, which put forward higher requirements for the storage, organization and services of spatial data that based on remote sensing image. Currently, research focus and problems mainly exist in the following aspects: the data organization and storage, storage resource allocation scheduling, earth data and earth model combination, fast data update, data
2. Relevant theories research
2.1. Earth subdivision organization theory
Global spatial data organization refers to the orderly configuration for the global spatial data, including spatial data spatial index structure and the segmentation, cataloguing, storage and corresponding coding, expressing and scheduling system on its basis according to certain rules. The traditional spatial information expression, organization, management and publishing methods based on maps can not meet the demands of global spatial data management [2]. At the same time, the traditional framing storage mode of vector data is not conducive to the unified expression, management and application of global spatial data. Therefore, establishing a new, global, multi-scale fusion spatial index mechanism as well as an open and seamless hierarchical spatial data management framework, and realizing all kinds of expression and organization of spatial data based on this framework, become problems to be urgently solved in practical application [3].
Earth partition organization theory makes use of facet grid partition method to divide the surface of the earth into a seamless, level grid cells. Each cell has the only code in the world to establish multilevel index system for global spatial information, used to solve the problems in the organization and management of spatial information. Spherical surface subdivision is a kind of multi-level, multi-scale, global grid division-based data organization method [4-7]. It directly determines the storage and index methods of discrete grid data and affects the data dispatching efficiency. Spherical grid model can be categorized into the spherical polyhedron based on the geographic coordinate system and spherical polyhedron system based on. Typical subdivision models are: QTM (Quaternary Triangular Mesh) model based on the positive eight hedron spherical triangulation, EARPIH (sphere triangle quad tree model based on EARP Icosahedron’s projection) model [8-10] based on the twenty hedron spherical triangulation, and SIMG (Spatial Information Multi-Grid) model [11-13] based on latitude and longitude difference mesh. Partition models generally use four forks tree structure and mesh coding to organize subdivided surfaces, thereby achieving different subdivision between tiers.
2.2. Spatial data clustering technology
Web service is used to access the Internet and a software architecture used by other applications. Spatial data web service provides spatial data and geographical functions through the network. Users access spatial data and functions through the network and integrate them in their own systems and applications without additional development of specific GIS tools or data [14-15]. Shao and Li [16] proposed a service-oriented spatial information sharing framework, analyzed its theoretical model and technological characteristics, and implemented its prototype platform; Li De-ren [17] proposed a new spatial information service model based on (Digital Measurable Image, DMI).
Figure 1. The Basic Framework of G/S mode
3. Subdivision data storage scheduling protocol prototype system
3.1. Protocol prototype system
Spatial subdivision data storage scheduling protocol prototype system (Geospatial Information Protocols, GeoIP) serves for the storage of spatial information subdivision organization system and provides addressing, routing, transmission control for spatial subdivision slice data in multiple physical storage system of high bandwidth link and the earth spatial information partition organization system. The model is used to describe the storage, scheduling and management rules in sum of the spatial information subdivision surface data service; it the protocol system used for the identification, location and accessing between physical storage and logical application; it lays a solid foundation for the flexible, changeable and unified-cataloging storage, scheduling, and distribution of spatial data.
According to the spatial information subdivision organization theory and referring to TCP/IP protocols to divide hierarchies, define functions and determine the basic architecture, the protocols cluster is divided into five layers, respectively, subdivision data application layer, subdivision data accessing layer, subdivision data logical organization layer, subdivision data representation layer and subdivision data storage layer, wherein, the storage layer and representation layer embody storage scheduling, logical organization layer represents data scheduling, accessing service layer reflects service scheduling [18]. In the network, each node has the same level; each layer contains the necessary agreement; each layer is transparent to other layers; different nodes in the same level have the same function; the same node between adjacent layers communicates through as interface; each layer uses the services provided by the lower layers and serves the upper layers; different nodes in the peer-to-peer layer realize communication according to the protocols.
3.2. Prototype system architecture
Figure 2 shows the spatial partition data network service system architecture which is divided into five layers, namely, subdivision data application layer, subdivision data access service layer, subdivision data logical organization layer, subdivision data representation layer and subdivision data physics storage layer.
The application layer solves the specific practical problems for each domain based on the lower layer protocols, solves the problem of how to expand the increasingly abundant spatial information application and the details of specific application programs based on the lower layers.
The logical organization layer focuses on the large-scale volume data and the relatively centralized data processing requirements of subdivision data, studies the subdivision data logical organization mechanism suitable for the characteristics of partition storage cluster, emphasizes on solving the problems of subdivision data storage and rapid response of hot data. The major functions and protocols it refers to include the rapid response mechanism of hot data, massive data management, parallel loading protocol Geo-DPP (Geo-Digital Parallel Processing), dynamic indexing mechanism Geo-ARP
Figure 2. The Spatial Partition Data Network Service Architecture
The physical storage layer solves the specific physical storage organization problem of subdivision data, stores the subdivision data and attribute data in the corresponding storage unit and the object according to features of the specific file system and physical storage device and based on partition data coding mode and representation method, establishes physical storage system of subdivision data. It is mainly related to physical access protocol, physical storage protocol, transmission control protocol and resource scheduling protocol.
3.3. Spatial data access flow under protocol support
During spatial data access, the data processes from the top to down layer are: first, transforming the geographic coordinate information into a unified subdivision surface coding through regional analysis protocol, in which single patch may be parsed into a mesh subdivision surface detail set; secondly, indexing and inquiring the corresponding storage node according to subdivision surface slice encoding; thirdly, doing attribute access address according to its attribute information (spatial location, resolution, scale), and checking the partition data of the specific attributes of the mesh patch set; finally, completing the data location through the related routing and transmission protocol of the data storage layer.
The data flow from bottom to top is a reserve process. Different nodes in the peer-to-peer layer realize the transparent communication in between according to protocols; the two sides communicate in the peer-to-peer layer, while they can not communicate in asymmetric levels.
Figure 3. Spatial Data Access Flow under Protocol Support
4. Spatial data storage scheduling model research
4.1. Scheduling model
Spatial subdivision data storage scheduling service model describes the mapping relationships between the storage service unit groups of the earth spatial partition data and the distributed spatial data storage. After geographic spatial data entering into the subdivision system, it will at first be preprocessed; then determines the grade of the data in the subdivision model, processes the spatial data partition, determines the mesh patch set contained in the data according to the latitude and longitude coordinates of the preprocessed spatial data in the upper left and lower right as well as in accordance with the mesh patch size; codes the subdivision surface according to the surface patch address code of the subdivision organization theory; organizes the data according to subdivision code and finally obtains the spatial data according to the subdivision surface organization, which is shown as Figure 4:
Global spatial information subdivision coding model is a kind of coding model which is based on the global spatial information subdivision model, combines the address code of the subdivision surface and data attribute information and codes the global spatial information. Subdivision surface level determines the transmission of information between the codes of subdivision surfaces, namely, the address code of the next level can be obtained according to subdivision surface code of the upper level, and the address code of the subdivision surface of the next level includes the address code of the subdivision surface of the upper level. Subdivision surface code consists of the address code and attributes code.
Spatial subdivision data storage scheduling service model is a protocol system used for the identification, and location between physical storage entity and logical application, and accessing earth spatial information partition surface data. Among them, GeoIP address generation algorithm establishes the mapping relationship from partition surface logical address to physical memory address based on partition surface code, and then generates the GeoIP address used to identify the storage location of partition surface. Finally, the host address is obtained by GeoIP address resolution algorithm, and the physical address that corresponds to the geographic feature regions according to the address mapping table.
4.2. Address code structure
The spatial information physical memory entity organizes and manages via GeoIP card. GeoIp card is the network interface card with data processing function, and the hardware equipment combining the network and storage medium. A complete geographic subdivision data scheduling protocol address code sets by m+n, which is composed of subdivision coding (m) and host address coding (n), and used to identify the storage location of the subdivision surface. Wherein, the host address code (n) realizes the geographic subdivision data scheduling protocol address and the one-to-one corresponding of GeoIP card physical address through address mapping Table, (AMT), as shown in Figure 5 (a).
Based on the earth subdivision model, the subdivision code (m) combines the address information of subdivision surfaces and the attribute information of spatial entities, analyzes the subdivision coding parts in GeoIP address coding through the geographic subdivision data scheduling protocol system, obtains the logical address index of subdivision surfaces that require accessing, and then accesses the corresponding physical memory entity in the GeoIP card file system through integration.
Subdivision surface data realizes its organization through geographic feature area and the included area storage units. Regional storage unit realizes positioning through a single GeoIP card, including a number of physical storage entitiy.; geographical feature area consists of several regional storage unit organizations, realizing positioning according to the wide area geographic features.
subdivision coding(m bit) host address code(n bit)
regional storage unit 1 physical address subdivision code analysis file system … D1 D2 Dk
regional storage unit 2 physical address subdivision code analysis file system … D1 D2 Dk
regional storage unit 2n
physical address subdivision code analysis file system … D1 D2 Dk
GeoIP address coding
address mapping
…
(b) Address Coding Structure 2
Figure 5. Address Coding Structure Diagram
In the figure 5, the host address code is divided into the two code segments-j and n-j, and 0<j<n. N-j section is used for geographical feature area addressing, being able to construct 2n-j geographical feature areas; j section is used for the location of regional storage units in geographical feature areas; each geographical feature area can construct 2j regional storage units.
4.3. Addressing process
At present, spatial data application is experiencing a revolution from industry to public. The industry introduced a number of virtual digital earth system, used for seamless integration, performance and analysis of large-scale, and even global multi-scale, multi-type massive spatial data. The members of the project group uses the Java SDK of the WorldWind version as the basic development kit, adopts the Eclipse RCP applications as interface and container, independently develops a protocol system of the digital earth platform. The platform system has already provided data support for the projects, such as, digital tourism, digital campus, virtual/digital moon data sharing service. Figure 7 is the structure and function framework of the digital earth platform prototype system; Figure 8 (a), (b) show it.
Figure 7. The Digital Earth Platform Prototype System Framework
(a) (b)
Figure 8. Digital Disaster Reduction and Digital Tourism Exhibition
6. Conclusion
At present, spatial data is faced with organizational efficiency problems and quick application problems, such as the slow query and access speed, slow integration and application. Aiming at these problems, based on the earth subdivision organization theory and on the basis of the client-oriented polymerized service G/S model, we researched and put forward a kind of earth subdivision data storage scheduling service model that applies to the field of spatial information, described its system structure and data access process, designed its address coding structure and address resolution process, formed an effective “data distributed storage, client node information aggregation” spatial subdivision data organized management, integration according to the needs and fast scheduling mechanism. The above
mentioned thinking methods were partly verified by prototype test. The fast data access and application speed, easy storage and update, adjustment to large data have certain theoretical significance and application value.
7. References
[1] CHENG Chengqi, Li Xuefeng, GUAN Li, “Study on System Architecture of Subdivision Storage Cluster for Global Spatial Data”, Acta Scientiarum Naturalium Universitatis Pekinensis, vol. 47, no. 1, pp.103-108, 2011.
[2] Goodchild M. F., “Discrete global grids for digital earth”, Proceedings of 1st International Conference on Discrete Global Grids, Santa Barbara, California, USA, 2000.
[3] GUAN Li, CHENG Cheng-qi, LV Xue-feng, “Study on the Organization Model for Vector Data Based on Global Subdivision Grid”, Geography and Geo-Information Science, vol. 25, no. 3, pp.23-27, 2009.
[4] SONG Shu-hua, CHENG Cheng-qi, GUAN Li, “Analysis on Global Geodata Partitioning Models”, Geography and Geo-Information Science, vol. 24, no. 4, pp.11-15, 2008.
[5] CHENG Cheng-qi, GUO Hui, “Research on Image Information Representation Based on Partition Data Model”, Bulletin of Surveying and Mapping, no. 10, pp.12-14, 2009.
[6] CHENG Cheng-qi, SONG Shu-hua, WAN Yuan-wei, “Preliminary Studies on Geospatial Information Code Model Based on Global Subdivision Model”, Geography and Geo-Information Science, vol. 25, no. 4, pp.8-11, 2009.
[7] Gang Liu, Pingqian Wang, Zhenwen He, Bingyin Tang, “Self-Adaptive Replacement and Pre-dispatch Algorithm Considering Spatial Relationship Constraint for 3D Spatial Data”, AISS: Advances in Information Sciences and Service Sciences, vol. 4, no. 3, pp.287-295, 2012.
[8] Junbo Xu, Huiqiang Wang, Guangsheng Feng, Lin Sun, “A Temporal and Spatial Data Compression Algorithm Based on Confidence Interval for Wireless Sensor Networks”, AISS: Advances in Information Sciences and Service Sciences, vol. 4, no. 1, pp.54-61, 2012.
[9] YUAN Wen, ZHUANG Da-fang, YUAN Wu, “Some essential questions in remote sensing science and technology”, Journal of Remote Sensing, vol. 13, no. 1, pp.103-111, 2009.
[10]LI De-ren, “On Generalized and Specialized Spatial Information Grid”, Journal of Remote Sensing, vol. 9, no. 5, pp.513-520, 2005.
[11]LI De-ren, SHAO Zhen-feng, ZHU Xin-yan, “Spatial Information Multi-grid and Its Typical Application”, Geomatics and Information Science of Wuhan University, no. 11, pp.945-950, 2004. [12]Li D. R., “From Digital Map to Spatial Information Multi-Grid”, Proceeding of IEEE International
Geoscience and Remote Sensing Symposium, pp.9, 2004.
[13]Fekete G, L Treinish, “Sphere quadtrees: A new data structure to support the visualization of spherically distributed data”, SPIE, Extracting Meaning from Complex Data: Processing, Display, Interaction, no. 1259, pp.242-253, 1990.
[14]Yuan Wen, Cheng Cheng-qi, Ma Ai-nai, Guan Xiao-jing, “L curve for spherical triangle region quadtrees”, Science in China: Series E, vol. 47, no. 3, pp.265-280, 2004.
[15]Bo Jingfang Yang Zongxi, “The Constructing of Mineral Exploration Data Management System based on Spatial Database”, AISS: Advances in Information Sciences and Service Sciences, vol. 4,