Graphical Probability Model and Heritage Tourism Routine Design

(1)

Graphical Probability Model and Heritage

Tourism Routine Design

Fengbao Ma

Beijing Institute of Fashion Technology Beijing, China

ysymfb@bift.edu.cn

Sicong Ma

Northeast Institute of Geography and Agroecology, Chinese Academy of Sciences

Changchun, China 84184911@qq.com

Qinyun Liu* Bath Spa University / Collaborative Innovation Centre of eToutism, Beijing Union University

England/China qinyun.liu15@bathspa.ac.uk

Abstract—Currently, tourism and heritage research areas are connected with computer science methods and techniques for detecting potential knowledge combinations to improve operating efficiency. In the heritage development process, how to design and provide a valuable and novel routine for visitors is quite essential. This research aims at detecting a method for heritage tourism routine design by using a graphical probability model and graphical database theory. Graphical probability is applied to achieved nodes and vertexes' definition of weights. The graphical database is used for connecting nodes with vertexes and establishing models for target heritage area (potential tourism area). Probability calculation is applied to calculate the weights of each connection. Eventually, the graphical probability model can be used to apply the most valuable routines for users to select when visitors arrive in this unfamiliar tourism area. Creative computing methods are used to achieve knowledge combinations and interdisciplinary methods integration, which are the cardinal parts for applying the graphical model in the heritage area.

Keywords—Graphical probability model, graphic database, cultural heritage, creative computing, knowledge combination

I. INTRODUCTION

Heritage tourism routine is imperative for both heritage protection organisations and tourism organisations. Visitors often go to a place to travel with two questions: firstly, decide which attractions to visit to make their trips more enjoyable. Secondly, design the route for each day in the schedule, which is to determine the order of access to each attraction. This process requires consideration of multiple parameters and constraints. However, visitors formulate the schedule all by themselves. For different visitors, preferences are different. Being amateurish on tourism plan in a heritage place has a risk of creating negative influences on heritage places and affecting effective tourism experiences.

Along with the development of internet and computer science techniques, interdisciplinary methods can be applied to provide available and feasible tourism schedule for visitors and protecting the heritage place at the same time. Graphical models and graphical database can be used to achieve such calculating application.

In this research, the graph database with the graphical possibility model is combined with being applied in the design tourism routine between different utilities, which are different visiting points in heritage places. To illustrate the research explicitly, this paper has five sections, which are introduction, related work and backgrounds, graphical model for tourism routine design, experiment, and conclusions.

II. RELATED WORK AND BACKGROUND

A. Graph database and graphical possibility model Graph database (GDB) can be applied for data storage and data calculating. GDB is a kind of database with graph structure to processing semantic querying [1]. The graph structure is based on vertexes, edges, and traits to represent entities, data, information and knowledge. The cardinal element of the database is the graph theory (important theory in discrete mathematics). Data, information and knowledge can be connected with nodes and edges that are used to represent relationships between nodes [2]. The relationships can be used to combine data in stack space. In most situations, querying and searching can be completed by one step. Relationships between data area prior considered in the graph database. It is an advantage of establishing a highly connected database with graph database techniques [2].

At present, relational database is more popular than graphic database. Graphic database is a kind of non-relational database. The graph model in graph database can be used to list the relationships that depend on different nodes [3]. However, it is rarely done by a non-relational database. Graphic database has different storage mechanism. Some graphic databases store data in charts [3]. Others store data in key value mode or document mode. The graphical possibility model includes two kinds of models, which are directed graphical model and undirected graphical model [4].

Starting with directed graph model (DGM), it is also known as Bayesian network (BN) or belief network. G (V, E) is defined as a graph model with a number of random variables (vertices of a graph):

V = {, , … , } Edges of the graph can be represented by:

E = {, , … , }

The vertex is the node of graph model, and the edge is the connection between every two nodes. Nodes and edges are used to form a complete graph model with different probabilities. Take figure 2.1 as an example. Every two nodes have four modes, which are cause result connection, result cause connection, general inference connection and standard result connection. As shown in Figure 2.1, there are four relationships between nodes KP1 and KP2, and kp3 is a condition:

(2)

Figure 2.1 Four Types of Three Nodes Graph Modes In Figure 2.1 (1), entity KP1 and entity KP2 are the connections of the inference results, which means that entity 1 can be used to infer entity kp3 and then entity 2. In Figure 2.1 (2), entity 1 and entity 2 are result inference connections, that is, entity 2 can be used to infer entity 1 [5]. These two simple relations can be described by mathematical symbols:

⊥ | The formula is true when entity 3 exists.

Similarly, in Figure 2.1(3) and (4), entity 3 is a common reason and the typical result of entity 1 and entity 2 respectively. These two relationships can be depicted by mathematical symbols:

⊥ |

When entity 3 exists, the formula is true. Or entity 1 and entity 2 are not independent under common causes. Entity 1 and entity 2 are not independent of true formula, but independent of entity 3. In a predefined graph G (V, E), probability formula of nodes (vertexes) can be defined below:

Posit set of variables is V = [, , … , ] P(V) = ∏ (|_∏)

(1)

In the formula (1), P(V) represents the probability of each vertexes in the target graph. represents target vertex in the graph, and _∏ represents the set of all parent vertexes of target vertex [6].

Edges of the graph can be defined as: , which represents edges between vertex and . The probability of edge is:

P = (|) (2)

The connection edge of two vertexes represents the possibility of child vertex can be reasoned based on parent vertex. The model (G, V) is an applied Bayesian Network of a directed graph in the database [7].

According to the undirected graphical model, which is also named Markov Random Field (MRF) or Markov Network (MN), the critical difference to directed graph situation is a partition function [8]. For an undirected graph

G(V, E) , the number of nodes in the graph is k. Random variables can be set as V = [, , … , ], the probability of vertex is:

p\ = (|()) (3)

N(k) represents a set of all vertexes that have connections with target vertex . (G, V) is a Markov random field. To calculate the probability of each vertex, there is a term that should be defined based on the Hammersley-Clifford theorem, known as a clique.

The principles of solving directed graph possibility problems cannot be used because the undirected graph does not provide orderly topology variables. The decomposition of undirected graph union probability is based on all connected subgraphs. Each connected subgraph is the processing unit in probability calculation, which is named clique. As can be seen in Figure 2.2:

Figure 2.2 Undirected Graph Cliques

In this five-vertex-undirected-graph, { , } , { , } , { , } , { , } are all the subgraphs (cliques) of the parent graph. The entity can be connected to form subgraphs and being connected by other entities from other subgraphs. This method is used for generating a multilayer database in this research, which can complicate relationships between cliques in the database. In the human brain, creativity is usually being generated through complicated calculations and unexpectable connections between different entities. This database structure is inspired by the human thinking process.

Based on Hammersley-Clifford theorem, the probability of vertexes in the undirected graph is [8]:

P(V) =

∗ ∏∈() (4)

As can be seen in formula (4), C represents the largest clique in the undirected graph. () represents the potential function of the undirected graph. Z is the most considerable different compared with directed graph vertex probability formula, the partition function. Z can be presented as [8]:

(3)

The entire graph database system is a mixture graph model, including an undirected graph and a directed graph [9]. Based on theories that are used in calculating probabilities of directed graph vertexes and edges and undirected graph vertexes and edges, the four layers graph database can be established with a specific probability value.

B. Creative Computing

The existing creative computing methods include knowledge combination (from two or more disciplines), enhancing human creativity (not just developing computational creativity), encouraging tacit knowledge and its transfer to procedural knowledge, crossing abstract levels, and integrating divergent thinking [12].

Knowledge combination is applied to the detection of creativity generation methods, especially the mixing of knowledge and models of different disciplines based on computer [10]. The knowledge of other disciplines is connected with computer science knowledge to form mixed tacit knowledge [10]. Then, knowledge movement is realized by programming, from tacit knowledge to procedural knowledge, the theoretical model is transformed into computer operation model, and the artificial creation system is realized.

The cross level of abstraction is the basis of new creative taxonomy. Due to the traditional creativity in human society, a new classification method should be established to improve the possibility of machine detection of creativity [10].

Creative guidance method is based on convergent thinking, divergent thinking and Boden's three creative methods. This method weaves innovation theory and artificial intelligence algorithm together to realize the combination and Realization of multi-disciplinary [12].

On the basis of knowledge combination method, the graphic possibility model and graphic database are used to realize heritage planning and design. The knowledge of computer science, mathematics, tourism and heritage is combined to realize the combined application of this knowledge.

C. Tourism Routine Design Research

In the research work of tourism route planning, most of the early researches focused on the op problem (orientation problem) as the basic problem and solved the tourism route planning problem through different variables [13]. The focus of this kind of work is to accurately model the user constraints, opening time of scenic spots, travel mode and other factors in the tourism route planning problem, and finally get one or more accurate path planning results that meet the user constraints [14]. However, in real life, such a model rarely can plan out the routine of travel. On the one hand, because tourism is a dynamic process, there are many uncertain factors in this process; on the other hand, when the geographical scope of tourist attractions is large, tourist attractions can no longer be modelled as a scenic spot [15]. As a tourist River, the starting and ending points of interest may be quite different. With the rapid development of the Internet, various kinds of user generated content related to daily information are also increasing rapidly. In the field of tourism, various forms of tourism spatiotemporal trajectory data have been formed [16]. A large number of tourism experience, photos and other data shared by data and users constitute a tourism database, which is of great significance for tourism data analysis and

information mining [16]. How to reasonably use these data for tourism route planning is a hot issue in recent years. The advantage of this kind of work is that it can quickly obtain feasible solutions in line with the actual situation and help users make travel planning, but the difficulty lies in the rational use of multi-source data to accurately mine the user's historical behaviour trajectory [17].

Based on the problems encountered by current users in tourism planning, tourism conventional planning problems emerge as the times require. A lot of research work is focused on how to solve this kind of problem by heuristic method.

The problem of personal tourism planning is more complex than that of travel salesman. Generally speaking, when a tourist is interested in multiple points of interest, how to choose the appropriate travel route according to the relevant constraints of tourists and their interest in the points of interest. Although there is a lot of tourism related information on the Internet, it is still a challenging task for those tourists who are not familiar with the city, especially the requirements for each scenic spot, visiting time, opening time and travel distance between scenic spots. The key to tourism route planning is to choose more POI which is in line with tourists' preference, so as to meet the constraints of tourists' time and cost, and to maximize the satisfaction of tourists.

In addition, as the target tourist area of this study is heritage site, protection is another factor to be considered.

III. GRAPHICAL MODEL FOR TOURISM ROUTINE DESIGN

A. Establishing Graph Database for Tourism Places In this study, tourism destination is taken as the processing unit. All tourist attractions are marked as a marker point stored in the database. In order to express the relationship between different tourist destinations, the structure of marked points in the database is established by using graphic data structure. In the whole process, especially in knowledge combination detection, database is essential.

There are three layers in graphic database, including entity layer, application layer and a hidden layer which represents the connection between entity layer and application layer.

(4)

undirected graph to depict relationships between tourism entities.

Figure 3.2 Second Layer of Graph database: Application Graph

The second layer of the tourism database, the application graph, is used for detecting application in tourism area. Applying this database in tourism and heritage subjects to detecting a tourism routine for visitors, modules are used for describing combined cliques of entities. For example, in heritage tourism area, places are divided into different cliques to be connected. Between different cliques, there are still related relationships and unrelated relationships. Connections between different modules are described by edges. Modules are represented by vertexes. As not all the modules have logical relationships, this layer is a mixture of directed graph and undirected graph as well.

The last layer is the hidden layer, which is used for describing relationships between different entities that are in distinct modules. As can be seen in Figure 3.3, entities in three modules have connections. Modules can be defined as cliques as each module includes sub-graph of entities.

In order to define and quantify each logical reasoning edge, probability graph model is used to complete the calculation. There are two models to describe the probability of each node and edge, namely Bayesian probability and Markov probability. Bayesian probability formula is used to detect the possibility of nodes and edges in a directed graph in the database layer. Computing the probability of vertices and edges plays an active role in extracting useful entities and generating creativity.

Figure 3.3 Hidden Layer of Graph Database: Entity Graph and Application Graph Connections B. Model Training and Calculation

Starting from discussing the ratio of entity connection, the relationship between different entities is represented by graph theory in discrete mathematics. Graph can be divided into digraph and undirected graph. In this field, directed graph is used to represent knowledge reasoning relationship, and undirected graph is used to represent the relationship between different entities. A graph can be presented by G (V, E, γ). V = {, , … , } represents vertexes in the graph. E = {, , … , _#} represents edges of graph. Γ represents the function that is used for connecting set E and set V. Degree can be used to represent edges that are related to each vertex in set V. when the vertex is entitythe degree is the connections between different entities.

Based on definition of degree of directed graph, this entity graph degree is:

deg(%) = deg(&) = deg(') = 2 Total degree=∑_∈!deg(*) = 6

Therefore, the ratio of this entity connections is: r(A, B, C) = ∑_∈!deg(*) = 6 (6)

To any undirected graph, each vertex has in-degrees (+-.) and out-degrees (+-/). +-/_{(%) = 1; +-}._{(%) = 1} +-/_{(&) = 2; +-}._{(&) = 1} +-/_{(') = 2; +-}._{(') = 3} 0∗_{= 4 deg (*)} ∈! = 5

The degree of this undirected entity graph is 5, which is the ratio of entities combination.

r = ∑_∈[deg ( ) + deg( ) + ⋯ + deg ( )] (7)

Each two entity linking relationships in n-dimensional simple graph meets the conditions of:

(5)

Figure 3.4 Simple Graphs of Entities Combination Based on inherent rules, the ratio of entities combination is conformed to arithmetic progression. For n-rank sequence, the general term formula of arithmetic progression is

<= <+ (: − 1)+

Therefore, the n-rank ratio of entities combination is: r = 0+ (: − 1) ∗ 4 = 4: + 2(: ≥ 3) In this formula, n represents the quantity of entities (vertexes in entity graph). Thus, the general formula of ratio of entity combination can be transferred to express relationships between the ratio of entity combination and entity quantity:

r = 4kp + 2(k ≥ 3) (8)

When the number of entities is 2 (KP = 2), the vertex number is 2, the edge number of entity graph is 1, and the entity combination ratio is 1. This situation is used to describe an interdisciplinary combination of entities.

When the entity is 1 (k = 1), the number of vertices is 1. If the entity has autocorrelation feature, the number of edges is 1 and the entity combination ratio is 2; if the entity is independent of itself, the edge number of the entity graph is 0 and the entity combination ratio is 0.

In this general formula, the default condition is that the weight of influence between entities is the same, and that of creativity is the same.

When the weight of different entities is different from that of creativity, the knowledge graph is transformed into undirected graph to show the logical reasoning among entities. An undirected graph with two weights can be used to calculate the intelligence level of target combination knowledge. For example, three entities in an undirected graph are connected as shown in Figure 3.5:

Figure 3.5 Weighted Undirected Entity Graph Weights values on different entities or between different entities have different meanings. ?, ? <:+ ? is used to express entities , <:+ influencing the degree to creativity of entire entity graph. ? represents influential degree from entity to entity , which is logical reasoning relationships between entities.

Entity related ratio of entity kp1 is: @_D_F= [(?+ ?)+-/₍ ) + (?+ ?)+-.₍ )] ∗ ? = 2 ∗ ?∗ (?+ ?+ ? + ?)

Entity related ratio of entity kp2 is: @_D_H= [(?+ ?)+-/₍

)

+ (?+ ?)+-.( )] ∗ ? = 2 ∗ ?∗ (?+ ?+ ? + ?)

Entity related ratio of entity kp3 is: @DI= [(?+ ? )+-/₍ ) + (?+ ?)+-.₍ )] ∗ ? = 2 ∗ ?∗ (?+ ?+ ? + ?)

(6)

@= deg ( )?+ ??+ ? (9) The formula for calculating the total related ratio of an undirected entity graph is:

I = ∑∈[deg( ) (?+ ?)(?+ ?) + deg( ) (?+ ?)(?+ ?) + ⋯ + deg ( )?+ ??+ ?] (10) ? represents logical connections influential weight between entity and entity . ?∈ (1,10). To calculate and obtain ?everal steps are required:

If can be reasoned by , , … , , … , , ?= 10 ∗

;

If ?< 1, then ?= 1;

? represents the influential weight of entity to creativity. ?∈ (1,10).

If can reason entities , , … , , then the weight ? can be:

?= ln(n) (?∈ (1,10))

As the number of entities could be enormous, based on weight range condition (1,10), number of entities n ∈ (1, 22026). Proposed n entities are reasoned by entity , which means entity has been applied to generate n entity connections. The more entities that have been connected by , the less possibility creativity can be achieved by using as a component.

Therefore, based on formulas of the ratio of entities combination, ratio of related entities can be quantitated as:

r = 4kp + 2(kp ≥ 3)

I = ∑∈[deg( ) (?+ ?)(?+ ?) + deg( ) (?+ ?)(?+ ?) + ⋯ + deg ( )?+ ??+ ?] (11)

C. Routine Design

When being applied in heritage tourism subject, the entity of the graph database and graphic possibility model is used to represent heritage places in the entire heritage area. In target heritage place, there exists some point of interest that allows visitors to tour around. Such POI can be defined as entities in the graph database and graphical model that are illustrated above. Weight of the connections between different POI can be confirmed through relationships of them in history. Based on explanation of ratio of entities connections calculation above, POI connections can be calculated with

I = ∑∈[deg(MN@) (?+ ?)(?+ ?) +

deg(MN@) (?+ ?)(?+ ?) + ⋯ + deg (MN@)?+ ??+ ?] (12)

Weight between different POIs can be confirmed with historical knowledge about the heritage place and preferences from visitors. Entities of POI are connected with different preferences based on historical traits of target POI.

IV. EXPERIMENT

To explain the Graphical model for Tourism Routine Design, Hailongtun culture heritage site is used as an example. To design a fitted tourism routine for visitors, POI of Hailongtun culture heritage should be placed in the database with a graphical model. As being shown in figure 4.1, it is the structure of Hailongtun culture heritage tourism area, including several POIs. The figure is the research routine in Hailongtun cite and citation of each POI is in Chinese. The route between different POI asis shown in the figure as well. The aim of this graphical model is to design the fitted tourism routine in the Hailongtun heritage place for visitors. Hailongtun tourism place is an undeveloped area no tourism data can be used for analysing touring routine. To provide new touring routines for new visitors in developed Hailongtun tourism area in the future. POIs in Hailongtun area are connected with paths, which are represented by entities, paths are represented by weights for tourism routine value calculation.

Figure 4.1 Hailongtun Culture Heritage Tourism Area Routines

Based on this structure figure, a symbol structure can be identified by using entities with the graphical model. In Hailongtun, there are 11 POIs, which are represented by entities. Connections between different entities are weight of each path, which are essential elements for routine value calculation and routine design.

(7)

routine value of routine B is @Q = 1+1+2+1+5+7+6+5+7+6 = 41. On historical level, routine A is better for visitors to comprehend history of Hailongtun area compared with routine B. On other aspects, routines can be calculated based on research to confirm the values in target aspects and providing the most valuable routine for visitors

V. CONCLUSION

In this research, heritage tourism routine design is achieved by using a graph database and graphical model. Computer sciences techniques and methods are applied in tourism area. Graph database with weights between different entities established for application. Graphical possibility model is used for calculating the weights and total value of each potential tourism routine. The final decision is based on value of each tourism routine with graphical model.

REFERENCES

[1] R. Angles, "A Comparison of Current Graph Database Models", 2012 IEEE 28th International Conference on Data

Engineering Workshops, Arlington, VA, USA, pp. 171-177.

IEEE, 2012.

[2] J. Miller, "Graph Database Applications and Concepts with Neo4j", Proceedings of the Southern Association for

Information Systems Conference, Atlanta, GA, USA, vol.

2324, no. 36. 2013.

[3] R. Angles and G. Claudio, "Survey of Graph Database Models", ACM Computing Surveys (CSUR), ACM Publishing, New York, USA, vol. 40, no. 1, p.1, 2008. [4] M. Gyssens, P. Jan, V. Jan and V. Dirk, "A Graph-Oriented Object Database Model", IEEE Transactions on Knowledge

& Data Engineering, IEEE, US, pp. 572-586, 1994.

[5] B. Iordanov, "HyperGraphDB: A Generalized Graph Database", International Conference on Web-Age Information Management, Springer, Berlin, Heidelberg, pp.

25-36, 2010.

[6] B. Bordoloi and K. Bichitra, "Designing Graph Database Models from Existing Relational Databases", International

Journal of Computer Applications, Citeseer, US, no. 1, 2013.

[7] M. Beal, G. Zoubin and E. Carl, "The Infinite Hidden Markov Model", Advances in Neural Information Processing

Systems, Neural Information Processing Systems Foundation

Inc., US, pp. 577-584, 2002.

[8] Q. Zhang and A. Saleem, "Finite-State Markov model for Rayleigh Fading Channels", IEEE Transactions on Communications, IEEE, New York, US, no. 11, pp.

1688-1692, 1999.

[9] M. Yuan and Y. Lin, "Model Selection and Estimation in The Gaussian Graphical Model", Biometrika, Oxford

University, Oxford, UK, no. 1, pp.19-35, 2007.

[10] H. Yang and A. Hugill, "The Creative Turn: New Challenges for Computing", International Journal of

Creative Computing, IEEE, London, UK, vol. 1, no. 1,

pp.4-19, 2013.

[11] Q. Liu, L. Zou, H. Che, H. Wang, Y. Ji and H. Yang, "A Creative Computing Based Inspiration Assistant to Poem Generation", 14th International Symposium on Pervasive

Systems, Algorithms and Networks and 11th International Conference on Frontier of Computer Science and Technology and 3rd International Symposium of Creative Computing (ISPAN-FCST-ISCC), IEEE, Exeter, UK, pp. 469-476, 2017.

[12] Q. Liu, L. Zou, S. Ma, and H. Yang, "Intelligence to Artificial Creativity", International Journal of

Performability Engineering, Beijing Magtech Co. Ltd., Chengdu, China, vol. 15, no. 2 2019.

[13] D. Dredge and J. M. Jenkins, Tourism Planning and Policy, John Wiley & Sons, Milton, 2007.

[14] M. Hall, "Tourism Planning: Policies, Processes and Relationships", Tourism Planning, Xi’an International Studies University, China, Vol. 22, no. 5, pp. 573-574, 1999. [15] K. Andriotis and G. Konstantinos, "Tourism Planning and Development in Create: Recent Tourism Policies and Their Efficacy", Journal of Sustainable Tourism, Taylor and Francis, New York, US, vol. 9, no. 4, pp. 298-316, 2010. [16] X. Zheng, V. Magnini and D. Fesenmaier, "Information Technology and Consumer Behavior in Travel and Tourism: Insights from Travel Planning Using the Internet", Journal of Retailing and Consumer Services, Elsevier, New York, US, vol. 22, pp. 244-249, 2010.