• No results found

A Data-driven Approach to Internet-based Business Collaboration

N/A
N/A
Protected

Academic year: 2021

Share "A Data-driven Approach to Internet-based Business Collaboration"

Copied!
6
0
0

Loading.... (view fulltext now)

Full text

(1)

A Data-driven Approach to Internet-based Business

Collaboration

Peng Zhang

Institute of Computing Technology, Chinese Academy of Sciences,

Graduate University of Chinese Academy of Sciences Beijing, China

Guiling Wang, Chen Liu

Institute of Computing Technology, Chinese Academy of Sciences

Beijing, China

AbstractTraditional business process modeling approaches focus on the activities, data flows are an afterthought. Concerning cross-organizational collaboration on the Internet, data flow is still far from optimal, causing resource contention and unnecessary data exchanges. In this paper, a data-driven approach to Internet-based business collaboration is proposed, which treats data as a centerpiece as data objects and associates activities with the data objects. The data objects hold information structures pertinent to global context to support logic data flow, both the resource contention and data exchanges can be alleviated. A case study shows that the approach works efficiently.

Keywords—business collaboration, data flow, data object, contention resolution

I. INTRODUCTION

In the 70’s and 80’s of the 20th century, the data-driven approach is dominant in the process of information system design, its focus is data modeling, and the business process modeling is overlooked. As the needs for better insight, understanding and efficiency for business operations increases, in the 90’s, the system developers are more and more inclined to adopt the process-driven approach to design the systems, but the approach is generally activity-centric, the data flow is still far from optimal, causing resource contention and unnecessary data exchanges, so it is disadvantageous for the data-intensive cross-organizational applications[1].

With the appearance of Cloud Computing, the Internet has the potential to transform a large part of IT industry, shaping the way information system is designed[2]. It is able to provide a unified data model to capture the information structures pertinent to global context to facilitate the Internet-based business collaboration, which is special important in emergency management.

In emergency management, the emergency response system is a cross-organizational event handling integrated system. It supports to make the contingency plan of event handling and implement the plan with incremental modification by end-user with ad-hoc decision-making. Current emergency plans usually exist in the form of multi-page printed documents and are similar to the business processes, so Christian[3] has proposed to use a workflow management system to model, implement

and manage emergency plans. However, the business processes involved in the emergency event handling are distributed and autonomous, and their internal data are dynamically changed and frequently exchanged. If there is not a unified data model, the chief dispatcher in emergency center can hardly acquire the global context to consult when he makes ad-hoc decision. As a result, the resource contention could be raised among different business processes, where the resource can be seen as an object and includes a set of data. Moreover, if each activity has to fetch data from remote place during the runtime, the frequent data exchanges would delay the response time.

In view of the variability of the policies and social context, there exists different business processes to handle the same event in different areas, but the key data objects(resources) related with the event are stable[4], for example, the gas、 electricity and water are key factors in all cities, the business processes are only means to change their values, such as from an abnormal value to a normal value. The lifecycles of the key data objects can be regarded as common references for different business processes with the same purpose. We call this characteristic “data object-driven”, or “data-driven” for short, and propose a data-driven approach to Internet-based business collaboration. In this approach, the data objects hold information structures pertinent to global context to support logic data flow, so that the data exchanges and resource contention can be alleviated.

II. BUSINESS PROCESS MODELING

How to search, extract, store and display the resource information is an important aspect in the process of emergency management. As a rule the resource information comprise of the following databases:

• Social economy, climate and geography database; • Public jeopardy factor database;

• Serious criminal database;

• Special emergency event database such as poisoning, epidemic, earthquake and so on;

• Emergency resource database such as policemen, ambulance, firemen and other recovery resource;

(2)

In order to get real-time general picture of key data elements from above distributed autonomous sources on the Internet in each step of a flow, the paper proposes a data-driven Internet-based business process modeling as shown in Figure 1.

data object modeling

Business activity modeling

Workflow Engine Resource Community

Legend: state transition business activity information

model lifecycle

Figure 1. The data-driven Internet-based business process modeling

It is divided into two steps. The first step is data object modeling, and the second step is business activity modeling. The domain expert makes an analysis of the key data objects related to the case, and models the data objects according the analysis result. Each data object includes an information model and lifecycle, the information model is extracted from distributed autonomous sources on the Internet, the lifecycle is composed of accepted states. Then the business analyzer establishes the relationship between data objects and business processes through business activity modeling. Finally, the end-user models the business processes by business activities. A. Data Object Modeling

The nested relational model was first proposed in 1977, also called NF2(non-first normal form). It allows relations to have relation-valued attributes and is one of the most adopted data model for representing semi-structured web data. It has been successfully utilized in web data extraction applications and has been implemented directly in some modern DBMSes, such as Oracle. The reason is that the nested relational model is simple, intuitive, and expressive enough to represent the semi-structured data commonly found in Web pages[6]. For that, we adopt the nested table based on nested relational model to represent the information model of data object

Here, the information model is composed of two elements: entity and attribute. The attribute is a name with specific data type and its semantics. In VINCA[5], the attribute type can be chosen from: text, textlink, img, imglink, video, videolink. In order to facilitate the operation and representation, we use conventional media types in hypertext as attribute type. The entity represents the resource. An entity can include many attributes. In the nested table, there is only 1:n relationship between entities, of which all attributes of the n entity are

added into attributes of the 1 entity. The formal definitions have been proposed in paper[6].

en tity attribu te ent ity attri bute

Figure 2. The nested table representation of information model

Concerning the modeling process, the domain expert firstly specifies all entities related with data object. Secondly the domain expert judges the relationship between entities. If there is 1:n relationship between two entities, he can drag the n entity into 1 entity, the final result is a nested table including all entities. Thirdly, the domain expert analyzes all attributes of the entities, and drags the attribute symbol to represent them. Fourthly, the domain expert specifies the data sources to provide data for the attributes, such as the fields of online database table or Internet-based data service. Here we only consider the online database.

Let’s take the fire squadron entity for example, the relationships between it and other entities are 1:n, such as fire men, fire engine, fire hydrant, so the domain expert drags these entities into fire squadron entity, the attributes of other entities are taken as the attributes of fire squadron, the nested table is shown as Figure 2. In order to link data source, we could get the key of 1:n relationship, such as the name of fire squadron which is the key of fire men. Then we traverse all nested entities according to the value of key to merge related tuple, such as the tuple of fire squadron and the tuple of fire men. Finally, we present the final tuple to end-user.

After that, the domain expert still needs to specify the lifecycle using the UML state chart[7]. A complete lifecycle must include a start state and end states, all states are connected by transitions. If a transition has multiple choices, the fork can be used. Each state of the lifecycle is based on the accepted domain knowledge, so the lifecycle can be considered as process template for the business analyzer to establish the relationship between data object and business process. Figure 3 shows three lifecycles of data objects involved in the tank leak case, now we give the related formal definitions in order to define the workflow net supporting nested data object to formally find the resource contention.

To begin with, we assume the existence of the following pairwise disjoint countably infinite sets: Tp of primitive types, C of (data object) schema(names), A of attributes (names), STATES of data object states, and IDc of (data object) identifiers for each schema C ∈ C .

A type is an element in the union T = Tp∪C .The domain of each type t in T, denoted as DOM(t), is defined as follows:

(3)

• if t ∈ Tp is a primitive type, the domain DOM(t) is some known set of values (integers, strings, etc.);

Figure 3. The lifecycle representation

Def 1. A data object schema is a six tuple(C, A,δ, Q, s, f)where C∈C is a schema name, A⊆A is a finite set of attributes, δ:A→T is a mapping, Q⊆States is a finite set of states and s∈Q, F⊆States are initial and final state respectively. A data object of schema(C,A,δ, Q, s, f) is a triple (o, u, q),where o∈IDc is an identifier, u is a partial mapping that assigns each attribute A in A an element in its domain DOM(δ(A)), and q∈Q is the current state. An data object is initial if q=s, and u is undefined for every attribute, and final if q∈F.

Def 2. A data objects schema set Γ is a minimum finite set of data object schema with distinct names such that every data object schema referenced in Γ also are included in Γ.

Def 3. Let Γ be a data object schema set. An instance of Γ is a mapping I that assigns each data object schema C in Γ a finite, valid, and complete set of data object.

B. Business Activity Modeling

Referred to the data object schema set Γ, the business analyzer could relate the business activities with data objects according to the capability information of the business activity, which comprises of input variables, output variables, preconditions and conditional effects. In our context, the data objects constitute a “real world”, which captures the information structures pertinent to global context, their values or states are continuously changed by the business activities in the process of business process running. We now proceed to modeling “business activity”.

These are essentially existing software modules used to act on data objects, and serve as the components from which business processes are assembled. We assume the existence of pairwise disjoint countably infinite sets of variables for data object schema in C. A variable of type C ∈ C may hold an identifier in IDc.

Def 4. The set of (typed) terms over a schema Γ includes the following.

• Variables of a data object schema C in Γ, and

• x.A, where x is a term of some data object schema C (in Γ) and A an attribute in C. (Note that x.A has the same type as A.)

We now define the notions of “atoms” and “conditions”, which are used to specify the preconditions and conditional effect.

Def 5. An atom over a schema Γ is one of the following: • t1 = t2, where t1, t2 are terms of data object schema C

in Γ,

• DEFINED(t,A), where t is a term of data object schema C and A is an attribute in C,

• s(t) (a state atom), where t is a term of data object schema C and s is a state of C.

Def 6. The well-formed formula over a schema Γ is one of the following:

• An atom is well-formed formula

• If A is formed formula, then ¬A is also well-formed formula

• If A and B are well-formed formulas, then A∧B, A∨ B also are well-formed formula

Only and if only finitely apply the (1),(2),(3) the result is also well-formed formula

We can now describe the business activities and their semantics. We assume the existence of a disjoint infinite set S of business activity names.

Def 7. A business activity over a schema Γ is tuple (n, Vi, Vo, P, E), where n ∈ S is a business activity name, Vi, Vo are finite sets of variables of data object schema in Γ, P is a well-formed formula over V to represent precondition, and E is a conditional effect, the effect is to update the attributes and states of data objects, each update operation binds to an Event-Condition-Action rule, where the Event is the end event, the Condition judges whether the business activity is a success or failure, the Action is described by SQL to update the databases in real-time.

Figure 4 shows major steps of business process modeling for tank leak case. The end-user composes these business activities to model the business process, the basic modeling elements include data object, business activity, repository, and connector. The repository describes a waiting shelf or a buffer for data objects, and the data object provides a unified data model to support logic data flow. The business activity can push a data object into a repository and pull it out of the repository. A connector connects the port of an activity to the port of an activity (Activity-Activity) or connects the port of an activity to the port of a repository (Activity-Repository). The ports are the points to connect the connectors. Activity-Activity connectors carry data objects or simple messages, Activity-Repository connectors carry data objects when an activity pushes a data object into a repository.

While the business process is running, the chief dispatcher pays attention to the nested data object regardless of the physical data flow. In the some organization, such as Fire fighting organization, the 119 data object is sharable, so there is no need to exchange data.

(4)

After the business process model is deployed into the workflow engine, the engine starts to execute. In each step of a flow, a real-time general picture of key data elements from distributed sources on the Internet is to present to the chief dispatcher to facilitate his decision-making.

The Petri nets are well suited for modeling and verification of concurrent systems. For that reason, they have be proven to be a successful formalism for workflow systems[16], thus we transform the business process model into the workflow net based on colored Petri net[8] to formally find the resource contention, where each place is associated with a type of nested data objects, each token represents a nested data object. In addition, learning from the complex data transition labels[13], the nested data object can be nested and un-nested, such as the record can be un-nested into 120 data object and 119 data object, now we will give our workflow net :

Def 8. The workflow net supporting nested data object is tuple WFN-net= (Γ,P,T,F,C,K,G,E,B, I)

• Γ: comprises of finite data object schemas,also called color set, I is an instance mapping of Γ;

• P: the place set,there is exactly one place with no incoming arcs, called initial place, and exactly one

place with no outgoing arcs, called final place; T is the transition set,st. PT=∅, ∀ t T∈ :• ≠ ∅ ∧ • ≠ ∅t t • F: the set of directed edge,st.FP×TT×P; • C: the color set mapping,C: PΓ;

• K: the capacity mapping,K:P→N{w},indicates the token capacity of place p (w indicates the infinite); • G: the guard function of transition, G:T→expressions,

∀ t∈T:Type(G(t))=bool∧Type(Var(G(t)))⊆Γ,the expression is the precondition of business activity; • E: the arc function , E:F→expressions f

F:Type(E(f))=C(p)MStype(var(E(f)) Γ);

• B: the body function of transition, also is the effect of business activity;

The workflow net supporting nested data object can model the resource contention in advance, and reduce the number of resource contention by contention resolution strategy, which is invoked when the guard function of transition is false.

Let’s examine the tank leak case. One day, a tank truck is running in X highway, suddenly, the tank occurs leak because

(5)

of the breach resulting from collision. Fortunately, the truck has fixed a detector, so the leak event can be detected and transmitted to emergency center. Suppose each detector has a unique identifier, the emergency center can locate the event source according to the identifier. In the Joint Action step, all of a sudden, the leaking tank triggers an explosion, the control centers feedback the event, the chief dispatcher in emergency center receives the event, and notifies the fire fighting control center to dispatch another two fire squadrons to reinforce the fire fighting. Suppose at the moment the control center receives the instruction, the responsible leader prepares to schedule two fire squadrons to depart. Owing to the lack of global fire squadron information, the chief dispatcher in emergency center hardly knows about that one of the fire squadrons has dispatched all fire engines to involve in other tasks, so the resource contention is raised among different tasks. However, with the help of our workflow net, the chief dispatcher can find the resource contention in advance and solve it interactively, such as the transition Reinforce2 is trigged, and a nested data object with 119 type is added to the place 119suspending according to the arc function E, where the nested data object includes the attributes such as the number of fire squadron. If their values could not satisfy the guard function of transition Deploy2, then the contention resolution strategy is invoked.

Algorithm 1.Contention Resolution

Input:Token do, WFN wf, int m //here we take the number of fire engine for example, the Token is considered as nested data object

Output:Boolean

// get all tokens with the same type Set set= wf. Γ.getTokens (do.Type) int t=0;

for(int i=0; i<set.size(); i++) {

do’=set.get(i)//fetch the token

//notify all the tokens, please release the resource sendNotification(do’,attribute)

//receive all resource release feedbacks xml=getNotification();

t+=xml.parse()//return the number of available resources }

if(t>m){

System.out.println(“Please allocate”) return true

}

else return false

Whether the nested data objects in other business processes release the resources depends on the criticality, which is dynamically changed along with the progress of emergency. If the return is false, then the system will notify the chief dispatcher “failure” and rollback to the place of Depart2 output port and cancel all transitions, otherwise notify the chief dispatcher to allocate the available resources to execute the transition Deploy2. After that, the merged nested data object in the place of Depart2 output port has three fire squadrons, the

Fire fighting starts out. The contention resolution strategy alleviates the resource contention.

III. EXPERIMENT

For the purpose of verifying the good effect of nested data object supporting logic data flow, we have made an experiment with ten business processes, 8 of which are targeted at transfer hospital, 2 of which are targeted at fire fighting. These business processes are rearranged according to the number of data exchanges, and are tested one by one during test run. Figure 5 shows the reduced data exchange effect with the help of nested data object. Let’s take the tank leak for example. If we use traditional workflow such as BPMN to model Internet-based business collaboration, each activity has to fetch data from remote place, so that the data flow has frequent data exchanges, but the number of data exchanges can be reduced in our workflow because of logical data flow, which only exchanges nested data objects each time.

0 20 40 60 80 100 1 2 3 4 5 6 7 8 9Test run10 Number WFN BPMN

Figure 5. The experiment of data exchanges

The Figure 5 shows that there are not distinct differences until the third run. The number of data exchanges is nearly square exponential growth using BPMN, but it is reduced to linear growth using WFN. There are two reasons to explain the effect, one of which is the nested data objects can be seen as data cache which is near the flow, the data cache optimize some unnecessary data exchanges, for example, in the some organization, the nested data objects are sharable, so there is no need to exchange data. Another is the nested data objects encapsulate physical distributed data, there only need to exchange nested data objects one time instead of exchanging physical distributed data many times. We are sure that as the number of data exchanges is reduced, the runtime is also reduced.

From our experiment, it is easy to find the benefit. We can conclude that with the Cloud Computing development, in Internet-based business collaboration, the efficiency problem resulting from exponential growth of data exchanges can be alleviated.

IV. DISCUSSIONS

The data-driven approach concerns the dependencies between data used by activities and derives control flows based on such dependencies. The related research groups are as follows:

(6)

Cohn[9] thinks associated requirements, business rules, and business intelligence are based on conceptual meta-models only loosely connected to the existing BPM base model. This disparity adds substantial conceptual complexity to models of business operations and processes, so he proposes a data-centric approach to modeling business processes. Based on the artifact-centric business process, Liu[14] develops a computational model for business artifact-centered operational models based on Petri Nets to enable formal analysis and verification, but he don’t give a complete formal definition.

Van der Aalst[10] thinks workflow management systems are too restrictive and have problems dealing with changes. For the purpose, he proposes a Case Handling paradigm, the knowledge worker in charge of a particular case actively decides on how the goal of that case is reached, and the role of a case handling system is assisting rather than guiding her in doing so. Case handling not only includes the process-driven, but also the data-driven. It can avoid context tunneling by providing all information available.

Both Kees[14] and Wang[15] propose resource constrained workflow net, and prove a number of properties, especially the number of available resources does not exceed the number of initially given resources at any moment of time, but the resources are metrics, the workflow net has no data model to capture the information pertinent to global context.

The development of complex products necessitates the coordination of thousands of processes. In most cases, the different processes have to be manually composed and coordinated. Manually modeling and adapting large process structures is error-prone, and can cause delays or deadlocks blocking the execution of the whole process structure. For this purpose, Reichert[11] provides a new approach with respect to the automated creation and data-driven adaptation of process structures during runtime reduces modeling efforts for large process structures and ensures correct coordination of processes.

Although above groups have proposed driven or data-centric business process modeling, they don’t explicitly consider the data flow optimization. However, our work explicitly considers the data flow optimization and proposes a nested data object model to support logic data flow.

To sum up, our data-driven approach to Internet-based business collaboration has three advantages:

• The nested data objects hold information structures pertinent to global context to support logic data flow. • The nested data objects reduce the unnecessary data

exchanges.

• The workflow net supporting nested data object is based on the colored Petri net, so it can formally find

the resource contention in advance and alleviate the resource contention.

With more and more Internet-based business collaborations, the optimization of data flow is key point, and our approach provides a good reference in the context.

ACKNOWLEDGMENT

The work was partially supported by the National Science Foundation of China under Grant No.60903048 and the Major State Basic Research Development Program of China (973 Program) No. 2007CB310805 and the National Science Foundation of Beijing under Grant No. 4092046 and the Mountain Tai scholarship program and Co-building Program of Beijing Municipal Education Commission.

REFERENCES

[1] D. Cohn and R. Hull. Business artifacts: A data-centric approach to modeling business operations and processes. IEEE Data Eng. Bull., 32(3):3–9, 2009.

[2] M. Armbrust, A. Fox, R. Griffith, et al.. Above the Clouds: A Berkeley View of Cloud computing. Technical Report No. UCB/EECS-2009-28,University of California at Berkley, USA, Feb. 10, 2009.

[3] Christian Sell, Iris Braun. Using a Workflow Management System to Manage Emergency Plans. ISCRAM, 2009.

[4] Tian Chao, David Cohn, Adrian Flatgard, et al.. Artifact-Based Transformation of IBM Global Financing. BPM 2009, 2009:261-277 [5] Yanbo Han, Jing Wang, Peng Zhang. Business-oriented service

modeling: A case study, Simulation Modeling Practice and Theory, 2009, 17:1413~1429.

[6] Guiling Wang, Shaohua Yang, Yanbo Han. Mashroom: end-user mashup programming using nested tables. WWW 2009, 2009: 861-870

[7] Eriksson H E, Penker M. Business modeling with UML. Wiley New York. 2000

[8] Ratzer V.A., Wells L., Lassen M.H., et al.. CPN Tools for Editing, Simulating, and Analysing Coloured Petri Nets. In: ICATPN 2003. LNCS, vol. 2679, pp. 450–462.

[9] Cohn D, Hull R. Business Artifacts: A Data-driven Approach to Modeling Business Operations and Processes. IEEE Computer Society Technical Committee on Data Engineering 2009. 2009.

[10] Van Aalst W, Weske M, Gr D. Case handling: a new paradigm for business process support. Data & Knowledge Engineering, 2005, 53(2):129-162

[11] Muller D, Reichert M, Herbst J. A new paradigm for the enactment and dynamic adaptation of data-driven process structures. CAiSE 2008, 2008:48-63

[12] Rong Liu, Kamal Bhattacharya, Frederick Y. Wu. Modeling Business Contexture and Behavior Using Business Artifacts. CAiSE 2007, 2007: 324-339

[13] Jan Hidders, Natalia Kwasnikowska, Jacek Sroka, et al.. Petri Net + Nested Relational Calculus = Dataflow. ODBASE 2005, 2005: 220-237 [14] Kees van Hee, Natalia Sidorova, Marc Voorhoeve.

Resource-Constrained Workflow Nets. Fundamenta Informaticae, 2006, 71:243-257

[15] Wang J, Tepfenhart W, Rosca D. Emergency response workflow resource requirements modeling and analysis. IEEE Transactions on Systems, Man, and Cybernetics, 2009, 39(3): 270-283

[16] W. M. P. van der Aalst and K. M. van Hee. Workflow Management: Models, Methods, and Systems. MIT Press, 2002.

Figure

Figure 2.   The nested table representation of information model  Concerning the modeling process, the domain expert firstly  specifies all entities related with data object
Figure 3.   The lifecycle representation
Figure 4.   Major steps of data-driven Internet-based business process modeling for tank leak case
Figure 5.   The experiment of data exchanges

References

Related documents

In Avesta, nouns, adjectives, participles and other parts of speech are formed by adding suffixes to roots. These nouns and adjectives are crude forms. If they have to

This study examines the degree to which the anonymity of users affects their tendency to engage in flaming behavior in online news forums through a qualitative analysis of

Feature phones Hardware Hardware Hardware Hardware Carriers Carriers Carriers Carriers Outbound ( 1:1) Outbound ( 1:1) Outbound ( 1:1) Outbound ( 1:1) 2000 Voice IM Text Mobile as

А для того, щоб така системна організація інформаційного забезпечення управління існувала необхідно додержуватися наступних принципів:

However, not all hotel employees are willing to go “above and beyond” requirements. The question then becomes what motivates employees to engage in OCB? Although OCB research has

Saturday (hard day, 6-8 hours): dojo class conditioning hard stretching sparring weight training  bag work. running

2, the top portion of the intersection, i.e., the subarea KSSE, addresses how knowledge engineering methods can be applied to software engineering; in other words, how

● From the Start/Finish at the River House Barn, head south and cross Fig Ave into the River Campground and head EAST and connect with Main Loop trail.. ● The main loop trail will