Introduction to Big Data Science
13
th
Period
Project: Situation Awareness and
Statistical Analysis On
Contents
What is Situation Awareness (SA)?
3 Levels for SA
Role of Data Mining and Reasoning in SA
Extracting Information from Big Data
Awareness
The goal of computational
awareness: to realize awareness in
computing machines
Awareness is the ability to perceive,
to feel, or to be conscious of events,
objects or sensory patterns.
“
Situation awareness is the
perception
of
environmental elements with respect to time
and/or space, the
comprehension
of their
meaning, and the
projection
of their status in the
near future after some variable has changed.
”
(Mica Endsley, Wikipedia).
A. Steinberg, et al., Rethinking the JDL Data Fusion Levels
JDL: Data Fusion Levels
M.R. Endsley, Theoretical Underpinnings of SA: A Critical Review
M.R. Endsley, Theoretical Underpinnings of SA: A Critical Review
Collect
Relevant
Data
Pr
o
v
enance
Relate
Situation
Entities
Semantic Analysis
•thematic
•Spatio-Temporal
•trust
M. Kokar, et al., Ontology-based Situation Awareness* (Modified Figure by A. Sheth)
Identify
Situation
Entities
A novel architecture for active situation
awareness
Image processing and pattern recognition, data
mining, signal processing in computer technology
can be applied to
perception layer
to recognize
low level objects and data patterns.
Situation awareness is inferring some conclusion
from observation in the perception layer.
Ontology-based rules are usually used for
comprehension
.
The top layer is for
projection
, which anticipates
A novel architecture for active situation
awareness
Projection
Comprehension (Situation)
Perception
World
recommendToparticipate TheEvent(Building, Event) needReplyTo (ITM) checkHisEvent (ITM) hasEvent (Building, Event) isRare(Event) giveHotTopic (ITM,ATopicHisBlog) sayCelebration (ITM, myBlog) Stand (People, Longline) isAT (People, Building) Wrote (ITM, myBlog) needReplyTo (ITM) Facebook Twitter Google Web Data ServicePerceptions by mining SNS data
Documents SNS, Web Data Services Twitter Facebook Document ProcessingLatent Query for SA (Time, Space, Theme) Data Event Information Extraction Classification (TF-IDF) Perception Information Ontology for Comprehension at Upper Layer Active Situation Awareness
Perception by mining SNS data
Select data set to extract information to be used in
comprehension layer.
The information can be modeled by Web APIs to
provide facts to rule engine. For example, we
have analyzed the Facebook user’s sentences by
data mining technique to catch use’s intension or
changes in mind.
There are various data and information set for
Ontology for Comprehension of the
information
Comprehension of the information by
inference of ontology and rule
%% Cafeteria Event Inference
%% Rules
%%longLineStand(Human) :- stand(Human), long(Human).
mayHaveEvent(Place) :- longLineStand(Human), areAt(Human, Place).
hasEvent(Place,Event) :- mayHaveEvent(Place), foundEvent(Place, Event).
recommendToparticipateTheEvent(Place, Event) :- hasEvent(Place,Event), isRare(Event).
%% Facts
longLineStand(students).
areAt(students, cafeteria).
foundEvent(cafeteria, sinsobamatsuri).
isRare(sobamatsuri).
ASA System Architecture on SNS
Facts
RESTful
Services
for
Perception
•
FaceBook Service
• Twiter Service• Web Data Service
Mapping
Ontologies
Ontologies
Domain
Rules
Inference Engine
Smart Phone
Scenarios
Scenario I
A student in our university bought a lunch box because he saw a long
waiting line in the university cafeteria. But he didn’t know it was the waiting
line for new soba festival in the cafeteria. If he got the information about
the new soba festival from his smart phone when he was near to the
cafeteria, he would have chosen the soba.
Scenario II, III
When I was in my office, a student came in. When I shake my smart
phone, the phone tells me the followings about the student based on
information on the Facebook:
(Example)
- The Opponent's Name: Leo Saito
- He has interest to me
- Saito has Events (Part Time Job, Date)
Mining SNS Data
(By TF-IDF for Perception layer)
Function: Category_calculate{//calculate category of a writing
Input: word // set of words that are split
Output: category //category of words set Data = learning data set
for i = 1 to n {// n = number of word in words set calculate IDFi=
log2 (number of all document in Data /
number of wordi containing document in Data )}
//IDFi = IDF value of wordi
for i = 1 to n {// n = number of word in words set
for j = 1 to m { // m = number of data of Data set
calculate TF ij =(frequency of wordi in Dataj / number of all wordi in Dataj )
calculate TFIDFij = Tf ij * IDFi }}
for j = 1 to m { // m = number of data of Data set
calculate Sum_of_TFIDFj = sumof TFIDF1j, TFIDF2j … ,TFIDFnj
if Max_Sum_of_TFIDF < Sum_of_TFIDFj {
category=category of Dataj}}
return category }
Function: determine the difference between the two categories{
Input: writing1 , writing2 //writing is document set
Output: true or false //If accordance -> true, Else -> false
for i = i to n {//n = number of document in writing 1 Category_calculate(writing1i) }
category_of_writing1 = most common category of document in writing1 for i = j to m {//m = number of document in writing 2
Category_calculate(writing2j) }
category_of_writing2 = most common category of document in writing2 if category_of_writing1 = category_of_writing2
return false else
return true }
Rules for SA (Example 2)
1) ITM
wantsMyReply(ITM) :- wrote(ITM, myBlog) and thereis(questionMark,hisWriting). enjoyMe(ITM) :- wroteNumberMorethan(ITM, myBlog, threshold).
giveHotTopic(ITM,ATopicHisBlog) :- wrote(ITM, ATopicHisBlog) and thereAreRepliesMorethan(ATopicHisBlog, threshold).
giveGoodEvaluation(ITM, ATopicHisBlog) :- wrote(ITM, ATopicHisBlog) and thereAreGoodRepliesMorethan(ATopicHisBlog, threshold).
sayCelebration(ITM, myBlog) :- wrote(ITM, myBlog) and thereis(celebration, myBlog). haveNewEvent(ITM) :- wrote(ITM, hisEventBlog).
* Example of Upper Level Factor or Situation
needReplyTo(ITM) :- wantsMyReply(ITM) and sayCelebration(ITM, myBlog) adn enjoyMe(ITM). checkHisEvent(ITM) :- haveNewEvent(ITM) and giveHotTopic(ITM, ATopicHistBlog).
2) MC
wantsMyReply(MC) :- wrote(MC, myBlog) and thereis(questionMark,hisWriting). enjoyMe(MC) :- wroteNumberMorethan(MC, myBlog, threshold).
giveHotTopic(MC,ATopicHisBlog) :- wrote(MC, ATopicHisBlog) and thereAreRepliesMorethan(ATopicHisBlog, threshold).
giveGoodEvaluation(MC, ATopicHisBlog) :- wrote(MC, ATopicHisBlog) and thereAreGoodRepliesMorethan(ATopicHisBlog, threshold).
sayCelebration(MC, myBlog) :- wrote(MC, myBlog) and thereis(celebration, myBlog). haveNewEvent(MC) :- wrote(MC, hisEventBlog).
3) IL
hasNewEvent(IL) :- wroteSomeBlogforEvent(IL) --> * large complex task *