Figure 3.42: ORange’s web interface
ORange (Objective-aware Range Query Refinement) is a web-based tool for range queries refinement. Essentially, ORange refines a range query to meet a specified cardinality constraint while taking into account the (dis)similarity between the initial query and its corresponding refined version. To showcase SAQR’s efficiency and benefits, we designed and developed: ORange: an application that guides police coordinators in allocating police officers into service zones, such that each police officer has a specific capacity (number of incidents) [121, 34]. That is, each police officer can only handle K incidents at a given time, therefore, when allocating a service zone to her, it must contain K incidents. More or less incidents in her service zone corresponds to a drop of quality of service or waste of resources, respectively. Accordingly, we employed the proposed SAQR schemes in Section 3.3 to refine service zones given their capacity (cardinality) constraints. We used the historical dataset of crime incidents of the city of San Diego, CA in USA 2 to estimate the number of incidents, therefore, this application is based on that city. Nonetheless, users can upload their own datasets to utilize ORange capabilities.
The capabilities of ORange can be summarized as follows:
• A police coordinator can enter a cardinality constraint and visually selects an initial range query (service zone) on a real map for a police officer.
• A police coordinator can see the new refined query provided by the application on the same input map, such that the new query satisfies the cardinality constraint.
2Extracted from clarinova.com-crime-incidents-casnd-7ba4. San Diego Regional Data Library. 2013-08-07
ORange Web interface
Query Refinement
Engine User
Initial selected range query Q (service zone)
+ cardinality constraint (K) Result of (Q) Original (Q) + constraint K
Refined query Q’ DBMS Google APIs D3 Library Google maps service Select and
visualize range queries on map
Figure 3.43: ORange’s complete system architecture
• Performance details (cost and deviation) are shown for each scheme to judge and compare them against a baseline heuristic algorithm: Hill Climbing.
In the next section, we firstly introduce the application’s architecture, then we briefly present the application’s setup and the used dataset. As for the underlying schemes of ORange: SAQR-S and SAQR-CS, which leverage and exploit the distance and cardinality constraints to effectively prune the search space, we refer the reader to Section 3.3 for more detailed description of those schemes.
3.5.1
ORange Architecture
Figure 3.43 shows a detailed architecture of ORange and its building modules. ORange communicates with the user through a web interface and receives the input as: a selected range query (service zone) and a constraint K. The former is captured by two modules: Google Maps APIs and D3 Library, to show a real world map and to draw a rectangular area, respectively. Then those input data are fed to the Query Refinement Engine. When the refinement engine finds the refined query, it sends it back to the web interface to display (with the help of Google Maps APIs and D3 Library) the new refined service zone for the user to see. If the user is not satisfied with the result, she can issue a new query and provide new constraints.
3.5.2
Application Setup
The ORange application is built as a client-server application with a front-end that handles all presentation tasks and a back-end for processing data. The front-end is a web interface which consists of an HTML page that provides the capabilities of communicating input data from users to the system and showing output to users in a suitable way (See Figure 3.42). A visualization library called D3
Attribute Description
date ISO date, in YY-MM-DD format
year Four digit year
month Month number extracted from the
date
day Day number, starting from Jan 1,
2000
dow Day of week, as a number. 0 is
Sunday
time Time, in H:MM:SS format
type Crime category, provided by
SANDAG
address Block address, street and city name
Latitude Provided by the geocoder
Longitude Provided by the geocoder
desc Long description of incident
Table 3.8: Schema of used dataset SD_incidents_100k.
by Google Maps APIs. The controlling parameters: cardinality constraint K, α and selected scheme are collected from users through HTML input tags. While the previously mentioned presentation tasks are all located in the front-end, all processing tasks are located in the back-end. Specifically, the refinement engine is located in the back-end and is implemented using Java. Its job is to receive the input parameters (query and control parameters) and return the optimal refined query and the performance indicators to the front-end for presentation. Specifically, the front-end will show the quality of refinement and the cost metric (via charts in a dashboard) for all schemes to users to comapre. All data is stored in MySQL DBMS, and as explained above, we are using a historical dataset of crime incidents of the city of San Diego, CA in USA. That dataset consists of one relation of 100k incidents. Each incident is represented by multiple attributes, however, we are only concerned with the longitude and latitude attributes of the crime incident. A partial schema of the relation is shown in Table 3.8.
3.5.3
Step-by-step Example
As shown earlier, the dataset used represents the locations of historical crime incidents within the city of San Diego. Users can allocate a service zone for a police officer, i.e., that police officer will be
in charge of any incident reported in his service zone and he must respond to it. Provided with the application interface in Figure 3.42, the user should perform the following four steps:
Step 1: The user will use a selection tool to select a rectangular area in the map of San Diego which represents the desired service zone. To keep the user more informed, the NE (North east) and SW (South west) coordinates of the selected zone are also shown on the text boxes.
Step 2: Once the zone is selected, the user then enters the cardinality constraint and selects the controlling parameters (α and refinement scheme).
Step 3: To execute the refinement process, the user clicks on the Refine button, and waits until the processing is finished. This is when the refinement module takes over and starts navigating the search space looking for a refined query that has the minimum overall deviation.
Step 4: As soon as the refinement process finishes, the new returned area is drawn on the same input map, but with different color to distinction between the initial and refined service zones. Also, the deviation from the initial selected area is shown in the text box as a normalized value between [0-1].