The data acquisition solution has several different components which communicate with each other for an integrated solution (see figure 6.1).
DistributeEyes
Client DistributeEyesCentral Server
HTML Training application
MTurk
Figure 6.1: TheDistributeEyes Components and their interactions
3http://www.distributeeyes.com 4
Figure 6.2: DistributeEyes Project Settings Dialog Window
6.3.1
The
DistributeEyes
client
A user of the client (the “user”) creates a new project and identifies the different objects of interest types (for historical reasons, we call these “phenotypes”) and then provides a few examples of each. The user then initializes the project on the central
DistributeEyes server (the “server”). Then, images that we would like to extract
training data from (“images”) are added to the project.
The “Distribute Images” panel (not shown) shows each image as a row in a table and allows users to select which images they want to distribute for online mark- ing. The marking of one image is done as a “Human Intelligence Task” (“HIT”) on Amazon’s MTurk by a worker and is dubbed a “submission.”
A project settings panel allows customization of the following parameters: min- imum number of total points, minimum number of positive points (objects) sought, minimum number of negative points (non-objects) sought, monetary reward for train- ing, bonus reward for training, time-to-expire for the HIT, and special instructions.
After images are distributed, the client polls the server regularly to see if the HITs have been completed. Once a submission is detected, the client provides two means of assessing the quality of the work. The first is a window that shows the local area around each marked point which allows for quick deletions of simple mistakes (see figure 6.3). The second is the phenotype training window which displays the original
Figure 6.3: Worker submission check window
image with an overlay of the training points (not shown, see Holmes et al., 2009 Figure 8, p9).
Usually within 5-10 seconds the user can recognize if the worker did a satisfactory job and accept or reject accordingly. A rejection means zero compensation for the worker. The client also provides common canned feedback responses such as “Great job!” or “You missed too many objects. Try again.” If the worker went above and beyond, a bonus can be rewarded. As an additional tool, the worker’s historical record (acceptances, rejections, bonuses, and feedback) are displayed.
research tool with enormous potential.
6.3.2
The Central
DistributeEyes
Server
The function of the server is to be an intermediary for passing data between the
client and MTurk’s workers, to render the HTML training application, and to serve
the public homepage URL.5 User accounts are created via the public portal and after
email verification, the user can download the DistributeEyes client.
6.3.3
The
HTML
Training Application and the Raw Data
Each image can be rendered inside of a training application for marking by a worker. The training application comes complete with each example image of each phenotype, a magnified view of the image where the user marks / deletes points, magnification adjustments, a thumbnail, a scaled-to-fit view, and a help link.
Before the worker is allowed to use the application, he must “qualify” by watching a five-minute tutorial video which teaches traning concepts and techniques, pass a
six-question quiz (see figure 6.4), and agree that DistributeEyes can monitor his
usage. Workers must pass the quiz in order to be able to work on a HIT. The
worker is forced to watch the entire video before having the opportunity to answer the questions. Workers that pass the qualification quiz are recorded on the central server and no longer need to re-qualify to work on future HITs.
Then, the worker moves to the training application (see figure 6.5). The worker marks points via point-and-click until all goals set by the user are met and all special instructions are fulfilled. The “submit training” button becomes enabled, and the worker can upload his points to the central server. When the worker submits, he is led through a series of dialogs inquiring if he has made the common mistakes that
Figure 6.4: The training video and example quiz questions
lead to rejections, and only then is he allowed to submit.
While the worker trains, we collect information about his actions e.g. when and where he clicked, when he switched phenotypes, when he used the magnification tools, and we log these actions every ten seconds.
To be clear, we refer equivalently to a “training” or a “labeling” as a worker’s
submitted points for one image which are coordinates (xip, yip) whereip ∈ {1, . . . , np},
p∈ {1, . . . , P}whereP denotes the total number of phenotypes (usually one) andnp
denotes the total number of points per phenotype that the worker labeled.
Figure 6.5: The HTML training application embedded into MTurk. The worker is training the project “birds” as part of our experiment (see section 6.5).
and inexpensive.
6.3.4
Interfacing with MTurk
Amazon Mechanical Turk (MTurk) is a marketplace that coordinates the use of human intelligence to perform tasks which computers are unable to do. This labor paradigm has been coined “crowdsourcing” which Amazon whimsically refers to as “artificial, artificial intelligence.”
Since it is the largest such crowdsourcing marketplace on the Internet, and it pro- vides a convenient Applications Programming Interface, it was chosen as the venue to
hostDistributeEyestasks. To create an MTurk HIT, theHTMLTraining Application
is rendered inside an IFrame which is wrapped inside of MTurk’s worker interface. From the perspective of the worker, everything occurs within MTurk; he is completely
unaware of DistributeEyes.
After submission, the client will detect the completion within 10 minutes. Accep- tances or rejections are relayed from the client to MTurk via the central server using
the MTurk API. The server records each acceptance or rejection. The workers are warned upon rejections. If a worker receives many rejections, he may be banned from
working on DistributeEyes HITs.