Methods - Tracking-by-Assignment as a Probabilistic GraphicalModel with Applications in Devel

5. Evidence

5.2. Methods

plugin that allows the manual tracking of divisible objects. The user provides a point-like marker for every object at every time slice and connects the markers between consecutive time slices. Fig. 5.3 on the facing page shows one example volume of the zebrafish dataset loaded in View5D together with markers for all nuclei.

With its help we tracked the cell nuclei in both the zebrafish (sec. 5.1.1 on page 42) and the Drosophila dataset (sec. 5.1.2 on page 43). In case of the zebrafish dataset we placed the markers directly on the raw images, i.e. we did a manual identification and tracking of the nuclei. The full tracking protocol is reproduced in Appendix A. For the Drosophila dataset we tracked only the segmented objects and only distinguished between false positive and true positive objects. That is, with the Drosophila ground truth we can exactly judge the performance of the chain graph tracking model since we used the very same input information that is visible to the tracking algorithm; whereas the zebrafish ground truth is useful as a benchmark for a complete reconstruction pipeline consisting of a segmentation and a tracking step.

When placing the marker on the raw images directly we have to establish a correspondence between the markers and the segmented objects. We placed the markers on the pixel with maximum intensity inside a nucleus and calculated the maximum intensity position inside the segmented objects, too. Assuming a perfect segmentation we could simply match each marker to its nearest segmented object, then. Since we have to consider false positive (phantom nuclei) and false negative (missed nuclei) segmentations, we have to allow for markers and segmented objects that are not matched. This can be formulated as aweak asymmetric bipartite matching

where markers resp. segmented objects are matched to their nearest neighbor segmented object resp. marker as long as the distance is below a certain threshold. Otherwise, they are matched with a hypotheticalsink object resp. marker. Fig. 5.4a

on the next page illustrates the idea.

This matching problem can be formulated as a binary integer program

min X i,j ci↔j·xi↔j, i ∈ lhs, j ∈ rhs s.t. X j xi↔j = 1, X i xi↔j = 1, ∀i, j \sink x_i↔j ∈ {_{0, 1},} ∀_{i, j} (5.1)

where ci↔j is the distance between the marker i and the segmented object j mea- sured from the maximum intensity position. The sets lhs and rhs contain one index for every marker resp. segmented object and one additional index each representing the two sink nodes. The problem can be easily solved in practice with

5. Evidence

(a) Assigning ground truth markers and segmented objects formulated as an asymmetric bipartite matching. Markers (lhs) and segmented objects (rhs) are matched with their respective nearest neighbors below a certain distance threshold. Other- wise, they are matched with asink

node. 0 200 400 600 800 1000 0 200 400 600 800 1000

Manual vs. Automatic Nuclei Detection ct-keller-animal - timestep 0/24 (threshold: 25 px)

ground truth our result

(b) Example result of matching ground truth and segmentation. Circles indicate ground truth and crosses segmented objects. Successful matches are green. Red crosses are false positives and red circles false negatives.

Figure 5.4.: Matching ground truth nuclei locations with segmented object locations.

an off-the-shelve integer programming package. As a result we obtain three sets of objects: objects both present in ground truth and segmented data, phantom segmentations (false positives), and objects that were not segmented (false negatives). An example result is shown in Fig. 5.4b for the first time slice of the zebrafish dataset and a distance threshold of 25 pixels.

5.2.3. Measuring Tracking Performance

Performance of acontestant tracking relative to a ground truth tracking is measured

in the following terms:

Recall: The number of tracking events that are found by the contestant and are also present in the ground truth relative to the number of all ground truth events.

5.2. Methods Precision: The number of tracking events that are found by the contestant and are also present in the ground truth relative to the number of all contestant events.

Precision and recall make sense intuitively. Additionally, we will derive the two measures by reasoning over cardinalities of sets containing tracking events and derive thef-measure as a unified error measure.

Tracking events are defined in the sense of Sec. 3.2 but limited to types that describe transitions, i.e. moves, divisions, appearances, and disappearances. In par-

ticular,true positive detections and false positive detections are not tracking events.

A tracking event can only be correct if all participating objects are true positive detections. Therefore, we do not consider true and false positive detections in the tracking precision and recall since they are already implicitly taken into account.

Formally, we have two sets: the set of tracking events present in the ground truth G and the set of tracking events extracted by the contestant tracking C. Furthermore, we can establish a matching between elements of the two sets when they describe the same tracking event type involving the same objects, i.e. the true positive tracking events.3 That is, we split each set again in two distinct subsets

Gm, Gm, Cm, and Cmwith m and m indicating match and no match.

Since they contain the successfully tracked events the sets G_mand C_mhave the same cardinality. Consequently, the set Gmcontains true tracking events that were not found correctly by the contestant tracking method and Cmcontains the tracking events that didn’t actually happen. In summary,

G = Gm∪Gm

C = Cm∪Cm |_G_m|_{= |C}_m|_.

(5.2) This constellation is illustrated in Fig. 5.5 on the next page.

We can now formally define trackingrecall and precision in terms of the ground

truth and contestant sets:

rec(G, G_m) =|Gm| |_G| prec(C, Cm) = |_C_m| |_C| (5.3)

In empirical evaluations it is desirable to have only a single error measure. This allows a unique ranking of competing methods and provides a clear target objective for machine learning approaches. Of course, one could combine precision and recall in anad hoc manner like the sum or the arithmetic mean. But by inspecting

the definitions of recall and precision in terms of set cardinalities we can define an analogous performance measure covering both sets G and C:

perf(G, Gm, C, Cm) =

|_G_m|_{+ |C}_m|

|_{G| + |C|} (5.4)

3_{This shouldn’t be confused with true positive objects, which are a necessary but not sufficient}

5. Evidence

Figure 5.5.: Ground truth vs. contestant tracking method. The black dots represent tracking events like moves, divisions, and (dis-)appearances. There are two sets of tracking events, those in the ground truth and those found by the contestant tracking method. The events that were tracked correctly by the tracking method are present in both sets as indicated by the matching lines.

This combined performance score can also be expressed in terms of precision and recall. Using |Gm|= |Cm|and expanding the fraction by

|_C_m| |_G|·|C|, we obtain perf(G, G_m, C, C_m) =2 · |_C_m| |_C| · |_G_m| |_G| |_C_m| |_C| + |_G_m| |_G| =2 · prec(C, Cm) · rec(G, Gm) prec(C, Cm) + rec(G, Gm) (5.5)

We now see that the combined error measure is actually the harmonic mean of precision and recall and is therefore equivalent to the well knownf-measure (Costa

et al., 2007).

In document Tracking-by-Assignment as a Probabilistic Graphical Model with Applications in Developmental Biology (Page 55-58)