Introduction - Distributed Multi-object Tracking with Multi-camera Systems Composed of Overlapp

In last chapter, we employ multiple features and update their parameters adaptively to improve the robustness of the tracking results. But we didn’t consider that some information of the environment and the system setup may be helpful to simplify the matching process and improve the robustness further more. For example, if we know the camera topology and the map of the roads beforehand, we can exclude some candidates that are not possible appearing in a certain direction. Or, if we know there are intersections or traﬃc lights in the blind region, we can decide to use GMM over a single Gaussian to improve the accuracy of the travel time model.

We name this type of information Domain Knowledge. Domain knowledge includes the conﬁguration of the cameras in the network, possible entrances/exits in or out of the camera views, occluding structures, e.g. columns in the scene, useful traﬃc rules, reasonable assumptions based on the environment conditions, etc.

There are several benefits of involving the domain knowledge. First of all, our system is capable of processing more complicated object tracking or event detection tasks. As aforementioned, the related work has focused on object tracking and object matching across disjoint camera views. Most of them only address the problem of object re-identification or association, i.e. the objects observed in the current/downstream camera must have already been seen/detected in the previous/upstream camera(s). Their algorithms and experiments only focus on finding the object correspondences in adjacent cameras. In other words, new objects coming from blind regions or observed objects disaappearing/leaving in the blind regions are not considered, except in the work by Huang and Russell [89]. In the related work that only solves the object association problem, they implicitly assume a simple domain knowledge, which is that there is only a small gap between two adjacent cameras and no entrances or exits exist in this gap. In our system, we consider more complicated scenarios that involve entrances/exits and intersections in the blind regions. In these cases, the objects that are detected in the downstream camera may be new objects coming from the blind region (i.e. they do not exist in the candidate lists received from the upstream camera(s)), and/or some of the objects in the candidate list will never show up in the downstream camera view. These scenarios make our application even more challenging and realistic.

In this chapter, we propose a distributed camera system for object tracking across non- overlapping views. Considering the uncertainties caused by vision algorithms, a probabilistic result is preferred to a deterministic one. To incorporate the uncertainties of each stage (foreground detection, tracking and object matching) in a proper way, we employ a pPN. In our system, every camera performs multi-object tracking individually and then object matching is performed if candidate data are received from the previous camera(s). The tracking process within a single camera and object matching across adjacent cameras are modeled by the pPN and a score of each object’s tracking and matching result is yielded as the output of the pPN. In our example three-camera setup, vehicles travel from Camera 1

and 2 to Camera 3. Camera 3 maintains a pPN, which includes the transitions from other cameras or entrances from the blind region into Camera 3. Similarly, if there are more cameras in the network, each camera that has upstream adjacent cameras needs to maintain a pPN, which includes the possible transitions from the previous cameras to the current camera.

Another advantage of employing the pPN is that the domain knowledge can be eﬃciently incorporated into the algorithm. When a rich set of domain knowledge is available, the pPN also helps to implement and control the work ﬂow.

The proposed approach can be generalized to various surveillance applications involving disjoint camera views, such as indoor human tracking or outdoor human/vehicle tracking. In this chapter, we first present the wide-area tracking of vehicles as an example. This example shows how we fuse multiple features, train the parameters, and handle blind regions and “never-seen-before” objects. Then, a similar approach together with a different set of domain knowledge is employed for tracking people in another example with a disjoint camera setup. This example is more challenging, because unlike vehicles moving in certain lanes in fixed directions, peoples routes are more diverse. In the traffic scenario, the upstream camera assumes that a car will not reappear after leaving the camera view. On the other hand, a person can always come back to the view. For such cases, we need to save object trackers in a list for a certain amount of time after objects leave the view. These different examples and results illustrate how our framework can be applied to different scenarios with different domain knowledge. We also present the pPN for each scenario, where the domain knowledge is incorporated in the work flow.

In the rest of this chapter, we first briefly review the definitions of Petri Net(PN) and probabilistic Petri Net(pPN). Then our pPN-based framework is explained using an example of wide-area vehicle tracking. Another people tracking example is also presented which employs a different set of domain knowledge and shows how the occlusion can be handled by

the pPN. Experiments and comparisons with related work are performed. The results show that our system is not only able to track objects across disjoint cameras with high accuracy, and also distinguish the new objects from the already observed objects successfully. At last, a discussion about the scalability and information about how to collect domain knowledge is presented.

In document Distributed Multi-object Tracking with Multi-camera Systems Composed of Overlapping and Non-overlapping Cameras (Page 118-121)