Scene Modeling with Mobile Robot - Autonomous 3D Modeling of Unknown Objects for Active Scene E

Here, the application of the NBV algorithm to obtain 3D models of larger objects, in particular complete scenes or workspaces, for collision-free motion planning is shown. In comparison to all the other experiments here the surface quality-based NBV algorithm was not required and thus only the PVS is updated based on which NBVs are planned in each step. However, after the number of free voxels in the PVS stagnates, the algorithm aborts and an optimized model is created based on the PVS and applied to the utilized motion planner (see Section 3.2.2).

As described in Section 5.1.2, the DLR/Kuka omniRob platform autonomously navigates within its environment based on a previously recorded 2D map. How- ever, here collisions are only avoided when moving the platform and not for motions with the LWR arm. The robot arm needs to be moved e.g. for picking up parts at one place and carrying them to another. Therefore, a precise 3D environment model of each scene the LWR will interact with is required in addition to the 2D map. This could be achieved by obtaining a complete 3D map of the environment. However, this is very costly as a high resolution model would be required for the complete site and more importantly in real production environments many workspaces are movable and might not be exactly in the same position as last time. Thus, we suggest to autonomously create a 3D model of each scene separately using the mobile platform and to afterward search for the position of the scene each time interaction is required. This is needed as workspaces such as shelves, conveyor belts or workbenches are often custom-made and no CAD data is given.

As the environment models of the initially unknown scenes are autonomously acquired in a preprocessing step, some measure for referencing the robot to the 3D model is required during scene interaction such as object picking or placing. Thus, we suggest to use AprilTags (Olson, 2011) and attach them to each workspace or scene (see Fig. 5.15 on page 122). AprilTags are similar to QR (Quick Response) Codes but are designed to encode smaller data allowing for precise detection of its 3D position with respect to the camera. Before modeling the scene, the human worker needs to attach the AprilTags in the scene, remove all objects not relevant for scene model and teach a platform position in front of each unknown scene. The position teaching can either be done by manually moving the platform to a position in front of the scene or by marking the position on the 2D map. The position should be selected so that the robot is approximately centered in front of the workspace along the long

5.5. SCENE MODELING WITH MOBILE ROBOT 121 side with the PTU facing the scene (see Fig. 5.3 on page 102).

For scene modeling, we use the stereo camera system on the PTU since it is a lot faster to move the PTU than the LWR to view in different directions. In contrast to the other experiments in this chapter, due to the use of the PTU the viewpoint space is already very restricted. Thus, scan candidate generation as suggested in the previous chapter is not applicable. Furthermore, here a triangle mesh, as in the previous experiments, is not a suitable representation as not all parts of the workspace can be modeled due to the limited sensor workspace and could lead to collisions with the robot arm. Therefore, only a PVS is updated and utilized for the NBV planning. The PVS is initialized with the state unknown for the area where the workspace is assumed. The human can configure the approximate size of the unknown area manually for each scene or else a predefined size is used. Finally, the robot autonomously explores the unknown scene and creates a 3D model utilizing viewpoints only from one side of the scene. During our experiments carried out at Grundfos 3

and the Automatica exhibition 4_{, all workspaces were not accessible for the}

robot from the backside which meant that the robot could also only interact with the scene from the front-side. Moreover, when using the PTU, the space beneath lower tables, shelves cannot be freed completely and thus the robot cannot move into these areas. However, this is not a problem as the LWR is also not able to move to these positions.

Fig. 5.15 shows two workspaces (left), a conveyor belt and a shelf, for which environment models (right) were obtained. These are two examples of several workspaces the mobile robot needed to interact with at a Grundfos factory in Denmark. Here, the task of the robot was to pick up industrial parts (cans and cores), assemble them, drop them into a small load carrier, and transport them to the shelf. As one can see, for both examples, the mobile robot cannot move behind.

For the viewpoint space or set of scan candidates of the omniRob we allowed for different platform positions in y-direction with 250 mm increments. The maximum and minimum value were limited by the width of the PVS size. At each platform position we sampled 15 viewing angles for the PTU: pan angle between −20◦ _{and 20}◦ _{and tilt angles between −60}◦ _{and 20}◦ _{both with an}

increment of 20◦_{. Here, for the NBV selection, we used just the IG part of the}

utility function from Equation (4.30) by setting ω to 0. Additionally, a penalty was added to the utility value only if the platform needed to be moved for an

Grundfos http://www.grundfos.com/, 2014

Figure 5.15: Two workspaces (left), a conveyor belt and a shelf, in a real production environment at Grundfos in Denmark are exemplarily shown. At both workspaces, the mobile robot needs to pick up or place parts. The final scene models (right) show that the modeling is able to cope even with the very shiny shelf or conveyor belt.

5.5. SCENE MODELING WITH MOBILE ROBOT 123 NBV:

futility= 0.8 · (1 −

3 m) · futility. (5.2) The penalty value depends on the platform offset oy, the platform would need

to move in y-direction from the current position to the NBV candidate assum- ing a maximum distance of 3 m. The maximum distance was chosen since all workspaces during our experiments at Grundfos and the Automatica exhibition were never wider than 3 m. The penalty was added to avoid too many platform movements, as each time the platform is moved, additionally a scan matching is required to calculate the platform offset in relation to the taught position. Scan matching is required as the platform odometry itself is not accurate enough. As no triangle mesh is needed, in contrast to the other experiments in this chapter, the initial viewpoint could also be planned selecting an NBV within the unknown PVS.

After an NBV is selected, the platform and PTU are moved to the NBV and a range image is obtained with the stereo system on the PTU in relation to the WCS. For performance reasons, the space update is performed by downscaling the range image by a factor of four and the NBV planning is carried out by also reducing the resolution of the range image by a factor of four as described in Section 4.5.3. Additionally, for each AprilTag which is visible in an acquired range image, its position and orientation are saved if the incidence angle is less than 60◦_{. For too high incidence angles, the AprilTag detection does not perform}

well (Olson, 2011). This procedure is repeated until the number of voxels which are free in the space does not change significantly anymore. Finally, all voxels with a probability p to be occupied below 25% are removed and based on all remaining voxels in the PVS, a data size-optimized model is created by removing inner voxels.

Fig. 5.16 shows a shelf which was autonomously modeled at the Automatica exhibition. At the bottom left-hand side, the final PVS with a resolution lv of

10 mm is shown based on 25 range images obtained during the NBV planning. Note that the probability of occupancy is color coded from black (almost free), through gray (unknown) to white (occupied). Voxels which are free are not shown. In Fig. 5.16 top right-hand side, the final model is shown created by removing voxels in the PVS assumed to be free or free-floating. Smaller groups of voxels which are free-floating in the air are erased as an obstacle always needs to be in contact with the floor. However, the final model still contains all voxels which could not be freed during the space update including voxels which do not actually represent an obstacle. This can be seen for the area above the top shelf where the stereo camera could not view everything. Still this is sufficient as the

Figure 5.16: A shelf (top left) is modeled and utilized for collision-free motion planning during manipulation with the mobile robot (top right). Therefore, a PVS has been updated based on 25 range images (bottom left) acquired during NBV planning. Note that the probability of occupancy is color coded from black (almost free), through gray (unknown) to white (occupied). Voxels which are free are not shown. In the final model (bottom right) free-floating regions and almost free voxels have been removed.

3D model just needs to allow for collision-free motion planning with the LWR within reachable regions. For the shelf, the robot cannot reach objects on the bottom or top shelf and therefore only the middle two shelves are of interest for motion planning. Fig. 5.16 top right shows the application of the scene model within a world model for motion planning during manipulation of small load carriers.

As the shelf is viewed from several positions, the AprilTags are also viewed multiple times. Therefore, after the NBV algorithm aborts, the positions and orientations of each detected AprilTag are optimized by averaging over all mea- surements with same AprilTag type. When the robot arrives at this workspace

5.5. SCENE MODELING WITH MOBILE ROBOT 125

Figure 5.17: A scene consisting of a press table, a work bench and another mobile robot (left) has been autonomously modeled (right) using different PVS resolutions for different areas of interest.

again during its working procedure, it can reference itself to the workspace model using the AprilTags. This works well even if the workspace has been moved e.g. by a worker as the robot first searches for the AprilTags in the expected area. Furthermore, Fig. 5.17 shows an example of a press table workspace acquired at the Automatica exhibition. Here the PVS resolution lv was set to 10 mm in

the area where the omniRob will manipulate, namely the table top, to which six AprilTags are attached. The resolution of the rest is set to 20 mm for performance reasons as here the robot does not have to move close for picking and placing. In areas where the robot needs to be able to move very close to obstacles, it is mandatory that the PVS resolution lv is chosen to be low enough as

otherwise the robot will not be able to move very close to the actual surface. The reason for this is that obstacles will always be represented larger in the PVS than for the actual obstacle. In Fig. 5.17, the robot arm of another mobile robot is modeled at a defined position as the two mobile robots will not perform manipulation at the same time. Note that the PVS is only modeled up to a certain height which is sufficient due to the workspace of the robot arm. The created 3D scene models were directly applied to the motion planner, as described in Section 3.2.2, in order to manipulate detected objects in the scenes. This has been evaluated by performing fetch and carry tasks with the om- niRob for a complete week at the Automatica exhibition. Thereby, the collision- free motion planning based on the autonomously acquired scene models is applied each time an object should be picked from a workspace or placed onto a

workspace using the LWR. The robot arm was always able to successfully find a path and never collided with its environment. The autonomous modeling of the workspaces, which was performed in a preprocessing step, took between 5 and 20 minutes depending on the size of the workspace.

In document Autonomous 3D Modeling of Unknown Objects for Active Scene Exploration (Page 140-146)