Discussion about 3D lidar classification - AUTOMATIC CLASSIFICATION OF LIDAR POINT CLOUDS IN A

In this research two methods were developed and applied. The reason of developing two methods is that although both methods did the classification successfully, the second method, RG, is more efficient and faster than the first one, TM. In addition, for future steps a smarter algorithm may be developed by taking advantage of both of them.

To evaluate the results, as there is no reference classified data, no evaluation can be done by reference data. Also due to huge number of points, manual assessment for each point is not feasible. On the other hand, one should notice that the purpose of this research is not to classify all rail and wire points perfectly because it is not possible. But the aim is to classify majority of rail and wire points so that some further analysis, like 3D modelling, doing measurements, checking change of gauge and inspecting continuity of rails, etc. can be done.

Regarding the points outlined, the following measures are used to evaluate the results:

a. Completeness.

b. Correctness.

c. Speed.

Assessing the completeness of classification is done by visual inspection because as mentioned at the beginning of this section, there is no reference classified data and manual assessment is not feasible. By visual inspection, the biggest gap of classification of points on the objects of interest can be determined. It should be noticed that all classification gaps are resulted from failure of classification algorithm as well as gaps in the data.

Correctness evaluation for classified rail points is done by using the results of 3D modelling of rails (Diaz Benito, To be published in March 2012). He fitted 2-meter long pieces of 3D rail models to classified rail points and calculated the geometric distance of classified rail points to 3D model. So by excluding points which are far away from model, only the ones that are very close to rails are left i.e. some outliers are removed. 3D modelling was not done for wires so the only way to assess the correctness of classification of wire points is visual inspection.

The first three steps of both methods are the same i.e. pre classification, grid generation and HJD.

Besides, each method has some specific steps as well.

The following is the discussion about methods and their results. First steps in common of both methods, then particular steps of each method and finally a comparison between two methods are discussed.

6.1. Discussion about common steps of Template Matching and Region Growing methods 6.1.1. Discussion about pre classification

Pre classification was done mainly to separate different objects of interest from each other based on their heights. There is a big height difference between rails and wires while the height difference between contact and catenary wires is very small. Therefore, some wrong differentiation between contact and catenary wires in pre classification step happened which affected the final results of contact and catenary

This wrong classification just occurs where catenary and contact wires meet but usually catenary wires lie in a higher elevation of contact wires and they rarely meet so the number of points which were wrongly classified due to this reason is very small.

One solution to avoid this wrong classification can be to use elevation percentiles. Once the classification of points on contact and catenary wires is done, all points on contact wires are considered and 80%

percentile elevation of points on contact wires is calculated. Then all contact wire points higher than 80%

percentile elevation are reclassified as catenary wires.

6.1.2. Discussion about grid generation

As it was mentioned in section 4.4.1.2, choosing the size of grid cells and the minimum number of points in each cell to be considered for further process are crucial as they have a great influence on results of HJD method. For a dataset with lower resolution, bigger grid cells require to be considered as well as suitable minimum number of points in each cell.

6.1.3. Discussion about height jump detection

This method is also highly dependent on point cloud density. For terrestrial dataset, HJD method managed to find majority of height jumps. While low resolution airborne dataset in which rail points are difficult to be distinguished from track bed points, HJD failed to detect all rail points. As both TM and RG methods are implemented based on the results of HJD step, if not sufficient rail points are classified in HJD step, then TM and RG methods also fail to classify the rest of rail and wire points.

To summarize, the final results of TM and RG are highly dependent on results of HJD step and results of HJD depend highly on point density. Figure 6-1shows how rails appear in high resolution terrestrial point clouds. As it is evident in the image, the points on top and body of the rail can be clearly distinguished from the ones on track bed. On the contrary, Figure 6-2 depicts how rails appear in low resolution

airborne dataset. In this case, although some points on top of rails can be recognized but the points on the body of the rail are mixed with track bed points.

Figure 6-1. Appearance of rails in high resolution terrestrial point clouds

6.2. Discussion about Template Matching 6.2.1. Methodology

First, regarding the specifications of defined rail pattern, 3 points should be noticed:

a. For the defined pattern, only contact wires were used because they have more homogeneous shape than catenary wires in the sense of less variation of height and planimetric position. In fact, contact wires were included to make the defined rail pattern more similar to real configuration of railway environment and a result having more accurate results.

b. The rails and contact wires are considered wider than what they really are, to increase the correlation between greyscale image and kernel. In other words, the length and width of rails in the pattern indicate the area where rails are expected to lie. So by having bigger width, more rails and wires lie in areas they are expected to lie thus the correlation between the greyscale image and kernel increases which makes the differentiation between true and wrong rail patterns easier.

c. Other kinds of rail patterns were also tried but the results were not as accurate as the applied pattern. The following is the list of other rail patterns defined:

i. The one with of just rails and without wires.

ii. The one with 1 m long rails and contact wire above without a black area in its surroundings.

iii. The one with 2 m long rails and contact wire above.

iv. The one with two pairs of 1 m long rails and contact wire above.

Other rail patterns and their corresponding results and discussion can be found in Appendix 5.

Second, as the algorithm follows only one direction for each pair of rails, in areas where there is a rail switch which connects two pairs of rails, the TM algorithm fails to classify switch points. In fact, it just follows rails in their main direction.

Figure 6-2. Appearance of rails in low resolution airborne point clouds

Third, to find the maximum correlation between the kernel and the greyscale image, the kernel was moved through greyscale image and was rotated with different angles. The rotation was done from -90 to +90 degrees with small steps of 1 degree which decreased the speed of algorithm considerably.

Fourth, TM method was implemented in MATLAB due to its useful image processing tools. Other programming languages like Python are usually faster.

6.2.2. Result on terrestrial laser scanned data

Evaluate of the completeness is done by visual inspection. The classification algorithm managed to cover the whole area. However, there were small areas where no rail or wire points were classified. In other words, the longest part in which no rail points were classified was 30 cm long. The biggest classification gap of rail points was 30 cm. While the biggest classification gap of contact and catenary points was 90 cm and 60 cm respectively.

Assessing the correctness of classified rail points, there were some track bed points which are wrongly classified as rail points. These points were very close to rails and as their number or percentage cannot be determined by visual inspection so it is determined in 3D modelling. As it was mentioned, the classified terrestrial point clouds were used for 3D modelling of rails by Diaz Benito (To be published in March 2012). 75 % and 95% of classified rail points were in a distance of 7 and 13 cm to 3D rail models respectively (Diaz Benito, To be published in March 2012). Figure 6-3 shows an accumulative histogram of geometric distances of classified rail points to 3D rail models. Total number of classified rail points was about 4606 from which more than 3500 lie in a distance less than 10 cm to 3D rail models. There are 3 sources of error which causing the classified rail points do not lie exactly on model but lie in a distance of 3D model:

1. Precision of dataset 2. Error of classification 3. Error of modelling

Figure 6-3. Accumulative histogram of geometric distances of classified rail points to 3D rail models for Template Matching (Diaz Benito, To be published in March 2012)

Evaluating correctness of classified wire, visual inspection shows there were some points on towers which are wrongly classified as catenary wires.

To classify points on two pairs of straight rails, two contact and two catenary wires for a distance of about 100 meters of the railway environment, TM algorithm took about 30 minutes.

One should notice that there were no curved rails in terrestrial dataset, so TM algorithm could not be assessed if it is able to classify them.

6.3. Discussion about Region Growing 6.3.1. Methodology

First, RG is able to follow one direction for each rail so if there is a rail switch that connects two pairs of rails, it fails to classify it.

Second, the capability of algorithm to grow each pair of rails together was not used as the current

algorithm managed to classify the points with negligible gaps and mistakes. This capability was developed for the areas with very complicated railway configuration where the current algorithm fails. One should notice that applying this capability improves the results as well as taking more time.

Third, direction of contact wires is another strong constraint that can be used in more complicated railway configuration. In fact, contact wires and rails are almost in the same direction. But again as the current algorithm worked well, this constraint was not applied.

Fourth, maximum slope considered for rails is 5% and maximum curvature is 5 cm in 1 m. These two values were verified by company Movares and were accepted due to Dutch railway environment specifications.

6.3.2. Result on terrestrial laser scanned data

Visual inspection indicates that RG managed to cover the classification of whole area although there are some areas where no rail or wire points are classified. The biggest classification gap of rail, contact and catenary wire points is 40, 90 and 190 cm respectively.

Inspecting the correctness of classification of rail points, results of 3D modelling shows that 75 % and 95% of classified rail points were in a distance of 5 and 11 cm to 3D rail models respectively (Diaz Benito, To be published in March 2012). Figure 6-4 shows an accumulative histogram of geometric distances of classified rail points to 3D rail models. Total number of classified rail points was about 8900 from which more than 8000 lie in a distance less than 10 cm to 3D rail models. Like TM, there are 3 sources of error which causing the classified rail points do not lie exactly on model but lie in a distance of 3D model:

1. Precision of dataset 2. Error of classification 3. Error of modelling

Evaluating the correctness of classification of wires, there are some points on towers which are wrongly classified as catenary wires. By visual inspection one can notice that the number of these points is less than that in TM. That is so because in TM catenary wire were classified by nearest neighbour analysis between projected points in class 4 and rail points while in RG it was done by nearest neighbour analysis between projected points in class 4 and contact wires. As the planimetric position of contact and catenary wires are much closer than that of rails and catenary wires, RG leads to more accurate results.

To classify points on two pairs of straight rails, two contact and two catenary wires for a distance of about 100 meters of the railway environment, RG algorithm took about 10 minutes.

As in terrestrial dataset, there are no curved rails, RG could not be assessed if it manages to classify them and that is the reason the airborne dataset was used.

6.3.3. Result on airborne laser scanned data

As it was mentioned before, airborne dataset was used to check the ability of RG algorithm to classify points on curved rails in a low resolution dataset. RG managed to classify points on all four curved rails and seven curved contact wires. This is mainly due to low resolution of dataset as well as shadow effect of the train carriage standing on rails.

To check the completeness by visual inspection, generally areas with no classified rail or wire points are considerably more than that in the results of terrestrial dataset. The biggest classification gap of rail, contact and catenary wires were 2, 1.8 and 6 meters.

Figure 6-4. Accumulative histogram of geometric distances of classified rail points to 3D rail models by Region Growing (Diaz Benito, To be published in March 2012)

As 3D modelling was not done for airborne dataset, correctness assessment for all classified rail and wire points was done by visual inspection. Some points on track bed close to rails were wrongly classified a rail points but their number of percentage cannot be determined b visual inspection. For classification of wire points, no major wrong classification can be seen in the results.

To classify points on six pairs of rails, including two straight and four curved pairs of rails, seven contact and seven catenary wires for a distance of about 80 meters of a railway environment, RG took about 10 minutes.

6.4. Comparion between Template Matching and Region Growing In terms of classification correctness and completeness:

1. The result of rail and contact wire classification of RG is almost as complete as that of TM.

Classification gaps in the results of both methods were similar to each other.

2. There were no rail switches in both terrestrial and airborne datasets but if there were some, both algorithms would fail to classify points on them.

3. The result of rail point’s classification of RG is more accurate than that of TM. This is evident in the result of 3D rail modelling done by Diaz Benito (To be published in March 2012). He indicated higher percentage of classified rail points in RG is closer to 3D rail models than in TM.

However, there were other sources of error which made rail points away from 3D model.

4. Visual inspection shows the number of wrongly classified rail points by RG is considerably less than that by TM.

a. As it was mentioned in section 6.3.2, that is because in RG the nearest neighbour analysis was done between points of class 4 and contact wires while in TM nearest neighbour analysis was done between points of class 4 and rail points and planimetric position of contact wires are much closer to catenary wires than that of rails.

In terms of algorithm efficiency and speed:

1. In TM, HJD step should be done for whole dataset which is computationally expensive and time consuming but in RG, it is done just for the first 5 meters. So RG is faster and more efficient even in the first step of classification which is HJD.

2. RG seeks the rail pattern for a small part of dataset i.e. the first 5 meters then it tries to find the rest of rail points in the neighbourhood of detected rail patterns in the first 5 meters. While TM seeks for rail pattern in all parts of dataset.

3. RG is faster than TM as it classified all points for a distance of about 100 meters of the railway environment in 10 minutes while TM did so in 30 minutes.

4. One of reasons that RG is faster than TM is that RG is mainly implemented in Python and only its Hough Transform step is done in MATLAB. While a big proportion of TM is implemented in MATLAB which makes TM much slower than RG.

7. CONCLUSIONS AND RECOMMENDATIONS FOR

In document AUTOMATIC CLASSIFICATION OF LIDAR POINT CLOUDS IN A RAILWAY ENVIRONMENT (Page 54-62)