Quality labelling, training and prediction

The term machine learning refers to the automated detection of meaningful patterns in data. In the past couple of decades, it has become a common tool in almost any task

Laboratory for Manufacturing Systems and Automation (LMS) Page 40 that requires information extraction from large data sets [82]. Learning is, of course, a very wide domain. Consequently, the field of machine learning has branched into several subfields dealing with different types of learning tasks. Two of the most common categorization of the machine learning techniques are the supervised and unsupervised methods.

Supervised versus Unsupervised:

Since learning involves an interaction between the learner and the environment, one can divide learning tasks according to the nature of that interaction. The first distinction is the difference between supervised and unsupervised learning. More abstractly, viewing learning as a process of “using experience to gain expertise,” supervised learning describes a scenario in which the “experience,” a training example, contains significant information that is missing in the unseen “test examples” to which the learned expertise is to be applied. In this setting, the acquired expertise is aimed to predict that missing information for the test data. In such cases, the method can be related as a teacher that “supervises” the learner by providing the extra information (labels). In unsupervised learning, however, there is no distinction between training and test data. The learner processes input data with the goal of coming up with some summary, or compressed version of that data. Clustering a data set into subsets of similar objects is a typical example of such a task [82].

There is also an intermediate learning setting in which, while the training examples contain more information than the test examples, the learner is required to predict even more information for the test examples. For example, one may try to learn a value function that describes for each setting of a chess board the degree by which White’s position is better than the Black’s. Yet, the only information available to the learner at training time is positions that occurred throughout actual chess games, labelled by who eventually won that game. Such learning frameworks are mainly investigated under the title of reinforcement learning [82].

Training - Prediction

In this thesis, after the feature extraction (image processing technique & PCA), several machine learning algorithms tested to classify the simulation data. The main goal was to connect the defected specimens with quality labelled simulations trials to teach the algorithm how to predict welding defects. Thus, a supervised classification and prediction method had to be implemented.

Laboratory for Manufacturing Systems and Automation (LMS) Page 41 A plethora of welding simulation trials have been characterized in detail based on the included defect with quality labels (O.K., lack of fusion, Cracks, Porosity, no seam).

This process as it was described above, is called data labelling and has to be done very carefully in real experiments since the outcome of the classification process heavily depends on it.

After the creation of training data sets, which contain both feature data and the related quality labels, several classification algorithms (e.g. Logistic Regression, Support Vector Machines and Random Forests) were tested. It was discovered that a Support Vector Machine with linear kernel provided the best performance compared to the other classification algorithms. “Support Vector Machine” (SVM) is a supervised machine learning algorithm which can be used for both classification or regression challenges [82]. However, it is mostly used in classification problems. In this technique, each data item is plotted as a point in n-dimensional space (where n is number of features you have) with the value of each feature being the value of a particular coordinate. Then, classification by finding the hyper-plane that differentiate the two classes very well is performed. Support Vectors are simply the co-ordinates of individual observation.

Support Vector Machine is a frontier which best segregates the two classes (hyper-plane/ line). In conclusion, each SVM algorithm works as a large-margin classifier. It enables to finding a hyper plane between two or more linear separable classes with the largest separation, or margin, between the classes [82].

Based on the principle component analysis performed above, the calculation of statistical features on the image data for the training data set was now feasible. In a first stage and after the insertion of PCA outcome to SVM algorithm the training and classified data were separated in two classes (“GOOD” & “NOT GOOD”). The results of this algorithm’s section are provided in the figure below. As it can be concluded the developed method successfully classified the simulation trials based on the training model.

Quality labelling, training and prediction

Laboratory for Manufacturing Systems and Automation (LMS) Page 42 Figure 22: "Good" & "NG" classification

As a next step, the algorithm can be enriched with the capability to classify more than two classes. This is needed due to the fact that more than two defects may be present during laser welding. Thus, indicative results from literature [83] are differently

Laboratory for Manufacturing Systems and Automation (LMS) Page 43 presented in the following figure illustrating the possible prediction of new simulation data which were characterized as “Cracking-Porosity”, “Lack of penetration”, “Good”.

Figure 23: Quality Prediction with 3 defects classes [83]

As far as it concerns the prediction of new data, images from real camera labelled measurements were acquired [83] aiming to calculate the features for each frame, use SVM classifier to predict the welding quality and validate the fact that the developed method can work with real measurements. In this regard, the linear SVM was trained with two welding data sets and the outcome of the prediction for a new trial can be achieved for each frame labelled by the algorithm. It was observed that by using the first 2 principle components based on the image data as features, quality prediction accuracy of 90% on training and test data was achieved. Afterwards, the same processes referred previously have been applied to more experimental trials to predict the quality state for two classes (“Good” and “Not Good”). The results are shown in the below figure. The algorithm was able to successfully predict this welding issue regarding the correct error class and the error position.

Laser Metal Deposition and Laser Welding paradigms

Laboratory for Manufacturing Systems and Automation (LMS) Page 44 Figure 24: Two different instances from real video measurements captured through

MATLAB - IR images

Figure 25: SVM classification of real data - Two classes (Good / Not Good)

In document Thesis. TITLE Quality assessment in laser welding (Page 41-46)