Implementation details - Machine learning for automatic configuration of structured parallel ap

In the following, some implementation details of the Tesn library are taken into account and discussed.

5.2.1 Dataset files

A dataset file contains all the necessary information to perform the learning and the prediction tasks for tree input domains. The dataset handled by the library is stored in a file having a specific format identified by the file extension gph. The contained information is acquired by the TreeDatasetParser class using a proper predictive parser based on the following grammar.

HEADER ::= T reeN um < int > M axArity < int > LabelDim < int >

T REE SAM P LES ::= T REE T REE SAM P LES _|

T REE ::= N ame < stringId > T arget < double > T reeDim < int > RECORDS

RECORDS ::= N ODE REP RESEN T AT ION RECORDS _|

N ODE REP RESEN T AT ION ::= < id > CHILD ID < parentId > LABELS The dataset is composed by an header and a body part. The header contains the common information characterizing all the tree samples in the dataset. It contains the number of available samples, the maximum tree arity allowed and the number of labels contained in each tree node. The body part represents the collection of all the tree samples in the dataset. Each tree in the dataset is characterized by a string name identifying univocally the sample in the dataset, its target for the considered task, the dimension of the tree expressed in terms of the nodes number and a nodes description. Every node is represented by a unique numeric identifier, the child identifiers, the parent identifier and the labels characterising the node. All nodes are sorted in the inverted topologi- cal order, in such a way it is possible to build easily and efficiently the data structure starting from the tree frontier. This node organization speedup the state computation phase. Indeed, since it requires to analyse the child nodes before their parents, the node information can be achieved sequentially exploiting the memory hierarchy levels to improve the performance.

The number of contained grammar productions is not detected directly by the grammar, but it is rather achieved by analysing the numeric attribute contained in the dataset description. For instance, the T reeN um attribute specifies the number of tree samples contained in the dataset. Basing on the previously described grammar and the information contained in the dataset, a top-down left-to-right parser able to acquire predictively all the information available in the dataset file has been developed.

The library is able to handle an unique dataset format, however others dataset for- mats can be considered implementing new parsers able to produce a dataset object. For instance, a new parser capable to handle tree-to-tree transduction dataset can be

implemented.

5.2.2 Matrices mathematical operations

All the information required by the Tesn library has been represented using matrices. For both the input units and the reservoir networks, the connections betweens input-reservoir and between reservoir neurons have been represented through adjacency matrices. At the same way, the dataset samples and the states produced by a dataset partition are stored in matrices too. Thus, in the developed library, the matrix data structures fulfil a key role and a way to treat them adequately must be found. All the matrices are represented as dense matrices of double values. But, sometimes, it is necessary to represent them in the equivalent sparse representation as happens for the reservoir connection matrix. Indeed, since the reservoir network benefits from the small-world property, it is possible to use sparsely connected reservoir to training the TreeESN model without afflicting the accuracy of the obtained predictive model. For this reason a proper procedure to compute the tree state has been provided both with dense and sparse matrix. The dense matrices are represented using the Dense class, while the sparse matrices using the Sparse class. Both of them provide methods allow- ing to load/store the matrix content from a file.

The operations performed on matrices are the most critical part in the Tesn library. Different kinds of operations must be implemented to perform the training of a TreeESN model and their implementations strongly influence the performance and accuracy of the training procedure. So, it is pretty natural the choice of relying on an external library to perform these operations in the manner. The mathematical library should solve the problems related to:

• Numerical Stability – Since the numerical linear algebra operations taken into account may involve multiple computation steps, the library should limit the numeric error produced with the state-of-art algorithms. The algorithmical error typically derives from rounding and truncation errors propagating during the computation. • Performance – The implementation of the linear algebra operations must efficiently exploit the nowadays architecture to achieve better performance. For example, it must take advantage from the vector ISA (Instructions Set Architecture) available for the target architecture.

In principle, also the parallel exploitation of the target architecture may be taken into account for the selection of mathematical library to be used. However, in the Tesn library this aspect is not fundamental since in Chap. 7 it is assumed that the parallelism is exploited at a coarser grain in the validation procedure.

In the Tesn library, the mathematical operations are carried on using the Basic Linear Algebra Subprograms (BLAS) and Linear Algebra PACKage (LAPACK) library. Both libraries define an interface used, by many vendors, as reference guide for different implementations. These libraries provides a de-facto standard for scientific computing when linear algebra operation must be faced. The BLAS library provides some basic linear algebra operations such as vector scaling, vector dot products, linear combinations and matrix multiplication. This library is used as building block for the implementation of very famous programming environments such as MATLAB, GNU Octave, Mathe- matica and R and in the LINPACK, LAPACK libraries. The LAPACK library makes use of the BLAS library to perform efficiently some complex linear algebra operations. The library provides some operations like the LU, Cholesky, QR matrix factorizations; inversion, linear system solution; and operations dealing with eigenvalues and singular values. In the Tesn library, two different implementations haves been tested: the Open-

BLAS5 _{and the Intel MKL}6 _{library. As stated previously, the BLAS/LAPACK library}

has been used in a context where a single operation is carried on by a single thread. For the OpenBLAS this can be ensured compiling the library with the appropriate flags. Instead, in the Intel MKL library it is sufficient to link appropriately the right object module.

In the Tesn library, the implemented code is designed to be easily optimizable by the compiler. For example, in the state computations the code has been written to take advantage from the vectorization optimization available in the target machine, both for the sparse and dense matrix representations. For the other and more complex mathematical operations, some calls to the BLAS/LAPACK library have been used to deal efficiently with the tasks. The implemented phases relying upon this library are:

• Linear Regression Training – For both the learning methods explained in Sect. 3.2.1 the mathematical library has been used. In particular:

– The classical Least Mean Square problem (Form. 3.11) has been solved calling the LAPACKE dgels function.

5_{OpenBLAS library – http://www.openblas.net/}

– The Tickonov regularization (Form. 3.13) has been implemented using the basic multiplication and transposition matrix operations (in the BLAS library) and an inversion procedure. The latter one has been computed using the LAPACK library by decomposing the matrix with the LU factorization (LAPACKE dgetrf ) and then applying the inversion (LAPACKE dgetri ). • Reservoir Initialization – The reservoir initialization is based on the normaliza-

tion of the contractivity factor (σ). To apply the normalization is necessary to compute the norm-2 of the reservoir connectivity matrix (W) and then apply the normalization using the equation

W = W σ

k_||W||2

where ˆW is the normalized reservoir matrix and k is the tree arity. The norm-

2 has been computed by the LAPACKE dgesvd function using a singular value matrix decomposition.

Chapter 6 Machine Learning Task

In this chapter, the construction of a task dealing with the completion time for skeleton applications will be analysed. The chapter discussion follows entirely the methodology exposed in Sect. 4.2, starting from the creation of the datasets until the training of the TreeESN models.

In document Machine learning for automatic configuration of structured parallel applications (Page 44-49)