7 TEST CASE & FUTURE OUTLOOK - DATA MODELING. Development of data models for the results of

7.1 Tensile strength test case

During the writing process of this thesis, the tensile strength data model described in Appendix 3 has been successfully utilized in a prototype database environment under development. Tensile strength results from a single chosen test provider arrive as an Excel spreadsheet as depicted in Picture 12. This data is then supplied to a conversion script that generates a JSON-formatted document where each row of the spreadsheet is represented as a JSON-object depicted in Picture 13, defined by the tensile strength data model. The generated JSON-formatted document is then inserted into a database.

Picture 12. Tensile strength results in Excel spreadsheet

It should be noted that this process is far from optimal. There are multiple unnecessary stages where the data of interest is converted from one format to another and possibly even manually copied between documents and applications, making it possible for the data to become corrupt or degraded. A preferable process would be that the data generated by the tensile strength testing equipment would initially be in such format that it would be possible to store to the database as-is, possibly even structured in a way described by the data model.

Picture 13. Tensile strength result as a JSON-object.

7.2 Plans for legacy data

The company has a very large number of various test results from various projects stored in project folders. In order to utilize the already acquired data in a wider context i.e., for analytic purposes, it is essential to convert the so-called legacy data into new data models and store them in a database for much more convenient access. Though this subject is a bit outside the scope of this thesis, it has been an ongoing project during the writing process, and for this reason the subject will be briefly discussed here.

The way data is transferred from external laboratories to company data storage and analysis systems should be streamlined. Currently test data is mainly arriving as email-attachments, and then stored to individual project folders on a network drive. A direct data exchange method between company and laboratories database systems would be an ideal addition. As the company digitalization initiative progresses, new technologies including cloud-based solutions are becoming available, which should be properly explored and assessed for this purpose.

7.2.1 Automatic conversion

Given the fact that there is little consistency in some test reports, developing an automatic conversion tool for every possible test type is extremely laborious and challenging. In addition, most reports only exist as PDF documents, which, if read with a machine vision or character recognition system, for example, would likely produce corrupt data. This is a problem especially in such cases where the PDF document has been scanned from a paper document. In addition, the data generated by the machine vision or character recognition system should in any case be manually entered into the new data models, as there is no guarantee that the reports will be consistent. Reading Excel-formatted reports i.e., with Python programming language utilizing the Pandas data analysis library is fairly straightforward, but once again the inconsistency makes automatic conversion almost impossible: If cells, rows, columns and variable names do not stay the same from report to report, there is no point in developing an automatic conversion tool.

However, if in collaboration with the laboratory producing the test data, it is possible to define a fixed format in which the data is delivered, it is relatively simple to create an

automatic conversion system. It remains to be seen whether laboratories are willing to provide test data in a fixed format retrospectively. And even as a short-term low-tech solution it would be beneficial if the laboratories are willing to implement some sort of fixed spreadsheet-template for data exchange while more convenient solutions are under development.

7.2.2 Manual conversion

Manually entering existing test results into conversion tools could be a sensible approach to converting at least a part of the legacy data. Tools for conversion are likely to be easier to implement based on data models created per test type.

One operating model could be to build a web-based reporting system utilizing the framework of the internal database portal already in development. Reporting system could create a dynamic report template to be populated based on the developed data models. The conversion process itself is more laborious in this approach and data corruption is possible due to human error, but changes in the format of the reports are much easier to deal with.

Other, even more manual method would be to just take the time and collect and clean the data that is considered most important to a tabular format, and manually convert it in a few batches. Conversion scripts are fairly straightforward to develop if the data cleanup is done properly, and this way there is no need to develop tools that would possibly require updating and maintenance in the future.

8 CONCLUSION

As the whole digitalization initiative and the utilization of a database system for test results is in its early steps, the data models described in this thesis should be considered as initial sketches bound to be improved as the system development progresses forwards. The models are refined enough to be utilized in the prototype stages of system development to collect feedback and to improve the models accordingly, but the final implications are near impossible to assess without a longer testing period.

Though abstract by nature, data modeling and the construction of visual representations of data systems are powerful tools, especially when working with people who are unfamiliar with programming or database development in general. The precise definition of the method or technique used to create the models should not be the primary concern, but rather the methods should be defined so that all stakeholders involved can understand what is the focus of the process and contribute as the process progresses.

Construction of a data model is not necessary when implementing a document-based non-relational database system, but the value of properly formatted documentation is something that is proven to be extremely important especially as systems expand and get more and more complicated. A data storage system implemented without documentation could be considered as a “hostage” of the developers if the only party with proper knowledge of the construction and relations of data is the developers themselves.

The scope of this thesis was limited to a narrow selection of data-producing processes to achieve a proof-of-concept kind of result and documentation about the data modeling process. Same principles of data modeling can be transferred to any other data-producing tests and processes in the future. Process described in the previous chapter and systematically progressing from conceptual to physical model, iterating, and asking for feedback at any given stage is a good guideline to achieve functional models.

REFERENCES

ASM International. 2004. Tensile testing. Second edition. Materials Park OH: ASM International.

Beaulieu, A. 2020. Learning SQL – Generate, manipulate and retrieve data. Third edition.

Sebastopol CA: O’Reilly Media Inc.

Bian, L.; Shamsaei, N.S. & Usher, J.M. 2018. Laser-Baser Additive Manufacturing of Metal Parts.

Boca Raton FL: Taylor & Francis Group.

Bierer, D. 2020. Learn MongoDB 4.x. Birmingham: Packt Publishing Ltd.

Brandt, M. 2017. Laser Additive Manufacturing. Materials, Design, Technologies and Applications. Duxford: Woodhead Publishing.

Callister, W.D. 2000. Materials Science and Engineering, an Introduction. Fifth edition. New York:

John Wiley & Sons Inc.

IBM Cloud Education 2020. IBM Cloud Learn Hub. What is Data Modeling. Referenced 29.7.2021.

https://www.ibm.com/cloud/learn/data-modeling

ISO 148-1. 2016. Metallic materials – Charpy pendulum impact test. Third edition. Geneva:

International Organization for Standardization.

ISO 17296-2. 2015. Additive manufacturing — General principles — Part 2: Overview of process categories and feedstock. Geneva: International Organization for Standardization.

ISO 1099. 2017. Metallic materials – Fatigue testing – Axial Force-controlled method. Third edition. Geneva: International Organization for Standardization.

ISO 6508-1. 2005. Metallic materials – Rockwell hardness test. Second edition. Geneva:

International Organization for Standardization.

Kroenke, D. & Auer, D. 2013. Database Concepts. Sixth edition. Upper Saddle River NJ: Pearson Education Inc.

Raghavendra, K.; Manjaiah, M.; Balashanmugam, N. & Davim, P.J. 2021. Additive Manufacturing.

A Tool for Industrial Revolution 4.0. Duxford: Woodhead Publishing.

Rösler, J.; Harders, H. & Bäker, M. 2007. Mechanical Behaviour of Engineering Materials.

Heidelberg: Springer-Verlag.

Sadalage, P. & Fowler, M. 2013. NoSQL Distilled: a brief guide to the emerging world of polylglot persistence. Upper Saddle River NJ: Pearson Education Inc.

Simsion, G. & Witt, G. 2005. Data Modeling Essentials. Third edition. San Francisco CA: Morgan Kauffmann Publishers.

Smallman, R.E. & Bishop, R.J. 1999. Modern Physical Metallurgy & Materials Engineering. Sixth Edition. Oxford: Butterworth-Heinemann.

University of Cambridge 2019. Creep Deformation of Metals. Referenced 21.7.2021.

https://www.doitpoms.ac.uk/tlplib/creep/

University of Cambridge 2018. Mechanical Testing of Metals. Referenced 20.7.2021.

https://www.doitpoms.ac.uk/tlplib/mechanical_testing_metals/

Vertabelo 2016. Design Fundamentals. Crow’s Foot Notation. Referenced 5.8.2021.

https://vertabelo.com/blog/crow-s-foot-notation/

Watt, A. & Eng, N. 2014. Database Design. Second edition. Victoria BC: BCCampus.

Zhu, W.D.; Gupta, M.; Kumar, V.; Perepa, S.; Sathi, A. & Statchuck, C. 2014. Building Big Data and Analytics Solutions in the Cloud. First Edition. Armonk NY: International Business Machines Corporation.

In document DATA MODELING. Development of data models for the results of mechanical testing of metals. Bachelor s thesis (Page 36-43)