Development of a Numeric On-line Decision Support System for Crop Fertilizer Optimization

(1)

1

Development of a Numeric On-line Decision Support System for Crop Fertilizer Optimization

By

Sadanand Shinde

Department of Bioresource Engineering Macdonald Campus of McGill University

Montreal, Quebec, Canada.

December 2017

A thesis submitted to McGill University in partial fulfillment of the requirements for the degree

of

Master of Science

©Sadanand Shinde, 2017

(2)

2 Table of Contents

Table of Contents ... 2

List of Tables ... 4

List of Figures ... 5

Table of Abbreviations ... 7

Abstract ... 8

Résumé ... 10

Acknowledgements ... 12

Preface and Contribution of Authors ... 13

1. Introduction ... 14

1.1 Why do we need a new comprehensive DSS for fertilizer optimization? ... 15

1.2 The problem statements ... 15

1.3 Research objectives ... 16

2. Literature Review... 17

2.1 Production function ... 17

2.3 DSS with an agriculture perspective ... 19

2.4 Precision agriculture and its practices... 21

2.5 Agriculture in Canada, and the importance of fertilization in an economic perspective. .. 21

2.6 Challenges of Nitrogen fertilizer management ... 24

3 Materials and Methods ... 25

3.1 DSS architecture and overview ... 25

3.2.1 Modeling and numeric computation of NRCF ... 27

3.2.2 Expected profits and uncertainty ... 28

3.3 Database ... 29

3.4 Record similarity ... 31

3.4.1 Feature scaling... 32

3.5 Errors to probability approach ... 37

3.6 Profit function ... 40

3.7 NumericAg components ... 42

3.7.1 Database ... 45

(3)

3

3.7.2 NumericAg – WebApp ... 47

4 Results and Discussion ... 52

4.1 Profit visualization ... 52

4.2 Sensitivity analysis... 57

4.2.1 Climate (AWDR) ... 58

4.2.2 Temperature (CHU) ... 59

4.2.3 Soil Type... 60

4.2.4 Previous crop ... 61

4.2.5 Tillage system... 62

4.2.6 Joint sensitivity ... 63

4.2.7 Nitrogen Cost ... 66

4.2.8 Yield price ... 67

4.4 Discussion ... 70

4.5 Conclusion ... 71

References ... 73

APPENDICES ... 79

Appendix A. Few sample fertility trials from initial 1680 records and DB schema ... 79

Appendix B. Intermediate files produced by various modules ... 81

Appendix C. Software code of various modules to generate the results ... 84

(4)

4 List of Tables

Table 1. Soil texture class to clay ratio mapping. ... 29

Table 2. AWDR numeric to subjective categorization. ... 30

Table 3. Previous crop contribution to soil nitrogen content. ... 30

Table 4. CHU numeric to subjective categorization. ... 31

Table 5. Features used in the similarity with their weights. ... 32

Table 6. User context attributes numerical transformation for record similarity. ... 33

Table 7. The Similarity illustration between the user context and database trials. ... 33

Table 8. Fertility trials production parameters; the yield for each N rate and key soil attributes. 46 Table 9. SiteData - attributes that identify the location and address of the site. ... 47

Table 10. Sensitivity assessment of climate conditions on the economic benefits. ... 58

Table 11. Sensitivity assessment of temperature (CHU) on the economic benefits. ... 59

Table 12. Sensitivity assessment of soil on the economic benefits. ... 60

Table 13. Sensitivity assessment of previous crop on the economic benefits. ... 62

Table 14. Sensitivity assessment of tillage system on the net returns. ... 63

Table 15. Sensitivity assessment specifying a joint same value for all input features. ... 64

Table 16. Sensitivity assessment of fertilizer cost on the economic returns. ... 66

Table 17. Sensitivity assessment of yield price variation on the economical returns. ... 67

Table 18. Risk classification with probability threshold. ... 69

(5)

5 List of Figures

Figure 1. A conceptual overview of a typical DSS. ... 17

Figure 2. Farm operating expenses in Canada, 2013 (source: AAFC, 2015). ... 22

Figure 3. Fertilizer types, and usage in Canada between 2009 and 2013 (AAFC, 2015)... 23

Figure 4. The architecture of the proposed decision support system. ... 25

Figure 5. Example of a quadratic-plateau yield response model. ... 28

Figure 6. Illustrating λ comparison between the soil classes. ... 35

Figure 7. The illustration of λ values for the soil feature similarity. ... 36

Figure 8. The λ values ranked corresponding to different powers. ... 37

Figure 9. Example of discrepancy in estimated yield and recorded yield. ... 38

Figure 10. Example showing distribution of errors for a model combination. ... 39

Figure 11. NumericAg – technical components and control flow diagram. ... 43

Figure 12. Homepage, has the user input form and navigation menus. ... 49

Figure 13. Add trial form to save past observed fertility trials in the database. ... 50

Figure 14. Admin functionality to perform administrative tasks. ... 51

Figure 15. Example of user context representing medium growing conditions. ... 53

Figure 16. Email sent to a user comprising of user input conditions and DSS results. ... 54

Figure 17. Profit response line, the expected NRCF at each fertilizer application rate. ... 56

Figure 18. Profit surface contour graph, with profits and probabilities at each N rate. ... 57

Figure 19. Profit response for the precipitation (AWDR) classes. ... 59

Figure 20. Profit responses of the temperature (CHU) cold, medium and hot conditions. ... 60

Figure 21. The expected profits for sand, clay and clay loam soil class at the given rates. ... 61

Figure 22. The expected profit at low, moderate and strong nutrient classes. ... 62

Figure 23. The expected profit lines with conventional till and no-till system. ... 63

Figure 24. The expected profits for the combined sensitivity of all features... 65

Figure 25. Probabilities of achieving profit thresholds for the joint sensitivity... 65

Figure 26. The expected profits for the fertilizer cost variable values. ... 67

Figure 27. The expected profits for the yield price flexible values. ... 68

Figure 28. Profit surface graph with risk classification. ... 69

Figure 29. Fertility trials with values recorded for the various parameters. ... 79

(6)

6 Figure 30. Database schema design diagram showing entities and relationship. ... 80 Figure 31. user context production parameters resulted from production parameter setup module.

... 81 Figure 32. Contains production parameters and their joint probability calculated from error to probability module. ... 81 Figure 33. Probabilities of achieving certain profit threshold and NRCF, EFB for each varying nitrogen application rate calculated after numeric simulation. ... 82 Figure 34. Profit classes with associated achieving probabilities at each nitrogen rate. ... 83

(7)

7 Table of Abbreviations

AAFC Agriculture and Agri-Food Canada AWDR Abundant and well-distributed rainfall CANB Canadian Agricultural Nitrogen Budget CHU Corn heat unit

CME Chicago Mercantile Exchange DSS Decision Support System EFB Expected fertilization benefits

EOFR Economically optimum fertilization rate EONR Economically optimum nitrogen rate NRCF Net return over cost of fertilization NUE Nitrogen use efficiency

QP Quadratic-plateau

SLC Soil Landscapes of Canada SOM Soil organic matter

WWW World Wide Web

(8)

8 Abstract

Fertilizer is a major agricultural input used to achieve high yields. The determination of an optimum fertilizer application rate involves various spatial (soil and landscape conditions) and temporal (weather and price) factors. The spatial and temporal controls require the determination of the optimum fertilizer range based on such factors as past management, soil characteristics, crop rotation, weather (historical, current and forecasts), commodity prices, cost, accessibility of input material, and risk preference. In every production scenario, the crop response to fertilization is uncertain due to numerous influencing factors and their interactions as well as anomalies. A quantitative representation of these uncertainties affects the definition of the best management practice in addition to the expected values. Therefore, a correctly implemented decision support system should help to determine fertilizer application rates that minimize the risks associated with fertilizer use, such as non-maximized profitability.

This thesis research presents a framework (DSS) for the implementation of the production controls (e.g. rate of agricultural inputs) that considers spatial, temporal and managerial factors in assessing the potential impact of crop fertilization (or the use of other amendments) on the bottom line of a given production scenario in the context of uncertain information. The proposed DSS includes a database, a user interface, and a numeric simulation model. The online web interface of the DSS assists the user to input the site-specific production conditions. The database contains previously recorded fertility trials for a given agricultural input under different conditions. The numeric simulation process integrates the fertility trials, information about local conditions and economic considerations. The dynamic linkage of the site-specific spatial, temporal and management factors to production and profit functions enables the estimation of the economically optimum fertilizer rate to maximize net return over the cost of fertilization to increase the probability of achievement for each decision option. The discrepancies between estimated and recorded yield were considered as production uncertainties whereas variations in prices were considered as economic uncertainties. The uncertainty-based treatment of each model input allows for a balance between the potential results of under-application or over-application.

The DSS allows for the input of historical farm trials to enhance the existing database and to continuously improve its accuracy. With the current version of the database, the DSS could estimate the optimum application rate with the expected net profit and the probability of achieving that profit. Based on illustrated examples, higher application rates tend to increase NRCF estimates

(9)

9 with more uncertainty as compared to low application rates. The probability of NRCF reduction was higher when a lower application rate was accepted and vice versa. The DSS proposed higher application rates when fertilizer cost was lower or yield price was higher and vice versa. The probabilities of profit were used to determine the most certain (least risky) profitable scenario based on individual risk preferences. With the first version of the developed DSS, only nitrogen application for corn in Eastern Canada was used to illustrate the proposed framework. The DSS can be accessed online at http://www.numericag.com.

(10)

10 Résumé

L'engrais est un intrant agricole important nécessaire pour atteindre des rendements élevés.

La détermination d'un taux optimal d'application d'engrais implique diverses conditions spatiales (conditions du sol et du paysage) et temporelles (conditions météorologiques et prix). Les contrôles spatiaux et temporels nécessitent la détermination de la gamme optimale d'engrais basée sur des facteurs tels que la gestion passée, les caractéristiques du sol, la rotation des cultures, les conditions météorologiques (historiques, actuelles et prévisionnelles), les prix des matières premières, le coût et l'accessibilité. Dans tous les contextes de production, la réponse des cultures à la fertilisation est incertaine en raison de nombreux facteurs d'influence et de leurs interactions ainsi que des anomalies. La représentation quantitative de ces incertitudes affecte la définition des meilleures pratiques de gestion en plus des valeurs attendues. Par conséquent, un système d'aide à la décision correctement mis en place aide à déterminer les taux d'application d'engrais qui minimisent les risques associés à l'utilisation d'engrais, tels que la rentabilité non maximisée.

Cette thèse présente un cadre pour la mise en œuvre des contrôles de production (taux d'intrants agricoles), conceptualisé comme un système d'aide à la décision (DSS) qui prend en compte les facteurs spatiaux, temporels et managériaux pour évaluer l'impact potentiel de la fertilisation des cultures, d'autres amendements) sur les bénéfices d'un scénario de production donné, dans le contexte d'informations incertaines. Le DSS proposé comprend une base de données, une interface utilisateur et un modèle de simulation numérique. L'interface web en ligne du DSS aide l'utilisateur à saisir les conditions de production spécifiques au site. La base de données contient des essais de fertilité précédemment enregistrés pour un intrant agricole donné dans différentes conditions. Le processus de simulation numérique intègre les essais de fertilité, l'information sur les conditions locales et les considérations économiques. Le couplage dynamique des facteurs spatiaux, temporels et de gestion spécifique au site aux fonctions de production et de profit permet d'estimer le taux d'engrais économiquement optimal, pour maximiser le rendement net sur le coût de la fertilisation, afin d’augmenter la probabilité de réussite pour chaque option décisionnelle. Les écarts entre le rendement estimé et enregistré ont été considérés comme des incertitudes de production alors que les variations de prix ont été considérées comme des incertitudes économiques. Le traitement basé sur l'incertitude de chaque entrée du modèle permet d'équilibrer les résultats potentiels de la sous-application ou de la sur-application.

(11)

11 Le DSS permet la saisie de données historiques sur les essais à la ferme pour améliorer la base de données existante et pour améliorer continuellement sa précision. Avec la version actuelle de la base de données, le DSS pourrait estimer le taux d'application optimal avec le bénéfice net attendu et la probabilité d'atteindre ce bénéfice. Sur la base d'exemples illustrés, des taux d'application plus élevés tendent à augmenter les estimations de NRCF avec plus d'incertitude par rapport aux faibles taux d'application. La probabilité de réduction de la NRCF était plus élevée lorsqu'un taux d'application inférieur était accepté et vice versa. Le DSS a proposé des taux d'application plus élevés lorsque les coûts des engrais étaient plus bas ou que le prix du rendement était plus élevé et vice versa. Les probabilités de profit ont été utilisées pour déterminer le scénario de rentabilité le plus certain (le moins risqué) en fonction des préférences de risque individuelles.

Avec la première version du DSS mis au point, seule l'application d'azote pour le maïs dans l'Est du Canada a été utilisée pour illustrer le cadre proposé. Le DSS peut être consulté en ligne à http://www.numericag.com

(12)

12 Acknowledgements

This research was supported in a part by Agriculture and Agri-Food Canada and Fertilizer Canada Agri-Innovation Program “A Canadian Research Network to Improve 4R Nutrient Stewardship for Environmental Health and Crop Production.”

I would also like to thank Dr. Viacheslav Adamchuk for giving me the opportunity to work under his supervision. I hope I lived up to your expectations. This unknown jump was rewarding to me and I learned a great deal of personal and professional aspects from you. Thank You.

I am thankful to Dr. Rene Lacroix for advising and consulting throughout my research. Working with you was indeed a pleasure. I am also thankful to Dr. Nicolas Tremblay of AAFC for being part of my supervision committee and Dr. Yacine Bouroubi of EffiGIS for assembling the initial database.

I am grateful to Dr. Nandkishor Dhawale and Mrs. Bhakti Shinde for referring this opportunity and helping throughout the master’s admission process. I’d like to thank everyone within the Precision Agriculture and Sensor System (PASS) research team for timely help and knowledge sharing. Working alongside you all has been a great experience.

I would like to thank my family, who allowed me to pursue and explore my own career path. My wife Sneha, spending initial days without me and her unconditional support. My friends and Sainte Anne buddies (Jay, Bhakti, Dhananjay, Nilesh, Nita-Nandu, Rasika, Manjurul, Russia), who helped, advised and supported me during tough times.

(13)

13 Preface and Contribution of Authors

This thesis involves partial collaboration with Agriculture and Agri-Food Canada (AAFC). The initial 1680 fertility trials were taken from Dr. Nicolas Tremblay of AAFC.

Dr. Adamchuk has significant contribution in managing and coordination, he proposed the idea of error to probability transformation and drafted the record similarity matching method.

Dr. Rene Lacroix of Valacta helped in data analysis and building the initial prototype of DSS, and provided his thoughtful comments in the thesis chapters.

The initial prototype of the research presented in this thesis was presented at the 11^th European Conference on Precision Agriculture (ECPA 2017) held in John McIntyre Centre, Edinburgh, UK and published as a refereed conference proceedings paper: Adamchuk, V., R. Lacroix, S. Shinde, N. Tremblay, and H. Huang. 2017. An uncertainty-based comprehensive decision support system for site-specific crop management. Advances in Animal Biosciences, 8(2), 625-629 (doi:10.1017/S2040470017000462).

(14)

14 1. Introduction

The world’s population is expected to reach 9.2 billion people by 2050; this is a 34 percent increase from the current number. To keep up with rising populations and to be able to feed the world, global food production must increase by 70 percent. Farmers need to adapt and benefit from advancements in technology to increase productivity, primarily using information systems for better decision making. A decision-support system (DSS) is a subsystem of an information system that takes historical data/knowledge as a base and applies governing rules/methodologies to formulate the decision outcome. Decision support is based on predictive analysis to help management foresee the uncertainties/behavior of the system’s stakeholders. A DSS may not automate every process, and it may require human intuition to choose between alternatives as a final decision, but it provides detailed insight or power to formulate the decision.

DSSs are widely accepted and used in many business and operational domains such as strategic planning, operational management, healthcare, e-commerce, and transportation. The agricultural sector also has started to benefit from the use of technology in many areas. Many companies are considering a complete precision agriculture solution that integrates all farming operations into one product, through the combined use of technology and information systems for most processes/operations. Tata Consultancy Services (TCS) developed a digital farming solution called ‘InteGra’ that uses precision agriculture tools for crop planning, aggregation and ordering, crop cycle management, harvest planning, etc. IBM’s ‘Deep Thunder,’ a massive computing system for predictive analysis, could be used to determine optimal times to plant, irrigate and harvest crops, based on the dynamic environmental conditions of individual farm locations.

Technological development expands in all directions. Likewise, in the agricultural sector, technological advancements allow farm operations to use smart machines, sensors and drones.

Many agriculture consultant companies and agronomy researchers are developing applications/solutions to use technology for effective agriculture decision making. Some DSS are in use in the precision agriculture domain (Venkatalakshmi and Devi, 2014) based on crop modeling and GIS for spatial analysis (Densham 1991). Similarly, but with different approaches and methodologies, we propose the development of a new DSS for crop fertilizer optimization.

Specifically, the aim is to maximize a farmer's profit in terms of crop fertilization as the agricultural input.

(15)

15 1.1 Why do we need a new comprehensive DSS for fertilizer optimization?

According to Agriculture and Agri-Food Canada, nitrogen constitutes 74% of overall fertilizer usage (AAFC 2015), which in turn is proportional to agricultural expenses accounted for fertilization (i.e., out of $5 billion expenses, $4 billion is due to fertilization cost). On the other hand, nitrogen (N) is the most limiting nutrient for crop production in many of the world’s agriculture areas and its efficient use is essential for the economic sustainability of cropping systems. The primary focus of this research was to provide decision support for farmers/ producers to choose the most economical application rate that maintains a balance between over, or under, application to resolve the issues mentioned here.

Farmers need decision support to understand the spatial heterogeneity within the field for site-specific nitrogen fertilization, and for the required N management practices (types and content of nutrients in fertilizers). Improving fertilizer efficiency requires the consideration of various management factors related to the soil (water, tillage), fertilization techniques (variable rates, application method) and crop. Also, there are specific challenges in implementing precision agriculture since there are not many formal DSS and no well-designed strategies that are flexible enough to incorporate the entire range of practices and management processes observed in practice (McBratney et al., 2005). The lack of information support or tools has been a significant roadblock to full adoption of precision agriculture. Per Lowenberg-DeBoer (1996): “One of the key factors limiting adoption of precision farming technology is the lack of decision support.” Various factors will be necessary to derive site-specific fertilization decisions to capture real heterogeneity. The proposed DSS will have a site-specific production function through site-specific input values, and a uniform, or generalized, profit function.

1.2 The problem statements

Fertilizer is one of the major agricultural inputs for crop growth. The over-application of fertilizers has an adverse impact on the environment and leads to soil degradation and increased costs, whereas application below a sufficient rate incurs profit losses. Application rates vary based on soil type, previous crop, and current managerial practices. What amount of fertilizer should be applied to the crop for a specific soil class, in specific weather, for a defined growing season and in uncertain variations in yield price and fertilizer costs? These are daunting questions that a farmer has to deal with in performing his farming operations. The lack of agricultural information and the use of conventional practices are not much help in an ever-changing, dynamic environment and

(16)

16 with highly variable market conditions. Farmers need support regarding the consultation or technological expert systems that can address these concerns. The traditional approach of soil sampling and laboratory analyses to gain a better knowledge of specific field conditions require the use of elaborate apparatus and procedures. This process is costly and time-consuming, due to the need for sampling, transportation, sample preparation and analysis (Rossel and Bouma 2016).

Although soil analyses provide useful information, this is not sufficient for the optimal management of fertilization. The N response to crop yield is very much dependent on a variety of agronomical factors, and net revenues depend on economic considerations (yield price, fertilizer cost). While conventional recommended practices and approaches work at a high level in general, they do not go far enough into the growing conditions specific to a particular farm, and the dynamic linking of these parameters to the production function is necessary to determine in-season, site- specific N rate. The proposed DSS should address the specifics of the agronomic issues plus the economics of the underlying user context dynamically.

1.3 Research objectives

The goal was to develop a DSS that enables the determination of the economic optimum nitrogen application rate (EONR), which gives maximum fertilization benefits (net return over cost of fertilizer). The DSS should provide comprehensive statistics ofnet return over cost of fertilizer (NRCF) with associated probabilities and expected fertilization benefits (EFB) at each variable nitrogen rate. The probabilities associated with expected profits at each fertilizer application rate should help farmers to perform the risk-based assessment of fertilizer rates. The DSS has to provide flexibility in determining the sensitivity of amendments in the input feature on the decision outcome. The primary characteristics of the proposed DSS include the dynamic linking of spatial (soil characteristics or variability), temporal (economy and local climate) and management (crop rotation, tillage practices) factors to the underlying crop production. Also, the dynamic consideration of different site-specific conditions (soil, weather, management) while taking into account the uncertainties associated with the economy.

(17)

17 2. Literature Review

2.1 Production function

A decision support system (DSS) is a computer-based information system that supports business or organizational decision-making activities (Keen 1980). A DSS helps people to make decisions about problems that may be rapidly changing and not easily specified in advance, i.e., unstructured or semi-structured decision problems. DSS can be either fully computerized, human- powered or a combination of both.

Sprague (1980) defines DSS as follows:

• DSS tends to aim at the less well structured, underspecified problem that upper-level managers typically face.

• DSS attempt to combine the use of models or analytic techniques with traditional data access and retrieval functions.

• DSS specifically focus on features that make them easy to use by non-computer-proficient people in an interactive mode.

• DSS emphasizes flexibility and adaptability to accommodate changes in the environment and the decision-making approach of the user.

A properly designed DSS is an interactive, software-based system intended to help decision- makers compile useful information from a combination of raw data, documents, and personal knowledge, or business models to identify and solve problems and make decisions. Figure 1 explains the points mentioned above.

Figure 1. A conceptual overview of a typical DSS.

(18)

18 DSS are mainly categorized into five types based on their dominant functionality: data- driven, document-driven, knowledge-driven, communications-driven and model-driven (Power and Sharda, 2009). Each DSS type serves a specific purpose. A data-driven DSS provides access to, and manipulation of, large databases of structured data and, especially, time series of internal company data and external data. Data-driven DSS emphasize analysis of data.

Document-driven DSS, a new type of DSS, are evolving to help managers retrieve and manage unstructured documents and web pages. A document-driven DSS integrates a variety of storage and processing technologies to provide complete document retrieval and analysis.

Knowledge-driven DSS suggest or recommend actions to managers. These DSS have specialized problem-solving expertise. The expertise consists of knowledge about a domain, understanding of problems within that domain, and skills at solving some of these problems. A communication- driven DSS enables cooperation, supporting more than one person working on a shared task;

examples include integrated tools like Google Docs.

A model-driven DSS emphasizes access to, and manipulation of, a quantitative model.

Simple statistical and analytical tools provide the most elementary level of functionality. Model- driven DSS use data and parameters provided by decision-makers to aid them in analyzing a situation, but they are not usually data intensive. Some OLAP systems that allow complex analysis of data, may be classified as hybrid DSS providing modeling, data retrieval, and data summarization functionality. Each of the DSS types has a specific purpose and is classified according to its functional usage. Here, we are interested in model-driven DSS since we want to incorporate statistics and analytics to make the estimations.

Traditionally, academics and practitioners in information systems have discussed building DSS regarding four major components:

1. The user interface 2. The database

3. The models and analytical tools 4. The DSS architecture and network

The proposed DSS incorporates the above-specified components. The database consists of historical data/records, the input form acts as the user interface, the production function is the model used for inference and the web application (NumericAg) serves as a communication architecture over the World Wide Web (WWW), to facilitate the information exchange.

(19)

19 Since we want to develop the DSS for the agricultural domain, the DSS needs to accommodate the domain understandings, methodologies and concepts that are in use. For example, a DSS should use the agricultural factors, conditions and an appropriate model for crop estimation.

In the proposed DSS, a production function has to be used as the physical yield assessment function, to estimate EONR. A quadratic-plateau (QP) approach is known to be a good fit with biological response (Cerrato and Blackmer, 1990, Bullock and Bullock, 1994; Bongiovanni and Lowenberg-DeBoer, 2000; Adamchuk, 2013, Adamchuk et al., 2017). Thus, we adopted the quadratic and plateau model for yield estimation in the proposed DSS. The production function involves agronomy parameters and it is mostly affected by spatial and temporal parameters. The DSS should consider the site-specific production parameters while estimating the site-specific yield.

Bakhsh et al. (2000) comment that ‘‘the lack of temporal stability in either the large-scale deterministic structure or small-scale stochastic structure shows that crop yield variability is not only controlled by intrinsic soil properties but also by extrinsic variables such as climate and management factors.’’ In total, these factors may affect the spatial as well as temporal variability in measured crop yield. Tremblay et al., (2012) found that soil properties and weather conditions are known to affect soil nitrogen (N) availability and plant N uptake. They have shown that corn response was higher with higher fertilization for the fine-textured soil than for medium textured soil. Also, abundant and well-distributed rainfall (AWDR) enhanced the corn response with less effect. Applying N at optimal rates have the potential to improve NUE (nitrogen use efficiency), crop yield, and profitability as well as to reduce environmental impacts (Kyveryga et al., 2009).

Soil properties (including texture, water holding capacity, and fertility) strongly affect soil N availability and crop yield (Zhu et al., 2009; Armstrong et al., 2009).

2.3 DSS with an agriculture perspective

Every DSS should be aware of the context of the underlying domain. As discussed earlier, decision making in farming involves various management factors, such as soil (understanding physical soil properties), water (time and cycles), crop (determination of current crop, crop rotation), time (growing season, economy) and fertilizer (application rate and type). Variation in one of these factors will change the yield response. Hence, a DSS would be helpful to

(20)

20 accommodate these complex, dynamic interactions between management factors and yield. Also, it will assist in analyzing or predicting the effects of amendments on the final decision outcome.

A model-driven DSS approach allows for the automation of farming decisions and fertilizer optimization.

We discussed earlier how a database could be used to feed historical records into an inference engine that will mine and uncover some governing patterns as a basis for inferences. The database with pre-recorded trials is also applicable in the agricultural sector. Past historical data can be fed into the DSS inference engine to estimate yields. The inference engine or model is the brain of a DSS. The inference engine constitutes a function or algorithm based on the rules/methodology acquired in yield estimation. Sometimes external conditions have an adverse impact on the final decision, and the linkage of external entities with production function is necessary. In the agriculture sector, the external entities include yield price and fertilizer costs, and climate.

In agriculture, we are interested in uncertainties that affect the desired or maximum profits.

Antón (2009) defined agriculture risk as coming from production uncertainty and price uncertainty. Production uncertainty is the amount and quality of the output that will result from a given bundle of production decisions, which is not known in advance with certainty. Uncontrolled elements such as weather conditions play a fundamental role in agricultural production. Also, the market price of the output is typically not known at the time that production decisions are taken.

Inelastic demand is often cited as the primary explanation for agricultural price variability. Rapidly changing prices and uncertain weather patterns have all contributed to the risks that farmers are facing (Boehlje and Trede, 1977). In the proposed DSS, we intend to deal with the uncertainties by incorporating the probabilities associated with the uncertainty of the input parameters/features.

The proposed DSS is in line with precision agriculture practices and objectives and, therefore, it falls under the precision agriculture perspective and umbrella. There are very few DSS available in precision agriculture, especially in fertilizer optimization. A DSS called DSS4Ag (Hoskinson et al.,1999) reduced fertilizer costs by 39.7% and increased yield by 3.3%, which resulted in a net economic gain of US $14.31/ha as compared to the uniform application rate used by farmers. However, DSS4Ag does not consider most of the higher-level controls and is not flexible enough to handle ad-hoc management and economic uncertainties, and incorporate them into production and profit functions. A standardized Canadian Agricultural Nitrogen Budget (CANB v2.0) was suggested by Yang et al. (2006). CANB is a national-level model that operates

(21)

21 on 3500 Soil Landscapes of Canada (SLC) polygons using generalized soil, landscape, climate, and Census of Agriculture socioeconomic data. The SLCs are a series of GIS coverages that show the major characteristics of soil and land for the whole country.

These standalone decision support systems (DSS4Ag and CANB) are not available in the public domain (World Wide Web). Therefore, they do not address the requirement of reaching a maximum of farmers/users in an easily accessible way. The proposed online DSS can be used on the internet to reach the maximum number of farmers with ease of accessibility and usability.

2.4 Precision agriculture and its practices

Precision agriculture is an advanced management concept in farming that comprises a set of technology, systems, machinery, and resources to analyze, predict, plant, grow and monitor farming in a better way. Precision agriculture consists of measuring, managing and monitoring crops by understanding spatial variability and soil characteristics. It is also used for providing variable treatments to crop using modern technology sets and equipment, such as Satellite Farming (use of GIS and GPS), Variable Rate Technology (VRA) and Site-Specific Crop Management, to optimize production and efficiently use all resources (Gebbers and Adamchuk., 2010).

Site-specific nitrogen management refers to the higher level and lower level controls involved in crop production (Anselin et al., 2004). It should account for soil characteristics, agronomy parameters and economics of the specific site in providing the controlled inputs (e.g., irrigated water, fertilizer, tillage). Site-specific N management corresponds to variable nitrogen treatment within the farm rather than a uniform application rate, considering the site conditions.

Precision agriculture technologies can be used to collect data, model and design experiments, perform trials by using the latest technologies such as sensors, on the go soil analysis, GIS, etc.

(Assimakopoulos et al., 2003). In addition, economic factors (i.e., commodity prices, amendment costs) are major determinants in deciding on optimal nitrogen application rates and the need to be coupled to production functions in DSS.

2.5 Agriculture in Canada, and the importance of fertilization in an economic perspective.

Agriculture plays a vital role in Canada’s economy. According to Agriculture and Agri-Food Canada (AAFC 2015), agriculture products account for 6.6% of Canada’s GDP. Also, the food service industry is the largest employer in the agriculture and agri-food sector, accounting for 5.7%

of all Canadian jobs (approximately 2.3 million people). Therefore, it is essential to look at the

(22)

22 various aspects, techniques and practices that can help maximize crop production. Agriculture expenses totaled $42 billion in 2013 (Figure 2), and crop fertilization was the second largest expenditure item after animal feed. It accounted for $5.0 billion, i.e., 11.7% of total expenditures (Figure 2). According to Fertilizer Canada, which represents manufacturers, wholesale and retail distributors of nitrogen, phosphate, potash and sulfur fertilizers, they play an essential role in helping to feed the world. World food production has more than doubled since 1960. Today, an estimated one-third to one-half of our global food supply is directly linked to the use of commercial fertilizers. If we are to meet future food demands, we need to double our current levels of production. Continuing to make better and more efficient use of fertilizer will help us feed the growing population.

Because fertilization contributes so much to agricultural expenses, efficient use of fertilizers will help to reduce or to optimize agricultural expenses. Therefore, it is critical from an economic perspective to work on economically optimum fertilization rates (EOFR). Farm fertilizer expenses include all costs associated with the purchase of fertilizer and lime, including application costs. In Canada, fertilizer expenses were estimated to have reached $4.95 billion in 2013, a decline of 6%

over 2012, but still higher than the 2008-2012 average annual expense of $4.2 billion. Fertilizer expenses in 2014 were forecast to be flat when compared to 2013.

Figure 2. Farm operating expenses in Canada, 2013 (AAFC, 2015).

(23)

23 Fertilizers contain one or more of three essential nutrients: nitrogen, phosphate, and potassium. The nitrogen fertilizers that are currently used in Canadian agriculture are primarily anhydrous ammonia, urea, nitrogen solution, ammonium nitrate and ammonium sulfate. Figure 3 describes the usage of the major types of fertilizers in Canadian agriculture in 2009 and 2013.

Because of its importance to plant growth and development, nitrogen is the most common nutrient used in agricultural production, accounting for 74% of total fertilizer usage, or about 5 million tonnes in 2013. The use of nitrogen increased at an annual growth rate of 4.8% from 2009 to 2013, with urea representing the most substantial volume used. Phosphate fertilizers accounted for 19%

of total fertilizer usage or about 1.3 million tonnes in 2013. Potash fertilizer accounted for 7% of total usage or about 0.5 million tonnes in 2013.

Figure 3. Fertilizer types, and usage in Canada between 2009 and 2013 (AAFC, 2015).

The annual historical data from 1981-2013 shows the elasticity of fertilizer consumption concerning the seeded area of major grain and oilseeds was estimated to be 0.88 in Canada. In other words, on average, a 1% increase in the seeded area resulted in a 0.88% increase in fertilizer use. Given the total seeded area and other factors, Canadian fertilizer usage was estimated to be higher in 2013 compared to the previous year.

(24)

24 2.6 Challenges of Nitrogen fertilizer management

Optimum N use efficiency is imperative from an agronomic, economic and environmental perspective. Soils can be improved but also, they can be degraded through fertilization, if proper amounts are not used. The choice of the suitable kind of fertilizer and the rate of application are fundamental issues for crop growth, while maintaining soil quality and avoiding the negative impact on the environment. It was also found that a decrease in profit can happen due to either over- or under the recommendation of N fertilization (Bullock and Bullock, 1994).

Conventional practices can sometimes result in sizeable fertilizer N losses, especially in extremely wet springs in the Corn Belt (Mathesius and Luce, 2009). In fact, only 30 to 50% of applied N is recovered by the crop in many cases (Raun and Johnson, 1999). Lost N not only reduces profits (through lost fertilizer and reduced yields in N deficient areas) but it can also lead to environmental contamination (i.e., nitrate leaching or greenhouse gase losses). It was found by Selassie ( 2015) that improving soil fertility is one of the major factors to increase soil productivity.

Nitrogen (N) recommendation rates provided by agronomists and soil and fertilizer consultants vary by soil and by crop across Canada (Yang et al., 2006). Producers often over-apply N fertilizer to corn because of the uncertainty in predicting the economic optimum nitrogen rate (EONR) (Dellinger et al., 2008). Optimal rates of N for corn are difficult to determine because they depend mainly on the interactions between weather, soil and crop management factors (Tremblay et al., 2010).

As mentioned in a paper written by McBratney et al. (2005), on the future directions of precision agriculture, the challenges are enormous, but they are also clear. Concerted and coordinated research efforts are needed in areas such as the determination of appropriate criteria for the economic assessment of precision agriculture and recognition and quantification of temporal variations. Therefore, it is important economically to use the optimum nitrogen rate (EONR). The proposed DSS should solve these challenges by its comprehensive assessment of production as well as profits specific to the aspects of the underlying scenario.

(25)

25 3 Materials and Methods

3.1 DSS architecture and overview

The proposed DSS (NumericAg) supports a user in determining the most profitable fertilization scenario as well as the scenario with the lowest expected risk of mismanagement penalties. Nitrogen application for corn in Eastern Canada is used to illustrate the initial structure of this DSS; that is, the system that seeks to maximize the net return of the cost of nitrogen fertilizer. As discussed earlier in chapter 2, nitrogen fertilizer application, as an agricultural input, is linked with site-specific past management practices, soil characteristics and other external factors. To combine all of these factors in the decision outcome, the proposed model-driven DSS (Figure 4) consists of several key components: 1) a user interface, 2) a database, 3) access to public online resources, and 4) a numeric simulation engine. Each of the components has several modules to perform specifically defined tasks.

Figure 4. The architecture of the proposed decision support system.

Figure 4 defines the orchestration and composition of the proposed system and describes the internal and external entities involved in the information communication flow. The process begins by the user accessing the DSS through an interface (web form) specifying his/her production context in the respective input features. A series of modules start executing sequentially, such as

(26)

26 fetching past recorded fertility trials (database), retrieving online data (external factors) and numeric simulation and modeling (production function). Finally, the results that are generated are provided together with recommendations to the user through an email. A user can specify different scenarios to analyze the performance sensitivity of the outcomes to the agronomy parameters by providing flexible inputs/scenarios. It is online and not a standalone application.

The objective of optimizing fertilizer usage is to maximize fertilization benefits, by estimating the potential profits that farmers should achieve. Farmers would want to get maximum benefits for the investment they made in fertilization. Yield sale is the only source of income here to get returns on investments. Therefore, the total estimated yield multiplied by the yield price, minus the nitrogen cost will return the marginal fertilization gains at each possible nitrogen rate.

However, for the same site/farm, at the same nitrogen rate, the yield could be different for different seasons, because of variations in climatic conditions. Even if a farmer could achieve the same yield as that of the previous season, the profits could be different due to changes in yield price and/or fertilizer cost. The yield price or nitrogen cost for the growing season may change per supply and demand conditions, which in turn would affect net gains. Variations in prices are to be treated as uncertainties in expected profits. These uncertainties were modeled through the probability of yield achievements from the residuals between observed and estimated yields, and the probability of variations in crop price and nitrogen cost. First, we will look at yield estimation through the production function and the probability of achieving the yield. Then, we will see the cost and price structure and associated probabilities.

The yield response was calculated through a selected production function. This function itself is a function of nitrogen application rate and agronomy parameters. There are production uncertainties due to the uncontrolled agriculture inputs which is the reason for the variations in yield response with respect to the given conditions from season to season; hence, different results could be obtained for the same set of inputs. The uncontrolled inputs include weather (temperature and precipitation) and economics (both yield price and fertilizer cost).

(27)

27 3.2.1 Modeling and numeric computation of NRCF

A quadratic-plateau (QP) equation was used as the production function to model corn yield in response to N fertilization that will estimate the probable NRCF. Thus, yield response to nitrogen fertilization was defined as:











 

max 2

max 2 max 1 0

max 2

2 1 0

Y Y

Y

N N for N

a N

a a

N N for N

a N a

Y a (1)

where Y is the crop yield (t/ha), N is the specified fertilizer application rate (kg/ha), NYmax is the minimum fertilizer application rate, resulting in the maximum yield. The a0, a1 and a2 are the coefficients of a second-order polynomial representation of yield response to N application rates below NYmax. The parameters of the equation can be defined through the physical parameters Y0

(yield with no fertilization), Ymax (maximum achievable yield) and Nymax. The coefficients of the second-order polynomial can be rewritten as:

a₀ Y₀ (2)

 

max 0 max 1

2 NY

Y

a Y 

 (3)

2 max

max 2 0

NY

Y

a Y  (4)

Figure 5 explains the yield estimation and the physical parameters used in the QP model.

Nymax is the optimum application rate in the sense that beyond Nymax, adding more fertilizer will have no effect on yield response and will lead to over-application. The difference between Y0 and Ymax is the estimated yield increase through the increased nitrogen fertilizer rate. The maximum yield would be reached at the Nymax application rate, after which the yield does not change (reaches a plateau). All possible sets of parameters (model) within a plausible range can be considered for each N rate from the range of application rates. For example, Y0 from 0 to 19.5 t/ha with 0.5 t/ha increment, Ymax from Y0 to 20 t/ha with 0.5 t/ha increment and Nymax from 0 to 250 kg/ha with 10 kg/ha increment were used.

(28)

28 3.2.2 Expected profits and uncertainty

The profit (objective function) was specifically defined as the net return over cost of fertilization:

N

Y N c

c Y

NRCF     (5)

where Y is the crop yield predicted as a function of the N fertilization rate (t/ha), cY is the price of the harvested crop ($/t), N is the specified fertilizer application rate (kg/ha) and cN is the cost of fertilizer ($/kg). The NRCF for an i^th case (for one of the model combination) was calculated using:



i i Y i



Yi N_i

i Y Y Y N N c N c

NRCF  ₀ , _max , _max ,    (6) where Y (Y0 i, Ymax i, NYmax i, N) is the yield function derived by combining equations 1 and 5 together. The probability of obtaining a specific NRCF can be calculated by the joint probability of occurrence of each of its component, assuming they are independent of each other. The probability NRCFi was calculated from its three main components:

p



NRCF_i

      

 p Y_i p c_Yi p c_Ni (7) where p(Yi) is the probability of yield Y for an i^th model, p(cYi) is the probability of yield price cY

for an i^thdiscrete price, p(cN) is the probability of cost of fertilizer cN for an i^thdiscrete value.

0 2 4 6 8 10 12 14

0 50 100 150 200 250

N application rate, kg ha^-1

Corn yield, t ha-1

N_Ymax Ymax

Y₀

Fertility trials

Figure 5. Example of a quadratic-plateau yield response model.

(29)

29 3.3 Database

A core component of this DSS is the assembled records of fertility trials or observations from multiple fields under different spatial and temporal conditions replicated for specific nitrogen fertilizer rates (from 0 to 200 kg/ha by the increment of 50 kg/ha). The initial database, currently implemented in MySQL, includes 320 records of yield data replicated for 5 N rates (0 to 200 by increments of 50 kg/ha). This means a total of 1680 fertility trials, resulting from an extensive meta-analysis study (Tremblay et al., 2012). Each record is specific to a site and a year and includes corn yield recorded under different nitrogen application rates as well as observed weather, soil conditions, and management practices.

• Soil type: the original database had the soil texture class for each of the trials, further the clay ratio was derived from the respective soil texture to quantify soil series as a continuous numeric value (Whiting et al. 2014). This soil type to clay ratio association is illustrated in Table 1.

Table 1. Soil texture class to clay ratio mapping.

Sr. No. Soil texture class Clay ratio

1 Sand 0.03

2 Fine sand 0.05

3 Loamy sand 0.08

4 Loamy fine sand 0.08

5 Sandy loam 0.13

6 Fine sandy loam 0.17

7 Loam 0.3

8 Silt loam 0.3

9 Sandy clay loam 0.33

10 Sandy clay 0.41

11 Clay loam 0.5

12 Clay 0.66

13 Silty clay loam 0.77

14 Silty clay 0.81

15 Heavy clay 0.88

(30)

30

• Precipitation (AWDR, mm): The rain precipitation data were based on the concept of abundant and well-distributed rainfall (AWDR) proposed by Tremblay et al. (2012). In addition to the AWDR calculation, it was categorized into five classes to ease specifying anticipated precipitation conditions by a user (Table 2):

Table 2. AWDR numeric to subjective categorization.

Sr. No. AWDR (mm) Category

1 0 – 30 Very Dry

2 30– 60 Dry

3 60– 90 Medium Conditions

4 90 – 120 Wet

5 120 -150 Very Wet

• Tillage: The tillage practices (i.e., conventional tillage or no-till) were numerically assigned to binary values 1 for conventional till and 0 for no-till. It is possible to indicate intermediate tillage practices, such as minimum or strip tillage as an intermediate value.

• Previous crop: N contribution of the preceding cultivar (e.g., weak, medium, strong) were recorded as well. Each category (Table 3) was transformed into a numerical value, assuming that each cultivar had a standard and uniform N contribution rate (0 – weak, 0.5 - medium, and 1 - strong).

Table 3. Previous crop contribution to soil nitrogen content.

Sr. No. Crop N Contribution

1 Low nutrient contribution (e.g., corn, potatoes) 0 2 Moderate nutrient contribution (e.g., forage) 0.5 3 Strong nutrient contribution (e.g., legumes) 1

Temperature (CHU, °C): The corn heat unit (CHU) expresses temperature measurements in °C (Bootsma et al., 2005). To ease specifying temperature conditions for a given site, it was categorized into five classes (Table 4).

(31)

31 Table 4. CHU numeric to subjective categorization.

Sr. No. CHU (°C) Category

1 500-600 Cold

2 600-700 Cool

3 700-800 Warm

4 800-900 Hot

5 900-1000 Very Hot

• N rate and yield: The database contains five different N rates of 0, 50, 100, 150 and 200 kg/ha and the recorded yield at the respective N rates.

3.4 Record similarity

The traditional approach in modeling is to retrieve records from a database through attribute values that are identical to the user-defined context, e.g., through SQL query, and fit a model with records that are exactly matched. This approach is ideal if we have an extensive dataset. An alternative is to compute a similarity index and use it in the modeling process on all records present in the dataset: records with higher similarity would play a more significant role in model assessment. Similarity indices are widely used in many domains such as in the Google search engine, which provides hyperlinks in an ordered manner, based on the query text match with documents, the best match results being on top of the order. It is possible this way to compute a distance, i.e., how far and how close a database trial (record) is from the user provided conditions (interchangeably used as features/inputs/parameters).

If values are missing for the selected features for any of the database records, they can be imput by inserting a new value randomly from the population distribution specific to each feature.

There are various other approaches to replace missing values with an artificial input value.

However, if many values are missing for a specific feature in the database, the results will not be reliable. For this research, it was decided to ignore records, if any of the similarity assessment feature’s value was missing. This condition reduced the original 1,680 records/trials to a non- missing value subset of 1,140 trials. For this reason, SOM records (missing from the majority of trials) was not used to assess similarity at this time.

(32)

32 3.4.1 Feature scaling

Several criteria were established for the development of the record similarity assessment mechanism. First, it should consider the fact that some features are more important than others when assessing similarity. Second, it should be highly sensitive to the presence or absence of a match, especially for essential features. Thirdly, it should be able to handle numeric, continuous features. The approach that was adopted was based on a product, which rapidly decreases the similarity between records towards its minimum value in the absence of a match for some features, which in turn decreases their impact on the model to be fitted. A power value was used to rapidly decrease the similarity index if the database record does not match with the user context. To simplify the presentation, let’s assume that the record representing user context is u. The similarity λi,u between any j^th record (a database trial) and the user specified record u (user context parameters) for all k features can be calculated as:

q K

k k k

u k j k k u

j x x

x



x

 













 



1 ,max ,min

, ,

, 1





(8)

where k is the percentage of 0 to 1 range (weight) affiliated with the k^th feature, xk, j is the value of the k^th feature of record j (representing trial ID in the database), xk, u is the value of the k^th feature of record u (representing the user context) and q is the power of similarity (high value reduces the influence of records that do not match user inputs identically). Another factor discussed earlier is the importance of the feature. Some feature can have high influence on the yield response, and a high weight should be given for the concerning feature while calculating the similarity index. For example, the soil type has greater influence on the yield response than the tillage system. The weight for the affected feature was assigned based on the domain knowledge and can be changed in the future. Table 5 illustrates the features used in similarity and their weights.

Table 5. Features used in the similarity with their weights.

Feature (k) SoilType CHU AWDR Tillage PrevCrop

Weight (k) 1 1 1 0.1 0.5

As discussed earlier, the categorical features were transformed into numerical values for the feature scaling process. Each attribute x for a feature k was associated with a specific numeric quantity, such as user-specified soil type was mapped to the clay ratio, CHU and AWDR were

(33)

33 mapped to a numeric value (the average of its lower and upper bounds were taken for the appropriate category to define a numeric value for the category) and the previous crop was mapped to the previous N contribution numeric value. For equation 8 to produce similarity values in the 0 to 1 interval, all the features must be scaled in the interval [0, 1].

To illustrate further, assume that a user sets up a context in the input form specifying his/her production specific conditions. The first step would be to load all the non-missing records from the database. In order to scale user context features, the concerning feature’s categorical values were converted to their corresponding numerical values. Such as soil type to clay ratio, previous crop (moderate) to previous N contribution numeric value (50), etc. For AWDR, the average of the wet conditions (105) was taken from its lower bound (90) and upper bound (120). The CHU numeric value was averaged (650) from its warm category. The clay ratio was derived from the soil type and the numerical value of the tillage class were fetched from the lookup table for the respective category.

Table 6. User context attributes numerical transformation for record similarity.

User context AWDR CHU PrevCrop SoilType Tillage

Category Wet Warm Moderate Clay Conventional

Numerical 105 650 50 0.6667 1

The numerically transformed values were used in Equation 8 for each feature to calculate the total similarity coefficient (λ) for each database trial. For the above user specified scenario, the maximum similarity was found for the trial id 261 with a computed similarity λ = 0.51. The least similarity value of 0.003 was found for the database trial id 855 (Table 7). Apparently, it can be interpreted that the user context was most identical (50 % overall match) with the database fertility trial id 261.

Table 7. The Similarity illustration between the user context and database trials.

ID AWDR CHU SoilType ClayRatio Tillage PrevCrop Yield λ 261 120 641 Silty clay 0.81 Conventional Soybean 8.75 0.513

855 21 707 Sand 0.03 No till Corn 7.06 0.003

(34)

34 Each record from the database has the similarity index computed against the user scenario (context) ranging from 0 to 1, the computed index closest to 1 being more identical. For the above scenario, the lambda values ranged between 0.51 being the highest and 0.0037 being the lowest similarity value by specifying the power (q) as 2. For illustration purposes, the database trials by the clay ratio were ordered in increasing order (smallest to largest). The lambda values were obtained and compared specifying flexible soil categories with three independent user scenarios (user requests). Scenario 1 with the user selecting soil type as sand, scenario two soil type as clay loam and the third scenario with the user soil class as heavy clay (Figure 6). The same set of underlying database records were used for all three scenarios.

Apparently, since the trials were ordered by the clay ratio feature, the lambda value was close to 1 for the initial set of sandy trials (exact or maximum match between the user soil and database trial soil), and the lambda value declined with the increase in the clay ratio value. The same scenario was observed with clay loam; the lambda was close to 1 in the middle portion of database records, having a clay ratio very close to clay loam (0.5). For the heavy clay scenario (clay ratio 0.88), the lambda was found to be low initially and followed an increasing order with the increase in the clay ratio and observed close to 1 at the end portion of the database trials. From all these scenarios, it could be interpreted that the lambda value was sensitive towards the user specified soil and database record/trial soil.

(35)

35 Figure 6. Illustrating λ comparison between the soil classes.

These λ values are the product of all feature’s similarity between the user provided values and the database trial value for the particular feature. It means that the final λ value is the product of soil feature similarity, AWDR feature similarity, previous crop similarity and CHU similarity.

Each feature contributes its particular similarity to increase or decrease the overall similarity value.

Figure 6 represents the total similarity; however, for illustration purposes, the database trials were ranked by the clay ratio in increasing order. Three independent user requests were placed alternatively changing the soil class as sand, clay loam and heavy clay. Further, the similarity of the soil feature was obtained between the user provided soil type and the database trials soil type for the sand, clay loam and heavy clay scenarios using the same dataset (Figure 7). The similarity value was found to be correlated to the user’s clay ratio (sand, clay loam and heavy clay) and database trial clay ratio for all trials. The soil similarity was at its maximum or close to 1 when the user specified soil type and the soil type of the database trials were very similar resulting in high similarity for the database trial. Similar patterns were observed for clay loam and heavy clay classes.

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7

𝝀values

All the database trials ranked according to clay ratio

Sand Clay Loam Heavy Clay

(36)

36 Figure 7. The illustration of λ values for the soil feature similarity.

Another fact to consider in the similarity assessment is the coefficient of the power (q). The power will lower lambda values in the absence of a match between the user and database feature.

For that reason, it is significant to use a balanced degree of the power. For illustration purposes, an arbitrary scenario with different powers, such as the power value as 1, 2 and 5 were compared (Figure 8). For the same scenario, the similarity values were decreased rapidly with increasing power. In other words, the power was sensitive to the absence or presence of the match. High power penalized more for reducing similarity, whereas low power did not decrease the similarity rapidly. The power of 2 was considered to be a balanced choice with respect to the current set of database trails. In future, the value of power can be analyzed and changed if new database trials are available.

0 0.2 0.4 0.6 0.8 1 1.2

𝝀values

All the database trials ranked according to clay ratio

Sand Clay Loam Heavy Clay