• No results found

How does this Research Methodology facilitate the development of the

of this Study and adds to the existing Body of

Knowledge?

The concept of knowledge in the data mining context, as discussed in Chapter 2 and detailed in [160], is the generation of patterns that benefit the user and task, with the value of such patterns being its interestingness. In this regard, the aim of this study was to adopt a research methodology that leveraged well-recognised research frame- works, data analytical techniques, and mathematical principles to generate patterns that are interesting, and thus produce knowledge. Further, given that the research paradigm is “functionalist”, and that the researcher has taken a “value-free, observer of scientific fact” approach, the patterns and knowledge that result from this research approach are reproducible, scientifically coherent, and will be the same irrespective of who or what conducts the analysis [147].

Given this, the proposed research methodology has led to the development of the proposed targeted promotions algorithm, and this linkage is detailed throughout this study. The unique contributions of this study are also detailed in several sections throughout this study with the premise that: (1) it is unique, because it has not been done before, (2) it is effective, because it achieves the primary business objectives of the study, and (3) it is additive to knowledge, because it is comparatively better than other existing approaches. Based on this, it can be concluded that the proposed targeted promotions algorithm adds to the existing body of knowledge, and thus the proposed research methodology has been effective in supporting this objective.

3.5

Summary

The discussion on the development and justification of the research methodology adopted as part of this study highlighted several key points:

• The “functionalist” paradigm adopted as part of this study is well-supported in data mining research in that it is scientific and produces results that are reproducible and generally free from researcher bias.

• Amalgamating research methodologies to incorporate the development of data mining theory, computer-based algorithms, and the business context, is a an effective way to develop data mining solutions that address business challenges and contribute to knowledge.

• Testing the proposed algorithm against a variety of conditions is an effective way to demonstrate the uniqueness of the research, the contribution of the research to the existing body of knowledge, and to underscore the validity of the research approach that was adopted.

Mathematical Model and

Algorithm

This chapter details the mathematical model and computer-based algorithm that formed the foundation of this study. It commences by mathematically formalising several definitions that have been used throughout this study, followed by the devel- opment of the mathematical model, which is based on two parts namely: identify- ing target items/itemsets, and identifying target customers. The marketing simula- tion approach is then discussed, before the chapter concludes with an outline of the computer-based algorithm, and a summary of key points.

The unique contributions of this chapter, developed using the research methodology detailed in Chapter 3, are as follows:

• The creation of a novel, yet simple, mathematical model to support generalised decision making, that can not only significantly improve MBA, but can be ap- plied to other fields that have similar decision-making constructs.

• The contribution to the overall body of knowledge on MBA, in particular cus- tomer and itemset targeting.

• The creation of a new algorithmic approach to improve targeted promotions within the retail sector, and which can be applied to other fields that have similar decision-making constructs.

4.1

Definitions

The following definitions are used throughout this study:

Items: Items are defined as per the original definition given in [5]. Let I = {I1, I2, . . . , Im},

be a set of all items, with the assumption that quality and quantity of items (I = 1, . . . , m) remain constant across all stores, and customers do not stockpile. These assumptions were necessary to ensure consistency of items across all stores.

Customers: Customers represent households, with all members within the house- hold viewed as one customer. This approach is consistent with practice, in that loyalty programs and retail analysts like Kantar, typically consider all members of a single household as one customer [42][90]. Customers are denoted by U , and represent a household with size f ; f > 0, with U purchasing subsets of I, referred to as transac- tions, or baskets.

Transactions: All purchases are made in the form of transactions, or baskets, and contain a subset of I, for example TS = I2, I9, I11, . . . , Ix is a single transaction from

store S. Itemsets are defined as subsets of all items within a transaction. Customers make one transaction per time period, W , per store, and this study uses W = 1 week. This assumption takes into consideration the practical aspects of shopping, where the generally accepted length of a shopping period is one week, as it is in line with how most people plan their household activity [47][90][141].

4.1.1

Support and Confidence

Both support and confidence are central to MBA, and the standard definitions of support and confidence as outlined in [5], were used throughout this study, and is detailed in Equations (4.1) and (4.2).

support of item, Ii|S = supp(Ii)|S =

Number of transactions containing Ii

Total number of transactions

S (4.1)

confidence of item Ii leading to items IiIj|S = conf(Ii → IiIj)

= Number of transactions containing Ii and Ij Number of transactions containing Ii

S (4.2)

In addition to the above, two user-defined parameters are defined as follows:

• Minimum support, minsup, is defined as the minimum support required for an item or itemset to be frequent. Like supp(Ii), minsup is a probability-based

parameter, and hence 0 < minsup ≤ 1. Note that minsup 6= 0 is a practical constraint as it is nonsensical to speak of transactions or databases with zero items.

• Minimum confidence, minconf , is defined as the minimum confidence required for two or more items to be associated. Like with minsup, the constraints, 0 < minconf ≤ 1, and minconf 6= 0 are practical constraints, as the underlying principle of ARM is to look at items that are associated, that is where conf(Ii

4.1.2

The Apriori Principle

The Apriori principle, first detailed in [7], remains fundamental to the study of MBA, and is a direct result of probability theory. The Apriori principle is defined as follows: For a given set of transactions, supp(Ii) ≥ supp(Ii, Ij) where Ii and Ij are two items

contained within these transactions. From a probability theory perspective, Apriori may be written as P (A) ≥ P (A ∩ B). In practical, retail terms it may be seen as: the transactions that contain butter are always greater than or equal to the transactions that contain both butter and cheese. Given that support and probability are interchangeable, this study uses supp(Ii) and P (Ii) interchangeably to denote

support.

4.2

The Market Target Model for Identifying Tar-