1.2 Thesis Scope and Contributions
2.1.1 Customer Segmentation
Psychographic Segmentation
Psychographic segmentation aim to segment people in terms of how they think, feel and act. The study is typically done by completing a survey customers containing attitudinal and behav- ioral questions. Pedersen [2008] perform a psychographic segmentation to find customer groups based on their behaviors and attitudes toward electricity and conversation. The behaviors and attitudes are further detailed into several categories, such as how they use lighting, plug-in de- vice, dishwashing, laundry, space heating/cooling, and water. He find that the customers can be divided into six customer groups, i.e., (from better to worse conservation ethic) tuned-out & carefree, stumbling proponents, comfort seekers, entrenched libertarian, cost-conscious prac- titioners, and devoted conservationists. The result is then used by a utility company (whose customers are studied by Pedersen) for its eNewsletter campaign and conservation programs, such as selecting a group of customers to be conservation role models in their community.
In addition to customer behaviors and attitudes toward energy conservation, S¨utterlin et al. [2011] also survey customer attitudes in purchasing new cars, their attitudes toward using public transport or their own cars, and their acceptance to public policy, such as renewing old power plants, or increasing price for appliances/cars with high energy consumption. As a con-
14 State of the Art sequence, while Pedersen’s segmentation is more about how people conserve energy, S¨utterlin et al. also take into account customer attitudes in purchasing energy-saving appliances or goods. They characterize customers into six segments: idealistic energy-savers, selfless inconsequent energy-savers, thrifty energy-savers, materialistic energy consumers, convenience-oriented indif- ferent energy consumers, problem-aware well-being-oriented energy consumers. The existence of the selfless inconsequent energy-savers is particularly interesting since they are inconsequent in translating their thinking into action, i.e., they are highly aware of energy conversation problem but less energy aware in their purchasing decision.1
In contrast to previous works, Sanquist et al. focus on how people consume energy, e.g., times per week oven is used, dishwasher loads per week, hours per week TV/computer is on, total hours light are on per day, and the size of air-conditioned (AC) area in the house [Sanquist
et al., 2012]. Additionally, they also develop segmentation based on where customers live, i.e.,
city, town, suburb, and rural area. They find that people who live in the city consume the least amount of energy due to the low energy consumption for AC and laundry. Despite the low use of AC, people who live in the rural area consume the largest amount of energy due to the high energy consumption for laundry .
Load Pattern Segmentation
Psychographic segmentation relies on customers’ answers on a survey. However, customers’ answers might not actually in line with what they do [Peattie,2001]. To this end, load pattern segmentation aims to cluster customers based on how they consume energy in reality, i.e., based on the metered consumption.
Segmenting Commercial and Industrial Customers There is a large body of literature on customer segmentation using load patterns of commercial and industrial customers to improve tariff structures (see e.g., [Chen et al.,1997,Chicco et al.,2003,Figueiredo et al.,2005,Kitayama
et al.,2002,Ramos and Vale,2008,Ramos et al.,2007,Tsekouras et al.,2007]). Chen et al. [1997]
perform the segmentation simply based on customer contractual data, i.e., customer activity type (commercial or industrial) and voltage level (low, high, or extra high). However, Chicco et al. [2003] shows that grouping customers based on their contractual data might be ineffective in characterizing their electricity consumption behavior.
Therefore, several studies propose features that are derived from customer consumption data. Chicco et al. [2003], Figueiredo et al. [2005], and Ramos and Vale [2008] propose features based on statistics on customer daily consumption, such as the average, minimum, or maximum power demand during the day,2 night impact, and lunch impact. Night impact is defined as the ratio between the average consumption during the night (23:00–06:00) and the day (06:00–23:00), while lunch impact is defined as the the ratio between the lunch time (12:00–14:00) and the day (06:00–23:00). In contrast, Ramos et al. [2007], Tsekouras et al. [2007], and Chicco [2012] propose features that represents customer typical daily load curve. More specifically, they divide the day into T time slots and define a feature vector of length T , where the ith element is the
customer typical consumption at the ith time slot.
Several unsupervised learning algorithms have also been employed, such as kMeans [Tsek-
ouras et al.,2007], hierarchical clustering [Ramos et al.,2007], and Self Organizing Maps [Figueiredo
et al.,2005]. After customer segments have been built and a new tariff structure has been pro-
1This fact also supports Peattie’s observation that green purchasers are not necessarily the same person as
green consumers, and vice versa [Peattie,2001].
2The ratio among them can also be considered, such as average/maximum, minimum/maximum, or mini-
2.1. Demand Analytics 15 posed, a supervised learning algorithm, e.g., decision tree, can also be trained on top of the clusters to classify new customers, and determine their tariffs [Figueiredo et al., 2005, Ramos
and Vale, 2008,Ramos et al.,2007].
Segmenting Residential Customers Due to the recent deployment of smart meters, there have been a growing interest on customer segmentation that focuses on residential customers. Similar to the works described previously, R¨as¨anen and Kolehmainen [2009] and Flath et al.[2012] perform a customer segmentation to improve customer tariff structure.
When a new customer join, a utility company typically associates her to a customer class based on her house type (e.g., detached, terraced), heating type (e.g., eletric heating or not), and activity type (e.g., spare-time cottage, agriculture residence). In other words, the utility company performs psychographic segmentation. To this end, R¨as¨anen and Kolehmainen propose to segment customers based on the statistical features derived from their energy consumption, such as mean, standard deviation, skewness, kurtosis, etc. R¨as¨anen and Kolehmainen then apply kMeans algorithm, and by using Index-of-Agreement,3they show that newly developed segments
perform better than the segmentation originally developed by the utility company. Similar to R¨as¨anen and Kolehmainen, Flath et al. also apply kMeans clustering algorithm. However, they use customer load curve as a features and aims to provide insight specifically into designing time of use pricing. They suggest four key steps: determining the number of time zones, identifying the starting time for each time zones, determining the price for each zones (one price could apply to several zones), and maximizing supplier profit (by deriving demand elasticity from field tests). Segmenting Daily Load Curve In contrast to the works described above which aim to cluster customers, Pitt and Kirschen [1999], Cao et al. [2013], and Kwac et al. [2014] focus on clustering customers’ daily load curves. More specifically, given a set of (normalized) daily load curve collected from all customers they aim to group similar curves in the same cluster. Thus, it is possible that load curves of a customer belong to several clusters. Pitt and Kirshen apply decision tree clustering (using day, month, and load factor as explanatory variables), while Cao et al. employ hierarchical clustering, SOM, and kMeans, and Kwac et al. use adaptive kMeans (setting distance threshold parameter instead of k). Both, Cao et al. and Kwac et al. use customer daily load curve as features, and in contrast to the works presented up to this point (including Pitt and Kirshen’s), they aim to support DR implementations and energy efficiency programs. For example, after clusters have been identified, utility companies could then classify the daily load curves of each customer, and (depending on the programs) target customers that have stable daily patterns (more predictable), or customers with peak demand at a particular time of day.