Capturing Value from Big Data through
Data-Driven Business Models
Patterns from the Start-up world
Philipp Hartman,
Dr Mohamed Zaki and Prof Duncan McFarlane Cambridge Service Alliance
“Data is the new oil”
1
0 5000 10000 15000 20000 25000 30000 35000 40000 45000Two general areas can be identified where
big data creates value
3
How to get value from Big Data?
Optimization of
existing business
1New Business
Models
1Chesbrough, Rosenbloom (2002):
Business model to capture value from an innovation
Crisculo (2012):
New technologies often first commercialized through start-up companies
Based on this motivation the research
question was developed
What types of business models that rely on data as a key resource (i.e.
data-driven business models) can be found in start up companies?
How to analyse data-driven business models? Su b q u e sti on s How to identify patterns? R es ear ch Que sti on
The research was done in five steps
5 Case studies Finding Patterns Data collection & coding Build the framework Literature ReviewHow to analyse data-driven business
models?
How to identify patterns?
The first step was a literature review with
three different topics
Li ter atu re R evie w Big Data Definition Value Creation Business Model Definition Business Model Frameworks Data driven business
Case studies Finding Patterns Data collection & coding Build the framework Literature Review
The first step was a literature review with
three different topics
7 Li ter atu re R evie w Big Data Definition Value Creation Business Model Definition Business Model Frameworks Related Work
Data driven business Models Cloud business models Case studies Finding Patterns Data collection & coding Build the framework Literature Review
Literature review: Business Model
Li ter atu re R evie w Big Data Definition Value Creation Business Model Definition Business Model Frameworks Data driven businessCase studies Finding Patterns Data collection & coding Build the framework Literature Review
Business model key components were
synthesized from existing frameworks
Existing Business Model Frameworks - Chesbrough & Rosenbloom 2002 - Hedman & Kaling 2003
- Osterwalder 2004 - Morris 2005
- Johnson, Christensen et. al. 2008 - Al-Debei 2010 - Burkhart 2012 V al u e Ca p tu ri n g V al u e Cr e ati on Key Resources Key Activities Cost structure Revenue Model Customer Segment Value Proposition
Business Model Definition Business Model Key Components
- No universally accepted definition of the concept
(Weill, Malone et al. 2011) - Most definitions refer to
Only a few papers are available in this field
Li ter atu re R evie w Big Data Definition Value Creation Business Model Definition Business Model Frameworks Data driven businessCase studies Finding Patterns Data collection & coding Build the framework Literature Review
The review was extend to cloud business
models
11 Li ter atu re R evie w Big Data Definition Value Creation Business Model Definition Business Model Frameworks Related WorkData driven business Models Cloud business models Case studies Finding Patterns Data collection & coding Build the framework Literature Review
The literature review identified several gaps
• Little academic research on big data and value creation – mostly whitepapers
• Gap in literature: data-driven business models
• Otto, Aier (2013) interesting paper but limited to specific industry > no generalization possible
• Similar research for cloud business models (cf. Labes, Erek et. Al. 2013) Case studies Finding Patterns Data collection & coding Build the framework Literature Review
The framework was build from literature
starting from the key components
Data-Driven-Business Model Data Sources Internal existing data Self-generated Data External Acquired Data Customer provided Free available Open Data Social Media data Web Crawled Data Key Activity Data Generation Crowdsourci ng Tracking & Other Data Acquisition Processing Aggregation Analytics descriptive predictive prescriptive Visualization Distribution Offering Data Information/ Knowledge Non-Data Product/Serv ice Target Customer B2B B2C Revenue Model Asset Sale Lending/Ren ting/Leasing Licensing Usage fee Subscription fee Advertising Specific cost advantage Data-Driven Business Model Data Sources Key Activity Offering Target Customer Revenue Model Specific cost advantage Data collection
& coding Case studies
Finding Patterns Literature Review Build the
framework
Features for each dimension
Data-Driven Business Model Framework
Business Model Key Components (Dimensions)
Data Sources Features for
Synthesizing the different sources leads to
the taxonomy
Data Sources Internal existing data Self-generated Data External Acquired Data Customer provided Free available Open DataSocial Media data Web Crawled
Dimension: Activities
15 Key Activity
Data Generation
Crowdsourcing
Tracking & Other Data Acquisition Processing Aggregation Analytics descriptive predictive prescriptive Visualization Distribution
Dimension: Offering
Offering
Data
Information/Knowledge
Non-Data
Product/Service
Dimension: Revenue Model
17 Revenue Model Asset Sale Lending/Renting/Leasing Licensing Usage fee Subscription fee AdvertisingDimension: Target Customer
Target Customer
B2B
B2C
Data collection & coding
The final framework
19 Case studies
Finding Patterns Literature Review Build the
framework Data-Driven-Business Model Data Sources Internal existing data Self-generated Data External Acquired Data Customer provided Free available Open Data Social Media data Web Crawled Data Key Activity Data Generation Crowdsourcing Tracking & Other Data Acquisition Processing Aggregation Analytics descriptive predictive prescriptive Visualization Distribution Offering Data Information/Kno wledge Non-Data Product/Service Target Customer B2B B2C Revenue Model Asset Sale Lending/Renting /Leasing Licensing Usage fee Subscription fee Advertising Specific cost advantage
Data collection and coding
Case studies Finding Patterns Build the frameworkLiterature Review Data collection
& coding
Data collection Data analysis
The data was generated using public
available sources
21 Tag: “big data”
“big data analytics”
1329 companies Data collection Company information • Company websites • Press releases Public sources • Coding of sources using data driven business model framework • Nvivo Data analysis 299 Sources ~3 sources/comp Sampling 100 Companies cleaning Random sample 100 binary feature vectors
Overall Analysis: Data Source
0% 10% 20% 30% 40% 50% 60% Acquired Data Customer&Partner-provided Data Free available Crowd Sourced Tracked & Other• >50% of companies rely on free available data
• >50% of companies use data provided by customers/partners
Overall Analysis: Key Activities
23 0% 10% 20% 30% 40% 50% 60% 70% 80% Aggregation Analytics Descriptive Analytics Predictive Analytics Prescriptive Analytics Data acquistion Data generation Data processing Distribution Visualization • >70% of companies use analytics - mostly descriptiveOverall Analysis: Revenue Model
0% 10% 20% 30% 40% 50% Advertising
Asset Sales Brokerage Fees Lending Renting Leasing Licensing Subscription fee Usage Fee No information • Majority of revenue models based on subscription
and/or usage fee • No information about the revenue model as many companies are in an early stage
Overall Analysis: Target Customer
25 70% 17% 13% B2B B2C both • There seems to be a noteworthy predominance of B2B business models • But no reference data foundBM patterns were identified using a
clustering approach
Ketchen, David J.; Shook, Christopher L. (1996): The Application of Cluster Analysis in Strategic Managment Reserach: An Analysis and Critique. In: Strat. Mgmt. J. 17 (6).
Han, Jiawei; Kamber, Micheline (2011): Data mining. Concepts and techniques.
Case studies Data collection
& coding Build the
framework
Literature Review Finding
Patterns 2. Clustering method 1. Clustering Variables 3. Number of Clusters 4. Validate & Interpret C.
7 Business Model Cluster were identified
27 Cluster 1 2 3 4 5 6 7 Da ta So u rce Acquired Data 0 0 1 0 0 0 0 Customer-provided Data 0 1 1 0 0 1 1 Free available 1 0 1 0 1 0 1 CrowdSourced 0 0 0 0 0 0 0Tracked, Generated & other 0 0 0 1 0 0 0
Ke y Act ivit y Aggregation 1 0 0 0 0 1 1 Analytics 0 1 1 1 1 0 1 Data acquistion 0 0 1 0 0 0 0 Data generation 0 0 0 1 0 0 1 Number of companies 17 28 5 16 14 6 14 Type A B - C D E F
6 significant Business Model types were
identified
Type B: “Analytics-as-a-Service”
Type C: “Data generation & Analytics”
Type D: “Free Data Knowledge Discovery” Type A: “Free Data Collector & Aggregator”
The 6 BM types are characterised by the key
activities and key data sources
29
Type F
Type A Type D
Type E Type B
Type C
Aggregation Analytics Data generation
Fr ee a vailable C u st ome r p rov ide d Tr ac ked & gen er at ed Key activity K ey Da ta Sour ce
Type D: “Free Data Knowledge Discovery”
1. DealAngel 2. Gild 3. Insightpool 4. Juristat 5. Market Prophit 6. MixRank 7. Numberfire 8. Olery 9. PeerIndex 10. PolyGraph 11. Review Signal 12. Tellagence 13. traackr - Free available - Social Media - Open Data - Web Crawled Key Activities Revenue Model Key Data Source- Analytics Target Customer 0 5 10 15 Descriptive Predictive Prescriptive 0 2 4 6 8 Subscription Usage Fee Companies
Type D: Examples
31 “Using patent-pending technology, Gild
evaluates the work of millions of developers so companies using Gild’s talent acquisition tools know who’s good and can target the right candidates.” • Key Data: Free available websites
(GitHub, Google Codes) • Key Activities: Analytics
• Revenue Model: Monthly subscription • Target Customer: B2B
“ Our goal is to provide the most
accurate and honest reviews possible by using the data consumers create. We listen to the conversations, analyze them and visualize them for consumers.”
• Key Data: Twitter
• Key Activities: Analytics
• Revenue Model: Advertising • Target Customer: B2B (B2C)
Finding Patterns
The cases studies will be validated the
framework and the clustering
Data collection & coding Build the
framework
Literature Review Case studies
4 case studies with companies from the sample such as Purpose:
1. Validate framework & clusters
2. Illustrate business model types through examples
3. Identify specific challenges
Summary
33 - Gap in literature identified
- RQ: What types of business models that rely on data as a key resource (i.e. data-driven business models) can be found in start up companies?
- 5 (7) different business model patterns identified
- Data-driven business model framework created to enable analysis - Pattern identification through clustering
Limitations & Outlook
Limitations • Only 100 samples
• Only start up companies
• Bias of data source (AngelList) • Statistical significance of
clustering result
• Only public available sources used
• No statement about success of
Outlook/Next Steps
1. Improve validity of findings
1. Increase sample size to test clusters
2. More Case-studies to illustrate/validate clusters
2. Include established organizations 3. Develop methodology to judge
(financial) performance of different business models
Appendix
Literature
Al-Debei, Mutaz M.; Avison, David (2010): Developing a unified framework of the business model concept. In Eur J Inf Syst 19 (3), pp. 359–376.
Burkhart, Thomas; Wolter, Stephan; Schief, Markus; Krumeich, Julian; Di Valentin, Christina; Werth, Dirk et al. (2012): A comprehensive approach towards the structural description of business models. In : Proceedings of the International Conference on
Management of Emergent Digital EcoSystems. New York, NY, USA: ACM (MEDES ’12), pp. 88–102.
Chesbrough, H.; Rosenbloom, R. (2002): The role of the business model in capturing value from innovation: evidence from Xerox Corporation's technology spin-off companies. In
Industrial and Corporate Change 11 (3), pp. 529–555.
Criscuolo, Paola; Nicolaou, Nicos; Salter, Ammon (2012): The elixir (or burden) of youth? Exploring differences in innovation between start-ups and established firms. In Research
Literature
37 Davenport, Thomas H.; Harris, Jeanne G. (2007): Competing on analytics. The new science of winning. Boston, Mass: Harvard Business School Press.
Everitt, Brian; Landau, Sabine; Leese, Morven (2011): Cluster analysis. 5th ed. London, New
York: Arnold; Oxford University Press.
Hagen, Christian; Khan, Khalid; Ciobo, Marco; Miller, Jason; Wall, Dan; Evans, Hugo; Yadava, Ajay (2013): Big Data and the Creative Destruction of Today's Business Models. ATKearney.
Han, Jiawei; Kamber, Micheline; Pei, Jian (2011): Data Mining. Concepts and Techniques. 3rd ed. Burlington: Elsevier Science.
Hedman, Jonas; Kalling, Thomas (2003): The business model concept: theoretical
underpinnings and empirical illustrations. In European Journal of Information Systems 12 (1), pp. 49–59.
Literature
Johnson, Mark W.; Christensen, Clayton M.; Kagermann, Henning (2008): Reinventing your business model. In Harvard Business Review 86 (12), pp. 57–68.
Labes, Stine; Erek, Koray; Zarnekow, Ruediger (2013): Common Patterns of Cloud Business Models. In : AMCIS 2013 Proceedings.
Manyika, James; Chui, Michael; Brown, Brad; Bughin, Jacques; Dobbs, Richard; Roxburgh, Charles; Hung Byres, Angela (2011): Big data: The next frontier for innovation,
competition, and productivity. McKinsey Global Institute.
Morris, Michael; Schindehutte, Minet; Allen, Jeffrey (2005): The entrepreneur's business model: toward a unified perspective. In Special Section: The Nonprofit Marketing
Landscape 58 (6), pp. 726–735.
Negash, Solomon (2004): Business Intelligence. In Communications of the Association for
Literature
39 Osterwalder, Alexander (2004): The Business Model Ontology. A Proposition in Design
Science Research. These. Ecole des Hautes Etudes Commerciales de l’Université de Lausanne, Lausanne.
Otto, Boris; Aier, Stephan (2013): Business Models in the Data Economy: A Case Study
from the Business Partner Data Domain. With assistance of Rainer Alt, Bogdan Franczyk. In : Proceedings of the 11th International Conference on Wirtschaftsinformatik (WI 2013) : The 11th International Conference on Wirtschaftsinformatik (WI 2013), vol. 1. Leipzig, pp. 475–489.
Pham, D. T.; Dimov, S. S.; Nguyen, C. D. (2005): Selection of K in K-means clustering. In
Proceedings of the Institution of Mechanical Engineers, Part C: Journal of Mechanical
Engineering Science 219 (1), pp. 103–119.
Pham, D. T.; Afify, A. A. (2007): Clustering techniques and their applications in engineering.
In Proceedings of the Institution of Mechanical Engineers, Part C: Journal of Mechanical
Literature
Rousseeuw, Peter J. (1987): Silhouettes: A graphical aid to the interpretation and
validation of cluster analysis. In Journal of Computational and Applied Mathematics 20 (0), pp. 53–65.
Schroeck, Michael; Shockley, Rebecca; Smart, Janet; Romero-Morales, Dolores; Tufano, Peter (2012): Analytics: The real-world use of big data. How innovative enterprises extract value from uncertain data. IBM Institute for Business Value; Saïd Business School at the University of Oxford.
Singh, Ranjit; Singh, Kawaljeet (2010): A Descriptive Classification of Causes of Data
Quality Problems in Data Warehousing. In IJCSI International Journal of Computer Science I
7 (3), pp. 41–50.
Weill, Peter; Malone, Thomas W.; Apel, Thomas G. (2011): The business models investors prefer. In MIT Sloan Management Review 52 (4), p. 17.
The Clustering Process
41 Variables relevant to determine clustering (Miligan 1996) #Variables has to match #samples (Mooi 2011) ~ 2m samples for m variables: 6-7 variablesAvoid high correlation between variables (<0.9) (Mooi 2011)
2 Dimensions: “Data source” &
“Key Activity” 9 variables max. correlation: 0,5 2. Clustering method 3. Number of Clusters 4. Validate & Interpret C. 1. Clustering Variables
The Clustering Process
Partitioning Hierarchical Density-based Grid-based Clustering Method (Han 2011) 4. Validate & Interpret C. 1. Clustering Variables 3. Number of Clusters 2. Clustering method K-MedoidsThere is no “one right solution” for the
number of clusters
43 large to reflect specific
differences k << n
1. Use a-priori knowledge to determine number of clusters 2. Visual approaches
3. Rule of thumb (Han 2011): 4. “Elbow” method 5. Statistical methods 𝑘 ~ 𝑛 2 → 𝑘 ~ 7 k? 2. Clustering method 4. Validate & Interpret C. 1. Clustering Variables 3. Number of Clusters
“Elbow” method
“Elbow Method” (cf. Ketchen 1993, Mooi 2011):
1. Hierarchical clustering first
2. Plot agglomeration coefficient against number of clusters 3. Search for “elbows”
2. Clustering method 4. Validate & Interpret C. 1. Clustering Variables 3. Number of Clusters
“Elbow” method
45 0.000 0.500 1.000 1.500 2.000 2.500 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60 62 64 66 68 70 72 74 76 78 80 82 84 86 88 90 92 94 96 98Clustering Coefficient (distance)
<29 7 16 2. Clustering method 4. Validate & Interpret C. 1. Clustering Variables 3. Number of Clusters Number of cluster k
Statistical Measure: Silhouette
0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 Silhouette Coefficient 2. Clustering method 4. Validate & Interpret C. 1. Clustering Variables 3. Number of Clusters For datum i: Compares distance within its cluster to distance to nearest neigbouring cluster−1 ≤ 𝑠(𝑖) ≤ 1