8 SUMMARY AND DISCUSSION
8.3 Limitations of the study’s methodology and suggestions for further work
The study has demonstrated with statistically-significance that the analysis of social media message volumes and sentiments can be used to lead the returns of market-traded securities. The study’s limitations are now presented and suggestions for additional technical work are provided.
8.3.1 Limitations in the data
This study is concerned with the analysis of time-series data. There are numerous limitations associated with the acquisition, and processing of both the social media and the financial data used in this study.
Firstly, as discussed in Chapter 4, one of the key technical drawbacks at the time of conducting the study’s experiments was the unavailability of historic Twitter datasets in the public domain. Therefore, a large component of the study centred on the collection of Twitter data, first by managing the creation of SocialSTORM (see Chapter 4.1), and then by the creation and use of the Twitter Collection Framework (see Chapter 4.2) – a proprietary framework built for connecting to Twitter’s APIs to facilitate the programmatic filtering and downloading of Tweets for the study’s experiments. The development durations of both SocialSTORM and the Twitter Collection Framework inherently placed limits on the length of time which was available for the collection of social media data.
175
However, as discussed in Chapter 5.2, a chronological limit had to be set on the length of time available for data-sample collection in order to minimise the effects of routine quarterly updatesa of ever-changing macroeconomic trends, whilst still offering a range of intra-day market volatilities. Furthermore, the dataset had to be sufficiently small to minimise the effects of seasonalityb (as discussed in depth in Chapter 5.6.2.1). Finally, and perhaps most importantly, the data collection period had to be small enough to avoid encapsulating significant alternations to the Twitter platform. This is because it has been shown that dramatic alterations to its core product can influence the consistency of Tweet data, driven by the resultant changing demographics of Twitter’s users (see Chapter 5.2). It should be noted that past studies in this space do not stipulate a minimum chronological data-size as it is specific to each study – indeed one past work on the analysis of Tweet message sentiments and volumes considered just a 32-day dataset49.
Therefore, whilst limits did exist on the length of time which was available for the collection of the social media data in the study, the choice of a 3-month dataset collection period based on the carefully-selected criteria was indeed possible.
Provided that the following effects can be mathematically modelled and mitigated, an extension of this study could be performed on a chronologically-larger dataset – this would inevitably provide further insight into the dependencies between social media data and financial data:
Ability to mitigate the effect of seasonality on Twitter and financial data;
Ability to mitigate the effect of quarterly macroeconomic trend updates on financial data;
Ability to mitigate the effect of changes to Twitter’s product.
Secondly, further limitations in the study exist from the perspective of Twitter data density. Due to the nature of the License Agreement between Twitter and its users, most programmatic connections to Twitter’s APIs provide access of up to 1% of all messages
a
Macroeconomic data is typically reported on a quarter-by-quarter basis. With reference to this study, the United States Department of Commerce Bureau of Economic analysis, and the UK’s Bank of England report macroeconomic data on a quarterly basis.
b Seasonality is the effect in time-series data that is driven by economic cycles influenced by the time of
176
passed through its network. As discussed in Chapter 2.1, before the Tweet-collection process began, contractual access for 10% of all messages passed through its network was secured. Thus, whilst the study’s 10% dataset is a fully-random sample of the fuller 100% data feed available from Twitter, the analysis of the full feed of all Tweets could provide further insight into social media’s ability to lead financial data.
8.3.2 Limitations of the sentiment classification system
This study is built on the analytics of Twitter sentiment using SentiStrength, a transparent dictionary-based classifier which has been shown to consistently outperform baseline competitors in ranking the colloquial nature of user-generated text from internet platforms50 (see Chapter 5.1). However, the system is only capable of ranking the sentiment on an arbitrary scale of ‘negative’ to ‘positive’. SentiStrength is strongly based on the work of Pennebaker et al.51, which also covers the ability to rank the sentiment of grammatically correct text on additional scales such as: anxiety, optimism, anger, and sadness in their Linguistic Inquiry and Word Count software (LIWC)a. Further work in the field of assessing whether social media data can lead financial data should therefore centre on the expansion of the SentiStrength package to incorporate the aforementioned additional scales offered by Pennebaker’s LIWC software. This would provide one with the ability to accurately rank the colloquial and often grammatically- incorrect text found in social media using additional mood dimensions. Thus, it is possible that additional insights into whether the sentiment of social media data can lead the returns of financial securities could be ascertained from the analyses of Tweets using these additional mood scales, provided they are adapted to accurately rank informal social media vernacular.
Furthermore, since SentiStrength is only capable of ranking the sentiments of text in English, this study’s approach ignores potentially-valuable non-English data passed through Twitter’s network. Substantial scope therefore exists for extending the study’s approach to the analysis of non-English social media data, provided that SentiStrength’s dictionaries can be adapted to rank sentiments in other languages.
177
8.3.3 Limitations of using company names as Twitter filters
As demonstrated by this study in the case of Apple, Inc. CFDs, filtering Tweets by Ticker-ID rather than Ticker-ID AND/OR Company Name shows a stronger ability for Twitter data to lead financial data (see Table 17). This is therefore evidence that filtering Tweets just by company name dilutes social media’s predictive powers. This is because using a Twitter filter which mentions a company’s name (e.g., “Amazon”) does not necessarily guarantee that filtered-in messages will only contain opinions on that firm. The messages can instead contain mentions of a company’s service (e.g., “Check out this great deal on Amazon.com”) or can in fact be entirely unrelated (e.g., “The Amazon river is unbelievably long”). Thus, whilst this study does demonstrate instances of where social media sentiment filtered by company name leads financial markets in a statistically-significant manner, it is likely that the potential strength of such relationships is diminished by this study’s inability to guarantee that Tweets can be filtered to only allow through direct opinions on a company’s future performance when filtering by company name. Substantial scope therefore exists for extending the study to only analysing Tweets on a company which contain direct opinions on that firm’s future performance. Whilst this is an inherently complex linguistic exercise, such methodologies could employ principles based on advanced part-of-speech tagging52 methods to infer if a Tweet contains a direct opinion on a firm’s future financial performance, or is merely discussing the firm. Such an exercise could provide stronger indications of social media’s ability to lead the financial markets.
8.4 Comparison to recent works in the space of market prediction with internet