Stefan Nann is co-founder and CEO of data analytics company StockPulse, a data analytics company specializing in mining Emotional Data Intelligence.
“Facts only account for 10% of the reactions on the stock market; everything else is psychology.” André Kostolany, a stock market investor who made most of his fortune during the reconstruction of Europe after World War II, made this observation. Renowned for his shrewd and astute mixture of psychology and his sensible knowledge of stocks and markets, Kostolany became one of the most successful investors of the 20th century.
The Internet was just emerging from its infancy when Kostolany died in 1999. The evolution of data intelligence technologies have given us the capability to process and analyze vast amounts of online data, which means we can now test Kostolany’s intuition that markets are highly impacted by emotional reactions. By monitoring and analyzing data from social media sources—especially with regard to communication about stocks—it’s now possible to connect the dots between sentiment and market movements.
Like most other industries, the financial industry communicates by sharing information and data via the Internet and, eventually, through social media. People interacting on social media generate emotional data by expressing their emotions and opinions via tweets, forum posts, and blogs. They also consume it, and in the process are influenced by the sentiments, feelings, and opinions expressed by others. Scientific studies show that people are often influenced by the data they consume, and that their decisions or actions are partly aligned with it. There are many examples of Twitter being used to broadcast information that triggers widespread emotional market reactions almost instantly.
At StockPulse, we refer to social media content as User-Generated Content (UGC). UGC comprises a substantial part of communication via social media. UGC that carries and facilitates the exchange of emotions is referred to as ”emotional data.” Data Intelligence refers to the collection and analysis of large amounts of data to uncover meaningful relationships between, and among, different data points to enhance decision making. When the methods and tools of Data Intelligence are applied to data that transport emotions, it can be referred to as Emotional Data Intelligence (or Sentiment Analysis).
StockPulse collects and analyzes data from social media sources around the clock in German, English, and Chinese. We have historical data from alternative sources that dates back to 2011. Our web crawlers are continuously scanning thousands of different Internet sources for relevant financial topics and communication, collecting several million tweets, chat messages, message board posts, news articles, and comments to news articles each day. Unstructured text documents are processed with methodologies from Natural Language Processing, along with more advanced sentiment analysis methods, to extract topics and investigate the hidden semantic structure of these large amounts of text.
Emotional Data Intelligence as a Tool for Trading Surveillance
Emotional data mined from digital social networks can be used in various ways for market surveillance. StockPulse cooperates with leading stock exchanges in Europe and the United States to deliver insights based on emotional sentiment data for trading surveillance.
For example, StockPulse is working with Nasdaq’s US Market Surveillance team to incorporate social media analytics into its surveillance system, and with their collaboration, we are developing processes tailored for exchange trading surveillance to detect potential market manipulation through the use of social media. One example of targeted conduct is the use of spam or bots for attempted price manipulation.
There are several surveillance use cases our stock exchange clients are exploring, including:
Uncovering of “Pump and Dump” schemes
Detecting false, misleading or exaggerated comments in social media that are followed or led by noticeable stock price developments is an important objective in trading surveillance. The analysis of social media and other general alternative data sources has become increasingly important.
A major part of classifying these kind of messages are comprehensive and sophisticated spam detection algorithms that detect whether users or authors in social media are distributing false or misleading information. Spam detection generally involves several steps. In a first step, a filter algorithm scans all messages for insulting words or phrases (so-called “rants” or “flames”). Most messages of this kind can be identified by their scurrilous or nasty language. There is also a more advanced approach that analyzes relationships between users (e.g., in the case of Twitter, analyzing the follower network of users or monitoring interactions such as likes, mentions, or re-tweets) and calculates a reputation rank or “author score” for every user. The impact of every message is based on this internal author score.
Further, there are manually curated and verified social media users (financial experts or renowned news agencies) who carry a higher author score by default. Elon Musk, Warren Buffet, and the Bloomberg news agency, for instance, are assigned to this category because their tweets and posts typically have a higher impact compared to the posts of lesser-known users.
Twitter Expert Network and Alert System
Some market participants potentially have a higher impact on the movement of stock prices than others. With our curated list of verified social media users (e.g. Twitter accounts of CEOs of listed companies, influencial politicians, journalists, analysts, news agencies, etc.) StockPulse provides categories of users who possibly have a higher impact than regular users. However, on a single stock, industry, or sector level, it needs a more individualized approach to find the really influential social media users.
Considering Tesla for example, we know that Elon Musk is probably the most important person to follow if we want to stay on top of news about the company that could be relevant for Tesla’s stock price. Most likely, the same situation can be found for all stocks or industries. There will be certain experts for a specific field or company who need to be followed in order to get most relevant information as quickly as possible. Regarding single stocks, Twitter accounts of the management of the company might be the initial users to start the search. In other cases, some analysts can be explicit experts for a certain stock or industry.
We call these initial users “root users” who are the most credible experts for a stock or business field. A first step involves monitoring of relevant social media statistics of these root users. These statistics could include the number of sent tweets, likes of other tweets, mentions of other users, favorites of other tweets, or re-tweets. In a following step all users who are followed by the root users also get monitored, recording the same statistics. It is also possible to repeat this for those accounts. This way there will be a few hundred, possibly even a few thousand Twitter accounts which will have the largest relevance for the corresponding industry or stock with a fairly high probability.
Knowledge about this “expert network” of social media users and their root statistics can be important for trading surveillance as it might be a relevant source to search for news or rumors after a suspicious price movement.
In addition to detecting the expert network of social media users for a single stock or industry, an alert system for specific Twitter accounts could be highly relevant. As soon as a Twitter user who has been identified as a relevant and credible source for a stock posts anything about that company, trading surveillance wants to know about it. Having this information as quickly as possible is key.
Key Event Monitoring
Key events in financial markets can impact price movements to a large extent. We have defined more than 150 market relevant events that are followed in the communication that our crawlers collect through social media. An event can be any topic which is characterized by key words which describe it in more detail. Every key word can also be assigned with a weight factor that describes the importance for the event.
For example, the event “Merger“ can be described with the following key words “merger, merger & acquisition, merger approval, merger deal, reverse merger, m&a, transaction, approved” (only a selection of all words). Key words can also be provided in multiple languages (English, German, and Chinese). Ultimately, this will lead to a connection between entities (e.g. listed companies) and events if both are detected in the same document. Other relevant events for trading surveillance might be “IPOs,” “Bankruptcy Fears,” “Board Member Resignation,” “Fraud,” “Going Private,” “Joint Venture,” or “Regulatory Investigation.”
Trading surveillance teams can monitor any rumors or posts about mergers or other relevant events in real-time in social media and get instant notifications if certain companies or events suddenly move into focus. In connection with the alert system, as described before, critical situations can be detected very quickly and might result in a timely reaction.
In Nasdaq’s case, incorporating real-time social media feeds into its US Surveillance program will enhance the company’s ability to monitor for potential market abuses. The company will be able to quickly consider a larger universe of factors when investigating unusual activity.
StockPulse also maintains comprehensive identifier mapping tables for financial titles world-wide. Identifier mapping is the assignment of relevant identifiers to a document based on recognized entities. For example, news discussing an analyst upgrade of video streaming company Netflix would be tagged with the ticker NFLX:US. Different processes regularly update all identifier mappings; for example ISIN or ticker symbol changes are handled automatically. Currently, the system maintains more than 1 million mappings.
StockPulse clients come from different geographic areas and industries and include top-tier hedge funds, asset managers, banks, stock exchanges, private equity companies, financial publishers, online brokers, news portals, research companies, and universities. They use the data for different purposes: For example, hedge funds are seeking trading signals or trading models that will allow them to extend their existing portfolios. Stock exchanges use the data for trading surveillance. Private equity companies want to find new investment opportunities based on alternative data that are not yet accessible to everyone. Other uses include product monitoring, investor relations alerts, and CEO reputation management.
StockPulse is a German-based data analytics company specializing in mining Emotional Data Intelligence. The company collects and analyzes alternative data from social media, with a particular focus on financial markets, to improve and support data-driven decision making for institutional financial investors.
The views and opinions expressed herein are the views and opinions of the authors at the time of publication and may not be updated. They do not necessarily reflect those of Nasdaq, Inc.