Two decades ago, data seemed fairly finite. There was macroeconomic data, like GDP figures and there was microeconomic data, like company reports. If you were cutting edge, you might have traded against meteorological data.
Today, that's all changed. Data has exploded. As J.P. Morgan's giant machine learning report explains, data sources have multiplied exponentially. Data can be generated by individuals (social media posts, product reviews, internet search trends), by business processes (company exhaust data, commercial transaction, credit card data, order book data), or by sensors (eg, cameras that track shopping habits).
These new "alpha generating" data sources have become a significant source of competitive advantage for investors. J.P. Morgan predicts that analysts and portfolio managers who don't have access to these data sources will be left behind.
Not all the new data is used by banks, hedge funds and other financial services companies, but a lot is. JPM estimates that finance firms account for around 15% of spending in the $130bn big data market. The bank expects the market as a whole to grow to over $200bn by 2020.
If you want to work in or with big data, you need to get with the leading companies in the burgeoning data industry. J.P. Morgan identifies a large number of them in its machine learning report, and we've extracted some of the most interesting below. The data they provide can be used to influence a broad range of investing decisions across equity and fixed income markets. Some, like iSentium or Quandl have been set up by ex-bankers and analysts and are focused on the finance industry. Others, like Repustate are more oriented towards monitoring brand perception among consumers.
Some areas of the data industry (eg. blog sentiment and social media monitoring) are young and there are multiple small companies competing in the same space. Some consolidation in the list below therefore looks likely, but for the moment these are some of your best bets if you want a big data job that feeds into financial services. Good luck.
What? Trading alerts and analytics are derived from relevant finance-related news processed on over 300 million public news websites, blogs, social media websites such as Twitter, and public financial documents such as SEC filings.
What? Algorithms that assess the sentiment of unstructured information – such as high-value financial news feeds.
What? Collects data from dozens of sources that collectively build a clear picture of such things as company and brand health, product and product category pricing and demand trends, customer engagement, and company risk factors.
4. Audit Analytics
What? Tracks issues related to audit, compliance, governance, corporate actions, and federal litigation from a broad range of public disclosures, including approximately 20,000 SEC company registrants.
5. Brave New Coin
What? A Market-Data Engine for the Blockchain & Digital Equities industry.
What? 3D Stereo video using bespoke cameras for counting visitors at stores.
What? 'Transforms real-time data from Twitter and other public sources into actionable alerts.'
What? Reports on information extracted from more than 100 million websites in 40 countries.
What? Does analysis on data sources like Bitly, Blogs, Boards, Daily Motion, Disqus, FB, Instagram, IMDB, Intense Debate, LexisNexis, NewsCred, Reddit, Topix, Tumblr, Videos, Wikipedia, Wordpress, Yammer and YouTube.
10. Descartes Labs
What? Full imagery archives (some including data only a few hours old) from hundreds of satellites.
What? Up-to-date data on companies, retail stores, restaurants, oil wells and real estate.
12. Eagle Alpha
What? Provides analytical Tools which enable clients to do proprietary analyses and Data sources includes a database of all the best alternative datasets worldwide
What? Patented pattern matching technology to project probable event outcomes and find relationships in Big Data.
What? An earnings data set with exclusive insight on over 2000 stocks.
15. Factset Revere
What? A database of supply chain relationships
16. GDELT Project
What? Creating a platform that monitors the world's news media in print, broadcast, and web formats.
What? Data acquisition from satellite reconnaissance, artificial Intelligence, and maritime freight tracking.
What? Real time trending news from the web, government wires, news wires, blogs and Twitter.
What? Turns news feeds into event-driven Analytics feeds.
What? Scans and monitors millions of websites, blogs, and business news publications in real-time to analyze 50,000 + stocks, topics, people, commodities and other assets.
What? iSense App transforms over 50 million Twitter messages per hour into a realtime sentiment time series.
What? A web intelligence company using cutting edge natural language processing and data science to extract value from non-traditional online sources.
What? Processes billions of unstructured documents every day. It translates text into profitable decisions.
What? Develops sentiment data across major news and social media outlets.
25. Markit Securities Finance
What? Global securities financing data.
What? A data platform to search companies and investors and create actionable lists of leads.
What? Millions of observations captured daily by a global network of contributors.
What? Financial, economic and alternative data.
29. Quant Connect
What? Huge amount of data relating to past trades.
What? Real-time Big Data analytics to predict macro trends, success and failures of companies and individual behaviors.
What? Price information from over 1,000 retailers across nearly 70 countries.
What? Transforms unstructured big data sets, such as traditional news and social media, into structured granular data and indicators to help financial services firms improve their performance.
What? Real-time analysis of text data.
What? Perform sentiment analysis and extract semantic insights from social media, news, surveys, blogs, forums.
35. Return Path
What? Uses emails and purchase receipts to offer insight into purchase behaviour, brand affinity, and consumer preferences.
What? Largest online ecosystem of crowd-experts and influencers in global financial markets used to generate trading signals.
37. Sentiment Trader
What? Over 300 Sentiment Surveys, Social Sentiment and other indicator for Equities, Bond, Currencies, ETFs and commodities.
38. Social Market Analytics
What? Harnesses unstructured yet valuable information embedded in social media streams and provide actionable intelligence in real time to clients.
39. Superfly insights
What? Detailed consumer purchasing insights using live transaction data inapp, online and in-store.
What? Sentiment analysis with classifications such as emotion, humour, gender, risk, speculation, and sarcasm.
41. Tick Data
What? Historical intraday stock, futures, options and forex data.
What? Collection and analysis of website data: publicly available information on company and government websites.
What? Insights based upon millions of anonymized consumer debit and credit transactions.
Different types of data and their history: