Corralling ‘Big Data’
For trading and investing firms, so-called big data — the extremely large data sets that flow through the business every day — provides a hoard of information about markets and their own market activity.
But going from simply owning the data, to exploiting the data, is no mean feat in today’s high-speed electronic markets, in which immediate computer responsiveness is prized. That’s especially the case given that a substantial amount of data is unstructured and first needs to be cleaned up and reformatted before being processed.
The value proposition of making the leap from possession to utilization is twofold. One, a proper harnessing of big data complies with increasingly ponderous and data-hungry financial regulation, most prominently Markets in Financial Instruments Directive II. Two, big data can be used to improve pre-trade, at-trade and post-trade analysis, and by extension, trading performance.
“There is a need for more cross-asset, real-time risk management,” said Tom Kennedy, global head of analytics at technology and data provider Thomson Reuters. “Look at MiFID II’s reporting obligations across asset classes. How do you pull that content together to publish to your reporting platform? It’s a huge project just to join up disparate databases, and then you have to run an analytics platform on top of that.”
Added Kennedy, “The winners will be those who achieve that, and in turn use those capabilities to drive business opportunities, new product development, and alpha.”
Banking and financial services is data-intensive. The sector has 1.51 installed terabytes per $1 million of revenue as of 2016, topping 16 other industries including #2 media and entertainment at 1.18 terabytes and #3 healthcare at 0.91 terabytes, according to technology-economics research firm Rubin Worldwide. Banking and financial services’ data per $1 million of revenue has increased 61% since 2012, when it was 0.94 installed terabytes.
Data/revenue in financial services has surged 61% over the past five years
Much of the data that comes across a trading desk is structured, in the form of numbers from databases or spreadsheets. But there is also unstructured data in the form of text and voice — this can originate from sources as disparate as e-mail, chats, IMs, texts, PDF reports, conference-call transcripts, news articles, phone calls and voicemail.
So the first challenge is to make unstructured data usable.
“The goal is to bring to unstructured data the power of programming languages and query languages that you normally associate with structured data,” said Dan Seal, senior vice president at in-memory streaming database provider Kx Systems. “It’s about being able to make sense of the content and use of the content by bringing structured data processing to an unstructured landscape, and doing that quickly and efficiently.”
Under MiFID II, which comes into force in January 2018, the European Securities and Markets Authority and national regulators are charged with ensuring that trading firms maintain a complete and accurate list of all trades, including those conducted in over-the-counter derivative markets that are thinly traded compared with screen-based equities markets.
Big data is characterized by volume, variety and velocity, a 15-year-old observation by industry consultant Gartner that applies many times over on today’s institutional trading desks.
“Customers have different data in different places,” said Adam Garrett, North Asia head of enterprise capabilities and content for Thomson Reuters. “A key part of the big data challenge is getting all the data in one place and making sense of it. That can be done using an enterprise data warehouse solution or with a visualization layer linking different databases.”
Stakeholders in markets worldwide are turning to technology to help corral unstructured data, with an eye toward scalability — that is, ensuring that a fintech solution that works on a small data will not break down with a heavier input.
The Securities and Futures Commission of Hong Kong sees artificial intelligence as a way to manage unstructured data, according to James Lau, acting secretary for financial services and treasury for the government of Hong Kong.
“AI can analyse not only text messages but also social media footprints, including voice messages,” Lau said in a Nov. 7 speech at a fintech event in Hong Kong. “AI can do the parsing of myriads of regulatory rules and guidance notes from different jurisdictions, and relate them to reporting, surveillance and enforcement mechanisms.”
MiFID II is acting as a catalyst for the disruptive and innovative force of big data, by pushing buy-side and sell-side market participants to get their arms around their data. That’s according to Christian Voigt, senior regulatory adviser at Fidessa in London.
“Trading firms search for solutions that allow them to meet new regulatory requirements while protecting their bottom line,” Voigt told Markets Media. “Implementation has many challenging components such as ensuring data accuracy, aggregation across disparate systems, traceability or data security, which all firms need to get right.”
“But while many financial institutions will extend their already bulging data warehouses, being able to extract meaningful business intelligence will separate the wheat from the chaff,” Voigt continued. “In order to master the big data challenge it is crucial to see the collection of amassed data not as a regulatory burden but as the key to understanding markets and customers better than ever before.”
First wave was finding data; second wave will be about interpreting data.
Technology giants with data analytics and distribution muscle may become new competition.
Institutional corporate-bond investors have a lot of catching up to do.
MiFID II prompts asset managers to assess the true value of research.
TickSmith works with CME Group to improve access to the US exchange’s historical data.