Last year, the London-based Hedge Fund Derwent Capital Markets bought research conducted by Indiana University computing professor Johan Bollen for close to $40 million. Who said academics wasn't a career to get you rich?
His paper's title alone might explain the corporate interest: "Twitter Mood Predicts the Stock Market" claimed that, by using algorithms to detect mood data in tweets, Bollen could accurately predict changes in the the Dow Jones Industrial Average
(INDEXDJX:.DJI) three to four days in advance with close to 88% accuracy.
Big Data can translate into big money for those who are looking at it the right way, and Bollen's research isn't the only Twitter-focused data crawler piquing investor interest.
Dataminr, a New York City-based social media analytics company, just raised $13 million in series B round fundraising this September, adding to the $3.5 million in venture funds it already has. The company's favorable access to Twitter's "full firehose" of ceaselessly flowing data and its unique approach to data analysis and prediction has attracted funds of, among others, GSV Capital
(NASDAQ:GSVC), whose past investments include Facebook
(NASDAQ:GRPN), and Dropbox.
Dataminr's co-founder and CEO Ted Bailey tells Minyanville that his company's relationship with Twitter is deep on many fronts. “From the CEO through the whole company, they support us on all levels.”
A good relationship with Twitter is important.
Keep in mind, the California-based PeopleBrowsr, a company which uses Twitter data to deliver analytics to Fortune 500 companies and government agencies, just took Twitter to court after Twitter began gradually restricting access to its Firehose. PeopleBroswr's business is built around Twitter's original promise to maintain its data as open ecosystem. A judge has awarded
a temporary restraining order to PeopleBroswr against Twitter, allowing them continued Firehose access.
How Twitter chooses to restrict its content going forward, and the implications this has on the multitude of companies whose business relies on access to the Firehose is left to be seen. Whether or not this is something Dataminr needs to worry about, it's not necessarily what's driving GSV (a noted backer of high-growth tech companies) to risk $2 million on this 3.5-year-old firm.
It's what Dataminr is doing with its access to Twitter's Firehose, however, that will likely be the payoff.
There's news, and then there's information
On Friday November 30, 10:50 a.m. Major news reports
gold futures have dropped 75 basis points on rumors that Fitch is downgrading its ESM rating. Gold futures simultaneously break their critical $1720 support level, in what some see indicating an end to gold’s recovery. Moody’s subsequently downgrades ESM to Aa1.
It’s a significant news point that no hawk-eyed trader would ever miss.
Subscribers to Dataminr knew about the rumor 19 minutes before major news pushed it out.
“Early information is the name of the game and it’s in our product everyday,” Bailey tells Minyanville. “What we’re looking at isn’t news. News is an industry that’s built around reporters who actually bring their perspective into something, and that’s hugely valuable. But it’s opinionated and it’s different than a 140 character post about what you think or see or what you just heard.“
Dataminr says it works in the space that exists prior to news, a space that is even beating major media's 'breaking news' once restricted to rushed telephone calls and abbreviated instant messages. This space is currently filled by Twitter (apparently supplanting Google
(NASDAQ:GOOG) which once tracks trends), and Dataminr is using it to give their clients what they call information advantage.
What’s the value in knowing Google acquired Incentive Targeting 54 minutes before major news outlets published coverage? Or seeing the Coast Guard has reported an explosion at the Black Elk Oil rig in the Gulf of Mexico 25 minutes before major news reports drove BP
(NYSE:BP) -- the newly crowned poster-child for gulf oil catastrophes -- to an intraday low of $39.58, a level not seen since early August?
It’s a huge advantage, believes Bailey, whose product serves top banks, $15-billion-plus hedge funds, and equities firms that have made their name on information advantage -- as well a number of government agencies about which he remains tight-lipped. “The thing that is really unusual about Twitter, the phenomenon of micro blogging: All of it is completely open, and it has fostered a use case of people describing the world around them, seeing things in real time and reacting and publishing information as quick as possible.”
And they’re doing it on a colossal scale. When Bailey founded Dataminr in 2009, Twitter had close to 1 billion tweets total. Now there are a billion tweets published every two days. There was no certainty Twitter would mushroom into what its become. He admits it was a risk, but a risk that has paid off.
“Twitter isn’t just the biggest data set. It’s the data set,” Bailey says.
For anyone paying attention, big data has become hot news lately, and for very good reasons according to data forecasting golden boy Nate Silver. In his New York Times
best selling book The Signal and the Noise
, he writes on human beings' natural attraction to look for signs in large sets of data. “Our minds are quick. We are wired to detect patterns and respond to opportunities without much hesitation.”
But apparently not quick enough. Silver points out that no matter how big Big Data gets, we will always be hindered by our physiological disadvantage; 2.5 quintillion bytes of data are being generated every day, while the human brain can store roughly 3 terabytes over a lifetime. That’s peanuts in comparison.
An algorithm walks into a bar...
Wherever there are eyes and ears, there is data being created on Twitter. Lots of it at that. And as with all large data sets, patterns will inevitably emerge.
Dataminr has over 20 proprietary algorithms that scan Twitter’s Firehose in search of relevant hotspots of activity. Technically speaking, these algorithms seek out ‘abnormal activity’ utilizing a broad spectrum of approaches. They pinpoint activity from, say, a trusted source that has delivered information on influential topic in the past. Or, a blossom of activity from handles that have never been important for any reason in time, but are all from the same location, tweeting about a real time event.
In layman’s terms, Dataminr is employing a very complex, very intelligent system of filters to select information it sees as relevant to its clients’ needs. It then delivers that information very quickly.
But quick can often mean low quality. Twitter has no editors. It has no censors. Anyone has the ability to propagate any type of information, be it true, false, or utter nonsense.
“There are a number of answers to the trustworthy question. The key to dealing with Twitter is knowing that you can’t really take one approach. Verification comes with the general expertise of those that are discussing the signal, or the geo proximity of those reporting the signal, and how normal that would be. And then there is the whole range in between,” says Bailey.
Truth and news, however, have never been solely synonymous, and rumors surely move markets too.
Larry Rafsky, the founder and CEO of Acquire Media, certainly believes this. His NewsEdge product line is a leading premium real time news delivery system used throughout the financial world, and he is adamant that people must realize "Twitter is the new newsfeed!”
NewsEdge (which incorporates the aforementioned PeopleBroswr's Twitter analytics platform) offers a product slightly different from Dataminr’s. Two integrated NewsEdge services, People are Talking and VORTEX (Velocity Of ReTweet EXpansion), combine to search Twitter for relevant mentions on public companies (or their brands) only, and measure the speed at which these tweets are being retweeted.
“We don’t care about the truth,” he tells Minyanville “We care about where the equity price is going.”