Twitter indexes all tweets, allows searching with ease
Washington: Twitter has indexed and has made every public tweet searchable since the microblogging platform's launch in 2006, which amounts to hundreds of billions of messages. A new, more powerful search function will let users search for specific words used by specific users, hashtags used between a set of given dates and other variables. The company is making the complete results from its 'Tweet Index' available through the "All" tab of search results generated by its Web client and iOS and Android apps, Twitter said in a blog post.
While the micro blogging site already enables discovery of relatively fresh user-generated content via 'an inverted index containing about a week's worth of recent tweets,' project has been long in the making to enable efficient searching of the "roughly half a trillion documents" contained on the site. "Since that first simple tweet over eight years ago, hundreds of billions of tweets have captured everyday human experiences and major historical events," project co-leader Yi Zhuang wrote.
"Our search engine excelled at surfacing breaking news and events in real time, and our search index infrastructure reflected this strong emphasis on recency. But our long-standing goal has been to let people search through every tweet ever published," Zhuang wrote. The new Tweet Index "serves queries with an average latency of under 100 milliseconds," the company said.
While the real-time index of recent tweets is "fully stored in RAM for low latency and fast updates," the full index uses cheaper storage to keep its maintenance from being 'prohibitively expensive'.Twitter's priorities in creating the comprehensive, searchable index of tweets were to make the system modular, scalable as it grows over time, cost-effective, simple and quick for people to use, and developed incrementally, Zhuang said.