Facebook engineers identify big data challenges for Graph Search
'There's still a lot of work we have to do,' they say
IDG News Service - Facebook's engineers have many challenges ahead of them as they work to scale up Graph Search, the site's new social search tool. One stumbling block: an over-abundance of data to sift through.
Take the example of searching for Japanese restaurants in New York City liked by people from Japan. A search that would seem to generate hundreds if not thousands of results only spits back two measly businesses.
The search engine, in its current beta form, simply does not have the processing power to sift through the millions of connections among Japanese people on the site to perform the search, Facebook engineers said Thursday during a small media briefing at the company's headquarters in Menlo Park, California.
"There's still a lot of work we have to do," said software engineer Michael Curtiss. "A query like this is very difficult computationally," to start with the 100 million in Japan, and then in a fraction of a second to sort through all the pages liked by people in Japan, he said.
"This is virtually intractable in the limited amount of time that we have," said the engineer, who helped to design the site's Unicorn search engine that provides Graph Search's infrastructure. "What we end up having to do is cut out possibly good results."
Facebook is taking a variety of approaches to solve this and other big data problems associated with Graph Search.
One strategy involves a concept in computer databases known as "query optimization," to improve the speed and efficiency of certain types of searches.
In the case of the Japanese restaurant search, the technique could be applied to start first with the restaurants that are liked instead of starting with Japan, and then filtering down the likes by people, Facebook engineers said.
The company is also addressing the challenges at the hardware level, by adding additional flash memory and other new features to the servers it uses at data centers, to accommodate the increase in search traffic caused by Graph Search.
"We need to do extra work in data centers, buying new hardware platforms, [with] new types of servers being put up to support the computational needs of Unicorn," said Soren Lassen, who led the search infrastructure team behind Graph Search.
Facebook began rolling out Graph Search last month to a limited number of users in the U.S. The search tool is designed to let people comb through the social network's 1 trillion connections among users to search for people, places, photos and interests using phrases in plain English.
In principle, nothing can stop users from typing in a query that is unusually long, such as "Employers of friends of my friends who live in New York and who like Downton Abbey," engineers said, since Graph Search uses cues such as "Likes" and check-ins to more easily rank the results.
Eventually Graph Search will incorporate other metrics such as user comments and status updates to compile and rank results, but that's further down the line, the company said.
- 15 Non-Certified IT Skills Growing in Demand
- How 19 Tech Titans Target Healthcare
- Twitter Suffering From Growing Pains (and Facebook Comparisons)
- Agile Comes to Data Integration
- Slideshow: 7 security mistakes people make with their mobile device
- iOS vs. Android: Which is more secure?
- 11 sure signs you've been hacked
- Who's afraid of the big (data) bad wolf? Survive the big data storm by getting ahead of integration and governance functional requirements This paper provides a detailed review of the best practices clients should consider before embarking on their big data integration projects.
- Understanding big data so you can act with confidence Automating information integration and governance and employing it at the point of data creation helps organizations boost confidence in their big data.
- Integrating and Governing Big Data The end-to-end information integration capabilities of IBM® InfoSphere® Information Server are designed to help organizations understand, cleanse, monitor, transform and deliver data-as well...
- The MDM advantage: Creating insight from big data To help enterprises create trusted insight as the volume, velocity and variety of data continue to explode, IBM offers several solutions designed to...
- Live Webcast Increasing the Value of Your Reports and Dashboards Learn how incorporating other analytical capabilities such as predictive modeling and visualization can increase the value of your reports and dashboards by providing...
- The Software-Defined Data Center: Is your ADC ready? Data center transformation is accelerating beyond virtualization to next-generation cloud architectures and software-defined data centers, bringing new challenges for application performance, scalability and...
- Application Acceleration: Optimize the End-User Experience Watch this on-demand webcast and learn how you can optimize your web content, accelerate performance across any device and browser combination, and offload... All Business Intelligence/Analytics White Papers | Webcasts