Facebook engineers identify big data challenges for Graph Search
'There's still a lot of work we have to do,' they say
IDG News Service - Facebook's engineers have many challenges ahead of them as they work to scale up Graph Search, the site's new social search tool. One stumbling block: an over-abundance of data to sift through.
Take the example of searching for Japanese restaurants in New York City liked by people from Japan. A search that would seem to generate hundreds if not thousands of results only spits back two measly businesses.
The search engine, in its current beta form, simply does not have the processing power to sift through the millions of connections among Japanese people on the site to perform the search, Facebook engineers said Thursday during a small media briefing at the company's headquarters in Menlo Park, California.
"There's still a lot of work we have to do," said software engineer Michael Curtiss. "A query like this is very difficult computationally," to start with the 100 million in Japan, and then in a fraction of a second to sort through all the pages liked by people in Japan, he said.
"This is virtually intractable in the limited amount of time that we have," said the engineer, who helped to design the site's Unicorn search engine that provides Graph Search's infrastructure. "What we end up having to do is cut out possibly good results."
Facebook is taking a variety of approaches to solve this and other big data problems associated with Graph Search.
One strategy involves a concept in computer databases known as "query optimization," to improve the speed and efficiency of certain types of searches.
In the case of the Japanese restaurant search, the technique could be applied to start first with the restaurants that are liked instead of starting with Japan, and then filtering down the likes by people, Facebook engineers said.
The company is also addressing the challenges at the hardware level, by adding additional flash memory and other new features to the servers it uses at data centers, to accommodate the increase in search traffic caused by Graph Search.
"We need to do extra work in data centers, buying new hardware platforms, [with] new types of servers being put up to support the computational needs of Unicorn," said Soren Lassen, who led the search infrastructure team behind Graph Search.
Facebook began rolling out Graph Search last month to a limited number of users in the U.S. The search tool is designed to let people comb through the social network's 1 trillion connections among users to search for people, places, photos and interests using phrases in plain English.
In principle, nothing can stop users from typing in a query that is unusually long, such as "Employers of friends of my friends who live in New York and who like Downton Abbey," engineers said, since Graph Search uses cues such as "Likes" and check-ins to more easily rank the results.
Eventually Graph Search will incorporate other metrics such as user comments and status updates to compile and rank results, but that's further down the line, the company said.
- Automating Cost Transparency By making all of the costs of running IT transparent, IT can change the way business units consume IT resources, drive down total...
- Forrester: The Business Technology Value Scorecard The "Business Technology Value Scorecard" paper proposes four categories of KPIs as the basis for a common language of metrics between IT leaders...
- Cisco Case Study Before Cisco could effectively manage the business of IT, the company needed to improve the way it accounted for the cost and performance...
- IT Financial Metrics Primer This Executive Brief details financial and non-financial metrics that IT financial managers must use to foster conversations with business stakeholders.
- Maximizing Availability for the Modern Data Center Check out this information-packed resource center for help in maximizing the availability of your data center - from overcoming challenges to choosing the...
- Fundamentals of the Unified Communications BE 6000 The Cisco Business Edition 6000 delivers the superior performance, system redundancy, and broad application integration you need. Well suited for businesses with 100... All Business Intelligence/Analytics White Papers | Webcasts
Our new bimonthly Internet of Things newsletter helps you keep pace with the rapidly evolving technologies, trends and developments related to the IoT. Subscribe now and stay up to date!