Facebook engineers identify big data challenges for Graph Search
'There's still a lot of work we have to do,' they say
IDG News Service - Facebook's engineers have many challenges ahead of them as they work to scale up Graph Search, the site's new social search tool. One stumbling block: an over-abundance of data to sift through.
Take the example of searching for Japanese restaurants in New York City liked by people from Japan. A search that would seem to generate hundreds if not thousands of results only spits back two measly businesses.
The search engine, in its current beta form, simply does not have the processing power to sift through the millions of connections among Japanese people on the site to perform the search, Facebook engineers said Thursday during a small media briefing at the company's headquarters in Menlo Park, California.
"There's still a lot of work we have to do," said software engineer Michael Curtiss. "A query like this is very difficult computationally," to start with the 100 million in Japan, and then in a fraction of a second to sort through all the pages liked by people in Japan, he said.
"This is virtually intractable in the limited amount of time that we have," said the engineer, who helped to design the site's Unicorn search engine that provides Graph Search's infrastructure. "What we end up having to do is cut out possibly good results."
Facebook is taking a variety of approaches to solve this and other big data problems associated with Graph Search.
One strategy involves a concept in computer databases known as "query optimization," to improve the speed and efficiency of certain types of searches.
In the case of the Japanese restaurant search, the technique could be applied to start first with the restaurants that are liked instead of starting with Japan, and then filtering down the likes by people, Facebook engineers said.
The company is also addressing the challenges at the hardware level, by adding additional flash memory and other new features to the servers it uses at data centers, to accommodate the increase in search traffic caused by Graph Search.
"We need to do extra work in data centers, buying new hardware platforms, [with] new types of servers being put up to support the computational needs of Unicorn," said Soren Lassen, who led the search infrastructure team behind Graph Search.
Facebook began rolling out Graph Search last month to a limited number of users in the U.S. The search tool is designed to let people comb through the social network's 1 trillion connections among users to search for people, places, photos and interests using phrases in plain English.
In principle, nothing can stop users from typing in a query that is unusually long, such as "Employers of friends of my friends who live in New York and who like Downton Abbey," engineers said, since Graph Search uses cues such as "Likes" and check-ins to more easily rank the results.
Eventually Graph Search will incorporate other metrics such as user comments and status updates to compile and rank results, but that's further down the line, the company said.
Zach Miners covers social networking, search and general technology news for IDG News Service. Follow Zach on Twitter at @zachminers. Zach's e-mail address is zach_miners@idg.com
- 12 iPhones Apps That Will Make You a Networking Star
- 10 Careers Robots Are Taking From You
- Big Data Gold Isn't Always Where You Would Expect It
- 6 Tips to Build Your Social Media Strategy
- A walking tour: 33 questions to ask about your company's security
- 15 social media scams
- The 7 elements of a successful security awareness program
- IT Certification Study Tips
- Register for this Computerworld Insider Study Tip guide and gain access to hundreds of premium content articles, cheat sheets, product reviews and more.
- When Application Performance is Better, Business Works Better Poor application performance can cost more than you think. In fact, Enterprise Management Associates reports that it can exceed $1 million per hour...
- The Practitioner's Guide to Data Profiling This paper considers the techniques used by data profiling tools, including reverse engineering, assessment for potential anomalies and validation of metadata and data...
- How to Effectively Realize Data Visualization Data visualization enables decision makers to understand what data really means. SAS Visual Analytics is a high-performance, in-memory solution for exploring massive amounts...
- Practical Fundamentals for Master Data Management Discover the early benefits that can be achieved by concentrating on simplifying and standardizing semantics, managing metadata and improving data quality as first...
- Live Webcast
Webinar: Create Competitive Advantage, Featuring Synchology - View Now!
- Webinar: Create Competitive Advantage, Featuring Synchology View Now!
- Software Asset Management - Program Considerations to Help Reduce Risk and Lower Costs SAM: A must have IT tool to help reduce costs and minimize business and legal risks. All Business Intelligence/Analytics White Papers | Webcasts