Ads by TechWords

See your link here
Receive the latest technology news and information.
Networking
Computerworld Daily News (First Look and Wrap-Up)
Computerworld Blogs Newsletter
The Weekly Top 10
Cloud Computing
View all newsletters




Privacy Policy
 

New search engine takes 'DeepDyve' into the Dark Web

Engine offers eye into area of Web that traditional search engines can't penetrate

November 11, 2008 12:00 PM ET

Active Comments
Anonymous says: I am not too happy with their register-first-before-you-search thing. What gives? Internet users are busy people and they just can't...
Anonymous says: With all the concerns and controversy about Google and Yahoo turning over their search request data to the government, even...


Computerworld - DeepDyve Inc. today announced that it has launched a free search engine that can be used to access databases, scholarly journals, unstructured information and other data sources in the so-called "Deep Web" or "Dark Web," where traditional search technologies don't work.

The DeepDyve search engine enables searches of the Dark Web to more easily find life sciences, patent and Wikipedia data. The new engine indexes 500 million pages, said DeepDyve, which was known as Infovell before changing its name today.

The company said it will soon start indexing physical sciences content in the areas of information technology, clean technology and energy. That will help it meet its goal of expanding its index to more than 1 billion pages by the end of the year.

Because much of the content on the Deep Web is made up of technical publications, databases, scholarly publications and unstructured data, it has been difficult for traditional search engines to access it. To tackle this problem, DeepDyve is partnering with publishers and other providers of those sources of information to gain access to content overlooked by other engines, the company added.

Google Inc. announced earlier this month that it is ratcheting up its focus on the Dark Web by adding the ability to search PDF documents. In April, Google had announced that it was trying to find a way for its search engine to index HTML forms such as drop-down boxes and select menus that are typically part of the Dark Web.

"According to IDC, more than 42 million consumers spend 25 hours per month online researching business and personal information, and they are frustrated with the results they get back and the tools they have to use," said William Park, CEO of DeepDyve, in a statement. "DeepDyve gives information-savvy consumers unparalleled access to quality information found only in the Deep Web, with features and functionality that make it easy to find, filter and organize their results."

The company said that its technology is designed to allow users to type in a few words or copy an entire article into a query box to find all related articles located in the Deep Web.

Chris Sherman, a blogger at Search Engine Land, said that DeepDyve's approach to scouring the Deep Web is innovative. He credited the company's chief scientists, who are veteran genomics researchers. To crack the genetic codes contained in DNA sequences, he noted, researchers must understand the hidden patterns in of massive amounts of data.

"DeepDyve takes a similar approach to understanding information on the Web," Sherman added. "Going far beyond basic keyword-based search, DeepDyve indexes every word in a document, but also computes the factorial combination of words and phrases in the document and uses some industrial-strength statistical techniques to assess the 'informational impact' of these combinations. In essence, this approach looks at the meaning of an entire document and uses that to compute relevance, rather than factors like snippets of text or anchor text in links pointing to documents."

DeepDyve isn't a threat to Google now, nor is it likely to be in the future, but its tool is good for people who want to do research in areas that DeepDyve indexes.

"DeepDyve also offers a genuinely different 'second opinion' of the Web if you're wanting to look beyond the top results returned by Google and the other major search engines," Sherman noted. "With its limited initial offering, DeepDyve has just scratched the surface of what's available on the invisible Web, albeit in a very useful way. However, truly cracking the invisible Web problem still seems like a distant dream."



Jump to comments

search

Additional Resources

Xerox
By using solid ink technology only from Xerox, you could save up to 65% by printing color for the cost of black and white. Enter for a chance to WIN a PhaserTM 8860 network color printer!
Microsoft
Save time and mitigate security risk. Deploy it now.
Sybase
In this white paper, IDC analyzes the role of next-generation mobile enterprise platforms as organizations seek a more strategic deployment of mobile solutions.

Learn the important issues you must consider before starting your next mobility initiative. Get your mobility white paper from IDC now, compliments of Sybase.

What People Are Saying

White Papers & Webcasts

The 2009 Handbook of Application Delivery
Learn how to become better with application delivery.  

Aligning IT to Business: The Rising Importance of Application Delivery Networks
Application Delivery Networking (ADN) will play a vital role in helping enterprises incorporate strategic technologies to achieve business initiatives.

Unified Application Delivery
By providing a unified Application Delivery Networking platform, F5 BIG-IP offers the ability for organizations to adopt a single platform for all its...  

Preparing Your Business Services for the Future
Would you trust your network monitoring tools enough to know when something is truly halting a business service?

ROI of Application Delivery Controllers
How modern offload technologies in Application Delivery Controllers can drastically reduce expenses in traditional and virtualized architectures, with a fast ROI.  

BMC Application Performance and Analytics: Predictive Intelligence in Action
See the highlights of BMC's Application Performance and Analytics today!

Gartner: Magic Quadrant for Application Delivery Controllers, 2009
The market for products to improve the delivery of application software over networks remains dynamic and innovative. Vendors focused on solving enterprises' most-pressing...  

IPAM: Slashing Network Costs
Slashing Network Costs by Consolidating and Automating Core Network Services

Gartner: Load Balancers are Dead
This research shifts the attention from basic load-balancing features to application delivery features to aid in the deployment and delivery of applications.  

Disaster Recovery & Cost Savings Zone
Thousands of customers world-wide have turned to virtualization solutions from Riverbed as a way to reduce costs.