Google service analyzes live streaming data
Google Cloud Dataflow can analyze both streaming and batched data with the same programming models
IDG News Service - Taking what many see as the next step in big data analysis, Google is previewing a service called Google Cloud Dataflow that analyzes live data, potentially giving users the ability to view trends and be alerted to events as they happen.
"There's an enormous amount of data being created, and so you need a way to ingest that in a more intelligent way," said Brian Goldfarb, Google Cloud Platform head of marketing. With big data, "the program models are different. The technologies are different. It requires developers to learn a lot and manage a lot to make it happen."
"It is a fully managed service that lets you create data pipelines for ingesting, transforming and analyzing arbitrary amounts of data in both batch or streaming mode, using the same programming model," Goldfarb said.
Google Cloud Dataflow is designed so the user can focus on devising proper analysis, without worrying about setting up and maintaining the underlying data piping and processing infrastructure.
It could be used for live sentiment analysis, for instance, where an organization estimates the popular sentiment around a product by scanning social networks such as Twitter. It could also be used as a security tool to watch activity logs for unusual activity.
"There are a bunch of different business applications in which it could apply. In a lot of data-centric verticals, like retail or oil and gas, a technology like this could open the door to getting analytics," Goldfarb said.
It could also be used an alternative to commercial ETL (extract, transform and load) programs, widely used to prepare data for analysis by business intelligence software.
Google Cloud Dataflow is based on technologies that the company built internally for its own use, following up on work it did on the MapReduce programming model, which is used in Apache Hadoop.
Live data stream analysis appears to be the next logical step in big data analysis, a field pioneered by Hadoop. Hadoop provides a way to analyze massive amounts of unstructured data spread across multiple servers. Originally, Hadoop used MapReduce as the platform to write programs that analyze the data.
MapReduce's limitation is that it can only analyze data in batch mode, which means all the data must be collected before it can be analyzed. A number of new software programs have been developed to get around the limitation of batch processing, such as Twitter Storm and Apache Spark, which are both available as open source and can run on Hadoop.
Google's own approach to live data analysis uses a number of technologies built by the company, notably Flume and MillWheel. Flume aggregates large amounts of data and MillWheel provides a platform for low-latency data processing.
- Why Android Wear is the new iPad
- Google answers Apple's 'Continuity' with partial measures
- Google gets into the weeds of Android Work
- Google focused on big data, real-time analysis in the cloud
- Google paves way for 64-bit Android L devices by year end
- Google looks into a future with 3D tablets, interactive animation
- What happened to smart homes, Glass and robots at Google I/O?
- Google targets business users with cloud, Docs advances
- Samsung Gear Live unveiled as latest Android Wear smartwatch
- Google trumps Microsoft's 1TB of storage with unlimited space
- Safeguarding the Next-Generation Data Center Use of virtual and cloud servers has exploded. Unfortunately, security often lags behind. McAfee recommends looking at innovative solutions in order to erect...
- Warning: Cloud Data at Risk Experts agree that relying on SaaS vendors to backup and restore your data is dangerous. Yet that's exactly what huge portions of the...
- The Opportunities and Challenges of the Cloud In this report F5 poses questions to IDC analysts, Sally Hudson and Phil Hochmuth, on behalf of F5's customers to better understand the...
- 5 Hybrid Cloud Starting Points Did you know that more than 50% of organizations are already using or planning a move to hybrid cloud?
- DevOps with PureApplication System: Reduce cost and speed delivery with an integrated IBM Cloud solution Join this webcast to hear what ING Netherlands has been able to achieve while deploying DevOps tools from IBM Rational. An ING executive...
- Why Are Customers Really Deploying an NGFW? It seems every IT Security expert is talking about the NGFW, but what are people really doing? This webcast covers 5 real-world customer... All Cloud Computing White Papers | Webcasts