A Big Data view of the Internet of Things

The Internet of Things turns out to be as much a big data challenge as it is a sensor and network (M2M) challenge. Data and analytics architectural choices will be important for government CIOs that want to realise the value of these solutions at affordable cost.

First it was smart metering, then smart buildings, then intelligent cars, then smart cities and so forth. As the hype starts to clear and the convergence of sensors and actuators with traditional ICT enables to realise tangible benefits, including in the public sector, as we discussed in a recent research study, the technology will move from early stage development and pilots, into deploying more optimised architectures.

One of the aspects of those Internet of Things architectures that the study recommended government CIO to watch out for are the data implications. As we said in the study: "Each connected "thing" should be considered a point of data capture, analysis, and actionability. The provisioning, analysis, archiving, security, and retrieval of that data are at the core of IoT business services that can be produced and delivered. The value that can be harnessed from objects, either for improving internal productivity or enhancing constituent service, becomes a data issue much more than a network infrastructure issue." And we are talking about a lot of data, often being generated on a continuous basis by embedded computing devices, so it will be more often than not a big data scenario.

All of that data coming from thousands of devices added daily to the Internet require new architectural approaches that government CIOs need to consider. Without optimised approaches governments could not afford to use some of these technologies to harness the value of that information, being it from an environmental sensor that monitors pollution in a densely populated area, a movement sensor implemented in the house of an elderly or ill person, or a chemical sensor embedded in the uniform of a firefighter that sends an alerts when there are deadly gases in the air.

An important question will be where to put the intelligence. In other words, will it be more efficient to have dumb sensors that simply transmit all the data to a central server where a data warehouse can be built and then apply the analytical capabilities there, or will it make sense to put some intelligence at the periphery, in sensors, or nodes? In some cases, for example when real time response is crucial, it will probably be more effective to put some intelligence locally, so that the sensor in the firefighter uniform can, not only send data to a server, but also understand when in the data there is an anomaly and even alert the firefighter by actuating a sound or visual alarm.

In other cases, it could be more efficient to have minimal levels of intelligence in sensors or peripheral nodes, so that they use less energy and are easier to maintain, and instead consolidate the data centrally to run the analysis, for example speed cameras do not necessarily need to calculate speed, they can send the raw data centrally; or it could be a mix of the two, for instance the speed cameras could send only the anomaly data to a central server, so that network usage is optimised, and then the algorithms to calculate the amount of the fine will be applied there.

Technology companies and academic researchers will certainly provide more and more effective answers to this question, but some of those solutions will be alternatives and government CIOs will need to make choices.

Another important question will be around how to architect the data itself. With a widening range of use cases and stakeholders that can leverage data in different contexts, it will be very difficult to envision the perfect data architecture with a precisely define core entity and all of its attributes. Going back to the speed cameras example, they are usually used to issue fines, but they could also be used to monitor traffic conditions, or to check if cars have paid insurance or taxes, in case those require putting recognisable tags on the wind-chill or the number plate.

It will be hard in the first iteration of architectural design to figure out exactly all of the possible use cases, hence if the core data entity is the number plate, or the insurance tag, or the car speed. Unexpected use cases come up because technology is evolving very fast and because, even if many use cases can be planned, often times separate government organisations are accountable for taking care of those processes and activities and they do not have a culture of collaboration, or more simply they cannot collaborate because current laws do not permit to share data. And sometimes there is also a mechanical engineering angle to be factored in.

Recently I was at the SAS analyst event and their lead big data executive gave an interesting example. Accelerometers embedded in cars provide useful information to insurance companies to adapt premiums based on the driving style. But accelerometers are not installed in such a standard way to allow detection of movements left and right and forward and backward, they are just installed to measure changes in speed. But backward and forward movement can be equally useful; for example an accelerometer installed in a pick-up truck that is used to plow snow in a driveway or a parking lot will transmit a lot of accelerations in a short time, but those short backward and forward movements are unlikely to add much additional risk to the profile of that driver.

In the specific example provided at the event, SAS worked with the client to mathematically detect the relevant accelerations. But with the rapid evolution of big data and the Internet of Things, government CIOs should expect a lot of those unknown unknowns.

Posted by Massimiliano Claps

Copyright © 2013 IDG Communications, Inc.

Shop Tech Products at Amazon