System, Cure Thyself
Self-healing software and hardware are on the way.
Computerworld - Some years ago at an insurance company, a server's file-locking process kept failing, and the vendor couldn't produce a patch to prevent it from happening. As a result, a file could be accessed by more than one user at a time.
The company's IT administrator ultimately wrote a custom script that simply restarted the locking process every time it failed, about every 10 minutes. "It was better than having several hundred users mad at me," recalls the administrator, Nick van der Zweep, now the director of virtualization and utility computing at Hewlett-Packard Co.
Van der Zweep's custom code was an early example of self-healing software, a general category that has earned the attention of researchers and vendors such as HP, IBM and Computer Associates International Inc. But many other companies are actively researching and developing self-healing capabilities for their products.
For example, there are already products on the market that automatically correct, or self-heal, components or subsystems such as servers that have reached capacity. (In that case, a program can add more servers or more blades automatically.) But the focus over the next two to five years will be on developing entire networks and systems that self-heal across combinations of applications, storage and computing resources, say analysts and researchers.
Definitions of self-healing vary widely. "Self-healing ... connotes that when there are problems in the infrastructure, the infrastructure copes with them," says Alan Ganek, vice president of autonomic computing at IBM.
For example, Ganek says, when IBM ran the Web site for the U.S. Open tennis tournament in September, software handled workload spikes by delivering computing power from a new server to keep service levels high.
"Self-healing is the capability of any piece of technology to monitor itself and self-diagnose a problem, and then to start a solution that either bypasses or corrects the problem," says Jean-Pierre Garbani, an analyst at Forrester Research Inc. in Cambridge, Mass. For example, HP has products that can detect when a processor is going to fail by noticing single-bit memory errors in the cache, so that they can automatically turn on another HP processor at a customer site.
With those examples in mind, it is clear that self-healing can mean utility computing (as in marshaling resources when needed) as well as autonomic computing (as in correcting an underlying system problem when it occurs).
Richard Ptak, an analyst at Ptak, Noel & Associates in Boston, says that there is a "great deal of confusion" about the term but that to be truly self-healing, a system must perform four functions: self-monitoring, self-analysis, planning and execution. Today, systems implement the four stages "with varying degrees of sophistication," he says.
The real challenge will be to drive implementation of all four steps to the lowest levelsto the level of devices and circuit elements, Ptak says. He predicts that within two years, manufacturers will have produced self-healing chips, which will support self-healing devices within four years, followed in perhaps another year by organic circuits, which will adapt themselves in order to correct deficiencies or failures. Manufacturers are likely to form partnerships in coming months to unite the four phases of self-healing, adds Jasmine Noel, Ptak's partner.
Meanwhile, start-up Vieo Inc. in Austin is trying to develop a single device that handles all four functions together, to replace a series of devices built by different companies, she says. Garbani notes that Intel Corp. may dominate the server processor market in five years, which could result in low-cost self-healing chips for servers.
In the Network
Researchers at the Georgia Institute of Technology are working with IBM-donated gear to develop self-healing systems for corporate settings. They are exploring how systems can respond to outages and other events more quickly than they can today, says Karsten Schwan, director of the university's Center for Experimental Research in Computer Systems.
One area of the research will be to find ways, perhaps through "network-aware middleware," to have systems self-heal across network layers, from Layer 1, the physical layer, to Layer 7, the application layer, Schwan says.
For example, TCP today slows the sending of packets at lower network layers, especially when they include rich multimedia content. "But this may not be in the interest of the servers running atop TCP," he says. With appropriate middleware, the application server could decide to take steps to affect the transmission, such as compressing the multimedia content more or marshaling more CPU resources, or maybe even sending a thumbnail of a picture instead of the full picture, Schwan says.
As an indication of the interest in self-healing systems, the Defense Advanced Research Projects Agency is evaluating proposals to support research and testing for its Self-Regenerative Systems program. "Network-centric warfare demands robust systems that can respond automatically and dynamically to both accidental and deliberate faults," DARPA has pointed out in its solicitation for bids.
Please click on image above to view a readable version.
Source: Ptak, Noel & Associates, Boston
See more Future Watch articles.
Read more about Applications in Computerworld's Applications Topic Center.
- Path Selection Infographic Path Selection Infographic
- Hyperconvergence Infographic A wide range of observers agree that data centers are now entering an era of "hyperconvergence" that will raise network traffic levels faster...
- Preparing Your Infrastructure for the Hyperconvergence Era From cloud computing and virtualization to mobility and unified communications, an array of innovative technologies is transforming today's data centers.
- Increase IT Performance from the Enterprise to the Cloud with WAN Optimization Massive consolidation and data mobility, enabled by virtualization, have radically altered how we build servers, design applications, and deploy storage for the emerging...
- Live Webcast
Transforming Finance, Procurement and Supply Chain Effectiveness with Cross-Functional Analytics
Date: May 6th, 2014
Time: 1 PM EDT
Attend this Webcast to find out how Oracle's packaged analytic applications enable line-of-business managers to examine all...
- Video Stream Quality Impacts Viewer Behavior This scientific white paper, using statistical data from Amakai's streaming network, analyzes how changes in video quality cause changes in viewer behavior.
- Service-Enabling CICS Applications: Best Practices This informative webcast provides an informed, thorough look into CICS service-enablement options and how they can affect your environment. You'll learn how to... All Applications White Papers | Webcasts