Out-of-memory problem caused Mars rover's glitch
The rover systems are again working on the Martian surface
Computerworld - A shortage of memory on board the Spirit Mars rover is what caused it to become unresponsive on the Martian surface on Jan. 22, raising fears that the Martian mission might end almost before it began in earnest.
Mike Deliman, a technical staff member at Wind River Systems Inc., which provided the real-time embedded operating system used in the mission, said the problem has been re-created in testing on Earth and appears to be entirely memory-related.
"It's not a software bug, it's not an application bug, and it's not a hardware bug," Deliman said. "It's a system constraint that we ran up against."
The Spirit rover dedicates 32MB of its 128MB of RAM to the onboard Wind River VxWorks operating system and a host of science applications, and as the mission progresses, technicians are scheduled to periodically delete old files and directories to clear out the memory for reuse, he said.
But with all the excitement after the Mars landing on Jan. 3, and with data being returned to Earth by the rover, that step was not performed quickly enough by mission technicians.
"We just ran out of memory, ran out of RAM," Deliman said. "This is why we initially lost contact" with the rover. The six-wheeled vehicle runs hundreds of tasks simultaneously in normal operations, with each operation using its own chunk of RAM, he said.
The VxWorks operating system was embedded in a specially prepared, radiation-hardened 20-MHz PowerPC CPU installed on each of the rovers, along with 128MB of RAM (see story). The hardware was cutting-edge back when it was chosen in the mid-1990s, but then it had to be treated to ensure its reliability in the radiation of deep space -- a process that takes five to 10 years.
"It's like having an old Windows machine that has a very little bit of disk space [remaining]," Deliman said. "When you run up against the end of your disk, if you don't clean it up, your system becomes unstable."
For about a week, scientists worked to figure out why the rover wasn't responding to commands from Earth and feared that a hardware problem could halt the rover in its tracks.
Technicians were eventually able to correct the problem when the rover went into a diagnostic mode, Deliman said. Diagnostic commands were beamed up to the machine, and a series of files and folders were deleted from a flash-memory-based file system board, allowing the rover to resume normal operations.
The Spirit rover is now continuing to take photographs on the Martian surface and conducting experiments for NASA. A second Mars rover, Opportunity, landed on Jan. 24 and has also been operating on the surface of the red planet. A minor glitch with a heater that won't shut off on Opportunity's robotic arm is the only problem experienced by that machine so far.
This pilot fish is a contractor at a military base, working on some very cool fire-control systems for tanks. But when he spots something obviously wrong during a live-fire test, he can't get the firing-range commander's attention.
- IT Certification Study Tips
- Register for this Computerworld Insider Study Tip guide and gain access to hundreds of premium content articles, cheat sheets, product reviews and more.
- Reduce federal infrastructure risk with compliance management and situational awareness
- IBM continuous monitoring and management solutions deliver real-time situational awareness to help federal agencies understand vulnerabilities, and protect the infrastructure.
- Step Out of the Bull's-Eye
- Learn about the evolution of targeted attacks, the latest in security intelligence, and strategic steps to keep your business safe.
- Warning: Cloud Data at Risk
- Experts agree that relying on SaaS vendors to backup and restore your data is dangerous. Yet that's exactly what huge portions of the...
- Where You Mitigate Heartbleed Matters
- Read this article to learn more about why customers must choose the most strategic point in the network at which to deploy their...
- Do More With Less: How CARFAX Consolidated Their Security Solutions
- Through a consolidated F5 solution, CARFAX cut site downtime to zero, secures its data, and deployed a high-performance infrastructure to support its rapid... All Government IT White Papers
- Keep Servers Up and Running and Attackers in the Dark An SSL/TLS handshake requires at least 10 times more processing power on a server than on the client. SSL renegotiation attacks can readily...
- On Demand: Mastering the Art of Mobile Content Management Mobile device usage in the enterprise has skyrocketed, and it continues to escalate. IT must answer to users who demand access to their...
- NSS Labs & Cisco Present: Evaluating Leading Breach Detection Systems Today's constantly evolving advanced malware and APTs can evade point-in-time defenses to penetrate networks. Security professionals must evolve their strategy in lockstep to...
- Will the Real Endpoint Threat Detection and Response Please Stand Up? This webinar explores new technologies & process for protecting endpoints from advanced attackers as well as the innovations that are pushing the envelope...
- What should I look for in a Next Generation Firewall? SANS Provides Guidance With so many vendors claiming to have a Next Generation Firewall (NGFW), it can be difficult to tell what makes each one different....
- All Government IT Webcasts