IBM develops new clustered analytics processing platform
The GPFS-SNC distributed computing architecture supports Posix
Computerworld - IBM today announced that it has created a new distributed computing architecture with a General Parallel File System technology that is twice as fast as existing clustered file systems and that provides management and advanced data-replication techniques.
Calling it the General Parallel File System-Shared Nothing Cluster (GPFS-SNC), IBM said the new architecture is designed to provide higher availability through advanced clustering technologies.
Prasenjit Sarkar, a master inventor in storage analytics and resiliency for IBM's research branch, said the system scales linearly, so that a file system with 40 nodes would have 12GB/sec. throughput, and a system with 400 nodes could achieve 120GB/sec. throughput.
"It's very cost-effective bandwidth. You get 1MB/sec. per dollar," Sarkar said. "If you try to replicate that with a [storage area network], it gets very costly."
The new architecture is aimed at enabling applications that support high-performance analytics, data warehousing applications and cloud computing, he said.
Sarkar described the GPFS's "shared nothing" cluster technology as each node or standard x86 server having access to its own metadata, cache, the data storage and management tools, while also having access to every other node in the cluster at the same time through Gigabit Ethernet ports.
"What we have done, in contrast to the Google file system, which has a single domain node, is we've distributed every aspect of the file system -- the metadata, the allocation, the lock management, the token management," he said. "Even if you take out a rack of servers [from the cluster], we'll still be able to continue to work."
By "sharing nothing," Sarkar said, new levels of availability, performance and scaling can be achieved with the clustered file system. Each node in the GPFS-SNC architecture is also self-sufficient. Tasks are divided up between these independent computers, and no one has to wait on another, Sarkar said.
The GPFS-SNC code also supports Posix, which enables a wide range of traditional applications to run on top of the file system, allowing both reads and writes to be performed.
"You can open a file, you can read a file, then you can append to the file and overwrite any section. With Google's Hadoop distributed file system, you cannot append to a file, you can't overwrite any sections, so you're very limited in what you can do," Sarkar said.
GPFS-SNC also supports the whole range of enterprise data storage features, such as snapshots, backup, archiving, information life-cycle management, data caching, WAN data replication, and management policies. The architecture has a single global domain namespace, allowing virtual machines to be moved between hypervisor nodes.
"So for example in our cluster, you can run Hadoop as well as a clustered DB2 or Oracle databases," Sarkar said. "This allows us to have a general-purpose file system that [can be used by] a wide range of users."
- Google I/O 2013's Coolest Products and Services
- 10 Star Trek Technologies That are Almost Here
- 19 Generations of Computer Programmers
- 25 Must-Have Technologies for SMBs
- A walking tour: 33 questions to ask about your company's security
- 15 social media scams
- The 7 elements of a successful security awareness program
- IT Certification Study Tips
- Register for this Computerworld Insider Study Tip guide and gain access to hundreds of premium content articles, cheat sheets, product reviews and more.
- The Total Cost of Email In this white paper, we'll explore the true costs of fragmented email management and uncover how to reduce those costs with a cloud-based...
- The Shape of Email The shape of email is a starting point in helping us understand the qualify of the information residing in the inboxes of organizations...
- SaaS with a Face: User Satisfaction in Cloud-Based E-mail Management with Mimecast Learn how a carefully targeted SaaS approach can add value to your email environment and potentially result in better services within a much...
-
Your Data under Siege: Protection in the Age of BYODs
Download Kaspersky Lab's new whitepaper, Your Data under Siege: Protection in the Age of BYODs, to learn about:
- How a mobile workforce stretches...
- Live Webcast
Get an Integrated Approach to Data Management - This KnowledgeVault Exchange is your one-stop resource center for designing a winning data management strategy with quantifiable top-line gains and bottom-line savings.
- Live Webcast
MFT and FileXpress - An Overview - Business users and applications exchange files on a regular basis. File transfer is a core part of the flow of business activity.
- Live Webcast
Bridging HTTP and FTP with FileXpress Internet Server - What if you could take an FTP server on your internal network, and allow external users (partners or customers) to securely access it...
- 3 Reasons Why Sepaton is the World's Fastest Backup Solution Leading analyst, Storage Switzerland learns how Sepaton backs up and deduplicates massive data volumes while maintaining the industry's fastest performance - all in...
- Bridging HTTP and FTP with FileXpress Internet Server What if you could take an FTP server on your internal network, and allow external users (partners or customers) to securely access it... All Data Storage White Papers | Webcasts