Subsurface exploration data provider Searcher Seismic was looking to automate the data storage and delivery process of tens to hundreds of gigabytes of data in processed data sets to reduce human error. The goal was to build a data-as-a-service system.
And it wanted to do so in a modern way. “We saw that machine learning and digitalisation were initiatives that were being quickly adopted by the oil and gas industry, and we wanted to ensure that we operated on a data platform that supported these new technologies”, Searcher Seismic VP of data and analytics Joshua Thorp told Computerworld Australia.
It was also looking to centralise its data storage, which was spread across various IT infrastructure platforms and physical media such as USB drives and high-density tape storage, and also have a queryable access method.
Headquartered in Perth, Searcher Seismic has offices across the world covering all regions and provides data sets to energy and oil and gas industries. This data is used by the oil and gas industry to determine the location and size of oil and gas reservoirs, to help prevent unnecessary drilling and thus reduce environmental impact and lower operational costs.
Previously, the company used high-speed SAN systems, NAS storage and physical media in its data centre and IT facilities. It also used a metadata database to track the data within the various systems. For the delivery process, it used a library of scripting mechanisms and bespoke software that was developed both in-house and by external providers. These required heavy involvement by the data management team, and a single delivery was a multi-step process that often took days.
How Searcher Seismic designed its new data platform
The project had multiple phases, starting with an extensive market research and creation of a detailed list of functional requirements, which helped in finding the best providers and also helped the providers understand the problem they were trying to solve.
Searcher Seismic found there was nothing that fulfilled all its requirements, so it decided that an open source platform was the best option because that provided flexibility in deployment no matter what type of platform — public, private, or hybrid cloud — would ultimately be used. “We also wouldn’t be locked into a given proprietary architecture, and it would give us the flexibility to adopt what was needed for our data and our organisation”, Thorp said.
The process of selecting the best solution to deal with the company’s challenges took into consideration two key factors.
First, “having programmatic access to the data, with a note that the Python language was especially important for data science”, Thorp said. “We could unlock the true value of our data by having data scientists and software developers manipulate our data, instead of only geoscientists who could operate the technical geophysical software packages”.
Second, “having a data platform that utilised open source software, particularly the open source distributed computing software such as Hadoop and Spark. This allowed for many developers across all industries to collaboratively build modular software components to solve common data issues, and thus allowing for a large talent pool of developers”, Thorp said.
The approach enabled Searcher Seismic to keep existing systems that were successful and prevent the business from having to develop things from scratch. The company ultimately chose Hortonworks (who merged with Cloudera in 2019) as its data-processing platform and Pure Storage Flashable for the scalable, high-speed centralised storage component.
With the requirements identified and technologies determined, Searcher Seismic did a five-month proof of concept that resulted in the building of a public cloud-based data platform to hold the processed data in a queryable high-performance database.
This was followed by a full development phase that took nine months to complete the build of a platform on the existing infrastructure, built in a data centre which allowed for a hybrid cloud model.
This was also when significant changes to the code base were made to optimise the storage and access speeds and added support for the raw field data storage. At the same time, Searcher Seismic built a front-end web application so that users could access data using a simple geospatial map view and send data to any location.
The company is now refining a commercial product it has built on its data platform Saismic so customers can access its data in various ways.
Searcher Seismic’s effort pays off in lower costs, greater flexibility, and new business benefits
The result of Searcher Seismic’s effort was a hybrid cloud solution that proved the total cost of ownership to be between one tenth and one half the TCO of a pure public cloud deployment, while allowing for higher overall performance, according to Thorp.
It also allowed for greater operational flexibility. “Our staff can effectively collaborate and remotely work, all within the same environment, without needing to be in the office. We can also easily scale the team up or down as needed throughout this time without having to make significant changes in our infrastructure or IT strategy, which helps take the load off the IT department and organisation in general,” Thorp added.
Both productivity and quality of work benefited too. The data delivery has been simplified, allowing for data to be delivered to any location across the world with “a few clicks”, Thorp said. A web-enabled component allows remote teams to easily see the entire database and visualise the data dynamically.
And a new business opportunity resulted. “Our greatest impact is that we have created a new product line using this technology,” Thorp explained. “Unlike traditional data licensing methods, we can offer a subscription-based service to ensure that clients have access to the newest data as we upload it.” Thorp is referring to Saismic, a cloud-based service that provides global seismic data on-demand with native support for deep learning and advanced analytics.