Mark Ramsey, SVP, R&D Data, at GlaxoSmithKline, spoke at a recent IBM/Kinetica executive breakfast, where he discussed how GSK uses GPUs and Kinetica to help transform the way that data is used as a strategic asset within their R&D organization.
GlaxoSmithKline (GSK) is a science-led global healthcare company that researches and develops a broad range of innovative products in three primary areas of pharmaceuticals, vaccines, and consumer healthcare. Their product portfolio includes global medicines for HIV/AIDS, tuberculosis and malaria, as well as a range of consumer brands, including Sensodyne, Aquafresh, Horlicks, Panadol, and Tums.
Having seen the promise of GPU-accelerated compute, GSK had already made a sizable investment in a GPU-accelerated cluster for a specific use case—accelerated chemical simulations. They were using specialized and customized software to run chemical simulations on 125,000 GPU cores. Due to the difficulty of writing parallel code to take advantage of the GPU environment, the cluster was not being taken advantage of for additional use cases. It was not effective for GSK to have such a platform for only one use case, so they were seeking a solution that allowed them to run a multitude of use cases on their GPU cores.
Kinetica enables GSK’s data science and analytics teams to tap into the power of their existing GPUs, without needing to write custom code or use specialized software for each particular use case. By using Kinetica, GSK is now able to open up a plethora of advanced and innovative use cases. By allowing GSK to communicate with their existing GPU cluster just like a typical relational database, users can interact with it using a traditional query language such as SQL. Now, GSK’s data science and analytics teams can run workloads that require a more computationally-intensive environment for analysis in Kinetica while maintaining the feel of communicating with a traditional relational environment. Below are summaries of a few specific use cases.
UDFs Enable Advanced Analytic Workloads Across the Entire Complex
One of the areas Ramsey and the GSK team are most excited about is in the area of machine learning and other types of algorithms. With Kinetica, one of the things they wanted to have was user-defined functions (UDFs). UDFs enable users to add a little payload into their query, and then let the Kinetica optimization engine spread that across the whole GPU complex.
Additionally, GSK can quickly deploy new models and eliminate the time and effort required to transform data and move it back and forth between a database and a separate data science system. With Kinetica’s user-defined functions, GSK can package up analytics as an extension to the optimization engine that’s running, and let Kinetica do the job for them.
When they’re doing advanced machine learning or other types of chemical simulations, GSK can now “turbocharge” them by running those jobs on a GPU complex. Kinetica allows GSK to use the GPU platform for more than just one use case, because each of the use cases suddenly become more like queries than highly customized pieces of software. Kinetica gives GSK a layer of abstraction between the GPU hardware and the workloads that they provide to the platform.
GSK has multi-billion row data sets, and they wanted to do aggregations and queries on those datasets. GSK has 125,000 GPU cores, and the organization didn’t want to tackle the labor- intensive task of figuring out how to write parallel queries that go across GPU cores. “I don’t want to figure out how to write queries to go across 125k cores. Nima and the Kinetica team have already solved that problem,” says Ramsey.
I don’t want to figure out how to write queries to go across 125k processors. Nima [CTO] and the team at Kinetica have already solved that problem.Mark Ramsay, SVP – R&D Data at GSK
Kinetica’s parallel architecture makes it possible for GSK’s data science and analytics teams to load their relational data into the Kinetica environment, submit a query, and get results back quickly. GSK is seeing 75 to 100 times faster results with Kinetica; they have moved from a more traditional database environment to an accelerated one.
Looking Ahead to Streaming
Given the emergence of digital devices and upcoming initiatives, GSK plans to make use of streaming technology in the future. At this point, GSK is currently focused on the machine learning and fast data areas, where they can take slower-running queries and turbocharge them, as well as take analytics and machine learning activities and apply them in a very efficient way.
GPU-Databases Bring Next-Generation Solutions to Pharma and Healthcare
GPU-accelerated databases are reshaping the pharmaceutical and healthcare industries. Bringing new drugs to market is an expensive, lengthy, and risky process, requiring billions of dollars in upfront investments. With Kinetica’s GPU-accelerated database, healthcare and pharma companies can derive faster insights from vast volumes of streaming data. Kinetica can help researchers use algorithmic techniques such as machine learning, deep learning, and natural language processing (NLP) with easy-to-use visual analytics tools to interactively run thousands of simulations across complex data sets.
Kinetica returns results for advanced analytical queries on billions of rows of data in well under a second, making it an ideal solution for pharmaceutical research. By delivering a fast and powerful data and analytics solution, Kinetica accelerates the drug discovery, development, and clinical trials process while minimizing costs, reducing risks, and ultimately, saving lives.
Find out how Kinetica can work for you. Contact us for a demo today!