Transforming Medical Research with a Big Data Services Platform

Topics in this article

The era of big data has opened up new opportunities for medical innovation. A new wave of research projects is looking to accelerate medical discovery by combining vast amounts of personal and operational healthcare information onto readily available compute and storage for utilizing leading edge analytics tools and techniques.

The opportunities are tremendous but the challenges are many. Rapid progress requires architectures and processes that overcome the biggest challenges of data management in the healthcare sector: rigorous data privacy regulations; diverse information standards; a proliferation of application and data silos and complex integration points; streams of real-time data from monitors, scanners, imaging devices, wearables, and mobile devices; and vast databases that hold data of every conceivable type.

Partners HealthCare is one of the world’s leading medical research organizations. Encompassing both Massachusetts General and Brigham and Women’s – the major teaching hospitals for Harvard Medical School – Partners HealthCare supports thousands of research projects each year. The Enterprise Research and Infrastructure Services (ERIS) group provides enabling technical capabilities for their research and innovation communities, ensuring that teams have access to the technology infrastructure, data, tools, and support resources they need to meet their project and operational goals.

The research and innovation teams at Partners are pioneers in the use of big data technologies. However it was clear that the aforementioned challenges, along with infrastructure limitations and a lack of supporting services, were impeding adoption and progress. Recognizing that their customers required a new approach for service delivery, ERIS worked with Dell EMC Services to architect, build, and operationalize a platform for developing and executing big data medical and translational research projects faster, more efficiently, and at lower cost.

The result of this collaboration is the Integrated Data Environment for Analytics platform (IDEA). The IDEA platform provides the Partners HealthCare community of researchers and innovators with four key service capabilities that are fundamental to the enablement of big data solutions – storage, compute, analytics, and platform.

  • A pay-as-you-go storage solution that offers secure and unlimited capacity for a wide variety of data sets at a range of price and performance points. Designed around Isilon, CloudPools, and Elastic Cloud Storage (ECS), it provides a single place to store, secure, and analyze unstructured data sets critical for research initiatives in a privacy-aware environment. Using the multi-protocol capabilities of Isilon OneFS, a single data copy can be accessed from any instrument or system, while maintaining an authentication and authorization model that is integrated with Partners existing security processes. CloudPools allows data to be encrypted and warm-archived to ECS or public cloud providers, thereby providing unlimited secure storage, while adhering to Partners’ stringent security and regulatory requirements. This implementation strategy for a secure Data Lake is fundamental to enabling big data analytics in Healthcare and Life Science.
  • On-demand provisioning of Linux-based virtual machines pre-integrated with Partners security platform (authentication, authorization and auditing), storage, and common analytics and development tools. Supporting services include patching, maintenance, backup and high availability, relieving the research teams of common administrative burdens.
  • Integration with leading analytics and research applications that allows all data to be accessed and analyzed in-place using a common data repository. Built upon the Dell EMC Analytic Insights Module (AIM), the platform provides foundational data management and processing capabilities based on the Hadoop ecosystem. Access to Spark, Hive, HBase, Sqoop, and HAWQ is available from purpose-built IDEA Virtual Desktop workstations. These high-powered VDI workstations include installations of popular open-source data sciences tools including R, Python, RStudio, Jupyter notebooks and Spyder.  Multiple relational and NoSQL datastore options are available, including MySQL, PostgreSQL, Greenplum, and MongoDB. IDEA is securely and seamlessly integrated with the ERIS High Performance Computing (HPC) environment, allowing for the development of fully integrated data pipelines between the two systems.
  • An application development platform that allows research and dev teams to rapidly translate their research analytics and processes into robust, data applications that can be deployed as cloud resources for clinical and business use outside of the IDEA environment.

The IDEA platform is used across the research and clinical innovation enterprise. The scalability and flexibility present allows for their use by both large, well-funded institutions, and small innovation teams with limited budgets. Customers of the IDEA platform include:

  • The Center for Integrated Diagnostics, which integrates genomic profiling with advanced analytics across vast data sets to provide patients with a new approach for the personalized treatment of serious diseases. Using IDEA, the center has collaborated with Dell EMC and InterSystems on development of a prototype next-generation precision medicine system (MRE), as well as introducing several innovations into the clinical workflow.
  • The Martinos Center for Biomedical Imaging is one of the world’s premier research centers devoted to development and application of advanced biomedical imaging technologies. The center is using IDEA data services to securely and efficiently share a 100+TB neuroimaging data set, The Human Connectome Project, across its many research teams and analytics platforms.
  • The Center for Connected Health is a leader in the innovation and development of IoT-based solutions that are empowering patients to transform their health care experience. The center is using the compute and platform capabilities of IDEA in the development of next generation mobile healthcare solutions.

These teams, and many others, are only just beginning to explore and understand the power of the IDEA platform, and its potential for supporting medical innovation. Their excitement is palpable. With the support of the ERIS team and Dell EMC, the research teams at Partners HealthCare are shaping the future of healthcare in the big data age.

About the Author: Rob Small

Rob has over twenty years’ experience working with customers on maximizing the value they receive from their data assets and information management investments. He joined the Dell Technologies Services business in 2008, and has spent most of that time working with customers across the healthcare sector, helping define, plan and execute strategic, technology-driven, business initiatives focused on big data and analytics. Prior to joining Dell Technologies, Rob held a number of leadership positions delivering data management software, services, and solutions for the pharmaceutical and biotech industries. A Londoner, long-time New Jersey resident, and avid trail runner, Rob spends too much of his too little free time running the state parks of Northern NJ. Proud board member of the Salt Shakers running club, a group that’s raised over $50k in the last four years for breast cancer charities. saltshakersrun.com.
Topics in this article