Money for nothing: Making sense of data collaborations in healthcare
Several leading health systems got together recently to announce the formation of Truveta, an independent company that will pool patient medical records from the participating health systems and analyze them for insights to drive healthcare outcomes. The announcement highlighted the benefits of sharing de-identified data for driving research, new therapies, and improved health outcomes.
In an initiative launched last year, UC San Francisco (UCSF) has created a data platform titled the UCSF Health Atlas that leverages over 150 social determinants of healthcare variables. The data, which comes from public sources, covers California residents and attempts to correlate health outcomes with highly granular data on where individuals live. The data is layered on EHR data and intends to provide researchers and clinicians invaluable insights into the nexus of social factors and health, especially for vulnerable and low-income populations.
The two initiatives described above are at different ends of the spectrum of data collaboration initiatives. However, their approach is the same: harnessing data from a wide range of sources and applying advanced analytics for insights to drive healthcare outcomes. There is also an underlying goal of monetizing the data and the insights, at least for some of them.
The data challenge for healthcare data consortiums
Data-sharing initiatives by health systems are not new. Mercy Technology Services, the IT arm of Mercy Health, launched an initiative in 2019 in partnership with device maker Medtronic and tech firm SAP as a data orchestration and insights network for better insights into costs and outcomes for Mercy Health’s patient population. Mercy claimed savings of $33 million in device and medical supplies costs in the previous three years through the use of real-world evidence (RWE). In France, pharma company Sanofi launched a collaborative initiative early this year to pool technology expertise and data to develop new digital solutions.
Over the past couple of years, health systems such as Mayo Clinic and Ascension Health have been in the news for entering into data-sharing partnerships with cloud providers, notably Google. The objectives are familiar: improving consumer engagement, applying AI/ML tools for advanced insights to drive health outcomes, and improving technology operations through cloud migration.
Relative to other sectors of the economy, healthcare has struggled with a balkanized landscape of data and remains mired in a thicket of regulatory constraints, interoperability challenges, and mistrust among participants. Katherine Lusk, Board Chair for AHIMA, an industry organization for health informatics professionals, states that we are still in the very early stages of harnessing data comprehensively. She believes that one of the biggest challenges for healthcare is the normalization of datasets using industry standards such as SNOMED to translate diverse, nuanced clinical data into clinical language and classification systems. While electronic health record (EHR) vendors have tried to normalize data within their platforms, their customers often set up their data dictionaries in enterprise-specific ways that make it challenging to exchange data across health systems seamlessly. Katherine points to Industry initiatives such as Carequality that are pushing for data normalization on a much broader scale as an encouraging development.
The monetization challenge for data consortiums
All data sharing consortiums will need to answer a fundamental question: How will the data and insights be monetized? I discuss three monetization options here for data-sharing collaborations.
- Licensing of the data analytics platform: Health systems have struggled to stand up internal data lakes that can ingest, normalize and standardize large amounts of data from diverse sources. Even where we have seen success, such as with CRM or population health management programs, limited datasets sitting in silos are accessed via APIs with core transaction systems such as EHR. Mona Baset, VP of Digital Services at SCL Health, points out that in any patient engagement initiative, the data piece often takes the longest to get right. Several newly launched initiatives have targeted this problem. Notable among these is the recently announced Amazon Health Lake that has outlined a grand vision of enabling healthcare providers, health plans, and pharma companies to aggregate, organize and analyze health data at a petabyte scale. Others include Innovaccer, a startup that has raised venture capital from Microsoft and recently acquired “unicorn” status. Amazon’s vision may be a comprehensive approach to aggregating data across healthcare segments; however, in practice, it has also been the most challenging to achieve in the past.
- Licensing of analytical insights: The Truveta consortium aims to generate insights that can drive improved healthcare outcomes through advanced analytics and AI. Even as Google doubles down on its relationship with Mayo and other health systems, news reports indicating that IBM was throwing in the towel and exploring a sale of their troubled Watson Health business have raised questions about the broader struggles for AI in healthcare. Even if these data collaborations succeed in applying AI to drive new insights, who gets to benefit? While the Mayo and Ascension relationships with Google imply that the datasets and the insights are for the exclusive use of the sponsoring entity, organizations such as Truveta may or may not have such limitations. Smaller health systems that cannot invest in advanced data and analytics programs could see an opportunity to leverage insights and benchmarks developed by a Truveta or by an Amazon data Lake (assuming that AI/ML algorithms trained on one patient population dataset can be extended to other population groups).
- Licensing of de-identified patient data: Data aggregation and monetization in consumer finance and other sectors have been in practice for a very long time by Equifax, Experian, and others. IQVIA and others have aggregated data on drug prescriptions in the life sciences space and sold the data to pharma companies for sales planning and clinical research. However, when it comes to patient medical information, there has always been a sensitivity to selling de-identified patient data in addition to all the regulatory compliance requirements such as HIPAA. The absence of a national patient identifier and the challenges of creating master data repositories due to data normalization and interoperability issues make it challenging to harness standardized and normalized data for all aspects of healthcare operations. Given that, there is likely a significant demand for a service such as Truveta that can deliver “clean” patient datasets to consortium members.
There are multiple avenues to make money from aggregated and de-identified data. Given that healthcare data belongs to patients, it raises questions about who stands to gain monetarily from these initiatives and how they will be passed on to consumers. The Mayo and Ascension agreements came under scrutiny last year over questions about who was gaining access to the data and how the data was being used besides population health insights. That might explain why the Truveta announcement takes pains to assure us about “careful protection of patient privacy and security” and why the word “ethics” appears several times in the statement (including a quote from a VP of Ethics at one of the health systems for added emphasis.)
The monetization of healthcare data has been a long-pursued goal for tech firms and healthcare enterprises. It may be too soon to tell if that elusive goal is within reach.