Clinical AI in the new world of ChatGPT

For many years, we have developed niche AI to help our hospital clients. Two years ago, we released our first platform (CAC) powered by a metered (fee-for-use) external AI service. Specifically, we use Microsoft’s Cognitive Services platform and its Natural Language Processing (NLP) API to translate freeform clinical text into medical concept codes. This AI cannot be used for patient care, but it is very useful for other purposes including assistance in the medical coding process.

At the time, this went against conventional approaches, as many industry peers were building proprietary NLP solutions to try to solve these problems. With the mainstream adoption of ChatGPT and other Large Language Models (LLMs), it is worth reviewing the current landscape and examining where things are headed.

The current state of clinical NLP

While commercial clinical NLP is available from vendors such as Microsoft, Google, IBM, and Amazon, the recent exposure of Large Language Models (LLMs) in the media has captured everyone’s imagination. ChatGPT has made it clear that AI will transform many professions, including medical transcription and clinical coding.

As a quick example, on our home page, we have some random medical text that is interpreted by Microsoft’s AI. I invite you to type the following in to ChatGPT (specifically GPT-4) to see the results (which may be different from those below).

WARNING: Do not enter any sensitive data into any Language Model, including GPT-4. It is not intended for commercial use, does not comply with privacy standards, and is not accurate enough for general medical use.

Extract all ICD-10CA and CCI codes from “The patient had a PICC line inserted and was mechanically ventilated to resolve O2 levels. UTI was confirmed by cultures and the patient developed AFib with blood pressure exceeding 115.”

The diagnosis codes seem reasonably accurate. The intervention codes are not. GPT-4 was not specifically trained for the purpose of interpreting clinical data; however, it did correctly identify three diagnosis and two intervention concepts from the text and it was familiar with Canadian-specific medical codes. This is a remarkable feat for a general-purpose model. Imagine the possibilities when dedicated generative models are refined for this purpose.

Practical barriers to medical coding with AI

Actual clinical text cannot be sent to an LLM service like GPT-4 but it does demonstrate where things are headed. Let’s look at some of the current limitations for medical coding, specifically with GPT-4 and Clinical NLP in general.

Data Privacy and Security: The handling of sensitive patient data is of the utmost importance, so hospitals must ensure that any AI systems they use complies with data privacy regulations such as HIPAA and GDPR. Commercial vendors like Microsoft offer this level of security, whereas public language models like GPT-4 do not.
Accuracy and Reliability: Although clinical AI systems have shown high levels of accuracy in translating clinical text to standardized medical codes, they still have a long way to go. In the near-term, it is likely that AI will continue to be a coding assistance tool rather than a complete replacement for professional coders, while noting that it could accurately code low-complexity cases such as typical day surgeries. Our CAC platform is built on this coding assistance principle and focuses on enhancing the overall productivity of the coding workflow.
Limited understanding of context: GPT-4 generates text based on the context provided to it; however, it may not always understand the context of medical codes and their relation to a patient’s medical history, diagnosis, and treatment. That being said, AI’s ability to consider context is increasing at an astonishing rate, and it may one day surpass humans in its ability to take into account all factors, not only for medical coding but for accurately diagnosing patients.
It can only code what it can see: Within the healthcare system, the availability of comprehensive medical data has been a persistent limitation for accurate medical coding. To ensure quality medical records, we must address several upstream issues, such as comprehensive integration of all available medical data and the thoroughness of clinical documentation recorded by clinicians.

Where things are going (and when)

No one knows. Everything is too new.

If one were to logically predict the likely workflow using highly accurate AI, it is reasonable to conclude that EMRs will eventually have sophisticated transcription and automated medical coding functionality built into them. In this scenario, coding software, as we know it, would be redundant.

The use of AI also calls into question the need for standardized codes and case mix groups (e.g., CMG in Canada) as ways of categorizing patients for convenient analysis. That need may no longer exist when clinicians, researchers, and administrators have the ability to create and analyze specialized patient cohorts within minutes. The nature of digitally representing patient populations would be redefined, as would hospital and system-wide planning, budgeting, and funding distribution methodologies.

However, please keep in mind that there are many practical limitations to this becoming a reality, beyond the current accuracy limitations of AI. I remind you that many hospitals currently use paper charts and we do not yet have a consolidated electronic patient record in Ontario, both of which are far less challenging technical goals; change in healthcare is very slow.

What should our hospital do to prepare?

Given the current state of digital health policy in Canada, we offer these recommendations for general preparation for the use of AI:

Form an AI team: It is a good idea to form an AI team to monitor the use of AI in healthcare and determine if there are new tools that can be used to immediately improve the effectiveness of care in your organization. It will also help you better assess what is coming down the pipeline.
Improve your IT infrastructure: The use of cloud-based providers will become the norm; however, it takes time for a hospital to become comfortable with securely using cloud services that involve transmitting sensitive data to a third-party provider. It requires knowledge of advanced topics such as cybersecurity and API usage. These skills take time to develop.
Focus on data integration and data quality: The value of AI will come from the quantity and quality of data available to it. Comprehensive data integration and data quality initiatives are expensive and laborious, and there are few shortcuts.
Focus effort, find good partners, and don’t get caught up in the hype: There are many novel ways to use this technology and not all of them bring concrete benefits. Have a process to define the actual value derived from any of the many options, and bring in experts to help in the domain, as it may be impractical to internally develop certain advanced skills.

How vendors (like us) can adapt

Flexibility will be key. Similar to when the Internet became mainstream over 25 years ago, things will evolve rapidly, with many failed or quickly obsolete service models.

Basic questions remain unanswered, such as: Will the industry be dominated by large, general-purpose AI models from a handful of vendors, or will local, personalized AI models become pervasive? For complex, widely applicable uses such as clinical text interpretation, we continue to believe that the most effective AI offerings will be smaller, domain-specific models offered as metered services by vendors like Microsoft, IBM, and Google.

We also currently believe that the immediate value of AI will also come from the safe and effective use of commercial APIs. Moving forward, we predict there will be less focus on advanced data science initiatives and more on software engineering efforts that make the most of third-party AI technology. Hospital-developed software solutions will likely diminish in favor of knowledge workers using pre-trained models with refined user interfaces.

Therefore, our approach is straightforward: create the data structures and software tools that best complement the strengths and weaknesses of commercial AI. This means creating intuitive user interfaces, addressing security details, effectively integrating data, and creating flexible back-ends that can adapt to changes in technology and providers.

Summary

Interactive AI that provides accurate, detailed answers to extremely complex questions will soon be available to everyone. Natural language AI products will no longer be exclusively born from advanced data science teams. The commoditization of AI will enable entrepreneurs and software engineers to solve practical problems and build new business models. This is a common pattern seen many times over the past forty years in personal computing.

It will be messy, and it will take time for healthcare providers and their partners to adapt. The best short-term approach will be to find where the current value is and be adaptable as everything evolves, while managing the risks and distractions that will inevitably appear.

If you have any questions about using clinical Natural Language Processing (NLP) and other Artificial Intelligence (AI) in your hospital, please contact us at any time.