How to introduce Machine Learning into your hospital

How to introduce Machine Learning into your hospital

Machine learning (ML) and Artificial Intelligence (AI) have consistently been the topic of healthcare articles citing advanced applications such as identifying malignant tumors from radiology images or automatically diagnosing patients. There is a very wide spectrum of uses for machine learning, many of which can be applied towards less captivating hospital operational purposes.

In this article, I hope to illustrate that there are many hospital Decision Support (DS) problems that can be best solved using ML and that you do not require an advanced degree in mathematics to take advantage of the technology.

We use Machine Learning extensively in our data quality software and similar techniques can be used by internal hospital DS teams. Advanced clinical problems seem to have captured the interest of many hospital executives. We’ve actually seen hospitals with well-funded advanced Machine Learning teams exclusively solving complex clinical/research issues while operational Decision Support is given far less attention. This is a gap that can be filled without significant investment.

Decision support in hospital environments

Anecdotally, we’ve seen that Decision Support (DS) teams mostly use a mixture of parameterized reports, ad-hoc database queries, Excel, and interactive data visualization tools (Qlik, Tableau, PowerBI) to support analytical uses. Many of these approaches are resource-intensive, inflexible, or leave it to end users to decipher and extract insight from the data.

With so much attention paid to interactive “self-serve” data visualization software, there is often a significant gap between the tools provided to users and what users actually need. Sometimes unit managers just need direct answers to targeted questions such as “Which of my patients are at risk for readmission?” and “What is my anticipated nursing workload for next week?”. These valuable questions are not easily answered by flashy dashboards that just present raw data in more digestible formats. They often require something else.

When teams try to create analytics to answer questions like these, the first response is often to write custom code/script that applies business logic formulated by subject matter experts (SME). Programmers often try to recreate the SME’s thought process through code. This is often very difficult to do well and can become increasingly difficult to manage as logic complexity increases. This is where Machine Learning can help.

What is Machine Learning?

To understand the basic concept, let’s look at a simple example. Let’s say that you have the following visit counts over the past 20 days and you’d like to predict what the next week will look like. This relationship between visit count and time looks linear, meaning it’s following a straight line into the future. To predict what will happen, you need to know what the slope of that line is and extrapolate future values from there.

What Machine Learning does in this trivial example is to provide a best estimate of the line’s slope (and intercept) based on the data available. You need to collect the raw data and select an algorithm (a linear equation in this case) that you believe will best represent the shape of the data. The ML algorithm fills in the blanks (parameters) and gives an estimate as to how much error may be involved in using this model to predict new values. Machine learning is very good at finding the right parameters for a model of your choosing that best represents the problem at hand.

A practical example

Let’s look at a specific application of ML in our DQA software. Whenever Diagnostic Imaging (DI) is used on a hospital visit, it needs to be recorded on the patient record. Given the sheer volume of DI exams that occur every day, making sure that they’re all properly transcribed is a daunting task. Even being correct 99% of the time can result in hundreds of missed exams, which can impact hospital funding from the government.

When tasked with finding these discrepancies, a common approach for a DS analyst would be to write a SQL database query to compare data from the DI system against the coded medical record data. The problem is that codes from the DI system are different from the standardized intervention codes (CCI in Canada) found on medical abstracts.

The DS analyst may then start writing custom logic or even maintaining mapping tables to perform this translation. We took this approach in an early version of our software and it was a difficult process to manage and had a significant error rate. There was hidden logic that we didn’t fully appreciate at the time, such as exclusion criteria when multiple DI records are found for the same time and body area.

Using Machine Learning to find missing DI exams

Luckily, there are many ML algorithms that are very good at detecting whether something is likely present or not (this is an example of “binary classification”). You just need to “train” the algorithm using historical records that list all DI system and resulting CCI codes for each visit. The ML algorithm processes all the historical data to learn the underlying relationships and gives you the ability to predict CCI codes for new records and uncover potential data quality errors.

This type of analysis takes less than 100 lines of code, none of which is particularly complicated. We tested a wide variety of binary classification algorithms including Logistic Regression, Support Vector classifiers, and Decision Tree variations. All were very accurate (about 300% more accurate than our mapping approach) and took very little time to develop. Additionally, the same approach is immediately reusable for any hospital regardless of DI system since the algorithm learns from the data (DI codes) itself. We only need to write database queries to feed the algorithms the proper data and retrain them.

How do I get started?

Professional data scientists are very expensive and out of reach for most DS departments but I believe that many hospital analysts can add practical ML techniques to their toolboxes. I can recommend the book “Machine Learning with R” which is a fantastic resource for beginners. It provides a comprehensive overview of most ML algorithm categories along with practice examples in R, a popular statistical and ML software package that is completely free. I’m sure there are similarly great books for Python (the most popular language for ML) but I work primarily with R.

How to select an algorithm based on your analytical problem. From Microsoft

With just a fundamental understanding of ML and evaluating model performance, analysts can begin getting value from these algorithms. As they gain more experience they can increase the accuracy of results through various techniques (algorithm tuning and feature engineering) and solve progressively more complex problems.

There is a learning curve and culture shift involved and it will take time to understand how to productionize these algorithms. Luckily, Microsoft SQL Server now has integrated R and Python support, and Microsoft has allocated significant resources towards making ML more approachable for data analysts and developers. They have a cloud service for machine learning that is easy to get started with.

If you are a 3terra client, you can also contact us to help implement new ML logic within your DQA implementation. DQA analytics utilizes many ML algorithms and already houses a wide variety of integrated hospital data that can provide new insight into your operations.


A rudimentary, practical understanding of Machine Learning may transform the way that you approach solving analytical problems. It has made our software far more accurate and easier to maintain. It may extend your limits in answering complex questions, some of which may have been thought unsolvable. It’s not all hype and it doesn’t exclusively belong in the domain of experts.