Here’s why your IT team should stop owning model monitoring.

Let’s pretend that you’re a Data Scientist (or if you actually are, no pretending needed ;)).

After weeks or even months of hard work understanding and refining a business problem, collecting and cleansing data, making state-of-the-art modelization using brand new Machine Learning algorithms, you finally have a model that solves a real-world problem, and it’s ready to be put into production.

Pass that workflow to your MRM team or MLOps (IT) and let them do their thing—and congrats, you’re done. Or … are you?

Contrary to popular belief, a Data Scientist’s job doesn’t end after a model is built. In fact, a model has absolutely no value for anyone until it’s deployed into a production environment and used on a regular basis by a business process.

Your next challenge: figuring out how your company is actually deploying the model—and what happens after that? For now, let’s set aside the technical struggles of deploying a model, and the complete Machine Learning pipeline that eventually comes with it, and focus on what’s next.

Typically this is when IT departments get involved. They may be responsible for model packaging, model hosting, and model serving (think API here). Every application under IT responsibility should be monitored. Most of time, this involves tracking classical information such as:

Making sure the resource is running and accessible
Monitoring CPU, GPU, RAM,
Keeping logs of the resource (application logs, errors logs, etc.)
A heartbeat check confirming the model returns results

And that’s it.

Unfortunately, Machine Learning models are not even close to traditional IT applications. While these pieces of classical application-level information are needed, they’re not sufficient. Machine Learning models are an awkward artifact that can technically work, meaning they’ll provide an answer when requested (e.g. probability, forecast, expected class). The problem is, that answer might be completely wrong for the very business problem they’re trying to solve. And that will only negatively impact the business process.

A lot of real-world complications can produce bad behavior in Machine Learning models. The most frequent ones are:

Inconsistent data being fed to the model
Poorly designed algorithms
Poorly designed data processing (e.g. incomplete feature engineering or inducing leaks)
Structural changes in the business problem we’re trying to solve

While the first two bullet points should be addressed with state-of-the-art modeling techniques that Data Scientists are actually responsible for, the latter two aren’t linked to their work. Let’s try to be more explicit about these two.

“Structural change in the business problem we’re trying to solve”

When working on a real-world problem, we know that the use case will evolve over time. This is 100 percent normal. If the evolution is slow-spaced, a deployed model can remain accurate one year or more after being deployed.

On the other hand, a fast deprecating model might be useless after a couple of weeks. Either way, a model will become impacted by any unexpected major change. For example, COVID-era lockdowns had a major impact on a lot of businesses that didn’t take into account that pre-COVID models would no longer be a good fit for them. This is what’s known as concept drift—when the statistical properties of the target variable a model is trying to predict change over time, rendering the predictions less accurate. Whether concept drift happens quickly or slowly, actions need to be taken on the model. Typically, retraining the model with more, fresh data is high on the priority list.

So when is concept drift important enough to trigger action? The answer depends on your use case. In order to know if the drift increases or not, you need to monitor it. And if you rely solely on IT to manage the end of the model’s lifecycle, you might miss it completely. Metrics that quantify concept drift are highly technical and Data Science–oriented. It’s not the role of an IT professional to try to interpret concept drift or know what action to take.Ultimately, it’s a Data Scientist’s responsibility to make a Data Science project work. That’s why it’s crucial you have direct access, yourself, to information about concept drift. That way you can recognize when action is needed—and take it.

“Inconsistent data being fed to the model”

The other major error that can make a model go off the rails is when it’s fed the wrong data—also known as data drift. Here are some common examples:

Relevant features start to go missing
New modalities appear on categorical features
Distribution of linear features changes over time
Changes in the schema of data (e.g. an integer becomes a character or a date format changes)
Hand-based input with a user error in it (e.g.: you’re expecting an input to be in M$, and it’s in $ instead)

These types of errors aren’t even close to being the responsibility of the Data Scientist who owns the model. But garbage in will lead to garbage out. In order to prevent that, information regarding data sent into the model needs to be computed and analyzed.That’s not to say you should raise an alert every time you see 0.1% of data missing in a given feature. But if, say, you notice that one of the top features is suddenly missing, you can expect things to go wrong for model users, and, a couple weeks later, for the data team. Like concept drift, data drift needs to be analyzed by a Data Scientist— assuming you have these kinds of analytics available. In other words, don’t expect your IT department to figure out what a Jensen-Shannon divergence is or identify any other fancy technical indicator that sums up data drift. This is a Data Scientist’s job.

All in all we’ve seen that, while IT monitoring is necessary for Machine Learning projects, classical IT indicators alone are not sufficient to provide a good level of service. They need to be rounded out with Data Science KPIs that should be monitored by a Data Science professional.

ABOUT Provision.IO

Provision.io brings powerful AI management capabilities to data science users so more AI projects make it into production and stay in production. Our purpose-built AI Management platform was designed by data scientists for data scientists and citizen data scientists to scale their value, domain expertise, and impact. The platform manages the hidden complexities and burdensome tasks that get in the way of realizing the tremendous productivity and performance gains AI can deliver across your business

Here’s why your IT team should stop owning model monitoring.

[Provision.io R SDK] Apps deployment

AIDOaRt at Provision.io

A Guide to Deliver a Full ML Pipeline

NEED MORE INFORMATION?

Related posts

NEED MORE INFORMATION?