Modelops improves machine learning model development, testing, deployment, and monitoring. Follow these tips to keep model risks in check and increase the efficiency and usefulness of your ML initiatives.
Let’s say your company’s data science teams have documented business goals for areas where analytics and machine learning models can deliver business impacts. Now they are ready to start. They’ve tagged data sets, selected machine learning technologies, and established a process for developing machine learning models. They have access to scalable cloud infrastructure. Is that sufficient to give the team the green light to develop machine learning models and deploy the successful ones to production?
Not so fast, say some machine learning and artificial intelligence experts who know that every innovation and production deployment comes with risks that need reviews and remediation strategies. They advocate establishing risk management practices early in the development and data science process. “In the area of data science or any other similarly focused business activity, innovation and risk management are two sides of the same coin,” says John Wheeler, senior advisor of risk and technology for AuditBoard.
Drawing an analogy with developing applications, software developers don’t just develop code and deploy it to production without considering risks and best practices. Most organizations establish a software development life cycle (SDLC), shift left devsecops practices, and create observability standards to remediate risks. These practices also ensure that development teams can maintain and improve code once it deploys to production.
SDLC’s equivalent in machine learning model management is modelops, a set of practices for managing the life cycle of machine learning models. Modelops practices include how data scientists create, test, and deploy machine learning models to production, and then how they monitor and improve ML models to ensure they deliver expected results.
Risk management is a broad category of potential problems and their remediation, so I focus on the ones tied to modelops and the machine learning life cycle in this article. Other related risk management topics include data quality, data privacy, and data security. Data scientists must also review training data for biases and consider other important responsible AI and ethical AI factors.
In talking to several experts, below are five problematic areas that modelops practices and technologies can have a role in remediating.
Risk 1. Developing models without a risk management strategy
In the State of Modelops 2022 Report, more than 60% of AI enterprise leaders reported that managing risk and regulatory compliance is challenging. Data scientists are generally not experts in risk management, and in enterprises, a first step should be to partner with risk management leaders and develop a strategy aligned to the modelops life cycle.
Wheeler says, “The goal of innovation is to seek better methods for achieving a desired business outcome. For data scientists, that often means creating new data models to drive better decision-making. However, without risk management, that desired business outcome may come at a high cost. When striving to innovate, data scientists must also seek to create reliable and valid data models by understanding and mitigating the risks that lie within the data.”
Two white papers to learn more about model risk management come from Domino and ModelOp. Data scientists should also institute data observability practices.
Risk 2. Increasing maintenance with duplicate and domain-specific models
Data science teams should also create standards on what business problems to focus on and how to generalize models that function across one or more business domains and areas. Data science teams should avoid creating and maintaining multiple models that solve similar problems; they need efficient techniques to train models in new business areas.
Srikumar Ramanathan, chief solutions officer at Mphasis, recognizes this challenge and its impact. “Every time the domain changes, the ML models are trained from scratch, even when using standard machine learning principles,” he says.
Ramanathan offers this remediation. “By using incremental learning, in which we use the input data continuously to extend the model, we can train the model for the new domains using fewer resources.”
Incremental learning is a technique for training models on new data continuously or on a defined cadence. There are examples of incremental learning on AWS SageMaker, Azure Cognitive Search, Matlab, and Python River.
Risk 3. Deploying too many models for the data science team’s capacity
The challenge in maintaining models goes beyond the steps to retrain them or implement incremental learning. Kjell Carlsson, head of data science strategy and evangelism at Domino Data Lab, says, “An increasing but largely overlooked risk lies in the constantly lagging ability for data science teams to redevelop and redeploy their models.”
Similar to how devops teams measure the cycle time for delivering and deploying features, data scientists can measure their model velocity.
Carlsson explains the risk and says, “Model velocity is usually far below what is needed, resulting in a growing backlog of underperforming models. As these models become increasingly critical and embedded throughout companies—combined with accelerating changes in customer and market behavior—it creates a ticking time bomb.”
Dare I label this issue “model debt?” As Carlsson suggests, measuring model velocity and the business impacts of underperforming models is the key starting point to managing this risk.
Data science teams should consider centralizing a model catalog or registry so that team members know the scope of what models exist, their status in the ML model life cycle, and the people responsible for managing it. Model catalog and registry capabilities can be found in data catalog platforms, ML development tools, and both MLops and modelops technologies.
Risk 4. Getting bottlenecked by bureaucratic review boards
Let’s say the data science team has followed the organization’s standards and best practices for data and model governance. Are they finally ready to deploy a model?
Risk management organizations may want to institute review boards to ensure data science teams mitigate all reasonable risks. Risk reviews may be reasonable when data science teams are just starting to deploy machine learning models into production and adopt risk management practices. But when is a review board necessary, and what should you do if the board becomes a bottleneck?
Chris Luiz, director of solutions and success at Monitaur, offers an alternative approach. “A better solution than a top-down, post hoc, and draconian executive review board is a combination of sound governance principles, software products that match the data science life cycle, and strong stakeholder alignment across the governance process.”
Luiz has several recommendations on modelops technologies. He says, “The tooling must seamlessly fit the data science life cycle, maintain (and preferably increase) the speed of innovation, meet stakeholder needs, and provide a self-service experience for non-technical stakeholders.”
Modelops technologies that have risk management capabilities include platforms from Datatron, Domino, Fiddler, MathWorks, ModelOp, Monitaur, RapidMiner, SAS, and TIBCO Software.
Risk 5. Failing to monitor models for data drift and operational issues
When a tree falls in the forest, will anyone take notice? We know the code needs to be maintained to support framework, library, and infrastructure upgrades. When an ML model underperforms, do monitors and trending reports alert data science teams?
“Every AI/ML model put into production is guaranteed to degrade over time due to the changing data of dynamic business environments,” says Hillary Ashton, executive vice president and chief product officer at Teradata.
Ashton recommends, “Once in production, data scientists can use modelops to automatically detect when models start to degrade (reactive via concept drift) or are likely to start degrading (proactive via data drift and data quality drift). They can be alerted to investigate and take action, such as retrain (refresh the model), retire (complete remodeling required), or ignore (false alarm). In the case of retraining, remediation can be fully automated.”
What you should take away from this review is that data scientist teams should define their modelops life cycle and develop a risk management strategy for the major steps. Data science teams should partner with their compliance and risk officers and use tools and automation to centralize a model catalog, improve model velocity, and reduce the impacts of data drift.