A self-driving car bumps into a lamppost. A doctor prescribes the wrong treatment to a patient based on an AI-based diagnostic tool. A missile is misfired by an AI-based defense system. An unfair decision provided by a banking chatbot leads to loss of customers.
The above examples are probably sufficient to explain why there is a need to tread carefully while deploying AI-based solutions in the real world. While wrong predictions made by recommendation systems in domains like retail might be inexpensive, such predictions in domains like healthcare, self-driving vehicles, banking or defense can cause hefty monetary losses or even loss of lives. A majority of these problems may be avoided if one is able to understand the AI’s reasoning behind its decisions.
Towards this goal, AI researchers have come up with various ways to surface the inner workings of models. These techniques may be broadly categorized as either explaining the predictions of black-box models like a traditional Convolutional Neural Network (CNN) or building an inherently interpretable model, like a Decision Tree, that is highly accurate.
Prof. Balaraman Ravindran, a mind tree faculty fellow and professor at IIT Madras, who also heads the Robert Bosch Centre for Data Science and Artificial Intelligence (RBCDSAI), and Mr Abhishek Ghose, a Director in the Data Sciences Group at [24]7.ai and a PhD Scholar at IIT Madras, have been interested in streamlining the use of interpretable models. They realized that for a model that is large in size, it is difficult to answer the question of how it makes its predictions, whereas when a model is small in size, it might be relatively inaccurate. The duo decided to work on ensuring that small-sized interpretable models may be constructed with the least possible loss in accuracy. Recently, they came up with a technique to build compact models that minimize the above tradeoff.
“The technique essentially works by modifying the model training step for an arbitrary model family to produce high accuracy at small sizes. The practical benefit of this is that instead of picking an interpretable model family based on accuracy, one may construct an accurate but possibly large model from a preferred model family, and then use our method to make it compact,” says Prof. Ravindran.
The algorithm first derives a highly accurate probabilistic model - called the “oracle” - on training data. Predictions from this model are then used to learn a sampling distribution over the training data, and a sample thus generated is used to train the interpretable model. The distribution learning step is framed as an optimization problem. The team tested this technique on various real-world datasets and found that it can indeed produce small-sized accurate interpretable models.
“What makes our algorithm interesting is that it is not tied to a specific optimizer - it’s easy to make it faster as newer and better optimizers become available. Also, since our technique operates by identifying data instances that have the greatest influence on learning, there are parallels to the area of data valuation, say, by computing Data Shapley values. This is a connection worth exploring,” says Mr. Ghose, while discussing future research plans.
As AI systems become increasingly common in the real-world, it is paramount that we correctly understand the working of these systems. This would not only increase the confidence of a modeler in the robustness of a system, but also would make it trustworthy for its users. In that context, the above technique developed by RBCDSAI researchers plays an important role in paving the path for deployment of useful and benevolent AI systems.
Contributors
Abhishek Ghose, Balaraman Ravindran
Article
Abhishek Ghose and Balaraman Ravindran, Learning Interpretable Models Using an Oracle.
Python Library - compactem: build accurate small models!
Keywords
Machine Learning,Model Agnostic Technique, Gated Recurrent Unit, Optimal Training Sample, Optimization Problem