Domain agnostic methods for integration of prior knowledge in learning algorithms

There has been a tremendous shift in recent years in data analysis applications, where reasoning about systems is moving from the use of domain knowledge and data to the use of purely data-based analysis. This shift has been largely driven by the availability of tremendous amount of data in all problem domains. This availability of data is poised to increase even further due to the pervasive nature of Internet of Things (IoT). In parallel, vigorous research in data science algorithms have now generated a suite of apparently “all-weather” algorithms that can be used in conjunction with Big Data for reasoning and analysis in almost all problem domains. In such a scenario, it is important to step-back and assess the relevance of domain knowledge in data analysis. There are several fundamental questions that beg answers. Is there no need for domain knowledge at all in data analysis - in other words, can everything that can be uncovered be done so with just data and algorithms alone? Even if it were possible to work purely with data, would the models that are developed be interpretable enough to be used in contexts that were not considered at the time of analysis? An alternate question is if indeed domain knowledge is available, how would one incorporate that knowledge into data analysis algorithms to derive better and more interpretable models? Answering the last question would require fundamental research in techniques that incorporate domain knowledge into learning algorithms. This proposal addresses this question for a class of domain knowledge and machine learning algorithms.