What machine learning can teach about philosophy of science — and what it can’t

Tom Bäckström
4 min readNov 12, 2018

--

Machine learning is currently the hot trend in engineering sciences. Though I’m divided as to whether machine learning tradition follows a scientific philosophy which stands up to scrutiny, I do think that machine learning practice provides breakthrough insight into the philosophy of science. In particular, machine learning provides a new interpretation to what it means when we say that science can understand real world events.

Machine learning is a data science where, in a simplification, the goal is to make a rule which takes a data set A and predicts a data set B. For example, if we have x-ray images of patients, we can try to predict whether the patient has cancer. The particulars are not overly important but the idea is that if we take a bunch of sufficiently complicated functions, then we can find a function which makes accurate predictions.

The interesting part is the way such models are evaluated: the existing data sets A and B are divided into two separate parts, training and evaluation sets; A.training and B.training as well as A.evaluation and B.evaluation, respectively. The model is constructed or trained with the training data and the performance is evaluated with the test data. By design, models are always supposed to be effective on the training data. The interesting question is then whether the model is effective also in predicting results on data which it has not seen before. In other words, optimally, the model should be able to generalize beyond the data it has already seen. It should be able to predict results for the testing set when it has seen only the training set. When we get the x-rays of new patients, the model should be able to tell whether they have cancer even when it has not seen the answer before.

This is a task of both interpolation and extrapolation. A good model should be able to understand the patterns in data such that it can recognize the same patterns in new data. Whether a model can ever understand anything is a matter of philosophy, but that’s beside the point. What matters is that a good model can make accurate predictions.

A central deficiency of machine learning methods is that even an accurate model does usually not provide the scientist any additional understanding of the problem. At best, we find that there is a relation between A and B, such that we can use A to predict B. The model however does not explain the relationship between A and B. How is A related to B?

Understanding is however a matter of philosophy of science. How can we ever claim to understand something? According to the famous classic theory of Kuhn, science is the process of observing the world, making hypothesis based on those observations, and evaluation of those hypotheses by making new measurements. The problems with such a theory are well documented and one of the issues is that this model does not take uncertainty to account. All measurements have some uncertainty, such that it’s impossible to state anything by certainty.

Machine learning however does not care about certainty, only about accuracy. How accurate predictions can we make? By giving up the concept of certainty and only talking about accuracy or likelihood, we can reinterpret science itself. We can proclaim, for example, that science is about making accurate predictions. It is a Bayesian perspective to philosophy of science, where we try to find the most likely solutions to scientific problems.

Classical concept of philosophy of science, like Occam’s razor, can then be interpreted within this Bayesian framework. Specifically, out of two possible models, the simpler one has less potential sources for errors, such that the simpler one is superior to the more complex, even when the accuracy of both models are similar. We thus interpret complexity of a model to be a source of uncertainty, which has potential to reduce accuracy.

As a scientific philosophy, the weakness of machine learning is its reliance on data alone. We do not have a fixed methodology for including prior results to new models. In data science, standing on the shoulders of giants has the meaning of adding a new pile of data on top of the old pile of data. There is no specific approach for knowledge transfer between models. This weakness is emphasised in systems optimized with the end-to-end paradigm, where the system consists of a single machine learning block, without any intermediate representations. The noble idea is that with this approach, the machine learning algorithm has the greatest amount of freedom to optimize for the final output quality. Any intermediate representation, separately designed, is a potential restriction. By avoiding intermediate representations we however also disable intermediate, human-readable results, such that it becomes more difficult to interpret what the algorithm is doing.

Still, in conclusion, I find that a scientific philosophy which focuses on accurate prediction has great potential. In a very utilitarian way, it avoids the problems of distinguishing between fact and fake; the utility of a theory is measured by the accuracy of its predictions. Overall, in addition to utility, this approach to philosophy of science is therefore also focusing on efficiency. In other words, the more coherent our world-view becomes, the more efficiently we can operate in that world. To me, that sounds like a rational goal in science, life and everything.

--

--

Tom Bäckström
Tom Bäckström

Written by Tom Bäckström

An excited researcher of life and everything. Associate Professor in Speech and Language Technology at Aalto University, Finland.

No responses yet