Friday, 20 October 2017

Paper Review: The Discipline of Machine Learning - Tom M. Mitchell

This paper is a must read for anyone starting off their journey into the world of Data Science and Machine Learning. It introduces the concept of Machine Learning in a very simple and crisp manner. Back in 2006, data scientists had already made much headway in commercial applications of ML like Speech Recognition, Computer Vision, Bio-Surveillance, etc. (Of course we are still endeavoring to improve the performance and accuracy in these fields today). It touches upon the key research questions surrounding the scope of machine learning algorithms and the exploration of the variety of learning tasks.

Although the paper was written over a decade ago, I believe the ideas expressed summarize much of what is known in the field today. The paper introduces the concept of Machine Learning as a process when a machine learns from its experience E, and utilizes the learning from this experience to improve its performance P at carrying out a defined task T. The concept is fairly simple. The complexity of actually implementing it is whole different question.

Take for example, the plethora of personal assistants entering the market recently. Google Home just hit the markets to take on its competitor Amazon Echo (Alexa) and shortly Apple HomePod (Siri) will be joining the ranks. Now, what really defines the performance of a personal assistant. One of the key functions of a PA is to provide answers to a user’s queries. To assess their performance, I asked this question to Google and Siri. “What is a neural network?”

Google presented the dictionary definition, “noun – a computer system modelled on the human brain and nervous system.”

Siri took me to the Wikipedia page for Artificial Neural Networks that starts off with “Artificial neural networks (ANNs), a form of connectionism, are computing systems inspired by the biological neural networks that constitute animal brains”

Which definition warrants a better rating on performance? Google’s is an extremely simplistic one while Siri leads to a detailed document on it. One might argue that the 2nd adds more knowledge and hence is a better choice. Is it really though? The definitions are meant for different kinds of users. The wiki page might be perfect for an aspiring data scientist while the dictionary definition is probably all that a non-technical user needs to know for putting his curiosity to rest.

The paper touches upon numerous aspects of machine learning research and leaves a reader with various avenues to start the exploratory journey into the field.

No comments:

Post a Comment