Machine Learning – A Dive into the Mystery

Machine Learning – you have heard about it, you have seen it in action but you are still confused over how it works. The fact that you are here shows your interest and curiosity of what’s all the buzz?.

A formal definition

Tom Mitchell in his book Machine Learning gives a slightly informal definition in opening line of the preface:

The field of machine learning is concerned with the question of how to construct
computer programs that automatically improve with experience.

I like this definition as it gives a simple understanding of what our goal is while developing the computer programs. Going little on the formal side, Mitchell gave a definition in the introduction which you will see repeatedly in every Machine Learning introduction article:

A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E.

This formalism kind of creeps out the people reading the definition. But don’t let it scare you of as this definition can help you out in a way no other machine learning definition can. We can use this definition as a basis and put E, T and P at the top of a table and list out complex problems with less ambiguity. It can be taken in as a pattern to remove a narrow approach and think over what data to collect(E), what decisions the program needs to make(T) and how we will evaluate its results(P). This power of resolving the ambiguity in a complex problem is like a super power to already the strongest people on the planet – PROGRAMMERS.

Now, let me answer the most awaited question:

What’s all the buzz about Machine Learning?

As I already said, the strongest people on the planet are PROGRAMMERS, and why so? This is because there isn’t any industry left that doesn’t require its own IT department to develop software for their business to grow, or to handle the loads of data effectively. But there is still one thing in software development that is eating up the industry more rapidly than other software types and that is Self Improving Software (in short software working on concept of Machine Learning). Various industry products like Google’s messaging app Allo, Amazon’s recommendations of products, Netflix suggesting you which movies you will love to watch and many more (check out this YouTube Link) make use of machine learning to give their users a product that kind of resembles to have self-consciousness.

Being a programmer myself, I can consider how these high level definitions and formalism can take their sweet time to sink in. So let’s just take into consideration the thing where we are best, giving this formalism a programmatic approach.

Programmatic Approach

In real world scenarios, you will find complex problems which will show you that it is not feasible to write every single if-statement for the instructions you need to cover. Let us take the example most commonly used to give a glimpse over the idea of machine learning – spam email detection. How would you write a program to filter incoming emails in inbox folder and spam folder?

A general programmer approach will be to collect a number of examples and then find patterns in the emails that are spams and that are not. Most commonly you’d abstract these patterns in the emails and design heuristics which will work on new emails in future as well. You’d go for crafty things for edge cases and will try to bring up the accuracy.

This manual derivation and hard coded programmer will be as good as the programmers’ ability to understand vital differences between spam and non-spam emails. A thing that will finally haunt you is maintenance nightmares.

I know the programmer inside you must be shouting at this point – “Automation! Automation!“.

Considering the above example in terms of machine learning:
Examples(E) are the emails we have collected
Task(T) was decision problem(classification) – deciding whether the email is a spam or not and placing them in correct folder.
Performance(P) measure will be accuracy in something, like percentage.
Then applying certain machine learning algorithm(will be discussed in upcoming articles) to get a model(heuristics) to work on new examples is a basic approach to get automation.

Some terminologies that are used in machine learning regularly:
Preparing a decision program is basically called training.
Collected examples are called training set.
Program is referred to as a model.

Now arises the biggest of all the questions you have in your mind:

Where to get started?

I know there are people out there of different preferences. Some prefer books over videos while others prefer video tutorials over books. So, I have made a list of things that can get you started.

Books:
Machine Learning by Mitchell
The Elements of Statistical Learning: Data Mining, Inference, and Prediction by Hastie,
Tibshirani and Friedman
Pattern Recognition and Machine Learning by Bishop
Machine Learning: An Algorithmic Perspective by Marsland.

Video Tutorials:
Coursera Machine Learning by Andrew NG
Intro To Machine Learning by Udacity
Siraj Raval YouTube channel