Machine Learning 101
Once upon a time
The story of machine learning starts with Arthur Samuel. Right after WWII, the American scientist started being obessessed with the idea of building a program that could defeat the Champion of the World of Checkers. He gratifies us with one of the earliest definition of machine learning:
«It is a field of study that gives computers the ability to learn without being explicitly programmed »
Another major early actor in the world of Artificial Intelligence (AI) is Brittish scientist Alan Turing. Even though many of his works have been classified by the Brittish Army and was revealed at a later stage, he is now considered a major contributor to the development of AI. He is also famous for solving the Enigma, a crypted communication mode used by the Germans during WWII. To this day, the Turing test is still used to measure artificial intelligence. Amongst his many great achievements, the brilliant mathematician leaves the following, which was, at the time, rather controversial:
“A machine learns when it changes its behavior based on experience using data”
After these early precusors, Artificial Intelligence was especially developped, enhanced and applied by tech companies such as IBM.
These advancements lead us to the three types of machine learning we know today:
- Supervised learning
- Unsupervised learning
- Rreinforcement learning
Supervised learning :
This type of machine learning is the most common. Here is an example that will help explaining its concept:
As for any machine learning scenario, there is a learning phase. In supervised learning, the program is fed with parameters and an expected result based of existing data. For new data sets, the machine replicates what is known and calculates an error margin and works on its formula to reduce the error to lower levels each time, crafting a more precise model little by little. This learning phase can last long depending on the complexity of the set and the precision we expect. For instance, spotting a cancer cell with a higher certitude will recquire a long learning time.
Problems can be solved with:
- linear regression with the variable is continuous with a finite number of values (price or demographic trends)
- classification with a discrete variable and a finite number of values (which type of animal is on the photo, is this an infected cell or not…) .
Unsupervised learning :
In the scenario of Non supervised learning, the machine is fed with a data set and no other information. In opposition to Supervised Learning in which specific questions are asked along expected outcomes, Non Supervised Learning enables us to only ask the question and let the machine work towards a result after grouping data according to similarities and differences. The associated function is K-mean clustering.
This type of clustering is behind the suggestions of products on most e-commerce websites, with “People also bought” or “You might also like”. It can also be applied to spot anomalies such as fraud by setting odd behaviours apart.
With Reinforcement learning, no more questions, no more answers are supplied to the machine. It is left alone to experiment in its environment. An agent creates its own experience and will assess the impact of this experience on its environment. In this scenario, the algorithm develops a series of actions designed to maximise the number of positive responses and in order to increase its performance p.
If we take the case of the self driving car, the program will experiment different things. If in its environment the road is walled, and if the car is damaged after turning at a certain angle, it will learn the safe angles to avoid being damaged next time(s), by the principle of trial and error.
Choosing the algorithm:
The most used algorithms in machine learnings are the linear regression, the KNN, neurone network and random forests.
To apply these algorithms, data scientists mostly use R, an open source software environnement, as well as Python for more complex tasks in Deep Learning. The data science community is sharing a lot of resources and Best Practices and Classes are widely available to help you master the technology.
Numerous APIs are available as such as Google Cloud AI, Microsoft Azure Machine Learning, or Amazon’s AWS Machine Learning.
Beyond Machine Learning :
Within Machine Learning itself is Deep Learning. As companies are able to gather and store huge volumes of data, Deep Learning also becomes more and more useful, as it specialises in extremely complex environments and Big Data. Deep Learning aims at surpassing human capacities.
Be it in the field of medical research, in advertising or digital marketing, AI is increasing in power and applications, offering new tools for researchers and marketers. At the crossroad of these worlds, 14eight made of Artificial Intelligence one of its main point of focus for the year 2020, and already has a few projects in the pipelines!
Want to know more, got a question or a project you would like to get started? Get in touch with us!
-Illustration en banière: https://icones8.fr