If you're wondering "What is Machine Learning?" or "What are the key concepts behind Machine Learning?", then this article is for you. In just 5 minutes, we'll explain the basics so you can (finally) understand what's behind Machine Learning.
Contents
What is Machine Learning?
Machine Learning is a type of artificial intelligence (AI) that enables software applications to become more accurate at predicting outcomes without being explicitly programmed to do so.
Machine Learning algorithms use historical data as input to predict new output values. Consider, for example, recommendation engines such as those used by social networks. These are a common use case for Machine Learning. In fact, there are many applications and use cases:
- fraud detection
- spam filtering
- detection of malware threats
- business process automation (APB)
- RPA (read : Understanding the promise and finding your way in the RPA galaxy)
- predictive maintenance
Why are we talking so much about Machine Learning?
Machine Learning is important, and will become fundamental in the coming years with the rise of AI. Indeed, it gives companies a view of trends in customer behavior and business models, while supporting the development of new products.
For example, many of today's leading companies, such as Facebook, Google and Uber, are making Machine Learning central to improving the experience they offer:
- display the most relevant and targeted content;
- improve the performance of their advertising or recommendation system;
- optimize resource allocation according to demand.
That's why adopting Machine Learning can turn into a real competitive advantage. We're talking here about potential cost savings, optimized customer/user experience, and time savings in operational processes.
The starting point: algorithms
As human beings, we learn from past experiences. We use our senses to obtain these "experiences" and later use them to survive. Machines, historically, learn through commands provided by humans. These sets of rules are called algorithms.
Algorithms are sets of rules that a computer is able to follow. Think of the way we learned to do complex division - maybe you learned to divide the denominator by the first digits of the numerator, subtract the subtotal and continue with the next digits until you have a remainder. Well, it's an algorithm, and it's the kind of thing we can program into a computer, which can do these kinds of calculations much, much faster than we can.
Classify, predict, group with Machine Learning
With Machine Learning, the objective is either prediction or clustering.
Prediction is a process where, from a set of input variables, we estimate the value of an output variable. This technique is used for data that have a precise correspondence between input and output, known as labeled data.
This is calledSupervised Learning. For example, using a set of characteristics of a house, we can estimate its selling price. Machine Learning can be classified into several types, which we describe below.
Supervised learning
For this type of Machine Learning, Data Scientists provide the algorithms with labeled training data and define the variables they want the algorithm to evaluate for correlations.
The input and output of the algorithm are specified.
Unsupervised learning
This type of Machine Learning involves algorithms that train on unlabeled data. The algorithm scans data sets for any significant connections.
On the other hand, the data on which the algorithms train, and the predictions or recommendations they generate, are predetermined.
Semi-supervised learning
This approach to Machine Learning involves a mixture of the two previous types: supervised and unsupervised learning.
Data Scientists can provide a primarily labeled algorithm with training data, but the model is also free to explore the data on its own and develop its own understanding of the dataset.
Reinforcement learning
Data Scientists typically use reinforcement learning to teach a machine to perform a multi-step process for which there are clearly defined rules.
Data Scientists program an algorithm to perform a task, giving it positive or negative cues as it determines how to accomplish a task. But in most cases, the algorithm will decide for itself which steps to take along the way.
What are the most common types of Machine Learning algorithms?
Once this classification into 4 main groups has been established, it becomes clear that the starting point is the algorithm, and how it will be defined beforehand. Now let's take a look at the most frequently used algorithms
Linear regression
This is a supervised learning algorithm used to predict a continuous output value (e.g. the price of a house) based on one or more input characteristics (e.g. the size of the house).
It assumes that the relationship between input characteristics and output is linear, meaning that the change in output is proportional to the change in input.
Logistic regression
This is a supervised learning algorithm used for classification tasks, where the aim is to predict a discrete output label (e.g. spam or non-spam).
It is similar to linear regression, but applies a sigmoid function to the output to map predicted values to probabilities between 0 and 1.
Decision trees
This is a supervised learning algorithm used for classification and regression tasks. It works by creating a tree model of decisions. Each internal node represents a decision based on the value of an input feature, and each branch represents a prediction.
K-means clustering
This is an unsupervised learning algorithm used for clustering tasks, where the aim is to group together similar data points.
It works by randomly selecting a fixed number of "centroids", then assigning each data point to the nearest centroid, based on Euclidean distance. The algorithm then updates the centroids according to the average of the assigned data points, and repeats this process until convergence is achieved.
Naive Bayesian classification
Naive Bayesian classification is a type of simple probabilistic Bayesian classification based on Bayes' theorem, with strong (naive) independence of assumptions.
This is a supervised learning algorithm used for classification tasks. It assumes that the presence (or absence) of a particular feature of a class is unrelated to the presence (or absence) of any other feature, given the class variable. This assumption is called class independence.
Some concrete uses of Machine Learning
Content recommendation algorithms
Today, Machine Learning can be used in a wide range of applications. One of the best-known examples of Machine Learning (which we all use) is the recommendation engine that powers the news feeds of social networks like Facebook, Instagram or TikTok.
Machine Learning is used here to personalize the news feed, which presents and displays content.
If a user frequently stops to read posts from a particular group or watch a particular type of video, the recommendation engine will start to display more activity from that group earlier in the feed, and more content of that type.
Behind the scenes, the engine tries to reinforce known patterns in the user's online behavior. If the user changes pattern and fails to read posts from that group in the coming weeks, the News Feed will adjust accordingly.
Use cases for Marketing, IS, HR...
In addition to recommendation engines, other uses of machine learning include the following:
- Customer relationship management. CRM software can use machine learning models to analyze e-mails and prompt sales team members to respond first to the most important messages. More advanced systems can even recommend potentially effective responses. To find out more What does the future hold for customer relationship management?
- Knowledge Management. BI and analytics vendors use machine learning in their software to identify potentially important data points, data point patterns and anomalies.
- Human Resources Information Systems. HRIS can use machine learning models to filter applications and identify the best candidates for a vacancy.
- Autonomous cars. Machine learning algorithms can even enable a semi-autonomous car to recognize a partially visible object and alert the driver.
- Virtual assistants. Intelligent assistants typically combine supervised and unsupervised machine learning models to interpret natural speech and provide context.
Further information

AI training: explaining and demystifying before taking the plunge
Training employees in AI is not just about responding to a current trend, it's about supporting the transformation of our businesses. All businesses are

Digital resilience: Cloud, data and AI
End-to-end control and management of the data collection, storage, processing, analysis and exploitation chain for data collected on

Datalake : définition, enjeux et ROI
Does the datalake have an ROI? Before we try to give an answer, let's recall the definition of a datalake. If you're wondering "what is a datalake?