When it comes to creating artificial intelligence, several approaches have been attempted to build an "intelligent" algorithm. However, if we look at recent developments, all the models we encounter are derived from one main paradigm: Machine Learning (often abbreviated as ML). Whether we're talking about LLMs, neural networks, or even simpler models, they all follow the principles of Machine Learning.
But what are these principles? How does a Machine Learning model work? That's what this article aims to highlight.
Before diving into the topic, I'd like to clarify a few key concepts.
Before getting into what a Machine Learning algorithm is and what makes it special, here's a brief reminder of what we generally expect from one. We want this program, given input data, to predict a value -- the target -- correctly. In other words, we want our model to generalize rules that apply to all the data it's likely to encounter, so it can accurately predict the target. All ML problems are variations of this challenge. What changes most often is what the value represents and its format (number, text, image, etc.).
A Machine Learning algorithm is something quite particular in the world of algorithms. Typically, the most common analogy for explaining an algorithm is a cooking recipe. You have ingredients, which represent the algorithm's inputs, and steps to follow, which represent the algorithm's rules. With these two elements, you produce the final dish, which represents the algorithm's output -- what we're trying to automate in the case of a computer program.
Simplified diagram of a traditional program
A Machine Learning program works in reverse compared to a traditional program: you give it input data, possibly the expected results, and it figures out on its own which rules transform the inputs into outputs -- or, if no expected results are provided, it tries to find commonalities among the data.
Diagram of a Machine Learning program
This mechanism of finding rules based on inputs is what we call learning, or training, and it's the common thread across every Machine Learning algorithm.
From there, one of the prerequisites of this field is to build a dataset that can be fed to the algorithm to generate these rules and ensure that the resulting Machine Learning model is sufficiently performant for the task we want to accomplish.
Now that we've covered the general principle of Machine Learning, the question is: what strategies exist for teaching a machine? There are several, and I'll present three of the most common ones here:
There are other types of learning, but these are the ones I find most notable and easiest to remember.
Now that we've covered the main ML mechanisms in fairly general terms, let's look at the main obstacles you encounter when doing Machine Learning. If I had to sum up the majority of obstacles in one word, it would be "bias." ML algorithms can fall into biases that may be linked to various factors.
One of the first biases you can encounter arises when measuring your model's performance. To make sure the model has truly generalized the rules it was supposed to learn, you measure the model's score on data it hasn't seen before -- test data. The risk of measuring performance on data the model already saw during training is that the model may have memorized the data, meaning it fails to generalize. That's why the dataset is split into training and test data (typically 70-80% for training and 20-30% for testing).
This problem is tied to the method used to measure the model's performance.
Overfitting and underfitting are problems related to the compatibility between the chosen ML algorithm and the dataset.
Overfitting is a problem where the model fails to generalize and instead memorizes the data. The main indicator of overfitting is that the model's score measured on training data is significantly higher than the score measured on test data. This means the model is too complex for the problem at hand and needs to be simplified. It can also be a consequence of a dataset that's too small.
Underfitting, on the other hand, is a problem where the model fails to capture and adapt to the complexity of the dataset. An indicator of this is that the model's score measured on both training and test data is very low. The algorithm needs to be adjusted so it better captures the data's complexity. Like overfitting, this can also result from a dataset that's too small.
Example of underfitting and overfitting
Another area requiring attention is the composition of the dataset. There are all kinds of biases related to the representativeness of different types of data in the initial dataset. For example, when building a facial recognition model, the algorithm must recognize all types of faces, regardless of the person's ethnicity, whether they wear glasses, their hair color, etc. And for that, the training dataset must contain images of people with varied characteristics, with no population being over-represented.
Here, I used the example of facial recognition, but this problem can arise with other types of data as well.
ML models are particular computer programs that try to infer rules from the data fed to them as input. There are several ways to train a model, depending on whether our data is labeled or not. There are also several obstacles and biases to avoid. In future articles, we'll revisit these approaches and see how, in practice, you create a model and navigate these various biases.
Pilier de Lamalo, Yohann allie expertise technique et pédagogie. Archi dans l'âme, développeur de talent, il apporte son énergie et ses compétences à la scale-up Lamalo. Pédagogue, il n'hésite pas à partager son savoir.
LinkedInGet our best articles every month.
Débloquer la valeur cachée dans des milliers de documents. Un projet bancaire qui transforme la recherche documentaire en quelques secondes.
ProjectDébloquer l'extraction de données hétérogènes. Un projet utilisant l'IA multimodale pour 9 marques.
ProjectDoubler la capacité de production d'audits grâce à l'automatisation intelligente.
TrainingFondamentaux ML, scikit-learn, premiers modèles supervisés et non supervisés.
ServiceDe l'audit au déploiement. Diagnostic, formation, POC, audit 360°, projet complet.
ServiceFormateurs opérationnels. IA, data science, développement web. Certifié Qualiopi.