While the term AI is being used in an ever-growing number of contexts, and it's becoming increasingly difficult to define, I propose we study one of its foundational technologies: the neural network.
The neural network is used in Machine Learning and Deep Learning, in algorithms whose purpose is to learn. The goal is for the algorithm to find patterns and draw generalizations from a large dataset. These patterns and generalizations allow the algorithm to perform tasks without having received specific instructions from a human beforehand.
We're going to study how a neural network can be used to predict whether an animal is a cat or a dog based on its fur length and its height.
The dataset we'll use is fictional and has 3 columns, as you can see in the following table:
Fur LengthHeightAnimal
5.253.15cat
2.766.66dog
3.236.55dog
3.871.62cat
4.213.94cat
………
The goal of this application is to allow the computer to predict whether the animal in question is a dog or a cat, knowing only its fur length and height.
The dataset will be easier to understand if we represent it on a graph:
Graph allowing us to visualize the fictional dataset we're going to study
When we look at the graph, for us humans, it's easy to see that dogs and cats are separated into 2 groups, with a clear boundary between them. Thanks to our visual abilities and graph-reading skills, solving this problem is trivial. But how does a machine do it? With a neural network.
For now, let's consider the neural network as a black box. We'll open it up in a moment to look inside.
Here's what we expect from the neural network: when we give it an animal's fur length and height, it tells us whether it's a cat or a dog.
The neural network works with numbers, so we'll define the input and output numbers as follows:
The neural network's response r is a number, but our goal is for the neural network to recognize dogs and cats. So we decide that if r >= 0 then the neural network's answer is "cat." If r < 0 then the answer is "dog."
Now that we know what the neural network needs as input and what we want it to output, let's open the black box to see what's inside.
There are fewer things than expected in this black box. That's because we're going to use a Perceptron... A historic technology, at the origin of the machine learning research field. The perceptron can be thought of as a neural network with just 1 neuron.
Inside the box, we see 2 things:
The learning part of the algorithm happens at the weight level. The weights are initialized to arbitrary values, often random. For our application, we'll set them all to 1:
w0 = w1 = w2 = 1
The neuron will use x1, x2, w0, w1, and w2 to calculate r. Don't worry, the calculation is simple:
r = w0 + w1 * x1 + w2 * x2
In plain language:
"The neuron sums its inputs (xi) multiplied by their associated weights (wi). And w0, the bias, is a weight that isn't associated with any input."
So with the current weight values:
r = 1 + 1 * x1 + 1 * x2
r = 1 + x1 + x2
Now that we've defined weights and know how to calculate r, we can make our first predictions! They'll probably be wrong because we gave random values to the weights w.
We'll use the following animals to make our predictions:
A Maine Coon cat, a Doberman dog, a long-haired Chihuahua dog, and a Munchkin cat
This means we expect the neural network to give the following results:
AnimalFur LengthHeightExpected Result
x1x2r
Maine Coon Cat85>= 0
Doberman Dog16< 0
Chihuahua Dog25< 0
Munchkin Cat42>= 0
Let's check whether with all weights w equal to 1, the neural network's predictions are correct:
AnimalCalculationResultCorrect?
Maine Coon Catr=1+8+51414 >= 0 => Cat is correct
Doberman Dogr=1 + 1 + 688 >= 0 => Cat is not correct
Chihuahua Dogr=1 + 2 + 588 >= 0 => Cat is not correct
Munchkin Catr=1 + 4 + 277 >= 0 => Cat is correct
We can see that for now, r is always greater than 0, so the neural network always answers "cat." This is not the expected result.
This is where the learning part comes in, which is the real strength of a neural network. Learning consists of finding which weights w allow the neural network to give the correct answer as often as possible. Thanks to algorithms that can identify which weights are most responsible for the neural network's error, learning involves testing the neural network many times and slightly adjusting the weights each time it makes a mistake.
This is a lengthy process that a computer performs using highly optimized matrix calculations. In our application, we'll take more of a trial-and-error approach.
I propose using the following values:
w0 = 1
w1 = 1
w2 = -1
This makes the neural network's calculation:
r = w0 + w1 * x1 + w2 * x2
r = 1 + 1 * x1 -1 * x2
r = 1 + x1 - x2
Let's test the neural network again with these new weights:
AnimalCalculationResultCorrect?
Maine Coon Catr=1+8-544>=0 => Cat is correct
Doberman Dogr=1+1-6-4-4<0 => Dog is correct
Chihuahua Dogr=1+2-5-2-2<0 => Dog is correct
Munchkin Catr=1+4-235 >= 0 => Cat is correct
All results are correct! Because our problem is particularly simple, we were able to find the right weight values very quickly.
Since we have a simple problem, solved by a very simple neural network... After a brief mathematical demonstration, we can show that the neuron's calculation actually gives the distance of each point to the line with equation y = (-w0/w2)*x+(-w1/w2) = x+1.
As for whether this is the right solution? There are actually multiple solutions. Several lines can separate the 2 groups of points. We could ensure a better solution if we had a dataset with more points near the boundary between the 2 groups.
In this application, we used a neural network for binary classification. Here are the key takeaways:
Once you're familiar with these concepts, imagining what happens inside a neural network becomes much more approachable. Even if a neural network is large, with multiple layers, the behavior of each neuron is the same as what we applied in our model.
Animation by 3 Blue 1 Brown: https://youtu.be/aircAruvnKk
This animation represents a neural network that can read handwritten digits. It has 4 layers: the first is the input layer, the last is the output layer. Each line connecting 2 neurons represents a weight, and there are 13,002 of them in total in this neural network. The animation shows that we perform the neuron calculations in order, layer by layer, until we reach the output layer which expresses the result.
If at the end of this article, you can watch the animation above without feeling like it's black magic, then I've achieved my goal. Image generation models and LLMs like Chat GPT are obviously much more complex and difficult to understand. But you can still tell yourself that even these complicated models are a "simple" succession of calculations performed by "neurons," and that the quality of these models depends on the quality of their training... Just like our little model that predicts whether an animal is a dog or a cat.
Ancien Rebooter passé par Météo-France, Oscar a marqué son passage chez Reboot par son énergie fédératrice : organisation de meetups IA, quiz Twitch, animation de la communauté tech strasbourgeoise. Un profil technique doublé d'un vrai talent pour rassembler les gens autour de l'IA.
LinkedInGet our best articles every month.
Débloquer la valeur cachée dans des milliers de documents. Un projet bancaire qui transforme la recherche documentaire en quelques secondes.
ProjectDébloquer l'extraction de données hétérogènes. Un projet utilisant l'IA multimodale pour 9 marques.
ProjectDoubler la capacité de production d'audits grâce à l'automatisation intelligente.
TrainingFondamentaux ML, scikit-learn, premiers modèles supervisés et non supervisés.
ServiceDe l'audit au déploiement. Diagnostic, formation, POC, audit 360°, projet complet.
ServiceFormateurs opérationnels. IA, data science, développement web. Certifié Qualiopi.