Imagine you had to explain a complex domain, like the basics of contract law, to a computer. You wouldn't just list random laws. You would define key concepts (like 'Contract', 'Contracting Party', 'Obligation') and the relationships that connect them (a 'Contracting Party' signs a 'Contract', a 'Contract' creates an 'Obligation').
An ontology is precisely that: a formal, structured way of representing the knowledge of a specific domain. It explicitly defines:
Think of it as creating an ultra-detailed dictionary and relational map for a given subject (in this case, law), designed so that a computer can understand and use it. It provides a shared vocabulary and framework, ensuring that everyone (whether flesh-and-blood or silicon-based AI) is talking about the same thing, in the same way, without ambiguity.
Even before the current AI boom, ontologies were valuable because they enable:
Large Language Models (LLMs), such as ChatGPT, Claude, or Gemini, are incredibly good at understanding and generating text. However, they also have weaknesses:
This is where ontologies become extremely useful partners for LLMs:
In short, ontologies bring the structure, consistency, and factual foundation that can make LLMs more reliable, accurate, and trustworthy, especially when applied to complex and specialized tasks (such as in law) within automated systems. They bridge the gap between the broad language understanding of LLMs and the deep, structured knowledge required for many real-world applications, knowledge that is always hard-won by very human experts.
Let's stay in the legal domain: imagine we want to represent legal reasoning through ontologies based on rulings from the Cour de cassation (France's highest court of appeal). Here are the development steps we would need to follow:
Here we use the langchain-community package, which provides ready-to-use methods for reading and extracting text from each page of a PDF document.
For our ontology, we decided it was relevant to extract the following from a ruling:
The idea here is to:
LangChainCourtRulingAnalysis model)Let's start with this prompt. The goal is to reuse the fields from our model, but also to guide the LLM further by specifying that we're interested in extracting generic facts and reasoning rather than the specifics of case X or Y.
This function creates a LangChain chain that applies both the prompt we wrote and the output formatting when calling the LLM. The format_docs function you see simply concatenates the PDF pages of the ruling so that the context isn't fragmented.
Finally, we have the function that wraps all this logic and lets us perform every step, from document ingestion and concatenation, to applying the prompt, all the way to generating the structured model output.
We are now ready to pass the returned object to our method that will either create or update our ontology.
We will use the owlready2 package, which makes it easy to manipulate ontologies in Python. The idea is as follows: we create an ontology and save it to a file if it doesn't exist, or we update it with a new ruling if the file already exists.
This create-or-load logic runs at the beginning of our function, and then, further in the same function:
... we define the concepts (classes) and the relationships between concepts in our ontology using the syntax provided by owlready2. For example, we declare that a ruling can have a fact by arbitrarily naming the relationship, which should always start with a verb to better differentiate it from the concepts making up the ontology.
Similarly, we also define data properties here, meaning the attributes of field X of a given entity. Since we use strings everywhere here, each entity has a hasText attribute, plain and simple.
Now that we have defined the structure of our ontology, we are ready, still within the same function, to add items (rulings) following this format:
Here, for each ruling, we create an instance of CourtRuling, which we hydrate with the facts, principles, decision, and reasoning. Finally, we persist our ontology to disk. Let's visualize the result!
This function uses the Python graphviz library (whose dependencies must also be installed on your system) to create a visualization, in the form of a directed graph going from left to right, of our ontology. If your result is pixelated, try increasing the DPI. This gives us, for example, the following result:
We now have a representation that is:
Now that we've gotten started with creating and visualizing ontologies using LLMs, here are some practical and concrete applications we could explore, armed with this new knowledge.
Our legal ontology captures the relationships between case law, facts, principles, decision rationales, and reasoning. We could deliver real added value with features like these:
Each of these applications would leverage the structured knowledge and relationships captured by our ontology, enabling more sophisticated analysis than simple text searches or unstructured approaches.
More than just formatting, an ontology also enables search (with querying systems available in the main libraries), visualization, inference, and LLM grounding. It is therefore a very powerful tool that can be applied to many other domains!
CTO de la scale-up LAMALO, Yacine est un développeur fullstack qui ne tient pas en place : JavaScript, Node.js, Python, LLM, voice UX... Toujours en veille, il transforme les dernières innovations en solutions concrètes !
LinkedInGet our best articles every month.
Père Castor, raconte-moi N8N N8N (prononcez « n-huit-n » ou « nodemation » si vous voulez faire classe). C'est un outil qui permet de connecter vos...
ArticleL'intelligence artificielle s'est invitée dans le quotidien des marketeurs à une vitesse record. En quelques mois, des outils comme ChatGPT,...
ArticleLe risque ? Créer une \"illusion de compétence\" tout en laissant les véritables lacunes stratégiques se creuser. La solution est pourtant simple et...
ArticleÀ lire avec la voix de Stallone : « plus de puces, plus de data, plus de milliards, le maître du monde ». Je viens de regarder le dernier numéro du...
ArticleSoyons clairs : si vous dirigez une organisation de taille significative aujourd'hui, la complexité des données—leur volume, leur vitesse de...
ArticleOn parle ici d'une transformation fondamentale, un changement de paradigme comparable à l'arrivée d'Internet ou de l'électricité dans l'industrie....