Feedback

Faculté des Sciences appliquées
Faculté des Sciences appliquées
MASTER THESIS

Automate Law Firms "Fund Review" processes using AI Document intelligence and Generative AI

Download
Crucifix, Arnaud ULiège
Promotor(s) : Ernst, Damien ULiège
Date of defense : 8-Sep-2025/9-Sep-2025 • Permalink : http://hdl.handle.net/2268.2/24860
Details
Title : Automate Law Firms "Fund Review" processes using AI Document intelligence and Generative AI
Author : Crucifix, Arnaud ULiège
Date of defense  : 8-Sep-2025/9-Sep-2025
Advisor(s) : Ernst, Damien ULiège
Committee's member(s) : Debruyne, Christophe ULiège
Louppe, Gilles ULiège
Royer, Stéphane 
Language : English
Discipline(s) : Engineering, computing & technology > Computer science
Institution(s) : Université de Liège, Liège, Belgique
Degree: Master en sciences informatiques, à finalité spécialisée en "computer systems security"
Faculty: Master thesis of the Faculté des Sciences appliquées

Abstract

[en] The increasing volume and complexity of legal and financial documents require a shift from
manual review to intelligent automation. However, current AI paradigms present critical risks
in this field. Traditional supervised models fail due to the scarcity of large-scale, paragraph-level
labeled data, while domain-specialized transformers like FinBERT or LegalBERT depend on
fine-tuning over datasets that are unavailable in our domain. Indeed, as they are shaped by
their training corpora, they do not reflect the specific structure, semantics, and regulatory nu-
ances of fund prospectuses, constitutions, and agreements. In fact, there are no large-scale, legal
paragraph-level datasets for our corpus that map legal texts into predefined legal categories.
Without this supervision, classifiers can’t be trained reliably. This also implies that there is no
taxonomy available in our domain, which is crucial for generating fund review reports. Large
Language Models (LLMs) such as GPT-5, despite their impressive general reasoning abilities,
have broad and ungrounded knowledge. In a legal and financial context, relying solely on an
LLM’s internal representation introduces risk: it cannot explain its reasoning paths or verify
its sources.
Therefore, this thesis introduces a different philosophy: an hybrid architecture for the auto-
mated analysis, synthesis, and querying of legal corpora by leveraging the best of both worlds:
a knowledge graph (KG) that acts as the system’s "legal brain", and an LLM leveraged not as
a source of truth, but as a sophisticated language interface to interact with the KG. Instead of
relying on an LLM’s internal knowledge, we externalize all legal reasoning into this symbolic
structure. By grounding LLM outputs in this graph, we ensure that retrieval, summarization,
and drafting of legal texts are legally valid, context-aware, and aligned with the structural logic
of the legal corpus.
The framework’s efficacy is demonstrated through three key achievements: (1) the con-
struction of a knowledge graph from a corpus of legal documents, using structure-aware parsing
algorithms that preserve legal semantics; (2) the automated discovery of an emergent legal
taxonomy using graph-based community detection on semantic similarity links, eliminating the
need for predefined classification schemes that aims to write fund review reports; and (3) the
implementation of a Graph-RAG (Retrieval-Augmented Generation) system that queries the
KG to provide accurate, domain-aware, and legally interpretable texts.


File(s)

Document(s)

File
Access TFE_CRUCIFIX_Arnaud.pdf
Description:
Size: 15.6 MB
Format: Adobe PDF
File
Access TFE_summary.pdf
Description:
Size: 109.81 kB
Format: Adobe PDF

Author

  • Crucifix, Arnaud ULiège Université de Liège > Master sc. inform. fin. spéc. comput. syst. secur.

Promotor(s)

Committee's member(s)

  • Debruyne, Christophe ULiège Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Représentation et ingénierie des données
    ORBi View his publications on ORBi
  • Louppe, Gilles ULiège Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Big Data
    ORBi View his publications on ORBi
  • Royer , Stéphane








All documents available on MatheO are protected by copyright and subject to the usual rules for fair use.
The University of Liège does not guarantee the scientific quality of these students' works or the accuracy of all the information they contain.