For the purpose of the following discussion topics, assume that we have access to the
following 3 types of documents:
1. Law articles: ground-truth for the law in The Netherlands
2. Case law: past details and rulings from various courts on real cases including
facts and circumstances.
3. Commentaries: interpretations and opinions of experts on particular law
articles, legal concepts and/or jurisprudence
Discussion topic 1: Domain-Specific Re-ranking Model
Outline your approach to developing a re-ranking model customized for the legal
domain. Assume you have access to a high quality varied index of legal documents and
a retriever that returns a set of N such documents.
Example questions:
- How would you go about collecting, processing and annotation of data?
- What types of features/metadata could be particularly useful for re-ranking in a
legal context?
- How would you train the reranker?
- How would you evaluate the reranker?
Discussion topic 2: Improving a RAG pipeline with legal domain knowledge
Our Retrieval Augmented Pipeline has challenges that are particular to the legal
domain. For instance, there are regulations that might be outdated (without this being
clear from the metadata), or a precendent that has been overruled.
Think about more potential failure cases a legal RAG pipeline is especially susceptible
to, and what experiments you would do to solve these problems. If you can’t come up
with extra failure cases, use our examples.