Nina Hristozova (TR Labs) will talk about an application of text summarization to Legal Court Documents and will give us a glimpse into the emerging field of Explainable AI and how it is applied to the summarization use case.


  • Problem Background
  • Abstractive Text Summarization Approach
  • Adding Explainability to the AI System
  • Key Takeaways

I will talk about how we added AI capabilities to an existing product at Thomson Reuters. This product monitors more than 200 courts across the US and sends out alerts to customers based on pre-defined filters. Behind the scenes, the product is supported by a team of editors who monitor and collect new court cases and perform various editorial tasks. One of the most challenging tasks is to write a summary of the key legal issues in a given court case. It therefore helps our customers to understand the essence of a case without having to read the entire complaint, react faster, and work more efficiently.
To speed up the editorial process, an AI-powered summarization model was built to automatically generate a first version of those summaries. We experimented with a summarization model trained on court cases and associated editor-written summaries – nearly 1 million documents that are mostly 5’000 words in length each!

This was not a straightforward task – the court cases consist of complex legal language and the summaries were not extractive. Thankfully, the advances of Deep Learning (DL) models for sequence generation and open sourcing enabled us to tackle this challenging task and achieve close to human-level performance.

Now, put yourself in the shoes of our editors – you push a button, and you see an auto-generated summary for a given court case. Wouldn’t you want to know how the DL model arrived at it? We learned that having an auto-generated summary already helps our editors a lot in terms of time savings. An explainability layer on top of that helped them become even more efficient and it strengthened their trust in the AI system.

What model did we use to summarize the court documents and how did we add an extra layer of explainability?



Nina Hristozova – Data Scientist | Thomson Reuters

Nina is a Data Scientist at Thomson Reuters (TR) Labs. She holds a BSc in Computer Science from the University of Glasgow, Scotland.

As part of her role at TR she has worked on a wide range of projects applying ML and DL to NLP problems. Her current focus is on abstractive summarization of legal text.

Outside of work she continues to spread the love for NLP as a Co-organizer of the NLP Zurich Meetup. As a hobby Nina plays and coaches volleyball.

May 27 @ 10:45
10:45 — 11:15 (30′)

Day 3 | 20th of May – Telecom + Media

Nina Hristozova – Data Scientist | Thomson Reuters