Abstractive Text Summarization & Explainability for Legal Documents

<Session_Outline/>

Nina Hristozova (TR Labs) will talk about an application of text summarization to Legal Court Documents and will give us a glimpse into the emerging field of Explainable AI and how it is applied to the summarization use case.

<Key_Takeaways/>

Problem Background
Abstractive Text Summarization Approach
Adding Explainability to the AI System
Key Takeaways

I will talk about how we added AI capabilities to an existing product at Thomson Reuters. This product monitors more than 200 courts across the US and sends out alerts to customers based on pre-defined filters. Behind the scenes, the product is supported by a team of editors who monitor and collect new court cases and perform various editorial tasks. One of the most challenging tasks is to write a summary of the key legal issues in a given court case. It therefore helps our customers to understand the essence of a case without having to read the entire complaint, react faster, and work more efficiently.
To speed up the editorial process, an AI-powered summarization model was built to automatically generate a first version of those summaries. We experimented with a summarization model trained on court cases and associated editor-written summaries – nearly 1 million documents that are mostly 5’000 words in length each!

This was not a straightforward task – the court cases consist of complex legal language and the summaries were not extractive. Thankfully, the advances of Deep Learning (DL) models for sequence generation and open sourcing enabled us to tackle this challenging task and achieve close to human-level performance.

Now, put yourself in the shoes of our editors – you push a button, and you see an auto-generated summary for a given court case. Wouldn’t you want to know how the DL model arrived at it? We learned that having an auto-generated summary already helps our editors a lot in terms of time savings. An explainability layer on top of that helped them become even more efficient and it strengthened their trust in the AI system.

What model did we use to summarize the court documents and how did we add an extra layer of explainability?

————————————————————————————————————————————————————

<Speaker_Bio/>

Nina Hristozova – Data Scientist | Thomson Reuters

Nina is a Data Scientist at Thomson Reuters (TR) Labs. She holds a BSc in Computer Science from the University of Glasgow, Scotland.

As part of her role at TR she has worked on a wide range of projects applying ML and DL to NLP problems. Her current focus is on abstractive summarization of legal text.

Outside of work she continues to spread the love for NLP as a Co-organizer of the NLP Zurich Meetup. As a hobby Nina plays and coaches volleyball.

May 27 @ 10:45

10:45 — 11:15 (30′)

Day 3 | 20th of May – Telecom + Media

Nina Hristozova – Data Scientist | Thomson Reuters

Cookie	Duration	Description
__cfduid	1 month	The cookie is used by cdn services like CloudFare to identify individual clients behind a shared IP address and apply security settings on a per-client basis. It does not correspond to any user ID in the web application and does not store any personally identifiable information.
cookielawinfo-checbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-advertisement	1 year	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Advertisement".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
bcookie	2 years	This cookie is set by linkedIn. The purpose of the cookie is to enable LinkedIn functionalities on the page.
lang	session	This cookie is used to store the language preferences of a user to serve up content in that stored language the next time user visit the website.
lidc	1 day	This cookie is set by LinkedIn and used for routing.

Cookie	Duration	Description
_fbp	3 months	This cookie is set by Facebook to deliver advertisement when they are on Facebook or a digital platform powered by Facebook advertising after visiting this website.
bscookie	2 years	This cookie is a browser ID cookie set by Linked share Buttons and ad tags.
fr	3 months	The cookie is set by Facebook to show relevant advertisments to the users and measure and improve the advertisements. The cookie also tracks the behavior of the user across the web on sites that have Facebook pixel or Facebook social plugin.
test_cookie	15 minutes	This cookie is set by doubleclick.net. The purpose of the cookie is to determine if the user's browser supports cookies.

Cookie	Duration	Description
_ga_P9NY14LEKW	2 years	No description
AnalyticsSyncHistory	1 month	No description
UserMatchHistory	1 month	Linkedin - Used to track visitors on multiple websites, in order to present relevant advertisement based on the visitor's preferences.

Abstractive Text Summarization & Explainability for Legal Documents

Nina Hristozova – Data Scientist | Thomson Reuters

Hyperight Summits

Legal

Contact