Abstractive legal text summarization using attention mechanisms

Alomar, Rafah

Publication:
Abstractive legal text summarization using attention mechanisms

Date

2024

Authors

Alomar, Rafah

Abstract

Yasal belgelerin özetlenmesini otomatikleştirmek, karmaşık, terminoloji açısından ağır metinleri damıtarak hukuk uzmanları için önemli miktarda zaman kazandırabilir. Türk hukuk alanında, mevcut çalışmaların çoğu ekstraktif özetleme yöntemlerine odaklanmaktadır. Türkçe hukuk belgeleri için soyut özetlemeyi keşfeden ilk çalışma olan araştırmamız, yüksek mahkeme kararları ve özetleri içeren büyük bir veri seti hazırladı. Eğitim seti, ChatGPT kullanılarak oluşturulan 13.000 özet içerirken, test seti Marmara Üniversitesi Hukuk Fakültesi öğrencileri tarafından oluşturulan 2.922 özet içermektedir. Veri setlerimizi kullanarak birkaç önceden eğitilmiş transformatör modelini denedik, ince ayarladık ve değerlendirdik. Ekstraktif yöntemler ROUGE puanlarında soyut yöntemlerden daha iyi performans gösterse de, soyut yaklaşım daha tutarlı ve özlü özetler oluşturdu. F1 puanları açısından, BERT2BERT modelleri üstünlük gösterdi, BART en yüksek hassasiyeti 0,44 puanla elde etti ve GPT-2 en iyi geri çağırma sonuçlarını verdi. Bu araştırma, Türkçe hukuk belgeleri bağlamında soyut özetleme tekniklerinin gelecekteki gelişimi için temel bir adım oluşturmaktadır.
Automating the summarization of legal documents can save significant time for legal professionals by distilling complex, terminology-heavy texts. In the Turkish legal domain, most existing work focuses on extractive summarization methods. Our study, the first to explore abstractive summarization for Turkish legal documents, compiled a large dataset of higher court decisions and summaries. The training set comprises 13,000 summaries generated using ChatGPT, while the test set contains 2,922 summaries created by Law Faculty students at Marmara University. We experimented with several pretrained transformer models, fine-tuning and evaluating them using our datasets. Although extractive methods outperformed abstractive ones in ROUGE scores, the abstractive approach generated more coherent and concise summaries. In terms of F1 scores, BERT2BERT models excelled, BART achieved the highest precision with a score of 0.44, and GPT-2 yielded the best recall results. This research serves as a foundational step for the future development of abstractive summarization techniques in the context of Turkish legal documents.

Keywords

Abstraktif Özetleme Abstractive Summarization, Bilgisayar mühendisliği, Computer engineering, Hukuk Metni Özetleme, Legal Text Summarization, Pre-trained Language Models, Transformers

URI

https://katalog.marmara.edu.tr/veriler/yordambt/cokluortam/5A/65f3ef7e0c8cb.pdf
https://hdl.handle.net/11424/296624

Collections

Tezler

Full item page

Publication:
Abstractive legal text summarization using attention mechanisms

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Research Projects

Organizational Units

Journal Issue

Abstract

Description

Keywords

Citation

URI

Collections

Endorsement

Review

Supplemented By

Referenced By

Publication: Abstractive legal text summarization using attention mechanisms

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Research Projects

Organizational Units

Journal Issue

Abstract

Description

Keywords

Citation

URI

Collections

Endorsement

Review

Supplemented By

Referenced By

Publication:
Abstractive legal text summarization using attention mechanisms