Aplikasi Peringkas Teks Bahasa Indonesia Menggunakan Model Text-to-Text Transfer Transformer (T5)

  • Mohammad Yani Politeknik Negeri Indramayu
  • Nur Siti Khodijah Politeknik Negeri Indramayu
  • Rendi Politeknik Negeri Indramayu
  • Muhamad Mustamiin Politeknik Negeri Indramayu
Keywords: text summarization, T5 model, multi format text, abstractif

Abstract

In the digital era, information can be easily accessed through various available media such
as search engines, academic repositories, social media, news portals, and websites.
However, the information available is often presented in lengthy textual form. For instance,
information about 'Joko Widodo' on the Wikipedia page contains at least 8,716 characters
of text. Such a long text can be challenging to extract its main points efficiently. Several
studies have been conducted on the development of text summarization applications.
Nevertheless, a summarization application that accepts input from various formats has not
yet been developed. In this research, the author proposes a text summarization system
capable of summarizing text using inputs in multiple formats. This text summarization
system processes information inputs in the form of text, document files, web pages, and
images, which are subsequently summarized using the Text-to-Text Transfer Transformer
(T5) model to generate a summarized output that can be stored in various file formats.
Evaluation results indicate that text summarization with document or text file inputs yields
higher ROUGE scores compared to non-text inputs, reaching an average score of 0.87

Published
2024-10-28