SUMMARIZING INDONESIAN NEWS ARTICLES USING GRAPH CONVOLUTIONAL NETWORK

Authors

  • Garmastewira Garmastewira Institut Teknologi Bandung
  • Masayu Leylia Khodra Institut Teknologi Bandung

DOI:

https://doi.org/10.32890/jict2019.18.3.4675

Keywords:

Graph Convolutional Network (GCN), Personalized Discourse Graph (PDG), ROUGE-2, summarization

Abstract

Multi-document summarization transforms a set of related documents into one concise summary. Existing Indonesian news articles summarizations do not take relationships between sentences into account and heavily depends on Indonesian language tools and resources. In this paper, we employ Graph Convolutional Network (GCN) which accepts word embedding sequence and sentence relationship graph as input for Indonesian news articles summarization. Our system is comprised of four main components, which are preprocess, graph construction, sentence scoring, and sentence selection components. Sentence scoring component is a neural network that uses Recurrent Neural Network (RNN) and GCN to produce the scores of all sentences. We use three different representation types for the sentence relationship graph. Sentence selection component then generates summary with two different techniques, which are by greedily choosing sentences with the highest scores and by using Maximum Marginal Relevance (MMR) technique. The evaluation shows that GCN summarizer with Personalized Discourse Graph (PDG) graph representation system achieves the best results with average ROUGE-2 recall score of 0.370 for 100-word summary and 0.378 for 200-word summary. Sentence selection using greedy technique gives better results for generating 100-word summary, while MMR performs better for generating 200-word summary.

 

Metrics

Metrics Loading ...

Additional Files

Published

10-06-2019

How to Cite

Garmastewira, G., & Khodra, M. L. (2019). SUMMARIZING INDONESIAN NEWS ARTICLES USING GRAPH CONVOLUTIONAL NETWORK. Journal of Information and Communication Technology, 18(3), 345–365. https://doi.org/10.32890/jict2019.18.3.4675