In this paper we introduce a computational procedure for measuring the semantic readability of a segmented text. The procedure mainly consists of three steps. First, natural language processing tools and unsupervised machine learning techniques are adopted in order to obtain a vectorized numerical representation for any section or segment of the inputted text. Hence, similar or semantically related text segments are modeled by nearby points in a vector space, then the shortest and longest Hamiltonian paths passing through them are computed. Lastly, the lengths of these paths and that of the original ordering on the segments are combined into an arithmetic expression in order to derive an index, which may be used to gauge the semantic difficulty that a reader is supposed to experience when reading the text. A preliminary experimental study is conducted on seven classic narrative texts written in English, which were obtained from the well-known Gutenberg project. The experimental results appear to be in line with our expectations.

A Computational Measure for the Semantic Readability of Segmented Texts

Santucci, Valentino
;
Bartoccini, Umberto;Zanda, Fabio
2022-01-01

Abstract

In this paper we introduce a computational procedure for measuring the semantic readability of a segmented text. The procedure mainly consists of three steps. First, natural language processing tools and unsupervised machine learning techniques are adopted in order to obtain a vectorized numerical representation for any section or segment of the inputted text. Hence, similar or semantically related text segments are modeled by nearby points in a vector space, then the shortest and longest Hamiltonian paths passing through them are computed. Lastly, the lengths of these paths and that of the original ordering on the segments are combined into an arithmetic expression in order to derive an index, which may be used to gauge the semantic difficulty that a reader is supposed to experience when reading the text. A preliminary experimental study is conducted on seven classic narrative texts written in English, which were obtained from the well-known Gutenberg project. The experimental results appear to be in line with our expectations.
2022
978-3-031-10535-7
978-3-031-10536-4
Semantic readability of texts, Natural Language Processing, Unsupervised machine learning, Hamiltonian path
File in questo prodotto:
File Dimensione Formato  
Santucci2022_Chapter_AComputationalMeasureForTheSem.pdf

non disponibili

Descrizione: Versione editoriale
Tipologia: Versione Editoriale (PDF)
Licenza: Copyright dell'editore
Dimensione 657.18 kB
Formato Adobe PDF
657.18 kB Adobe PDF   Visualizza/Apri   Richiedi una copia
_ICCSA_2022____Narrative_Index.pdf

accesso aperto

Descrizione: Preprint da overleaf
Tipologia: Documento in Pre-print
Licenza: Creative commons
Dimensione 731.46 kB
Formato Adobe PDF
731.46 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12071/31768
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact