In this paper we introduce a computational procedure for measuring the semantic readability of a segmented text. The procedure mainly consists of three steps. First, natural language processing tools and unsupervised machine learning techniques are adopted in order to obtain a vectorized numerical representation for any section or segment of the inputted text. Hence, similar or semantically related text segments are modeled by nearby points in a vector space, then the shortest and longest Hamiltonian paths passing through them are computed. Lastly, the lengths of these paths and that of the original ordering on the segments are combined into an arithmetic expression in order to derive an index, which may be used to gauge the semantic difficulty that a reader is supposed to experience when reading the text. A preliminary experimental study is conducted on seven classic narrative texts written in English, which were obtained from the well-known Gutenberg project. The experimental results appear to be in line with our expectations.
A Computational Measure for the Semantic Readability of Segmented Texts
Santucci, Valentino
;Bartoccini, Umberto;Zanda, Fabio
2022-01-01
Abstract
In this paper we introduce a computational procedure for measuring the semantic readability of a segmented text. The procedure mainly consists of three steps. First, natural language processing tools and unsupervised machine learning techniques are adopted in order to obtain a vectorized numerical representation for any section or segment of the inputted text. Hence, similar or semantically related text segments are modeled by nearby points in a vector space, then the shortest and longest Hamiltonian paths passing through them are computed. Lastly, the lengths of these paths and that of the original ordering on the segments are combined into an arithmetic expression in order to derive an index, which may be used to gauge the semantic difficulty that a reader is supposed to experience when reading the text. A preliminary experimental study is conducted on seven classic narrative texts written in English, which were obtained from the well-known Gutenberg project. The experimental results appear to be in line with our expectations.File | Dimensione | Formato | |
---|---|---|---|
Santucci2022_Chapter_AComputationalMeasureForTheSem.pdf
non disponibili
Descrizione: Versione editoriale
Tipologia:
Versione Editoriale (PDF)
Licenza:
Copyright dell'editore
Dimensione
657.18 kB
Formato
Adobe PDF
|
657.18 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
_ICCSA_2022____Narrative_Index.pdf
accesso aperto
Descrizione: Preprint da overleaf
Tipologia:
Documento in Pre-print
Licenza:
Creative commons
Dimensione
731.46 kB
Formato
Adobe PDF
|
731.46 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.