A Computational Measure for the Semantic Readability of Segmented Texts

IRIS

In this paper we introduce a computational procedure for measuring the semantic readability of a segmented text. The procedure mainly consists of three steps. First, natural language processing tools and unsupervised machine learning techniques are adopted in order to obtain a vectorized numerical representation for any section or segment of the inputted text. Hence, similar or semantically related text segments are modeled by nearby points in a vector space, then the shortest and longest Hamiltonian paths passing through them are computed. Lastly, the lengths of these paths and that of the original ordering on the segments are combined into an arithmetic expression in order to derive an index, which may be used to gauge the semantic difficulty that a reader is supposed to experience when reading the text. A preliminary experimental study is conducted on seven classic narrative texts written in English, which were obtained from the well-known Gutenberg project. The experimental results appear to be in line with our expectations.

A Computational Measure for the Semantic Readability of Segmented Texts

Santucci, Valentino;Bartoccini, Umberto;Mengoni, Paolo;Zanda, Fabio

2022-01-01

Abstract

In this paper we introduce a computational procedure for measuring the semantic readability of a segmented text. The procedure mainly consists of three steps. First, natural language processing tools and unsupervised machine learning techniques are adopted in order to obtain a vectorized numerical representation for any section or segment of the inputted text. Hence, similar or semantically related text segments are modeled by nearby points in a vector space, then the shortest and longest Hamiltonian paths passing through them are computed. Lastly, the lengths of these paths and that of the original ordering on the segments are combined into an arithmetic expression in order to derive an index, which may be used to gauge the semantic difficulty that a reader is supposed to experience when reading the text. A preliminary experimental study is conducted on seven classic narrative texts written in English, which were obtained from the well-known Gutenberg project. The experimental results appear to be in line with our expectations.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2022
			
	Codice ISBN
	
				978-3-031-10535-7
978-3-031-10536-4
			
	Parole chiave
	
				Semantic readability of texts, Natural Language Processing, Unsupervised machine learning, Hamiltonian path
			
	Appare nelle tipologie:
	
				4.1 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
Santucci2022_Chapter_AComputationalMeasureForTheSem.pdf non disponibili Descrizione: Versione editoriale Tipologia: Versione Editoriale (PDF) Licenza: Copyright dell'editore Dimensione 657.18 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	657.18 kB	Adobe PDF	Visualizza/Apri Richiedi una copia
_ICCSA_2022____Narrative_Index.pdf accesso aperto Descrizione: Preprint da overleaf Tipologia: Documento in Pre-print Licenza: Creative commons Dimensione 731.46 kB Formato Adobe PDF Visualizza/Apri	731.46 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12071/31768

Citazioni

ND

social impact