Corpora di italiano L2: difficoltà di annotazione e trascrizione "allargata"

IRIS

The spoken Italian L2 Corpus of the University for Foreigners Perugia has been created with the intent to focus both on the study of second language acquisition and on the improvement of teaching Italian as a foreign language methods. The first part of our Corpus has undergone different stages of development; particularly important is the transition from the transcription phase and XML manual annotation of data to its automatic annotation: since it is extremely difficult to train the software to automatically recognize the learner’s interlanguage, it has been adopted a method based on an “widened” transcription, characterized by a first stage of manual treatment of data in which audio recordings are manually transcribed together with the linguistic, contextual and structural annotation. A taxonomy of the main problematic features of interlanguage productions, based on the observation of Chinese learner’s data, has been created in order to set up unambiguous criteria for transcription.

Corpora di italiano L2: difficoltà di annotazione e trascrizione "allargata"

Spina S;Atzori L.;Chiapedi N.

2009-01-01

Abstract

The spoken Italian L2 Corpus of the University for Foreigners Perugia has been created with the intent to focus both on the study of second language acquisition and on the improvement of teaching Italian as a foreign language methods. The first part of our Corpus has undergone different stages of development; particularly important is the transition from the transcription phase and XML manual annotation of data to its automatic annotation: since it is extremely difficult to train the software to automatically recognize the learner’s interlanguage, it has been adopted a method based on an “widened” transcription, characterized by a first stage of manual treatment of data in which audio recordings are manually transcribed together with the linguistic, contextual and structural annotation. A taxonomy of the main problematic features of interlanguage productions, based on the observation of Chinese learner’s data, has been created in order to set up unambiguous criteria for transcription.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2009
			
	Codice ISBN
	
				978-88-557-0168-6
			
	Parole chiave
	
				learner corpora; annotation
			
	Appare nelle tipologie:
	
				2.1 Contributo in volume (Capitolo o Saggio)

File in questo prodotto:

File	Dimensione	Formato
atzori_chiapedi_spina.pdf non disponibili Licenza: Non specificato Dimensione 806.12 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	806.12 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12071/2005

Citazioni

ND

social impact