In this paper I describe the Academic Italian Word List (AIWL), a frequency list of the most common non- technical words used in written academic communication. The project arises from the need to expand academic vocabulary of non-native students of Italian Universities. The AIWL is a corpus-based list, being extracted from a balanced, POS-tagged and lemmatized corpus of Italian academic written language (the AIC, Academic Italian Corpus). The AIC includes 1 million words and is composed of 240 texts belonging to different subject areas and textual typologies. The lexical units extracted from the AIC (single words as well as word combinations) are ordered by frequency and selected by a statistical measure of dispersion within the different subject areas. The AIWL aims to provide a computational and lexicographical resource to support the constitution of natural language processing applications to be used in an online learning environment. This paper describes in detail the theoretical assumptions and the methodology of extraction of the frequency list, and it outlines the main features of the lexical units that are included in the AIWL.
|Titolo:||AIWL: una lista di frequenza dell’italiano accademico|
|Data di pubblicazione:||2010|
|Appare nelle tipologie:||4.1 Contributo in Atti di convegno|