Linguistic Corpus and the Essence of Data Structure
Paper ID : 1026-ICIL
Authors:
Masood Ghayoomi *
Street No 64 (west), Kordestan Highway
Abstract:
Corpus linguistics has deserved a special position in the linguistic studies. This new approach changed the process of linguistic studies and has caused a linguistic corpus to function as a useful tool for evaluating linguistic theories and research hypotheses. To save time and ease the search process in a linguistic corpus, it is evident and essential to use software to dig the corpus and extract the samples. To reach the goal, the linguistic data has to be structured and organized in a way to be used by the software.
In this paper, we explore the essence of structuring and organizing data in a linguistic corpus and introduce a standard structure for data annotation in various levels, namely phonological, morphological, and syntactic levels, to show the importance of structuring and utilizing the structure. At the end, the data structure of the Persian Linguistic DataBase (PLDB) as a practical sample will be studies.
Keywords:
digital humanities, information technology, linguistic corpus, corpus linguistics, data structure
Status : Paper Accepted (Oral Presentation)
10th International Iranian Conference on Linguistics
login