Definition context extraction from the COVID-19 corpus with CQL

Authors

  • Ana Eliza Pereira Bocorny Universidade Federal do Rio Grande do Sul
  • Rozane Rebechi Universidade Federal do Rio Grande do Sul
  • Cristiane Krause Kilian Instituto Superior de Educação Ivoti

DOI:

https://doi.org/10.11606/issn.2317-9511.v42p125-138

Keywords:

COVID-19, terminology, corpus linguistics, definition context (DC) extraction, definitional segments (DS)

Abstract

Terms represent the concepts of a domain and by comprehending them readers have access to the knowledge contained in specialized texts. Therefore, understanding the meaning of terms is of great importance not only for researchers to share the results of their studies, but also for professionals and students from various areas to apply specialized information in their learning and working contexts. The fast-evolving knowledge does not always permit that the terminology created to designate new concepts is quickly inserted in dictionaries, and this may represent a great challenge for those who need access to specialized knowledge. After presenting approaches used in the last twenty years for the automatic extraction of definition traits (DT) and definition contexts (DC), we propose the use of the Corpus Query Language (CQL) tool to retrieve information that helps in understanding the terminology used in specialized texts. In particular, we attested the usefulness of search syntaxes built with CQL for this purpose, applying them to the COVID-19 Corpus. The path presented in this study can help not only specialists in the medical field, but also translators, lexicographers and teachers to process, in a faster and more accurate way, the knowledge contained in specialized texts.

Downloads

Download data is not yet available.

Author Biographies

  • Ana Eliza Pereira Bocorny, Universidade Federal do Rio Grande do Sul

    Professora no Instituto de Letras da Universidade Federal do Rio Grande do Sul. E-mail: ana.bocorny@gmail.com.

  • Rozane Rebechi, Universidade Federal do Rio Grande do Sul

    Professora no Instituto de Letras da Universidade Federal do Rio Grande do Sul. E-mail: rozane.rebechi@ufrgs.br.

  • Cristiane Krause Kilian, Instituto Superior de Educação Ivoti

    Professora no Instituto Superior de Educação Ivoti. E-mail: cristianekkilian@gmail.com.

References

BARROS, L. A. Curso básico de Terminologia. São Paulo: EdUSP, 2004.

BIBER, D., CONNOR, U., & UPTON, T. Discourse on the Move. Using corpus analysis to describe discourse structure. Amsterdam: John Benjamins, 2007.

BOCORNY, A. E. P., REBECHI, R. R., REPPEN, R., DELFINO, M. C. N., & LAMEIRA, V. M. A produção de artigos da área das ciências da saúde com o auxílio de key lexical bundles: um estudo direcionado por corpus. D.E.L.T.A, n., v. 1, 2021: 1-37.

CUI, H., KAN, M. Y., & CHUA, T. S. Unsupervised learning of soft patterns for definitional question answering. Proceedings of the Thirteenth World Wide Web conference (WWW 2004), 2004: 90-99.

CUI, H., KAN, M. Y., & CHUA, T. S. Generic soft pattern models for definitional question answering. Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, 2005: 384-391.

DE BESSÉ, B. Le contexte terminographique. Meta: journal des traducteurs/Meta: Translators' Journal, n. 36, v. 1, 1991: 111-120.

DE LUCCA, J. L. Identificação de padrões recorrentes no discurso técnico e científico para a extração automática de candidatos a contextos definitórios em língua portuguesa. Intercâmbio. Revista do Programa de Estudos Pós-Graduados em Linguística Aplicada e Estudos da Linguagem, n. 15, 2006.

FAHMI, I., & BOUMA, G. Learning to identify definitions using syntactic features. Proceedings of the Workshop on Learning Structured Information in Natural Language Applications, 2006: 64-71.

FINATTO, M. J. B. Elementos Lexicográficos e Enciclopédicos na Definição Terminológica: Questões de Partida. Organon, n. 12, v. 26, 1998: 1-8. Disponível em < http://seer.ufrgs.br/index.php/organon/article/view/29563>. Acesso em 01 ago. 2020.

JIN, Y., KAN, M. Y., NG, J. P., & HE, X. Mining scientific terms and their definitions: A study of the ACL anthology. Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, 2013: 780-790.

KILGARRIFF, A.; RYCHLY, P.; SMRZ, P.; TUGWELL, D. The Sketch Engine. Proceedings of Euralex, 2004: 105-116.

KLAVANS, J. L., & MURESAN, S. Evaluation of the DEFINDER system for fully automatic glossary construction. Proceedings of the AMIA Symposium. American Medical Informatics Association, 2001: 324-328.

KLAVANS, J. L., & MURESAN, S. DEFINDER: Rule-based methods for the extraction of medical terminology and their associated definitions from on-line text. Proceedings of the AMIA Symposium. American Medical Informatics Association, 2000: 1049.

KOSEM, I., KOPPEL, K., KUHN, T. Z., MICHELFEIT, J., & TIBERIUS, C. Identification and automatic extraction of good dictionary examples: the case (s) of GDEX. International Journal of Lexicography, n. 32, v. 2, 2019: 119–137.

KOVÁŘ, V., MOČIARIKOVÁ, M., & RYCHLÝ, P. Finding definitions in large corpora with Sketch Engine. Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16), 2016: 391-394.

MASUM, M., SHAHRIAR, H., HADDAD, H. M., AHAMED, S., SNEHA, S., RAHMAN, M., & CUZZOCREA, A. Actionable Knowledge Extraction Framework for COVID-19. Proceedings of the 2020 IEEE International Conference on Big Data (Big Data), 2020: 4036-4041.

PAVEL, S.; NOLET, D. Manual de Terminologia. Trad. Enilde Faulstich. Direção de Terminologia e Normalização. Departamento de Tradução do Governo Canadense, 2002. https://linguisticadocumentaria.files.wordpress.com/2011/03/pavel-terminologia.pdf

PEARSON, J. The expression of definitions in specialised texts: a corpus-based analysis. Proceedings of the Seventh Euralex International Congress, 1996: 817-824.

PEARSON, J. Terms in context. Amsterdam: John Benjamins, 1998.

SIERRA, G. Extracción de contextos definitorios en textos de especialidad a partir del reconocimiento de patrones lingüísticos. Linguamática, n. 1, v. 2, 2009: 13-37.

TRIKI, N. Elaboration paradigms in PhD theses introductions. Deviation(s), 2014: 202-225.

TRIKI, N. Revisiting the metadiscursive aspect of definitions in academic writing. Journal of English for Academic Purposes, n. 37, 2019: 104-116.

VEYSEH, A. P. B., DERNONCOURT, F., DOU, D., & NGUYEN, T. H. A Joint Model for Definition Extraction with Syntactic Connection and Semantic Consistency. Proceedings of the AAAI, 2020: 9098-9105.

Published

2022-11-09

Issue

Section

Articles

How to Cite

Bocorny, A. E. P., Rebechi, R., & Kilian, C. K. (2022). Definition context extraction from the COVID-19 corpus with CQL. TradTerm, 42, 125-138. https://doi.org/10.11606/issn.2317-9511.v42p125-138