Automatic extraction of tacit knowledge in a digital library: using artificial intelligence mechanisms

Authors

DOI:

https://doi.org/10.11606/issn.2178-2075.incid.2025.231405

Keywords:

Digital Library, Artificial Intelligence, Machine Learning, Text Mining, Clustering, Knowledge Graphs

Abstract

Objective: This study reports the results of applied research to build intelligent-agent solutions based on natural language processing, text mining, and machine learning, along with visual data presentations (graphs and lists), to extract and display tacit knowledge contained in scientific articles within a digital library database. Methodology: The study is qualitative and interpretive, arising from interactions among researchers and the analysis of extracted data results. Methodologically, it is grounded in Design Science Research, as this approach is metatheoretical. Epistemologically, the study is based on Design Science, which generates knowledge by examining how science-and-technology research designs artifacts to solve problems. The proof of concept—using AI techniques to extract knowledge from data in a digital library of scientific articles—constitutes applied, domain-based research. Results: Searches conducted by computational intelligent agents yielded satisfactory outcomes: after identifying the terms most commonly used by the authors in their scientific articles, text mining and specific algorithms grouped authors with similar research interests (clustering through unsupervised machine learning). Conclusion: As a result of this research, a digital library previously built with traditional data input and display structures now includes a set of intelligent agents that perform processes beyond the trivial filters typically offered—such as searches by author, event, subject, or keywords.

Downloads

Download data is not yet available.

Author Biographies

  • André Luiz de Castro Leal, Federal Rural University of Rio de Janeiro

    Doutor em informática pela Pontifícia Universidade Católica do Rio de Janeiro, PUC-Rio;

    Professor na Universidade Federal Rural do Rio de Janeiro, UFRRJ, Seropédica, RJ, Brasil.

  • Sanderson Nascimento Milagres Filho, Federal Rural University of Rio de Janeiro

    Graduando em Sistemas de Informação pela Universidade Federal Rural do Rio de Janeiro, UFRRJ, Seropédica, RJ, Brasil.

  • Lorena Vasconcellos Oliveira Magalhaes, Federal Rural University of Rio de Janeiro

    Graduanda em Sistemas de Informação pela Universidade Federal Rural do Rio de Janeiro, UFRRJ, Seropédica, RJ, Brasil

  • Weslei de Carvalho Vianna, Federal Rural University of Rio de Janeiro

    Graduando em Sistemas de Informação pela Universidade Federal Rural do Rio de Janeiro, UFRRJ, Seropédica, RJ, Brasil.

  • Gizelle Kupac Vianna, Federal Rural University of Rio de Janeiro

    Doutora em Engenharia de Sistemas e Computação pela Universidade Federal do Rio de Janeiro, UFRJ;

    Professora da Universidade Federal Rural do Rio de Janeiro, UFRRJ, Seropédica, RJ, Brasil.

References

ACM DIGITAL LIBRARY. Repository of resources. Association for Computing Machinery ACM Inc. c2025. Disponível em: http://portal.acm.org. Acesso em: 18 abr. 2025.

BAX, M. P. Design science: filosofia da pesquisa em ciência da informação e tecnologia. Ciência da Informação, Brasília, v. 42, n. 2, ago. 2015. DOI: https://doi.org/10.18225/ci.inf.v42i2.1388. Disponível em: https://doi.org/10.18225/ci.inf.v42i2.1388. Acesso em: 24 abr. 2025.

BORGMAN, D. L. What are digital libraries? competing visions. Information Processing & Management, v. 35, n. 3, p. 227-243, 1999. DOI https://doi.org/10.1016/S0306-4573(98)00059-4. Disponível em: https://doi.org/10.1016/S0306-4573(98)00059-4. Acesso em: 24 abr. 2025.

CAPES – Coordenação de Aperfeiçoamento de Pessoal de Nível Superior. Plataforma Sucupira. Brasília: CAPES, 2024 Disponível em http://www.periodicos.capes.gov.br. Acesso em: 18 maio 2025.

CRESWELL, J. W. Investigação qualitativa e projeto de pesquisa: escolhendo entre cinco abordagens. 3. ed. Porto Alegre: Penso, 2014.

DÖRRE, J.; GERSTL, P.; SEIFFERT, R.. Text mining: finding nuggets in mountains of textual data. In: ACM SIGKDD, 50., San Diego, 1999. Proceedings[…]. San Diego: ACM, 1999. p. 308-401. DOI https://doi.org/10.1145/312129.312299. Disponível em: https://dl.acm.org/doi/10.1145/312129.312299. Acesso em: maio 2025.

FACELI, K. Inteligência artificial: uma abordagem de aprendizado de máquina. Rio de Janeiro: LTC, 2022.

GREENSTEIN, D. Digital libraries and their challenges. Library Trends, v. 49, n. 2, p. 290-303, Fall 2000.

HEVNER, A.; CHATTERJEE, S. Design science research in information systems: theory and practice. Design Research in Information Systems, v. 2, p. 9-22, mar. 2010. DOI: https://doi.org/10.1007/978-1-4419-5653-8_2. Disponível em: https://doi.org/10.18225/ci.inf.v42i2.1388. Acesso em: 24 abr. 2025.

IOANNIDIS, Y. Digital libraries: future directions for a european research programme. 3. ed. Roma: DELOS, 2001.

KITCHENHAM, B. A.; CHARTERS, S. Guidelines for perform-ing systematic literature reviews in software engineering. UK: Tech. rep.: Keele University, 2007. 65p.

KODRATOFF, Y. Knowledge discovery in texts: a definition, and applications. In: RÁS, Z. W.; SKOWRON, A. (ed.) Foundations of inteligente systems: International Symposium on Methodologies for Intelligent Systems 1999. Berlin: Springer, 1999. p. 16-29. (Lecture Notes in Computer Science, v. 1609) DOI https://doi.org/10.1007/BFb0095087. Disponível em: https://doi.org/10.1007/BFb0095087. Acesso em: 24 abr. 2025.

LESK, M. Practical digital libraries: books, bytes, and bucks. California: Morgan Kaufmann, 1997.

LEY, M. The DBLP computer science bibliography: evolution, research issues, perspectives. In: LAENDER, A.H.F.; OLIVEIRA, A.L. String processing and information retrieval: International symposium on string processing and information retrieval 2002. Berlin: Springer, 2002. p. 1-10. (Lecture Notes in Computer Science, v. 2476). DOI https://doi.org/10.1007/3-540-45735-6_1. Disponível em: https://doi.org/10.1007/3-540-45735-6_1. Acesso em: 24 abr. 2025.

NORVIG, P.; RUSSELL, S. Inteligência artificial: uma abordagem moderna. 4. ed. Rio de Janeiro: LTC, 2022.

PETERSEN, K.; FELDT, R.; MUJTABA, S.; MATTSSON, M.. Systematic mapping studies in software engineering. In: INTERNATIONAL CONFERENCE ON EVALUATION AND ASSESSMENT IN SOFTWARE ENGINEERING (EASE), 12., Italy, 2008. [Conference Proceedings]. Italy: BCS Learning & Development, 2008. p. 68-77. DOI: 10.14236/ewic/EASE2008.8. Disponível em: https://www.scienceopen.com/hosted-document?doi=10.14236/ewic/EASE2008.8. Acesso em: 24 abr. 2025.

PETERSEN, K.; VAKKALANKA, S.; KUZNIARZ, L.. Guidelines for conducting systematic mapping studies in software engineering: an update. Information and software technology,v. 64, p. 1-18, 2015. DOI https://doi.org/10.1016/j.infsof.2015.03.007. Disponível em: https://doi.org/10.1016/j.infsof.2015.03.007. Acesso em: 24 abr. 2025.

PORTER, M. F. An algorithm for suffix stripping. Program: electronic library and information systems, v. 14, n. 3, p. 130-137, 1980. DOI https://doi.org/10.1108/eb046814. Disponível em: https://doi.org/10.1108/eb046814. Acesso em: 24 abr. 2025.

SAFFADY, W. Digital library concepts and technologies for the management of library collections: an analysis of methods and costs. Library Technology Reports, v. 31, n. 3, p. 221-380, 1995. Disponível em: http://link.gale.com/apps/doc/A17443511/AONE?u=anon~db7d6b9b&sid=googleScholar&xid=679c185f. Acesso em: 24 abr. 2025.

SCHOLAR GOOGLE. c2025. Disponível em: http://scholar.google.com. Acesso em: 23 abr. 2025.

SNYDER, H. Literature review as a research methodology: an overview and guidelines. Journal of Business Research, v. 104, p. 333-339, nov. 2019. DOI https://doi.org/10.1016/j.jbusres.2019.07.039. Disponível em: https://doi.org/10.1016/j.jbusres.2019.07.039. Acesso em: 24 abr. 2025.

van AKEN, J. E.; ROMME, G. Reinventing the future: adding design science to the repertoire of organization and management studies. Organization Management Journal, v. 6, n. 1, p. 5-12, 2009. DOI https://doi.org/10.1057/omj.2009.1. Disponível em: https://doi.org/10.1057/omj.2009.1. Acesso em: 24 abr. 2025.

WITTEN, I. H.; MOFFAT, A.; BELL, T. C. Managing gigabytes: compressing and indexing documents and images. 2. ed. Boston: Morgan Kaufmann, 1999, 484 p.

WU, L. et al. Development of benchmark datasets for text mining and sentiment analysis to accelerate regulatory literature review. Regulatory Toxicology and Pharmacology, v. 137, p. 105-287, 2023. DOI https://doi.org/10.1016/j.yrtph.2022.105287. Disponível em: https://doi.org/10.1016/j.yrtph.2022.105287. Acesso em: 24 abr. 2025.

Published

2025-11-18

Issue

Section

Articles

How to Cite

LEAL, André Luiz de Castro; MILAGRES FILHO, Sanderson Nascimento; MAGALHAES, Lorena Vasconcellos Oliveira; VIANNA, Weslei de Carvalho; VIANNA, Gizelle Kupac. Automatic extraction of tacit knowledge in a digital library: using artificial intelligence mechanisms. InCID: Revista de Ciência da Informação e Documentação, Ribeirão Preto, Brasil, v. 16, p. e-231405, 2025. DOI: 10.11606/issn.2178-2075.incid.2025.231405. Disponível em: https://revistas.usp.br/incid/article/view/231405. Acesso em: 1 feb. 2026.