Automatic extraction of tacit knowledge in a digital library: using artificial intelligence mechanisms
DOI:
https://doi.org/10.11606/issn.2178-2075.incid.2025.231405Keywords:
Digital Library, Artificial Intelligence, Machine Learning, Text Mining, Clustering, Knowledge GraphsAbstract
Objective: This study reports the results of applied research to build intelligent-agent solutions based on natural language processing, text mining, and machine learning, along with visual data presentations (graphs and lists), to extract and display tacit knowledge contained in scientific articles within a digital library database. Methodology: The study is qualitative and interpretive, arising from interactions among researchers and the analysis of extracted data results. Methodologically, it is grounded in Design Science Research, as this approach is metatheoretical. Epistemologically, the study is based on Design Science, which generates knowledge by examining how science-and-technology research designs artifacts to solve problems. The proof of concept—using AI techniques to extract knowledge from data in a digital library of scientific articles—constitutes applied, domain-based research. Results: Searches conducted by computational intelligent agents yielded satisfactory outcomes: after identifying the terms most commonly used by the authors in their scientific articles, text mining and specific algorithms grouped authors with similar research interests (clustering through unsupervised machine learning). Conclusion: As a result of this research, a digital library previously built with traditional data input and display structures now includes a set of intelligent agents that perform processes beyond the trivial filters typically offered—such as searches by author, event, subject, or keywords.
Downloads
References
ACM DIGITAL LIBRARY. Repository of resources. Association for Computing Machinery ACM Inc. c2025. Disponível em: http://portal.acm.org. Acesso em: 18 abr. 2025.
BAX, M. P. Design science: filosofia da pesquisa em ciência da informação e tecnologia. Ciência da Informação, Brasília, v. 42, n. 2, ago. 2015. DOI: https://doi.org/10.18225/ci.inf.v42i2.1388. Disponível em: https://doi.org/10.18225/ci.inf.v42i2.1388. Acesso em: 24 abr. 2025.
BORGMAN, D. L. What are digital libraries? competing visions. Information Processing & Management, v. 35, n. 3, p. 227-243, 1999. DOI https://doi.org/10.1016/S0306-4573(98)00059-4. Disponível em: https://doi.org/10.1016/S0306-4573(98)00059-4. Acesso em: 24 abr. 2025.
CAPES – Coordenação de Aperfeiçoamento de Pessoal de Nível Superior. Plataforma Sucupira. Brasília: CAPES, 2024 Disponível em http://www.periodicos.capes.gov.br. Acesso em: 18 maio 2025.
CRESWELL, J. W. Investigação qualitativa e projeto de pesquisa: escolhendo entre cinco abordagens. 3. ed. Porto Alegre: Penso, 2014.
DÖRRE, J.; GERSTL, P.; SEIFFERT, R.. Text mining: finding nuggets in mountains of textual data. In: ACM SIGKDD, 50., San Diego, 1999. Proceedings[…]. San Diego: ACM, 1999. p. 308-401. DOI https://doi.org/10.1145/312129.312299. Disponível em: https://dl.acm.org/doi/10.1145/312129.312299. Acesso em: maio 2025.
FACELI, K. Inteligência artificial: uma abordagem de aprendizado de máquina. Rio de Janeiro: LTC, 2022.
GREENSTEIN, D. Digital libraries and their challenges. Library Trends, v. 49, n. 2, p. 290-303, Fall 2000.
HEVNER, A.; CHATTERJEE, S. Design science research in information systems: theory and practice. Design Research in Information Systems, v. 2, p. 9-22, mar. 2010. DOI: https://doi.org/10.1007/978-1-4419-5653-8_2. Disponível em: https://doi.org/10.18225/ci.inf.v42i2.1388. Acesso em: 24 abr. 2025.
IOANNIDIS, Y. Digital libraries: future directions for a european research programme. 3. ed. Roma: DELOS, 2001.
KITCHENHAM, B. A.; CHARTERS, S. Guidelines for perform-ing systematic literature reviews in software engineering. UK: Tech. rep.: Keele University, 2007. 65p.
KODRATOFF, Y. Knowledge discovery in texts: a definition, and applications. In: RÁS, Z. W.; SKOWRON, A. (ed.) Foundations of inteligente systems: International Symposium on Methodologies for Intelligent Systems 1999. Berlin: Springer, 1999. p. 16-29. (Lecture Notes in Computer Science, v. 1609) DOI https://doi.org/10.1007/BFb0095087. Disponível em: https://doi.org/10.1007/BFb0095087. Acesso em: 24 abr. 2025.
LESK, M. Practical digital libraries: books, bytes, and bucks. California: Morgan Kaufmann, 1997.
LEY, M. The DBLP computer science bibliography: evolution, research issues, perspectives. In: LAENDER, A.H.F.; OLIVEIRA, A.L. String processing and information retrieval: International symposium on string processing and information retrieval 2002. Berlin: Springer, 2002. p. 1-10. (Lecture Notes in Computer Science, v. 2476). DOI https://doi.org/10.1007/3-540-45735-6_1. Disponível em: https://doi.org/10.1007/3-540-45735-6_1. Acesso em: 24 abr. 2025.
NORVIG, P.; RUSSELL, S. Inteligência artificial: uma abordagem moderna. 4. ed. Rio de Janeiro: LTC, 2022.
PETERSEN, K.; FELDT, R.; MUJTABA, S.; MATTSSON, M.. Systematic mapping studies in software engineering. In: INTERNATIONAL CONFERENCE ON EVALUATION AND ASSESSMENT IN SOFTWARE ENGINEERING (EASE), 12., Italy, 2008. [Conference Proceedings]. Italy: BCS Learning & Development, 2008. p. 68-77. DOI: 10.14236/ewic/EASE2008.8. Disponível em: https://www.scienceopen.com/hosted-document?doi=10.14236/ewic/EASE2008.8. Acesso em: 24 abr. 2025.
PETERSEN, K.; VAKKALANKA, S.; KUZNIARZ, L.. Guidelines for conducting systematic mapping studies in software engineering: an update. Information and software technology,v. 64, p. 1-18, 2015. DOI https://doi.org/10.1016/j.infsof.2015.03.007. Disponível em: https://doi.org/10.1016/j.infsof.2015.03.007. Acesso em: 24 abr. 2025.
PORTER, M. F. An algorithm for suffix stripping. Program: electronic library and information systems, v. 14, n. 3, p. 130-137, 1980. DOI https://doi.org/10.1108/eb046814. Disponível em: https://doi.org/10.1108/eb046814. Acesso em: 24 abr. 2025.
SAFFADY, W. Digital library concepts and technologies for the management of library collections: an analysis of methods and costs. Library Technology Reports, v. 31, n. 3, p. 221-380, 1995. Disponível em: http://link.gale.com/apps/doc/A17443511/AONE?u=anon~db7d6b9b&sid=googleScholar&xid=679c185f. Acesso em: 24 abr. 2025.
SCHOLAR GOOGLE. c2025. Disponível em: http://scholar.google.com. Acesso em: 23 abr. 2025.
SNYDER, H. Literature review as a research methodology: an overview and guidelines. Journal of Business Research, v. 104, p. 333-339, nov. 2019. DOI https://doi.org/10.1016/j.jbusres.2019.07.039. Disponível em: https://doi.org/10.1016/j.jbusres.2019.07.039. Acesso em: 24 abr. 2025.
van AKEN, J. E.; ROMME, G. Reinventing the future: adding design science to the repertoire of organization and management studies. Organization Management Journal, v. 6, n. 1, p. 5-12, 2009. DOI https://doi.org/10.1057/omj.2009.1. Disponível em: https://doi.org/10.1057/omj.2009.1. Acesso em: 24 abr. 2025.
WITTEN, I. H.; MOFFAT, A.; BELL, T. C. Managing gigabytes: compressing and indexing documents and images. 2. ed. Boston: Morgan Kaufmann, 1999, 484 p.
WU, L. et al. Development of benchmark datasets for text mining and sentiment analysis to accelerate regulatory literature review. Regulatory Toxicology and Pharmacology, v. 137, p. 105-287, 2023. DOI https://doi.org/10.1016/j.yrtph.2022.105287. Disponível em: https://doi.org/10.1016/j.yrtph.2022.105287. Acesso em: 24 abr. 2025.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 André Luiz de Castro Leal, Sanderson Nascimento Milagres Filho, Lorena Vasconcellos Oliveira Magalhaes, Weslei de Carvalho Vianna, Gizelle Kupac Vianna

This work is licensed under a Creative Commons Attribution 4.0 International License.
Ao encaminhar textos à InCID: Revista de Ciência da Informação e Documentação, o autor concorda com as prerrogativas do DOAJ para periódicos de acesso aberto adotadas pela revista:
- concessão à revista o direito de primeira publicação sob a Licença Creative Commons Attribution (CC BY 4.0), que permite acessar, imprimir, ler, distribuir, remixar, adaptar e desenvolver outros trabalhos, com reconhecimento da autoria.
- autorização para distribuição não exclusiva da versão do trabalho publicado nesta revista , como a publicação em repositorios institucionais desde que o reconhecimento da autoria e publicação inicial na InCID
- leitores podem ler, fazer download, distribuir, imprimir, linkar o texto completo dos arquivos sem pedir permissão prévia aos autores e/ou editores, desde que respeitado o estabelecido na Licença Creative Commons Attribution (CC BY 4.0).
O trabalho publicado é considerado colaboração e, portanto, o autor não receberá qualquer remuneração para tal, bem como nada lhe será cobrado em troca para a publicação.
Os textos são de responsabilidade de seus autores. Citações e transcrições são permitidas mediante menção às fontes.