Assessing the validity of ChatGPT-4o and Google Gemini Advanced when responding to frequently asked questions in endodontics
DOI:
https://doi.org/10.1590/Keywords:
Artificial intelligence, ChatGPT, Endodontics, Google Gemini, Large language modelsAbstract
Artificial intelligence (AI) is transforming access to dental information via large language models (LLMs) such as ChatGPT and Google Gemini. Both models are increasingly being used in endodontics as a source of information for patients. Therefore, as developers release new versions, the validity of their responses must be continuously compared to professional consultations. Objective: This study aimed to evaluate the validity of the responses provided by the most advanced LLMs [Google Gemini Advanced (GGA) and ChatGPT-4o] to frequently asked questions (FAQs) in endodontics. Methodology: A cross-sectional analytical study was conducted in five phases. The top 20 endodontic FAQs submitted by users to chatbots and collected from Google Trends were compiled. In total, nine academically certified endodontic specialists with educational roles scored GGA and ChatGPT-4o responses to the FAQs using a five-point Likert scale. Validity was determined using high (4.5-5) and low (≥4) thresholds. The Fisher's exact test was used for comparative analysis. Results: At the low threshold, both models obtained 95% validity (95% CI: 75.1%- 99.9%; p=.05). At the high threshold, ChatGPT-4o achieved 35% (95% CI: 15.4%- 59.2%) and GGA, 40% (95% CI: 19.1%- 63.9%) validity (p=1). Conclusions: ChatGPT-4o and GGA responses showed high validity under lenient criteria that significantly decreased under stricter thresholds, limiting their reliability as a stand-alone source of information in endodontics. While AI chatbots show promise to improve patient education in endodontics, their validity limitations under rigorous evaluation highlight the need for careful professional monitoring.
Downloads
References
Caballero Alarcón FA, Brítez Carli R. Inteligencia artificial en el mejoramiento de la enseñanza y aprendizaje. Academo Rev Investig Ciencias Soc Human. 2024;11(2):99-108. doi: 10.30545/academo.2024.may-ago.1
» https://doi.org/10.30545/academo.2024.may-ago.1
Xu Y, Liu X, Cao X, Huang C, Liu E, Qian S, et al. Artificial intelligence: a powerful paradigm for scientific research. Innovation. 2021;2(4):100179. doi: 10.1016/j.xinn.2021.100179
» https://doi.org/10.1016/j.xinn.2021.100179
OpenAI. OpenAI Blog [Internet]. 2023 [cited 2025 Apr 22]. Available from: https://openai.com/blog
Zhao H, Chen H, Yang F, Liu N, Deng H, Cai H, et al. Explainability for large language models: a survey. ACM Trans Intell Syst Technol. 2024;15(2). doi: 10.1145/3639372
» https://doi.org/10.1145/3639372
Gemini Team. Gemini: a family of highly capable multimodal models. arXiv:2312.11805 [cs.CL]. 2023 Dec 19 [updated 2025 May 9; cited 2025 May 20]. Available from: https://doi.org/10.48550/arXiv.2312.11805
» https://doi.org/10.48550/arXiv.2312.11805
Aminoshariae A, Kulild J, Nagendrababu V. Artificial intelligence in endodontics: Current applications and future directions. J Endod. 2021;47(1): 1352-7. doi: 10.1016/j.joen.2020.10.003
» https://doi.org/10.1016/j.joen.2020.10.003
Alhaidry HM, Fatani B, Alrayes JO, Almana AM, Alfhaed NK. ChatGPT in dentistry: a comprehensive review. Cureus. 2023;15(5). doi:10.7759/cureus.38632
» https://doi.org/10.7759/cureus.38632
Suárez A, Díaz-Flores García V, Algar J, Gómez Sánchez M, Llorente de Pedro M, Freire Y. Unveiling the ChatGPT phenomenon: evaluating the consistency and accuracy of endodontic question answers. Int Endod J. 2024;57(1):108-13. doi: /10.1111/iej.13998
» https://doi.org//10.1111/iej.13998
Mohammad-Rahimi H, Ourang SA, Pourhoseingholi MA, Dianat O, Dummer PM, Nosrat A. Validity and reliability of artificial intelligence chatbots as public sources of information on Endodontics. Int Endod J. 2024;57(3): 305-14. doi: /10.1111/iej.14014
» https://doi.org//10.1111/iej.14014
Dave T, Athaluri SA, Singh S. ChatGPT in medicine: an overview of its applications, advantages, limitations, future prospects, and ethical considerations. Front Artif Intell. 2023;6. doi: /10.3389/frai.2023.1169595
» https://doi.org//10.3389/frai.2023.1169595
OpenAI. GPT-4 Technical Report. arXiv:2303.08774 [cs.CL]. 2023 Mar 15 [updated 2024 Mar 4; cited 2025 May 20]. Available from: https://doi.org/10.48550/arXiv.2303.08774
» https://doi.org/10.48550/arXiv.2303.08774
OpenAI. Hello GPT-4o [Internet]. [cited 2025 Apr 22]. Available from: https://openai.com/index/hello-gpt-4o/
» https://openai.com/index/hello-gpt-4o/
Moulaei K, Yadegari A, Baharestani M, Farzanbakhsh S, Sabet B, Afrash MR. Generative artificial intelligence in healthcare: a scoping review on benefits, challenges, and applications. Int J Med Inform. 2024;188:105474. doi: /10.1016/j.ijmedinf.2024.105474
» https://doi.org//10.1016/j.ijmedinf.2024.105474
Portilla ND, Garcia-Font M, Nagendrababu V, Abbott PV, Sanchez JA, Abella F. Accuracy and consistency of Gemini responses regarding the management of traumatized permanent teeth. Dent Traumatol. 2024;41:171-7. doi: 10.1111/edt.13004
» https://doi.org/10.1111/edt.13004
Aminoshariae A, Nosrat A, Nagendrababu V, Dianat O, Mohammad-Rahimi H, O’Keefe AW, et al. Artificial intelligence in endodontic education. J Endod. 2024;50(2): 562-78. doi: 10.1016/j. joen.2023.12.007
» https://doi.org/10.1016/j.joen.2023.12.007
Morishita M, Fukuda H, Muraoka K, Nakamura T, Hayashi M, Yoshioka I, et al. Evaluating GPT-4V's performance in the Japanese national dental examination: A challenge explored. J Dent Sci. 2024;19(3):1595-600. doi: /10.1016/j.jds.2023.12.007
» https://doi.org//10.1016/j.jds.2023.12.007
Danesh A, Danesh A, Danesh F. Innovating dental diagnostics: ChatGPT's accuracy on diagnostic challenges. Oral Dis. 2024;31(3):911- 7. doi: /10.1111/odi.15082
» https://doi.org//10.1111/odi.15082
Topol EJ. High-performance medicine: the convergence of human and artificial intelligence. Nat Med. 2019;25(1):44-56. doi: /10.1038/s41591-018-0300-7
» https://doi.org//10.1038/s41591-018-0300-7
Johnson AJ, Singh TK, Gupta A, Sankar H, Gill I, Shalini M, et al. Evaluation of validity and reliability of AI chatbots as public sources of information on dental trauma. Dent Traumatol. 2024;40(1):1-7. doi: /10.1111/edt.13000
» https://doi.org//10.1111/edt.13000
Sismanoglu S, Capan BS. Performance of artificial intelligence on Turkish dental specialization exam: can ChatGPT-4.0 and gemini advanced achieve comparable results to humans? BMC Med Educ. 2025;25(1):214. doi:10.1186/s12909-024-06389-9
» https://doi.org/10.1186/s12909-024-06389-9
Hwang TJ, Kesselheim AS, Vokinger KN. Lifecycle regulation of artificial intelligence- and machine learning-based software devices in medicine. JAMA. 2019;322(13):1286-7. doi: /10.1001/jama.2019.13163
» https://doi.org//10.1001/jama.2019.13163
Davenport T, Kalakota R. The potential for artificial intelligence in healthcare. Future Healthc J. 2019;6(2):94–8. doi: /10.7861/futurehosp.6-2-94
» https://doi.org//10.7861/futurehosp.6-2-94
Ourang SA, Sohrabniya F, Mohammad-Rahimi H, Dianat O, Aminoshariae A, Nagendrababu V, et al. Artificial intelligence in endodontics: fundamental principles, workflow, and tasks. Int Endod J. 2024;57(11):1546-65. doi: /10.1111/iej.14020
» https://doi.org//10.1111/iej.14020
Singh S, Asthana G. Artificial intelligence: a futuristic tool for advanced endodontics. J Conserv Dent Endod. 2024;27(5):447-8. doi: 10.4103/JCDE.JCDE_171_24
» https://doi.org/10.4103/JCDE.JCDE_171_24
Mokrane S, Siad. The promise and perils of Google's Bard for scientific research [Internet]. 2023 [cited 2025 Apr 22]. Available from: doi: /10.17613/yb4n-mc79
» https://doi.org//10.17613/yb4n-mc79
Setzer F, Li J, Khan A. The use of artificial intelligence in endodontics. J Dent Res. 2024;103(9):853-862. doi: /10.1177/00220345241255593
» https://doi.org//10.1177/00220345241255593
Di Battista M, Kernitsky J, Dibart S. Artificial intelligence chatbots in patient communication: Current possibilities. Int J Periodontics Restor Dent. 2024;44(6):731-8. doi: 10.11607/prd.6925
» https://doi.org/10.11607/prd.6925
Gondode P, Duggal S, Garg N, Sethupathy S, Asai O, Lohakare P. Comparing patient education tools for chronic pain medications: artificial intelligence chatbot versus traditional patient information leaflets. Indian J Anaesth. 2024;68(7):631-6. doi: 10.4103/ija. ija_204_24
» https://doi.org/10.4103/ija.ija_204_24
Uribe SE, Maldupa I, Kavadella A, El Tantawi M, Chaurasia A, Fontana M, et al. Artificial intelligence chatbots and large language models in dental education: Worldwide survey of educators. Eur J Dent Educ. 2024;28:865-76. doi: /10.1111/eje.12900
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Nicolás Dufey-Portilla, Ana Billik Frisman, Maximiliano Gallardo Robles, Fernando Peña-Bengoa, Consuelo Cabrera Ávila, Venkateshbabu Nagendrababu, Paul M. H Dummer, Marc Garcia-Font, Francesc Abella Sans

This work is licensed under a Creative Commons Attribution 4.0 International License.
Todo o conteúdo do periódico, exceto onde está identificado, está licenciado sob uma Licença Creative Commons do tipo atribuição CC-BY.