A Inteligência Artificial e os desafios da Ciência Forense Digital no século XXI

Rafael Padilha; Antônio Theóphilo; Fernanda A. Andaló; Didier A. Vega-Oliveros; João P. Cardenuto; Gabriel Bertocco; José Nascimento; Jing Yang; Anderson Rocha

doi:10.1590/s0103-4014.2021.35101.009

Autores

Rafael Padilha Universidade Estadual de Campinas. Instituto da Computação https://orcid.org/0000-0003-1944-5475
Antônio Theóphilo Universidade Estadual de Campinas. Instituto da Computação https://orcid.org/0000-0003-1408-0745
Fernanda A. Andaló Universidade Estadual de Campinas. Instituto da Computação https://orcid.org/0000-0002-5243-0921
Didier A. Vega-Oliveros Universidade Estadual de Campinas. Instituto da Computação https://orcid.org/0000-0001-9569-3775
João P. Cardenuto Universidade Estadual de Campinas. Instituto da Computação https://orcid.org/0000-0002-8370-6329
Gabriel Bertocco Universidade Estadual de Campinas. Instituto da Computação https://orcid.org/0000-0002-7701-7420
José Nascimento Universidade Estadual de Campinas. Instituto da Computação https://orcid.org/0000-0003-3450-6029
Jing Yang Universidade Estadual de Campinas. Instituto da Computação https://orcid.org/0000-0002-0035-3960
Anderson Rocha Universidade Estadual de Campinas. Instituto da Computação https://orcid.org/0000-0002-4236-8212

DOI:

https://doi.org/10.1590/s0103-4014.2021.35101.009

Palavras-chave:

Ciência forense digital, Inteligência Artificial, Aprendizado de máquina, Mídias sociais, Fake news

Resumo

A Ciência Forense Digital surgiu da necessidade de tratar problemas forenses na era digital. Seu mais recente desafio está relacionado ao surgimento das mídias sociais, intensificado pelos avanços da Inteligência Artificial. A produção massiva de dados nas mídias sociais tornou a análise forense mais complexa, especialmente pelo aperfeiçoamento de modelos computacionais capazes de gerar conteúdo artificial com alto realismo. Assim, tem-se a necessidade da aplicação de técnicas de Inteligência Artificial para tratar esse imenso volume de informação. Neste artigo, apresentamos desafios e oportunidades associados à aplicação dessas técnicas, além de fornecer exemplos de seu uso em situações reais. Discutimos os problemas que surgem em contextos sensíveis e como a comunidade científica tem abordado esses tópicos. Por fim, delineamos futuros caminhos de pesquisa a serem explorados.

Downloads

Os dados de download ainda não estão disponíveis.

Biografia do Autor

Rafael Padilha, Universidade Estadual de Campinas. Instituto da Computação

é doutorando do Instituto da Computação da Universidade Estadual de Campinas (Unicamp). Contribuiu igualmente no desenvolvimento do artigo.@ – rafael.padilha@ic.unicamp.br / https://orcid.org/0000-0003-1944-5475.
Antônio Theóphilo, Universidade Estadual de Campinas. Instituto da Computação

é doutorando do Instituto da Computação da Universidade Estadual de Campinas (Unicamp). Contribuiu igualmente no desenvolvimento do artigo.@ – antonio.theophilo@ic.unicamp.br/ https://orcid.org/0000-0003-1408-0745.
Fernanda A. Andaló, Universidade Estadual de Campinas. Instituto da Computação

é pesquisadora colaboradora do Instituto da Computação da Universidade Estadual de Campinas (Unicamp). @ – feandalo@ic.unicamp.br/ https://orcid.org/0000-0002-5243-0921.
Didier A. Vega-Oliveros, Universidade Estadual de Campinas. Instituto da Computação

é pesquisador de pós-doutorado do Instituto da Computação da Universidade Estadual de Campinas (Unicamp). @ – davo@unicamp.br / https://orcid.org/0000-0001-9569-3775.
João P. Cardenuto, Universidade Estadual de Campinas. Instituto da Computação

é doutorando do Instituto da Computação da Universidade Estadual de Campinas (Unicamp). @ – phillipe.cardenuto@ic.unicamp.br/ https://orcid.org/0000-0002-8370-6329.
Gabriel Bertocco, Universidade Estadual de Campinas. Instituto da Computação

é doutorando do Instituto da Computação da Universidade Estadual de Campinas (Unicamp). @ – gabriel.bertocco@ic.unicamp.br/https://orcid.org/0000-0002-7701-7420.
José Nascimento, Universidade Estadual de Campinas. Instituto da Computação

é doutorando do Instituto da Computação da Universidade Estadual de Campinas (Unicamp). @ – jose.nascimento@ic.unicamp.br/ https://orcid.org/0000-0003-3450-6029.
Jing Yang, Universidade Estadual de Campinas. Instituto da Computação

é doutorando do Instituto da Computação da Universidade Estadual de Campinas (Unicamp). @ – jing.yang@ic.unicamp.br/ https://orcid.org/0000-0002-0035-3960.
Anderson Rocha, Universidade Estadual de Campinas. Instituto da Computação

é professor associado do Instituto da Computação da Universidade Estadual de Campinas (Unicamp). @ – anderson.rocha@ic.unicamp.br/ https://orcid.org0000-0002-4236-8212.

Referências

ACUNA, D. E.; BROOKES, P. S.; KORDING, K. P. Bioscience-scale automated detection of figure element reuse. Cold Spring Harbor Laboratory, fev. 2018.

ADADI, A.; BERRADA, M. Peeking inside the black-box: A survey on explainable

artificial intelligence (XAI). IEEE Access, v.6, p.52138-60, 2018.

ATLURI, G.; KARPATNE, A.; KUMAR, V. Spatio-temporal data mining: A survey of

problems and methods. ACM Comput. Surv., v.51, n.4, p.1-83, ago. 2018.

BALL, P.; MAXMEN, A. The epic battle against coronavirus misinformation and conspiracy theories. Nature, v.581, n.7809, p.371-4, 2020.

BOCCALETTI, S. et al. The structure and dynamics of multilayer networks. Physics

Reports, v.544, n.1, p.1-122, 2014.

BOERS, N. et al. Complex networks reveal global pattern of extreme-rainfall teleconnections. Nature, v.566, n.7744, p.373, 2019.

BUCCI, E. M. Automatic detection of image manipulations in the biomedical literature. Cell Death & Disease, Springer Science and Business Media LLC, v.9, n.3, mar.

CALDERS, T.; ŽLIOBAITE, I. Why unbiased computational processes can lead to

discriminative˙ decision procedures. In: Discrimination and privacy in the information

society. s.l.: Springer, 2013. p.43-57.

CASEY, E. Digital evidence and computer crime: Forensic science, computers, and the

internet. s.l.: Academic Press, 2011.

CHEN, E.; LERMAN, K.; FERRARA, E. Tracking social media discourse about the

covid-19 pandemic: Development of a public coronavirus twitter data set. JMIR Public

Health and Surveillance, v.6, n.2, p.e19273, 2020.

CHESNEY, B.; CITRON, D. Deep fakes: a looming challenge for privacy, democracy,

and national security. California Law Review, v.107, n.6, p.1753-820, 2019.

CHISUM, W. J.; TURVEY, B. Evidence dynamics: Locard’s exchange principle & crime reconstruction. Journal of Behavioral Profiling, v.1, n.1, p.1-15, 2000.

CINELLI, M. et al. The covid-19 social media infodemic. arXiv preprint, arXiv:2003.05004, 2020.

CUI, L.; WANG, S.; LEE, D. Same: sentiment-aware multi-modal embedding for detecting fake news. In: IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM). s.l.: s.n., 2019. p.41-8.

DATTA, A.; TSCHANTZ, M. C.; DATTA, A. Automated experiments on ad privacy

settings. Proceedings on Privacy Enhancing Technologies, v.2015, n.1, p.92-112, 2015.

ESTER, M. et al. A density-based algorithm for discovering clusters in large spatial databases with noise. Kdd., v.96, n.34, p.226-31, 1996.

FERREIRA, A. et al. Counteracting the contemporaneous proliferation of digital forgeries and fake news. Anais da Academia Brasileira de Ciências, v.91, n.1, p.e20180149,

FERREIRA, L. N. et al. Spatiotemporal data analysis with chronological networks. Nature Communications, v.11, n.1, p.1-11, 2020.

FROSST, N.; HINTON, G. Distilling a neural network into a soft decision tree. arXiv

preprint, arXiv:1711.09784, 2017.

GILPIN, L. H. et al. Explaining explanations: An overview of interpretability of machine learning. In: IEEE INTERNATIONAL CONFERENCE ON DATA SCIENCE

AND ADVANCED ANALYTICS (DSAA). s.l.: s.n., 2018. p.80-9.

GUNNING, D. et al. XAI—Explainable artificial intelligence. Science Robotics, v.4,

n.37, 2019.

GUPTA, A.; LAMBA, H.; KUMARAGURU, P. $1.00 per rt #bostonmarathon

#prayforboston: Analyzing fake content on twitter. In: 2013 APWG eCrime Researchers

Summit. s.l.: s.n., 2013. p.1-12.

HENDRICKS, L. A. et al. Generating visual explanations. In: EUROPEAN CONFERENCE ON COMPUTER VISION (ECCV). s.l.: s.n., 2016. p.3-19.

HERNANDEZ-SUAREZ, A. et al. A web scraping methodology for bypassing twitter

API restrictions. arXiv preprint, arXiv:1803.09875, 2018.

HOU, B.-J.; ZHOU, Z.-H. Learning with interpretable structure from gated RNN.

IEEE Transactions on Neural Networks and Learning Systems, v.31, n.7, p.2267-79, 2020.

JANG, S. M. et al. A computational approach for examining the roots and spreading

patterns of fake news: Evolution tree analysis. Computers in Human Behavior, v.84,

p.103-13, 2018.

JIANG, Z. et al. Focal-test-based spatial decision tree learning. IEEE Trans. Knowl.

Data Eng., v.27, n.6, p.1547-59, 2015.

JIN, Z. et al. Multimodal fusion with recurrent neural networks for rumor detection on

microblogs. In: ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA. s.l.:

s.n., 2017. p.795-816.

JUOLA, P. Authorship attribution. Foundations and Trends® in Information Retrieval,

v.1, n.3, p.233-334, 2008.

KAHNEMAN, D. Thinking, fast and slow. s.l.: Macmillan, 2011.

KHATTAR, D. et al. Mvae: Multimodal variational autoencoder for fake news detection.

In: THE WORLD WIDE WEB CONFERENCE (WWW). s.l.: s.n., 2019. p.2915-21.

KOPPEL, M.; SCHLER, J.; ARGAMON, S. Computational methods in authorship

attribution. Journal of the American Society for information Science and Technology, v.60,

n.1, p.9-26, 2009.

LAMERI, S. et al. Who is my parent? reconstructing video sequences from partially

matching shots. In: IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP). s.l.: s.n., 2014.

LEFÈVRE, T. Big data in forensic science and medicine. Journal of Forensic and Legal

Medicine, v.57, p.1-6, 2018.

MARRA, F. et al. Do GANs leave artificial fingerprints? In: IEEE CONFERENCE ON

MULTIMEDIA INFORMATION PROCESSING AND RETRIEVAL (MIPR). s.l.:

s.n., 2019.

MCFARLAND, D. A.; LEWIS, K.; GOLDBERG, A. Sociology in the era of big data: The ascent of forensic social science. The American Sociologist, v.47, n.1, p.12-35, 2016.

MIDDLETON, S. E.; PAPADOPOULOS, S.; KOMPATSIARIS, Y. Social computing

for verifying social media content in breaking news. IEEE Internet Computing, v.22,

n.2, p.83-9, 2018.

NELSON, G. S. Bias in artificial intelligence. North Carolina Medical Journal, v.80,

n.4, p.220-2, 2019.

NGUYEN, D. T. et al. Automatic image filtering on social networks using deep learning and perceptual hashing during crises. arXiv preprint, arXiv:1704.02602, 2017.

NTOUTSI, E. et al. Bias in data-driven artificial intelligence systems—an introductory

survey. WIREs Data Mining and Knowledge Discovery, v.10, n.3, p.e1356, 2020.

OMEZI, N.; JAHANKHANI, H. Proposed forensic guidelines for the investigation of

fake news. In: Policing in the Era of AI and Smart Societies. s.l.: s.n., 2020. p.231-65.

PADILHA, R. et al. Unraveling the notre dame cathedral fire in space and time: an

x-coherence approach. In: A ser publicado em Crime Science and Digital Forensics: A

Holistic View. s.l.: CRC Press, 2021.

PADILHA, R.; ANDALÓ, F. A.; ROCHA, A. Improving the chronological sorting of

images through occlusion: A study on the notre-dame cathedral fire. In: IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP). s.l.: s.n., 2020. p.2972-6.

PADILHA, R. et al. Forensic event analysis: From seemingly unrelated data to understanding. IEEE Security and Privacy, v.18, n.6, p.23-32, 2020.

PINHEIRO, G. et al. Detection and synchronization of video sequences for event reconstruction. In: IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP). s.l.: s.n., 2019.

POLLITT, M. A history of digital forensics. In: International Conference on Digital

Forensics (IFIP). s.l.: s.n., 2010. p.3-15.

PRATES, M. O.; AVELAR, P. H.; LAMB, L. C. Assessing gender bias in machine

translation: a case study with Google Translate. Neural Computing and Applications,

p. 1-19, 2019.

QI, C.; ZHANG, J.; LUO, P. Emerging concern of scientific fraud: Deep learning and

image manipulation. Cold Spring Harbor Laboratory, nov. 2020.

RIBEIRO, M. T.; SINGH, S.; GUESTRIN, C. “Why should I trust you?” Explaining the predictions of any classifier. In: ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING. s.l.: s.n., 2016.

p.1135-144.

ROCHA, A. et al. Authorship attribution for social media forensics. IEEE Transactions

on Information Forensics and Security, v.12, n.1, p.5-33, 2017.

RODRIGUES, C. M. et al. Image semantic representation for event understanding. In:

IEEE International Workshop on Information Forensics and Security (WIFS). s.l.: s.n.,

p.1-6.

ROSSETTI, G.; CAZABET, R. Community discovery in dynamic networks: A survey.

ACM Comput. Surv., v.51, n.2, fev. 2018. ISSN 0360-0300. Disponível em: <https://doi.org/10.1145/3172867>.

RUDER, S.; GHAFFARI, P.; BRESLIN, J. G. Character-level and multi-channel convolutional neural networks for large-scale authorship attribution. arXiv preprint, arXiv:1609.06686, 2016.

SAMMES, T.; JENKINSON, B. Forensic computing. s.l.: Springer, 2007.

SCHEIRER, W. A pandemic of bad science. Bulletin of the Atomic Scientists, Informa

UK Limited, v.76, n.4, p.175-84, 2020.

SCHEUFELE, D. A.; KRAUSE, N. M. Science audiences, misinformation, and fake

news. Proceedings of the National Academy of Sciences, v.116, n.16, p.7662-9, jan. 2019.

SCHNEIDER, M.; CHANG, S. A robust content based digital signature for image authentication. In: IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP). s.l.: s.n., 1996. v.3, p.227-30, v.3.

SCHWARZ, S.; THEÓPHILO, A.; ROCHA, A. Emet: Embeddings from multilingual-

-encoder transformer for fake news detection. In: IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP). s.l.:

s.n., 2020. p.2777–81.

SELVARAJU, R. R. et al. Grad-cam: Visual explanations from deep networks via gradient-based localization. In: IEEE International Conference on Computer Vision and

Pattern Recognition (CVPR). s.l.: s.n., 2017. p.618-26.

SHRESTHA, P. et al. Convolutional neural networks for authorship attribution of

short texts. In: Conference of the European Chapter of the Association for Computational

Linguistics: Volume 2, Short Papers. s.l.: s.n., 2017. p.669-74.

SHU, K.; WANG, S.; LIU, H. Beyond news contents: The role of social context for

fake news detection. In: ACM International Conference on Web Search and Data Mining (WSDM). s.l.: s.n., 2019. p.312-20.

SONG, L. et al. Unsupervised domain adaptive re-identification: Theory and practice.

Pattern Recognition, v.102, p.107-73, 2020.

STAMATATOS, E. A survey of modern authorship attribution methods. Journal of the

American Society for information Science and Technology, v.60, n.3, p.538-56, 2009.

SUNDARARAJAN, M.; TALY, A.; YAN, Q. Axiomatic attribution for deep networks.

In: International Conference on Machine Learning (ICML). s.l.: s.n., 2017. p.3319-28.

THEÓPHILO, A.; PEREIRA, L. A.; ROCHA, A. A needle in a haystack? Harnessing onomatopoeia and user-specific stylometrics for authorship attribution of micro-messages. In: IEEE International Conference on Acoustics, Speech and Signal Processing

(ICASSP). s.l.: s.n., 2019. p.2692-6.

VAROL, O. et al. Online human-bot interactions: Detection, estimation, and characterization. arXiv preprint, arXiv:1703.03107, 2017.

VENKATESAN, R. et al. Robust image hashing. In: IEEE International Conference on

Image Processing (ICIP). s.l.: s.n., 2000. p.664-6.

WU, L.; RAO, Y. Adaptive interaction fusion networks for fake news detection. arXiv

preprint, arXiv:2004.10009, 2020.

XIANG, Z.; ACUNA, D. E. Scientific image tampering detection based on noise inconsistencies: A method and datasets. arXiv preprint, arXiv:2001.07799, 2020.

XIE, N. et al. Explainable deep learning: A field guide for the uninitiated. arXiv preprint, arXiv:2004.14545, 2020.

YANG, F. et al. Asymmetric co-teaching for unsupervised cross-domain person re-identification. In: AAAI. s.l.: s.n., 2020. p.12597-604.

ZELLERS, R. et al. From recognition to cognition: Visual commonsense reasoning. In:

IEEE International Conference on Computer Vision and Pattern Recognition (CVPR).

s.l.: s.n., 2019. p.6720-31.

ZHAI, Y. et al. Ad-cluster: Augmented discriminative clustering for domain adaptive

person re-identification. In: IEEE International Conference on Computer Vision and

Pattern Recognition (CVPR). s.l.: s.n., 2020. p.9021-30.

ZHOU, X.; WU, J.; ZAFARANI, R. Safe: Similarity-aware multi-modal fake news detection. arXiv preprint, arXiv:2003.04981, 2020.

A Inteligência Artificial e os desafios da Ciência Forense Digital no século XXI

Autores

DOI:

Palavras-chave:

Resumo

Downloads

Biografia do Autor

Referências

Downloads

Publicado

Edição

Seção

Licença

Como Citar

Idioma

Informações