New perspectives for the Annelida collection (National Museum/UFRJ) database: using data visualization to analyze and manage biological collections

Authors

  • Camila Simões Martins de Aguiar Messias
  • Carlos Cesar de Oliveira Fonseca
  • Monique Cristina dos Santos
  • Asla M. Sá
  • Joana Zanol

DOI:

https://doi.org/10.1590/

Keywords:

Polychaetes, Biological collection, Management, Interactive visual representations

Abstract

Collection management faces many challenges in keeping stored items preserved and the information associated
with them accurate and organized. It is essential for the expansion and use of this biodiversity repository that the
database is unambiguous and that errors are quickly identified and corrected. This work aims to show the use of
interactive visual representations (IVRs) of the collection’s metadata as tools to inspect the data and help solve
these challenges. To do this, we used the Annelida collection database from the National Museum (MN) of the
Federal University of Rio de Janeiro (UFRJ). Interactive graphs of the metadata within this database (catalog
date, taxonomic identification and determiners, sampling, depth, geographic localization, and collector data) were
created with the Altair library in the Python 3 language. Data analyses using these graphs made it possible to
identify anomalous patterns in the data and fill in missing records. They also provided an understanding of the
spatial and bathymetric distribution of the specimens deposited over time, and the growth rate of the collection in
each family, thus projecting future growth and solutions for the physical organization of vials. Graphs are an ally
in the management of collections with digital entry forms and aim to facilitate the availability of metadata associated
with cataloged specimens. Likewise, IVRs can even be used to give credit to the researchers involved in building
biological collections. Thus, visualization tools are efficient in recognizing global patterns present in databases and
solving biological collection management tasks.

References

Ariño, A. H. 2010. Approaches to estimating the universe

of natural history collections data. Biodiversity

Informatics, 7(2), 81-92.

Beaman, R. & Cellinese, N. 2012. Mass digitization of

scientific collections: New opportunities to transform the

use of biological specimens and underwrite biodiversity

science. ZooKeys, 209, 7–17.

Blagoderov, V., Kitching, I., Livermore, L., Simonsen, T.

& Smith, V. 2012. No specimen left behind: industrial

scale digitization of natural history collections. ZooKeys,

, 133–146.

Cook, J. A., Edwards, S. V., Lacey, E. A., Guralnick, R. P.,

Soltis, P. S., Soltis, D. E., Welch, C. K., Bell, K. C.,

Galbreath, K. E., Himes, C., Allen, J. M., Heath, T. A.,

Carnaval, A. C., Cooper, K. L., Liu, M., Hanken, J. & IckertBond, S. 2014. Natural history collections as emerging

resources for innovative education. BioScience,

(8), 725-734.

Comoglio, F., Fracchia, L. & Rinaldi, M. 2013. Bayesian

Inference from Count Data Using Discrete Uniform

Priors. PLoS ONE, 8(10), e74388.

Fayyad, U., Grinstein, G. G. & Wierse, A. 2001. Information

visualization in data mining and knowledge discovery.

Burlington, Morgan Kaufmann Publishers

Graham, C., Ferrier, S., Huettman, F., Moritz, C. & Peterson, A.

New developments in museum-based informatics

and applications in biodiversity analysis. Trends in

Ecology & Evolution, 19(9), 497–503.

Guisan, A. & Thuiller, W. 2005. Predicting species

distribution: offering more than simple habitat models.

Ecology Letters, 8(9), 993–1009.

He, P., Chen, J., Kong, H., Cai, L. & Qiao, G. 2021.

Important Supporting Role of Biological Specimen in

Biodiversity Conservation and Research. Bulletin of

Chinese Academy of Sciences, 38(12), 11.

Hedrick, B. P., Heberling, J. M., Meineke, E. K., Turner, K. G.,

Grassa, C. J., Park, D. S., Kennedy, J., Clarke, J. A.,

Cook, J. A., Blackburn, D. C., Edwards, S. V. & Davis, C. C.

Digitization and the Future of Natural History

Collections. BioScience, 70(3), 243–251.

Hutchings, P. 1998. Biodiversity and functioning of

polychaetes in benthic sediments. Biodiversity and

Conservation, 7(9), 1133–1145.

Jin, J. & Yang, J. 2020. BDcleaner: A workflow for cleaning

taxonomic and geographic errors in occurrence data

archived in biodiversity databases. Global Ecology

and Conservation, 21, e00852.

Johnson, K. R., Owens, I. F. P. & The Global Collection

Group. 2023. A global approach for natural history

museum collections. Science, 379(6638), 1192–1194.

Keim, D. A. 2002. Information visualization and visual

data mining. IEEE Transactions on Visualization and

Computer Graphics, 8(1), 1–8.

Krishtalka, L. & Humphrey, P. S. 2000. Can Natural History

Museums Capture the Future? BioScience, 50(7), 611 -617.

Lana, P. C. & Bernardinho, A. F. (ed.). 2018. Brazilian

Estuaries. Cham: Springer International Publishing.

Liu, S., Andrienko, G., Wu, Y., Cao, N., Jiang, L., Shi, C.,

Wang, Y. S. & Hong, S. 2018. Steering data quality

with visual analytics: The complexity challenge. Visual

Informatics, 2(4), 191–197.

Liu, S., Cui, W., Wu, Y. & Liu, M. 2014. A survey on

information visualization: recent advances and

challenges. The Visual Computer, 30(12), 1373–1393.

Medeiros e Sá, A., Oliveira, F. A., Schneider, B., Echavarria,

K. R. & Serejo, C. S. 2022. Visually Overviewing

Biodiversity Open Data Digital Collections. In:

Proceedings of the Symposium on Open Data and

Knowledge for a Post-Pandemic Era ODAK22, UK.

Messias, C. S. M. A., Fonseca, C., Santos, M., Sá E

Medeiros, A. & Zanol, J. 2023. New perspectives of

Annelida collection (National Museum/UFRJ) database:

using data visualization to analyze and manage

biological collections. Ocean and Coastal Research.

https://doi.org/10.5281/zenodo.8092072

Meyer, C., Weigelt, P. & Kreft, H. 2016. Multidimensional

biases, gaps and uncertainties in global plant occurrence

information. Ecology Letters, 19(8), 992–1006.

Miller, M. & Vielfaure, N. 2022. OpenRefine: An

Approachable Open Tool to Clean Research Data.

Bulletin - Association of Canadian Map Libraries and

Archives), (170), 2-8.

National Academies of Sciences, Engineering and Medicine.

Biological Collections: Ensuring Critical Research

and Education for the 21st Century. Washington, DC,

National Academies Press.

Page, L. M., Macfadden, B. J., Fortes, J. A., Soltis, P. S. &

Riccardi, G. 2015. Digitization of Biodiversity Collections

Reveals Biggest Data on Biodiversity. BioScience,

(9), 841–842.

Peterson, A. T., Navarro-Sigüenza, A. G. & Pereira, R. S.

Detecting errors in biodiversity data based on

Annelida collection management with data visualization

Ocean and Coastal Research 2024, v72(suppl 1):e24016 15

Messias et al.

collectors’ itineraries. Bulletin of the British Ornithologists

Club, 124, 143–151.

Ribeiro, B. R., Velazco, S. J. E., Guidoni-Martins, K.,

Tessarolo, G., Jardim, L., Bachman, S. P. & Loyola, R.

bdc: A toolkit for standardizing, integrating and

cleaning biodiversity data. Methods in Ecology and

Evolution, 13(7), 1421–1428.

Rouhan, G., Dorr, L. J., Gautier, L., Clerc, P., Muller, S.

& Gaudeul, M. 2017. The time has come for Natural

History Collections to claim co‐authorship of research

articles. TAXON, 66(5), 1014–1016.

Scott, B., Baker, E., Woodburn, M., Vincent, S., Hardy, H.

& Smith, V. S. 2019. The Natural History Museum Data

Portal. Database, 2019, baz038.

Shiravi, H., Shiravi, A. & Ghorbani, A. A. 2012. A Survey

of Visualization Systems for Network Security. IEEE

Transactions on Visualization and Computer Graphics,

(8), 1313–1329.

Shnneiderman, B. 1996. The eyes have it: a task by

data type taxonomy for information visualizations. In:

Proceedings IEEE Symposium on Visual Languages

(pp. 336–343). Boulder: IEEE Computer Society Press.

Suarez, A. V. & Tsutsui, N. D. 2004. The Value of

Museum Collections for Research and Society.

BioScience, 54(1), 66-74.

Wang, R., Perez-Riverol, Y., Hermjakob, H. & Vizcaíno,

J. A. 2015. Open source libraries and frameworks for

biological data visualisation: A guide for developers.

PROTEOMICS, 15(8), 1356–1374.

Wilson, S. L., Way, G. P., Bittremieux, W., Armache,

J., Haendel, M. A. & Hoffman, M. M. 2021. Sharing

biological data: why, when, and how. FEBS Letters,

(7), 847–863.

Xu, J., Wu, S. & Li, X. 2007. Estimating Collection Size with

Logistic Regression. In: Proceedings of the 30th annual

international ACM SIGIR conference on Research and

development in information retrieval (pp. 789-790).

New York, ACM.

Zizka, A., Silvestro, D., Andermann, T., Azevedo, J.,

Duarte, C. R., Edler, D., Farooq, H., Herdean, A.,

Ariza, M., Scharn, R., Svantesson, S., Wengström,

N., Zizka, V. & Antonelli, A. 2019. CoordinateCleaner:

Standardized cleaning of occurrence records from

biological collection databases. Methods in Ecology and

Evolution, 10(5), 744–751.

Downloads

Published

2024-06-11

Issue

Section

Original Article

How to Cite

New perspectives for the Annelida collection (National Museum/UFRJ) database: using data visualization to analyze and manage biological collections. (2024). Ocean and Coastal Research, 72(Suppl. 1). https://doi.org/10.1590/