A Natural History of Data

A Natural History of Data


Spindle diagrams from Jack Sepkoski’s factor analysis of Phanerozoic marine taxonomic diversification. J. John Sepkoski Jr., “A Factor Analytic Description of the Phanerozoic Marine Fossil Record,” Paleobiology 7 (1981), p. 38.

A Natural History of Data examines the history of practices and rationalities surrounding data in the natural sciences between 1800 and the present. One feature of this transformation is the emergence of the modern digital database as the locus of scientific inquiry and practice, and the consensus that we are now living in an era of “data-driven” science. However, a major component of the project involves critically examining this development in order to historicize our modern fascination with data and databases. I do not take it for granted, for example, that digital databases are discontinuous with more traditional archival practices and technologies, nor do I assume that earlier eras of science were less “data driven” than the present. This project does seek, though, to develop a more nuanced appreciation for how data and databases have come to have such a central place in the modern scientific imagination.   

The central motivation behind this project is to historicize the development of data and database practices in the natural sciences, but it is also defined by a further set of questions, including: What is the relationship between data and the physical objects, phenomena, or experiences that they represent? How have tools and available technologies changed the epistemology and practice of data over the past 200 years? What are the consequences of the increasing economies of scale as ever more massive data collections are assembled? Have new technologies of data changed the very meaning and ontology of data itself? How have changes in scientific representations occurred in conjunction with the evolution of data practices (e.g. diagrams, graphs, photographs, atlases, compendia, etc.)? And, ultimately, is there something fundamentally new about the modern era of science in its relationship to and reliance on data and databases? 

To date, I have focused this project on the history of data in paleontology, both because it is a discipline I know very well from my past work, and also because paleontology has a particularly rich and fascinating history of archival and data-oriented practices. My initial findings suggest that this history of data is extremely complex: my research has shown that there are many ways in which the data practices of the 19th century paleontologists were similar to those employed today, but also that significant differences in the technology, representation, and epistemology of data separate the 19th from the late 20th century. However, one task of this project is to broaden the investigation into other disciplines, such as geology/geophysics, paleoanthropology, archaeology, and biology. I am especially interested in the ways that disciplines with a historical epistemology (e.g. “natural history,” but also some of the human sciences) have evolved data practices and epistemologies.