The Infrastructures of Sequence Data in Biology

Hallam Stevens

This project provided two examples of the very specific ways in which data is tied to the technologies and practices of computing and information technologies. The examples suggest that big data cannot exist outside of these computational infrastructures. First, it described Walter's Goad's work at Los Alamos that led ultimately to the development of the GenBank database. Here the use of data arose from specific kinds of computational practices that were originally developed for weapons work. Second, it examined some of the work of making and updating the Ensembl database run by the European Bioinformatics Institute. In this case, the generation, structuring, and use of the data are tied to the technologies and practices of the World Wide Web. The examples highlight some of the novelties of big data and big data practices: not only its size, but also in how it is manipulated and used through numerical methods, relational database structures, machine learning, hyperlinking, and topological analysis. The project suggested that this novelty warrants some new social science approaches to studying data that allows us to follow it inside machines, software, and databases—that is, we need to supplement material culture approaches with data culture approaches.

Archives & Collecting

MAX-PLANCK-INSTITUT FÜR WISSENSCHAFTSGESCHICHTE Max Planck Institute for the History of Science

Institute

People

Research

Publications & Resources

News & Events

Career

Disciplinary groups

Perspectives and Methods

The Infrastructures of Sequence Data in Biology