Signal, Symbol, Measure, Model

Signal, Symbol, Measure, Model

Xiaochang LI

An illustration depicting the envisioned design of the "phonoscribe", a voice-operated typewriter proposed by John B. Flowers. Source: Darling, Lloyd: The Marvelous Voice Typewriter: Talk to It and It Writes. In: Popular Science Monthly, July 1916.

An illustration depicting the envisioned design of the "phonoscribe", a voice-operated typewriter proposed by John B. Flowers. Source: Darling, Lloyd: The Marvelous Voice Typewriter: Talk to It and It Writes. In: Popular Science Monthly, July 1916.

An illustration depicting the envisioned design of the "phonoscribe", a voice-operated typewriter proposed by John B. Flowers. Source: Darling, Lloyd: The Marvelous Voice Typewriter: Talk to It and It Writes. In: Popular Science Monthly, July 1916.

In 1988, during a workshop on speech recognition hosted at the secluded Arden House estate in New York’s Central Valley, computer scientist Robert Mercer made an uncannily prescient observation: “There’s no data like more data.” Mercer, at the time a member of IBM’s Continuous Speech Recognition group, was describing their then-unorthodox approach to speech recognition, which used statistical models derived automatically from large sets of speech and text data rather than linguistic knowledge to predict text sequences. The remark has since become a rallying axiom for so-called “big data” or “data-driven” practices that have today risen to prominence as privileged and pervasive forms of knowledge production. The computational techniques popularized by the Continuous Speech Recognition group at IBM have proven similarly foundational, not only becoming standard across natural language processing, but finding their way into computational modeling practices in domains as diverse as finance and bioinformatics.

This project examines the influence of speech recognition and acoustic processing in the development of machine learning and data-driven knowledge practices. It traces how the pursuit of speech recognition made language at once conceptually and technically amenable to informatics, becoming a critical stage for negotiations over the parameters of human and computational forms of knowledge. Through the latter half of the twentieth century, research in speech recognition and related language technologies helped draw a particular form of statistical thinking, one rooted in Bayesian inference and forged through signal processing, into close alliance with both popular fantasies of machine intelligence and the commercial priorities of the computing industry. In doing so, it played a pivotal role in the development and proliferation of machine learning, data-driven analytics, and sibling algorithmic practices in both scientific endeavor and everyday life.