A Fistful of Data: Franco Moretti and the stuplime experience of big data

In the past few years, the digital humanities have become one of the sexier modes of literary study in the academy. The merging of scientific processes of inquiry and the art of subjective, qualitative study has been embodied in the work of Franco Moretti, professor of English at Stanford University and founder of the Stanford Literary Lab. Moretti is part of the vanguard of the digital humanities, striking forward with equal parts innovative method and charisma. In many ways, the digital humanities have become synonymous with Moretti’s name and his work, which has included Graphs, Maps, Trees: Abstract Models for a Literary History (2005) and Distant Reading (2013), among others. Moretti’s scientific approach to literature has sparked a discussion on big data and what it means to use quantitative studies in the humanities. His work has involved looking at thousands of titles of literature and analyzing the meta data from those titles; he has analyzed relationships between characters over time, changing title lengths during a constrained period, and has created visualization maps to illustrate his findings. The digital humanities and the use of big data to analyze literature have been critiqued in the academy, but what is our emotional reaction to this big data, as an initial instinct rather than as a slowly formed intellectual opinion? This paper will explore the use of big data in the digital humanities and how this big data results in an emotional response akin to awe and boredom. Our confrontation with big data results in the experience of what Sianne Ngai calls “stuplimity,” a term that will be further defined later, but which explains the complex emotional response to something both sublime and tedious. This response to literature’s big data is not caused by the individual texts themselves, but rather by the enormity of the data as a whole. It also inspires a sense of reverential respect for the individual partaking in the data collection: are digital humanists the new heroes, traversing untravelled frontiers? Finally, I will consider how the collectors and “scientists” of big data are heroes of tedium, uncovering new information about the study of the humanities through means of qualitative formalism.

Franco Moretti, big data, and distant reading

Moretti’s work has received its fair share of criticism. Scholars have suggested that much of his work reveals data that is either wrong or underwhelming: his study of title length in his essay “Style, Inc. Reflections on Seven Thousand Titles” looks at titles from British novels between 1740 and 1850, ultimately revealing little more than the shifting cultural and stylistic changes in naming. The earlier titles he considered featured as much as twenty words, while the later titles had as little as one word. This is significant, he says, because the early titles served as short descriptions or summaries of the work contained, while more recent, shorter titles “adopted a signifying strategy that made readers look for a unity in the narrative structure” (“Style, Inc.”). His findings reveal other somewhat interesting aspects about shifting naming conventions – like the presence of adjectives in short titles and their relationship to narratives of unstable domesticity, rather than of adventure (see The Unfashionable Wife versus The Vampyre) – and attempt to make our processes of formal analyses better than they already are.

From "Style, Inc.", Franco Moretti
From “Style, Inc.”, Franco Moretti

In Moretti’s essay “Conjectures on World Literature,” also collected in Distant Reading, he talks about the vast amount of literature that exists – thousands of books in English, French, Chinese, etc – and that only a small amount of them are taught – this is the canon – and actually read. There is a vast amount of books that have not even been read and are not remembered or ever thought of. This discussion brings to light the vastness of literature as a collection of thoughts and data, something that, as Joshua Rothman points out, is “perhaps best approached in a statistical way” (New Yorker). Rothman goes on to suggest that By turning those books into data, and analyzing that data, you can discover facts about literature in general – facts that are true not just about a small number of canonized works but about what the critic Margaret Cohen has called the ‘Great Unread’” (New Yorker). So there may be something valuable in big data for the humanities after all. It isn’t possible to close read every title in this “Great Unread,” and so perhaps we need to approach it more distantly.

Moretti’s writing about his data findings are infused with charm and excitement, which in turn makes the reader excited about his findings; it’s certainly true that reading Moretti is an exhilarating experience, one that “invests the discipline with such an air of discovery and possibility” (LA Review of Books). His essays are paired with data visualizations, created in his literary lab, mapping the relationships between time and title length, among other such findings. These visualizations help to condense the large quantities of data into easily understandable and digestible tables, perhaps necessary for readers from a humanities background with an aversion to numerical statements. Moretti’s work is more about the method than the result, however, an approach that also results in “an eagerness to highlight his own false steps,” (LA Review of Books); this scientific method of falsifiability is intended to give his successful results more authority and infuse his study with trustworthiness. Moretti’s notes on these “failed” studies leave very few of his projects intact, an aspect of the digital humanities that is borrowed right from the hard sciences.

Moretti’s approach marks an important shift in the study of the humanities and the way in which it defines its task: it is not only important to derive meaning from literature, but to also study its larger existence among history. Moretti’s “distant reading” is the pursuit of this: he aims to look at literature as a large set of individual pieces – or of datum – existing in tandem with each other. Rather than zooming in on the individual texts, as close reading does, he has zoomed far out to look at them as a whole, creating a new mode of literary study. Moretti, in an attempt to analyze “national literatures as they have evolved, differentiated and cross-pollinated over several centuries,” ends up reading literature distantly in terms of “temporality rather than scope” (LA Review of Books). In a review of his collection of essays titled Distant Reading, Kathleen Fitzpatrick questions why he reads this history distantly, and why he treats literature like data rather than as representations, suggesting that his method of very distantly reading is perhaps too distant, and that he engages in the “objective knowledge of world-scale systems, without the complications of the particular” (LA Review of Books). Moretti’s attempt at collecting large sets of data about different sets of literature across time and geography is essentially a vast zooming out, an overwhelming quantitative look at what the humanities have produced over the past several hundred years.

The ability to look this distantly is only now available to us because of our improved technology; never before have we been able to consider all of literature at once, to consider a collection that was previously too large for us tor treat accurately or fairly. This is exciting. The power of this kind of quantitative data collection is in the new kinds of access to interpretation it provides. Moretti’s gathering up of literature is a kind of quantitative formalism that, because of its enormity and newness, leaves us with questions about its accuracy and in which ways it is, or isn’t, contributing to our understanding of literature and the humanities. Fitzpatrick aptly says that Distant Reading, and Morretti’s work on big data as a whole, “raises the question not of whether one ought to read distantly, but of what one can read only distantly, and what one requires closeness in order to capture” (LA Review of Books).

Sianne Ngai’s “stuplimity”

When we are confronted with this kind of big data, our reaction tends to be one of awe or stupefaction. The digital humanities have a tendency to draw us down as we get lost in the database, in the vastness of the data. Big data, the critics say, removes the subjective experience of individual works, reducing them to numbers in a spread sheet: we are greeted by a wall of data, a looming, overwhelming mass, one that seems to overtake the particulars, that is too devoid of the individual to retain our attention, so that we find it dull as an accountant’s ledger. There is no room for the individual in big data. This kind of experience is not limited to numbers or data sets, and can also be seen in examples of art or other collections of objects. Sianne Ngai’s essay “Stuplimity” in the book Ugly Feelings proposes a new term for this kind of emotional response to big data. Ngai defines her terms as “the aesthetic experience in which astonishment is paradoxically united with boredom as stuplimity” (Ngai, 271). She further describes it as “a concatenation of boredom and astonishment – a bringing together of what ‘dulls’ and what ‘irritates’ or agitates; of sharp, sudden excitation and prolonged desensitization, exhaustion, or fatigue” (ibid, 271).

Listen to this if you want to experience stuplimity

The first part of stuplimity involves the experience of the sublime. As Ngai explains, the sublime, “conscripted to theorize an observer’s response to things in nature of great or infinite magnitude or of terrifying might, has had a revitalized cachet in what Arthur C. Danto describes as the twentieth-century avant-garde’s attempt to separate the concepts of art and beauty” (ibid, 265). While Kant’s original definition of the sublime was restricted to the experience of “rude nature,” adaptations of the term define it as a confrontation with terror or dread invoked by greatness, but “both sublimes involve an initial experience of being overwhelmed in a confrontation with totality that makes the observer painfully aware of her limitations – or at least at first” (ibid, 265). Art has invoked the sublime through repetition, in mimicking the vastness of a waterfall or mountain in nature through the creation of insurmountable concepts. An example of this can be seen in Gertrude Stein’s writings, which are intentionally recursive to the point of senselessness. Stein once said “There is no such thing as repetition. Only insistence.” An example of her repetition is in the following passage, from her poem “If I Told Him: A Completed Portrait of Picasso”:

“Shutters shut and shutters and so shutters shut and shutters and so and so shutters and so shutters shut and so shutters shut and shutters and so. And so shutters shut and so and also. And also and so and so and also.”

Other recursive works, including the music of Philip Glass – experimental in its repetitive structures – or in the pop art of Andy Warhol, result in the “aesthetic strategy in avant-garde practices” (ibid, 262). The Kantian aesthetic sublime is then transferred to the aesthetic experience of art, of man-made creations.

Atlas, Gerhard Richter
Atlas, Gerhard Richter

On the subject of repetition in art, Ngai uses the example of an art installation by Ann Hamilton, whose work has included pieces comprised of 16,000 teeth, 750,000 pennies, 800 men’s shirts, and floors covered in hair. She also mentions Gerhard Richter’s installation titled Atlas, which “confronts the spectator with 643 sheets displaying more than 7,000 items – snapshots, newspaper cuttings, sketches, color fields – arranged on white rectangular panels” (ibid, 263). She says that in the viewing of the installation, a collection of large quantities of objects, “the fatigue of the viewer’s responsivity approaches the kind of exhaustion involved in the attempt to read a dictionary” (ibid, 263). Another apt example comes from Janet Zweig’s computer/printer installations, titled Her Recursive Apology. The work was created by programming several computers and printers to print, “in the smallest possible type,” random apologies over and over again on continuously fed paper. The result was a dense block of text, mundane from a distance, but featuring thousands of lines of apologetic speech acts – which the artist explains is a commentary on the gendered act of apology. As Ngai notes, “For both Stein and Zweig, where system and subject converge is more specifically where language piles up and becomes ‘dense’” (ibid, 264). These installations “register as at once exciting and enervating, astonishing yet tedious” (ibid, 264). Each of the above example is a system of organizing banal objects and stuff – from hair to words – creating a new mode of looking at the objects as a group. As Ngai says, “particulars ‘thicken’ to produce new individualities” (ibid, 264).

Her Recursive Apology
Her Recursive Apology, Janet Zweig

But the experience of encountering large quantities of objects or data all at once is not fully explained by the sublime. The sense of limitation and inadequacy are undermined when the self realizes its capacity for “reason as a superior faculty – one capable of grasping the totality of infinity that the imagination could not in the form of a noumenal or supersensible idea, and also revealing the self’s final superiority to nature” (ibid, 266). The sets of data – or enumerations of objects or concepts – take the place of the infinite in the sublime, the experience of which leads to “representational or conceptual fatigue, if not collapse. Such tiredness results even when the narrator subdivides the enormity of what we are asked to imagine into more manageable increments” (ibid, 274). This is stuplimity, where the sensation of the sublime leads to the dulling of the senses, and eventually to boredom, as a matter of course. If we consider Moretti’s collections of data as examples of the infinite – it is nearly impossible to imagine all of literature as a single entity, as strings of cultural trends that go on forever – we can imagine the experience that one might encounter upon a visit to Niagara Falls, or of flipping through volumes of an accountant’s books: an initial feeling of inadequacy and perhaps awe, but also an eventual loss of interest. Those who venture to collect big data are those who seek to discover new meanings with new methods, though the act may be tedious and repetitive. Moretti, and those like him, are seeking new frontiers, uncovering the bits of information hidden beneath the mounds of unwieldy data. Big data collectors are the cowboys of the twenty-first century, going where no person has gone before.

Heroes of Tedium

David Foster Wallace once wrote that “True heroism is minutes, hours, weeks, year upon year of the quiet, precise, judicious exercise of probity and care – with no one there to see or cheer.” With some insight from passages from his novel The Pale King, we can consider how collectors of data push through the tedium of the work to uncover new ways of understanding. Moretti appears as a figure of the picaresque, a roguish hero of tedious tasks. His eagerness to uncover new historical knowledge about literature does not always result in success stories; often, Moretti’s work comes full stop at a dead end, and he records all of this, making note of his failures. But it is not in the results where Moretti ultimately succeeds: it is in the grandeur of his approach that Moretti gains his aesthetic power. Where the Kantian aesthetic hero once braved the sublime ocean, Moretti now braves the stuplime sea of data in the name of discovery. His persona as the maverick, as the personality behind the objective data, is important in retaining some of the humanities in his work: if we can’t have the subjectivity of the old methods of literary study, then we at least need a character to lead us through the unknown.

Unlike an accountant, Moretti is ready to be the face of faceless data, but like an accountant, he is willing to wade through the data to get there:

“You have wondered, perhaps, why all real accountants wear hats? They are today’s cowboys. As will you be. Riding the American range. Riding herd on the unending torrent of financial data. The eddies, cataracts, arranged variations, fractious minutiae. You order the data, shepherd it, direct its flow, lead it where it’s needed, in the codified form in which its apposite” (Wallace).

Replace “financial” with “literary” and you have Moretti: the good, the bad, and the ugly feelings.



Works Cited

English, James F, Kathleen Fitzpatrick, and Alexander R. Galloway. “Franco Moretti’s “Distant Reading”: A Symposium.LA Review of Books. Web. 29 March 2016.

Moretti, Franco. Distant Reading. Verso, 2013. Print.

Moretti, Franco. “Style Inc.Distant Reading. Verso, 2013.

Moretti, Franco. Graphs, Maps, Trees: Abstract Models for a Literary History. Verso. 2005. Print.

Ngai, Sianne. “Stuplimity.” Ugly Feelings. Cambridge, MA: Harvard University Press. 2005. Print.

Rothman, Joshua. “An Attempt to Discover the Laws of Literature.The New Yorker.

Wallace, David Foster. The Pale King. New York: Little, Brown and Co., 2011. Print.

2 Replies to “A Fistful of Data: Franco Moretti and the stuplime experience of big data”

  1. Daryn,

    Your essay makes a strong case for digital humanities and a distant reading of literature using the methods and ideas of Franco Moretti, without coming across as dismissive of alternative, “closer” modes of analysis. I find the conclusion that art and technology need not be isolated, and can in fact be complementary, to be very refreshing and hopeful. By drawing the connection between the sublime experience of vastness and technology, you demonstrate that art and science are linked, rather than in opposition.

    Your examples show that technology is art, and can itself be a vessel for art or a way of expressing art. As Stein and arguably any writer or poet demonstrates, language itself is a form of technology, and one which is to be experimented with. Each of the examples you cite, from music to poetry to art installations, is a “system of organizing banal objects and stuff – from hair to words” and each analyzed body of text could be such a system of, maybe not banal, but unrelated objects. Through distant reading, one could create a “new mode of looking at the objects” – or texts, or books, or stories – as a group.

    Personally, I find it admirable that Moretti applies the scientific method to literary studies, incorporating all results of experiments, even those that contradict Moretti’s intentions. You make note of this as well, writing that:

    “Like any good picaresque hero, Moretti takes his setbacks in stride, and never gives up on the idea of progress. The very fact that his ingenious models can be knocked to pieces, and his hard-won results summarily falsified, serves as a kind of negative validation, a sign that he is on the right track, practicing a genuine science of literary form, anchored — unlike the softer, safer, merely interpretative work of more tractable critics — to the bedrock of empirical evidence.”

    This statement exemplifies the importance of Moretti’s contribution to the realm of digital humanities. What’s unique about Moretti’s work is the application of the scientific method that honours the importance of falsifiability, thus aligning Moretti’s work with that of “hard” scientists, and bridging the gap between the humanities and other disciplines.

    Your inclusion of Kathleen Fitzpatrick’s review, in which she criticizes Moretti’s work as ignorant to the representative importance of literature, directly points to the main argument against the digital humanities. I’m glad that you address this near the beginning of the paper, as it is likely the reaction of many readers. Fitzpatrick positions the analysis of literature as an issue of either/or, and that Moretti “treats literature like data rather than as representations.” What is unclear to me is why the data itself is not a representation. James English’s review also discusses this reverence toward data, and the potentially problematic influence of this, rooted in the fear of the infection of the arts by science and technology. While we should be wary of an over-reliance on any single source of data, I think that within the humanities, we need to overcome our fear of data, recognize that this is another form of literary analysis, and work to incorporate it into our broader exploration of literature.

    Finally, a common critique of Moretti’s work, and digital humanities overall, is the triviality of it. Many ask what the point of distant reading is, and to what end are we collecting this information. The point, to me at least, based on your discussion of his work, is the “idea of progress” noted in the above quote. We as publishers, bibliophiles, and artists should embrace progress rather than fight it, as it suggests that art is weak or vulnerable, and I believe that we can all agree that it is not in the least.

  2. This is a terrific piece for so many reasons, but especially because your use of the concept of stuplimity as a lens through which to understand DH works extremely well. The analogy is not only apt, but it offers the opportunity to deconstruct what is happening when you aggregate all the texts and try to view them as a whole.

Leave a Reply