1 November 2016
With the rise of technology and computer algorithms in all aspects of modern society, the collection of user data became not only a possibility, but an inevitability. For years now, marketers have used data to inform targeted marketing, understand their audience, and cater to their customers’ needs. So as publishing has moved more and more onto online platforms, the question arises as to how big data can or should be applied in a creative entertainment industry. More than simply a practical question, it is moral and ethical as books have immense cultural capital in our society. To many, a book created with user data in mind is not a pure creative expression, but rather is soulless and caters to lowbrow tastes. But in the semi-chaotic market of bookselling, where entire companies can be raised or ruined by a single title with an unpredictable performance, replacing intuition with mathematics is a difficult temptation to avoid.
Despite being a relatively recent phenomenon, e-reader data collection can track a large number of metrics including “How far you read in a book and how fast […] what books you buy, […] your reading habits, what part of a story turns you off and makes you want to stop reading, when you read, how fast you read certain parts of books, and even what device you read them on” (Lambert). Additionally, digital tracking opens up the ability to always “know where your readers are (geographically) and how that might influence their reading habits and even what they read” (Lambert). In addition to retailers, subscription services such as Scribd and the late Oyster also track subscriber action, primarily to judge completion of a book for scaled pricing (Howard). Effectively, every digital reading platform has the legal freedom to collect, analyze and utilize user data, opening doors that simply did not exist in print.
Previous tracking of reader preferences relied on intuition and vague market data: “In the past, before digital reading, publishers had at hand the blunt instrument of units sold and could draw inferences by analyzing sales by region and broad demographics, and then anecdotally what people, or reviewers anyway, thought of the content” (Kobo). For an industry with an uncertain future, more sophisticated tracking is an opportunity that cannot be passed up. Alexandra Alter reinforces this need to advance, saying that “Publishing has lagged far behind the rest of the entertainment industry when it comes to measuring consumers’ tastes and habits. TV producers relentlessly test new shows through focus groups; movie studios run films through a battery of tests and retool them based on viewers’ reactions.” A dependence on experience, hunches and “postmortem measure[s] of success [that] can’t shape or predict a hit” (Alter) seems to be another symptom of the publishing industry’s difficulty in abandoning outdated traditions.
As with any form of private data collection, consumers are wary of how their information is being collected and used. In 2014, Adobe drew ire when it was discovered that its Digital Editions e-reader was finding and transmitting data from users’ libraries back to Adobe servers in unencrypted plain text (Gallagher). Gallagher also suggests that Adobe “may be in violation of a recently passed New Jersey Law, the Reader Privacy Act” as well as “The American Library Association’s Code of Ethics.” Nate Hoffelder, who originally discovered and spread word of the security concern, describes the situation as “spying on users” and a “massively boneheaded stupid mistake,” emphasizing just how sensitive a subject online privacy is to consumers. While this data is valuable to publishers, transparency and security need to be of utmost importance lest they find their brand’s reputation stained by a data leak or unexplained overreach.
In Kobo’s whitepaper “Publishing in the Era of Big Data”, they conclude that “We are at the very earliest stages of the possible when it comes to applying Big Data to the publishing world but even with these relatively simple tools, much can be learned to benefit overall business” (11). Publishers have already begun using the troves of data available to inform their business decisions, and are experimenting with more creative uses. At the most basic level, readership analysis can be used to judge a book’s performance with more depth and detail than previous methods. “Perhaps the most compelling use of ebook tracking data could be used to give backlist a boost. Kobo highlights an unnamed book that has high user engagement but low sales, meaning most people read it all of the way through, but not too many people are buying it in the first place” (Howard). Using engagement (completion of a book and time spent reading) as a judge of quality, Kobo and others hope to find great books that were forgotten due to poor marketing or positioning and give them a second chance at success. Tracking engagement for a particular author allows publishers to see more clearly how their titles are performing and can help determine whether to sign them for more books and the size of their advance. Kobo (5) uses the example of tracking readership across an entire series to see where engagement hit its peak (and then determine why), and decide when it is time to change the formula or bring the series to a close.
However, at a more profound level than sales decisions, some publishers and authors are wondering how big data can be implemented directly into the creative process. Jodie Archer and Matthew Jockers have created an algorithm that can read and analyze a book, then use the trends found in this data to determine common qualities among bestsellers (Althoff). They suggest that these trends can be used to identify future blockbusters and inform publishers where to spend their time and resources. However, some fear that Archer and Jockers’ blockbuster algorithm “Can homogenize the market or try and somehow take [editors’] jobs away from them” (Archer qtd. in Althoff). There is a general anxiety surrounding the inclusion of readership data in publishing. Lynn Neary writes: “The idea that data collected from e-readers might be used by publishers to improve a writer’s work strikes [author Jonathan] Evison as wrong;” Jonathan Galassi, president of Farrar, Straus & Giroux, adds to the same thought: “The thing about a book is that it can be eccentric, it can be the length it needs to be, and that is something the reader shouldn’t have anything to do with […] We’re not going to shorten ‘War and Peace’ because someone didn’t finish it” (qtd. in Alter). Most writers and publishers seem to agree that while collected data is intriguing and that it can and should inform marketing decisions, it has no place in the creative process. However, already this opinion is not unanimous, and some authors like Scott Turow disagree: “I would love to know if 35 percent of my readers were quitting after the first two chapters […] because that frankly strikes me as, sometimes, a problem I could fix” (qtd. in Neary).
Ebooks and the collection of big data being as new as they are, only time will tell how deeply they become ingrained in the publishing process. Althoff points to “A larger movement in the publishing industry to replace gut instinct and wishful thinking with data.” In a field as typically conservative and slow to adapt as publishing, it is encouraging to see that already marketers have looked into creative uses for reader data and begun implementing them in the same way as supermarkets, online retailers, and the like. But when it comes to the art of writing and the creative process, big data becomes more of a double-edged sword. A rift may be forming between those who view writing as an independent outlet that cannot be influenced by commercial demand, and those who take a more economic approach. For years, industries such as film, music and television have created works specifically to meet consumer’s wants, and in the uncertain market of publishing, data collection may be the most practical solution.