Publishing and privacy: why publishers should back personal data services

Abstract

Publishers, like other corporations, will have to develop a strategy that protects users’ data if they want to maintain a relationship of trust. As I will argue in this essay, publishers are ideally placed to support initiatives that preserve privacy because of reading’s historical link with interiority and solitude, and because their business model is based on the idea of information as valuable. In this essay, I will outline one tool for preserving privacy, the Personal Data Service (PDS).  These services protect individuals’ privacy, while allowing corporations access to better and richer data. They create a sense of transparency that allows corporations to interact with people in a more meaningful and value-creating way.

I will argue that publishers should support PDSs. First, they allow publishers potential access to a wealth of verified, trusted data. Second, they create a private space for readers who choose to read without being tracked. Third, they parallel what publishers are already doing by creating ways for people to set and control access to information.

Context

If horror movies are the expression of our collective fears (Philips 2005), then Unfriended is the perfect case study. The action unfolds entirely on-screen, following the interactions of five friends on Skype, Facebook chat and text messages–all of which are being manipulated by an unknown stranger. As Spotify’s suggestions grow steadily creepier, gory deaths ensue (Gingold 2014). Blood splatter aside, Unfriended perfectly captures a zeitgeist of fear about information technology. The victims’ intimate encounters are watched and recorded by a sinister, faceless entity. Their inability to carve out a private space reveals a cultural anxiety about being tracked, manipulated, and exposed online.

This anxiety is not only expressed in horror movies: a 2015 Pew study found that 91% of Americans feel they have lost control over corporations’ use of their personal information online (Madden 2015). And they have good reason to feel this way. Facebook founder and CEO Mark Zuckerberg declared in 2010 that privacy was no longer a “social norm,” to much criticism (Johnson 2010). Taxi start-up Uber came under fire for allowing employees a “God view” of individuals’ movements in real time (Timberg 2014). Online services store massive amounts of “anonymized” data that is not really anonymous–a study found that they could identify 95% of people using only four GPS location points, even when the data was of low accuracy (de Montjoye et al. 2012). And the 2013 Edward Snowden scandal revealed that data is not only being used by corporations: governments are aggregating and analyzing massive quantities of information to build “a pattern of life” of anyone even loosely associated with suspected terrorists (Davis et al. 2013).

There is a growing number of people who ask whether their personal data belongs in the corporations’ hands. “There is a growing view… that data is a personal asset,” Alan Mitchell, UK strategic advisor on privacy, says. “The full potential value can only be realised if individuals are able to control what personal information they share with who, for what purposes, under what terms and conditions; and if they can realise the benefits (including financial benefits) of doing so.” (Mitchell 2012). A World Economic Forum report adds that data ownership should be thought of in terms of old English common law: as the right to possess, use, and distribute, rather than as physical ownership (Dutta and Mia 2009). Individuals should be allowed to control who accesses their data and why, as well as to come up with new uses for it.

The problem is that personal data is siloed across dozens of websites. Users must navigate confusing and constantly shifting End User License Agreements (EULAs) to discover what information is being collected about them. Their consent is passive: if they don’t agree to the terms of use, their only option is to avoid using the service. When it comes to services such as Facebook, LinkedIn, or Twitter, which are used by millions of people worldwide to network and socialize, the decision to abstain can impact professional and social life (Dimicco 2009, Burke et. al. 2010, Kim and Lee 2011). Consenting to EULAs or abstaining from a service are not adequate choices: people need a regulated, safe way to set privacy levels they are comfortable with.

Personal Data Service

Enter the Personal Data Service (PDS). PDSs are based on the idea that individuals own their data and should be allowed to control the flow of their personal information. The PDS stores or aggregates data, displays it to the user, and allows users to download and share it in machine readable format (Reed). Users set their own terms for who can access their data and why. This affords the user what Helen Nissenbaum terms “context-relative informational norms”: the ability to share data with appropriate parties only (Nissenbaum 2009). People willingly share financial information with their bank, and medical information with their doctor, but aren’t comfortable telling their bank manager about their bunions or their doctor about their student debt. Sharing information in context creates a space where people feel more secure about their privacy (Nissenbaum 2009). At the same time, PDSs give companies access to a richer combined data set, including Volunteered Personal Information (VPI) (Mitchell 2012). A study into PDS use found that people share 12% more information when they are explicitly told how their data will be used (Ctrl Shift 2014). Consumers can create a single online identity with their likes and dislikes, allowing corporations to give them more valuable and relevant offers. In other words, companies can interact with them as individuals rather than as a demographic. By providing a safe space for individuals to store their sensitive data, PDSs benefit both companies and individuals.

So how do they work? To start, the PDS provider must access personal data in some way. Some may simply simply aggregate data stored by online services and display them to the user as a dashboard or in a database format (Ctrl Shift  2014). However, more robust implementations will store the data on a central or personal server: “given the huge number of data sources that a user interacts with on a daily basis, interoperability is not enough. Rather, the user needs to actually own a secured space, a Personal Data Store acting as a centralized location where his data live” (openPDS). In this implementation, the PDS acts as a buffer between the web service and the end user. It captures any user-generated raw data (such as GPS location, form entries, or preferences) and stores it securely.

The web service never accesses this captured data directly. Instead, it sends a query to the PDS and the PDS sends back an answer (de Montjoye 2014). For example, Netflix may want to know whether to recommend House of Cards or Star Trek to you. It will send a request to the PDS with code that uses some combination of demographic information, viewing habits, and geospatial location to predict what you would prefer. The PDS evaluates whether the information requested fits in with the privacy preferences you have set. If it does, it sends back the result to Netflix: in this case, “House of Cards” or “Star Trek.” Netflix need never know the fine-grained information that led to this result. In another case, Netflix may run similar code against multiple users’ aggregated data to draw conclusions about an entire population. In either case, “the dimensionality of the data shared with the services is reduced from high-dimensional metadata to low-dimensional answers that are less likely to be re-identifiable and to contain sensitive information” (Montjoye 2014). Companies can continue to use personal data while respecting individuals’ privacy.

PDS implementation (de Montjoye 2014)
PDS implementation (de Montjoye 2014)

Like a bank, the PDS enters into a binding legal contract with its customers, pledging to protect their data from unwanted access. Mydex, for example, legally structured their business as a nonprofit organization that could never be acquired by a corporation or government (Mydex). “We knew trust was absolutely paramount,” a spokesperson states (Mydex).  Like banks, in order to compete amongst themselves, PDSs will need to prove to their clients that they are trustworthy. This means keeping personal information safe from attack and exploitation: for example, by including checks to make sure that the information returned to service queries is sufficiently anonymous, and identifying and blocking untrustworthy requests (de Montjoye 2014).

Test runs of PDSs have been a success: 81% of beta users of one service, openPDS, said they would use it in their personal life (de Montjoye 2014). Despite this, adoption is slow. “Deployment on a large-scale is a chicken-and-egg problem; users are waiting for compatible services while services are waiting for user adoption,” Montjoye says. Without wide consumer demand, companies are unlikely to give up control of their databanks. However, political support in combination with technological advancement may be enough to spur change (Montjoye 2014).

That support may be coming. In 2011 the UK launched their voluntary Midata program asking corporations to put data back into the hands of users (Midata 2014). In 2012, the EU commision wrote a reform of data protection, stating “individuals’ right to be forgotten, to have easier access to their data, and to be able to easily transfer them” (openPDS).  In response to this changing view of privacy, dozens of PDSs have sprung up around the world, gathering millions of dollars each in financial backing (Ctrl Shift 2012).

PDS and Publishers

With PDSs on the rise, publishers need to pay attention. As content creators who often rely on advertising and market data, publishers can benefit by becoming early backers of this service. PDSs allow publishers to access more data about their customers, levelling the playing field between publishers and Amazon. They create a sense of privacy for the reader, preserving the mystique of picking up a book. And they are rooted in a concept of ownership of information, which aligns nicely with publishers’ ideals.

Data and marketing

With the rise of PDS, publishers will have new advantages. Currently, traditional publishers are disadvantaged when it comes to gathering  data because booksellers act as intermediaries between them and their consumers. With the rise of PDS, publishers will have access to a wealth of information from different streams about what their users want and like. “The proposed framework removes barriers to entry for new businesses, allowing the most innovative algorithmic companies to provide better data-powered services,” de Montjoye writes of openPDS (de Montjoye 2014). Publishers, who take on the financial risk of backing a book with little data (Dunlop 2015), will finally have the same advantages as Amazon and Kobo.

Privacy

Beyond the commercial benefits of PDS, the values it serves to uphold–privacy and ownership of information–could align with their publishers’ ethos and re-establish their foothold in the community. PDSs are  built on a desire for individuals to maintain privacy by controlling access to their information, and publishers have an intimate relationship with privacy. In fact, some argue that privacy coevolved with the technology of print. Jagodizinski notes that the traditional definition of “privacy” was negative: it denoted an individual who, through the absence of public office, had no power to lead his community (Jagodizinski 1999 23).   As  books became portable and abundant, literacy rose and with it, silent reading. People were able to absorb information in solitude rather than through public conversation. A sense of interiority developed, and the word “privacy” evolved to take on a more positive meaning (Jagodizinski 1999).  Spacks elaborates on this, connecting reading with “individual fantasy,” “withdrawal from the public sphere,” and “the opportunity to explore and solidify the self” (Spacks 2003 28-29). Such  outcomes are only possible when reading takes place in a relatively private space–not private in the sense of solitary, but in the sense of having an unwatched space to think about and process the text. This privacy allows people to explore new ideas free from outside judgement.

But reading is no longer a private activity: it is now yet another way to gather data. Ebook vendors such as Kobo track not only which books are bought, but when and how often they’re read. The information is accurate enough to track reader engagement chapter by chapter and even page by page (Kobo 2014). Online, websites store incredibly detailed information about what you read, building a profile of you across from page to page. Academic journals are no better: 16 out of the top 20 research journals allow trackers to spy on their readers (Hellman 2015). Many users do not realize to  what extent they are being watched. “The psychological privacy afforded by communication channels may lull users into a false assumption of informational privacy,” Walther writes (Walther 2011, 4). But as awareness grows (Madden 2015), users will employ PDSs to limit when and how they are tracked. Data collection is transparent to the user and sufficiently anonymous to the tracker. With PDS, readers will be able to build a private space online that mimics that of print.

Copyright and access

The advent of PDS will also change the way people relate to information as property. Proponents of open access to information have long argued that “information wants to be free” (Clarke 2000). However, most people are less comfortable with free access to their own information. In a culture where so much content is available for free online, the new slogan is, “If you are not paying for it, you’re the product being sold” (Fitzpatrick 2010). “Free” information is paid for in advertising dollars spent by companies trying to reach and track engaged audiences.

Books are no exception. In the past, Nakamura says, we paid for books but our conversations about them were free. Now that paradigm has shifted: “today books are free through Google Books and Internet Archive and, much to the consternation of publishers, through torrent sites like Pirate Bay and Media Fire, but we pay to create readerly communities on social networks like Goodreads. We pay with our attention and our readerly capital, our LOLs, rankings, conversations, and insights” (Nakamura 7). User-created data is an indirect payment. Even with print books, customers trade their data for rewards and discounts. Booksellers’ use of data is cast in terms of labour and production: “each transaction customers make using their loyalty cards produces valuable data for these booksellers. In effect, they are outsourcing the costly labor of market research to their most loyal customers, who ironically buy back the labor they’ve freely given with each subsequent purchase” (Striphas 2010). Information exchange becomes a grim commerce, where the customer is the worker, product and buyer. In such exchanges, information is anything but free: it is heavily commodified. Content is still paid for, although the payment is invisible to many users.

PDSs, on the other hand, spring from the idea of data as a form of personal property. The first American text to argue for the right to privacy grounds in property law, similarly to notions of copyright: “The right of property in its widest sense, including all possession, including all rights and privileges, and hence embracing the right to an inviolate personality, affords alone that broad basis upon which the protection which the individual demands can be rested” (Warren and Brandeis 1890). The right to an “inviolate” self is a form of property over all private information about that self. What makes determines whether information is private? To Warren and Brandeis, information is private until it is published: “The common law secures to each individual the right of determining, ordinarily, to what extent his thoughts, sentiments, and emotions shall be communicated to others… The right to privacy ceases upon the publication of the facts by the individual, or with his consent.” (Warren and Brandeis 1890). Publication is seen as the boundary by which individuals relinquish control over their information.

With the advent of digital publishing, however, “publication” is no longer as clear cut. Are you “publishing” when you post to a Facebook profile that is restricted to friends? When you send an email through an online service, is it still private if it’s filtered through an algorithm that builds a profile of you? PDS erases these distinctions by allowing individuals to decide for themselves when their information is “published” and when it is private. In effect, they are acting as their own publishers, curating and editing their data and setting terms for access. In some PDS implementations, they are even allowed to set a price  to access their data. This way of looking at information bodes well for publishers and content providers in general. Information can still be “free,” in the sense of accessible to anyone. But publishers may be able to find a price that captures the value of the content they provide. If people recognize their own information as valuable, they will begin to view others’ information as valuable too (Lanier 2010). This could be good news for those segments of the publishing industry that have not yet figured out a way to make money off free content.

Conclusion

Personal data services have yet to catch on, but the problem they address is real. Movies like Unfriended are just the tip of the cultural iceberg. A sense of anxiety pervades the digital space, as people worry about the permanence of their digital footprint and about the ways they are being tracked.  Addressing privacy concerns online is not just an issue for publishers, it’s a social issue–but it’s one that publishers have a vested interest in. The commercial benefits of opening data access to publishers is enormous, and could help tremendously to identify and reach new markets. In a world where people feel “digitally crowded” (Joinson 2011),  publishers could open up a cool oasis of privacy, allowing readers to explore new territory without fear of surveillance. And if publishers want consumers to treat content as valuable, they can begin by treating consumers’ information in the same way.  Publishers will need to take a long, hard look at digital privacy and how it fits in with their vision for the future.

 

References

Burke, Moira, Cameron Marlow, and Thomas Lento. “Social Network Activity and Social Well-Being.” In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, 1909–12. ACM, 2010. http://dl.acm.org.proxy.lib.sfu.ca/ft_gateway.cfm?id=1753613&ftid=770043&dwn=1&CFID=496986671&CFTOKEN=32475132.

 

Clarke, Roger. “Information Wants to Be Free.” Roger Clarke, 2000. http://www.rogerclarke.com/II/IWtbF.html.

 

Davis, Kenan, Nadia Popovich, Kenton Powell, Ewen MacAskill, Ruth Spencer, and Lisa van Gelder. “NSA Files Decoded: Edward Snowden’s Surveillance Revelations Explained.” The Guardian, November 1, 2013. http://www.theguardian.com/world/interactive/2013/nov/01/snowden-nsa-files-surveillance-revelations-decoded#section/6.

 

De Montjoye, Yves-Alexandre, Cesar A. Hidalgo, Michel Verleysen, and Vincent D. Blondel. “Unique in the Crowd: The Privacy Bounds of Human Mobility.” Sci. Rep. 3 (March 25, 2013). doi:10.1038/srep01376.

 

De Montjoye, Yves-Alexandre, Erez Shmueli, Samuel S. Wang, and Alex Sandy Pentland. “openPDS: Protecting the Privacy of Metadata through SafeAnswers.” Edited by Tobias Preis. PLoS ONE 9, no. 7 (July 9, 2014): e98790. doi:10.1371/journal.pone.0098790.

 

Dunlop, Laura. “Accessing Big Data: The Key to Publishers Taking Back the Power.” PUB 802: Canadian Centre for Studies in Publishing, SFU, February 27, 2015. http://tkbr.publishing.sfu.ca/pub802/2015/02/accessing-big-data-the-key-to-publishers-taking-back-the-power/.

 

Dutta, Soumitra, and Irene Mia. “Global Information Technology Report 2008-2009.” World Economic Forum, 2009. http://hd.media.mit.edu/wef_globalit.pdf.

 

Fitzpatrick, Jason. “If You’re Not Paying for It; You’re the Product.” LifeHacker, November 23, 2010. http://lifehacker.com/5697167/if-youre-not-paying-for-it-youre-the-product.

 

Gingold, Michael. “‘UNFRIENDED’ (aka ‘CYBERNATURAL’; Fantasia Movie Review).” Fangoria, July 21, 2014. http://www.fangoria.com/new/unfriended-cybernatural-fantasia-movie-review/.

 

Hellman, Eric. “16 of the Top 20 Research Journals Let Ad Networks Spy on Their Readers.” Go To Hellman, March 12, 2015. http://go-to-hellman.blogspot.ca/2015/03/16-of-top-20-research-journals-let-ad.html.

 

Jagodzinski, Cecile M. Privacy and Print: Reading and Writing in Seventeenth-Century England. University of Virginia Press, 1999.

 

Johnson, Bobby. “Privacy No Longer a Social Norm, Says Facebook Founder.” The Guardian, January 11, 2010, sec. Technology. http://www.theguardian.com/technology/2010/jan/11/facebook-privacy.

 

Joinson, Adam N., David J. Houghton, Asimina Vasalou, and Ben L. Marder. “Digital Crowding: Privacy, Self-Disclosure, and Technology.” In Privacy Online, 33–45. Springer, 2011.

 

Kim, Junghyun, and Jong-Eun Roselyn Lee. “The Facebook Paths to Happiness: Effects of the Number of Facebook Friends and Self-Presentation on Subjective Well-Being.” Cyberpsychology, Behavior, and Social Networking 14, no. 6 (2011): 359–64. doi:10.1089/cyber.2010.0374.

 

 

Lanier, Jaron. Who Owns the Future?. First Simon & Schuster hardcover edition. New York: Simon & Schuster, 2013.

 

Madden, Mary. “Privacy and Cybersecurity: Key Findings from Pew Research.” Pew Research Center, January 16, 2015. http://www.pewresearch.org/key-data-points/privacy/.

 

Midata Voluntary Programme: Review. Consumer Protection. UK: Department for Business, Innovation & Skills, July 8, 2014. https://www.gov.uk/government/publications/midata-voluntary-programme-review.

 

Mitchell, Alan. “Personal Data Stores Will Liberate Us from a Toxic Privacy Battleground.” Wired UK, May 30, 2012. http://www.wired.co.uk/news/archive/2012-05/30/ideas-bank-personal-data-stores.

 

Nakamura, Lisa. “‘Words with Friends’: Socially Networked Reading on Goodreads.” PMLA 128, no. 1 (2013): 238–43.

 

“New Market for ‘Empowering’ Personal Data Services ‘Will Transform Relationships between Customers and Brands.’” Ctrl Shift, March 20, 2014. https://www.ctrl-shift.co.uk/news/2014/03/20/new-market-for-empowering-personal-data-services-will-transform-relationships-between-customers-and-brands/.

 

Nissenbaum, Helen. Privacy in Context: Technology, Policy, and the Integrity of Social Life. Stanford University Press, 2009.

 

“openPDS/SafeAnswers – The Privacy-Preserving Personal Data Store.” OpenPDS. Accessed April 6, 2015. http://openpds.media.mit.edu/.

 

Personal Data Stores. Ctrl Shift, April 30, 2012. https://www.ctrl-shift.co.uk/research/product/64.

 

Phillips, Kendall R. Projected Fears: Horror Films and American Culture: Horror Films and American Culture. Praeger Publishers, 2005.

 

Reed, Drummond. “Revision: ‘Personal Data Service’ AND ‘Personal Data Store’ Go Together.” Equals Drummond, October 6, 2010. http://equalsdrummond.name/2010/10/06/revision-personal-data-service-and-personal-data-store/.

 

Spacks, Patricia Meyer. Privacy : Concealing the Eighteenth-Century Self. Chicago, IL, USA: University of Chicago Press, 2003. http://site.ebrary.com/lib/sfu/docDetail.action?docID=10468497.

 

Steinfield, Charles, Joan M. DiMicco, Nicole B. Ellison, and Cliff Lampe. “Bowling Online: Social Networking and Social Capital within the Organization.” In Proceedings of the Fourth International Conference on Communities and Technologies, 245–54. ACM, 2009. http://socio-informatics.de/fileadmin/IISI/upload/2009/p245.pdf.

 

St. John, Jeffrey. “The Late Age of Print: Everyday Book Culture from Consumerism to Control, by Ted Striphas,” 2010.

 

Timberg, Craig. “Is Uber’s Rider Database a Sitting Duck for Hackers?” The Washington Post, December 1, 2014. http://www.washingtonpost.com/blogs/the-switch/wp/2014/12/01/is-ubers-rider-database-a-sitting-duck-for-hackers/.

 

Trepte, Sabine, and Leonard Reinecke, eds. Privacy Online. Berlin, Heidelberg: Springer Berlin Heidelberg, 2011. http://link.springer.com/10.1007/978-3-642-21521-6.
“Understanding Personal Data Stores (PDS).” Mydex. Accessed April 6, 2015. https://mydex.org/understand-pds/.

 

Viseu, Ana, Andrew Clement, and Jane Aspinall. “Situating Privacy Online.” Information, Communication & Society 7, no. 1 (January 1, 2004): 92–114. doi:10.1080/1369118042000208924.

 

Walther, Joseph B. “Introduction to Privacy Online.” In Privacy Online, 3–8. Springer, 2011.

 

Warren, Samuel D., and Louis D. Brandeis. “The Right to Privacy.” Harvard Law Review, 1890, 193–220.

The Limits of “Unlimited” Ebook Subscription Services

Introduction

In the fantasy-comic short story The Choosing of the Bride (Die Brautwahl, 1819), German writer E.T.A. Hoffman tells the bizarre adventures of three suitors competing for the hand of the young and beautiful mistress Albertine Vosswinkel. The first suitor is the amusingly pedantic bureaucrat and obsessive bibliophile “Herr Chancellery Private Secretary” Tusmann; the second, the young and charming painter Edmund Lehsen; the third, the wealthy, greedy and revolting Baron Benjamin. Albertine’s destiny rests on a game of chance that follows the popular fairy-tale pattern of the casket–choice, echoing Shakespeare’s Merchant of Venice. Albertine’s suitors, indeed, must choose among three caskets; the one who picks the casket containing the demoiselle’s portrait will win her hand in marriage. Although the finale can be easily predicted – the winner being her beloved Edmund – the opening of the “wrong” caskets comes with some surprising revelations. Both the Baron and Tusmann will be rewarded with a gift more valuable, to them, than Albertine herself: whereas the first rejoices at the discovery of a magic file that prevents his precious ducats from deteriorating, the second is startled at finding “a little book bound in parchment” with nothing but blank pages inside. As the secretary will learn shortly after, the small “packet of paper” in the casket is not an ordinary book, but is “the richest, completest library anyone has ever possessed,” for every time he takes the volume out of his pocket, this will become whatever title he wishes to read. [1]

Tusmann’s magic book is the materialization of every bibliophile’s dream. Today, some internet retailers and up-and-coming digital content providers are trying to turn this dream into a reality by offering readers access to thousands of titles online with the alluring promise of the freedom of unlimited reading. But is this exciting prospect of a universal, personal library a pipe dream or an already existing reality that is bound to prosper? In the course of this essay, I will attempt to answer this question by investigating – and ultimately calling into question – the viability of broad-based, subscription services in consumer publishing.

The popularity of Netflix-like subscription services: or a book is not a film (nor a song)

Book subscription models are hardly a novelty, dating back as far as the eighteenth century, when printers used to “solicit customers to ‘subscribe’ to a particularly expensive work or collection of works in advance of publication as a way to reduce risk by prefunding the project.” [2] Whereas in the late twentieth century, with the advent of digital publishing, the term was associated with the libraries’ and businesses’ subscription to digital journals, collections, and databases, today, the word has come to identify streaming media services that offer instant access to vast collections of content.

The huge success of streaming technologies such as Netflix and Spotify has earned them the reputation of digital content subscription providers par excellence to the point that subscription-based e-book lending libraries such as OysterScribd, and Kindle Unlimited have often been labeled as “Netflix of books,” (or “Spotify of books”). Beyond the allure of the epithet, too often applied without much discrimination to the book market, the comparison calls for a fundamental distinction: the mode of consumption of digital video or audio products and that of eBooks are remarkably different. Compare the average Spotify user with the average Kindle Unlimited reader: while the former may listen to a large number of songs every day, each one for a short time, and often several times, the latter is likely to read only a few titles over a longer period, usually once, with a more focused attention span. [3] Furthermore, the attention of the music consumer need not be entirely absorbed in the act of listening, this type of content often being consumed simultaneously with other activities. The difference becomes less visible if we compare the experience of watching a movie on Netflix with that of reading a book on a dedicated device. Both situations, in fact, require the user’s total attention in a more relaxed, “lean back” situation (although this is not always true for books, and even more so for eBooks, which often involve the reader’s active participation and are consumed in “lean forward” mode).

Another important distinction to be made concerns the audience size. Scribd, one of the leading book membership services has grown to 80 million monthly readers since its inception in 2013. The paying subscribers are in the order of “hundreds of thousands,” but the numbers are not nearly as big as Netflix’s and Spotify’s, with their respective 57.4 million and 15 million global subscribers. [4] [5] Obviously, given that Netflix and Spotify have been around for years, the statistics announced by the younger ebook subscription services are promising. Yet – and I’m now venturing into the realm of speculation – it is unlikely that the number of book gluttons willing to pay for a “binge reading” will ever approach that of voracious (paying) users of “all-you-can-watch” and “all-you-can-listen” streaming services. An additional issue to be addressed is that of the “potential degradation of high-value markets” connected with the increase of book subscriptions. [6] As a recent BISG report acknowledges, low-price access to vast libraries of content may lead to a devaluation of ebooks (and books in general) in the mid term, and possibly translate into “into an unwillingness to pay a higher price to own a book.” [7]

Freedom of unlimited reading: a promise to debunk

Unlimited books, audio books, and comics to be instantly accessed and consumed in “total freedom” (Scribd); unlimited listening and reading on any device, freedom to explore thousands of titles (Kindle Unlimited); unlimited reading – anytime, anywhere – of “as many books as [readers] want” (Oyster). “Freedom” and “unlimited access” are the common distinctive features and marketing mantra of the big current broad-based subscription services for books. Their offers even seem to exceed the dream of the universal lending library described in Hoffman’s tale as Scribd, Oyster and Kindle Unlimited readers, through the magic of algorithms and user-generated recommendations, may discover titles about which they have never heard. But is this really true? What is the extent of this “unlimited” offer and of our “freedom” as subscribers-readers? As journalist Cameron Fuller observes, when it comes to books, the very notion of unlimited becomes questionable: “[u]nlimited books do nothing if you don’t read them, and reading has a limit. It has a limit in time and speed, something most people do not possess.” [8] For Kindle Unlimited subscribers, the limit is in the choice of titles, since a significant portion of the books available comes from Amazon’s own imprints (Thomas & Mercer, 47th North, Montlake Romance, and so on) and self-publishing platform, Kindle Direct Publishing. In addition, the majority of the books on the New York Times bestsellers list available through the Kindle Store, as well as the top selling titles published by the major publishers (Hachette, Macmillan, Simon & Schuster, HarperCollins, and Penguin Random House) are not included in the offer. Interestingly, even if to a lesser extent, a similar restriction on the collections can be found in connection with the only ebook subscription service, Oyster, that managed to bring three of the Big Publishers on board (HarperCollins, Simon & Shuster and, most recently, Macmillan). These publishers, in fact, are putting into the service only their backlist titles, leaving the new, “most attractive commercial titles” out of the offer. [9]

This also raises some questions about subscribers’ freedom of choice: most subscription services for books, indeed, are offered through third-party aggregators rather than directly by publishers, and the selection of titles is subordinate to commercial agreements where readers occupy a marginal place, if any at all. As Shatzkin has recently pointed out, the problem with ebook subscription services is that, over time, “the power of ‘brand’ passes from the individual titles (and authors) to the subscription service itself.” As a result “a subscriber-reader can become used to choosing from what the service offers and will either not know about, skip, or accept purchasing the occasional book s/he wants outside the service if it isn’t offered inside.” [10] In other words, the tantalizing promise of an unlimited offer actually translates into a limited range of prepackaged choices. The risk, in the long term, is the narrowing of the reader’s perspective. Yes, it is true, by “remov[ing] the purchase from the process after the initial acquisition of access,” subscription services relieve the reader of the burden of potentially making an erroneous buying choice, by presenting them with a set of prepackaged options, but… is this truly an exercise of freedom or, rather, a limitation on it?

Cost-savings: a real benefit or another myth to be exposed?

Along with convenience and ease of access, low price is a key factor of success in the new subscription economy. Ebook subscription services are no exception. Voracious readers – especially the price-sensitive ones – are attracted to online book membership services because in them their insatiable appetite for books can find easy gratification with minimum expense. And yet, even for those readers, the idea that purchasing a subscription plan is cost-effective can be deceptive. Let’s take Kindle Unlimited as an example. The service costs $9.99 per month. A true deal, if it wasn’t for the fact that most of the titles on Kindle Unlimited are priced very low ($0.99 or $2.99) and in order to actually benefit from the service, the reader would have to read at least three $2.99 books or ten $0.99 books per month. [11] Casual readers on a Kindle Unlimited plan, it goes without saying, will be highly likely to have their needs unmet. Safari Books Online, the first subscription service for books, is a different story, not only because it is targeted at a niche audience – mainly IT and business professionals –, but also because it uses a more viable financial model, which assigns “a percentage of the revenue as a pool to compensate publishers rather than guaranteeing a purchase for every read” as Oyster and Scribd do. [12] Furthermore, the monthly fee for the basic plan (PRO), which offers access to an extended library of ebooks, audio books, video courses and conference talks, is very affordable. It costs $39 per month, a price only slightly above that of a single downloadable book online. [13]

Conclusion

The subscription economy, rather than an emerging trend, is an established fact in several media industries, and an already existing reality in the current publishing landscape. Consumer publishers’ ability to use this model successfully in their business, especially in the long term, will depend on their willingness to abandon the “one-size-fits-all” policy – or the chimeric dream of a “one-library-fits-all” solution – in favour of a mixed strategy that reaches (and meets the needs of) different kinds of costumers through different market pathways. [14]

Works Cited

1. E.T.A. Hoffman, J. Hollingdale (trans.), Tales of Hoffman (London: Penguin, 1982). I owe the reference to Gino Roncaglia’s pioneering study La quarta rivoluzione (Roma: Laterza, 2010), 70-73.
2. “Digital Books and the new Subscription Economy: Executive Summary” (New York: Book Industry Study Group, 2014), 10. https://www.bisg.org/publications/digital-books-and-new-subscription-economy-0.
3. Daniele De Veris, “Daniele De Veris intervista Gino Roncaglia,” Insula Europea, January 1, 2015, accessed April 1, 2015. http://www.insulaeuropea.eu/leinterviste/interviste/deveris_roncaglia.htm.
4. Brad Stone, “Scribd’s E-Book Subscription Service, Now With Audiobooks,” Bloomberg, November 6, 2014, accessed March 31, 2015. http://www.bloomberg.com/bw/articles/2014-11-06/scribd-launches-audiobook-service
5. Frank Pallotta, “Netflix gains 4.3 million subscribers in 4th quarter,” CNN Money, January 20, 2015, accessed April 2, 2015. http://money.cnn.com/2015/01/20/media/netflix-earnings/
6. “Digital Books and the new Subscription Economy,” 7.
7. Ibid.
8. Cameron Fuller, “Why Oyster Isn’t ‘The Netflix Of Books’,” International Business Times, January 09, 2014, accessed April 1, 2015. http://www.ibtimes.com/why-oyster-isnt-netflix-books-1534086.
9. Michael Shatzkin, “Subscription Services for eBooks Progress to Becoming a Real Experiment,The Shatzkin Files, May 27, 2014, accessed April 1, 2015. http://www.idealog.com/blog/subscription-services-ebooks-progress-becoming-real-experiment/.
10. Ibid.
11. Piotr Kowalczyk,“Kindle Unlimited Ebook Subscription: 8 Things Readers Need to Know, ” Ebook Friendly, April 5, 2015, accessed April 5, 2015. http://ebookfriendly.com/kindle-unlimited-ebook-subscription/.
12. Shatzkin, 2014.
13. Andrew Savikas,“Welcome to the New Safari,” Safari Books Online, July 8, 2014, accessed April 1, 2015. https:// blog. safaribooksonline. com/ 2014/ 07/ 08/ new-safari/.
14.”Digital Books and the new Subscription Economy,”  6, 13. For more details on this view, please see the thorough research on subscription models recently conducted by the Book Industry Study Group.

 

BIBLIOGRAPHY

Digital Books and the new Subscription Economy: Executive Summary. New York: Book Industry Study Group, 2014. https://www.bisg.org/publications/digital-books-and-new-subscription-economy-0.

Fuller, Cameron. “Why Oyster Isn’t ‘The Netflix Of Books’,” International Business Times, January 09, 2014, accessed April 1, 2015. http://www.ibtimes.com/why-oyster-isnt-netflix-books-1534086.

Kowalczyk, Piotr. “Kindle Unlimited Ebook Subscription: 8 Things Readers Need to Know, ” Ebook Friendly, April 5, 2015, accessed April 5. http://ebookfriendly.com/kindle-unlimited-ebook-subscription/

Lunden, Ingrid. “Spotify Now Has 15M Paying Users, 60M Overall Active Subscribers. Techcrunch, January 12, 2015, accessed April 2, 2015. http://techcrunch.com/2015/01/12/spotify-now-has-15m-paying-users-60m-overall/.

Pallotta, Frank. “Netflix gains 4.3 million subscribers in 4th quarter.” CNN Money, January 20, 2015, accessed April 2, 2015. http://money.cnn.com/2015/01/20/media/netflix-earnings/.

Roncaglia, Gino. La quarta rivoluzione: sei lezioni sul futuro del libro. Roma: Laterza, 2010.

____________ “A Tangled Tale: Biblioteche digitali, subscription services e promozione della lettura” (Seminar Presentation). Convegno Stelline, March 14, 2015, accessed April 1, 2015. https://prezi.com/bahcrznhhtyg/a-tangled-tale/?utm_source=facebook&utm_medium=ending_bar_share.

Savikas, Andrew. “Welcome to the New Safari.” Safari Books Online, July 8, 2014, accessed April 1, 2015. https:// blog. safaribooksonline. com/ 2014/ 07/ 08/ new-safari/

Shatzkin, Michael. “Subscription Services for eBooks Progress to Becoming a Real Experiment.” The Shatzkin Files, May 27, 2014, accessed April 1, 2015. http://www.idealog.com/blog/subscription-services-ebooks-progress-becoming-real-experiment/.

Weber, Harrison. “Netflix beats Q3 expectations, but adds just 3M subscribers.” VB News, October 15, 2014, accessed April 2. http://venturebeat.com/2014/10/15/netflix-beats-expectations-with-3m-new-subscribers-in-q3-0-96-earnings-per-share/.

 

 

What can fiction publishers learn from altmetrics?

Introduction

Gone are the days of red ink-stained editors straining their eyes over a slush pile, a pack of Marlboros and a glass of cheap scotch their only company in the dim, slatted light of their tiny office. Today, editorial acquisitions are increasingly driven by sales data. Editors look at the number of books the author has previously sold, as well as sales of comparable titles, before making the decision to acquire a book. But are these numbers telling the whole story? Trade fiction publishers can learn from scholarly presses, who are developing far more nuanced metrics to rank article, author, and journal impact. This paper will examine these metrics, then explore the challenges and opportunities for trade publishers in building similar rankings for themselves.

Altmetrics defined

Scholars receive funding from different sources, so there is a strong need to measure and evaluate the impact of their research. Bibliometrics, “the statistical analysis of books, articles, or other publications,”1 is one way of evaluating impact. The most common metric used is citation count, in which often-cited articles are ranked more highly than those with lower citation counts. Different indices, such as the  h-index2 and the Journal Impact Factor3 have been developed to measure impact through citations. These metrics, however, present several problems. The first is that they’re slow: citations can take years to accrue, making it difficult to accurately measure recent work 4. Secondly, measures like JIF are proprietary5 and easily manipulated,6 making them difficult to rely on.7 Thirdly, these measures don’t take into account other ways in which academic work can be shared or engaged with. Not all work is published in article form, and articles can generate conversations in other ways besides citation.

To counter this, organizations such as Altmetric.com have developed altmetrics, or article-level metrics. These nuanced measurements take into account a variety of different article features, including blog, Wikipedia, Twitter and Facebook mentions, article downloads, mainstream media citations, StackExchange and other forum posts, and more. What characterizes different altmetrics is that they draw on disparate sources created by a wide user base to evaluate articles.

And it works. Studies have shown that altmetrics can predict citation rates, but at a more rapid speed.9 10 11 Of course, these studies still privilege traditional citation metrics by using them as benchmarks of success. In fact, the real power of altmetrics is in their ability to identify different ways of measuring the impact of a text. “Altmetrics are fast, using public APIs to gather data in days or weeks. They’re open–not just the data, but the scripts and algorithms that collect and interpret it,” Jason Priem writes in “Altmetrics: A Manifesto.”12  But more importantly, he adds, their diversity makes them better able to navigate an academic landscape which now includes raw data sets, tweets, blogs, and other items in addition to traditional articles.13 In addition, altmetrics have the ability to emphasize other forms of impact: the ability of a work to provoke discussion, to cross over between disciplines, or to inform students and non-academic readers.

Fiction publishers and acquisition assessment

Fiction publishers invest in their authors in much the same way universities invest in academics. Publishers take on the financial risk of printing a book and paying an author advance. Only too often, they have no idea whether their risk will pay off. Trade publishers live in what Tom Davenport calls a “disadvantaged” industry: a B2B2C industry in which retailers, such as Amazon and Indigo, hold all the data about the end customers. 14. In the past, acquisitions were made blindly, based on gut instinct or an editor’s sense of the market. “It was difficult to discern sales patterns to see how well or poorly a book was doing until much later—sometimes months later—when the publishers receive returns,” Amanda Regan writes in her report on sales data. 15 Publishers were taking huge financial risks on authors without knowing if they would be successful. Even when a publisher’s instinct paid off, the process was still problematic:  when acquisitions editors say that books select themselves, what they are actually saying is that they choose books within a framework of values they see no need to question.’ 16 By ignoring data, publishers were potentially ignoring larger cultural trends about what people wanted to read, acting on the assumption that the public was just like them.

In 2005, BookNet Canada launched its SalesData service,17 which provides subscribers with week-to-week aggregated sales data for any ISBN. Since then, publishers have shifted to acquiring books based, at least in part, on sales data: that of an author’s past works, or of comparable titles But these metrics come with their own set of problems. First of all, they are overly simplistic: publishers often use a single data point, total sales, as a predictor of future success. Secondly, they are subjective: comparable titles are still picked entirely according to editorial discretion. Finally, they favour mass-market bestsellers, since higher sales figures simply indicate broad appeal. For most books, however, a more successful strategy is to market directly to a niche audience who will engage with the content. To compete for attention with data-rich companies such as Facebook and Google, Davenport argues, “editing and editorial decision-making will have to become data-driven. Social media will have to be mined for sentiment along with content clickstream data. Publishers will have to compile insights on what really works, combining data analytics with knowledge management.”18 Acquisitions editors need more sophisticated metrics if they are to properly assess a work’s impact.

Altmetrics for fiction publishers

Fiction publishers can develop more nuanced metrics, similar to altmetrics used by academic publishers. One of the major components of altmetrics is online engagement, and this feature translates well to trade book publishers. Publishers should track mentions of a book or author on every social media platform, including Twitter, Facebook, Goodreads, and more. They should also quantify mentions in blogs, articles, and other media. There have been a few attempts to measure social media sentiment towards academic articles.19 Trade publishers may wish to expand upon these attempts and track sentiment across social media, since readers will be unlikely to pick up a book that has received overwhelmingly negative reviews.

Publishers could also track fan fiction online as a measure of strong engagement: “Within publishing, these writers represent the kind of ‘prosumer’ audience that has broadened the market for things like digital cameras, home theater and more. Online, there is a ‘flood of amateur collaboration’ we can embrace and benefit from,” Brian O’Leary writes.20 This includes not only text-based fan fiction but videos, photos, songs and art posted online that reference the work. A more ambitious project for publishers may be to track searches for keywords which appear in a book. If many people look up, in order, an obscure word used in Chapter 1, a song reference in Chapter 2, and a movie referenced in Chapter 3, they are engaging with the text across media in a way that is valuable to the publisher.

As well as measuring transmedia engagement, publishers can measure intertextual influence. Although by convention fiction books don’t cite each other, they do exist in a network of influence. Fifty Shades of Grey author E.L. James owes a debt to Stephenie Meyer, and most Western fiction owes a debt to Shakespeare. Is it possible to trace this influence across fiction? In the most traditional sense, publishers could count quotations from their books in other works, but this would likely only impact a very few well-known books. So far only a few studies have attempted to track an author’s influence on other texts. One, conducted by the Stanford Literary Lab, mapped David Foster Wallace’s works within a network of texts by extracting mentions of other books and authors from Amazon and mass media reviews. “In recommendation networks, the more times a text is recommended ‘by’ another text, the higher its prestige value. In review networks, where the links (based on co-occurrences) have no directionality, it is even simpler: nodes with the most links are the most prestigious,” Ed Finn explains.21 The study found that Amazon reviewers situated Wallace in a richer and more diverse network of texts than mass media reviewers did. While his method needs some work (both “Wallace is the next Shakespeare” and “Shakespeare, he ain’t” connect the author to Shakespeare), his approach is interesting. Publishers could use similar techniques to map texts in a  network using Amazon reviews, Goodreads bookshelves and other user-generated data. This would help them not only to better measure impact, but also to better position and market books.

The above metrics assess the impact of a book after it hits the market to measure author influence. But is there a way to measure the future influence of first-time authors? In his 2012 study, Rui Yan found 11 features of articles and mapped them against citation counts. Three of these features were based solely on the content of the article The first, novelty, measured the novelty of the statements in the article. Yan found that citation count increased with novelty up to a point, and began to decrease after a certain threshold. This showed, he argued, that works which strayed too far from the norm were less likely to be widely cited. The second feature, topic rank, measured the popularity  of the subject, and correlated with citation count. The third, diversity, measured the amount of topics in the article and found that in general, citation count increased with diversity. 22 Trade publishers could develop similar factors for text novelty, genre, and subject, and test them accordingly, to help them assess incoming manuscripts.

Future applications

Some may argue against evaluating authors’ and books’ impacts at all. They may point to the difficulty of quantifying artistic merit, or the problem of identifying talent which may only be recognized years down the line. These concerns are valid, and I am not suggesting that impact metrics should be the only consideration editors use to drive decisions. The purpose of these measures, as with academic altmetrics, is not solely to inform acquisitions. Instead, they fill two other necessary functions.

First, they filter existing published content to help it get into the hands of the right readers.  In her book Planned Obsolescence, Kathleen Fitzpatrick develops a framework for how reviews can help filter academic articles after publication. She talks about establishing a “trust metric” that will rank authors based on their reputation and authority within the community.23 The same concept applies to trade publishing, where the marketing department works hard to establish an author’s authority through jacket copy, interviews, and more. With an impact metric for both the book and the author, readers can find titles they trust within an over-saturated market. These metrics could be customized to each user’s tastes, like the Amazon “Recommended for you” feature. They could even be tweakable by the user, allowing them to emphasize different features of the model (i.e. choosing to rank social media mentions more highly than mass media mentions or vice versa).24

Second, metrics provide authors with an assessment of their own impact which can translate  into other, indirect benefits. In his book The Long Tail, Chris Anderson writes about “The Reputation Economy: “Down in the tail, where distribution and production costs are low (thanks to the digital technologies), business considerations are often secondary. Instead, people create for a variety of other reasons — expression, fun, experimentation, and so on. The reason one might call it an economy at all is that there is a coin of the realm that can be every bit as motivating as money: reputation. Measured by the amount of attention a product attracts, reputation can be converted into other things of value: tenure, audiences, and lucrative offers of all sorts.” 25. Most small- and mid-list authors currently reside in the long tail. They often do not earn enough from advances or royalties to support themselves, but choose to write for other reasons. However, with trusted, recognized measures of their impact, they could turn their book into a better job, a speaking engagement, or a more profitable contract, just as an academic leverages high metrics into tenure, promotion, or increased funding.

Conclusion

I have tried to limit my analysis to forms of data that are already accessible to publishers: social media, reviews, and the manuscript itself. However,  publishers need to demand access to data from retailers such as Amazon and Kobo.  This data should include reader engagement with purchased books (such as time spent reading and completion rate); references to their own titles in works published by other presses (including indirect mentions and quotations), and data about consumer buying habits (including networks of books bought by readers of their book). With this information publishers could develop even better author metrics, and compensate for the fact that far fewer trade books are accessible online for free or through subscription services as compared to academic books and articles.

As trade publishers reevaluate their metrics, though, this lack of accessibility may change. Trade publishers who adopt an altmetric-like model will need to be aware that their own impact as a press is measureable too. This provides an opportunity for them to define their brand in their reader’s eyes. But it also leads to increased competition, not for buyers, but  for attention. To succeed, publishers will need to learn to value reader’s engagement on its own terms, rather than as a direct lead-in to sales. They will have to make sure their texts are easily shareable and clippable, and use the data they gather to inform marketing and production as well as acquisitions.

References

1. “Bibliometrics Definition.” OECD Glossary of Statistical Terms. Accessed February 27, 2015. http://stats.oecd.org/glossary/detail.asp?ID=198.
2. Hirsch, J. E. “An Index to Quantify an Individual’s Scientific Research Output.” Proceedings of the National Academy of Sciences 102, no. 46 (November 15, 2005): 16569–72.  doi:10.1073/pnas.0507655102.
3. “Journal Impact Factor.” Journal Impact Factor. Accessed February 27, 2015. http://jifactor.com/.
4 Thelwall, Mike, Stefanie Haustein, Vincent Larivière, and Cassidy R. Sugimoto. “Do Altmetrics Work? Twitter and Ten Other Social Web Services.” Edited by Lutz Bornmann. PLoS ONE 8, no. 5 (May 28, 2013): e64841. doi:10.1371/journal.pone.0064841.
5. Rossner, M., H. Van Epps, and E. Hill. “Show Me the Data.” The Journal of Cell Biology 179, no. 6 (December 17, 2007): 1091–92. doi:10.1083/jcb.200711140.
6. The PLoS Medicine Editors. “The Impact Factor Game.” PLoS Medicine 3, no. 6 (2006): e291. doi:10.1371/journal.pmed.0030291.
7. Priem, Jason, Dario Taraborelli, Paul Groth, and Cameron Neylon. “Altmetrics: A Manifesto,” 2010. http://altmetrics.org/manifesto/.
8. “What Does Altmetric Do?” Altmetric. Accessed February 27, 2015. http://www.altmetric.com/whatwedo.php.
9. Eysenbach, Gunther. “Can Tweets Predict Citations? Metrics of Social Impact Based on Twitter and Correlation with Traditional Metrics of Scientific Impact.” Edited by Anne Federer. Journal of Medical Internet Research 13, no. 4 (2011): e123. doi:10.2196/jmir.2012.
10. Thelwall, “Do Altmetrics Work?”
11. >Yan, Rui, Congrui Huang, Jie Tang, Yan Zhang, and Xiaoming Li. “To Better Stand on the Shoulder of Giants.” In Proceedings of the 12th ACM/IEEE-CS Joint Conference on Digital Libraries, 51–60. ACM, 2012. doi:10.1145/2232817.2232831.
12. Priem, “Altmetrics: A Manifesto.”
13. Priem, “Altmetrics: A Manifesto.”
14. Davenport, Tom. “Book Publishing ’s Big Data Future.” Harvard Business Review, March 2014. https://hbr.org/2014/03/book-publishings-big-data-future/.
15. >Regan, Amanda Jia’en. Data-Driven Publishing: Using Sell-through Data as a Tool for Editorial Strategy and Developing Long-Term Bestsellers. MPub Project Report. Vancouver, BC: Simon Fraser University, Spring 2012. http://publishing.sfu.ca/?p=1915&preview=true
16. Curran, J., qtd. in C. Clayton Childress. “Decision-Making, Market Logic and the Rating Mindset: Negotiating BookScan in the Field of US Trade Publishing.” European Journal of Cultural Studies 15, no. 5 (2012): 604–20. http://ecs.sagepub.com/content/15/5/604
17. “About SalesData.” BookNet Canada. Accessed February 27, 2015. http://www.booknetcanada.ca/salesdata/.
18. Davenport, “Book Publishing’s Big Data Future.”
19. Thelwall, Mike, Andrew Tsou, Scott Weingart, Kim Holmberg, and Stefanie Haustein. “Tweeting Links to Academic Articles.” Cybermetrics: International Journal of Scientometrics, Informetrics and Bibliometrics, no. 17 (2013): 1–8. http://dialnet.unirioja.es/servlet/revista?codigo=5578
20. O’Leary, Brian. From Competitors to Collaborators : 12 Steps for Publishers in the Digital Age, 2014. http://www.magellanmediapartners.com/publishing-innovation/12-steps-for-publishers-in-the-digital-age/.
21. Finn, Ed. “Becoming Yourself: The Afterlife of Reception,” 2011. Stanford Literary Lab Pamphletslitlab.stanford.edu/LiteraryLabPamphlet3.pdf.
22. Yan, “To Better Stand on the Shoulders of Giants.”
23. Fitzpatrick, Kathleen. Planned Obsolescence: Publishing, Technology, and the Future of the Academy. NYU Press, 2011. http://www.plannedobsolescence.net/about/.
24. Davis, Phil. “Visualizing Article Performance-Altmetric Searches for Appropriate Display.” The Scholarly Kitchen, September 30, 2013. http://scholarlykitchen.sspnet.org/2013/09/30/visualizing-article-performance-altmetrics-searches-for-appropriate-display/.
25. Anderson, Chris. The Long Tail: Why the Future of Business Is Selling Less of More. Hyperion, 2006. http://www.thelongtail.com/about.html

Too Fast Too Facile: The Rise Of Online Annotations

In 2014, technocrats and open source crusaders from around the world gathered at an annual conference in California to ruminate over the possibilities of palliating an information-saturated internet with the use of online annotations. Conspicuous among the attendees were representatives from Genius, formerly Rap Genius, which has been provisioned with millions of dollars of VC funding since its inception in 2009. The thrust of the conference was the creation of a universal online annotation system that would not only critique and question the veracity of online content but also network it by hyperlinking and minimizing the degrees of separation between reams of webpages which might otherwise be insulated from each other.

At the conference, Nick Stenning, a developer with hypothes.is, made the most compelling case for online annotation. “…the web will be a vast, varied assembly of sources of information. Annotation provides us with the way of navigating that information…without requiring that the publishers provide it themselves.” The crux of his argument lies in the phrase without requiring that the publishers provide it themselves. As it happens, it is often the web publisher—by having sole discretion over inserting hyperlinks to sources, related webpages—that lays down the route a seeker of information must take in navigating the web. Discounting his comments, which are— besides being at the end of the page and hence inconspicuous and relatively decontextualized—vulnerable to deletion, the user has no recourse to link a webpage to another. Conversely, annotations, by virtue of being inline, function as an incisive, line-specific commentary that let users paste hyperlinks to related webpages without requiring the publisher’s imprimatur.

And indeed that is how Genius, with its melioristic mandate—Annotate The World, could affect a change. Of all the emergent annotation platforms, Genius seems poised to break new ground not least because it is buoyant with venture capital but also because it is being shepherded by Marc Andreessen, the co-founder of the now defunct Mosaic, one of the earliest web browsers which also introduced online annotations to the nascent internet community of the early 90s. Having failed in creating an annotatable web on first attempt, Andreessen, with Genius, hopes to reinvigorate online annotation and this time for good. But the concept of online annotations predates Mosaic. Even though it can be argued that the idea harks all the way back to Vannevar Bush’s Memex Machine, it was Ted Nelson, who, with his seminal Project Xanadu, first began to think critically of the possibilities of creating an annotatable web.

Despite having similar intentions, Nelson’s and Andreessen’s visions were fundamentally dissimilar. Much before Hypertext, which would link countless blogs and primitive webpages that hitherto existed in isolation, Nelson, who actually coined the term, was busy ideating a radically different but vastly superior version of the internet as we know it today. An exposition of how Project Xanadu differs from the contemporary World Wide Web would require another paper but a brief excursus into its fundaments is crucial to drawing lessons for online annotations.

To Nelson, the World Wide Web is an aborted and slipshod version of what he had in mind for Xanadu. “[Xanadu] has always been much more ambitious…where documents may be closely compared side by side and closely annotated; where it is possible to see the origins of every quotation; The Web trivialized this original Xanadu model, vastly but incorrectly simplifying these problems…Fonts and glitz, rather than content connective structure, prevail.”

It is Nelson’s emphasis on connective structure that makes WWW pale in comparison to Xanadu; two-way links, as he labels them, allow the user to view a document that either borrows, references or derives content in simultaneity with the source document from which it borrows, references or derives. In fact, by using beams to connect content to its source document, it visualizes not just connections between documents but between lines and paragraphs scattered across documents.

A screen shot of Xanadu’s working deliverable, courtesy: www.kottke.org

Stated laconically, Xanadu traces not just the genealogy of documents but functions as a kind of an omniscient library system, mapping the web of interconnections in the accumulation of human knowledge.

But one could argue that the web, as it exists today, allows for comparing a source document and a derivative document by displaying them in different tabs or windows; but, it does not provide for two-way links. For example, a news report about discrepancies in a company’s financial statement would hyperlink itself to the said financial statement released on the company’s website. But, despite the availability of myriad backlink softwares that notify a webmaster every time another webpage links to their website, it is unlikely that the website would reciprocate the action by linking its financial statement to the news report. As mentioned earlier, this is because hyperlinking to another webpage is at the discretion of the web publisher. Xanadu’s provision for two-way linking ensures that no document can exist in isolation which effectively means that it displays not just links (or beams, as illustrated above) to a source document but links from a source document to all documents that source from it.

In the ground-breaking Death Of The Author, literary theorist Roland Barthes wrote:
“The text is a tissue of citations, resulting from the thousand sources of culture”

Arguably, Barthes’ aphorism is a more elegant summary of the more banal dictum: All Knowledge Is Derivative. With two-way links, Xanadu imitates it by tracing every quotation or idea to its very source such that there is no document or text that exists without being foregrounded in the scholarship that precedes and influences it and that which proceeds and is influenced by it.

But how does this inform annotations?

Although Xanadu never got off its feat and is considered more or less vaporware, online annotations can be an effective tool in mitigating some of the damage that the shoddy implementation of hypertext has wrought on the web. And this is where Genius can be of service.

Any user, after signing up for a free Genius account, can annotate the web. Genius is different from other annotation platforms which tend to be browser plugins; rather, it looms over the web which is to say that it precedes the URL—for example, past Genius annotations on the LA Times website can be viewed by Genius users and new annotations can be made by going to http://genius.com/www.latimes.com. If a Genius user finds a story on Seattle Times that is related to another story on LA Times, he can make an annotation on http://genius.com/www.seattletimes.com and insert a hyperlink to http://genius.com/www.latimes.com. This is, however, still a one-way link. But, with the right backlink software, Genius will be notified that a user has linked to http://genius.com/www.latimes.com and can use bots to display the in-bound link to Seattle Times (http://genius.com/www.seattletimes.com) on LA Times (http://genius.com/www.latimes.com)

What makes this prospect of two-way linking irresistible is that Genius can do this without needing the permission of either newspaper. The whole mechanism may seem tedious even undoable. But, with the weight of influential investors and millions of dollars behind them, Genius is perfectly positioned to delve into two-way linking and channelizing funds into conceiving new ways of accomplishing it.

Two-way linking is essential not only for more transparency and navigability of information as the two examples illustrated but also for creating a highly interlinked web. More connections and reciprocal connections would create an infinitely networked and heuristic World Wide Web where information would be more accessible and one where users can amble from one website to another without solely relying on search engines and a list of favored websites as their gateways to the web. It would pave the way for a more equitable internet—one where information would be scattered across multitudes of websites—and where a few media organizations would not hold a monopoly over privileged information and take editorial calls over the publishing of sensitive content.

Nelson’s Project Xanadu was a spectacular failure; but it prognosticated problems that have only come to the surface since big internet companies started implementing Hypertext with little foresight and content began pullulating the internet in the last two decades.

Online annotation, which is yet to come to fruition, can, to some degree, bring us closer to a Xanadu-like internet. But, Genius, with its emphasis on the ‘Worse Is Better’ model of business, seems to be prioritizing scaling up over and above other imperatives. In fact, the founders of Rap Genius are taking comfort in the fact that the introduction of Hypertext was met with similar consternation which eventually fizzled out. In doing so, it is evincing the same haste and impatience that the internet behemoths demonstrated in their road to El Dorado.

Nelson wouldn’t be surprised.

Bibliography

RapGenius Rebrands With $40M, Aims to ‘Annotate the World’, Lora Kolodny, Wall Street Journal
Perpectives on Annotation, W3 TPAC Conference, Oct 2014
Why Andreessen Horowitz Is Investing in Rap Genius
Toward an ecology of hypertext notation, Catherine C Marshall, Xerox Palo Alto Research Center
Pioneering hypertext project Xanadu released after 54 years, kottke.org
The Death Of The Author, Roland Barthes
Xanalogical Structure, Needed Now More than Ever:
Parallel Documents, Deep Links to Content, Deep Versioning and Deep Re-Use, Project Xanadu

The curse of Xanadu, Gary Wolf, Wired
Genius Idea, Reeves Wiedeman, New York Magazine

The hybrid model and copyright: could Wattpad do for publishers what YouTube did for music and movies?

Abstract

Lawrence Lessig famously claimed that the US is becoming less a “free culture,” with the ability of new creators to build on and remix older works heavily regulated. He argues that copyright laws have changed to protect the media companies rather than the creators.1  This essay begins by examining how YouTube has provided an incentive for big media companies to overlook copyright infringement by monetizing unauthorized use. It uses YouTube’s business model as a framework to discuss publishing models and digital copyright infringement, with particular emphasis on Wattpad, which bills itself as “the YouTube of publishing.”

Introduction

The turn of the millennium was an interesting time to be a pirate. In 1998, the Sonny Bono Act extended the American copyright term by twenty years, meaning that works slated to enter the public domain that year—most notably, the earliest Mickey Mouse movie—would stay protected until 2019.2 Meanwhile, the World Wide Web was exploding in popularity, and with it, file-sharing software such as Napster, and, later, Kazaa, Limewire, and torrents. Despite heavy DRM (digital rights management) and lawsuits from major record companies, so-called “pirates” showed no signs of slowing down. Heavier locks led to more sophisticated workarounds; new file-sharing platforms sprung up as fast as lawsuits crippled others. Media companies pushed for more restrictive copyright protection in courts, while online, people all over the world performed what Anil Dash calls “a massive act of civil disobedience”— posting videos online with the words “no infringement intended.” These words indicate that they purposely violated copyright because the law didn’t match their expectations of how they could share and reuse content.3

The deadlock between media companies and pirates came from a clash between two coexisting economies. In the first, “commercial” or “commodity” culture, items have a value and are bought for cash. In the second, “sharing” or “gift” culture, items are exchanged reciprocally and offering cash is inappropriate4 5—just imagine handing a fifty-dollar bill to great-aunt May in exchange for the hand-knit sweater she gives you for your birthday. Online, the two economies clash: “Within commodity culture, sharing content may be viewed as economically damaging; in the informal gift economy, by contrast, the failure to share material is socially damaging.”6 Audiences wanted to share with their online communities or even between devices, and heavy DRM made piracy the easiest way to do so. They wanted to create new videos or music based on their favourite clips, and to share with others, without having to pay exorbitant fees.

YouTube and the hybrid economy

Fast forward to the present day. Copyright law hasn’t changed, but now anyone can watch He-Man singing a falsetto version of a Four Non-Blondes single7 or see Buffy stake Edward.8 Creators of remixes rarely find themselves faced with enormous lawsuits. What has changed?

YouTube, owned by Google, is a video-sharing platform that has been a hotbed for copyright infringement. In 2007, Viacom launched a $1 billion lawsuit against Google for allowing copyrighted videos to be uploaded.9 They reached a settlement in 2014, and that same year, Google released a white paper outlining its approach for stopping piracy. Google began by working to make YouTube a “better, more convenient alternative” to illegal sharing. Meanwhile, they “follow the money,” taking a two-pronged approach to ensure money flows to the right places.10

First, Google cuts off pirate sites from its ad program, removing an economic incentive for them to operate. Second, and most importantly, it compensates rights holders for infringing uses of their work. More than five thousand rights holders use YouTube’s Content ID program, which fingerprints the videos and music they upload and notifies them when a new upload matches their content. Rightsholders can choose between three options: to earn a percentage of ad revenue from the video; to track its viewing statistics; or to remove the video from YouTube. The vast majority choose to make money. Some earn a six-figure income from YouTube views, and the program has made over a billion dollars for the media industry since it was launched. Google’s policy has achieved what years of lobbying against copyright never could: “When copyright owners choose to monetize or track user-submitted videos, it allows users to continue to freely remix and upload a wide variety of new creations using existing works.”11 It has set a new de facto standard for copyright where remixes are not only accepted, they are encouraged—after all, they generate profit for the rights holders without them having to do any work.

YouTube marries the commodity economy to the gift economy, creating what Lawrence Lessig describes as a “hybrid economy.”12 Hybrid economies benefit from the work of the communities they foster, walking a delicate line to monetize that effort without offending their users. Lessig claims that nearly every interesting online company follows a hybrid business model, and it’s not hard to see why: they make use of a diverse range of human talent and interests to create a democratic platform where anyone can find what they are looking for. In this model, “consumption is no longer necessarily seen as an end point in an economic chain of production but as a dynamic site of innovation and growth in itself.”13 Consuming a movie doesn’t end at buying the DVD—it continues online with reviews, remixes, and fan videos.

Can this model work for publishers?

Publishers have an unusual relationship to piracy. Unlike other major media companies, few publishers have taken action against digital copyright infringement, and the handful of cases that made it to court were largely unsuccessful. J.D. Lipton notes that it’s surprising publishers have not been more active in copyright litigation, since “the threats to digital publishing from unbridled copying and distribution are arguably greater than the threats to other digitized industries.”14 Readers do not tend to reread books, so they are unlikely purchase a book if they enjoyed the first read. And unlike music and movie companies, publishers rarely profit from events, shows or merchandise sales.

There are several reasons why publishers are less concerned about piracy than their movie and music counterparts. The first is the barrier to entry: a digital book requires a tablet or e-reader to enjoy comfortably. Another is lack of demand: Tim O’Reilly famously said, “Obscurity is a far greater threat to authors and creative artists than piracy.”15 O’Reilly saw piracy as a hidden tax on the few books that were successful enough to be shared online.

But a happier reason why publishers are not worried about piracy may be that publishers are already ahead of the curve when it comes to copyright. Offline, book publishing already exists in a hybrid economy. Content is available for free from public libraries. Fair use citation is not only not litigated against, it is encouraged. In fact, essays such as this one can be seen as the original remix, drawing disparate sources together to create something new and (hopefully) contribute to a larger conversation.

The problem? As content moves online, these models will break down. Publishers will need to find a model that, as Techdirt writer Mike Masnick puts it, connects with fans and gives them a reason to buy.16 Otherwise, publishers risk losing sales to piracy or, worse, dooming themselves to irrelevance as readers turn to media they can consume the way they want to.

Wattpad’s transition from sharing to hybrid model

One possible model might be Wattpad, which bills itself as “the YouTube for books.”17 Wattpad is a self-publishing platform that allows users to read stories, comment on them and post their own for free. Wattpad fits in nicely with Lessig’s characteristics of a sharing or gift economy.18 It capitalizes on the “Long Tail”—since it can store a vast amount of stories, it can cater to niche interests without additional cost. It plays “Little Brother,” gathering minute data on its audience so it can rebuild the site according to their needs. And it can be “Lego-ized,” meaning that it can fit together with other content, both on- and off-site, to build something new. Users can add any content they want, influencing the range of content from vampire romance to nonfiction. They can comment on each others’ works, influencing books in progress with their suggestions. Like a sharing economy, Wattpad is still struggling to make money in a way that doesn’t alienate its users. It currently experiments with banner ads and native advertising.

Wattpad could adopt a YouTube-style model towards copyrighted works, compensating authors when their works are used in fan fiction. Fan fiction is a murky copyright area. It is technically illegal, except in specific circumstances such as parody. Despite this, most authors embrace fan fiction as a way to connect with their audience and build a community around their work. Now, however, “the rise of self-publishing has coincided with a blurring of the lines between noncommercial fan fiction and commercial self-published work,” Lipton argues.19 Rather than force fans to take down their works, Wattpad could share a portion of ad revenue with the rights holders. Wattpad also allows its users to select from different licenses, ranging from All Rights Reserved to Creative Commons licenses,20  so self-published authors can choose how their own work will be used.

As Wattpad gains momentum as a platform, it will also attract pirates. New York Times bestselling author Jasinda Wilder’s book was posted on Wattpad without her permission. The person who posted it gave the work a different title and author name. Wilder’s book was read 41,000 times before a user identified it as a pirated version and notified the author. Wattpad removed the book and blocked the user, but did not notify readers that a different author had crafted the book they’d enjoyed. Wilder estimates that she lost $168,000 in royalties from the incident,21 although it’s doubtful whether the readers would have paid for her book in-store. If Wattpad had handled the situation differently, she might have gained a new audience, as well as data (through comments and profiles) about who was reading her book and how.

Conclusion

A Publisher’s Weekly  article on the Jasina Wilder incident theorized that “plagiarizers on sites like Wattpad often commit their crimes to develop a following, so that they can rely on an established audience to purchase their own paid work.”22 Interestingly, the language used identifies the crime as “plagiarism,” rather than “piracy,” showing the different attitude towards sharing text as opposed to other media. But as Wattpad continues to grow, its users will probably share content on it for different reason. Like the teenagers posting “no infringement intended” videos to YouTube, readers will copy their favourite books to Wattpad because they want to share them with their community of followers and take advantage of Wattpad’s in-line commenting system. Publishers will then have to make the difficult decision—take down these books and risk alienating a fan community, or leave them online? If Wattpad examines YouTube’s model for combating piracy, it may be able to truly cement its place in online reading while keeping publishers and readers happy, transitioning from a sharing economy to a true hybrid model.

References

1. Lessig, Lawrence. Free Culture: How Big Media Uses Technology and the Law to Lock down Culture and Control Creativity. New York: Penguin Press, 2004.

2. Copyright Term Extension Act, 1998. http://www.copyright.gov/legislation/s505.pdf.

3. Dash, Anil. “The Web We Lost.” Dashes.com. Accessed January 9, 2015. http://dashes.com/anil/2012/12/the-web-we-lost.html.

4. Lessig, Lawrence. Remix: Making Art and Culture Thrive in the Hybrid Economy. New York: Penguin Books, 2008, 117.

5. Jenkins, Henry, Sam Ford and Joshua Green. Spreadable Media: Creating Value and Meaning in a Networked Culture. New York: New York University Press, 2013, 60.

6. Jenkins, 64.

7. “He Man Sings 4 Non Blondes (Original).” YouTube. Accessed January 30, 2015. https://www.youtube.com/watch?v=X8Nc8RCLy1s.

8. “Buffy vs Edward: Twilight Remixed — [original Version].” YouTube. Accessed January 31, 2015. https://www.youtube.com/watch?v=RZwM3GvaTRM.

9. “Viacom International, Inc. et Al v. Youtube, Inc. et Al, 1:07-Cv-02103, No. 1 (S.D.N.Y. Mar. 13, 2007).” Docket Alarm, March 13, 2007. https://www.docketalarm.com/cases/New_York_Southern_District_Court/1–07-cv-02103/Viacom_International_Inc._et_al_v._Youtube_Inc._et_al/1/.

10. How Google Fights Piracy. Google, October 17, 2014. http://googlepublicpolicy.blogspot.ca/2014/10/continued-progress-on-fighting-piracy.html.

11. How Google Fights Piracy, Google.

12. Lessig, Remix, 177.

13. Burgess, Jean, and Joshua Green. YouTube: Online Video and Participatory Culture. John Wiley & Sons, 2013, 13.

14. Lipton, Jacqueline D. “Copyright, Plagiarism, and Emerging Norms in Digital Publishing.” Vanderbilt Journal of Entertainment and Technology Law 16, no.5, Spring (2014): 585. http://www.lexisnexis.com/lnacui2api/api/version1/getDocCui?lni=5CP4-X3W0-02C9-D08K&csi=239002&hl=t&hv=t&hnsd=f&hns=t&hgn=t&oc=00240&perma=true.

15. O’Reilly, Tim. “Piracy Is Progressive Taxation, and Other Thoughts on the Evolution of Online Distribution.” OpenP2P. Com, 2002. http://www.openp2p.com/pub/a/p2p/2002/12/11/piracy.html.

16. Masnick, Mike. “My MidemNet Presentation: Trent Reznor And The Formula For Future Music Business Models.” Techdirt 6 (2009). http://www.techdirt.com/articles/20090201/1408273588.shtml.

17. Landau, Emily. “The Wattpad Cult: Why Toronto’s Buzziest Tech Start-up Is a Self-Publishing App Beloved by Teen Girls.” Toronto Life, 2014. http://www.torontolife.com/informer/tech-informer/2014/11/10/the-wattpad-cult/3/.

18. Lessig, Remix, 128-141.

19. Lipton, “Copyright, Plagiarism, and Emerging Norms in Digital Publishing.”

20. “Copyrights – Help Center.” Wattpad. Accessed January 28, 2015. http://support.wattpad.com/hc/en-us/articles/200773104-Copyrights.

21. Rosen, Judith. “Wattpad Pirates Get Craftier.” Publisher’s Weekly, November 13, 2014. http://www.publishersweekly.com/pw/by-topic/digital/copyright/article/64657-wattpad-pirates-get-craftier.html.

22. Rosen, “Wattpad Pirates Get Craftier.”