Wikipedia: An Underrated Academic Resource

Introduction

Wikipedia has been regarded as “one of the iconic sites of the Web 2.0,” [9] “the king of online references,” [7] and “the connected world’s go-to reference source.” [6] As of December 2014, the online encyclopaedia contained more than twenty-six million articles in over 250 languages. [3] In addition, it is “consistently ranked in the top ten of the most popular websites in the world.”[9] However, academics have long been critical of Wikipedia’s credibility as an academic resource. Even today, the “Harvard Guide to Using Sources” includes a page outlining what the university thinks is “wrong with Wikipedia.” The information on this page reiterates the most common objections to the online encyclopaedia’s accuracy: anyone can contribute to it; the expertise of contributors is not evaluated; the information may be outdated; and the entries are not reviewed by experts. [2] Nevertheless, there is evidence that suggests an increasing shift in the attitude of some academics in favour of Wikipedia. For instance, a 2013/2014 study conducted at four California State University campuses found that their faculty’s perception has shifted in favour of Wikipedia over the five-year period preceding the study. [3] This paper will argue that this shift in perception is happening for good reason due to the potential of the Wikipedia model to produce high-quality content and provide an open-access publishing venue for academic content.

High Quality Peer Production 

Wikipedia is an excellent example of the kind of effective commons-based peer production that Yochai Benkler discusses in his book The Wealth of Networks. He argues that the development of the Web 2.0 and the networked information economy has been accompanied by a “rise of effective, large-scale cooperative efforts – peer production of information, knowledge, and culture,” which people participate in for personal gratification and not just capital gain. [11] Accordingly, Wikipedia owes its success to its “Wikipedians” – the huge community of people who volunteer to contribute to it. It has the huge advantage of benefiting from their cumulative knowledge. As Cass R. Sunstein writes in Infotopia: How Many Minds Produce Knowledge: “The involvement of many people ensures that Wikipedians are able to produce a much more comprehensive resource than a small group could, even a small group of experts.” [4] Thus, Wikipedia’s success exemplifies the potential of peer-produced content.

The quality of Wikipedia’s content has been favourably evaluated by various studies based on its comprehensiveness, currency, readability, and accuracy. [9] Due to the broad range of topics covered in its millions of entries, Wikipedia is considered to be “one of the most comprehensive sources in existence” [9] and has been recognized as “the most serious online alternative to the Encyclopaedia Britannica.” [11] In fact, Wikipedia has proven to be comparable and even superior based on certain measures of quality. For example, its “live, continuous online publishing model” allows it to have an unprecedented level of currency that is even generally unmatched by Britannica. [9] With regards to readability, a comparative study of Wikipedia and Britannica’s online entries found no significant differences in such quantitative measures as lexical density and the length of sentences and words. [9] Furthermore, a famous expert-led study carried out by the reputable scientific journal Nature in 2005 concluded that the accuracy of Wikipedia science entries is comparable to those of Encyclopaedia Britannica. Similar amounts of inaccuracies were found in both encyclopaedias, with the average Britannica entry containing around three errors and the average Wikipedia entry containing around four. [5] Considering that “the reliability of Wikipedia articles has generally improved over time” [9] and ten years have passed since this study was conducted, it’s fair to assume that today’s Wikipedia content is even more accurate.

Arguably, the two most important quality assurance mechanisms that Wikipedia has in place are its verifiability policy and community of dedicated Wikipedians. By requiring contributors to support their additions with citations to trustworthy external sources, Wikipedia is not only improving its reliability, but also helping ensure that articles are written based on verifiable facts or perspectives and not just personal opinion. [9] Any errors that do make it onto Wikipedia are usually rapidly corrected by the huge numbers of volunteers that are constantly revising its pages. [4] Thus, continuous real-time editing doesn’t only benefit Wikipedia’s currency, but inevitably its accuracy as well.

Alternative Venue for Academic Content

The popular use of Wikipedia by students, as well as the general public, should motivate academics to help improve its accuracy by contributing their expertise to it. Among Harvard’s objections to the online encyclopaedia is that its articles are generally neither written nor reviewed by experts. [2] The obvious way in which to address this concern is by getting experts involved. Aside from formal academic publishing, contributing to Wikipedia could help academics proliferate high-quality academic content on a platform that’s freely available to everyone – including those who cannot afford the cost of higher education. Some of those in academia are already working towards making this a reality.

Charles Matthews, a former Cambridge University math professor, has already edited more than 200,000 pages on Wikipedia. [7] Meanwhile, at Imperial College in London, a group of faculty and students has come together to “start legitimizing Wikipedia as a research source” by working to improve the content of its articles. [7] However, perhaps most impressively, in 2011 Harvard University Professor Mahzarin R. Banaji started an initiative to encourage the 25,000 members of the Association for Psychological Science (APS) to work towards improving the thousands of psychology articles on Wikipedia. [8, 1] She wants to ensure that the discipline is represented “as fully and as accurately as possible and thereby promote the free teaching of psychology worldwide.” [8] Soon after this project was launched, the American Sociological Association began a similar initiative based on the APS example. [10] Hopefully this will become a trend that will encourage academics in other disciplines to do the same.

Conclusion

Considering its huge size and popularity, it’s fair to assume that Wikipedia is a massive publishing project that isn’t going anywhere and cannot be ignored. In her book Planned Obsolescence: Publishing, Technology, and the Future of the Academy, Kathleen Fitzpatrick writes that: “Failing to engage fully with the intellectual merits of a project like Wikipedia, or with the ways in which Wikipedia represents one facet of a far-reaching change in contemporary epistemologies, is a mistake that we academics make at our own peril.” [6] It’s clear that all academics should start approaching Wikipedia more favourably. As detailed above, many studies have already confirmed the free online encyclopaedia’s quality and reliability, which can be even further improved with time. The Wikipedia model provides an excellent peer production platform that academics and students can utilise to publish high-quality content for anyone on the internet to access. It’s encouraging to see that some academics and initiatives are recognizing this potential and working towards shifting perspectives in Wikipedia’s favour. In the future, Wikipedia may become not only the most widely used, but also the most trusted, academic resource by students and educators alike. 

References

  1. “APS Wikipedia Initiative,” Association for Psychological Science. http://www.psychologicalscience.org/index.php/members/aps-wikipedia-initiative
  2. “What’s Wrong with Wikipedia?” Harvard University. http://isites.harvard.edu/icb/icb.do?keyword=k70847&pageid=icb.page346376
  3. Aline Soules, “Faculty perception of Wikipedia in the California State University System”, New Library World (2015), Vol. 116 Iss 3/4: 213–226. http://www.emeraldinsight.com/doi/abs/10.1108/NLW-08-2014-0096
  4. Cass R. Sunstein, Infotopia: How Many Minds Produce Knowledge (New York: Oxford University Press, 2006). https://global.oup.com/academic/product/infotopia-9780195189285?cc=ca&lang=en&
  5. J. Giles, “Internet Encyclopaedias go head to head,” Nature (2005), Vol. 438 No. 7070: 900-901. http://www.nature.com/nature/journal/v438/n7070/full/438900a.html
  6. Kathleen Fitzpatrick, Planned Obsolescence: Publishing, Technology, and the Future of the Academy (New York: NYU Press, 2009) http://mcpress.media-commons.org/plannedobsolescence/
  7. Liz Dwyer, “Could the Days of Wikipedia Being a Banned Research Source Be Over?” Good, March 25, 2011. http://magazine.good.is/articles/could-the-days-of-wikipedia-being-a-banned-research-source-be-over
  8. Liz Dwyer, “Harvard Academic Starts Initiative to Boost Accuracy of Wikipedia’s Psychology Articles,” Good, June 2, 2011. http://magazine.good.is/articles/harvard-academic-starts-initiative-to-boost-accuracy-of-wikipedia-s-psychology-articles
  9. M. Mesgari, C. Okoli, M. Mehdi, F. Å. Nielsen, and A. Lanamäki, “The sum of all human knowledge: A systematic review of scholarly research on the content of Wikipedia,” Journal of the Association for Information Science and Technology (2015), Vol. 66: 219–245. http://spectrum.library.concordia.ca/978618/1/WikiLit_Content_-_open_access_version.pdf
  10. Piotr Konieczny, “Rethinking Wikipedia for the Classroom,” Contexts (2014), Vol. 13 No. 1: 80-83 http://ctx.sagepub.com/content/13/1/80.full.pdf+html
  11. Yochai Benkler, The Wealth of Networks (New Haven: Yale University Press, 2006) http://cyber.law.harvard.edu/wealth_of_networks/Download_PDFs_of_the_book

The Rise of E-Reading Mobile Apps on the Prospects of Publishing

Introduction

The Publishing Industry is one dynamic sector that has undergone several transformations with the advent of the Internet. As technology advances, players of this industry have developed systems, cutting across online retailing systems, digital printing, and more recently, digital book formats, to meet the increasing demand of its target audience. These technologies seem to have caught on well with the dynamic market and are gaining much traction.

In spite of this growth in digital publishing, one challenge that persisted was the limited access to retrieve electronically published contents such as books, magazines, newspapers, etc. This is because some E books were only made available on specially designed e-readers by specific book stores. Publisher’s market targets were marginalized and potential readers without these ‘e-readers’ were deprived.

To enhance the readability and access given to readers of digital books, there has been the development of several mobile e-reading apps. This growing trend has been as a result of cues taken from the gradual increase and user preference of mobile apps to mobile websites. These e-reading apps provides users faster and easy access to digitally published contents (books, newspapers and periodicals) on any mobile device; expanding the scope and reach of publications and providing increased returns to publishers.

This study seeks to examine the advent of e-reading mobile apps and the impact they are having on the publishing industry (publishers and readers). It focuses on the factors propelling the usage of mobile apps, the effects they have on publishing firms, users’ experiences with these apps as well as the future prospects of e-reading mobile apps for the publishing industry, including some recommendations on what publishers can do to maximize the potential of these reading apps.

Mobile Apps

To begin with, mobile apps are application software that are developed for mobile devices such as phones and tablets, usually for information retrieval and communication (Mobile Apps, n.d.). These applications are either pre-installed by the manufacturer or have to be installed by the user either at a cost or free.

Most businesses use mobile apps as a revenue-generating platform, a way of promoting their content or product, sharing content and information with consumers, tracking their users, notifying users of special events and emergencies when the need arises. For example, Financial Institutions use mobile apps to interact with their customers and provide them with services such as paying bills, transferring and receiving money, checking account balance, etc.

E-Reading Apps

E-reading apps are special kind of apps that are specially produced for reading. It is provided through some existing retail eBook distribution channels such as: kindle, Nook, Marvin, Kobo, google play Books, etc. They offer readers some value added features such as highlighting text, annotating, making notes, audio features, etc.

Each apps is designed for peculiar platforms, which helps to differentiate one app from the other. For example, Aldiko is specifically designed for android and supports epub, PDF, and Adobe DRM encrypted eBook. It offers the ability to add notes, highlight text and make notes while reading. The FB reader is also for android, has a dictionary  and support in finding books online, etc. Google Play Books also have a font and typeface customization feature, highlighting text, dictionary, map search, etc. IBook comes with a feature to sync all collections, bookmarks, notes, etc. (Corpuz, 2013) These features offered to readers through apps are seen as tools for making reading easier and more comfortable.

Impact of Mobile Apps on Publishing

The advent of e-reading app has affected the publishing industry both positively and negatively;

First of all, the usage of mobile reading devices has undoubtedly increased the audience of many publishers. Due to the increase in the number of potential readers, there is the likelihood that publishers’ sales volume would be affected in a positive way. Unlike previously where eBooks were available to owners of e-readers, the advent of apps has expanded the market by reaching out to owners of other mobile devices either than traditional e-readers.

Secondly, e-reading apps gives publishers the opportunity to know how readers interact with books. Publishers are able to track their readers, know how many hours a reader spends on a particular book, how much time they spend on a session of the book, where they start and pause, which genre of books they prefer to read, etc. (Alter, 2012) These tracking activities obviously allow publishers to know the preferences of their audience and also enable them to easily analyze and refine their services through the various reactions from readers while reading a book. (Alter, 2012)

Also,  e-reading apps have become an important tool used by publishers in promoting and branding their products, as well as offering users easy accessibility of their publications. Apps serve as a way of increasing a book’s visibility to customers and providing them information with information as they are requested. These apps also serve as a channel for publishers to interact with their readers.

Despite the above mentioned benefits of mobile apps, there seem to be some existing challenges that poses threats to publishers;

First, readers have high quality expectation but are not ready to pay for the price that comes with meeting such demand. (Hall, 2013) Irrespective of how much resources are invested in the production of eBooks, publishers are push to consider pricing their books very low in order to attract more buyers.

Another concern expressed by most publishers has got to do with the illegal means by which readers gain access to published books. Even with Digital Rights Management Systems in place, some readers have found ways of removing DRM from eBooks.  This leaves publishers at a risk of not having a good reflection of sales returns on books.

Readers’ experiences with e – reading mobile apps

In examining different experiences people have with the use of e-readers, an observation of reviews was made on three e-reading apps from Kobo, Kindle and Nook.

kobo Reading Apps

“Still love its simple yet effective interface I’ve been using this app since 2011 & still love it. I now have it on 3 devices and syncs beautifully! Don’t know what people are complaining about” March 26, 2015

About epub extension I like this app but not the way this crumbles epub book internally… besides, this doesn’t catch up epub extensions itself, so the only way is it to import if recognises the fileMarch 14, 2015

Kindle Reading Apps

Obviously all such apps basically do what they are supposed to, but actually I do notice differences, other than “fancy visuals like 3D page-turning.” I find Apple’s books implementation on the iPad much easier than the Kindle app. Not at all because of the skeuomorphic 3D look, but rather because of the ease-of-useMay 2, 2013

“I love my iPad and the Kindle app. My only frustration is that even though I have an Amazon Prime membership, I can’t access the Kindle Lending Library because I don’t own the actual “Kindle Device.July 13, 2013

Nook Reading Apps

I love the recently updated version of the nook app. Shopping is easier, I can see the balance on my gift cards, and locating the book I am currently reading is easier than ever. I recommend this app to people all the timeApril 3, 2015

I love the new version of this app! It’s more like the nook. I also like that you can watch video directly from it rather than needing to download the other appApril 4, 2015.

These comments from users clearly indicates that people like the reading apps, but are unsatisfied with what is being offered. They obviously expect more than what the apps offer now. This is a good call for publishers and  app developers to start extending services beyond their reach.

The prospects of e-reading mobile apps on the future of the publishing industry

With the evolution of mobile apps, mobile users’ attention is gradually shifting from the web to apps. People’s attention and use of the internet are gradually moving from personal computers (PCs) to mobile devices. An observation in a study from February 2013 to January 2014 in the US, clearly indicates that, the usage of mobile devices eclipse the use of internet on PCs. (Murtagh, 2014) About 99.5% of these consumers use the mobile devices to access content/information. A breakdown of this study is shown below;

(Murtagh, 2014)

With the use of mobile devices, apps are seen to be the most preferred platform as compared to websites. Another survey of about 3,534 smartphone users in UK, US, France, Germany, India and Japan, by Compuware, reports that, 85% of American consumers prefer to use mobile apps (Moth, 2013). Reasons for their preferences as indicated in the study, are; the convenient nature of mobile apps, its fast nature, its ability to ease browsing, better user experience, easy access to bank account and to shop. The diagram below shows consumers’ preferences and reasons for using mobile apps;

(Moth, 2013)

This is an indication that mobile apps are gradually taking over the market for most businesses. For publishers, it would be a great avenue to reach their audience since the potential readers spend most of their time on these mobile devices. Due to the progressive increase in the number of usage of mobile devices, there is a likelihood that mobile apps for books will take over the book market.

For developing nations, it would be an opportunity for most readers to get access to books through their mobile phones. This is because the mobile phone is seen as the most used mobile device in developing nations, as a result of the high cost of other mobile devices and computers such as tablets, laptops, e-readers, etc. With the vast number of developing countries in Africa, Africa is seen as the fastest growing continent for mobile phones, with a percentage rate of about 89%. (Zell, 2013). In order for readers in such regions to get access to eBooks, mobile e- reading apps would be a great asset to them, as well as the publishers (which will definitely expand their market).

This is not to anticipate the future of publishing but to look at it from a positive perspective of how publishers can benefit from these apps, as well as giving readers easy accessibility to books, and to provide them with features that will enhance their reading and learning process.

Maximizing the potential of E-reading Apps

From the above mentioned reviews from users, publishers need to explore other new and innovative ways of reaching readers.

First of all, to make these apps more useful to readers, publishers could provide new interactive features for them. An example of such offer, is to provide a more reader friendly audio app for the visually impaired, to enable them highlight, make notes, annotate whiles they listen to their audio books. The iOS kindle reading app offers a similar feature. (Moscaritolo, 2013) Even though these consumers cannot perform activities with their sight, the audio app can be produced to receive commands and provide services the reader requests for whilst reading. A clear example of such feature is Google’s talk back provided on android. With the various tools prompting users of the icons they click on, audio book apps can feature this in the book, by allowing the user to interact with the text through a talk back function. This will enable them to request for services such as; highlighting text, finding meaning of words, and as well as annotating. This feature could also extend an invitation to more users with visual disabilities.

Secondly, publishers could strengthen subscription models to maintain readership on these mobile devices. This will be a way to offer them with reading packages (monthly, bi yearly, yearly, etc). An example of such subscription model for e reading is Oyster. Oyster allows readers of both IOS and android to subscribe. This subscription model comes with a free trial version and also offers one month, three months, six months and one year subscription plans for readers. This model allow readers to get unlimited access to several books, which is updated every day. It also allows readers to search and explore recommended titles based on their likes. Subscribers are also able to download books and read them offline. Such subscription model would help publishers to build their audience, offer readers some packages to help bolster sales.

Knowing the potential of mobile app market in developing countries, there is also the need to offer them the opportunity to pay for books online. To provide that service, mobile payment system could be adopted. This is mainly because the debit and credit card payments are not very common in such regions. These mobile service solutions could help create a payment option for these readers because that have served as a means of offering financial services to certain communities in some developing countries. (Chaia, et al, 2010) This mode of online payment is very uncommon but would be an opportunity for publishers to connect with other readers, as they expand their market to the developing world.

Conclusion

I agree with Natasha Clark’s statement that says, “Mobile could be the key to developing your business and bringing in more traffic, advertising and sales” (Clark, 2014), and I believe that mobile apps are great channels through which companies are branding their businesses and products. The publishing industry is undoubtedly part of this evolution, as they use these apps to brand their firms, market books, reach out to readers, serve the needs of readers, offer features that makes online reading very comfortable, and most importantly, enhancing the learning process through the various tools provided. E-reading apps are obviously a flourishing and a rewarding feature of the publishing industry.

References

Clark, N. (2014) Should your Business Develop a mobile app? http://business-technology.co.uk/2014/06/should-your-business-develop-a-mobile-app/ accessed on 3/31/2015

Compuware http://www.compuware.com/

Moth D. (2013) 85% of consumers favour apps over mobile websites https://econsultancy.com/blog/62326-85-of-consumers-favour-apps-over-mobile-websites/ accessed on 3/31/2015.

Corpus, J. (2013) 12 Best eBook Reader Apps http://www.tomsguide.com/us/pictures-story/583-best-ereader-apps.html accessed on 3/31/2015.

Alter A. (2012) Your E-Book is Reading You? http://www.wsj.com/articles/SB10001424052702304870304577490950051438304 accessed on 3/31/2015.

Edwards, J. (2014) Mobile Apps Are Killing The Free Web, Handing A Censored Duopoly to Google and Apple http://www.businessinsider.com/mobile-web-vs-app-usage-statistics-2014-4 accessed on  3/31/2015.

Fiegerman, S. (2013) Oyster Releases the First True Netflix for E-Book App http://mashable.com/2013/09/05/oyster-launch/ accessed on 3/31/2015.

Google Play book https://play.google.com/store/apps/details?id=com.google.android.apps.books&hl=en

Google TalkBack http://www.androidcentral.com/what-google-talk-back accessed 3/31/2015

Hall F.(2013) The Business of Digital Publishing: An introduction To the Digital Book and Journal Industries. Routledge: London

Khalaf, S. (2014) Apps Solidify Leadership Six Years into the  Mobile Revolution. http://www.flurry.com/bid/109749/Apps-Solidify-Leadership-Six-Years-into-the-Mobile-Revolution#.VSB7_XufjIV accessed on 3/31/2015.

Kindle e reading app https://www.amazon.ca/gp/digital/fiona/kcp-landing-page?ie=UTF8&ref_=klp_mn

Kobo https://www.kobo.com/koboarc7hd#reading life

Marvin http://www.appstafarian.com/marvin.html

Mobile Apps (n.d.) http://en.wikipedia.org/wiki/Mobile_app accessed on 4/3/2015

Moscaritolo, A. (2013) Amazon updates iOS Kindle Reading App for Blind, Visually Impaired http://www.pcmag.com/article2/0,2817,2418410,00.asp accessed on 3/31/2015.

Murtaghm R. (2014) Mobile now Exceeds PC: The Biggest shift since the internet Began http://searchenginewatch.com/sew/opinion/2353616/mobile-now-exceeds-pc-the-biggest-shift-since-the-internet-began accessed on 3/31/2015.

Neilson (2014) An Era of Growth: The cross-platform report Q4 2014 http://www.nielsen.com/us/en/insights/reports/2014/an-era-of-growth-the-cross-platform-report.html accessed on 3/31/2015.

Siegler, M. (2008). “Analyst: There’s a great future in iPhone apps”. Venture Beat.

Zell, H. (2013). Print vs Electronic, and the Digital Revolution in Africa http://www.academia.edu/2514725/Print_vs_Electronic_and_the_Digital_Revolution_in_Africa

Publishing and privacy: why publishers should back personal data services

Abstract

Publishers, like other corporations, will have to develop a strategy that protects users’ data if they want to maintain a relationship of trust. As I will argue in this essay, publishers are ideally placed to support initiatives that preserve privacy because of reading’s historical link with interiority and solitude, and because their business model is based on the idea of information as valuable. In this essay, I will outline one tool for preserving privacy, the Personal Data Service (PDS).  These services protect individuals’ privacy, while allowing corporations access to better and richer data. They create a sense of transparency that allows corporations to interact with people in a more meaningful and value-creating way.

I will argue that publishers should support PDSs. First, they allow publishers potential access to a wealth of verified, trusted data. Second, they create a private space for readers who choose to read without being tracked. Third, they parallel what publishers are already doing by creating ways for people to set and control access to information.

Context

If horror movies are the expression of our collective fears (Philips 2005), then Unfriended is the perfect case study. The action unfolds entirely on-screen, following the interactions of five friends on Skype, Facebook chat and text messages–all of which are being manipulated by an unknown stranger. As Spotify’s suggestions grow steadily creepier, gory deaths ensue (Gingold 2014). Blood splatter aside, Unfriended perfectly captures a zeitgeist of fear about information technology. The victims’ intimate encounters are watched and recorded by a sinister, faceless entity. Their inability to carve out a private space reveals a cultural anxiety about being tracked, manipulated, and exposed online.

This anxiety is not only expressed in horror movies: a 2015 Pew study found that 91% of Americans feel they have lost control over corporations’ use of their personal information online (Madden 2015). And they have good reason to feel this way. Facebook founder and CEO Mark Zuckerberg declared in 2010 that privacy was no longer a “social norm,” to much criticism (Johnson 2010). Taxi start-up Uber came under fire for allowing employees a “God view” of individuals’ movements in real time (Timberg 2014). Online services store massive amounts of “anonymized” data that is not really anonymous–a study found that they could identify 95% of people using only four GPS location points, even when the data was of low accuracy (de Montjoye et al. 2012). And the 2013 Edward Snowden scandal revealed that data is not only being used by corporations: governments are aggregating and analyzing massive quantities of information to build “a pattern of life” of anyone even loosely associated with suspected terrorists (Davis et al. 2013).

There is a growing number of people who ask whether their personal data belongs in the corporations’ hands. “There is a growing view… that data is a personal asset,” Alan Mitchell, UK strategic advisor on privacy, says. “The full potential value can only be realised if individuals are able to control what personal information they share with who, for what purposes, under what terms and conditions; and if they can realise the benefits (including financial benefits) of doing so.” (Mitchell 2012). A World Economic Forum report adds that data ownership should be thought of in terms of old English common law: as the right to possess, use, and distribute, rather than as physical ownership (Dutta and Mia 2009). Individuals should be allowed to control who accesses their data and why, as well as to come up with new uses for it.

The problem is that personal data is siloed across dozens of websites. Users must navigate confusing and constantly shifting End User License Agreements (EULAs) to discover what information is being collected about them. Their consent is passive: if they don’t agree to the terms of use, their only option is to avoid using the service. When it comes to services such as Facebook, LinkedIn, or Twitter, which are used by millions of people worldwide to network and socialize, the decision to abstain can impact professional and social life (Dimicco 2009, Burke et. al. 2010, Kim and Lee 2011). Consenting to EULAs or abstaining from a service are not adequate choices: people need a regulated, safe way to set privacy levels they are comfortable with.

Personal Data Service

Enter the Personal Data Service (PDS). PDSs are based on the idea that individuals own their data and should be allowed to control the flow of their personal information. The PDS stores or aggregates data, displays it to the user, and allows users to download and share it in machine readable format (Reed). Users set their own terms for who can access their data and why. This affords the user what Helen Nissenbaum terms “context-relative informational norms”: the ability to share data with appropriate parties only (Nissenbaum 2009). People willingly share financial information with their bank, and medical information with their doctor, but aren’t comfortable telling their bank manager about their bunions or their doctor about their student debt. Sharing information in context creates a space where people feel more secure about their privacy (Nissenbaum 2009). At the same time, PDSs give companies access to a richer combined data set, including Volunteered Personal Information (VPI) (Mitchell 2012). A study into PDS use found that people share 12% more information when they are explicitly told how their data will be used (Ctrl Shift 2014). Consumers can create a single online identity with their likes and dislikes, allowing corporations to give them more valuable and relevant offers. In other words, companies can interact with them as individuals rather than as a demographic. By providing a safe space for individuals to store their sensitive data, PDSs benefit both companies and individuals.

So how do they work? To start, the PDS provider must access personal data in some way. Some may simply simply aggregate data stored by online services and display them to the user as a dashboard or in a database format (Ctrl Shift  2014). However, more robust implementations will store the data on a central or personal server: “given the huge number of data sources that a user interacts with on a daily basis, interoperability is not enough. Rather, the user needs to actually own a secured space, a Personal Data Store acting as a centralized location where his data live” (openPDS). In this implementation, the PDS acts as a buffer between the web service and the end user. It captures any user-generated raw data (such as GPS location, form entries, or preferences) and stores it securely.

The web service never accesses this captured data directly. Instead, it sends a query to the PDS and the PDS sends back an answer (de Montjoye 2014). For example, Netflix may want to know whether to recommend House of Cards or Star Trek to you. It will send a request to the PDS with code that uses some combination of demographic information, viewing habits, and geospatial location to predict what you would prefer. The PDS evaluates whether the information requested fits in with the privacy preferences you have set. If it does, it sends back the result to Netflix: in this case, “House of Cards” or “Star Trek.” Netflix need never know the fine-grained information that led to this result. In another case, Netflix may run similar code against multiple users’ aggregated data to draw conclusions about an entire population. In either case, “the dimensionality of the data shared with the services is reduced from high-dimensional metadata to low-dimensional answers that are less likely to be re-identifiable and to contain sensitive information” (Montjoye 2014). Companies can continue to use personal data while respecting individuals’ privacy.

PDS implementation (de Montjoye 2014)
PDS implementation (de Montjoye 2014)

Like a bank, the PDS enters into a binding legal contract with its customers, pledging to protect their data from unwanted access. Mydex, for example, legally structured their business as a nonprofit organization that could never be acquired by a corporation or government (Mydex). “We knew trust was absolutely paramount,” a spokesperson states (Mydex).  Like banks, in order to compete amongst themselves, PDSs will need to prove to their clients that they are trustworthy. This means keeping personal information safe from attack and exploitation: for example, by including checks to make sure that the information returned to service queries is sufficiently anonymous, and identifying and blocking untrustworthy requests (de Montjoye 2014).

Test runs of PDSs have been a success: 81% of beta users of one service, openPDS, said they would use it in their personal life (de Montjoye 2014). Despite this, adoption is slow. “Deployment on a large-scale is a chicken-and-egg problem; users are waiting for compatible services while services are waiting for user adoption,” Montjoye says. Without wide consumer demand, companies are unlikely to give up control of their databanks. However, political support in combination with technological advancement may be enough to spur change (Montjoye 2014).

That support may be coming. In 2011 the UK launched their voluntary Midata program asking corporations to put data back into the hands of users (Midata 2014). In 2012, the EU commision wrote a reform of data protection, stating “individuals’ right to be forgotten, to have easier access to their data, and to be able to easily transfer them” (openPDS).  In response to this changing view of privacy, dozens of PDSs have sprung up around the world, gathering millions of dollars each in financial backing (Ctrl Shift 2012).

PDS and Publishers

With PDSs on the rise, publishers need to pay attention. As content creators who often rely on advertising and market data, publishers can benefit by becoming early backers of this service. PDSs allow publishers to access more data about their customers, levelling the playing field between publishers and Amazon. They create a sense of privacy for the reader, preserving the mystique of picking up a book. And they are rooted in a concept of ownership of information, which aligns nicely with publishers’ ideals.

Data and marketing

With the rise of PDS, publishers will have new advantages. Currently, traditional publishers are disadvantaged when it comes to gathering  data because booksellers act as intermediaries between them and their consumers. With the rise of PDS, publishers will have access to a wealth of information from different streams about what their users want and like. “The proposed framework removes barriers to entry for new businesses, allowing the most innovative algorithmic companies to provide better data-powered services,” de Montjoye writes of openPDS (de Montjoye 2014). Publishers, who take on the financial risk of backing a book with little data (Dunlop 2015), will finally have the same advantages as Amazon and Kobo.

Privacy

Beyond the commercial benefits of PDS, the values it serves to uphold–privacy and ownership of information–could align with their publishers’ ethos and re-establish their foothold in the community. PDSs are  built on a desire for individuals to maintain privacy by controlling access to their information, and publishers have an intimate relationship with privacy. In fact, some argue that privacy coevolved with the technology of print. Jagodizinski notes that the traditional definition of “privacy” was negative: it denoted an individual who, through the absence of public office, had no power to lead his community (Jagodizinski 1999 23).   As  books became portable and abundant, literacy rose and with it, silent reading. People were able to absorb information in solitude rather than through public conversation. A sense of interiority developed, and the word “privacy” evolved to take on a more positive meaning (Jagodizinski 1999).  Spacks elaborates on this, connecting reading with “individual fantasy,” “withdrawal from the public sphere,” and “the opportunity to explore and solidify the self” (Spacks 2003 28-29). Such  outcomes are only possible when reading takes place in a relatively private space–not private in the sense of solitary, but in the sense of having an unwatched space to think about and process the text. This privacy allows people to explore new ideas free from outside judgement.

But reading is no longer a private activity: it is now yet another way to gather data. Ebook vendors such as Kobo track not only which books are bought, but when and how often they’re read. The information is accurate enough to track reader engagement chapter by chapter and even page by page (Kobo 2014). Online, websites store incredibly detailed information about what you read, building a profile of you across from page to page. Academic journals are no better: 16 out of the top 20 research journals allow trackers to spy on their readers (Hellman 2015). Many users do not realize to  what extent they are being watched. “The psychological privacy afforded by communication channels may lull users into a false assumption of informational privacy,” Walther writes (Walther 2011, 4). But as awareness grows (Madden 2015), users will employ PDSs to limit when and how they are tracked. Data collection is transparent to the user and sufficiently anonymous to the tracker. With PDS, readers will be able to build a private space online that mimics that of print.

Copyright and access

The advent of PDS will also change the way people relate to information as property. Proponents of open access to information have long argued that “information wants to be free” (Clarke 2000). However, most people are less comfortable with free access to their own information. In a culture where so much content is available for free online, the new slogan is, “If you are not paying for it, you’re the product being sold” (Fitzpatrick 2010). “Free” information is paid for in advertising dollars spent by companies trying to reach and track engaged audiences.

Books are no exception. In the past, Nakamura says, we paid for books but our conversations about them were free. Now that paradigm has shifted: “today books are free through Google Books and Internet Archive and, much to the consternation of publishers, through torrent sites like Pirate Bay and Media Fire, but we pay to create readerly communities on social networks like Goodreads. We pay with our attention and our readerly capital, our LOLs, rankings, conversations, and insights” (Nakamura 7). User-created data is an indirect payment. Even with print books, customers trade their data for rewards and discounts. Booksellers’ use of data is cast in terms of labour and production: “each transaction customers make using their loyalty cards produces valuable data for these booksellers. In effect, they are outsourcing the costly labor of market research to their most loyal customers, who ironically buy back the labor they’ve freely given with each subsequent purchase” (Striphas 2010). Information exchange becomes a grim commerce, where the customer is the worker, product and buyer. In such exchanges, information is anything but free: it is heavily commodified. Content is still paid for, although the payment is invisible to many users.

PDSs, on the other hand, spring from the idea of data as a form of personal property. The first American text to argue for the right to privacy grounds in property law, similarly to notions of copyright: “The right of property in its widest sense, including all possession, including all rights and privileges, and hence embracing the right to an inviolate personality, affords alone that broad basis upon which the protection which the individual demands can be rested” (Warren and Brandeis 1890). The right to an “inviolate” self is a form of property over all private information about that self. What makes determines whether information is private? To Warren and Brandeis, information is private until it is published: “The common law secures to each individual the right of determining, ordinarily, to what extent his thoughts, sentiments, and emotions shall be communicated to others… The right to privacy ceases upon the publication of the facts by the individual, or with his consent.” (Warren and Brandeis 1890). Publication is seen as the boundary by which individuals relinquish control over their information.

With the advent of digital publishing, however, “publication” is no longer as clear cut. Are you “publishing” when you post to a Facebook profile that is restricted to friends? When you send an email through an online service, is it still private if it’s filtered through an algorithm that builds a profile of you? PDS erases these distinctions by allowing individuals to decide for themselves when their information is “published” and when it is private. In effect, they are acting as their own publishers, curating and editing their data and setting terms for access. In some PDS implementations, they are even allowed to set a price  to access their data. This way of looking at information bodes well for publishers and content providers in general. Information can still be “free,” in the sense of accessible to anyone. But publishers may be able to find a price that captures the value of the content they provide. If people recognize their own information as valuable, they will begin to view others’ information as valuable too (Lanier 2010). This could be good news for those segments of the publishing industry that have not yet figured out a way to make money off free content.

Conclusion

Personal data services have yet to catch on, but the problem they address is real. Movies like Unfriended are just the tip of the cultural iceberg. A sense of anxiety pervades the digital space, as people worry about the permanence of their digital footprint and about the ways they are being tracked.  Addressing privacy concerns online is not just an issue for publishers, it’s a social issue–but it’s one that publishers have a vested interest in. The commercial benefits of opening data access to publishers is enormous, and could help tremendously to identify and reach new markets. In a world where people feel “digitally crowded” (Joinson 2011),  publishers could open up a cool oasis of privacy, allowing readers to explore new territory without fear of surveillance. And if publishers want consumers to treat content as valuable, they can begin by treating consumers’ information in the same way.  Publishers will need to take a long, hard look at digital privacy and how it fits in with their vision for the future.

 

References

Burke, Moira, Cameron Marlow, and Thomas Lento. “Social Network Activity and Social Well-Being.” In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, 1909–12. ACM, 2010. http://dl.acm.org.proxy.lib.sfu.ca/ft_gateway.cfm?id=1753613&ftid=770043&dwn=1&CFID=496986671&CFTOKEN=32475132.

 

Clarke, Roger. “Information Wants to Be Free.” Roger Clarke, 2000. http://www.rogerclarke.com/II/IWtbF.html.

 

Davis, Kenan, Nadia Popovich, Kenton Powell, Ewen MacAskill, Ruth Spencer, and Lisa van Gelder. “NSA Files Decoded: Edward Snowden’s Surveillance Revelations Explained.” The Guardian, November 1, 2013. http://www.theguardian.com/world/interactive/2013/nov/01/snowden-nsa-files-surveillance-revelations-decoded#section/6.

 

De Montjoye, Yves-Alexandre, Cesar A. Hidalgo, Michel Verleysen, and Vincent D. Blondel. “Unique in the Crowd: The Privacy Bounds of Human Mobility.” Sci. Rep. 3 (March 25, 2013). doi:10.1038/srep01376.

 

De Montjoye, Yves-Alexandre, Erez Shmueli, Samuel S. Wang, and Alex Sandy Pentland. “openPDS: Protecting the Privacy of Metadata through SafeAnswers.” Edited by Tobias Preis. PLoS ONE 9, no. 7 (July 9, 2014): e98790. doi:10.1371/journal.pone.0098790.

 

Dunlop, Laura. “Accessing Big Data: The Key to Publishers Taking Back the Power.” PUB 802: Canadian Centre for Studies in Publishing, SFU, February 27, 2015. http://tkbr.publishing.sfu.ca/pub802/2015/02/accessing-big-data-the-key-to-publishers-taking-back-the-power/.

 

Dutta, Soumitra, and Irene Mia. “Global Information Technology Report 2008-2009.” World Economic Forum, 2009. http://hd.media.mit.edu/wef_globalit.pdf.

 

Fitzpatrick, Jason. “If You’re Not Paying for It; You’re the Product.” LifeHacker, November 23, 2010. http://lifehacker.com/5697167/if-youre-not-paying-for-it-youre-the-product.

 

Gingold, Michael. “‘UNFRIENDED’ (aka ‘CYBERNATURAL’; Fantasia Movie Review).” Fangoria, July 21, 2014. http://www.fangoria.com/new/unfriended-cybernatural-fantasia-movie-review/.

 

Hellman, Eric. “16 of the Top 20 Research Journals Let Ad Networks Spy on Their Readers.” Go To Hellman, March 12, 2015. http://go-to-hellman.blogspot.ca/2015/03/16-of-top-20-research-journals-let-ad.html.

 

Jagodzinski, Cecile M. Privacy and Print: Reading and Writing in Seventeenth-Century England. University of Virginia Press, 1999.

 

Johnson, Bobby. “Privacy No Longer a Social Norm, Says Facebook Founder.” The Guardian, January 11, 2010, sec. Technology. http://www.theguardian.com/technology/2010/jan/11/facebook-privacy.

 

Joinson, Adam N., David J. Houghton, Asimina Vasalou, and Ben L. Marder. “Digital Crowding: Privacy, Self-Disclosure, and Technology.” In Privacy Online, 33–45. Springer, 2011.

 

Kim, Junghyun, and Jong-Eun Roselyn Lee. “The Facebook Paths to Happiness: Effects of the Number of Facebook Friends and Self-Presentation on Subjective Well-Being.” Cyberpsychology, Behavior, and Social Networking 14, no. 6 (2011): 359–64. doi:10.1089/cyber.2010.0374.

 

 

Lanier, Jaron. Who Owns the Future?. First Simon & Schuster hardcover edition. New York: Simon & Schuster, 2013.

 

Madden, Mary. “Privacy and Cybersecurity: Key Findings from Pew Research.” Pew Research Center, January 16, 2015. http://www.pewresearch.org/key-data-points/privacy/.

 

Midata Voluntary Programme: Review. Consumer Protection. UK: Department for Business, Innovation & Skills, July 8, 2014. https://www.gov.uk/government/publications/midata-voluntary-programme-review.

 

Mitchell, Alan. “Personal Data Stores Will Liberate Us from a Toxic Privacy Battleground.” Wired UK, May 30, 2012. http://www.wired.co.uk/news/archive/2012-05/30/ideas-bank-personal-data-stores.

 

Nakamura, Lisa. “‘Words with Friends’: Socially Networked Reading on Goodreads.” PMLA 128, no. 1 (2013): 238–43.

 

“New Market for ‘Empowering’ Personal Data Services ‘Will Transform Relationships between Customers and Brands.’” Ctrl Shift, March 20, 2014. https://www.ctrl-shift.co.uk/news/2014/03/20/new-market-for-empowering-personal-data-services-will-transform-relationships-between-customers-and-brands/.

 

Nissenbaum, Helen. Privacy in Context: Technology, Policy, and the Integrity of Social Life. Stanford University Press, 2009.

 

“openPDS/SafeAnswers – The Privacy-Preserving Personal Data Store.” OpenPDS. Accessed April 6, 2015. http://openpds.media.mit.edu/.

 

Personal Data Stores. Ctrl Shift, April 30, 2012. https://www.ctrl-shift.co.uk/research/product/64.

 

Phillips, Kendall R. Projected Fears: Horror Films and American Culture: Horror Films and American Culture. Praeger Publishers, 2005.

 

Reed, Drummond. “Revision: ‘Personal Data Service’ AND ‘Personal Data Store’ Go Together.” Equals Drummond, October 6, 2010. http://equalsdrummond.name/2010/10/06/revision-personal-data-service-and-personal-data-store/.

 

Spacks, Patricia Meyer. Privacy : Concealing the Eighteenth-Century Self. Chicago, IL, USA: University of Chicago Press, 2003. http://site.ebrary.com/lib/sfu/docDetail.action?docID=10468497.

 

Steinfield, Charles, Joan M. DiMicco, Nicole B. Ellison, and Cliff Lampe. “Bowling Online: Social Networking and Social Capital within the Organization.” In Proceedings of the Fourth International Conference on Communities and Technologies, 245–54. ACM, 2009. http://socio-informatics.de/fileadmin/IISI/upload/2009/p245.pdf.

 

St. John, Jeffrey. “The Late Age of Print: Everyday Book Culture from Consumerism to Control, by Ted Striphas,” 2010.

 

Timberg, Craig. “Is Uber’s Rider Database a Sitting Duck for Hackers?” The Washington Post, December 1, 2014. http://www.washingtonpost.com/blogs/the-switch/wp/2014/12/01/is-ubers-rider-database-a-sitting-duck-for-hackers/.

 

Trepte, Sabine, and Leonard Reinecke, eds. Privacy Online. Berlin, Heidelberg: Springer Berlin Heidelberg, 2011. http://link.springer.com/10.1007/978-3-642-21521-6.
“Understanding Personal Data Stores (PDS).” Mydex. Accessed April 6, 2015. https://mydex.org/understand-pds/.

 

Viseu, Ana, Andrew Clement, and Jane Aspinall. “Situating Privacy Online.” Information, Communication & Society 7, no. 1 (January 1, 2004): 92–114. doi:10.1080/1369118042000208924.

 

Walther, Joseph B. “Introduction to Privacy Online.” In Privacy Online, 3–8. Springer, 2011.

 

Warren, Samuel D., and Louis D. Brandeis. “The Right to Privacy.” Harvard Law Review, 1890, 193–220.

Computer Generated Fiction

If computers are able to write content that is indistinguishable from human authored writing, what will this mean? If they can one day write anything from travel guides to literary novels, will people read them? Would they trust that a computer knows where the best restaurants in Atlanta are or the best hotels in Paris? What about literature? Humans plumb the depths of their emotions and draw on extraordinary experiences to create great literary works; could a computer possibly do the same? If this comes to fruition and people can distinguish between a human and a computer, which will they prefer? If they prefer the computer-generated content, what would this mean for authors?

Starting with what computers can already do, we can look at information compilations such as travel guides. People will read them because computers can pull human reviews from the internet and compile more reviews more quickly than any human ever could therefore making it the best authority on anything that is subject to reviews. It will likely have a better review for you than your best friend’s actual experience.

Once computers have advanced past that stage and on to simple or formulaic literature, humans will not care who or what wrote it as long as it is entertaining. The most common formulaic genre is romance, and, out of the paperback fiction category in North America, romance novels are the best-selling.  Computers can be fed formulaic plot lines and stock characters to work with, and will be able to read through a million novels to get ideas on which words and phrases are liked best by humans. Humans already read extremely formulaic books and will not mind if a computer writes them instead of a human, especially since the computer is pulling from sources written by humans.

Other formulaic examples come from syndicates like the Stratmeyer Sydicate that put out Nancy Drew and The Hardy Boys. The Stratmeyer Syndicate is an excellent example of why computers could become “fiction factories”, as the syndicate was known. Founder Edward Stratmeyer hired unknown writers, gave them anything from a few sentences to a three-page outline and a plot, and expected to receive a finished book two weeks later complete with chapter cliffhanger endings and consistent sounding dialogue.

Computers will soon be able do even better – as of today they can write fiction that is almost comparable to that written by humans. By having access to every book and online resource possible, they have access to almost all human documentation thus far, giving them the power to not only get ideas and phrases that have received positive human feedback, but also millions of human experiences, and what emotions these evoked. Alexander Prokopovich’s algorithm wrote its own version of War and Peace, entitled True Love, in 2008. It sounds close to a lot of human writing apart from the odd phrase or two: ‘Kitty couldn’t fall asleep for a long time. Her nerves were strained as two tight strings.’ The Georgia Institute of Technology has developed a program called Scheherazade that can write fiction that sounds convincingly human. For example:

John took another deep breath as he wondered if this was really a good idea, and entered the bank. John stepped into line behind the last person and waited his turn. When the person before John had finished, John slowly walked up to Sally. The teller said, “Hello, my name is Sally, how can I help you?” Sally got scared when John approached because he looked suspicious. John pulled out a handgun that was concealed in his jacket pocket. John wore a stern stare as he pointed the gun at Sally. Sally was very scared and screamed out of fear for her life. In a rough, coarse voice, John demanded the money. John threw the empty bag onto the counter. John watched as Sally loaded the bag and then grabbed it from her once she had filled it. Sally felt tears streaming down her face as she let out sorrowful sobs. John strode quickly from the bank and got into his car tossing the money bag on the seat beside him. John slammed the truck door and, with tyres screaming, he pulled out of the parking space and drove away.

Robot fiction reviewer Nicholas Lezard actually thought it was an excerpt from a new Dan Brown novel, but then realised Scheherazade could have been programmed using algorithms based on Brown.

Over the years people have come up with tests to see if a computer can pass as a human. One such test is the Turing Test.  Invented by Alan Turing, it consists of a human sitting in a room at a terminal with a computer, and a computer at a terminal in a separate room. The human corresponds via text with whoever or whatever is in the other room, and then he or she has to figure out if he or she is corresponding with another human or with a computer. So far no computer program has definitively passed this test. People have come up with alternate Turing Tests where people read different articles or stories and try and figure out whether a human or a computer wrote them. One such a test can be found at http://www.nytimes.com/interactive/2015/03/08/opinion/sunday/algorithm-human-quiz.html.

As times moves on, humans will be unlikely to prefer one type of writing over the other. Some people will happily read formulaic, computer generated novels, others will be intrigued and will voraciously read literary novels written by computers, while traditionalists will stick with their human written works.

If the majority of humans ever do prefer computer-generated content this will affect authors because they will be less in demand. If a computer can write the new Dan Brown while authors are working on literary novels, which already don’t sell as well as thrillers, they may lose out on work. That being said, there will always be traditionalists so computers writing fiction may actually push human authors harder to compete, drawing forth some of the best literature we have ever read.

As of right now, although they are close, computers cannot write fiction equal to that authored by humans. “The hardest [for the computers] to crack will be the elements of great writing we ourselves struggle to explain: the poetic force of the sentences, the unique insights of the author, the sense of a connection.”

 

Sources:


http://www.scientificamerican.com/article/computers-vs-brains/

Studying the Romance Novel

http://www.trussel.com/books/strat.htm

www.bbc.com/culture/story/20150122-could-a-robot-write-a-novel

http://www.theguardian.com/books/2014/nov/11/can-computers-write-fiction-artificial-intelligence

http://psych.utoronto.ca/users/reingold/courses/ai/turing.html

 

The Limits of “Unlimited” Ebook Subscription Services

Introduction

In the fantasy-comic short story The Choosing of the Bride (Die Brautwahl, 1819), German writer E.T.A. Hoffman tells the bizarre adventures of three suitors competing for the hand of the young and beautiful mistress Albertine Vosswinkel. The first suitor is the amusingly pedantic bureaucrat and obsessive bibliophile “Herr Chancellery Private Secretary” Tusmann; the second, the young and charming painter Edmund Lehsen; the third, the wealthy, greedy and revolting Baron Benjamin. Albertine’s destiny rests on a game of chance that follows the popular fairy-tale pattern of the casket–choice, echoing Shakespeare’s Merchant of Venice. Albertine’s suitors, indeed, must choose among three caskets; the one who picks the casket containing the demoiselle’s portrait will win her hand in marriage. Although the finale can be easily predicted – the winner being her beloved Edmund – the opening of the “wrong” caskets comes with some surprising revelations. Both the Baron and Tusmann will be rewarded with a gift more valuable, to them, than Albertine herself: whereas the first rejoices at the discovery of a magic file that prevents his precious ducats from deteriorating, the second is startled at finding “a little book bound in parchment” with nothing but blank pages inside. As the secretary will learn shortly after, the small “packet of paper” in the casket is not an ordinary book, but is “the richest, completest library anyone has ever possessed,” for every time he takes the volume out of his pocket, this will become whatever title he wishes to read. [1]

Tusmann’s magic book is the materialization of every bibliophile’s dream. Today, some internet retailers and up-and-coming digital content providers are trying to turn this dream into a reality by offering readers access to thousands of titles online with the alluring promise of the freedom of unlimited reading. But is this exciting prospect of a universal, personal library a pipe dream or an already existing reality that is bound to prosper? In the course of this essay, I will attempt to answer this question by investigating – and ultimately calling into question – the viability of broad-based, subscription services in consumer publishing.

The popularity of Netflix-like subscription services: or a book is not a film (nor a song)

Book subscription models are hardly a novelty, dating back as far as the eighteenth century, when printers used to “solicit customers to ‘subscribe’ to a particularly expensive work or collection of works in advance of publication as a way to reduce risk by prefunding the project.” [2] Whereas in the late twentieth century, with the advent of digital publishing, the term was associated with the libraries’ and businesses’ subscription to digital journals, collections, and databases, today, the word has come to identify streaming media services that offer instant access to vast collections of content.

The huge success of streaming technologies such as Netflix and Spotify has earned them the reputation of digital content subscription providers par excellence to the point that subscription-based e-book lending libraries such as OysterScribd, and Kindle Unlimited have often been labeled as “Netflix of books,” (or “Spotify of books”). Beyond the allure of the epithet, too often applied without much discrimination to the book market, the comparison calls for a fundamental distinction: the mode of consumption of digital video or audio products and that of eBooks are remarkably different. Compare the average Spotify user with the average Kindle Unlimited reader: while the former may listen to a large number of songs every day, each one for a short time, and often several times, the latter is likely to read only a few titles over a longer period, usually once, with a more focused attention span. [3] Furthermore, the attention of the music consumer need not be entirely absorbed in the act of listening, this type of content often being consumed simultaneously with other activities. The difference becomes less visible if we compare the experience of watching a movie on Netflix with that of reading a book on a dedicated device. Both situations, in fact, require the user’s total attention in a more relaxed, “lean back” situation (although this is not always true for books, and even more so for eBooks, which often involve the reader’s active participation and are consumed in “lean forward” mode).

Another important distinction to be made concerns the audience size. Scribd, one of the leading book membership services has grown to 80 million monthly readers since its inception in 2013. The paying subscribers are in the order of “hundreds of thousands,” but the numbers are not nearly as big as Netflix’s and Spotify’s, with their respective 57.4 million and 15 million global subscribers. [4] [5] Obviously, given that Netflix and Spotify have been around for years, the statistics announced by the younger ebook subscription services are promising. Yet – and I’m now venturing into the realm of speculation – it is unlikely that the number of book gluttons willing to pay for a “binge reading” will ever approach that of voracious (paying) users of “all-you-can-watch” and “all-you-can-listen” streaming services. An additional issue to be addressed is that of the “potential degradation of high-value markets” connected with the increase of book subscriptions. [6] As a recent BISG report acknowledges, low-price access to vast libraries of content may lead to a devaluation of ebooks (and books in general) in the mid term, and possibly translate into “into an unwillingness to pay a higher price to own a book.” [7]

Freedom of unlimited reading: a promise to debunk

Unlimited books, audio books, and comics to be instantly accessed and consumed in “total freedom” (Scribd); unlimited listening and reading on any device, freedom to explore thousands of titles (Kindle Unlimited); unlimited reading – anytime, anywhere – of “as many books as [readers] want” (Oyster). “Freedom” and “unlimited access” are the common distinctive features and marketing mantra of the big current broad-based subscription services for books. Their offers even seem to exceed the dream of the universal lending library described in Hoffman’s tale as Scribd, Oyster and Kindle Unlimited readers, through the magic of algorithms and user-generated recommendations, may discover titles about which they have never heard. But is this really true? What is the extent of this “unlimited” offer and of our “freedom” as subscribers-readers? As journalist Cameron Fuller observes, when it comes to books, the very notion of unlimited becomes questionable: “[u]nlimited books do nothing if you don’t read them, and reading has a limit. It has a limit in time and speed, something most people do not possess.” [8] For Kindle Unlimited subscribers, the limit is in the choice of titles, since a significant portion of the books available comes from Amazon’s own imprints (Thomas & Mercer, 47th North, Montlake Romance, and so on) and self-publishing platform, Kindle Direct Publishing. In addition, the majority of the books on the New York Times bestsellers list available through the Kindle Store, as well as the top selling titles published by the major publishers (Hachette, Macmillan, Simon & Schuster, HarperCollins, and Penguin Random House) are not included in the offer. Interestingly, even if to a lesser extent, a similar restriction on the collections can be found in connection with the only ebook subscription service, Oyster, that managed to bring three of the Big Publishers on board (HarperCollins, Simon & Shuster and, most recently, Macmillan). These publishers, in fact, are putting into the service only their backlist titles, leaving the new, “most attractive commercial titles” out of the offer. [9]

This also raises some questions about subscribers’ freedom of choice: most subscription services for books, indeed, are offered through third-party aggregators rather than directly by publishers, and the selection of titles is subordinate to commercial agreements where readers occupy a marginal place, if any at all. As Shatzkin has recently pointed out, the problem with ebook subscription services is that, over time, “the power of ‘brand’ passes from the individual titles (and authors) to the subscription service itself.” As a result “a subscriber-reader can become used to choosing from what the service offers and will either not know about, skip, or accept purchasing the occasional book s/he wants outside the service if it isn’t offered inside.” [10] In other words, the tantalizing promise of an unlimited offer actually translates into a limited range of prepackaged choices. The risk, in the long term, is the narrowing of the reader’s perspective. Yes, it is true, by “remov[ing] the purchase from the process after the initial acquisition of access,” subscription services relieve the reader of the burden of potentially making an erroneous buying choice, by presenting them with a set of prepackaged options, but… is this truly an exercise of freedom or, rather, a limitation on it?

Cost-savings: a real benefit or another myth to be exposed?

Along with convenience and ease of access, low price is a key factor of success in the new subscription economy. Ebook subscription services are no exception. Voracious readers – especially the price-sensitive ones – are attracted to online book membership services because in them their insatiable appetite for books can find easy gratification with minimum expense. And yet, even for those readers, the idea that purchasing a subscription plan is cost-effective can be deceptive. Let’s take Kindle Unlimited as an example. The service costs $9.99 per month. A true deal, if it wasn’t for the fact that most of the titles on Kindle Unlimited are priced very low ($0.99 or $2.99) and in order to actually benefit from the service, the reader would have to read at least three $2.99 books or ten $0.99 books per month. [11] Casual readers on a Kindle Unlimited plan, it goes without saying, will be highly likely to have their needs unmet. Safari Books Online, the first subscription service for books, is a different story, not only because it is targeted at a niche audience – mainly IT and business professionals –, but also because it uses a more viable financial model, which assigns “a percentage of the revenue as a pool to compensate publishers rather than guaranteeing a purchase for every read” as Oyster and Scribd do. [12] Furthermore, the monthly fee for the basic plan (PRO), which offers access to an extended library of ebooks, audio books, video courses and conference talks, is very affordable. It costs $39 per month, a price only slightly above that of a single downloadable book online. [13]

Conclusion

The subscription economy, rather than an emerging trend, is an established fact in several media industries, and an already existing reality in the current publishing landscape. Consumer publishers’ ability to use this model successfully in their business, especially in the long term, will depend on their willingness to abandon the “one-size-fits-all” policy – or the chimeric dream of a “one-library-fits-all” solution – in favour of a mixed strategy that reaches (and meets the needs of) different kinds of costumers through different market pathways. [14]

Works Cited

1. E.T.A. Hoffman, J. Hollingdale (trans.), Tales of Hoffman (London: Penguin, 1982). I owe the reference to Gino Roncaglia’s pioneering study La quarta rivoluzione (Roma: Laterza, 2010), 70-73.
2. “Digital Books and the new Subscription Economy: Executive Summary” (New York: Book Industry Study Group, 2014), 10. https://www.bisg.org/publications/digital-books-and-new-subscription-economy-0.
3. Daniele De Veris, “Daniele De Veris intervista Gino Roncaglia,” Insula Europea, January 1, 2015, accessed April 1, 2015. http://www.insulaeuropea.eu/leinterviste/interviste/deveris_roncaglia.htm.
4. Brad Stone, “Scribd’s E-Book Subscription Service, Now With Audiobooks,” Bloomberg, November 6, 2014, accessed March 31, 2015. http://www.bloomberg.com/bw/articles/2014-11-06/scribd-launches-audiobook-service
5. Frank Pallotta, “Netflix gains 4.3 million subscribers in 4th quarter,” CNN Money, January 20, 2015, accessed April 2, 2015. http://money.cnn.com/2015/01/20/media/netflix-earnings/
6. “Digital Books and the new Subscription Economy,” 7.
7. Ibid.
8. Cameron Fuller, “Why Oyster Isn’t ‘The Netflix Of Books’,” International Business Times, January 09, 2014, accessed April 1, 2015. http://www.ibtimes.com/why-oyster-isnt-netflix-books-1534086.
9. Michael Shatzkin, “Subscription Services for eBooks Progress to Becoming a Real Experiment,The Shatzkin Files, May 27, 2014, accessed April 1, 2015. http://www.idealog.com/blog/subscription-services-ebooks-progress-becoming-real-experiment/.
10. Ibid.
11. Piotr Kowalczyk,“Kindle Unlimited Ebook Subscription: 8 Things Readers Need to Know, ” Ebook Friendly, April 5, 2015, accessed April 5, 2015. http://ebookfriendly.com/kindle-unlimited-ebook-subscription/.
12. Shatzkin, 2014.
13. Andrew Savikas,“Welcome to the New Safari,” Safari Books Online, July 8, 2014, accessed April 1, 2015. https:// blog. safaribooksonline. com/ 2014/ 07/ 08/ new-safari/.
14.”Digital Books and the new Subscription Economy,”  6, 13. For more details on this view, please see the thorough research on subscription models recently conducted by the Book Industry Study Group.

 

BIBLIOGRAPHY

Digital Books and the new Subscription Economy: Executive Summary. New York: Book Industry Study Group, 2014. https://www.bisg.org/publications/digital-books-and-new-subscription-economy-0.

Fuller, Cameron. “Why Oyster Isn’t ‘The Netflix Of Books’,” International Business Times, January 09, 2014, accessed April 1, 2015. http://www.ibtimes.com/why-oyster-isnt-netflix-books-1534086.

Kowalczyk, Piotr. “Kindle Unlimited Ebook Subscription: 8 Things Readers Need to Know, ” Ebook Friendly, April 5, 2015, accessed April 5. http://ebookfriendly.com/kindle-unlimited-ebook-subscription/

Lunden, Ingrid. “Spotify Now Has 15M Paying Users, 60M Overall Active Subscribers. Techcrunch, January 12, 2015, accessed April 2, 2015. http://techcrunch.com/2015/01/12/spotify-now-has-15m-paying-users-60m-overall/.

Pallotta, Frank. “Netflix gains 4.3 million subscribers in 4th quarter.” CNN Money, January 20, 2015, accessed April 2, 2015. http://money.cnn.com/2015/01/20/media/netflix-earnings/.

Roncaglia, Gino. La quarta rivoluzione: sei lezioni sul futuro del libro. Roma: Laterza, 2010.

____________ “A Tangled Tale: Biblioteche digitali, subscription services e promozione della lettura” (Seminar Presentation). Convegno Stelline, March 14, 2015, accessed April 1, 2015. https://prezi.com/bahcrznhhtyg/a-tangled-tale/?utm_source=facebook&utm_medium=ending_bar_share.

Savikas, Andrew. “Welcome to the New Safari.” Safari Books Online, July 8, 2014, accessed April 1, 2015. https:// blog. safaribooksonline. com/ 2014/ 07/ 08/ new-safari/

Shatzkin, Michael. “Subscription Services for eBooks Progress to Becoming a Real Experiment.” The Shatzkin Files, May 27, 2014, accessed April 1, 2015. http://www.idealog.com/blog/subscription-services-ebooks-progress-becoming-real-experiment/.

Weber, Harrison. “Netflix beats Q3 expectations, but adds just 3M subscribers.” VB News, October 15, 2014, accessed April 2. http://venturebeat.com/2014/10/15/netflix-beats-expectations-with-3m-new-subscribers-in-q3-0-96-earnings-per-share/.

 

 

Publishers are from Venus, Data is from Mars: Can they play together nicely?

Research centers like Book Net Canada, and tech companies like Kobo, are collecting data on readers and books. They provide valuable knowledge to publishers, and can answer questions publishers have wanted the answers to. Where do readers hear about their books? Does a dollar price difference mean more or less buyers? Are ebook readers print readers as well? This research helps publishers make what they feel to be safer moves in the content they publish and how they publish it. However, publishers should be cautious as to how they use this data. As creative industries publishers need to first learn how to shift through massive amounts of data rather than try to accommodate it all. Also, if publishers rely solely on data to make decisions they can harm the collective minority of readers and mid-list authors. The individual reader and author are still valuable, and if publishers stop taking risks on new authors and new genres, we’ll be left with a national-wide book list with no diversity, uniqueness, or ingenuity.

The trade publishing industry has largely relied on risk and chance when acquiring and publishing their lists. They do their best to follow current genre trends, look to other media to see what characters people are connecting with, maintain proven plot progressions, and base their sales and marketing strategies on what has worked for them before (The Economist). Even with years of experience, every publishing company takes manuscripts, crafts them into books, lets them out into the world, and crosses their fingers that readers will respond well.

“Amazon executives considered publishing people ‘antediluvian losers with rotary phones and inventory systems designed in 1968 and warehouses full of crap.’ Publishers kept no data on customers, making their bets on books a matter of instinct rather than metrics
(Packer, New Yorker)

Today publishers have more instincts at their disposal, they have data, but with great data comes great responsibility (Uncle Ben, Spiderman). In Huffington Post’s article “How to Write and Publish the (almost) Perfect Book”, Penny Sansevieri outlines 11 important points that publishers must consider with every book. Out of these 11, five rely solely on data. These points include identifying the market, knowing the list of competing titles, understanding target audience and how they can be reached, planning sales strategy, identifying the best places to sell the book, and choosing an appropriate launch date (Sansevieri, Huffington Post).  Data is so valuable to the business of publishing and publishing companies have neglected to gather and use it for so long that now they are playing a game of catch up.

“As an industry we lag behind most major consumer industries, including the music, TV, and film industries, in using data to make informed decisions about our content and audience. We have been super-resistant to this idea that we should let audience insights drive content development, to the point that when asked, most folks in the editorial and content production areas of mainstream publishers are unable to give even the most basic metrics on who their actual customers are, and how much it costs the company to get and retain that customer.”
(Kristen McLean as qtd. in Publishing Perspectives)

Thankfully this is starting to change. Using tools like Booknet Canada, founded in 2002, publishers can track book sales over time and using Booknet’s initiative on gathering information from consumers, publishers can also understand needs in the market (Theriault, Publishing SFU). Booknet’s newest research data includes analysis on audiobooks, effects of the Canada Reads competition on book sales, and infographics on cookbook buyers (BNC Blog). For many publishers, Booknet was the first source of data they had available to them.

“It’s funny to think now how little info we had before BookNet. None of us really knew what was selling out there, except anecdotally. You had reps calling every week with a list of titles to ask how many we’d sold in the last week. That was the ‘market research,’ so not very sophisticated or accurate. A lot of companies just used the ‘Oh, well it’s selling well out there!’ model. Okay, but how do you define ‘well’? We had no idea what stock levels were, what returns levels were likely to be, or what reprints would be needed until it had reached a panicked state and reprints of books for Christmas were delivered in January, which was no use to anybody. It was very unprofessional, with a lot of ‘by guess and by golly.'”
(Peter Waldock as qtd. in “First, Do No Harm”: Five Years of Book-Industry Data Sharing With BookNet Canada SalesData)

Since Booknet and other data-driven strategies have emerged there are two major problems that if not dealt with effectively could mean that Canadian publishing will be in a bigger mess than they were before. The first is an inability to effectively read and sort data due to the lack of skills in individuals working in publishing, and the time crunch to catch up to other industries. The foundations of publishing are “creative, not analytical”, and it would be impossible to expect publishers to make this change quickly, “We’re talking about an evolutionary change in our DNA. It won’t happen overnight, and we won’t go from being one type of organism to a completely different type of organism in one jump.” (McLean as qtd in Publishing Perspectives)

Data will do nothing for publishers unless they know how to use it, and not many in publishing do. Some publishers like Harper Collins, have made efforts to maximize their potential and have a digital division filled with employees who are comfortable working with data, but even they’re having a difficult time. The Chief Digital Officer of HarperCollins, Chantal Restivo-Alessi, says;

“Sifting through the data is a lot. We’re really trying to find what buckets of data we should combine and aggregate. There’s a lot of talk, for example, about social listening. What parts of that do we bring closer to, say, the sales information and how do we bring them together in a way that is visual enough? There’s a lot of noise there because there are other marketing activities that might take place at the same time. We’re really early days to at least try to bring some of the data sets together—even deciding which ones, rather than trying to do it all. We could be here doing the monster of all projects.” (Restivo-Alessi as qtd. in Fast Company)

Top publishers who have the funds and resources have not yet been able to smoothly work with data, which puts mid range and small publishers at an even greater risk. As a creative industry, publishing employs creative people, and “have rarely been known for attracting big data types” (Davenport, Harvard Business Reviews). This changes the entire operating system of a publishing company. Everything from what books are chosen, to how they’re designed, and later sold and marketed all involve instinct, and intuition. Asking publishers to become data-friendly companies is like asking publishers to forget everything they know and start over (Davenport, Harvard Business Reviews).

If all books were published solely relying on information gathered from data, all the books and authors that sit under top categories would drop out. Using Booknet Canada’s report titled “The Canadian Book Consumer 2013” publishers can see that Canadian book buyers are more likely to be female (57%), more likely to buy print books (79%) than ebooks (17%), and that Mystery is the most popular genre fiction (9%) (Booknet Canada). While this is useful to paint a picture of a Canadian market, choosing to publish only women’s mystery novels in paperback in order to maximize sales leaves out every reader and author outside of these categories. Worse still, is what would happen if multiple publishers made similar decisions based on the same data. The books that do not appear to make money will be replaced by books that are shown to, and as multiple publishing companies have the same data and information from Booknet, it is possible for an overwhelming number of books to pour into these “hot” categories, while the rest fade out (The Economist).

Publishers are not just literary gate keepers, they are also businesses who need to be profitable in order to survive. However, by abandoning risky acquisitions and mid-list authors they are altering an integral part of their foundation. There needs to be a careful balance between the two as there are also self-publishers who are ready and waiting to take over the risky mid-list. Once they take it it is possible publishers will not be able to take it back.

“Where publishers used to be able to justify nurturing a writer’s career over several books in the hope that he might gain an audience, disappointing sales may strangle a potentially promising career in its crib. In a world where we know what sells, it’s hard for businesses to justify producing a demonstrably unpopular product.” (Warner, Chicago Tribune)

Self-publishers however, aren’t as concerned with profits as publishers are, and according to “The Economist” “they are doing it to leave a mark, if only a digital one” (The Economist). Self-published books cover a wide variety of genre-specific literature, as well as non-fiction. They offer a place for dropped mid-list authors to further develop themselves (Warner, Chicago Tribune). If publishers drop this market, the group of minority readers will go elsewhere to find their books, abandoning the publisher.

There are six more points in Penny Sansevieri’s article “How to Write and Publish the (almost) Perfect Book”. Choosing the right title, being open to all ideas, chose the right look for the book, find and understand the key message of the book, and manage the relationship between feeling passionately for a book, while being able to step back and see it objectively (Sansvieri, Huffington Post). These are strengths of the people who work in publishing, and they are just as important as data. Embracing data is vital to the publishing industry as a whole. It can help alleviate much of the risk, better inform future prospects and decisions, and help publishers understand their readers better than ever before. However, they need to understand the data before they can toss it around.

“Data is just like crude. It’s valuable, but if unrefined it cannot really be used. It has to be changed into gas, plastic, chemicals, etc., to create a valuable entity that drives profitable activity; so must data be broken down, analyzed for it to have value.” (Rotella, Forbes)

Publishers must sift through troves of data and understand the whole picture the data is showing them, as well as what data is and is not relevant to them. If they are able to understand data language, and be able to use it in conjunction with their instincts and intuition, it would prevent them from abandoning millions of minority readers and authors.

“I’ve worked almost my entire life—with the exception of a small blip in banking—in creative businesses. I have a lot respect for what creative people can really bring. It’s something that’s mysterious. If everyone could do it, everyone would be a billionaire. Identifying the hits and working with the talent is a really hard job. You have to respect the intuition as much as you respect the data.
(Restivo-Alessi as qtd. in Fast Company).

 

Sources

“From Papyrus to Pixels: The Digital Transformation has only just begun” Essays. The Economist, 10 October 2014.

“The Canadian Book Consumer 2013” BNC Research Reports. Book Net Canada, December 2014.

“The BNC Blog” BNC Research Reports. Book Net Canada, 2015.

Anderson, Porter. “Publishing is Now a ‘Data Game’” Publishing Perspectives; Digital. Publishing Perspectives, 17 Sept 2013.

Davenport, Tom. “Book Publishing’s Big Data Future” HBR: Customers. Harvard Business Review, 3 Mar 2014.

Greenfield, Rebecca. “How HarperCollin’s Chief Digital Officer Uses Big Data to make Publishing more” Most Creative People. Fast Company, 23 Jan 2014.

Packer, George. “Cheap Words: Amazon is good for customers. But is it good for books?” A Reporter at Large. The New Yorker, 17 Feb 2014.

Rotella, Perry. “Is Data The New Oil?” Tech. Forbes, 2 April 2012.

Sansevieri, Penny. “How to Write and Publish the (almost) Perfect Book” Huff Post Books. Huffington Post, 28 May 2011.

Theriault, Chelsea. “‘First, Do No Harm’ Five years of Book-Industry Data Sharing with Booknet Canada Salesdata” Publishing SFU: Project Reports. Simon Fraser University, 2010

Warner, John. “The Biblioracle: Stuck in the mid-list with you” Features. Chicago Tribune, 24 Jan 2014.

So You Want to Be a Publisher.

 

Whoever came up with the saying “Jack of all trades, master of none” never saw hybrid publishing coming. In a world (and web) where indie publishers are as boundless as the sea in Colridge’s The Rime of the Ancient Mariner[1], and where, like the poem, most perish, small publishers starting out today cannot afford to lean on publishing practices of the past where individuals focus on but one component of the industry; they must be competent in a variety of skill sets if they are to survive.

First, to define the term “indie” or “small” publisher. The scope from which these terms are currently used can be vast. “Independent publishing — that is, publishing whatever an individual or small group think is worthy of dumping their time and money into — is nothing new…”[2] New? No, but today definitely more abundantly diverse than in the past; from rudimentary, single-novel, self-publishing authors to multi-million dollar enterprises that fall short of “the big five” traditional publishing houses. While those at either end of the scale may find this discussion useful, this paper aims particularly at publishers starting out or looking to adapt to today’s changing technologies and who plan to offer enough of a catalogue to make these suggestions beneficial to their process.

And then to define hybrid publishing and its processes as this will be the direction of this paper. This form of publishing differs from both traditional publishing and from what we’ve mainly seen from fixed-layout e-publishing to date in its workflows and outputs “in which both print and electronic editions of the same basic content can be published in a parallel or complementary fashion.”[3] It is more than providing PDFs of the print version for electronic distribution, it is about producing files that will reflow on any platform, for any device; the majority of the process moving through a single workflow.

Returning to skill sets, understanding and implementing new tools as they emerge will be a vital component of remaining relevant to the publishing conversation. Small publishers need to understand that “you have access to tools, distribution and best practices knowledge to publish ebooks faster, smarter and less expensively than large publishers.”[4] From analytics to design, from workflow tools to distribution, these tools can be found for every aspect of their business.

To counter, publisher Spencer Madsenon argues, “Just because you know how layers work doesn’t mean you’re a graphic designer. Just because you’ve got a DSLR doesn’t mean you’re a photographer. Wherever you have the opportunity to work with a professional, take it. Helvetica isn’t the only font out there and there are people who make fonts for a living. Make sure the cover art you go with looks good online, since that’s where you’re selling it.”[5] That’s all fine and dandy, and of course as a publisher you are trying to produce the type of work you are happy with and that you feel will sell, and it’s great if financial resources allow this type of outsourcing, but “small publishers are under a great deal of pressure to keep project costs low, often due to smaller budgets.”[6]

Now there are aspects of Madsenon’s argument that hold true for even smaller-budget presses. This is where self-evaluation comes in. Let’s use Madsenon’s photography example. Does an individual’s high-school photography class, and their DSLR, give them enough of a base knowledge with which to work and provide quality photography for their needs? Or are they better served spending two hours online searching for and buying a photo from a stock image provider like Shutterstock at five photos for $50? Or how about analytics? Does one learn the Google ropes through tens of hours of tutorials or pay a service such as Chartbeats at $9.95/month to oversee their site(s)? The point here is that nobody knows everything and that sometimes it makes more economical sense to outsource, but the less a small publisher has to to this, the more operating capital stays in the company coffers.

Production workflow is one area that can get costly, and thus should be a skill area indie publishers learn moving forward. Hybrid publishing complicates this is even further because “going electronic or hybrid, requires changes in the way you work during the publishing process, from delivered manuscripts to final publication. The software tools such as Microsoft Word, and Adobe InDesign, were created for the world of analog print and desktop publishing. Although it is possible to create electronic publications from these formats, in most cases this will be a painful, slow, inefficient and expensive process. Small publishers need to take a DIY approach using technical alternatives.”[7] These technical alternatives include Open Source document conversion tools like Pandoc and Calibre, the basic markup language, Markdown (soon to be CommonMark), and EPUB, the publication format that specifies and documents the things reflowable publications need to include: content documents, style sheets, images, media, scripts, fonts, and more.

Now while it doesn’t work for every type of end product (for example, interactive children’s books and the like), “the easiest and least expensive method of hybrid publishing is to convert the source text to Markdown, manually edit the Markdown into a well-structured document, use Pandoc to convert Markdown to EPUB, and import EPUB into Calibre for final adjustments and conversion into other ebook formats (including Amazon Kindle)”[8]. Becoming familiar with a small number of tools can do much of the heavy lifting of hybrid publishing. Does it sound easy to us non-coders? No. Does it sound fun? Probably not for most. But to push through where others remain stagnant, to look at discoverability with the eye of an opportunist rather than as a challenge, to not meet the same fate as many a traditional publisher, gaining the skill sets of hybrid publishing is what’s needed to stay relevant.

“Hybrid publishing will sooner or later confront them [traditional publishers and PDF-only e-publishers] with the need to re-think traditional publication formats, editorial and production workflows, and distribution. The changes required may well be greater and more extensive than initially expected.”[9] But it is not only these skills that should be kept in-house whenever possible, small publishers must expand beyond production workflow skills to marketing, basic website design and maintenance, and all other aspects of the business they can get their hands dirty in. When to and not to outsource is a question nearly as old as business itself. In an industry with historically low profit margins, much of the choice is often reduced to necessity. But the ability to learn things we want to know has never been closer at hand.

Small publishers need to be the ones pushing, the ones making waves. “What small presses should consider is creating a workflow that is both structured and flexible enough to cater to the variety of demands of publishing within a hybrid publishing strategy. A strategy based on publishing across different media while keeping the main part of the work in-house rather than outsourcing it.”[10]

It may be a difficult road to familiarize oneself with multiple tools never before used, and include elements like metadata as it “is significantly more important in the context of hybrid publishing than it is in traditional publishing. Carefully applied metadata will ensure that the publication can be found online in databases and bookstores such as Amazon.”[11] But that’s what it’s going to take if you want keep ahead of the competition.

Unfortunately there is no ‘magic wand’ that will turn a print book design into an electronic publication at the touch of a button. [12]

Yeah. No one said it would be easy.

 

 

 

 Citations

 [1]  Coleridge, Samuel Taylor. The Rime of the Ancient Mariner, (New York: Dover, 1970)

[2]  Diamond, Jason. 25 Independent Presses That Prove This Is the Golden Age of Indie Publishing, Oct 1, 2013, web. http://flavorwire.com/417838/25-independent-presses-that-prove-this-is-the-golden-age-of-indie-publishing

[3]  Institute of Network Cultures. web.  http://networkcultures.org/blog/publication/from-print-to-ebooks-a-hybrid-publishing-toolkit-for-the-arts/

[4]  Coker, Matt (Founder, Smashwords), Ebook Publishing Gets More Difficult From Here: How Indie Authors Can Survive and Thrive, 11/21/2014, Huffington Post: Books, web.  http://www.huffingtonpost.com/mark-coker/ebook-publishing-gets-more-difficult-from-here_b_6200508.html

[5]  Madsenon, Spencer. I Made the Mistake of Starting a Small Press and So Can You, Mar 12, 2014, The Toast, web.  http://the-toast.net/2014/03/12/guide-to-starting-a-small-press/#FlcMf2hKx1LXYI7B.99

[6]  Institute of Network Cultures. web.  http://networkcultures.org/blog/publication/from-print-to-ebooks-a-hybrid-publishing-toolkit-for-the-arts/ 8

[7]  Ibid., 9

[8]  Ibid., 85

[9]  Ibid., 7

[10] Ibid., 89

[11] Ibid., 103

[12] Ibid., 7

Facebook and News Publishers: The Writing is on the Wall

Publishing has been swallowed up in the vortex of big data and the pimply-faced prodigies from Silicon Valley are to blame. The sentiment is overfamiliar, almost prosaic—its utterances are recurring conceits of most every critique and discourse on the Information Age. Yet, it’s not without reason: the fraught relationship between content creators and enablers is now poised to enter its second decade and neither side is willing to sign an armistice. Yet.

On March this year, The New York Times ran a scoop on itself; the report quoted anonymous sources confirming that the Grand Old Lady of journalism had been holding backchannel talks with Facebook to work out a business deal that would effectively allow the world’s largest social network to host the media giant’s content on its website. Two others—the highly adaptive National Geographic and Buzzfeed, the dark overlord of click bait—are the other media concerns which have joined NYT at the negotiating table. Details of the business plan haven’t been disclosed but there are strong indications that Facebook has offered to share revenue from ads which would be displayed alongside the publisher’s content. The deal, if inked, will not just prove to be a turning point in the new internet economy but could also bring about a paradigm shift in the ‘adapt or die’ environment of digital disruption that has, over the years, thrown the publishing industry into an existential crisis. More on that later.

First things first: Why is Facebook, a colossal player in the information economy, making overtures to NYT. Media gazers believe it’s because the world’s largest social network is facing the heat from suffering a steady loss in user base as newer social media platforms take hold. By hosting content within its contours, Facebook can limit the number of out-bound links and contain users within itself. Additionally, there is the time factor: according to Facebook, the average time it takes a news article to load on a third-party website is eight seconds; the lag period may seem innocuous but to Facebook, it is time lost in a breakneck digital environment where even milliseconds matter and impatient users can just as easily navigate to quicker alternatives. Sounds plausible enough but a closer look at Facebook’s recent activities shows the move is part of a bigger plan.

It all began a few months ago when it tweaked its algorithm to demote videos from third-party websites like Youtube and Daily Motion and lend more visibility to videos embedded on its own platform; the proposal to host written content from publishers is thus in step with Facebook’s larger plan of making all content native to its website.

Is the impulse to nativize all content an attempt to supersede the other but substantially larger news aggregator—Google, which, for all intents and purposes, exists as a kind of gateway to the web at large. Indeed, by cozying up to publishers, offering them access to its billion plus users and the ad revenue perks that come with it, is Facebook trying to undermine Google News? Already, Facebook has left Twitter and Reddit behind to become the second largest provider of referral traffic to news websites. Publishers, especially new media ones such as Buzzfeed have shown a 80% leap in referral traffic on the back of Facebook. In the case of Buzzfeed, Facebook has overcome Google as the leading provider of referral traffic: in fact, Buzzfeed, which has seen spectacular success over the last few years, would have stood scant chance had it been confined to Google News’ highly intransigent search algorithms which are given to promoting newsworthy content from established publishers over lighter content from start-ups. Over the years, Facebook’s quiet entry into the content space has reaped significant dividends for publishers, especially newer upstarts like Buzzfeed, Vox and Gawker which rely on high shareability and clickability of content to spread like virus across the social network. Indeed, both Vox and Buzzfeed have made much of their charmed association with Facebook, releasing fawning case studies of their lucrative partnerships.

One of the principal differences between Google News and Facebook’s newsfeed is that while the latter lets users decide what’s interesting, Google’s algorithm is a nod to the time-honored tradition of news publishing: Whoever breaks the scoop first gets visibility over and above the others. While Facebook’s algorithm has been a boon to newer publishers who, save a few exceptions, do not scruple to uphold substance over shareability, it has similarly given free rein to content farms. Only recently did it add strictures in place to prioritize high-quality content over memes and gifs. More crucially, unlike Google, Facebook is not a search engine; it merely makes visible an article, assuming it satisfies its pre-reqs, if your friends and the networks around them have liked and shared it.

It is no great revelation that news dissemination has become highly socialized—that more and more internet users receive their news from social networks instead of from search results. ‘If the news is that important, it’ll find me’ seems to be the general refrain among users, particularly younger ones. In the current state of the news business, it isn’t just publishers who are pitted against each other as they vie for ever more clicks. There is a bigger contest in progress: new battle lines are being drawn between search and social and social has staked its claim to Google’s numero uno position. By hosting content within its own platform, Facebook is coming after Google and seems determined to fight tooth-and-nail to eclipse its long shadow over online content.

Meanwhile, publishers are mere spectators, their mouths agog, as the big bears of Silicon Valley tussle it out. But now, Facebook’s offer to host their content has created a frenzy among insiders in media circles whose opinions are firmly divided between both sides of the spectrum. While NYT considers its options, The Guardian, which has, over the years, reinvented itself as a suave digital news publication with an international focus, is reportedly trying to band together with other publications to enter into an alliance with Facebook but only if it’s on its own terms; the same NYT report indicates that the British media company is trying to drum up support in order to cajole Facebook into letting news publishers retain control over advertising. This is a politic move on The Guardian’s part considering advertising control allows publications to collect and harvest data on users which is then utilized to bargain for more revenue from advertisers. It’s difficult to say if Facebook will bite the bullet because to many publishers, its offer, which, with privileged access to a billion plus users, offers gargantuan scale in place of advertising control, is already irresistible. Truly, the gains from leveraging such scale cannot be underestimated; moreover, even if Facebook withholds advertising control from publishers, it still helps publishers create optimally targeted marketing campaigns: for example, it unveiled new tools in December to give publishers access to specific demographics—premised on age and location and other parameters—from its masterful collection of detailed data on its legions of users.

Scale is paramount, but is it enough? NYT, which, until recently, was in dire financial straits before being rescued by Mexican billionaire Carlos Slim, has fought a long and hard battle to weather digital disruption. Both NYT and The Guardian have concentrated considerable resources to stabilize their online presence and have managed to import their print legacies into the digital space so much so that advertisers have finally begun to take notice and pump considerable money into their coffers. NYT went a step further and, against market wisdom, introduced a paywall to milk its brand value. How does it expect to transplant its content into Facebook without risking its robust subscription model? Is Facebook’s offer a Faustian bargain that cannot be accepted without jeopardizing all that NYT has accomplished in the last few years?

But unlike NYT, BuzzFeed, inarguably one of the largest content spewers in the World Wide Web, does not have to contend with such problems. Evan as Facebook has been on a silent crusade against click-baiters over the last few months, it still continues to be BuzzFeed’s largest traffic driver. But—and this is where the long established rules of online publishing get tossed out of a moving car— BuzzFeed couldn’t care less if users stopped visiting its website. Yes, we have arrived at the pearly gates of a new epoch; months before Facebook advanced its new business proposal, BuzzFeed CEO Jonah Peretti had announced his company’s grand plan of channeling content across different platforms. It’s important to keep in mind that BuzzFeed gamed the new internet economy while it was still a bit player; unlike other publishers, it does not rely on display ads or advertisements native to its platform, rather it has built its own species of advertorials: editorial products crafted to market its client brands. And it has already embarked on a campaign to entrench these products into as many platforms as it can. Peretti has said that so long as it reaches hundreds of millions of readers, he does not care where his content lives. Indeed, in January this year, BuzzFeed received 420 million views via referrals from Twitter, Pinterest and Facebook. But it generated a whooping 18 billion impressions on those platforms. When it comes to BuzzFeed’s unique operating model, advertisers don’t care where the content is being consumed, as long as it’s being consumed. In fact, as Peretti points out, organic social shares of their content qualify as earned media as opposed to viral hits, which can be easily obtained by buying cheap traffic.

For BuzzFeed, Facebook’s offer is a no-brainer. But when it comes to legacy publishers such as NYT, The Guardian and the other old-timers, the rules of operation are markedly different. With their rich publishing histories, their cultural capital is inextricably tied up with their brand appeal. The crucial reason why BuzzFeed and its like work best with Facebook is because they are digital natives: they are in the business of floating individual stories to rake up hundreds of millions of numbers. But, the old guards are institutions in and of themselves; the emphasis is on the whole rather than on one isolate story. Granted, the imperatives of social media have put an end to newspapers’ age-old practice of slot fillers or token stories published to merely fill space; but, legacy publishers still exist as whole entities even as their reliance on social media increases. To neglect their platforms in the face of an inexorably changing digital landscape is tantamount to undercutting the very fundaments of their traditional publishing model.

Notwithstanding all the ominous talk, it is still possible for them to come out unscathed. In NYT’s case, the outcome pivots on their online subscriptions. One of the reasons why online subscribers are important is because they fetch more advertising money. How would its online subscribers react upon finding that the story that they have paid good money to read is being freely shared on Facebook? Will they unsubscribe? Will we ultimately witness the fall of the paywall? It’s equally possible that there won’t be a backlash at all. Paid subscribers, especially in an environment where everything is free, exhibit an unyielding brand loyalty and reverence for august publications. The Financial Times, which has performed numerous experiments with its paywall, can attest to that.

There are no authoritative answers to these questions largely because no one is privy to the back room deals that are in progress at the moment. But one thing is manifestly clear; online publishing is in for a seismic change. Whether the old guard chooses to jump aboard or stay marooned in its own platforms, there’s no denying that Facebook is on a single-minded mission to overshadow Google. Even if it fails to win bigger publishers over, it will go after smaller fish. The game is about to change and those who change first will have a heads-tart. Once again, the writing is on the wall: Sooner or later you’ll have to join us or you will perish.

Bibliography

Facebook May Host News Sites’ Content, New York Times
Facebook turns 10 but are its days numbered?, BBC News
Why you may never need to leave Facebook ever again, Digital Trends
What the Rise of Native Video on Facebook & Twitter Means for Brands, Adweek
Facebook Reminds Media Companies: We Still Really, Really Like You, All Things D
Facebook Is Totally Dominating Google In Referrals … For Buzzfeed, Social News Daily
Finding Political News Online, the Young Pass It On, The New York Times
BuzzFeed’s New Strategy: Fishing for Eyeballs in Other People’s Streams, Recode
Is the Financial Times the perfect digital model?, The Guardian
How Vox.com approaches publishing on Facebook, Vox
Why ad buyers are upbeat on The New York Times’ digital transformation, Digiday

Inclusivity of Women in Digital Publishing

It is no secret that the publishing industry is behind in technology. Often mentioned is the lack of funds to take on uncertain ventures. However, sexism within tech industries, as well as in technology itself, is a considerable factor in alienating publishing professionals, most of which are women. Tech industries need improvements, but the publishing industry must also take steps to protect itself from the influence of sexism in tech while incorporating new technologies into its publishing initiatives. Steps include providing welcoming environments in and outside the workplace, mentoring other women, making our voices heard, foregoing the traditional, acknowledging successful women in digital publishing, introducing young girls to technology, and boycotting sexist publications.

Women in Tech

Technologies need to be designed for inclusion of women. There is a clear lack of women in tech industries, which in turn affects the technologies that are developed.1 According to Audrey Watters in “Men Explain Technology to Me,” because tech industries are dominated by men, technologies are generally designed for men.2 For example, Danah Boyd danah boyd, a principal researcher at Microsoft Research and research assistant professor in media, culture, and communication at New York University, suggests that the Oculus Rift is sexist in its design because of the different ways it affects women and men.3 Furthermore, Watters, whose main concern is education technology, wonders, “How do [male] privileges, ideologies, expectations, values get hard-coded into ed-tech?”4 The same question can be applied to publishing technologies.

Women and Publishing Technologies

The publishing industry, on the other hand, is made up of mostly women.5 It is clear that if the industry needs incentive to adopt new technologies, publishing professionals need to feel that those technologies are inclusive of their demographic.

The following are steps that should be taken by women and men in the publishing industry to ensure that digital publishing is inclusive of women.

  1. Provide welcoming learning environments. Independent feminist-activist spaces such as Geek Feminism and Double union work within communities to bring an interest in tech to women outside of corporate environments.6 Another example is Ladies Learning Code in Toronto, which supports and encourages women to learn programming.
  2. Create new, non-traditional platforms. We can change the industry by foregoing the traditional, male-dominated hurdles altogether and creating our own publishing initiatives. For example, Model View Culture is “an independent media platform covering technology, culture, and diversity” that was started by women who saw a need for a publishing platform that speaks to women and minorities in tech who feel excluded by the industry.7
  3. Be loud. Part of the problem is that women are not as loud as their male counterparts.8 According to Hope Leman, who works with databases in the healthcare industry, women in digital publishing can have their voices heard by speaking at conferences and networking with other women in publishing, either in person or via social media.9 Meeting and speaking with as many people as possible will “normalize the idea” of women working in tech-related fields.10
  4. Mentor other women. Women working in digital publishing can make an effort to mentor women in their workplaces.11 We can also volunteer to mentor women in our lives who are interested in tech.12
  5. Don’t purchase sexist publications. As consumers, we can choose not to purchase digital publications that promote sexism. One such publication is Bustle, a news site for women that is mostly run by men, that puts “world news and politics alongside beauty tips.”13 The suggestion that women can only handle serious topics when accompanied by beauty tips is demeaning.
  6. Introduce children to technology early. Breaking into tech fields is a much bigger challenge for girls than for boys, so they benefit from an extra push. Children can be introduced to technology in preschool, in an attempt to eliminate the stereotype that boys are better than girls in natural aptitude. Elise Boulard, who works as an ebook distribution sales manager, says the gender imbalance in tech fields starts in childhood, when children are told that “girls are good at arts and boys at math.”14 One way to fight this problem, suggests Leman, is by supporting initiatives such as Josie Robin, Science Fiend, “a STEM inspired ebook for kids.”15 Meghan MacDonald, digital project manager at Penguin Random House Canada, agrees: she says a huge hurdle for girls is getting past the discouragement of people around them and feeling that they don’t belong in technology-related fields because of their gender.16 Furthermore, in an interview of successful women in digital publishing, everyone agreed that programming should be taught in schools to help break the illusion that girls and women cannot code.17
  7. Acknowledge women already doing the work. In the publishing industry, we should acknowledge the work of women who bring tech to publishing, such as Brenda Walker, CEO and founder of EndTap.18 Liza Daly, vice president of engineering at Safari Books Online, says that we should emphasize the work of women working in digital publishing to show role models and success stories to girls and women interested in technology.19
  8. Work with HR. Because tech roles tend to be dominated by men, even in the publishing industry,20 companies can make sure their HR department takes diversity seriously. According to Shanley Kane, founder of Model View Culture, HR departments are often ignorant of the issues faced by minorities in tech and need to be educated in creating environments and recruitment practices that are welcoming to all.21
  9. Donate. Donating to organizations like Girl Develop It helps women learn to work with technologies at an affordable price.22 23

These are just a few examples of the ways women can support each other, and be supported, in digital publishing. Since there is a need for systemic changes within tech and digital publishing, the most important thing to take away is that we need to rethink the ways technologies are built and introduced within our communities. When women working in publishing feel accepted and encouraged to explore the world of tech, the publishing industry will be much more likely to adopt new technologies.

 


Endnotes

1  Audrey Watters, “Men Explain Technology to Me: On Gender, Ed-Tech, and the Refusal to Be Silent,” Hack Education, November 18, 2014, http://hackeducation.com/2014/11/18/gender-and-ed-tech/.

2  Ibid.

3  Danah Boyd, “Is the Oculus Rift Sexist? (plus Response to Criticism),” Apophenia, April 3, 2014, http://www.zephoria.org/thoughts/archives/2014/04/03/is-the-oculus-rift-sexist.html.

4  Watters, “Men Explain Technology to Me: On Gender, Ed-Tech, and the Refusal to Be Silent.”

5  Rachel Deahl, “Where the Boys Are Not,” Publishers Weekly, September 20, 2010, http://www.publishersweekly.com/pw/by-topic/industry-news/publisher-news/article/44510-where-the-boys-are-not.html.

6  Mary, “Model View Culture: Where Tech Intersects with Social and Cultural Lenses,” Geek Feminism Blog, March 20, 2014, http://geekfeminism.org/2014/03/20/model-view-culture-where-tech-intersects-with-social-and-cultural-lenses/.

7  Elizabeth Spiers, “‘Speaking up Every. F*cking. Time’: How One Feminist Publisher Is Taking on the Worst of Silicon Valley (and Some of Her Allies, Too),” Medium: Matter, July 9, 2014, https://medium.com/matter/speaking-up-every-fucking-time-a61a24aa7629.

8  Laura Brady and Christen Thomas, “There’s No Ceiling If You Start at the Top! Women in Digital Publishing and TechDigital Book World,” Digital Book World, September 30, 2013, http://www.digitalbookworld.com/2013/theres-no-ceiling-if-you-start-at-the-top-women-in-digital-publishing-and-tech/.

9  Ibid.

10  Thursday Bram, “Attracting Girls to STEM: Four Ways Women in Tech Can Help,” Que, May 13, 2014, http://www.quepublishing.com/articles/article.aspx?p=2211696.

11  Ibid.

12  Ibid.

13  Spiers, “‘Speaking up Every. F*cking. Time’: How One Feminist Publisher Is Taking on the Worst of Silicon Valley (and Some of Her Allies, Too).”

14  Brady and Thomas, “There’s No Ceiling If You Start at the Top! Women in Digital Publishing and TechDigital Book World.”

15  Ibid.

16  Ibid.

17  Ibid.

18  Ibid.

19  Ibid.

20  Spiers, “‘Speaking up Every. F*cking. Time’: How One Feminist Publisher Is Taking on the Worst of Silicon Valley (and Some of Her Allies, Too).”

21  Shanley Kane, “HR Antipatterns at Startups,” Model View Culture, June 9, 2014, https://modelviewculture.com/pieces/hr-antipatterns-at-startups.

22  Bram, “Attracting Girls to STEM: Four Ways Women in Tech Can Help.”

23  “Who We Are,” Girl Develop It, accessed April 4, 2015, https://www.girldevelopit.com/.

References

Chen, Elia. “Twelve Hours. Sixty Students. Eight Challenges.” Medium, January 30, 2015. https://medium.com/@ecat108/mv-hacks-promotes-gender-equality-in-technology-industry-9970c85cd4b.

“Cyberfeminism.” Wikipedia: The Free Encyclopedia. Accessed March 31, 2015. http://en.wikipedia.org/wiki/Cyberfeminism.

Ellis, Joanna. “Celebrating Female Digital Publishing Pioneers.” Digital Women UK, 2014. http://www.digitalwomenuk.co.uk/celebrating-female-digital-publishing-pioneers/.

“Feminist Technoscience.” Wikipedia: The Free Encyclopedia. Accessed March 31, 2015. http://en.wikipedia.org/wiki/Feminist_technoscience.

Kember, Sarah. “Notes Towards a Feminist Futurist Manifesto.” Ada: A Journal of Gender, New Media, and Technology, 2012. http://adanewmedia.org/2012/11/issue1-kember/.

“Publishing Needs to Re-Focus on Gender.” The Bookseller, April 17, 2014. http://www.thebookseller.com/news/publishing-needs-re-focus-gender.

Watters, Audrey. “Top Ed-Tech Trends of 2014: Social Justice.” Hack Education, December 18, 2014. http://2014trends.hackeducation.com/justice.html.

FEMBOT Collective. Accessed March 31, 2015. http://fembotcollective.org/.

Self-publishers setting the stage, but traditional publishers have a part to play

Self-publishing is setting the stage[1] for the future of publishing with the prevalence of “do-it-yourself” tools and applications, almost diminishing the value of the traditional publisher as gatekeeper.

The digital context has given ordinary readers tools[2] to become self-published authors/publishers through several online platforms and user-friendly technology tools to start-up their own publishing, marketing and data analysis businesses. One such author is Scott Nicholson who has published more than 70 books and sells them online through Amazon for the Kindle and other ereaders. “He handles the entire process himself”[3] and the lucrative 70% royalties on e-book sales attract authors more than the traditional publisher’s offer of a mere 25%[4]. With that said, Amy-Mae Elliott says that “with the advent of e-books, social reading sites and simple digital self-publishing software and platforms, all that has changed. An increasing proportion of authors now actively choose to self-publish their work, giving them better control over their books’ rights, marketing, distribution and pricing”(Mashable, February 2014).

Moreover, editors and designers, as well as graduate publishing students are also forming start-up businesses geared towards content strategies for publishers and authors. For traditional publishers, the online context has emphasized the role of the publisher as an incubator, and consultant.

According to Bowker’s statistics “more than three million new titles were published in 2010. Of these, over 2.7 million were non-traditionally published books, including print-on-demand and self-published titles.[5]

Traditional publishers, who already face competition from retail giants such as Amazon, now have to consider their competitive edge against a surprising opponent – the consumer, and in this respect, the reader. We can see that through social computing, as described by Alan Liu as an evolutionary form of reading where the reader assumes the role of annotator, and thereby contribute to the work of the original author. In this sense however, authorship is not overtly important, but the overall collaboration of the project instead. Readers who range from academics to ordinary non-scholars or literary students, are able to developed a shared network among others and create a community from which they are able to grow an audience base. Self-publishing tools offered by Create Space and online coding academies to create website ad artificial intelligent website creators such as The Grid[6] offer readers who become self-published authors the ability to create a brand around themselves and successfully publish online and printed books, without the help of a traditional pubisher who often administered this task.

This paper argues that technology has revolutionized the way we approach the publishing, its function, and who has the right to publish. Matthew Ingram says that how we view publishing is narrowed down to the push of a button in the online context of the web.[7]

The future of the book: “Almost as constant as the appeal of the book has been the worry that appeal is about to come to an end. The rise of digital technology—and especially Amazon, underlined those fears” (The Economist, From Papyrus to Pixels)[8].

Traditional publishers find themselves at odds with having to compete for the same market alongside ordinary persons with little to none experience in the publishing field, but who are able to attract, and maintain an audience with user-friendly, and free tools and platforms on the web. Additionally, the serialization of content is popularized by the context of people consuming media in short form, from a mobile device or tablet, and often on the go. The reader who consumes in this fashion, is able to come up with the right solution for what publishers are missing the mark on.

This ties into Brian O’Leary’s view of “Context not Container” in his book A Futurist’s Manifesto, especially with the publishing industry taking a popular form of the web2.0. In the same sense, contexts such as social computing have blurred the lines between author and reader with both having the capacity to adopt the role of publisher through networked channels.

What this means for traditional publishers is not only a change in their business models, but also their approach to the nature of the digitized age. They have to align themselves with networked trends, and find innovative ways to approach online distribution, marketing, and content creation. Additionally, instead of focusing on the plight of traditional publishing in the age of technology, this paper draws its attention to the opportunities self-publishers exploit and how both traditional publishers can co-exist alongside them within this context.

“The book is now a place as well as a thing and you can find its location mapped in cyberspace,” writes researcher Paula Berinstein[9] who discusses the notion of the networked book where authors, publishers, and readers gather to think, discuss, annotate, and refer the book. One can say that this was sparked by online journaling platforms such as blogs, and now by the Web 2.0 which makes the book searchable, linkable, divisible, and mutable (Berinstein, 2015). A case study such as Gamer Theory (spelled GAM3R 7H3ORY) by McKenzie Wark which started off as a draft online and invited reader interaction through annotation, comments and feedback points to how such a networked book was transformed into a better book for online and print. The book contained index cards with reader comments, and prestigious academics. It was also acquired by Harvard University Press for publishing in 2007 and an online editions are available. This changing approach in how the book is created, curated, promoted, and distributed appeals to the cooperation between self-publishers and traditional publishers in a digitized context.

Other opportunities show that traditional publishers will need to unbundle their content and services in order to remain relevant. “They will have to reimagine their role. [They] could start offering “light versions of their services, such as print-only distribution, or editing, and not taking a cut of the whole pie”[10]. Moreover, publishers will need to work harder at proving to authors that they are capable of reading a far larger audience. This challenge is could be tantamount to the accessibility of the same technology making it is easier for self-publishing and explore new and alternative ways discovering, marketing, sharing, distributing, and imitating the books of other self-published and traditional publishers, think fan-fiction.

Furthermore, traditional publishers need not be at loggerheads with self- publishers, but should rather look for collaborating opportunities by declaring their importance with publishing quality content with the assistance of editors and customized content strategies. A recent case study points out to how a self-published author of a dystopian science fiction short story, “Wool” on the web led to film adaptation and a contract with a traditional publisher, Simon and Schuster to buy the license rights to print the book. “Most writers still sign with publishers when they have the chance, because print books remain such a sizeable chunk of the market”[11]. With that said, the self-published author owns the right to the e-book.

Besides this, self-published authors attract readers by selling their books at a low price, and often in e-book format. This puts traditional publishers under pressure to lower their prices too especially in genre fiction, such as romance, where romance publisher Harlequin suffered financial losses and was ultimately acquired by HarperCollins in May 2014.[12] This acquisition, for the most part, has led to international opportunities for the now-imprint to publish in over 30 languages worldwide, a move they hope to acquire authors. We can again, see in this instance, that traditional publishers are able to exploit international brand presence.

In his article, “A modest proposal for publishers and authors”, Jonathan Fields discusses the nature of the relationship in the digitized age, and how the two can co-exist through partnerships. He says that traditional publishers, even as well-known brands did not even have direct access to buyers, and according to him, still do not.[13]

Self-publishers who are able to attract and maintain a profitable audience can explore the benefits traditional publishers and booksellers are able to offer in partnership. Barnes and Noble’s Nook Press has launched Pub It! that offers self-published authors e-book publishing and print book packages. Potential self-publishers can build their book, prepare downloadable manuscript files that includes instructions to create, format e-books and print books on demand—as well as the technologies available to do this. Authors also have the option of acquiring professional input from Nook Press in any part of the publishing process.

Their author services packages can be purchased and guides authors through the publishing process to create a printed book which is ready for shipping within a week. [14] The Nook Press print platform creates print books for personal use whereas the ebook platform creates digital books for sale through Nook and the Barnes and Nobles website which distributes directly to the reader.

According to their press release, PubIt! attracts at least 20% independent authors every term and titles increase by 24% in the Nook Store. The report also states that at least 30% of customers purchase self-published content accounting for at least a quarter of Nook books sales every month[15].

In conclusion, self-publishers have approached the web as a platform or context of endless opportunity, whereas traditional publishers have perceived it as a threat to their business models and in turn, their very purpose of publishing. Essentially, a new form of publishing is already set as the stage where self-publishers are able to introduce new standards of creating content and curating content, marketing and distributing it with user-friendly, accessible and even free tools. The smart traditional publishers, and even booksellers, as we have seen have used this as an inspiration to expand their own models, and collaborate with successful self-publishers, even emerging bloggers and annotators by offering unbundled professional services and content strategies, as well as editing and formatting tools to publish their own books. The new stage of the “techno-publishing”, a term I coined myself, is a place to invest coding skills, multiplatform marketing and content disaggregation for the right audience at the right time, is where the business of publishing is right now. What is left, is for us to decide which part we’ll play in it as future publishers.

 

Work cited:

Elliott, A. 2014. “People-Powered Publishing is changing all the rules.” at http://mashable.com/2014/02/09/self-publishing-digital/

McGuire and O’Leary (2012) “Context, not Container” in  A Futurist’s Manifesto. Press Books. http://book.pressbooks.com/chapter/context-not-container-brian-oleary


Please Put Your Hands Together For Facebook’s New Clown

In the trail of navel-gazing and soul-searching that’s streaking like condensation on the window through which Facebook and the New York Times are about to shake hands, the debate about the deal that would see the media organization publish content directly to Facebook has boiled down to duel between control and independence versus access to audience. The thermometer stuck into the crowd of most media critics and pundits reads hot with fear over the potential deal. This potential deal, best described as a “leap of faith for news organizations”— words straight from the Times— is built on the idea that Facebook has journalism’s best interests in mind, an idea that Chris Cox, chief product officer Facebook, relayed to David Carr (2014) when reports of the deal first surfaced in late 2014. Fundamentally, it seems unlikely that it’s in the best interest of digital publishers to cede power to a third-party company with its own shareholders to answer to, and ultimately, this partnership is a wedge that could sever the Time’s social audience from itself by allowing Facebook to corral its  audience while setting Facebook on the road to becoming a publisher themselves.

First and foremost, this partnership gives  Facebook the opportunity to build a fence around its herd of users while making the pen so absolutely comfortable that the herd just couldn’t fathom leaving. When Facebook approached media organizations with the idea of publishing natively to their platform, the conversation was driven with user experience at the forefront. Currently readers click on links posted by media organizations, wait the average eight seconds for the story to load, then consume it. This process of referral has proven beneficial in the way it drives enormous amounts of traffic to media organizations, but as Matt Buchanan (2015) of The Awl sees it, these enormous amounts of outbound links from Facebook are a wasteful byproduct caused by extracting users’ attention spans. With this deal, Facebook has created a way to reduce clicks, load time, and most importantly, referrals (waste), all in the name of improving the user experience, with the fine print reading that media organizations be limited to the infrastructure developed by Facebook and by doing so, cede some level of control over the type and cost of advertising that runs beside the content. Cutting through the PR-speak, Joshua Benton at Nieman Lab writes what should be painfully obvious: “Facebook has far better data about individual users than any publisher has, and it wants to keep its users on Facebook.” This should be no surprise as this company is the same  that only permits Instagram users to have one functioning hyperlink on their profile and none in their posts.

The building of this fence is a dance in power dynamics between publishers and Facebook that implicates millennials. These fickle readers do not turn to the New York Times to find out about the world. They find out about the world from the New York Times story that their friends shared on Facebook (Honan, 2014). Reflecting on a strategy group involving millennials and a newspaper he was consulting for, Alan Mutter (2014) of Reflections of a Newsosaur found ideas that fall in the same line: The only media that matters to millennials is social media. They trust their friends for recommendations and aren’t loyal to one brand of media over another. They click on content that interests or amuses them—news, entertainment, advertising: it’s all good. This crowd is attractive in the eyes of the Times for its size and youth. This deal presents an opportunity for the Times to publish a style of article that resonates with millennials in a sphere where they frequent. But how does the New York Times brand benefit from this? Rarely is viral content on Facebook held in the esteem that the New York Times strives for as the hallmark of quality journalism. And do these fly-by Times readers via Facebook leave that ecosystem to find the Times’ owned channels—moreover, do they come to these owned channels with their wallets open?

The end-game is a herd of news consumers (read: hundreds of millions) who turn to Facebook for their news more so than they do now. These future readers may read and trust the Times in Facebook, but what happens if Facebook decides to shifts it posture to hosting content from the Times on its platform? These readers the Times worked on feeding remain behind the fence. Do the loyalists in the herd follow the Times outside of the pasture? Some, sure. All? Surely not.

When Carr (2014) first reported on the makings of this partnership, David Bradley, owner and leader of the Atlantic Media Company, publisher of The Atlantic, spoke of an emerging competition that personifies the perils of the Facebook and New York Times deal: “In my last trip to the Valley, the best minds were talking about the same issue: Is the coming contest between platforms and publishing companies an existential threat to journalism? At least in the Valley, largely the answer I heard was ‘Yes.’”

With their emphasis on hosting videos on their platform and now stories, Facebook is assembling a multi-purpose stage for others to put their content on display while it builds the fence to keep its herd in. Especially in light of Bradley’s comments, it is not unthinkable that Facebook could swap the content on display from the New York Times to, say, their own as the contest thickens. One aspect about this partnership that has critics concerned is the loss of ability to collect data on their readers. While the details of how data about readership (or how advertising revenue would be shared, for that matter) has not been released, it’s presumable that Facebook would share this data in a similar manner to the way they share data about Pages via their Insight program. While this information is invaluable to most media organizations, for organizations like the Times, it may not go deep enough. So while the Times may lose certain reader insights, Facebook gains a whole world of new insight that the company would not have previously been privy to—information like what stories readers of the Times are more likely to read through completion, what authors are read the most, and what stories are read most in certain parts of the world. All of this data puts Facebook in a position to better understand the corral of readers it has amassed, in turn, putting Facebook in a better position to take the reins and publish directly to the audience themselves, not unlike how Netflix uses the data it collects to create shows like House of Cards.

As Mat Yurow, associate director of audience development at the New York Times, wrote, this deal may very well be the publishing industry’s iTunes moment. If Facebook can build an entertaining enough circus for the herd within its pen, the New York Times may be reduced to a bit part in the rotating cast of clowns brought into the seamlessly entertain while Facebook works on replacing those clowns altogether.

Sources

Benton, J. (2015, March 15). Facebook wants to be the new World Wide Web, and news orgs are apparently on board. Retrieved April 3, 2015, from http://www.niemanlab.org/2015/03/facebook-wants-to-be-the-new-world-wide-web-and-news-orgs-are-apparently-on-board/

Buchanan, M. (2014, August 11). Content Distributed. Retrieved April 3, 2015, from http://www.theawl.com/2014/08/content-distributed

Carr, D. (2014, October 26). Facebook Offers Life Raft, but Publishers Are Wary. Retrieved March 30, 2015, from http://www.nytimes.com/2014/10/27/business/media/facebook-offers-life-raft-but-publishers-are-wary.html?_r=2

D. Mutter, A. (2014, December 11). How newspapers lost the millennials. Retrieved April 5, 2015, from http://newsosaur.blogspot.ca/search?q=How newspapers lost the Millennials

Honan, M. (2014, December 17). Inside the Buzz-Fueled Media Startups Battling for Your Attention | WIRED. Retrieved January 15, 2015, from http://www.wired.com/2014/12/new-media-2/

Yurow, M. (2015, February 6). Please, [Insert Tech Platform Here], Take My Business! Retrieved April 3, 2015, from https://medium.com/@myurow/please-snapchat-facebook-twitter-take-my-content-2e1d96897144

Optimizing Algorithms for the Publishing Industry

Optimizing Algorithms for the Publishing Industry

Maintaining technological relevance and a subsequent competitive edge are two of the publishing industry’s greatest challenges, regardless of medium. The publishing landscape is changing at a pace and progression previously unheard of, and this has resulted in the release of an endless flood of theories, apps, programs, and designs, all purporting to hold the key to one’s success. One such example of this trend is the idea of the computerized author and publisher, which has recently begun to pick up steam. Increasingly, advances in technology dealing with algorithmic machine learning are seen as heralding a new future, one in which human writing and publishing is obsolete. Such perspectives, however, can be seen as both an over reaction and an over reaching of the facts at hand. By presenting the question as one of machine versus human, the publishing world risks missing out on the algorithms’ full potential and yet another chance for improvement. Instead of seeing such algorithms as a means of replacement, the industry must see them as a way of advancement and increasing efficiency. If publishers can recognize these algorithms for the tools that they are, they have the ability to streamline the publishing process like never before. In this essay, I will thus explore the ways in which the implementing of such algorithms can modernize the pre-publishing tasks of topic exploration, market research, and acquisitions.

In the last decade, numerous organizations and businesses have explored the potential for algorithms within the publishing sphere, with each trying in various ways to create “a method and apparatus for automated authoring and marketing” (Abrahams). In most instances, these advancing technologies and algorithms are suggested as an alternative to human labour and the subsequent costs and time associated with it. Take for example Philip M. Parker, a “chair professor of Management Science at INSEAD” (Abrahams) and head of Icon Group International, a publishing and tech company. In the last ten years, Parker has “written more than one million titles” (McGuinness), all thanks to an algorithm that he created. The program itself is extremely simple, boasting a completion time of twenty minutes per book. Working off of large linguistic databases, the algorithm is designed to respond to a human entered topic. Parker feeds the algorithm “a recipe for writing a particular genre […]. The computer [then] uses the recipe to select data from the database and write and format it into a book” (Abrahams). The entire process costs about twenty-three cents and can quickly be sold through Amazon or Printed on Demand when ordered.

Given their speed and efficiency, machine run algorithms like this are being hailed as the next wave of betterment for publishing industries across the globe. In some circles, it is even being argued that they will eliminate the need for many human positions, including those of “authors, editors, graphic artists, data analysts, translators, distributors and marketing personnel” (Abrahams). The problem with this idea, however, is that such a move would undeniably erase 80% of the industry and its employees, making it a less positive option. Beyond this, the technology itself is not yet advanced enough to match or better the results of human labor, particularly when it comes to writing and production in the creative field. By restricting algorithms to the replacement of human content creation, publishers run the risk of over estimating and misusing the technology. Though Parker is able to produce endless streams of books, for example, they are often encyclopedic in nature, exploring bland subjects like wax, sour red cabbage pickles, and royal jelly supplements (Abrahams).

Given these restrictions, it is highly unlikely that such algorithms will replace the human author or publisher in the near future, but that does not mean an erasure of their role in the industry. Rather, these platforms and programs are as vital to publishing advancement as ever, if in a more combinatorial manner. Instead of presenting the situation as human or machine, publishers need to recognize algorithms as a tool, and enact hybrid models of automation.

Although the products produced by companies such as Parker’s Icon Group International, or the similarly structured Nimble Books by Zimmerman, are subpar at best, their processes are not without value. Much like Parker, Zimmerman employs an algorithm that is able to “[search] a corpus of content and [select] articles that match” (Woods) a given topic, before organizing and formatting the contents into a book. These algorithms are able to search multiple online databases for a given topic or theme, analyzing and organizing obscene amounts of information in minutes. This alone has great implications for the publishing process. Instead of looking to package these findings in passable book format, publishers need to recognize algorithms as “accelerat[ing] and enhanc[ing] the traditional process” (Woods) of market and topic research. With more and more human knowledge and history being placed online, having programs in place that can quickly explore these channels will allow publishers to better understand a potential book or topic’s place in the market, audience, and public eye. Algorithms that consolidate the most relevant and important information in one place are essentially offering simplified, convenient, and “highly applicable market research within minutes, for just pennies” (Conner).

This in turn can also give publishers a better sense of what users, demographics, and groups a certain book attracts, ultimately allowing for “more precision” (Conner) in audience targeting. The same can also be said for topic research as well, with algorithmic investigations working to discover popular or over saturated genres and comparable titles. Access to such databases would help ensure that a potential book was “genuinely unique, […] [and] potentially patentable” (Conner), consequently relieving some copyright and monetary risks. Authors and publishers alike would thus be able to see how certain styles or genres of books were received, giving them the opportunity to tailor their own products and marketing for success.

Beyond looking at historical placement and reception of texts to help decide acquisitions, algorithm run platforms can also improve the publishing process via statistical stylometry. Statistical stylometry is the “statistical analysis of variations in literary style between one writer or genre” (Stony Brook) that can be used to determine commercial and critical success. In 2013, Professor Yejin Choi from Stony Brook University unveiled a study in which an algorithm was used to analyze 800 books from Project Gutenberg, a platform which “houses 42,000 books that are available for free download” (Stony Brook). The study was one of the first to try and provide quantitative insights into the relationship between book success and writing style. It looked at “1,000 sentences from the beginning of each book […] [and] performed systematic analyses based on lexical and syntactic features” (Stony Brook), before comparing those statistics to the number of downloads. The study itself uncovered numerous interesting trends and correlations such as how less successful books tend to be “characterized by a higher percentage of verbs, adverbs, and foreign words” (Stony Brook), while successful books make more frequent use of discourse connectives.

For the sake of this essay, however, the statistic I am focusing on is the algorithm’s ability to correctly determine the success of a book given its writing pattern. At the conclusion of the study, Choi determined that the algorithm was “effective in distinguishing highly successful literature from its less successful counterpart, achieving accuracy rates as high as 84%” (Stony Brook). Though not completely accurate, the proficiency of this algorithm could be used to further improve processes of acquisition. If used as a tool, this program could help publishers determine which books have a higher probability of being a success, and consequently, which books to take on. This could also help first time authors get published as well, since presses would be more likely to take on the risk of an unknown writer if given statistical evidence of a profitable return. Predicting the success of literary works has always posed a “massive dilemma for publishers” (Conner), and algorithms such as this one present an opportunity to improve this process, taking some of the guess work out of acquisitions in a way previously impossible.

Although algorithmic programs present almost infinite opportunities for the publishing world, publishers should be careful in how they decide to implement such technology. Rather than introducing algorithms as a means of replacing human labour and content production, publishers should use them as a tool for improving and advancing the systems currently in place. Given their talent for quickly searching and consolidating mass amounts of online information, algorithms have the ability to completely modernize processes of research and acquisitions, ultimately cutting costs and time, while improving accuracy. This in turn presents an alternative future for publishing, one in which human creation is accelerated and perfected by the machine, rather than erased.

 

 

 

Works Cited

Abrahams, Marc. “How to Write 85,000 Books.” Annals of Improbable Research. 2008. www.neatorama.com/2010/10/05/how-to-write-85000-books/

Conner, Cheryl. “Could your next book be written by a machine?” August 23, 2012. www.forbes.com/sites/cherylsnappconner/2012/08/23/could-your-next-book-be-written-by-a-machine/

McGuinness, Ross. “Meet the robots writing your news articles” the rise of automated journalism.” July 10, 2014. www.metro.co.uk/2014/07/10/meet-the-robots-writing-your-news-articles-the-rise-of-automated-journalism-4792284/

Stony Brook University. “Some elements of writing style differentiate successful fiction.” Science Daily. January 6, 2014. www.sciencedaily.com/releases/2014/01/140106094151.htm

Woods, Dan. “How Algorithmically created content will transform publishing.” Aug 13, 2012. www.forbes.com/site/danwoods/2012/08/13/how-algorithmically-created-content-will-transform-publishing/

 

 

 

 

 

Gamification in Publishing: A Brief Argument for its Current Use and Future Expansion

This essay argues that gamification in publishing is a beneficial development and should be sustained in any of its present forms or potential uses. Gamification, as will be defined, can influence positive reading behaviour through the reader’s play and interaction with the content by means of intrinsic and/or extrinsic motivators. Presently, companies like Scholastic, Kobo, and the Huffington Post have successfully integrated gamification into their publishing (or platform) models, but many publishers remain skeptical of its intrinsic, long-term value to the reader’s experience. It should be noted, however, that “gamification” strategies—although not traditionally defined as such—have been a long-standing part of print editorial, production, distribution, and marketing plans even before the inception of digital reading. Furthermore, reader psychology naturally allows for the acceptance of a gamified environment through the reader’s inherent motivation to complete, share, and collect content. As such, a publisher or content provider’s adoption of gamification in digital production, while not only benefitting the industry as a whole, should be expanded beyond the elementary use of points, badges, and leaderboards (PBL), and adopted cross-genre.

Gamification is a buzzword coined in 2002 by Nick Pelling, British-born computer programmer[1], to describe the use of game structure and mechanics in a non-game context. Gamified content is such that a user’s motivation within a game structure cultivates higher engagement with the content. Yu-kai Chou, in his book Actionable Gamification, references games like Candy Crush or Angry Birds that “require the same repetitive action” from the user over weeks or months; in the real world, this would be described as a chore or “grunt work”, but it now becomes “fun and addictive” because the user gains a feeling of accomplishment and gratification that is immediate and tangible[2]. These feelings are produced by extrinsic motivation—obtaining checkpoints and awards—or by intrinsic motivation—fulfilling creative expression or social contribution. Chou says, “[users] want these feelings enough that anything that stands in the way, be it grunt work or otherwise, is worth doing and doing urgently[3].”

It should be noted that publishers don’t consider reading a chore, and they assume that “people buy books for the joy of reading[4],” likening enjoyment of books to the act of reading itself. This is misguided. Reading is simply a tool in our toolkit, much like our ears are for listening, and our eyes are for watching. Our ability to read is not the reason why we do it; it is, in fact, emotional drivers that motivate us to read, like learning new information, being entertained through storytelling, escaping the real world, empathizing with a character, keeping up-to-date on current affairs or hot topics, etc. Furthermore, any of these emotional drivers do not require that we read; fulfillment of any of these drivers can be achieved through alternate forms of media, like music, video, and video games. As such, and as a result of a converging media field[5], gamification is helping to justify the act of reading in its substantiation of a reader’s emotional drivers through a system of reward. Now, those people who believe reading is a chore, those who replace their valuable time in watching YouTube, or those who can’t seem to finish a book they started three times over, are given the motivation they need to engage with the book in a positive way. Nancy Davis Kho an online content professional agrees, saying, “game theorists believe that integrating [gamification] techniques into [digital] interactions appeals to a range of fundamental human desires, including rewards, status, achievement, self-expression, competition, and even altruism[6]” that can, at the same time, be highly profitable for the publisher, author, and third party content provider.

Examples of successful gamification can be seen in Scholastic’s 39 Clues, an online game room that supplements the book series, driving kids to creatively problem-solve to discover clues in the book that help them level-up in their online adventure. Kobo, in its newer devices, overlaid a system of points, badges, and leaderboards that track the user’s progress, awarding them for things like fastest reading time, which can be shared on social media. One of the earliest adopters of content gamification was the Huffington Post, who enabled readers to earn points and badges for any article comments that gained respect from the community[7]. They also developed a long-standing game “Predict the News”, that engages users through a system of voting and ranking. In gamification, profitability is inseparable from the reader’s engagement with the content. If content gamification is designed with consideration for the reader’s emotional drivers, publishers can expect return on investment, like enhanced brand value through reader loyalty, enhanced product value through added features, higher visibility through social sharing, and data collection through reader interaction.

Furthermore, a sound gamification strategy cannot rely on the developer to gauge the reader’s emotional drivers; audience information that is best gathered from the publisher is necessary in developing a successful product. Acquiring reader loyalty, for example, may require new levels or hidden chapters of fresh content because “players quickly become disillusioned with games that don’t present new challenges[8].” Publishers are best suited to either generate new content or recommend further reading at these milestones. As a second example, fully engaging the reader within the gamification funnel (the game mechanics that pull a reader through the book[9]) may require the publisher’s insight on a reader’s changing needs and motivations as the story unfolds. At each point of reader interaction that the publisher can foresee and implement into the gamified reading structure, they are furthering their knowledge of reading behaviours and reading communities through the collection of live reader data. This information can then be reintegrated into the system, perhaps to help build a reader’s profile avatar, to perfect recommendation algorithms, or simply to optimize later iterations of a similar game.

It should be noted that publishers have been using “gamification” strategies—although not traditionally defined as such—since the beginning of modern publishing. While this essay will not cover gamification history, some examples of its use in editorial, production, distribution, and marketing plans for print books will help to contextualize its natural fit within the publishing industry. Firstly, the use of chapter headings, subheading, and other textual organizational devices act as physical milestones for the reader to work toward and track their progress. The use of badges and leaderboards are new gamified devices to assert the reader’s progress, but they mirror exactly the kind of motivation and sense of accomplishment that the reader naturally feels. Using broken up chapters to engage audiences has been a publishing strategy since the serialization novels in the Victorian era. Now, a reader who unlocks a hidden chapter level, or who wins an excerpt trophy has the same level of engagement. Secondly, readers naturally want to share their knowledge and opinions of a book, and this motivational driver has been baked into the publishing model through reviews and critiques, ratings and book awards, reader-to-reader recommendations, etc. Social media, as a subset of content gamification, is essentially the digital version of a book club where readers are motivated to share, lead a discussion, gather followers, etc. Gamified books can reflect this by having the user login with their social account. Finally, the reader’s urge to collect and/or display books has informed print production and design throughout history. Whether these books are collected with the aim of acquiring an award within the game community, or displayed in an online library that people can rank and share, these gamified devices are not changing a reader’s behaviour from what it once was in the print age; gamification is simply enhancing innate reader behaviour to comply with a digital reading world.

As such, a publisher or content provider’s adoption of gamification in digital production, while not only benefitting the industry as a whole, should be expanded to adopt new gamification devices that would allow for its cultivation cross-genre. Some of the examples mentioned previously are not genre specific and could be applied to literary fiction, non-fiction, genre fiction, educational books and the like. However, as Kho suggests, “one obstacle preventing deeper penetration of gamification into publishing organizations may be the concern that gaming will sully the seriousness of how content is received,” noting that, “earliest forays by some publishers into gamification were around relatively ‘light’ content, such as recipes[10].” As such, publishers see potential for gamification and interactivity within a limited scope of “light” or “playful” content, like children’s books. In Kho’s same article, Bunhball Inc.’s founder and chief product officer, Rajat Paharia suggests, “publishers may think that gamification is more appropriate for entertainment properties[11].” Again, these are misinterpretations on the part of the publisher and reveal their bias that some content should be, and desires to be consumed in a “serious way”. In fact, gamification devices are neither inherently serious nor are they inherently playful; they simply reflect human motivational drivers that we all share in interacting with the world around us. Our need to, for example, find meaning, become empowered, influence others, be surprised, avoid hazards, acquire rarities, take ownership, or accomplish a goal[12] can each be playful or serious endeavours. The fact that publishers can digitally harness these powerful motivational drivers through gamification should be an exciting development in the history of publishing.

References

[1] “Gamification.” 2015. Wikipedia, the Free Encyclopedia. http://en.wikipedia.org/w/index.php?title=Gamification&oldid=651735429.

[2] Chou, Yu-kai. 2014. Actionable Gamification. Leanpub. https://leanpub.com/actionable-gamification-beyond-points-badges-leaderboards.

[3] ibid.

[4] Kalder, Daniel. 2011. “Ready Reader One: Why Gamification Is Key to Publishing’s Future.” Publishing Perspectives. http://publishingperspectives.com/2011/09/gamification-key-publishing-future/.

[5] Ruth, Linda. 2013. “Gamification: Publishing’s Most Important Challenge.” Publishing Executive. http://www.pubexec.com/blog/gamification-publishings-most-important-challenge.

[6] Kho, Nancy Davis. 2012. “Getting Gamified: Publishers Score Big With Online Games.” EContent Magazine. http://www.econtentmag.com/Articles/Editorial/Feature/Getting-Gamified-Publishers-Score-Big-With-Online-Games-80935.htm.

[7] ibid.

[8] ibid.

[9] Rathi, Nandini. 2015. “10 Proven Gamification Strategies for Publishers to Maximize Engagement.” Betaout. https://www.betaout.com/blog/10-proven-gamification-strategies-for-publishers-to-maximize-engagement/.

[10] Kho, “Getting Gamified: Publishers Score Big With Online Games.”

[11] ibid.

[12] Chou. A summary of Chou’s Octalysis concept.

Greenfield, Jeremy. 2013. “One Idea to Save Illustrated Ebooks: GamificationDigital Book World | Digital Book World.” Digital Book World. http://www.digitalbookworld.com/2013/one-idea-to-save-illustrated-ebooks-gamification/.

Nawotka, Edward. 2015. “Does Gamification Turn Readers Into Winners and Losers?” Publishing Perspectives. http://publishingperspectives.com/2011/09/gamification-winners-and-losers/.

Star, Kam. 2013. “Behavioural Design”. presented at the Behaviour Design day at Digital Shoreditch. http://playgen.com/behavioural-design/.

How Publishers Can Get Rich or Die Tryin’ in the Digital Landscape

In a time of content abundance, “discoverability is becoming a bigger problem for authors and publishers,” according to JellyBooks founder Andrew Rhomberg (2014). But is the quantity of new books really creating the problem? Perhaps this is actually an example of a square peg fitting into a round hole. Book publishers continue to apply traditional or antiquated marketing techniques to a completely new marketing & selling environment. If book publishers re-strategized their marketing plans to utilize key tools available to them to optimize discoverability of a text, wouldn’t today’s discoverability actually be better than the old days of paper books in a 4-storey bookstore? This paper will examine the most important elements of book discoverability in the digital world and will recommend priorities for publishers as they adopt new and improved marketing techniques. Compared to traditional print book marketing strategies, such as jacket copy, print advertising, and co-op bookstore placement, discoverability in the digital landscape is better than ever before and presents an enormous opportunity for publishers in terms of awareness and sales.

1. Metadata

In 2013, 50% of print books and 90% of ebooks were discovered online (Booknet 2013). This online discovery takes place in multiple ways, including online browsing, retailer recommendations, author sites, and social networks. Though online browsing has the highest impact on online discovery, the distribution of impact between online sources is split between multiple sources. Therefore, and not surprisingly, my recommendation for a publisher’s first priority in digital discoverability is metadata, as the first priority for publishers must be to reach the online book shopper. Though the online book shopper may or may not discover books on Amazon’s or Chapters’ site, they are buying them there, and online recommendation algorithms also depend on quality metadata (Bellis). As of 2012, 25% of Canadian book buying was happening on the internet between ebook, audio book, and print book purchases (Booknet). Representing a quarter of the market, online buyers need to be able to find and purchase the book they want, based on any feature, including author, subject matter, BISAC, colour of the cover, or anything else that relates to the text. The publisher with the “best, most complete metadata offers the greatest chance for consumers to buy books,” (Dawson).

Metadata presents many opportunities of which publishers should take advantage. One opportunity for publishers is to appeal to more than one type of consumer. In traditional print book marketing, the single audience for the book is identified, perhaps a secondary audience is considered, and then marketing is strategically planned to reach that specific consumer. In marketing books digitally, there’s no longer a limitation to one audience or consumer for a given title. This is a huge opportunity for publishers, and it is largely being overlooked in today’s digital climate (Dawson). Many books are suitable to several specific target markets, and each has its own vernacular through which it can be reached (Dawson). It’s necessary to have different audiences addressed within a book’s metadata, because not everyone needs or wants to know the same thing about a given title. Where publishers in the past needed to segment, they can now reach multiple audiences through their metadata. The commitment from the publisher is to hire one individual to focus entirely on metadata entry. Of course, effective metadata requires research into search terms and audience data, and research requires associates who know how to do it well (Shatzkin). This is where the staffing perspective of publishers needs to focus on marketing beyond ability with social media and publicity pitches. Publishers need marketers who have the skills and awareness for the importance of metadata and how to best utilize this tool.

Another opportunity with metadata today is to harness the power of the backlist title. Where in the past a bookstore had limited shelf space to hold stock, today’s bookstore has endless storage space. Amazon, for example, sells products that it doesn’t even keep physical hold on. In the past, backlist sales have typically been an after-thought, a pleasant surprise in the profitability of a title. However, in the digital landscape, effective metadata and virtually endless stock space allows a longer lifespan for backlist titles. HarperCollins executive Susan Katz says backlist sales are an opportunity for all genres, if publishers place the proper focus on preparation of backlist metadata. “If you don’t take care of the metadata, it will take care of you,” she joked at Digital Book World 2014 (DBW). Further, backlist titles present the opportunity to address current events or news stories that make a title relevant once again, and thorough and effective metadata is the way to monitor the relevance of past titles (DBW).

The first priority for publishers when it comes to digital book discoverability needs to be focused on quality and extensive metadata for titles. Metadata is the bare minimum that publishers need to complete because online discoverability is literally impossible without proper metadata. The publishers who don’t prioritize effective metadata risk poor sales for one simple reason: because no one can find their books to begin with (Dawson).

2. Inbound Marketing

The second major priority for publishers, after metadata, should be an inbound marketing strategy. Inbound marketing would allow publishers to build up owned content on their own website that will work to attract search engine users (basically the entire general public) to their site content and eventually, their products. Plus, besides the manpower required to create content, a full inbound marketing campaign can be orchestrated on a free platform like WordPress with the help of affordable services like MailChimp for newsletters. Inbound marketing, coined by Hubspot in 2006, works from the basis of creating quality content, such as blog posts, email newsletters, calls-to-action, and social media, that pulls people in by aligning that content with the customer’s own interests. The four actions of inbound marketing (Attract, Convert, Close, Delight) work to move potential customers down the sales funnel to eventually create brand ambassadors out of paying customers (Hubspot). A publisher’s website is the only online venue that they have complete control over (Izenwasser). They don’t have any control over Amazon, or Kobo, or social media sites. The website is all they have, and that control over the content, the search-engine optimization, and the conversions, is exactly what publishers need to steer customers to purchase directly. Booknet reports that the top way readers become aware of new titles is through online browsing, therefore publishers need a way to reach readers early in the sales funnel, on their own site (2012).

Inbound marketing is also a great opportunity for publishers to begin building direct relationships with customers. Historically, a publisher’s relationship with the consumer has been almost non-existent, as the publisher sells to a distributor, the distributor sells to a retailer, and the retailer builds the relationship with the customer (and gets all of the customer data). Due to the importance of data and loyalty,“establishing a direct customer relationship is probably the most important task for publishers in 2015 and beyond,” (Izenwasser). Inbound marketing works to drive customers back to the website, creating a relationship between the visitor and the brand, even if they never buy products. Then, when they search to purchase a publisher’s book later on, they’ll recognize the publisher name (plus Google will reward the publisher for having created a previous relationship with the customer), and will potentially purchase products (Barnes). Barnes says to think of inbound marketing as a way to produce more reliable search results for customers. “Want to attract the right visitors to that online shop you’ve gone to the bother of creating? Inbound marketing is the best way to spark up a meaningful, fruitful relationship with customers who love what you’re doing,” (Barnes). The audience research done in the metadata phase of a marketing campaign will also be useful in the inbound campaigns. Content can be created to attract various audience segments and aid discoverability of a title. Imagine going to a publisher website and seeing content that interesting and valuable to you, as a reader, rather than a half-hearted visual of book covers or an online store that is obviously less streamlined than Amazon’s or Chapters Indigo’s. Build the relationship, and then send readers to the books and online store. Izenwasser notes that the advantage here is for small publishers who are likely to have an easier time building up brand awareness. He credits this to the fact that smaller presses have fewer titles and imprints, and often more focused lists, which will allow a more cohesive online experience.

Lastly, tools such as Google Analytics can be used by publishers not only to improve inbound marketing efforts but also to expand understanding of audiences for content and for books, which in turn will feed back to the metadata for an imprint and for backlist texts. The “right” search terms for a book or audience may not be the same in both Google and Amazon, and this level of data analysis will allow publishers to segment between online representations of a given title (Shatzkin).

Discoverability for print books is split between in-person (52%) and online (49%), as of 2013 (Booknet). This means that half of print books, and the vast majority of ebooks, are discovered in the online environment. Publishers need to build up their online presence where they have control over the content, reach customers early in the sales funnel, and build up the direct relationship with customers in order to capitalize on this key discoverability venue.

3. Social & Startups

Once publishers have managed their resources to cover both metadata and inbound marketing, there is some room to play with other exciting digital marketing platforms. Here is where a publisher will strive to reach passionate readers, not the typical reader. These are the types of readers who access special platforms to learn about books.

Social media, though used in inbound marketing to drive traffic to a website, is also worth consideration to a further extent once a publisher has successfully managed effective metadata and inbound marketing strategies. Beyond directing traffic, social media can also be valuable for discoverability of books within the confines of a given social ecosystem. This includes but is not limited to the social trinity of Facebook, Twitter, and Instagram. Pinterest, Goodreads, Tumblr, and Reddit all provide publishers with unique opportunities to interact with potential customers rather than just push out posts, and build relationships with unique audiences. Despite being rented media, they’re still worth investigating (as long as the metadata and website are already taken care of), plus, they provide marketers with even more data to apply towards audience research.

Book marketing startups could be also enormously beneficially to publishers. As an example, Jellybooks is a platform that allows readers to sample ebooks to discover their next book by browsing by genre. Among many tools for publishers, two stand out as being especially beneficial. The first is their main service, which offers cloud-hosted book samples for readers to download or share with friends. This is a worthy opportunity for publishers because according to Booknet, reading an excerpt is the primary mode of ebook discovery for Canadians (Booknet 2012). Not only does this tool offer readers with no-risk opportunity to browse titles, Jellybooks also provides publishers with full access to download and share statistics as part of their Jellybooks VIP program (Jellybooks). The second tool worth note is their book widgets, which can be applied to publisher websites, author websites, blogs, and Tumblrs, to allow for easy and direct sample downloads. Again, data is shared with publishers including which widgets are the most effective at spreading the word and who creates the most downloads and shares. This also adds value to the publisher website by offering an additional service. Not only does Jellybooks facilitate ebook sampling, but it also links out for purchase of a book upon completion of the sample, which can include a publisher estore (for a print book or an ebook) as a purchase option (Jellybooks). Direct purchases will benefit from the improved brand recognition that publisher have already put in place by honing their inbound marketing strategy.

Social media and tech startups provide exciting new opportunities for publishers to explore. However, these opportunities should not be explored until complex metadata and strategic inbound marketing has already been put into place. The communities on social sites and on startups such as Jellybooks are specialized, and while they are worthy of exploration, they do not represent the majority of book buyers in the Canadian market.

Conclusion

Andrew Rhomberg says no author or publisher than sustain sales momentum without a virtuous feedback loop (2015). Discover, Sample, Buy, Read, Share, Discover. Where traditionally, discovery happened in a bookstore by reading jacket copy, today’s publishers can focus metadata on specific audiences to aide discoverability. Further, where in the past advertising was used to push product information towards customers, publishers can now use inbound marketing to draw the right customers in towards their products. Where contact between readers and publishers has historically been non-existent, publishers can use their websites and social media accounts to build direct customers relationships. Lastly, sampling can now be done digitally, in the comfort of one’s home, rather than in a physical bookstore. Success in these important digital marketing venues will only come from correct prioritization and allocation of resources. It’s easy to declare that discoverability is the next big problem in publishing, but publishers would be better to revolutionize their marketing techniques and take advantage of the great opportunities digital discoverability provides instead.

Works Cited

Barnes, Emma. “The How and Why of Inbound Book Marketing.” Digital Book World. 26 Feb 2015. Web. 31 Mar 2015.
http://www.digitalbookworld.com/2015/the-how-and-why-of-inbound-book-marketing/

Bellis, Rich. “New Ebook Discovery Efforts Differ on Means.” Digital Book World. 12 Mar 2015. Web. 31 Mar 2015.
http://www.digitalbookworld.com/2015/new-ebook-discovery-efforts-differ-on-means/

Booknet. “The Canadian Book Consumer 2013: Book Purchases by Channel.” 2013. Web. 1 Apr 2015.
http://content.lib.sfu.ca/cdm/ref/collection/sfulibr/id/1671

Booknet. “The Canadian Book Consumer 2012: Annual Report.” 2012. Web. 1 Apr 2015. http://edocs.lib.sfu.ca/restricted/BookNet/BNC_BookConsumer_annual-report2012.pdf

Dawson, Laura. “What We Talk About When We Tail About Metadata.” A Futurist’s Manifesto. ed. Hugh McGuire & Brian O’Leary. O’Reilly Media. 2012. Web. 31 Mar 2015.
http://book.pressbooks.com/chapter/metadata-laura-dawson

DBW. “Selling Back-List Titles? Think Audiences and Metadata.” Digital Book World. 24 Apr 2014. Web. 31 Mar 2015.
http://www.digitalbookworld.com/2014/selling-back-list-titles-think-audience-and-metadata/

Hubspot. “The Inbound Methodology: The best way to turn strangers into customers and promoters of your business.” Web. 2 Apr 2015.
http://www.hubspot.com/inbound-marketing

Izenwasser, Murray. “Why Book Marketing (Still) Starts and Ends with the Website.” Digital Book World. 12 Dec 2014. Web. 1 Apr 2015.
http://www.digitalbookworld.com/2014/why-book-marketing-still-starts-and-ends-with-the-website/

Jellybooks. “Jellybooks for Publishers.” Web. 2 April 2015. http://publishers.jellybooks.com/

Rhomberg, Andrew. “Discoverability, Not Discovery, Is Publishing’s Next Big Challenge.” Digital Book World. 6 Jan 2014. Web. 30 Mar 2015.
http://www.digitalbookworld.com/2014/discoverability-not-discovery-is-publishings-next-big-challenge/

Rhomberg, Andrew. “Publish and They Will Come… Right?” Digital Book World. 8 Jan 2015. Web. 30 Mar 2015.
http://www.digitalbookworld.com/2015/publish-and-they-will-come-right/

Shatzkin, Mike. “Peter McCarthy and I have a new business and publishing has a new digital marketing service.” The Shatzkin Files. 7 Apr 2014. Web. 2 Apr 2015.
http://www.idealog.com/blog/peter-mccarthy-new-business-publishing-new-digital-marketing-service/

Topic modeling for BISAC code selection

From the first book we read, or, more often, have read to us, we begin to form preferences. We find authors we like, writing styles or reading levels that we enjoy consuming; plotlines that compel us to keep reading, and characters we connect with, but underlying all of these nuanced preferences is one very specific penchant: genre.

The genre of a book is almost always the first criteria considered by a reader. When we answer the question, “what kind of books do you like?” we, more often than not, respond with a list of our preferred genres. It is the defining feature that creates our personal dichotomy of books we like and those we do not. Occasionally, this can be overturned by the preference for an author whose writing spans genres, but seldom does this happen. While J.K. Rowling may have the power to draw readers to any kind of book she might produce, most authors could not draw equal readership for, say, their writing in the two genres of romance and true crime.

Think even of the setup of a typical bookstore. If you live in Canada, you are likely imagining a Chapters-Indigo, and would confirm the store as being laid out in sections based on the genre of the books found within. The same is true online, as we encounter and browse through websites that offer us catalogues of books sorted by genre; categorized into the neat and tidy packages of biography, fiction and literature, science fiction, and more. Whether wandering into a brick and mortar store, or browsing through an online retailer’s website, we start with genre and go from there.

Understanding then how important book genre is to a reader’s selection and purchase of a book, it would seem logical to assume that the system by which books are assigned a genre is one of consistency and standardization; with strict guidelines and processes in place to ensure that the genre being selected is accurate—reflective of the book’s content. Unfortunately, this assumption is untrue.

The process by which books are categorized into genres begins with a publisher’s selection of what it known as a Book Industry Standards and Communications (BISAC) code, which is entered into the publisher’s metadata system—likely ONIX—to be shared with book sellers and retailers along with the rest of the book’s data (e.g. title, author, price, etc.).

These codes are created and managed by the Book Industry Study Group’s Subject Codes Committee, are updated on an annual basis, and run the gamut between being vague (e.g. BIO000000 BIOGRAPHY & AUTOBIOGRAPHY / General) and specific (e.g. CRA044000 CRAFTS & HOBBIES / Needlework / Cross-Stitch) [1]. As of 2014, the BISG has created 52 subject headings under which BISAC codes are listed, many of which have been carefully cross-listed to ensure that no two codes are redundant.

Despite this extensive and detailed list being developed with apparent care by the BISG, no definitions for the various subject headings or codes are provided. Very little guidance surrounding the selection of BISAC codes based on book content is given; and no categorization tools or aids are supplied.

The BISG website certainly does little to assist publishers in their selection of BISAC codes. In the FAQ section of their BISAC Tutorial and FAQ page they answer the question “How do I choose the BISAC Subject Heading for a specific book?” with the following vague and ineffectual response:

The first step in determining the proper heading for a book would be to identify which of the 52 major areas within the list is most appropriate for the title. Once that section is identified, look for the term that most closely fits the content of the book. If the title has numerous facets, it is recommended that the process be repeated for other relevant major sections. If database systems are sophisticated enough, a recommendation is to do a Keyword or Find search on the entire list in order to identify all the terms that may be appropriate for the book. This is especially effective if it is difficult to determine the proper major section for the term one imagines would be used. This will also help alert the user to cases where similar subjects appear in different sections to reflect different ways of approaching the topic (e.g., “HEALTH & FITNESS / Sexuality”, “PSYCHOLOGY / Human Sexuality”, “RELIGION / Sexuality & Gender Studies”, “SELF-HELP / Sexual Instruction”, not to mention related subjects under JUVENILE FICTION, JUVENILE NONFICTION, and SOCIAL SCIENCE). [2]

Beyond this, the only concrete documentation provided to assist publishers in their selection of BISAC codes is an optional download of a document called Best Practices of Product Metadata. The document reminds publishers that “BISAC subject should be assigned based on book’s content—not on the merchandising plans of the publisher” [3] and offers limited (and commonsense) advice including:

  • There should be consistency across formats. In other words, hardcover, paperback, mass market, large print, audio books, and e-books should all have the same BISAC subjects.
  • Works of juvenile nonfiction should be assigned subjects in the JUVENILE NONFICTION section only. Collections containing both juvenile nonfiction and juvenile fiction may also be assigned subjects in the JUVENILE FICTION section.
  • Use subjects in the FOREIGN LANGUAGE STUDY section for works about the languages specified, whether these works are of an instructional, historical, or linguistic nature. Do not use subjects in this section to indicate the language of a work: works should be classified based on their subject content without regard to the language in which they are written (of course, if a work is about a language and written in that language, a subject in this section should be assigned) [4].

Only two small piece of advice offered even remotely pertain to a book’s content and its relation to selecting a BISAC code. They are:

  • Use subjects in the HEALTH & FITNESS section for works aimed at nonprofessionals. For scholarly works and/or works aimed at medical or health care professionals, use subjects in the MEDICAL section.
  • Certain other subject combinations also apply to titles intended for a lay person vs. those intended for a professional. These combinations include Nature vs. Science, Self-Help vs. Psychology [6].

The rest of the information provided is focused on the entry process for the BISAC codes into metadata systems such as ONIX, and other administrative or clerical tasks associated with BISAC code selection (e.g. how many codes you can select, the fact that a general code is not required if a more specific code from the same subject heading is selected, etc.).

In the absence of industry standards, what’s left is publisher intuition and interpretation, with each publisher (or their proxy) applying their own definitions to the subject headings and selecting BISAC codes as they see fit. The result is an unorganized system with no consistency across the millions of books published and released into the North American market each year.

The question then, is how do we fix this broken system and implement a consistent process for selecting BISAC codes? The first solution that springs to mind is to define the BISAC subject headings and formulate guidelines outlining which elements within a book’s content correspond to specific subject headings. Although this would likely have some positive impact on the consistency of the BISAC codes being assigned by publishers, it would not be enough because it does not fully resolve the key issue with the current system: the potential for human interpretation and bias. Even with guidelines in place, a system that relies on individuals to understand, interpret, and apply standards is bound to experience variance in the output being produced. The answer then must include moving the process of selecting BISAC codes outside the responsibilities of individuals and into the hands of technology, where subjective interpretation is replaced by the objective and programmable application of rules and standards.

Enter natural language processing. A field of study that combines computer science, artificial intelligence, and linguistics; natural language processing (NLP) is concerned with the interaction of computers with human (natural) languages  and includes within it experimentation with numerous computer-completed “tasks” such as speech recognition (converting speech to its textual equivalent), translation (converting text from one language to another), and—most relevant to the issue of selecting a BISAC code—topic modeling (determining a document’s topic based on elements within the text) [6].

According to Princeton researcher David M. Blei, topic modeling is a statistical method of analyzing the words of original texts “to discover the themes that run through them, [and] how those themes are connected to each other.” [7] This method functions when a computer processes a text and identifies specific patterns within it.

These patterns can include the recurrent use of certain words or phrases, or the repeated appearance of relationships (in terms of grammar, order, and position) between words or phrases, and are measured or quantified so as to provide the statistical likelihood that a text pertains to a specific theme or subject matter [8] .

In order for topic modeling to work, the computer processing the text relies on a set of algorithms or rules defining which patterns it should be looking for and which topics correspond with these patterns. These rules are generally created using lexical databases and other linguistic (syntactic and semantic) information, which, for the scope of this essay, will not be discussed in detail.

In an introductory paper discussing topic modeling, Blei goes on to describe the benefits of topic modeling by asking readers to “[i]magine searching and exploring documents based on the themes that run through them” [9] .

We might “zoom in” and “zoom out” to find specific or broader themes; we might look at how those themes changed through time or how they are connected to each other. Rather than finding documents through keyword search alone, we might first find the theme that we are interested in, and then examine the documents related to that theme [10].

Blei’s description of how topic modeling is related to search and discoverability sounds strikingly familiar to the way readers already search for and select books. Replace “theme” with “subject heading” and interpret zooming in as selecting a more specific BISAC code within a subject heading, and topic modeling and the selection of a BISAC are mirrored processes—the only difference is that, at present, one is completed by humans and the other by computers.

Applying the process of topic modeling to the selection of BISAC codes then, we would begin by developing a standard for the linguistic, semantic and syntactic patterns associated with specific BISAC codes. This can be done in multiple ways, but the most obvious way would be through an examination of the patterns present in a massive corpus of books already identified as having a specific BISAC code.

With these patterns identified and thus the topic modeling rules or algorithms set, a publisher could run the text of a book through topic modeling software. The computer would process the text of a book, measuring and recording the patterns it observes, and then, using the frequency and proportions of these patterns, identify the book’s subject matter, and in turn, the most appropriate BISAC code.

By placing this task in the hands of computers, not only would the process become extremely expedient, but the consistency and impartiality with which BISAC codes are selected would also be drastically increased. Without human interference, BISAC codes would be applied solely based on the content of a book, and the biases, interpretations, and marketing ploys of publishers would be removed from the process entirely.

As an added bonus, with the right tools, the topic modeling software could be linked directly to the ONIX metadata for each book, feeding its selection directly into the database. Each year when the list of BISAC codes is updated, the software could automatically re-process the text and update the BISAC codes when necessary or appropriate. Currently, because of the manual process used to select BISAC codes, even as the list of codes is updated, the BISAC codes assigned to books are never updated. Making BISAC code selection an automatic computer task would keep ONIX genre metadata up-to-date and consistent, and would prevent books assigned a now outdated or discontinued BISAC from falling off the radar or being excluded from search results or retailer sorting algorithms (e.g. Amazon’s recommendations or subject/genre categorizations) that depend on or factor in BISAC codes. Topic modeling and its application to BISAC code selection is an obvious fix to a system that so clearly is not function.

With publishers everywhere clamoring about the volume of books flooding the market and the accompanying issue of discoverability metadata—which includes BISAC codes—and the importance of its accuracy has risen to the forefront of the conversation [11]. Book industry veteran and Product Manager, Identifiers at Bowker, explicitly states, “The publisher (and retailer) with the best, most complete metadata offers the greatest chance for consumers to buy books. The publisher with poor metadata risks poor sales—because no one can find those books” [12].

And yet, even with this rise in concern and understanding of the importance of metadata, broken systems such as the human-based selection of BISAC codes still persist within the industry. Given the above discussion of the importance and omnipresence of genre in the purchasing decisions of buyers, and the known existence of topic modeling software—which offers clear benefits and advancement for publishers toward accurate metadata for publishers—the question is raised: when will publishers stop talking about their problems, and actually solving them?

References
[1] https://www.bisg.org/tutorial-and-faq#General
[2] https://www.bisg.org/publications/best-practices-product-metadata
[3] Ibid.
[4] Ibid.
[5] Ibid.
[6] http://en.wikipedia.org/wiki/Natural_language_processing
[7] https://www.cs.princeton.edu/~blei/papers/Blei2012.pdf
[8] Ibid.
[9] Ibid.
[10] Ibid.
[11] http://toc.oreilly.com/2010/06/sifting-through-all-these-book.html
[12] http://book.pressbooks.com/chapter/metadata-laura-dawson