Optimizing Algorithms for the Publishing Industry
Maintaining technological relevance and a subsequent competitive edge are two of the publishing industry’s greatest challenges, regardless of medium. The publishing landscape is changing at a pace and progression previously unheard of, and this has resulted in the release of an endless flood of theories, apps, programs, and designs, all purporting to hold the key to one’s success. One such example of this trend is the idea of the computerized author and publisher, which has recently begun to pick up steam. Increasingly, advances in technology dealing with algorithmic machine learning are seen as heralding a new future, one in which human writing and publishing is obsolete. Such perspectives, however, can be seen as both an over reaction and an over reaching of the facts at hand. By presenting the question as one of machine versus human, the publishing world risks missing out on the algorithms’ full potential and yet another chance for improvement. Instead of seeing such algorithms as a means of replacement, the industry must see them as a way of advancement and increasing efficiency. If publishers can recognize these algorithms for the tools that they are, they have the ability to streamline the publishing process like never before. In this essay, I will thus explore the ways in which the implementing of such algorithms can modernize the pre-publishing tasks of topic exploration, market research, and acquisitions.
In the last decade, numerous organizations and businesses have explored the potential for algorithms within the publishing sphere, with each trying in various ways to create “a method and apparatus for automated authoring and marketing” (Abrahams). In most instances, these advancing technologies and algorithms are suggested as an alternative to human labour and the subsequent costs and time associated with it. Take for example Philip M. Parker, a “chair professor of Management Science at INSEAD” (Abrahams) and head of Icon Group International, a publishing and tech company. In the last ten years, Parker has “written more than one million titles” (McGuinness), all thanks to an algorithm that he created. The program itself is extremely simple, boasting a completion time of twenty minutes per book. Working off of large linguistic databases, the algorithm is designed to respond to a human entered topic. Parker feeds the algorithm “a recipe for writing a particular genre […]. The computer [then] uses the recipe to select data from the database and write and format it into a book” (Abrahams). The entire process costs about twenty-three cents and can quickly be sold through Amazon or Printed on Demand when ordered.
Given their speed and efficiency, machine run algorithms like this are being hailed as the next wave of betterment for publishing industries across the globe. In some circles, it is even being argued that they will eliminate the need for many human positions, including those of “authors, editors, graphic artists, data analysts, translators, distributors and marketing personnel” (Abrahams). The problem with this idea, however, is that such a move would undeniably erase 80% of the industry and its employees, making it a less positive option. Beyond this, the technology itself is not yet advanced enough to match or better the results of human labor, particularly when it comes to writing and production in the creative field. By restricting algorithms to the replacement of human content creation, publishers run the risk of over estimating and misusing the technology. Though Parker is able to produce endless streams of books, for example, they are often encyclopedic in nature, exploring bland subjects like wax, sour red cabbage pickles, and royal jelly supplements (Abrahams).
Given these restrictions, it is highly unlikely that such algorithms will replace the human author or publisher in the near future, but that does not mean an erasure of their role in the industry. Rather, these platforms and programs are as vital to publishing advancement as ever, if in a more combinatorial manner. Instead of presenting the situation as human or machine, publishers need to recognize algorithms as a tool, and enact hybrid models of automation.
Although the products produced by companies such as Parker’s Icon Group International, or the similarly structured Nimble Books by Zimmerman, are subpar at best, their processes are not without value. Much like Parker, Zimmerman employs an algorithm that is able to “[search] a corpus of content and [select] articles that match” (Woods) a given topic, before organizing and formatting the contents into a book. These algorithms are able to search multiple online databases for a given topic or theme, analyzing and organizing obscene amounts of information in minutes. This alone has great implications for the publishing process. Instead of looking to package these findings in passable book format, publishers need to recognize algorithms as “accelerat[ing] and enhanc[ing] the traditional process” (Woods) of market and topic research. With more and more human knowledge and history being placed online, having programs in place that can quickly explore these channels will allow publishers to better understand a potential book or topic’s place in the market, audience, and public eye. Algorithms that consolidate the most relevant and important information in one place are essentially offering simplified, convenient, and “highly applicable market research within minutes, for just pennies” (Conner).
This in turn can also give publishers a better sense of what users, demographics, and groups a certain book attracts, ultimately allowing for “more precision” (Conner) in audience targeting. The same can also be said for topic research as well, with algorithmic investigations working to discover popular or over saturated genres and comparable titles. Access to such databases would help ensure that a potential book was “genuinely unique, […] [and] potentially patentable” (Conner), consequently relieving some copyright and monetary risks. Authors and publishers alike would thus be able to see how certain styles or genres of books were received, giving them the opportunity to tailor their own products and marketing for success.
Beyond looking at historical placement and reception of texts to help decide acquisitions, algorithm run platforms can also improve the publishing process via statistical stylometry. Statistical stylometry is the “statistical analysis of variations in literary style between one writer or genre” (Stony Brook) that can be used to determine commercial and critical success. In 2013, Professor Yejin Choi from Stony Brook University unveiled a study in which an algorithm was used to analyze 800 books from Project Gutenberg, a platform which “houses 42,000 books that are available for free download” (Stony Brook). The study was one of the first to try and provide quantitative insights into the relationship between book success and writing style. It looked at “1,000 sentences from the beginning of each book […] [and] performed systematic analyses based on lexical and syntactic features” (Stony Brook), before comparing those statistics to the number of downloads. The study itself uncovered numerous interesting trends and correlations such as how less successful books tend to be “characterized by a higher percentage of verbs, adverbs, and foreign words” (Stony Brook), while successful books make more frequent use of discourse connectives.
For the sake of this essay, however, the statistic I am focusing on is the algorithm’s ability to correctly determine the success of a book given its writing pattern. At the conclusion of the study, Choi determined that the algorithm was “effective in distinguishing highly successful literature from its less successful counterpart, achieving accuracy rates as high as 84%” (Stony Brook). Though not completely accurate, the proficiency of this algorithm could be used to further improve processes of acquisition. If used as a tool, this program could help publishers determine which books have a higher probability of being a success, and consequently, which books to take on. This could also help first time authors get published as well, since presses would be more likely to take on the risk of an unknown writer if given statistical evidence of a profitable return. Predicting the success of literary works has always posed a “massive dilemma for publishers” (Conner), and algorithms such as this one present an opportunity to improve this process, taking some of the guess work out of acquisitions in a way previously impossible.
Although algorithmic programs present almost infinite opportunities for the publishing world, publishers should be careful in how they decide to implement such technology. Rather than introducing algorithms as a means of replacing human labour and content production, publishers should use them as a tool for improving and advancing the systems currently in place. Given their talent for quickly searching and consolidating mass amounts of online information, algorithms have the ability to completely modernize processes of research and acquisitions, ultimately cutting costs and time, while improving accuracy. This in turn presents an alternative future for publishing, one in which human creation is accelerated and perfected by the machine, rather than erased.
Abrahams, Marc. “How to Write 85,000 Books.” Annals of Improbable Research. 2008. www.neatorama.com/2010/10/05/how-to-write-85000-books/
Conner, Cheryl. “Could your next book be written by a machine?” August 23, 2012. www.forbes.com/sites/cherylsnappconner/2012/08/23/could-your-next-book-be-written-by-a-machine/
McGuinness, Ross. “Meet the robots writing your news articles” the rise of automated journalism.” July 10, 2014. www.metro.co.uk/2014/07/10/meet-the-robots-writing-your-news-articles-the-rise-of-automated-journalism-4792284/
Stony Brook University. “Some elements of writing style differentiate successful fiction.” Science Daily. January 6, 2014. www.sciencedaily.com/releases/2014/01/140106094151.htm
Woods, Dan. “How Algorithmically created content will transform publishing.” Aug 13, 2012. www.forbes.com/site/danwoods/2012/08/13/how-algorithmically-created-content-will-transform-publishing/