Digging for Gold: Reader Analytics and Data Mining in Manuscripts

As a publisher, if I had an all access pass to book data I would concentrate on my authors, their writing and my editorial team. I’m not talking about producing blockbuster after blockbuster, but simply having more hits than misses. Plus, only so many people read so many books a year which means the amount of blockbusters is finite. If I only wanted to be producing blockbusters then I’d be putting out two or three books a year, and somehow having a drastically reduced field of competition. No, I don’t need to sell a million copies of my author’s latest work (although that would be nice) but I do want to give their book the best possible chance to make it. How would I do this? By using reader analytics and data mining of course. Other publishers have already acknowledged the advantages.

A perfected Jellybooks would be my tool of choice. Being able to pin point where a reader struggles or stops reading would be beneficial for both the editor and the author to know. If the majority of readers are calling it quits after chapter three then some changes need to be made in the writing. My editor knows this book is a winner since the ending is spectacular, reflective, and thought-provoking, except no one is going to know that unless they get to the end! If the book lulls and you lose your audience (who is far less trained to recognize real talent and art, the je ne sais quoi of good writing than my editors and their gut) then it doesn’t matter how good the potential of the book is. Maybe all it will take is a little tweak to keep readers hooked.

Wouldn’t the authors have a problem with this? Sharing their precious baby before its ready for the cold world when it still needs some time to incubate with their editor. Yes, writers are sensitive and having their work picked apart by a bunch of strangers certainly doesn’t seem appealing and there are mixed opinions on beta reading. I would encourage them to reconsider, and to look at it as an investment in beta testing and although it may be painful it would at least give their book the best chance it could get before being released to the real cold world. Wouldn’t they appreciate a test-flop before a real flop? At least they have the time to go back and tweak their manuscript some more.

Plus, there are only six basic emotional arcs of storytelling and by data mining the manuscripts my editors would make sure that they keep on track with patterns readers are familiar with. Of course, this doesn’t mean the stories can’t break rules, and it’s possible to build complex arcs by using basic building blocks in sequence to create something unique. If my editors are able to catch a dip or spike in an already established arc, then it would be easier for them to hone in on the problem area and adjust it accordingly. Data mining manuscripts offers editors a map to the potential problem areas, and the chance to dig in and use their editorial training to adjust these segments. Generally, a good editor would be able to find these problem areas and lulls regardless, but an algorithm speeds up the process and allows for more time dedicated to workshopping the section.

Data mining manuscripts and using reader analytics isn’t about removing the human element from editorial work, quite the contrary. Reader analytics is studying human behaviour with reading, while data mining manuscripts is simply expediting the grunt work editors would have to go through regardless. Editors can use these tools to streamline the process they need to take with the manuscript and combine it with their gut instincts and human experience to allow a book to reach its full potential.

2 Replies to “Digging for Gold: Reader Analytics and Data Mining in Manuscripts”

  1. Hi Jaiden,
    Thanks for your response. I liked how you focused on using data to essentially help the author. I think a perfected Jellybooks would be a great way to get the author feedback on their work. You’re definitely right that data mining and human element aren’t (and shouldn’t be) opposed to one another.

  2. I like how targeted this response is on reader data and the ways that it could help a publisher. I was left wanting to know two more things about your thinking that, if included, would have made this post stronger. The first is the ways in which you think Jellybooks data could be improved. You say you’d want a “perfected Jellybooks,” but don’t say what that looks like. The second was a better sense of how this data would really improve sales. You talk about efficiencies in the editorial process, but these seem like business efficiencies, not better choices and sales. Unless, as I suspect, you believe that good editing leads to better books, and better books sell better. I, perhaps cynically, am not so sure that is the case, so I wanted to see you make the argument for it.

Leave a Reply