Digging for Gold: Reader Analytics and Data Mining in Manuscripts

As a publisher, if I had an all access pass to book data I would concentrate on my authors, their writing and my editorial team. I’m not talking about producing blockbuster after blockbuster, but simply having more hits than misses. Plus, only so many people read so many books a year which means the amount of blockbusters is finite. If I only wanted to be producing blockbusters then I’d be putting out two or three books a year, and somehow having a drastically reduced field of competition. No, I don’t need to sell a million copies of my author’s latest work (although that would be nice) but I do want to give their book the best possible chance to make it. How would I do this? By using reader analytics and data mining of course. Other publishers have already acknowledged the advantages.

A perfected Jellybooks would be my tool of choice. Being able to pin point where a reader struggles or stops reading would be beneficial for both the editor and the author to know. If the majority of readers are calling it quits after chapter three then some changes need to be made in the writing. My editor knows this book is a winner since the ending is spectacular, reflective, and thought-provoking, except no one is going to know that unless they get to the end! If the book lulls and you lose your audience (who is far less trained to recognize real talent and art, the je ne sais quoi of good writing than my editors and their gut) then it doesn’t matter how good the potential of the book is. Maybe all it will take is a little tweak to keep readers hooked.

Wouldn’t the authors have a problem with this? Sharing their precious baby before its ready for the cold world when it still needs some time to incubate with their editor. Yes, writers are sensitive and having their work picked apart by a bunch of strangers certainly doesn’t seem appealing and there are mixed opinions on beta reading. I would encourage them to reconsider, and to look at it as an investment in beta testing and although it may be painful it would at least give their book the best chance it could get before being released to the real cold world. Wouldn’t they appreciate a test-flop before a real flop? At least they have the time to go back and tweak their manuscript some more.

Plus, there are only six basic emotional arcs of storytelling and by data mining the manuscripts my editors would make sure that they keep on track with patterns readers are familiar with. Of course, this doesn’t mean the stories can’t break rules, and it’s possible to build complex arcs by using basic building blocks in sequence to create something unique. If my editors are able to catch a dip or spike in an already established arc, then it would be easier for them to hone in on the problem area and adjust it accordingly. Data mining manuscripts offers editors a map to the potential problem areas, and the chance to dig in and use their editorial training to adjust these segments. Generally, a good editor would be able to find these problem areas and lulls regardless, but an algorithm speeds up the process and allows for more time dedicated to workshopping the section.

Data mining manuscripts and using reader analytics isn’t about removing the human element from editorial work, quite the contrary. Reader analytics is studying human behaviour with reading, while data mining manuscripts is simply expediting the grunt work editors would have to go through regardless. Editors can use these tools to streamline the process they need to take with the manuscript and combine it with their gut instincts and human experience to allow a book to reach its full potential.

Ah, Internet writing. What does one call thee?

What does it mean “to publish”? The Oxford English Dictionary defines it as when one makes information available to the public. In A Writing Revolution Seed Magazine written by Denis Pelli and Charles Bigelow at Seed Magazine, the two make claims around what publishing means today. Yes, what they consider as contemporary publishing is supported with graphs and statistics, conveying that the Internet is making it even easier for anyone to essentially publish (make things public); however, I’m not so entirely on board that what they are describing is called “publishing”.

Continue reading “Ah, Internet writing. What does one call thee?”