The world of publishing is a game of chance. This game is played no matter whether you are a writer vying for publication, a publisher taking a gamble on a writer’s work, a bookstore ordering copies of this writer’s work to put on display or a reader taking a chance on the published work of a new or at least previously unfamiliar writer. Life, like publishing, is also a game of chance and some gambles have better odds than others. At times, depending on the nature of the gamble, you can foresee the outcome of a decision or an action of your choosing. But not always.
Part of the beauty and the angst of this thing called life is in fact, the mystery. The unknown. The chances. Delving into the unknown shapes us. It marks and it mars us in business and in our everyday lives. And publishing, like life, has as much risk as a high-stakes game of poker. In the realm of publishing there is more at stake than the individual. Placed in a precarious position and at risk in the gamble are the jobs of the editors at the publishing house, the reputation of the writer, the success of the bookstore and the wooing of the reader.
Because of all those whose well-being stands on the edge, choosing a work that will be successful in the marketplace is of primary importance. Money makes the world go round. Therefore, there must be a more efficient way to get rid of the “duds” and maximize the success of what is sadly becoming a (somewhat) dwindling market: the book.
What could this solution possibly be? Well, some very smart math people figured out an algorithm to determine what qualities the most popular titles have in common.
There are formulas to determine chemical reactions and formulas to determine area and perimeter and algorithms to determine probability of an infinite number of things. But this time, math has gone too far. The world of writing and literature is not a game of numbers and I dislike the fact that they are bringing numbers into this sacred realm very much. Sure there are people who crunch the numbers to determine how much of an advance a writer should get, how much a book should cost based on the number of pages it occupies, whether the book should appear in hardcover or paper back or both. I’m not in denial of the importance of math, numbers or money. But, an algorithm to determine the potential success of a book? Seriously?
According to gizmodo.com, in a recent article titled, Will Your Novel Be A Best Seller? Ask this Super Accurate Algorithm, this algorithm can determine the potential success of a book with an 84% accuracy rate. Researches from Stony Brook University have been working to develop a system called “Statistical Stylometry” which can mathematically examine words and grammar in books. Utilizing Project Gutenberg, these researchers have tested their algorithm on the literary works of the past that have come to find success. If you want to read a little more in-depth about the study, click here to read the abstract of the paper titled, “Success with Style: Using Writing Style to Predict the Success of Novels.”
While I began this post with some philosophizing about the risks of life, and while I do in fact respect the ingenuity of these researchers in determining this algorithm and their interest in literature in general, I still have issue with this situation. My complaints are threefold. First, I find the use of math to determine the success of literature to be rather insulting. A mathematical equation to explain the ebbs and flows of language is ridiculous and limiting. Secondly, from what I’ve read of the study, they do not take into account the changes that take place in popular language over time. And thirdly, they do not take into account the importance of those failed works in the grand scheme of things.
According to the study, “there are potentially many influencing factors, some of which concern the intrinsic content and quality of the book, such as interestingness, novelty, style of writing, and engaging storyline, but external factors such as social context and even luck can play a role” (http://aclweb.org/anthology/D/D13/D13-1181.pdf). This is all true and from the intro, one can assume that their intent is to reduce the rate of rejection for works that are diamonds in the rough, meaning: those works that initially faced numerous rounds of rejection before being picked up and then subsequently becoming best sellers.
So, let’s delve into a couple of the factors that the algorithm takes into consideration and the researchers determined based on the results they were given. Their sample of each text consists of the following: The researchers pulled the first 1000 sentences from each text to analyze and utilized various markers such as parts of speech tags in order to determine the voice/style of the writer. This is a good sample and a good idea, But, they don’t seem take into account the different styles of speech and writing of previous time periods. Cadences and sentence structure vary not only from writer to writer, but also from time periods and locations. They may address this issue further in the study, but the abstract did not have any indicators of how this was handled.
Some characteristics of successful versus unsuccessful works according to the study are as follows: “prepositions, nouns, pronouns, determiners and adjectives are predictive of highly successful books whereas less successful books are characterized by higher percentage of verbs, adverbs, and foreign words” (http://aclweb.org/anthology/D/D13/D13-1181.pdf). Thus, descriptiveness in a work is important. Creating a scene that the reader can taste, touch and feel is a big part of the writing bit. I can get behind this. Additionally, “more successful books use discourse connectives and prepositions more frequently, while less successful books rely more on topical words that could be almost cliche ́, e.g., “love”, typical locations, and involve more extreme (e.g., “breathless”) and negative words (e.g., “risk”) ” (http://aclweb.org/anthology/D/D13/D13-1181.pdf). Thus, according to this algorithm, works that utilize language that fits into a certain range, did not and will not achieve success in the marketplace.
But then the study makes a counter-intuitive point. “Successful books tend to bear closer resemblance to informative articles” (http://aclweb.org/anthology/D/D13/D13-1181.pdf). They don’t elaborate on the reasoning behind why this may be so. I hypothesize that this in fact could be true according to their study because of the fact that informative/journalistic articles are affiliated with a particular grade level of reading. The easier the reading level, the more mass appeal a book will have. This is certainly not an indicator that it is actually worthy of best-seller status, in my opinion. There are numerous poorly written books that became best sellers that from a literary perspective (and yes I am a bit of a literary snob) do not actually deserve the recognition that they receive.
These bad titles and these unexpected surprises are important and you can see this in the way that literary studies has been working to change the way that it preserves and educates its students. In the past, literary studies focused on the big authors, the writers who are the quintessential examples of popularity in their times. But then they began to turn their heads to the smaller writers, those who may or may not have been prolific. Their stories have value too. They teach us about the time period, they teach us about the everyday writer. They teach us about the different avenues that writing can take. All of these factors are important and without these works, a whole history would have been lost to us.
Computers and mathematical equations do a plethora of amazing things. But equations to determine the success of a book is just not something I can accept. Literature is about art, about teaching and delighting (to use the words of the ancients) and while the publishing world for those who work in it is primarily about making money, it will be a very sorry day for the all of us if something like this algorithm ever actually becomes a part of their business practice and is used as anything other than entertainment.