Building a Poetrybot

[07-2018 / English / 1416 words]

Automated text generation is an intriguing topic. As a writer and reader its implications both fascinate and terrify me. To me, it is very clear that the literary world will be subject to complete disruption in 20 or 30 years from now. However, this doesn’t mean that humans won’t influence literary production anymore. It will just be very different.

Together with my friends Alexander Simunics and Emanuel Spurny I constructed a system which generates short poetry based on visual input. Poetrybot© is written in Processing and Java and uses two Google APIs - check the code on github. A simplified information flow looks like this: (1) a user generated image is tagged by Cloud Vision API, (2) the tag (e.g. ‘face’) is fed into Custom Search API, (3) the resulting url’s are searched for html paragraph tags by a webscraper, (4) this ‘source text’ is cleaned and tokenized, (5) and finally a body of text of any length is generated in a stochastic process (‘markov chain’).

Markov text generators have been around since the 80’s and they still function by the same simple concept: a memoryless probability distribution. This enables the automatic creation of sometimes stunningly good (and often outright silly) text pieces. When they are read in a creative context, these pieces can even seem like legitimate poetic work written in free verse. The examples below are actual works produced by poetrybot©, printed on my awesome Epson TM-T20II thermo-printer. You can see how the poems look like and what their strengths and weaknesses are.

In a markov text generator, source text matters a lot. The word combinations specified in a source file demarcate the borders of the system. When you happen to use the internet as source, you have – positively speaking – a wide spectrum of results. Note that the examples above are not a representative batch of produce. You don’t see the countless advertising slogans, mindless lists of products, and dust-dry Wikipedia fragments. In many sessions of playing around with the program and printing the results, the examples above were the ones that didn’t suck. Basically, these poems are the pride of internet creation. But it has to get better than ‘nose supports python3’, right?

Using the internet as inspiration has definitely a certain charm. However, it is much more effective to use the work of real authors and poets as starting point. By injecting more coherent forms of literature, I found out that there is definitely room for improvement. Though markov chain based systems are not actually ‘learning’, I like to imagine them as little children parroting the word choices of their parents. So what happens if a child is reared by authors like Mo Yan, William S. Burroughs, Chinua Achebe and Neal Stephenson in a collaborative (and monolingual) effort? I tried it out and fed their great books Red Sorghum, Naked Lunch, Things fall apart and Cryptonomicon in the form of txt-files into poetrybot©.

The result is a lot of strangeness like the following somewhat macabre sentence:

> It has 33 lost because their town all their foul-smelling urine. The dead very well, missing factor could look for cooking?

Another section allusively connects algebra with the divine:

> Randy pays respect for calculus, butOf course! He doesn't deny the aisle after it.

However, most of the sentences are primarily confusing:

> human beings again, this for in close eye out flutes sang this zeta function that yourself, said an ominous was tremendously startling events that cannot hold; Mere anarchy is sick Indian copper

as well as:

> Commander, I'm really work. Get these people a bona fide croaker, neither animal nor your way... You haven't forgotten the bush league short ones who sang in any arithmetic problem down.


> Commander Yu. Adjutant Ren smacked him alone. He even though nothing else knew who desired children left their stomach and candiru is some had yet vicious dog?

In case you are wondering what a ‘candiru’ is – that’s a term from Naked Lunch which William S. Burroughs will gladly explain to you:

> “the candiru is a small eel-like fish or worm about one-quarter inch through and two inches long patronizing certain rivers of ill repute in the Greater Amazon Basin, will dart up your prick or your asshole or a woman's cunt faute de mieux, and hold himself there by sharp spines with precisely what motives is not known since no one has stepped forward to observe the candiru's life-cycle in situ” (excerpt from Naked Lunch)

Then again, there are also lots of gripping sentences like:

> I blasted my dear friend. And Junk hung directly at night, because I read him no patience beyond an alienated staircase leading him later with license to thinking.

You find the full 5000 word output here.

What is stunning in this textual assemblage is its strange capacity to fabricate creative expressions. For example, the interesting word-groupings 'sick Indian copper' or ‘bush league short ones’ are not contained in the source texts – the program made them up on its own. This quality of associative juxtaposition seems to me like a valuable tool for expressive poets. On the other hand, the equally interesting groupings ‘bona fide croaker’ and ‘alienated staircase’ are stolen from Burroughs and Stephenson. A markov chain based approach to writing is essentially a form of making cut-ups on steroids. Based on my experience with contemporary poetry I sometimes wonder who else already got this idea.

In a kind of literary Turing test, I generated several poems with poetrybot© and read them within the scope of a poetry course at university. I told everybody that I wrote the poems myself. My professor and some fellow students repeatedly praised me for my ‘associative and complex’ free verse - style. My only adjustment to the automatically generated text was to introduce line breaks. This little modification is surprisingly important for readability and an overall ‘poetic’ appearance.

However, the potential of this form of text generation is clearly limited. This is because there is a key component missing: meaning. An automatically generated short poem of a few lines can seem thoughtful, whereas the same model with thousands of lines looks like incoherent rambling (think of a less sophisticated Ulysses). Unfortunately, a markov chain based approach will never be able to generate cogent fiction. Real storytelling with a recognizable plot and character development is very hard to do in an automated way.

To me, the most promising approach to this problem seems to be located in the sphere of machine learning. By using a word-based LSTM model, which is learned with great literature, I want to create a fictionbot capable of generating ultrashort stories with a very simple storyline and a length of about 100-200 words. To make the model responsive, an upstream word-to-vector-model vectorizes textual input and enables the creation of stories out of ‘seed words’. Ideally, a workflow would look like this: (1) a human chooses a model learned on a source text to influence the stylistic and thematic setting (e.g. 1984 for a dystopic feeling), (2) the human specifies some keywords to steer the story in the desired direction, (3) the keywords are vectorized and matched with fitting terms in the model (e.g. the keyword ‘banana’ increases the likelihood of appearance for other fruit-related terms), and (4) the finished story is outputted based on the learned model.

If such a model is really operable, we would be at the starting point of a major disruption of the literary world. Why use hundred thousands of words to tell a story when you can achieve a similar result with about hundred well-selected keywords and a thematically and stylistically fitting model for text generation? In practice, such automated literary production could happen in a process resembling the present-day coding procedures. Here, a high-level-language is used to outline the meaning of a program – and in a subsequent step, this code is automatically compiled to make it executable.

Following this train of thought, a potential fictionbot would serve as compiler for an array of modular semantic clusters, carefully arranged by a human author. Of course, such an endeavor would require extraordinary amounts of calibration and fine-tuning. Also, it would potentially broach the issue of plagiarism due to its necessary use of very large amounts of high-quality literary source text.