Another long wait at the airport is being filled with writing up some of the 10 pages of notes I scribbled while attending the WebNLG’16 workshop and the 9th International Natural Language Generation conference 2016 (INLG’16), that were held from 6 to 10 September in Edinburgh, Scotland.
There were two keynote speakers, Yejin Choi and Vera Demberg, and several long and short presentations and a bunch of posters and demos, all of which had full or short papers in the (soon to appear) ACL proceedings online. My impression was that, overall, the ‘hot’ topics were image-to-text, summaries and simplification, and then some question generation and statistical approaches to NLG.
The talk by Yejin Choi was about sketch-to-text, or: pretty much anything to text, such as image captioning, recipe generation based on the ingredients, and one even could do it with sonnets. She used a range of techniques to achieve it, such as probabilistic CFGs and recurrent neural networks. Vera Demberg’s talk, on the other hand, was about psycholinguistics for NLG, starting from the ‘uniform information density hypothesis’ compared to surprisal words and grammatical errors and how that affects a person reading the text. It appears that there’s more pupil jitter when there’s a grammar error. The talk then moved on to see how one can model and predict information density, for which there are syntactic, semantic, and event surprisal models. For instance, with the semantic one: ‘peter felled a tree’: then how predictable is ‘tree’, given that its already kind of entailed in the word ‘felled’? Some results were shown for the most likely fillers for, e.g., ‘serves’ as in ‘the waitress serves…’ and ‘the prisoner serves…’, which then could be used to find suitable word candidates in the sentence generation.
The best paper award went to “Towards generating colour terms for referents in photographs: prefer the expected or the unexpected?”, by Sina Zarrieß and David Schlangen . While the title might sound a bit obscure, the presentation was very clear. There is the colour spectrum, and people assign names to the colours, which one could take as RGB colour value for images. This is all nice and well on the colour strip, but when a colour is put in context of other colours and background knowledge, the colours humans would use to describe that patch on an image isn’t always in line with the actual RGB colour. The authors approached the problem by viewing it as a multi-class classification problem and used a multi-layer perceptron with some top-down recalibration—and voilá, the software returns the intended colour, most of the times. (Knowing the name of the colour, one then can go on trying to automatically annotate images with text.)
As for the other plenary presentations, I did make notes of all of them, but will select only a few due to time limitations. The presentation by Advaith Siddhartan on summarisation of news stories for children  was quite nice, as it needed three aspects together: summarising text (with NLG, not just repeating a few salient sentences), simplifying it with respect to children’s vocabulary, and editing out or rewording the harsh news bits. Another paper on summaries was presented by Sabita Acharya , which is likely to be relevant also to my student’s work on NLG for patient discharge notes . Sabita focussed on trying to get doctor’s notes and plan of care into a format understandable by a layperson, and used the UMLS in the process. A different topic was NLG for automatically describing graphs to blind people, with grade-appropriate lexicons (4-5th grade learners and students) . Kathy Mccoy outlined how they were happy to remember their computer science classes, and seeing that they could use graph search to solve it, with its states, actions, and goals. They evaluated the generated text for the graphs—as many others did in their research—with crowdsourcing using the Mechanical Turk. One other paper that is definitely on my post-conference reading list, is the one about mereology and geographic entities for weather forecasts , which was presented by Rodrigo de Oliveira. For instance, a Scottish weather forecast referring to ‘the south’ is a different region than that of the UK as a whole, and the task was how to generate the right term for the intended region.
My 1-minute lightning talk of Langa’s and my long paper  went well (one other speaker of the same session even resentfully noted afterward that I got all the accolades of the session), as did the poster and demo session afterward. The contents of the paper on part-whole relations in isiZulu were introduced in a previous post, and you can click on the thumbnail on the right for a png version of the poster (which is less text than the blog post). Note that the poster only highlights three part-whole relations from the 11 discussed in the paper.
ENLG and INLG will merge and become a yearly INLG, there is a SIG for NLG, (www.siggen.org), and one of the ‘challenges’ for this upcoming year will be on generating text from RDF triples.
Irrelevant for the average reader, I suppose, was that there were some 92 attendees, most of whom attended the social dinner where there was a ceilidh—Scottish traditional music by a band with traditional dancing by the participants—were it was even possible to have many (traditional) couples for the couples dances. There was some overlap in attendees between CNL16 and INLG16, so while it was my first INLG it wasn’t all brand new, yet also new people to meet and network with. As a welcome surprise, it was even mostly dry and sunny during the conference days in the otherwise quite rainy Edinburgh.
(links TBA shortly—neither Google nor duckduckgo found their pdfs yet)
 Sina Zarrieß and David Schlangen. Towards generating colour terms for referents in photographs: prefer the expected or the unexpected? INLG’16. ACL, 246-255.
 Iain Macdonald and Advaith Siddhartan. Summarising news stories for children. INLG’16. ACL, 1-10.
 Sabita Acharya. Barbara Di Eugenio, Andrew D. Boyd, Karen Dunn Lopez, Richard Cameron, Gail M Keenan. Generating summaries of hospitalizations: A new metric to assess the complexity of medical terms and their definitions. INLG’16. ACL, 26-30.
 Joan Byamugisha, C. Maria Keet, Brian DeRenzi. Tense and aspect in Runyankore using a context-free grammar. INLG’16. ACL, 84-88.
 Priscilla Morales, Kathleen Mccoy, and Sandra Carberry. Enabling text readability awareness during the micro planning phase of NLG applications. INLG’16. ACL, 121-131.
 Rodrigo de Oliveira, Somayajulu Sripada and Ehud Reiter. Absolute and relative properties in geographic referring expressions. INLG’16. ACL, 256-264.
 C. Maria Keet and Langa Khumalo. On the verbalization patterns of part-whole relations in isiZulu. INLG’16. ACL, 174-183.