# Some explorations into book publishing logistics

Writing a book is only one part of the whole process of publishing a book. There’s the actual thing that eventually needs to get out into the wide world. Hard copy? E-book? Print-on-demand? All three or a subset only? Taking a step back: where are you as author located, where are the publisher and the printer, and where is the prospective audience? Is the prospective readership IT savvy enough for e-books to even consider that option? Is the book’s content suitable for reading on devices with a gazillion different screen sizes? Here’s a brief digest from after my analysis paralysis of the too many options where none has it all – not ever, it seems.

I’ve written about book publishing logistics and choices for my open textbook, but that is, well, a textbook. My new book, No Taming of the Enthusiast, is of a different genre and aimed at a broader audience. Also, I’m a little wiser on the practicalities of hard copy publishing. For instance, it took nearly 1.5 months for the College Publications-published textbook to arrive in Cape Town, having travelled all the way from Europe where the publisher and printer are located. Admittedly, these days aren’t the best days for international cargo, but such a delivery time is a bit too long for the average book buyer. I’ve tried buying books with other overseas retailers and book sellers over the past few years—same story. On top of that, in South Africa, you then have to go to the post office to pick up the parcel and pay a picking-up-the-parcel fee (or whatever the fee is for), on top of the book’s cost and shipping fee. And it may get stuck in Customs limbo. This is not a good strategy if I want to reach South African readers. Also, it would be cool to get at least some books all the way onto the shelves of local book stores.

A local publisher then? That would be good for contributing my bit to stimulating the local economy as well. It has the hard copy logistics problem in reverse at least in part, however: how to get the books from so far down south to other places in the world where buyers may be located. Since the memoir is expected to have an international audience as well, some international distribution is a must. This requirement still gives three options: a multinational hard copy publisher that distributes to main cities with various shipping delays, print-on-demand (soft copy distributed, printed locally wherever it is bought), or e-book.

Let’s take the e-books detour for a short while. There is a low percentage of uptake of e-books – some 20% at best – and lively subjective opinions on why people don’t like ebooks. I prefer hard copies as well, but tolerate soft copies for work. Both are useful for different types of use: a hard copy for serious reading and a soft copy for skimming and searching so as to save oneself endless flicking to look up something. It’s happening the same with my textbook as well, to some extent at least: people pay for it to have it nicely printed and bound even though they can do that with the pdf themselves or just read the pdf. For other genres, some are better in print in any case, such as colourful cookbooks, but others should tolerate e-readers quite well, such as fiction when it’s just plain text.

After the tech tests, I’ve read through the first few pages of one of the two epub e-books – and abandoned it since. Although the epub file resized well, and I suppose that’s a pat on the back for the software developers, it renders ugly on the dual laptop/tablet and smartphone I checked it with. It offers not nearly the same neat affordances of a physical book. For the time being, I’ll buy an e-book only if there’s no option to buy a hard copy and I really, really, want to read it. Else to just let it slide – there are plenty of interesting books that are accessible and my reading time is limited.

So, now what for my new book? There is no perfect solution. I don’t want to be an author of something I would not want to read (the e-book), but it can be set up if there’s enough demand for it. Then, for the hard copies route, if you’re not already a best-selling author or a VIP who dabbles in writing, it’s not possible to get it both published ‘fast’ – in, say, at most 6 months cf. the usual 1.5-2 years with a traditional publisher – and have it distributed ‘globally’. Even if you are quite the hotshot writer, you have to be rather patient and contend with limited reach.

Then what about me, as humble award-wining textbook writer who wrote a memoir as well, and who can be patient but generally isn’t for long? First, I still prefer hard copies first and foremost nonetheless. Second, there’s the decision to either favour local or global in the logistics. Eventually, I decided to favour local and found a willing South African publisher, Porcupine Press, to publish it under their imprint and then went for the print-on-demand for elsewhere. PoD will take a few days lead time for an outside-South-Africa buyer, but that’s little compared to international shipping times and costs.

How to do the PoD? A reader/buyer need not worry and simply will be able to buy it from the main online retailers later in the upcoming week, with the exact timing depending on how often they run their batch update scripts and how much manual post-processing they do.

From the publishing and distribution side: it turns out someone has thought about all that already. More precisely, IngramSpark has set up an international network of local distributors that has a wider reach than, notably, KDP for the Kindle, if that floats your boat (there are multiple comparisons of the two on many more parameters, e.g., here and here). You load the softcopy files onto their system and then they push it into some 40000 outlets, including the main international ones like Amazon and multiple national ones (e.g., Adlibris in Sweden, Agapea in Spain). Anyway, that’s how it works in theory. Let’s see how that works in practice. The ‘loading onto the system’ stage started last week and should be all done some time this upcoming week. Please let me know if it doesn’t work out; we’ll figure something out.

Meanwhile for people in South Africa who can’t wait for the book store distribution that likely will take another few weeks to cover the Joburg/Pretoria and Cape Town book shops (an possibly on the shelf only in January): 1) it’s on its way for distribution through the usual sites, such as TakeALot and Loot, over the upcoming days (plus some days that they’ll take to update their online shop); 2) you’ll be able to buy it from the Porcupine Press website once they’ve updated their site when the currently-in-transit books arrive there in Gauteng; 3) for those of you in Cape Town, and where the company that did the actual printing is located (did I already mention logistics matter?): I received some copies for distribution on Thursday and I will bring copies to the book launch next weekend. If the impending ‘family meeting’ is going to mess up the launch plans due to an unpleasant more impractical adjusted lockdown level, or you simply can’t wait: you may contact me directly as well.

# Progress on generating educational questions from ontologies

With increasing student numbers, but not as much more funding for schools and universities, and the desire to automate certain tasks anyhow, there have been multiple efforts to generate and mark educational exercises automatically. There are a number of efforts for the relatively easy tasks, such as for learning a language, which range from the entry level with simple vocabulary exercises to advanced ones of automatically marking essays. I’ve dabbled in that area as well, mainly with 3rd-year capstone projects and 4th-year honours project student projects [1]. Then there’s one notch up with fact recall and concept meaning recall questions, and further steps up, such as generating multiple-choice questions (MCQs) with not just obviously wrong distractors but good distractors to make the question harder. There’s quite a bit of work done on generating those MCQs in theory and in tooling, notably [2,3,4,5]. As a recent review [6] also notes, however, there are still quite a few gaps. Among others, about generalisability of theory and systems – can you plug in any structured data or knowledge source to question templates – and the type of questions. Most of the research on ‘not-so-hard to generate and mark’ questions has been done for MCQs, but there are multiple of other types of questions that also should be doable to generate automatically, such as true/false, yes/no, and enumerations. For instance, with an axiom such as $impala \sqsubseteq \exists livesOn.land$ in a ontology or knowledge graph, a suitable question generation system may then generate “Does an impala live on land?” or “True or false: An impala lives on land.”, among other options.

We set out to make a start with tackling those sort of questions, for the type-level information from an ontology (cf. facts in the ABox or knowledge graph). The only work done there, when we started with it, was for the slick and fancy Inquire Biology [5], but which did not have their tech available for inspection and use, so we had to start from scratch. In particular, we wanted to find a way to be able to plug in any ontology into a system and generate those non-MCQ other types of educations questions (10 in total), where the questions generated are at least grammatically good and for which the answers also can be generated automatically, so that we get to automated marking as well.

Initial explorations started in 2019 with an honours project to develop some basics and a baseline, which was then expanded upon. Meanwhile, we have some more designed, developed, and evaluated, which was written up in the paper “Generating Answerable Questions from Ontologies for Educational Exercises” [7] that has been accepted for publication and presentation at the 15th international conference on metadata and semantics research (MTSR’21) that will be held online next week.

In short:

• Different types of questions and the answer they have to provide put different prerequisites on the content of the ontology with certain types of axioms. We specified those for 10 types of educational questions.
• Three strategies of question generation were devised, being ‘simple’ from the vocabulary and axioms and plug it into a template, guided by some more semantics in the ontology (a foundational ontology), and one that didn’t really care about either but rather took a natural language approach. Variants were added to cater for differences in naming and other variations, amounting to 75 question templates in total.
• The human evaluation with questions generated from three ontologies showed that while the semantics-based one was slightly better than the baseline, the NLP-based one gave the best results on syntactic and semantic correctness of the sentences (according to the human evaluators).
• It was tested with several ontologies in different domains, and the generalisability looks promising.

To be honest to those getting their hopes up: there are some issues that cause it never to make it to the ‘100% fabulous!’ if one still wants to designs a system that should be able to take any ontology as input. A main culprit is naming of elements in the ontology, which varies widely across ontologies. There are several guidelines for how to name entities, such as using camel case or underscores, and those things easily can be coded into an algorithm, indeed, but developers don’t stick to them consistently or there’s an ontology import that uses another naming convention so that there likely will be a glitch in the generated sentences here or there. Or they name things within the context of the hierarchy where they put the class, but in the question it is out of that context and then looks weird or is even meaningless. I moaned about this before; e.g., ‘American’ as the name of the class that should have been named ‘American Pizza’ in the Pizza ontology. Or the word used for the name of the class can have different POS tags such that it makes the generated sentence hard to read; e.g., ‘stuff’ as a noun or a verb.

Be this as it may, overall, promising results were obtained and are being extended (more to follow). Some details can be found in the (CRC of the) paper and the algorithms and data are available from the GitHub repo. The first author of the paper, Toky Raboanary, recently made a short presentation video about the paper for the yearly Open Evening/Showcase, which was held virtually and that page is still online available.

References

