Some explorations into book publishing logistics

Writing a book is only one part of the whole process of publishing a book. There’s the actual thing that eventually needs to get out into the wide world. Hard copy? E-book? Print-on-demand? All three or a subset only? Taking a step back: where are you as author located, where are the publisher and the printer, and where is the prospective audience? Is the prospective readership IT savvy enough for e-books to even consider that option? Is the book’s content suitable for reading on devices with a gazillion different screen sizes? Here’s a brief digest from after my analysis paralysis of the too many options where none has it all – not ever, it seems.

I’ve written about book publishing logistics and choices for my open textbook, but that is, well, a textbook. My new book, No Taming of the Enthusiast, is of a different genre and aimed at a broader audience. Also, I’m a little wiser on the practicalities of hard copy publishing. For instance, it took nearly 1.5 months for the College Publications-published textbook to arrive in Cape Town, having travelled all the way from Europe where the publisher and printer are located. Admittedly, these days aren’t the best days for international cargo, but such a delivery time is a bit too long for the average book buyer. I’ve tried buying books with other overseas retailers and book sellers over the past few years—same story. On top of that, in South Africa, you then have to go to the post office to pick up the parcel and pay a picking-up-the-parcel fee (or whatever the fee is for), on top of the book’s cost and shipping fee. And it may get stuck in Customs limbo. This is not a good strategy if I want to reach South African readers. Also, it would be cool to get at least some books all the way onto the shelves of local book stores.

A local publisher then? That would be good for contributing my bit to stimulating the local economy as well. It has the hard copy logistics problem in reverse at least in part, however: how to get the books from so far down south to other places in the world where buyers may be located. Since the memoir is expected to have an international audience as well, some international distribution is a must. This requirement still gives three options: a multinational hard copy publisher that distributes to main cities with various shipping delays, print-on-demand (soft copy distributed, printed locally wherever it is bought), or e-book.

Let’s take the e-books detour for a short while. There is a low percentage of uptake of e-books – some 20% at best – and lively subjective opinions on why people don’t like ebooks. I prefer hard copies as well, but tolerate soft copies for work. Both are useful for different types of use: a hard copy for serious reading and a soft copy for skimming and searching so as to save oneself endless flicking to look up something. It’s happening the same with my textbook as well, to some extent at least: people pay for it to have it nicely printed and bound even though they can do that with the pdf themselves or just read the pdf. For other genres, some are better in print in any case, such as colourful cookbooks, but others should tolerate e-readers quite well, such as fiction when it’s just plain text.

In deciding whether to go for an e-book, I did explore usability and readability of e-books for non-work books to form my own opinion on it. I really tried. I jumped into the rabbit hole of e-reader software with their pros and cons, and settled on Calibre eventually as best fit. I read a fixed-size e-book in its entirety and it was fine, but there was a glitch in that it did not quite adjust to the screen size of the device easily and navigating pages was awkward; I didn’t try to search. I also bought two e-book novels from smashwords (epub format) and tested one for cross-device usability and readability. Regarding the ‘across devices’: I think I deserve to share and read e-books on all my devices when I duly paid for the copyrighted books. And, lo and behold, I indeed could do so across unconnected devices through emailing myself on different email addresses. The flip side of that is that it means that once any epub is downloaded by one buyer (separately, not into e-books software), it’s basically a free-for-all. There are also epub to pdf converters. The hurdles to do so may be enough of a deterrent for an average reader, but it’s not even a real challenge for anyone in IT or computing.

After the tech tests, I’ve read through the first few pages of one of the two epub e-books – and abandoned it since. Although the epub file resized well, and I suppose that’s a pat on the back for the software developers, it renders ugly on the dual laptop/tablet and smartphone I checked it with. It offers not nearly the same neat affordances of a physical book. For the time being, I’ll buy an e-book only if there’s no option to buy a hard copy and I really, really, want to read it. Else to just let it slide – there are plenty of interesting books that are accessible and my reading time is limited.

Spoiler alert on how the logistics ended up eventually 🙂

So, now what for my new book? There is no perfect solution. I don’t want to be an author of something I would not want to read (the e-book), but it can be set up if there’s enough demand for it. Then, for the hard copies route, if you’re not already a best-selling author or a VIP who dabbles in writing, it’s not possible to get it both published ‘fast’ – in, say, at most 6 months cf. the usual 1.5-2 years with a traditional publisher – and have it distributed ‘globally’. Even if you are quite the hotshot writer, you have to be rather patient and contend with limited reach.

Then what about me, as humble award-wining textbook writer who wrote a memoir as well, and who can be patient but generally isn’t for long? First, I still prefer hard copies first and foremost nonetheless. Second, there’s the decision to either favour local or global in the logistics. Eventually, I decided to favour local and found a willing South African publisher, Porcupine Press, to publish it under their imprint and then went for the print-on-demand for elsewhere. PoD will take a few days lead time for an outside-South-Africa buyer, but that’s little compared to international shipping times and costs.

How to do the PoD? A reader/buyer need not worry and simply will be able to buy it from the main online retailers later in the upcoming week, with the exact timing depending on how often they run their batch update scripts and how much manual post-processing they do.

From the publishing and distribution side: it turns out someone has thought about all that already. More precisely, IngramSpark has set up an international network of local distributors that has a wider reach than, notably, KDP for the Kindle, if that floats your boat (there are multiple comparisons of the two on many more parameters, e.g., here and here). You load the softcopy files onto their system and then they push it into some 40000 outlets, including the main international ones like Amazon and multiple national ones (e.g., Adlibris in Sweden, Agapea in Spain). Anyway, that’s how it works in theory. Let’s see how that works in practice. The ‘loading onto the system’ stage started last week and should be all done some time this upcoming week. Please let me know if it doesn’t work out; we’ll figure something out.

Meanwhile for people in South Africa who can’t wait for the book store distribution that likely will take another few weeks to cover the Joburg/Pretoria and Cape Town book shops (an possibly on the shelf only in January): 1) it’s on its way for distribution through the usual sites, such as TakeALot and Loot, over the upcoming days (plus some days that they’ll take to update their online shop); 2) you’ll be able to buy it from the Porcupine Press website once they’ve updated their site when the currently-in-transit books arrive there in Gauteng; 3) for those of you in Cape Town, and where the company that did the actual printing is located (did I already mention logistics matter?): I received some copies for distribution on Thursday and I will bring copies to the book launch next weekend. If the impending ‘family meeting’ is going to mess up the launch plans due to an unpleasant more impractical adjusted lockdown level, or you simply can’t wait: you may contact me directly as well.

Version 1.5 of the textbook on ontology engineering is available now

“Extended and Improved!” could some advertisement say of the new v1.5 of “An introduction to ontology engineering” that I made available online today. It’s not that v1 was no good, but there were a few loose ends and I received funding from the digital open textbooks for development (DOT4D) project to turn the ‘mere pdf’ into a proper “textbook package” whilst meeting the DOT4D interests of, principally, student involvement, multilingualism, local relevance, and universal access. The remainder of this post briefly describes the changes to the pdf and the rest of it.

The main changes to the book itself

With respect to contents in the pdf itself, the main differences with version 1 are:

  • a new chapter on modularisation, which is based on a part of the PhD thesis of my former student and meanwhile Senior Researcher at the CSIR, Dr. Zubeida Khan (Dawood).
  • more content in Chapter 9 on natural language & ontologies.
  • A new OntoClean tutorial (as Appendix A of the book, introduced last year), co-authored with Zola Mahlaza, which is integrated with Protégé and the OWL reasoner, rather than only paper-based.
  • There are about 10% more exercises and sample answers.
  • A bunch of typos and grammatical infelicities have been corrected and some figures were updated just in case (as the copyright stuff of those were unclear).

Other tweaks have been made in other sections to reflect these changes, and some of the wording here and there was reformulated to try to avoid some unintended parsing of it.

The “package” beyond a ‘mere’ pdf file

Since most textbooks, in computer science at least, are not just hardcopy textbooks or pdf-file-only entities, the OE textbook is not just that either. While some material for the exercises in v1 were already available on the textbook website, this has been extended substantially over the past year. The main additions are:

There are further extras that are not easily included in a book, yet possibly useful to have access to, such as list of ontology verbalisers with references that Zola Mahlaza compiled and an errata page for v1.

Overall, I hope it will be of some (more) use than v1. If you have any questions or comments, please don’t hesitate to contact me. (Now with v1.5 there are fewer loose ends than with v1, yet there’s always more that can be done [in theory at least].)

p.s.: yes, there’s a new front cover, so as to make it easier to distinguish. It’s also a photo I took in South Africa, but this time standing on top of Table Mountain.

Some experiences on making a textbook available

I did make available a textbook on ontology engineering for free in July 2018. Meanwhile, I’ve had several “why did you do this and not a proper publisher??!?” I had tried to answer that already in the textbook’s FAQ. Turns out that that short answer may be a bit too short after all. So, here follows a bit more about that.

The main question I tried to answer in the book’s FAQ was “Would it not have been better with a ‘proper publisher’?” and the answer to that was:

Probably. The layout would have looked better, for sure. There are several reasons why it isn’t. First and foremost, I think knowledge should be free, open, and shared. I also have benefited from material that has been made openly available, and I think it is fair to continue contributing to such sharing. Also, my current employer pays me sufficient to live from and I don’t think it would sell thousands of copies (needed for making a decent amount of money from a textbook), so setting up such a barrier of high costs for its use does not seem like a good idea. A minor consideration is that it would have taken much more time to publish, both due to the logistics and the additional reviewing (previous multi-author general textbook efforts led to nothing due to conflicting interests and lack of time, so I unlikely would ever satisfy all reviewers, if they would get around reading it), yet I need the book for the next OE installment I will teach soon.

Ontology Engineering (OE) is listed as an elective in the ACM curriculum guidelines. Yet, it’s suited best for advanced undergrad/postgrad level because of the prerequisites (like knowing the basics of databases and conceptual modeling). This means there won’t be big 800-students size classes all over the world lining up for OE. I guess it would not go beyond some 500-1000/year throughout the world (50 classes of 10-20 computer science students), and surely not all classes would use the textbook. Let’s say, optimistically, that 100 students/year would be asked to use the book.

With that low volume in mind, I did look up the cost of similar books in the same and similar fields with the ‘regular’ academic publishers. It doesn’t look enticing for either the author or the student. For instance this one from Springer and that one from IGI Global are all still >100 euro. for. the. eBook., and they’re the cheap ones (not counting the 100-page ‘silver bullet’ book). Handbooks and similar on ontologies, e.g., this and that one are offered for >200 euro (eBook). Admitted there’s the odd topical book that’s cheaper and in the 50-70 euro range here and there (still just the eBook) or again >100 as well, for a, to me, inexplicable reason (not page numbers) for other books (like these and those). There’s an option to publish a textbook with Springer in open access format, but that would cost me a lot of money, and UCT only has a fund for OA journal papers, not books (nor for conference papers, btw).

IOS press does not fare much better. For instance, a softcover version in the studies on semantic web series, which is their cheapest range, would be about 70 euro due to number of pages, which is over R1100, and so again above budget for most students in South Africa, where the going rate is that a book would need to be below about R600 for students to buy it. A plain eBook or softcover IOS Press not in that series goes for about 100 euro again, i.e., around R1700 depending on the exchange rate—about three times the maximum acceptable price for a textbook.

The MIT press BFO eBook is only R425 on takealot, yet considering other MIT press textbooks there, with the size of the OE book, it then would be around the R600-700. Oxford University Press and its Cambridge counterpart—that, unlike MIT press, I had checked out when deciding—are more expensive and again approaching 80-100 euro.

One that made me digress for a bit of exploration was Macmillan HE, which had an “Ada Lovelace day 2018” listing books by female authors, but a logics for CS book was again at some 83 euros, although the softer area of knowledge management for information systems got a book down to 50 euros, and something more popular, like a book on linguistics published by its subsidiary “Red Globe Press”, was down to even ‘just’ 35 euros. Trying to understand it more, Macmillan HE’s “about us” revealed that “Macmillan International Higher Education is a division of Macmillan Education and part of the Springer Nature Group, publishers of Nature and Scientific American.” and it turns out Macmillan publishes through Red Globe Press. Or: it’s all the same company, with different profit margins, and mostly those profit margins are too high to result in affordable textbooks, whichever subsidiary construction is used.

So, I had given up on the ‘proper publisher route’ on financial grounds, given that:

  • Any ontology engineering (OE) book will not sell large amounts of copies, so it will be expensive due to relatively low sales volume and I still will not make a substantial amount from royalties anyway.
  • Most of the money spent when buying a textbook from an established publisher goes to the coffers of the publisher (production costs etc + about 30-40% pure profit [more info]). Also, scholarships ought not to be indirect subsidy schemes for large-profit-margin publishers.
  • Most publishers would charge an amount of money for the book that would render the book too expensive for my own students. It’s bad enough when that happens with other textbooks when there’s no alternative, but here I do have direct and easy-to-realise agency to avoid such a situation.

Of course, there’s still the ‘knowledge should be free’ etc. argument, but this was to show that even if one were not to have that viewpoint, it’s still not a smart move to publish the textbook with the well-known academic publishers, even more so if the topic isn’t in the core undergraduate computer science curriculum.

Interestingly, after ‘publishing’ it on my website and listing it on OpenUCT and the Open Textbook Archive—I’m certainly not the only one who had done a market analysis or has certain political convictions—one colleague pointed me to the non-profit College Publications that aims to “break the monopoly that commercial publishers have” and another colleague pointed me to UCT press. I had contacted both, and the former responded. In the meantime, the book has been published by CP and is now also listed on Amazon for just $18 (about 16 euro) or some R250 for the paperback version—whilst the original pdf file is still freely available—or: you pay for production costs of the paperback, which has a slightly nicer layout and the errata I knew of at the time have been corrected.

I have noticed that some people don’t take the informal self publishing seriously—even below the so-called ‘vanity publishers’ like Lulu—notwithstanding the archives to cater for it, the financial take on the matter, the knowledge sharing argument, and the ‘textbooks for development’ in emerging economies angle of it. So, I guess no brownie points from them then and, on top of that, my publication record did, and does, take a hit. Yet, writing a book, as an activity, is a nice and rewarding change from just churning out more and more papers like a paper production machine, and I hope it will contribute to keeping the OE research area alive and lead to better ontologies in ontology-driven information systems. The textbook got its first two citations already, the feedback is mostly very positive, readers have shared it elsewhere (reddit, ungule.it, Open Libra, Ebooks directory, and other platforms), and I recently got some funding from the DOT4D project to improve the resources further (for things like another chapter, new exercises, some tools development to illuminate the theory, a proofreading contest, updating the slides for sharing, and such). So, overall, if I had to make the choice again now, I’d still do it again the way I did. Also, I hope more textbook authors will start seeing self-publishing, or else non-profit, as a good option. Last, the notion of open textbooks is gaining momentum, so you even could become a trendsetter and be fashionable 😉

On ‘open access’ CS conference proceedings

It perhaps sounds nice and doing-good-like, for the doe-eyed ones at least: publish computer science conference proceedings as open access so that anyone in the world can access the scientific advances for free. Yay. Free access to scientific materials is good for a multitude of reasons. There’s downside in the set-up in the way some try to push this now, though, which amounts to making people pay for what used to be, and still mostly is, for free already. I take issue with that. Instead of individualising a downside of open access by heaping more costs onto the individual researchers, the free flow of knowledge should be—and remain—a collectivised effort.

 

It is, and used to be, the case that most authors put the camera-ready-copy (CRC) on their respective homepages and/or institutional repositories, and it used to be typically even before the conference (e.g., mine are here). Putting the CRC on one’s website or in an openly accessible institutional repository seems to happen slightly less often now, even though it is legal to do so. I don’t know why. Even if it were not entirely legal, a collective disobedience is not something that the publishers easily can fight. It doesn’t help that Google indexes the publisher quicker than the academics’ webpages, so the CRCs on the authors’ pages don’t turn up immediately in the search results even whey the CRCs are online, but that would be a pathetic reason for not uploading the CRC. It’s a little extra effort to lookup an author’s website, but acceptable as long as the file is still online and freely available.

Besides the established hallelujah’s to principles of knowledge sharing, there’s since recently a drive at various computer science (CS) conferences to make sure the proceedings will be open access (OA). Like for OA journal papers in an OA or hybrid journal, someone’s going to have to pay for the ‘article processing charges’. The instances that I’ve seen close-up, put those costs for all papers of the proceedings in the conference budget and therewith increase the conference registration costs. Depending on 1) how good or bad the deal is that the organisers made, 2) how many people are expected to attend, and 3) how many papers will go in the volume, it hikes up the registration costs by some 50 euro. This is new money that the publishing house is making that they did not use to make before, and I’m pretty sure they wouldn’t offer an OA option if it were to result in them making less profit from the obscenely lucrative science publishing business.

So, who pays? Different universities have different funding schemes, as have different funders as to what they fund. For instance, there exist funds for contributing to OA journal article publishing (also at UCT, and Springer even has a list of OA funders in several countries), but that cannot be used in this case, for the OA costs are hidden in the conference registration fee. There are also conference travel funds, but they fund part of it or cap it to a maximum, and the more the whole thing costs, the greater the shortfall that one then will have to pay out of one’s own research fund or one’s own pocket.

A colleague (at another university) who’s pushing for the OA for CS conference proceedings said that his institution is paying for all the OA anyway, not him—he easily can have principles, as it doesn’t cost him anything anyway. Some academics have their universities pay for the conference proceedings access already anyway, as part of the subscription package; it’s typically the higher-ranking technical universities that have access. Those I spoke to, didn’t like the idea that now they’d have to pay for access in this way, for they already had ‘free’ (to them) access, as the registration fees come from their own research funds. For me, it is my own research funds as well, i.e., those funds that I have to scramble together through project proposal applications with their low acceptance rates. If I’d go to/have papers at, say, 5 such conferences per year (in the past several years, it was more like double that), that’s the same amount as paying a student/scientific programmer for almost a week and about a monthly salary for the lowest-paid in South Africa, or travel costs or accommodation for the national CS&IT conference (or both) or its registration fees. That is, with increased registration fees to cover the additional OA costs, at least one of my students or I would lose out on participating in even a local conference, or students would be less exposed to doing research and obtaining programming experience that helps them to get a better job or better chance at obtaining a scholarship for postgraduate studies. To name but a few trade-offs.

Effectively, the system has moved from “free access to the scientific literature anyway” (the online CRCs), to “free access plus losing money (i.e.: all that I could have done with it) in the process”. That’s not an improvement on the ground.

Further, my hard-earned research funds are mine, and I’d like to decide what to do with it, rather than having that decision been taken for me. Who do the rich boys up North think they are to say that I should spend it on OA when the papers were already free, rather than giving a student an opportunity to go to a national conference or devise and implement an algorithm, or participate in an experiment etc.! (Setting aside them trying to reprimand and ‘educate’ me on the goodness—tsk! as if I don’t know that the free flow of scientific information is a good thing.)

Tell me, why should the OA principles trump the capacity building when the papers are free access already anyway? I’ve not seen OA advocates actually weighing up any alternatives on what would be the better good to spend money on. As to possible answers, note that an “it ought to be the case that there would be enough money for both” is not a valid answer in discussing trade-offs, nor is a “we might add a bit of patching up as conference registration reduction for those needy that are not in the rich inner core” for it hardly ever happens, nor is a “it’s not much for each instance, you really should be able to cover it” because many instances do add up. We all know that funding for universities and for research in general is being squeezed left, right, and centre in most countries, especially over the past 8-10 years, and such choices will have to, and are being, made already. These are not just choices we face in Africa, but this holds also in richer countries, like in the EU (fewer resources in relative or absolute terms and greater divides), although a 250 euro (the 5 conferences scenario) won’t go as far there as in low-income countries.

Also, and regardless the funding squeeze: why should we start paying for free access that already was a de facto, and with most CS proceedings publishers, also a de jure, free access anyway? I’m seriously starting to wonder who’s getting kickbacks for promoting and pushing this sort of scheme. It’s certainly not me, and nor would I take it if some publisher would offer it to me, as it contributes to the flow of even more money from universities and research institutes to the profits of multinationals. If it’s not kickbacks, then to all those new ‘conference proceedings need to be OA’ advocates: why do you advocate paying for a right that we had for free? Why isn’t it enough for you to just pay for a principle yourself as you so desire, but instead insist to force others to do so too even when there is already a tacit and functioning agreement going on that realises that aim of free flow of knowledge?

Sure, the publisher has a responsibility to keep the papers available in perpetuity, which I don’t, and link rot does exist. One easily could write a script to search all academics’ websites and get the files, like citeseer used to do well. They get funding for such projects for long-term archiving, like arxiv.org does as well, and philpapers, and SSRN as popular ones (see also a comprehensive list of preprint servers), and most institution’s repositories, too (e.g., the CS@UCT pubs repository). So, the perpetuity argument can also be taken care of that way, without the researchers actually having to pay more.

Really, if you’re swimming in so much research money that you want to pay for a principle that was realised without costs to researchers, then perhaps instead do fund the event so that, say, some student grants can be given out, that it can contribute to some nice networking activity, or whatever part of the costs. The new “we should pay for OA, notwithstanding that no one was suffering when it was for free” attitude for CS conference proceedings is way too fishy to actually being honest; if you’re honest and not getting kickbacks, then it’s a very dumb thing to advocate for.

For the two events where this scheme is happening that I’m involved in, I admit I didn’t forcefully object at the time it was mentioned (nor had I really thought through the consequences). I should have, though. I will do so a next time.