On ‘open access’ CS conference proceedings

It perhaps sounds nice and doing-good-like, for the doe-eyed ones at least: publish computer science conference proceedings as open access so that anyone in the world can access the scientific advances for free. Yay. Free access to scientific materials is good for a multitude of reasons. There’s downside in the set-up in the way some try to push this now, though, which amounts to making people pay for what used to be, and still mostly is, for free already. I take issue with that. Instead of individualising a downside of open access by heaping more costs onto the individual researchers, the free flow of knowledge should be—and remain—a collectivised effort.

 

It is, and used to be, the case that most authors put the camera-ready-copy (CRC) on their respective homepages and/or institutional repositories, and it used to be typically even before the conference (e.g., mine are here). Putting the CRC on one’s website or in an openly accessible institutional repository seems to happen slightly less often now, even though it is legal to do so. I don’t know why. Even if it were not entirely legal, a collective disobedience is not something that the publishers easily can fight. It doesn’t help that Google indexes the publisher quicker than the academics’ webpages, so the CRCs on the authors’ pages don’t turn up immediately in the search results even whey the CRCs are online, but that would be a pathetic reason for not uploading the CRC. It’s a little extra effort to lookup an author’s website, but acceptable as long as the file is still online and freely available.

Besides the established hallelujah’s to principles of knowledge sharing, there’s since recently a drive at various computer science (CS) conferences to make sure the proceedings will be open access (OA). Like for OA journal papers in an OA or hybrid journal, someone’s going to have to pay for the ‘article processing charges’. The instances that I’ve seen close-up, put those costs for all papers of the proceedings in the conference budget and therewith increase the conference registration costs. Depending on 1) how good or bad the deal is that the organisers made, 2) how many people are expected to attend, and 3) how many papers will go in the volume, it hikes up the registration costs by some 50 euro. This is new money that the publishing house is making that they did not use to make before, and I’m pretty sure they wouldn’t offer an OA option if it were to result in them making less profit from the obscenely lucrative science publishing business.

So, who pays? Different universities have different funding schemes, as have different funders as to what they fund. For instance, there exist funds for contributing to OA journal article publishing (also at UCT, and Springer even has a list of OA funders in several countries), but that cannot be used in this case, for the OA costs are hidden in the conference registration fee. There are also conference travel funds, but they fund part of it or cap it to a maximum, and the more the whole thing costs, the greater the shortfall that one then will have to pay out of one’s own research fund or one’s own pocket.

A colleague (at another university) who’s pushing for the OA for CS conference proceedings said that his institution is paying for all the OA anyway, not him—he easily can have principles, as it doesn’t cost him anything anyway. Some academics have their universities pay for the conference proceedings access already anyway, as part of the subscription package; it’s typically the higher-ranking technical universities that have access. Those I spoke to, didn’t like the idea that now they’d have to pay for access in this way, for they already had ‘free’ (to them) access, as the registration fees come from their own research funds. For me, it is my own research funds as well, i.e., those funds that I have to scramble together through project proposal applications with their low acceptance rates. If I’d go to/have papers at, say, 5 such conferences per year (in the past several years, it was more like double that), that’s the same amount as paying a student/scientific programmer for almost a week and about a monthly salary for the lowest-paid in South Africa, or travel costs or accommodation for the national CS&IT conference (or both) or its registration fees. That is, with increased registration fees to cover the additional OA costs, at least one of my students or I would lose out on participating in even a local conference, or students would be less exposed to doing research and obtaining programming experience that helps them to get a better job or better chance at obtaining a scholarship for postgraduate studies. To name but a few trade-offs.

Effectively, the system has moved from “free access to the scientific literature anyway” (the online CRCs), to “free access plus losing money (i.e.: all that I could have done with it) in the process”. That’s not an improvement on the ground.

Further, my hard-earned research funds are mine, and I’d like to decide what to do with it, rather than having that decision been taken for me. Who do the rich boys up North think they are to say that I should spend it on OA when the papers were already free, rather than giving a student an opportunity to go to a national conference or devise and implement an algorithm, or participate in an experiment etc.! (Setting aside them trying to reprimand and ‘educate’ me on the goodness—tsk! as if I don’t know that the free flow of scientific information is a good thing.)

Tell me, why should the OA principles trump the capacity building when the papers are free access already anyway? I’ve not seen OA advocates actually weighing up any alternatives on what would be the better good to spend money on. As to possible answers, note that an “it ought to be the case that there would be enough money for both” is not a valid answer in discussing trade-offs, nor is a “we might add a bit of patching up as conference registration reduction for those needy that are not in the rich inner core” for it hardly ever happens, nor is a “it’s not much for each instance, you really should be able to cover it” because many instances do add up. We all know that funding for universities and for research in general is being squeezed left, right, and centre in most countries, especially over the past 8-10 years, and such choices will have to, and are being, made already. These are not just choices we face in Africa, but this holds also in richer countries, like in the EU (fewer resources in relative or absolute terms and greater divides), although a 250 euro (the 5 conferences scenario) won’t go as far there as in low-income countries.

Also, and regardless the funding squeeze: why should we start paying for free access that already was a de facto, and with most CS proceedings publishers, also a de jure, free access anyway? I’m seriously starting to wonder who’s getting kickbacks for promoting and pushing this sort of scheme. It’s certainly not me, and nor would I take it if some publisher would offer it to me, as it contributes to the flow of even more money from universities and research institutes to the profits of multinationals. If it’s not kickbacks, then to all those new ‘conference proceedings need to be OA’ advocates: why do you advocate paying for a right that we had for free? Why isn’t it enough for you to just pay for a principle yourself as you so desire, but instead insist to force others to do so too even when there is already a tacit and functioning agreement going on that realises that aim of free flow of knowledge?

Sure, the publisher has a responsibility to keep the papers available in perpetuity, which I don’t, and link rot does exist. One easily could write a script to search all academics’ websites and get the files, like citeseer used to do well. They get funding for such projects for long-term archiving, like arxiv.org does as well, and philpapers, and SSRN as popular ones (see also a comprehensive list of preprint servers), and most institution’s repositories, too (e.g., the CS@UCT pubs repository). So, the perpetuity argument can also be taken care of that way, without the researchers actually having to pay more.

Really, if you’re swimming in so much research money that you want to pay for a principle that was realised without costs to researchers, then perhaps instead do fund the event so that, say, some student grants can be given out, that it can contribute to some nice networking activity, or whatever part of the costs. The new “we should pay for OA, notwithstanding that no one was suffering when it was for free” attitude for CS conference proceedings is way too fishy to actually being honest; if you’re honest and not getting kickbacks, then it’s a very dumb thing to advocate for.

For the two events where this scheme is happening that I’m involved in, I admit I didn’t forcefully object at the time it was mentioned (nor had I really thought through the consequences). I should have, though. I will do so a next time.

Additional suggestions for conference blogging

Lunch Over IP has an interesting blog post with tips for conference bloggers (pdf) covering twelve topics: tools, location, preparation, software, speakers, style, quotes, audience, context, linking, tagging, timing, mistakes, collaboration, tagging, and timing. These suggestions by Ethan Zuckerman and Bruno Giussani are useful suggestions for blogging about ‘general’ conferences, but I would like to add a few suggestions for scientific conference blogging, and those of computer science in particular, which are the principal outlets for the latest research (as opposed to journal articles in other disciplines).

The main modifications concern preparation, speakers, and timing, which is based on the conferences and workshops I did blog about (ORM’06, DL’07, OWLED’07, AI for cultural heritage 2007, AI*IA’07, IFIP TC9 ICT for warfare, OWLED’08, ISWC’08, ICT for Peace Symposium’08), the differences in quality of those post, the ones I started writing about but abandoned, the ones that I intended to blog about but did not do, and why for some I did not even start the process.

The, by far, most important point is preparation. Look up the accepted papers, decide on a theme, try to get the relevant papers beforehand, and read them. Split them into the stack of ones are worthwhile the “blog-attention” regardless, which to have as “potentials” and which to “discard”. For those where there is no paper available before the conference, skim through the paper upon receiving the proceedings, or at least mark them to go to the presentation and check the paper after the presentation.

Then, at the conference, attend the presentations and make notes primarily for those ones you have pre-selected and only by exception one that seemed unexpectedly interesting or that generated quite a bit of debate from the audience; in a good conference, there is too much new information to digest properly to summarize all presentations adequately, so not only preparation but also selection is important. Lunch Over IP mentions collaboration, which might be useful provided you team up with someone who has different interests or attends a parallel session. On the other hand, it also can be useful to have multiple reports and arguments about the same paper & presentation, in particular if one is attending an interdisciplinary conference. Further, even a lousy presentation but good paper should be worth mentioning: a scientific conference is not a marketing exercise where better-sold goods deserve more attention, but instead those papers that add something significant (the presenter could be a brilliant but nervous PhD student, humble researcher, or socially-challenged professor). Vice versa, a good presentation may mask a lousy paper; if there is such an ‘unexpectedly interesting’ presentation, then before blogging about it be sure to check the paper and consult an expert if it is not precisely your area of research.

The third point, timing, which Lunch over IP would like to see as liveblogging: posting within 10 minutes after the presentation. Well, no; let us call the opposite lagblogging. Aside that new things may pop up during the presentation—e.g., having misunderstood a section, newer material has been presented, criticism from the audience you had not thought of—one should back up any posted comments with an argumentation, which takes time to write, or compare it with another paper on the same topic that might be scheduled afterward, or even in another timeslot. Or perhaps there are links between papers one has not thought of before. Such papers should be synthesized into one analysis and not processed and published in a piecemeal fashion. Being able to connect dots is important in science, and when you do it in your post, the readers will appreciate that: not being at the venue, your blog readers were not exposed to the amalgamation of topics and papers, so your synthesis will give added-value. Make a connected ‘flow’ out of the selected papers and presentations. In that case, being a day or two (or three) later is fine.

Last, but this may be just my personal opinion, when I read other people’s conference blog posts, I really do not care who you rubbed shoulders with. First and foremost, I want to know what is useful to check out (and why), what was the ‘vibe’ of a panel session to get an idea of what lives in that research community, and what was deemed worthy of ‘keynote speech’ by the organisers (and was it really worthwhile listening to?).