My road travelled from microbiology to computer science

From bites to bytes or, more precisely, from foods to formalisations, and that sprinkled with a handful of humanities and a dash of design. It does add up. The road I travelled into computer science has nothing to do with any ‘gender blabla’, nor with an idealistic drive to solve the world food problem by other means, nor that I would have become fed up with the broad theme of agriculture. But then what was it? I’m regularly asked about that road into computer science, for various reasons. There are those who are curious or nosy, some deem it improbable and that I must be making it up, and yet others chiefly speculate about where I obtained the money from to pay for it all. So here it goes, in a fairly large write-up since I did not take a straight path, let alone a shortcut.

If you’ve seen my CV, you know I studied “Food Science, free specialisation” at Wageningen University in the Netherlands. It is the university to go to for all things to do with agriculture in the broad sense. Somehow I made it into computer science, but it was not there. The motivation does come from there, thanks to it being at the forefront of science and such has an ambiance that facilitates exposure to a wide range of topics and techniques within the education system and among fellow students. (Also, it really was the best quality education I ever had, which deserves to be said—and I’ve been around to have ample comparison material.)

And yet.

Perhaps it is conceivable to speculate that all the hurdles with mathematics and PC use when I was young were the motivation to turn to computing. Definitely not. Instead, it happened when I was working on my last, and major, Master’s thesis in the Molecular Ecology section of the Laboratory of Microbiology at Wageningen University, having drifted away a little from microbes in food science.

My thesis topic was about trying to clean up chemically contaminated soil by using bacteria that would eat the harmful compounds, rather than cleaning up the site by disrupting the ecosystem with excavations and chemical treatments of the soil. In this case, it was about 3-chlorobenzoate, which is an intermediate degradation product from, mainly, spilled paint that had been going on since the 1920s and said molecule substantially reduces growth and yield of maize, which is undesirable. I set out to examine a bunch of configurations of different amounts of 3-chlorobenzoate in the soil together with the Pseudomonas B13 bacteria and distance to the roots of the maize plants and their effects on the growth of the maize plants. The bacteria were expected to clean up more of the 3-chlorobenzoate in the area nearby the roots (the rhizosphere), and there were some questions about what the bacteria would do once the 3-chlorobenzoate ran out (mainly: will they die or feed on other molecules?).

The birds-eye view still sounds interesting to me, but there was a lot of boring work to do to find the answer. There were days that the only excitement was to open the stove to see whether my beasts had grown on the agar plate in the petri dish; if they had (yay!), I was punished with counting the colonies. Staring at dots on the agar plate in the petri dish and counting them. Then there were the analysis methods to be used, of which two turned out to be crucial for changing track, mixed with a minor logistical issue to top it off.

First, there was the PCR technique to sequence genetic material, which by now during COVID-19 times, may be a familiar term. There are machines that do the procedure automatically. In 1997, it was still a cumbersome procedure, which took about a day near non-stop work to sequence the short ribosomal RNA (16S rRNA) strand that was extracted from the collected bacteria. That was how we could figure out whether any of those white dots in the petri dish were, say, the Pseudomonas B13 I had inoculated the soil with, or some other soil bacteria. You extract the genetic material, multiply it, sequence it and then compare it. It was the last step that was the coolest.

The average number of base pairs of the 16S rRNA of a bacterium is around 1500 base pairs which is represented as a sequence of some 1500 capital letters consisting of A’s, C’s, G’s, and U’s. For comparison: the SARS-CoV-2 genome is about 30000 base pairs. You really don’t want to compare either one by hand against even one other similar sequence of letters, let alone manually checking your newly PCR-ed sequence against many others to figure out which bacteria you likely had isolated or which one is phylogenetically most closely related. Instead, we sent the sequence, as a string of flat text with those ACGU letters, to a database called the RNABase and we received an answer with a list of more or less likely matches within a few hours to a day, depending on the time of submitting it to the database.

It was like magic. But how did it really do that? What is a database? How does it calculate the alignments? And since it can do this cool stuff that’s not doable by humans, what else can you do with such techniques to advance our knowledge about the world? How much faster can science advance with these things? I wanted to know. I needed to know.

The other technique I had to work with was not new to me, but I had to scale it up: the High-Performance Liquid Chromatography (HPLC). You give the machine a solution and it separates out the component molecules, so you can figure out what’s in the solution and how much of it is in there. Different types of molecules stick to the wall of the tube inside the machine at different places. The machine then spits out the result as a graph, where different peaks scattered across the x axis indicate different substances in the solution and the size of the peak indicates the concentration of that molecule in the sample.

I had taken multiple soil samples closer and father away from the rhizosphere of different boxes with maize plants with different treatments of the soil, rinsed it and tested the solution in the HPLC. The task then was to compare the resulting graphs to see if there was a difference in treatment. Having printed them all out, they covered a large table of about 1.5 by 2 meter, and I had to look closely at them and try to do some manual pattern matching on the shape and size of the graphs and sub-graphs. There was no program that could compare graphs automatically. I tried to overlay printouts and hold them in front of the ceiling light. With every printed graph about the size of 20x20cm, you can calculate how many I had and how many 1-by-1 comparisons that amounts to (this is left as an exercise to the reader). It felt primitive, especially considering all the fancy toys in the lab and on the PC. Couldn’t those software developers not also develop a tool to compare graphs?! Now that would have been useful. But no. If only I could develop such a useful tool myself; then I would not have to wait on the software developers until they care to develop it.

On top of that manual analysis was that it seemed unfair that I had to copy the data from the HPLC machine in the basement of the building onto a 3.5 inch floppy disk and walk upstairs to the third floor to the shared MSc thesis students’ desktop PCs to be able to process it, whereas the PCR data was accessible from my desktop PC even though the PCR machine was on the ground floor. The PC could access the internet and present data from all over the world, even, so surely it should be able to connect to the HPLC downstairs?! Enter questions about computer networks.

The first step in trying to get some answers, was to inquire with the academics in the department. “Maybe there’s something like ‘theoretical microbiology’, or whatever it’s called that focuses on data analysis and modelling of microbiology? It is the fun part of the research—and avoids lab work?”, I asked my supervisor and more generally in the lab. “Not really,”, was the answer, continuing “ok, sure, there is some, but theory-only without the evidence from experiments isn’t it.” Despite all the advanced equipment, of which computing is an indispensable component, they still deemed that wetlab research trumped solely theory and computing. “Those technologies are there to assist answering faster the new and more advanced questions, but not replace the processes”, I was told.

Sigh. Pity. So be it, I supposed. But I still wanted answers to those computing questions. I also wanted to do a PhD in microbiology and then probably move to some other discipline, since I sensed that possibly after another 4-6 years I might become bored with microbiology. Then there was the logistical issue that I still could not walk well, which made wetlab work difficult; hence, it would make obtaining a PhD scholarship harder. Lab work was a hard requirement for a PhD in microbiology and it wasn’t exactly the most exciting part of studying bacteria. So, I might as well swap to something else straight away then. Since there were those questions in computing that I wanted answers to, there we have the inevitable conclusion to move to greener, or at least as green, pastures.

***

How to obtain those answers in computing? Signing up for a sort of ‘top up’ degree for the computing aspects would be nice, so as to do that brand new thing called bioinformatics. There were no such top-up degrees in the Netherlands at the time and the only one that came close was a full degree in medical informatics, which is not what I wanted. I didn’t want to know about all the horrible diseases people can get.

The only way to combine it, was to enrol in the 1st year of a degree in computing. The snag was the money. I was finishing up my 5 years of state funding for the master’s degree (old system, so it included the BSc) and the state paid for only one such degree. The only way to be able to do it, was to start working, save money, and pay for it myself at some point in the near future once I’d have enough money. Going into IT in industry out in the big wide world sounded somewhat interesting as second-choice option, since it should be easier with such skills to work anywhere in the world, and I still wanted to travel the world as well.

Once I finished the thesis in molecular ecology and graduated with a master’s degree in January 1998, I started looking for work whilst receiving unemployment benefit. IT companies only offered ‘conversion’ courses, such as a crash course in Cobol—the Y2K bug was alive and well—or some IT admin course, including Microsoft Certified System Engineer program (MCSE), with the catch that you’d have to keep working for the IT company for 3 years to pay off the debt of that training. That sounded like bonded labour and not particularly appealing.

Some day flicking through the newspapers on the lookout for interesting job offers, an advertisement caught my eye: a conversion course over a year for an MCSE consisting of five months full-time training and the rest of the year a practice period in industry whilst maintaining one’s unemployment benefit whose amount was just about sufficient to get by, and then all was paid off. A sizeable portion of funding came from the European Union. The programme was geared toward giving a second chance for basket cases, such as the long-term unemployed and the disabled. I was not a basket case, not yet at least. I tried nonetheless, applied for a position, and was invited for an interview. My main task was to try to convince them that I was basket case-like enough to qualify to be accepted in the programme, but good enough to pass fast and with good marks. The arguments worked and I was accepted for the programme. A foothold in the door.

We were a class of 16 people, 15 men and me the only woman. I completed the MCSE successfully, and then I also completed a range of other vocational training courses whilst employed in various IT jobs. Unix system administration, ITIL service management, a bit of Novell Netware and Cisco, and some more online self-study training sessions, which were all paid for by the companies I was employed at. The downside with those trainings, is that they all were, in my humble opinion, superficial and the how-to technology changes fast and the prospect or perpetual rote learning did not sound appealing to me. I wanted to know the underlying principles so that I wouldn’t have to keep updating myself with the latest trivia modification in an application. It was time to take the next step.

I was working for Eurologic Systems in Dublin, Ireland, at the time as a systems integration test engineer for fibre channel storage enclosures, which are boxes with many hard drives stacked up and connected for fast access to lots of data stored on the disks. They were a good employer, but they had only few training opportunities since it was an R&D company with experienced and highly educated engineers. I asked HR if I could sign up elsewhere, with, say, the Open University, and that they’d pay for some of it, maybe? “Yes,” the humane HR lady said, “that’s a good idea, and we’ll pay for every course you pass whilst in our employment.” Deal!

So, I enrolled with the Open University UK. I breezed through my first year even though I had skipped their 1st year courses and jumped straight into 2nd year courses. My second year went just as smoothly. The third year I paid myself, since I had opted for voluntary redundancy and was allowed to take it in the second round, since I wanted to get back on track of my original plan to go into bioinformatics. The dotcom bubble had burst and Eurologic could not escape some of its effects. While they were not fond of seeing me go, they knew I’d leave soon anyway and they were happy to see that the redundancy money would be put to good use to finish my Computing & IT degree. With that finished, I’d be able to finally do the bioinformatics that I was after since 1997, or so I thought.

My honours project was on database development, with a focus on conceptual data modelling languages. I rediscovered the Object-Role Modelling language from the lecture notes of the Saxion University of Applied Sciences that I had bought out of curiosity when I did the aforementioned MCSE course (in Enschede, the Netherlands). The database was about bacteriocins, which are produced by bacteria and they can be used in food for food safety and preservation. A first real step into bioinformatics. Bacteriocins have something to do with genes, too, and in searching for conceptual models about genes, I had stumbled into a new world in 2003, one with the Gene Ontology and the notion of ontologies to solve the data integration problem. Marking and marks processing took a bit longer than usual that year (the academics were on strike), and I was awarded the BSc(honours) degree (1st class) in March 2004. By that time, there were several bioinformatics conversion courses available. Ah, well.

The long route taken did give me some precious insight that no bioinformatics conversion top-up degree can give: a deeper understanding of indoctrination into disciplinary thinking and ways of doing science. That is, on what the respective mores are, how to question, how to identify a problem, looking at things, ways of answering questions and solving problems. Of course, when there’s, say, an experimental method, the principles of the methods are the same—hypothesis, set up experiment, do experiment, check results against hypothesis—as are some of the results processing tools the same (e.g., statistics), but there are substantive differences. For instance, in computing, you break down to problem, isolate it, and solve that piece of something that’s all human-made. In microbiology, it’s about trying to figure out how nature works, with all its interconnected parts that may interfere and complicate the picture. In the engineering side of food science, it was more along the line of, once we figure out what it does and what we really need, can we find something that does what we need or can we me make it do it to solve the problem? It doesn’t necessarily mean one is less cool; just different. And hard to explain to someone who has ever studied only one degree in one discipline, most of whom invariably have the ‘my way or the highway’ attitude or think everyone is homologous to them. If you manage to create the chance to do a second full degree, take it.

***

Who am I to say that a top-up degree is unlike the double indoctrination into a discipline’s mores? Because I also did a top-up degree, in yet another discipline. Besides studying for the last year in Computing & IT with a full-time load, I had also signed up for a conversion Master’s of Arts in Peace & Development studies at the University of Limerick, Ireland. The Computing & IT degree didn’t seem like it would be a lot of work, so I was looking for something to do on the side. I had also started exploring what to do after completing the degree, and in particular to maybe sign up for a masters or PhD in bioinformatics. And so it was that I stumbled upon the information about the Masters of Arts in Peace & Development studies in the postgraduate prospectus. Reading up on the aims and the courses, this coursework and dissertation masters looked like it might actually help me answer some questions I had that were nagging since I spent some time in Peru. Before going to Peru, I was a committed pacifist; violence doesn’t solve problems. Then Peru’s Moviemento Revolucionario de Tupac Amaru (MRTA) hijacked the Japanese embassy in Lima in late 1996 when I was in Lima. They were trying to draw attention to the plight of the people in the Andes and demanded more resources and investments there. I’d seen the situation there, with its malnutrition, limited potable water, and limited to no electricity, which was in stark contrast to the coastal region. The Peruvians I spoke to did not condone the MRTA’s methods, but they had a valid point, or so went the consensus. Can violence ever be justified? Maybe violence could be justified if all else had failed in trying to address injustices? If it is used, will it lead to something good, or merely a set-up for the next cycle of violence and oppression?

I clearly did not have a Bachelor of arts, but I had done some courses roughly in that area in my degree in Wageningen and had done a range of extra-curricular activities. Perhaps that, and more, would help me persuade the selection committee? I put it all in detail in the application form in the hope it would increase my chances to try to make it look like I could pull this off and be accepted into the programme. I was accepted into the programme. Yay. Afterwards, I heard from one of the professors that it had been an easy decision, “since you already have a Masters degree, of science, no less”. Also this door was opened thanks to that first degree I had obtained that was paid for by the state merely because I qualified for the tertiary education. The money to pay for this study came from my savings and the severance package from Eurologic. I had earned too much money in industry to qualify for state subsidy in Ireland; fair enough.

Doing the courses, I could feel I was missing the foundations, both regarding the content of some established theories here and there and in tackling things. By that time, I was immersed in computing, where you break down things in smaller sub-components and that systematising is also reflected in the reports you write. My essays and reports have sections and subsections and suitably itemised lists—Ordnung muss sein. But no, we’re in a fluffy humanities space and it should have been ‘verbal diarrhoea’. That was my interpretation of some essay feedback I had received, which claimed that there was too much structure and that it should have been one long piece of text without visually identifiable begin, middle, and end. That was early in the first semester. A few months into the programme, I thought that the only way I’d be able to pull off the dissertation, was to drag the topic as much as I could into an area that I was comparatively good at: modelling and maths.

That is: to stick with my disciplinary indoctrinations as much as possible, rather than fully descend into what to me still resembled mud and quicksand. For sure, there’s much more to the humanities than meets an average scientist’s eye, and I gained an appreciation of it during that degree, but that does not mean I was comfortable with it. In addition, for thesis topic choice, there were still the ‘terrorists’ I was looking for an answer to. Combine the two, and voila, my dissertation topic: applying game theory to peace negotiations in the so-called ‘terrorist theatre’. Prof. Moxon-Browne was not only a willing, but also eager, supervisor, and a great one at that. The fact that he could not wait to see my progress was a good stimulator to work and achieve that progress.

In the end, the dissertation had some ‘fluffy’ theory, some mathematical modelling, and some experimentation. It looked into three party negotiations cf. the common zero-sum approach in the literature: the government and two aggrieved groups, of which one was the politically-oriented one and the other one the violent one. For instance, in the case of South Africa, the Apartheid government on the one side and the ANC and the MK on the other side, and in case of Ireland, the UK/Northern Ireland government, Sinn Fein and the IRA. The strategic benefits of who teams up with whom during negotiations, if at all, depends on their relative strength: mathematically, in several identified power-dynamic circumstances, an aggrieved participant could obtain a larger slice of the pie for the victims if they were not in a coalition than if they were, and the desire, or not, for a coalition among aggrieved groups depended on their relative power. This deviated from the widespread assumption at the time that said that the aggrieved groups should always band together. I hoped it would still be enough for a pass.

It was awarded a distinction. It turned out that my approach was fairly novel. Perhaps therein lies a retort argument for the top-up degrees against the ‘do both’ advice I mentioned before: a fresh look on the matter, if not interdisciplinarity or transdisciplinarity. I can see it also with the dissertation topics of our conversion Masters in IT students as well. They’re all interesting and topics that perhaps no disciplinarian would have produced.

***

The final step, then. With a distinction in the MA in Peace & Development in my pocket and a first in the BSc(honours) in CS&IT at around the same time, what next? The humanities topics were becoming too depressing even with a detached scientific mind—too many devastating problems and too little agency to influence—and I had worked toward the plan to go into bioinformatics for so many years already. Looking for jobs in bioinformatics, they all demanded a PhD. With the knowledge and experience amassed studying for the two full degrees, I could do all those tasks they wanted the bioinformatician to do. However, without meeting that requirement for a PhD, there was no chance I’d make it through the first selection round. That’s what I thought at the time. I tried 1-2 regardless—reject because no PhD. Maybe I should have tried and applied more widely nonetheless, since, in hindsight, it was the system’s way of saying they wanted someone well-versed in both fields, not someone trained to become an academic, since most of those jobs are software development jobs anyway.

Disappointed that I still couldn’t be the bioinformatician I thought I would be able to be after those two degrees, I sighed and resigned to the idea that, gracious sakes, I’ll get that PhD, too, then, and defer the dream a little longer.

In a roundabout way I ended up at the Free University of Bozen-Bolzano (FUB), Italy. They paid for the scholarship and there was generous project funding to pay for conference attendance. Meanwhile in the bioinformatics field, things had moved on from databases for molecular biology to bio-ontologies to facilitate data integration. The KRDB research centre at FUB was into ontologies, but then rather from the logic side of things. Fairly soon after my commencement with the PhD studies, my supervisor, who did not even have a PhD in Computer Science, told me in no unclear terms that I was enrolled in a PhD in computer science, that my scientific contributions had to be in computer science, and if I wanted to do something in ‘bio-whatever’, that was fine, but that I’d have to do that in my own time. Crystal clear.

The `bio-whatever’ petered out, since I had to step up the computer science content because I had only three years to complete the PhD. On the bright side, passion will come the more you investigate something. Modelling, with some examples in bio, and ontologies and conceptual modelling it was. I completed my PhD in three year(-ish); fully indoctrinated in the computer science way. Journey completed.

***

I’ve not yet mentioned the design I indicated at the start of the blog post. It has nothing to do with moving into computer science. At all. Weaving in the interior design into the narrative didn’t work well, and it falls under the “vocational training courses whilst employed in various IT jobs” phrase earlier on. The costs of the associate diploma at the Portobello Institute in Dublin? I earned most of the costs (1200 pound or so? I can’t recall exactly, but it was somewhere between 1-2K) together in a week: we got double pay for working a shift on New Year (the year 2000 no less) and then I volunteered for the double pay for 12h shifts instead of regular 8h shifts for the week thereafter. One week extra work for an interesting hobby in the evening hours for a year was a good deal in my opinion, and it allowed me to explore whether I liked the topic as much as I thought I might in secondary school. I passed with a distinction and also got Rhodec certified. I still enjoy playing around with interiors, as hobby, and have given up the initial idea (in 1999) to use IT with it, since tangible samples work fine.

So, yes, I really have completed degrees in science, engineering, and political science straddling into humanities, and a little bit of the arts. A substantial chunck was paid for by the state (‘full scholarships’), companies chimed in as well, and I paid some of it from my hard earned money. On the motivations for the journey: I hope I made that clear despite cutting out some text in an attempt to reduce the post’s length. (Getting into university in the first place and staying in academia after completing a PhD are two different stories altogether, and left for another time.)

I still have many questions, but I also realise that many will remain unanswered even if the answer is known to humanity already, since to live means it’s finite and there’s simply not enough time to learn everything. In any case: do study what you want, not what anyone tells you to study. If the choice is a study or, say, a down payment on a mortgage for a house, then if completing the study will give good prospects and relieves you from a job you are not aiming for, go for it—that house may be bought later and be a tad bit smaller. It’s your life you’re living, not someone else’s.