Countless articles have announced the death of symbolic AI, which includes, among others, ontology engineering, in favour of data-driven AI with deep learning, even more loudly so since large language model-based apps like ChatGPT have captured the public’s attention and imagination. There are those who don’t even realise there is more to AI than deep learning with neural networks. But there is; have a look at the ACM Computing Classification or scroll down to the screenshots at the end of this post if you’re unaware of that. With all the hype and narrow focus, doom and gloom is being predicted with a new AI winter on the cards. But is it? It’s not like we all ditched mathematics at school when portable calculators became cheap gadgets, so why would AI now with machine and deep learning and Large Language Models (LLMs) and an app that attracts attention? Let me touch upon a few examples to illustrate that ontologies have not become obsolete, nor will they.
How exactly do you think data integration is done? Maybe ChatGPT can tell you what’s involved, superficially, but it won’t actually do it for you. Consider, for instance, a paper published earlier this month, on finding clusters of long Covid patient symptoms [Reese23], described in a press release: they obtained data of 20,532 relevant patients from 38 (!!) data partners, where the authors mapped the clinical findings taken from the electronic health records “to computable terms contained in the Human Phenotype Ontology (HPO), a standard framework for describing human traits … This allowed the researchers to analyze the data across the entire cohort.” (italics are mine). Here’s an illustration of the idea:
Could reliable data integration possibly be done by LLMs? No, not even in the future. NLP with electronic health records is an option, true, but it won’t harmonise terminology for you, nor will it integrate different electronic health record systems.
LLMs aren’t good at playing with data in the myriad of ways where ontologies are used to power ‘intelligent’ applications. Data that’s generated in automation of scientific experiments, for instance, like that cell types in the brain need to be annotated and processed to try to find new cell types and then add annotations with those new types, which is used downstream in queries and further analysis [Tan23]. There is no new stuff in off-the-shelf LLMs, so they can’t help; ontologies can – and do. Ontologies are used and extended as needed to document the new ground truth, which won’t ever be replaced by LLMs, nor by the approximations that machine learning’s outputs are.
What about intelligent analysis of real-time data? Those LLMs won’t be of assistance there either. Take, e.g., energy-optimised building systems control: the system takes real-time data that is linked to an ontology and then it can automatically derive energy conservation measures for the building and its use [Pruvost22].
Much has been written on ChatGPT and education. It’s an application domain that permits for no mistakes on the teaching side of it and, in fact, demands for vetted quality. There are many tasks, from content presentation to assessment. ChatGPT can generate quiz questions, indeed, but only on general knowledge. It can generate a response as well, but whether that will be correct answer is another matter altogether. We also need other types of educational questions besides MCQs, in many disciplines, on specific texts and textbooks with its particular vocabulary, and have the answer computed for automated marking. Computing correct questions and answers can be done with ontologies and some basic automated reasoning services [Raboanary22]. One obtains precision with ontologies that cannot be had with probabilistic guessing. Or take the Foundational Model of Anatomy ontology as a concrete example, which is used to manage the topics in anatomy classes augmented with VR [Soergel22]. Ontologies can also be used as a method of teaching, in art history no less, to push students to dig into the details and be precise [Bertens22] – the opposite of bland, handwaivy, roughly, sort of, non-committal, and fickle responses ChatGPT provides, at times, to open questions.
They’re just a few application examples that I lazily came across in the timespan of a mere 15 minutes (including selecting them) – one via the LinkedIn timeline, a GS search on “ontologies” with a “since 2022” (17300 results this morning) and clicking a few links that sounded appealing, and one I’m involved in.
This post is not a cry of desperation before sinking, but, rather, mainly one of annoyance. Technology blinkers of any kind are no good and one better has more than just a hammer in one’s toolbox. Not everything can be solved by LLMs and deep learning, and Knowledge Representation (& Reasoning) is not dead. It may have been elbowed to the side by the new kids on the block. I suspect that those in the ‘symbolic AI is obsolete’ camp simply aren’t aware – or would like to pretend not to be aware – of the many different AI-driven computing tasks that need to be solved and implemented. Tasks for which there are no humongous amounts of text or non-text data to grab and learn from. Tasks that are not tolerant to outputs that are noisy or plain wrong. Tasks that require current data, not stale stuff from over a year old and longer ago. Tasks where past data are not a good predictor for the future. Tasks in specialised domains. Tasks that are quirky to a locale. And so on. The NLP community already has recognised LLM’s outputs need fixing, which I was pleasantly surprised with when I attended EMNLP’22 in December (see my EMNLP22 trip report for a few pointers).
Also, and casting the net a little wider, our academic year is about to start, where students need to choose projects and courses, including, among others, another installment of ontology engineering, of logic for AI, Computer Vision, and so on. Perhaps this might assist in choosing and in reflecting that computing as a whole is not going to be obsolete either. ChatGPT and CodePilot can probably pass our 1st-year practical assignments, but there’s so much more computing beyond that, that relies on students understanding the foundations and problem-solving methods. Why should the whole rest of AI, and even computing as a discipline, become obsolete the instant a tool can, at best, regurgitate the known coding solutions to common basic tasks. There are still mathematicians notwithstanding all the devices more powerful than a pocket calculator and there are linguists regardless the free availability of Google Translate’s services; so why would software engineers not remain when there’s a code-completion tool for basic tasks.
Perhaps you still do not care about ontologies and knowledge representation & reasoning. That’s fine; everyone has their interests – just don’t confound new interests for obsolescence of established topics. In case you do want to know more about ontologies and ontology engineering: you may like to have a look at my award-winning open textbook, with exercises, tools, and slides.
p.s.: here are those screenshots on the ACM classification and AI, annotated:
[Bertens22] Bertens, L. M. F. Modeling the art historical canon. Arts and Humanities in Higher Education, 2022, 21(3), 240-262.
[Pruvost22] Pruvost, Hervé and Olaf Enge-Rosenblatt. Using Ontologies for Knowledge-Based Monitoring of Building Energy Systems. Computing in Civil Engineering 2021. American Society of Civil Engineers, 2022, pp762-770.
[Roboanary22] Raboanary, T., Wang, S., Keet, C.M. Generating Answerable Questions from Ontologies for Educational Exercises. 15th Metadata and Semantics Research Conference (MTSR’21). Garoufallou, E., Ovalle-Perandones, M-A., Vlachidis, A (Eds.). 2022, Springer CCIS vol. 1537, 28-40.
[Reese23] Reese, J. et al. Generalisable long COVID subtypes: findings from the NIH N3C and RECOVER programmes. eBioMedicine, Volume 87, 104413, January 2023.
[Soergel22] Soergel, Dagobert, Olivia Helfer, Steven Lewis, Matthew Wysocki, David Mawer. Using Virtual Reality & Ontologies to Teach System Structure & Function: The Case of Introduction to Anatomy. 12th International conference on the Future of Education 2022. 2022/07/01
[Tan23] Tan, S.Z.K., Kir, H., Aevermann, B.D. et al. Brain Data Standards – A method for building data-driven cell-type ontologies. Scientific Data, 2023, 10, 50.