# More lively Theory of Computation lectures with peer instruction

Although I don’t teach theory of computation these days, I did encounter something that I’m pretty sure of that would have made the lectures more lively compared to my attempts (blogged about earlier here), and deserves some ‘air time’: peer instruction [1].

My database systems, computer literacy, and networks students already encountered it during some of my lectures, but to briefly recap the general idea (pioneered by Eric Mazur): you insert ‘quiz questions’ (i.e., concept tests) into the lectures, students vote on the multiple choice question, then there is a discussion among the students about the answers, and a re-vote. This sounds perhaps too playful for higher education, but it has been shown to improve course grades by 34-45% [1, 2], thanks to, among others, active engagement, focusing on understanding principles, and student discussion involving ‘teaching each other’.

Even if one thinks it can be suitable for some courses but not others (like tough computer science courses), there’s even peer instruction course material for something so abstract as theory of computation, pioneered by Cynthia Lee and available through peerinstruction4cs. And it works [2].

The peer instruction slides for theory of computation and several other courses can be downloaded after registration. This I did, both out of curiosity and for getting ideas how I might improve the quizzes for the networks course I’m teaching, for which no materials are available yet. Here, I’ll give two examples of such questions for theory of computation to give an indication that it is feasible.

The first one is about tracing through an NFA, and grasping the notion that if any trace through the NFA computation ends in a final state with all input read, it accepts the string. The question is shown in the figure below, which I rendered with the AutoSim automata simulator.

Peer instruction quiz question about NFAs.

A and D can be excluded ‘easily’: A has the second trace as [accept] but it rejects, and D has the first trace [reject] but it accepts (which can be checked nicely with the AutoSim file loaded into the tool). The real meat is with options B and C: if one trace accepts and the other rejects, then does the NFA as a whole accept the string or not? Yes, it does; thus, the answer is B.

The other question is further into the course material, on understanding reductions. Let us have ATM and it reduces to MYSTERY_LANGUAGE. Which of the following is true (given the above statement is true):

1. a) You can reduce to MYSTERY_LANG from ATM.
2. b) MYSTERY_LANG is undecidable.
3. c) A decider for MYSTERY_LANG (if it exists) could be used to decide ATM.
4. d) All of the above

Answering this during the lecture requires more than consulting a ‘cheat sheet’ on reductions. A is a rephrasing of the statement, so that holds (though this option may not be very fair with students whose first language is not English). You will have to remember that ATM is undecidable, and, if visually oriented, the following figure:

Graphical depiction of reduction

So, if ATM is undecidable (the “P1” on the left), then so is MYSTERY_LANGUAGE (the “P2” on the right); for the aficionados: there’s a theorem for that in Chapter 9 of [3], being ‘if there is a reduction from P1 to P2, then if P1 is undecidable, then so is P2’. So B holds. Thus, the answer is probably D, but let’s look at C just to be sure of D. Because ATM is ‘in’ MYSTERY_LANGUAGE, then if we can decide MYSTERY_LANGUAGE, then so can we decide ATM—it just happens to be the case we can’t (but if we could, it would have been possible). Thus, indeed, D is the answer.

Sure, there is more to theory of computation than quizzes alone, but feasible it is, and on top of that, student perception of peer instruction is apparently positive [4]. But before making the post too long on seemingly playful matters in education, I’m going to have a look at how that other game—the second semi-final—will unfold to see how interesting the world cup final is going be.

References

[1] Crouch, C.H., Mazur, E. (2001). Peer Instruction: Ten years of experience and results. American Journal of Physics, 69, 970-977.

[2] Bailey Lee, C., Garcia, S., Porter, L. (2013). Can Peer Instruction Be Effective in Upper-Division Computer Science Courses?. ACM Transactions on Computing Education, 13(3): 12-22.

[3] Hopcroft, J.E., Motwani, R. & Ullman, J.D. (2007). Introduction to Automata Theory, Languages, and Computation, 3rd ed., Pearson Education.

[4] Simon, B., Esper, S., Porter, L., Cutts, Q. (2013). Student experience in a student-centered peer instruction classroom. Proceedings of the International Computing Education Research Conference (ICER’13), ACM Conference Proceedings. pp129-136.

# Facts and opinions about Theory of Computation in Computer Science Curricula

Computing curricula are regularly reassessed and updated to reflect changes in the discipline, and, at the time of writing, the ACM/IEEE curriculum ‘CS2013’—also called the Srawman Draft [1], about which I blogged earlier—is under review. South Africa does not have its own computing organisation that for official accreditation and does not provide official guidelines on quality and curriculum planning, and therefore such international curricula guidelines provide the main guiding principles for curriculum development here. As is known, there’s theory and there’s practice, so a fact-based assessment about what the guidelines say and what is happening in the trenches may be in order. Here, I will focus on assessment of the curriculum with respect to one course in particular: theory of computation (ToC)—consisting of, roughly, formal languages, automata, Turing machines, computability, and complexity—which introduces the mathematical and computational principles that are the foundations of CS, such as the foundations of programming languages and algorithms, and the limits of computation. One may wonder (well, at least I did): How are ToC topics implemented in CS curricula around the world at present? Are there any country or regional differences? What is the sentiment around teaching ToC in academia that may influence the former?

To answer these questions, I examined how it is implemented in computer science curricula around the world, and the sentiment around teaching it by means of two surveys. The first survey was an examination of curricula and syllabi of computer science degrees around the world on whether, and if so, how, they include ToC. The second one was an online opinion survey (about which I reported before). The paper describing the details [2] has been accepted at the 21st Annual Meeting of the Southern African Association for Research in Mathematics, Science, and Technology Education (SAARMSTE’13), and here I will cherry-pick a few salient outcomes. (or go straight to the answers of the three questions)

Syllabi survey

The 17 traditional and comprehensive universities from South Africa were consulted online on offerings of CS programmes and also 15 universities from continental Europe, 15 from the Anglo-Saxon countries, 6 from Asia, 7 from Africa other than South Africa, and 8 from Latin America. It appeared that 27% of the South African universities—University of KwaZulu-Natal, University of South Africa, University of Western Cape, and the University of Witwatersrand—has ToC in the CS curriculum, compared to 84% elsewhere in the world, which is depicted in Figure 1. The regional difference between Europe (93% inclusion of ToC) and the Anglo-Saxon countries (80%) may be an artefact of the sample size, but it deserves further analysis.

Figure 1. Comparison of ToC in the CS curriculum in South Africa and in other countries around the world (FLAT = formal languages and automata theory, i.e., not including topics such as Turing machines computability and complexity).

The level of detail of the ToC course contents as described in the course syllabi online varied (but see below for better data from the opinion survey), and only 43 universities (of the 59 with data) had sufficient information online regarding timing of ToC in their degree programme. Five universities had it explicitly in the MSc degree programme, with the rest mainly in year 2, 3, or 4 (but see below for better data from the opinion survey), and 5 universities have it spread over 2 or more courses, whereas the rest offers it in a single course (sometimes with further advanced courses covering topics such as probabilistic automata and other complexity classes).

Opinion survey

The opinion survey was quite successful, with a response of n=77. 58 respondents had filled in their affiliation, and they were employed mainly at universities and research institutes: 12 respondents gave an South African academic affiliation and thus the majority of respondents were from around the world, including, among others, the USA, UK, Canada, Germany, Italy, Switzerland, Indonesia, China, Cuba, Brazil, and Argentina. A more detailed characterisation of the respondents, as well as the complete (anonymised) raw question answers with percentages, sorted by question (exported from LimeSurvey) are online at http://www.meteck.org/files/tocsurvey/.

There was a near-unanimous agreement (76 of the 77) that ToC should be in programme, 74 (96% of the answers) have it currently in the programme, and 82% had it in their degree programme when they were a student. Overall, the timing when it is taught in the programme has varied little over the years (see Figure 2). Further, for 90% of the responses, ToC is core in the curriculum, and secure in the programme for 86% (only few reasons were provided for “under threat/other”: that it has been removed from some specialisations but not all (in part due to the computer science vs. information systems tensions), or threatened due to low enrolment numbers).

Figure 2. Comparison between when the survey respondents did ToC in their degree and in which year in the programme it is taught at their university (where applicable).

Regarding the topics that should be part of a ToC course, the following is observed. The survey listed 46 topics and for each one, a possible answer [essential/extended/peripheral/no answer] could be given. The complete list of ToC topics ordered on percent ‘essential’ is shown in Table 1. In short: it is perceived decidedly that formal languages, automata, Turing machines, complexity, computability and decidability themes form part of one coherent offering, although the detail of the sub-topics covered may vary.

Table 1. Ordering of the 46 ToC topics, by calculating the percentage of responses that marked it as ‘essential’ out of the given answers.

Given the plentiful anecdotes, hearsay, and assertions in other articles about teaching ToC concerning difficulties with ToC teaching and learning, the survey also included some questions about that. The data provided by the respondents do substantiate the existence of issues to some extent: 44% of the responses answered that there are no issues and everything runs smoothly, or: a slight majority does, which can be subdivided into 32% ‘causes problems in the academic system each year’ and 24% where ‘management/student affairs has gotten used to the fact there are problems’. Several respondents provided additional information regarding the issues, mentioning low pass rates (n=3), that students struggle because they do not see the usefulness of ToC for their career (n=4), that it also depends on the quality of the teacher (n=2), and low enrolment numbers (n=2). For 45%, the first-time pass rates remain below 60% and with 80% of the respondents, the pass rate remains below 80%. The correlation between pass rate and issues is 0.79 (n is to small to draw any conclusions for the other combinations of pass rates, class sizes, extrapolated course content, and having issues).

Discussion

There is much one can discuss about with respect to the data (and more is included in the paper than I cover here in this blog post). Considering the curriculum analysis first, it can be summarized that ToC in CS is solidly in the programme, is oftentimes taught in a single course, and mostly in 2nd and 3rd year of the undergrad CS programme. Interestingly, there is a discrepancy between the ‘essential’ content according to the survey and the newly proposed ACM curriculum guidelines; compare Table 1 with Figure 3.

Figure 3. Proposed CS2013’s ToC topics in the Strawman draft (layout edited). 100% of tier-1 and >80% of tier-2 is core and has to be covered, and an undefined amount of the elective topics to facilitate track-development (no tracks have been defined in the CS2013 draft yet).

Considering the Strawman’s “core” topics, one may question the feasibility of imparting a real understanding of complexity classes P and NP without also touching upon computability and Turing machines. Furthermore, the hours indicated in Figure 3 are meant as minimum hours of fact-to-face lectures (i.e., 8 lessons at a South African university, or at least almost 3 weeks of a standard 16 credit course), which, if this minimum is adhered to, amounts to a very superficial treatment of partial ToC topics. As an aside: my ToC students at UKZN now go through all of it (in accordance with the topics listed in the handbook). Comparing the ‘essentials’ list with the older curriculum guidelines [3, 4], however, one observes that they are much more in agreement.

Quite a bit has changed in the computing arena since the late 1980s, and most notably the specialization and diversification of the field. ToC matters more for CS than other recently recognised specialisations within computing—e.g., software engineering, net-centric computing, information systems, and computational biology—and this diversification is, or has to be, recognised by the curriculum developers [5, 6], which should result in putting more or less weight on the core topics (see [7] for a detailed analysis on sub-disciplines within computing and a proposed weighting of curriculum themes). But recall that the Strawman draft is about (different track within) CS only. The diversification and its effect on computing curricula is noticeable clearly only when one compares it with the Software Engineering curriculum guidelines [8]: these guidelines include only a little bit on finite state machines, grammars, and complexity and computability in the “Data structures and algorithms” and “Discrete structures II” themes. It may be the case that, in praxis, those degree programmes called “computer science” indeed do contain the more fundamental topics, such as ToC (and logic, formal methods etc.), and that other ‘tracks’ actually have been given different names already, hence, would have been filtered out unintentionally a priori in the data collection stage of the curriculum survey.

Concerning issues teaching ToC, on an absolute scale, that 56% faces issues with their ToC courses is substantial, and, conversely, it deserves a comparative analysis to uncover what it is that the other half does so as to not have such issues. Based on the comments in the survey and outside (follow-up emails with survey respondents), there are a few directions: it may help to demonstrate better the applicability of ToC topics in the students’ prospective career, have experienced good teachers, and appropriate preparation in prior courses to increase the pass rates. Further, having issues might be related to the quantity and depth of material covered in a ToC course with respect to nominal course load. The data hints also to another possible explanation: even with a 80-100% pass rate and no low enrolment the ‘gotten used to the issues’ was selected occasionally, and vv., with a 41-60% pass rate that everything runs smoothly, thereby indicating that having issues might also be relative to a particular university culture and expectations of students, academics, and management.

Answers to the questions

Looking again at the questions raised at the start, here are the (short) answers to them:

1. How are ToC topics implemented in CS curricula around the world at present? ToC topics in the actual international curricula are more in line with the older curriculum guidelines of [3, 4] than the more recent versions that put less weight on ToC topics. The timing in the curriculum regarding when to teach ToC remains largely stable and for a majority is scheduled in the 2nd or 3rd year.
2. Are there any country or regional differences? There are country/regional differences, the largest one being that ToC is taught at only 27% of the South African traditional and comprehensive universities versus at 84% of the consulted curricula elsewhere in the world. Even including those SA universities with partial ToC coverage does not make up for the differences with elsewhere in the world or any of the proposed CS curriculum guidelines. Other geographic or language-based differences are not deducible from the data, or: based on the data, region does not matter substantially regarding inclusion of ToC in the CS curriculum, except that the slight difference between Europe and the Anglo-Saxon countries deserves further attention.
3. What is the sentiment around teaching ToC in academia that may influence the former? Opinion on ToC is overwhelmingly in favour of having it in the curriculum, and primarily in the 2nd or 3rd year. Also, a large list of topics is considered to be ‘essential’ to the course, and this list is substantially larger than the recent international curricula Strawman drafts’ core for ToC topics (and more like the Strawman drafts’ total ToC list). Despite noted issues with the course, the voices from the field clearly indicate that ToC is here to stay.

In closing (for now): ToC is solidly in the CS degree programme, and perhaps ought to be introduced more widely in South Africa. And just in case you think something along the line of “well, we have pressing issues to solve in South Africa and no time for follies like doodling DFAs and tinkering with Turing machines”: CS and development of novel and good quality software requires an understanding of ToC topics. For instance, to develop a correct isiZulu grammar checker for text processing software or a parser for natural language processing, scalable image pattern recognition algorithms to monitor wildlife tracks with pictures taken in situ in, say, the Kruger park, an ontology-driven user interface for the Department of Science & Technology’s National Recordal System for indigenous knowledge management, and proper data integration to harmonize and streamline service delivery management, to name but a few application scenarios. Foreigners will not do all this for you (and they have their own problems they want to solve), or only for large consulting fees that otherwise could have been used to, among others, install potable water for the 1.3 million South Africans that don’t have it now, provide them closed toilets, ARV etc.

References

[1] ACM/IEEE Joint Task Force on Computing Curricula. (2012). Computer Science Curricula 2013 Strawman Draft (Feb. 2012). ACM/IEEE.

[2] Keet, C.M. An Assessment of Theory of Computation in Computer Science Curricula. 21st Annual Meeting of the Southern African Association for Research in Mathematics, Science, and Technology Education (SAARMSTE’13). BellVille, South Africa, January 14-17, 2013.

[3] Denning, P.J., Comer, D.E., Gries, D., Mulder, M.C., Tucker, A., Turner, A.J. & Young, P.R. (1989) Computing as a discipline. Communications of the ACM, 32(1), 9-23.

[4] UNESCO-IFIP. (1994). A modular curriculum in computer science. UNESCO and IFIP report ED/94/WS/13. 112p.

[5] Sahimi, M., Roach, S., Cuadros-Vargas, E. & Reed, D. (2012). Computer Science curriculum 2013: reviewing the Strawman report from the ACM/IEEE Task Team. In: Proceedings of the 43rd ACM technical Symposium on Computer Science Education (pp. 3-4). Raleigh, North Carolina, USA, February 29 – March 3, 2012. New York: ACM Conference Proceedings.

[6] Rosenbloom, P.S. (2004). A new framework for computer science and engineering. IEEE Computer, 37(11), 23-28.

[7] ACM/IEEE Joint Task Force on Computing Curricula. (2005). The overview report. ACM, AIS, IEEE-CS, September 30, 2005.

[8] ACM/IEEE Joint Task Force on Computing Curricula. (2004). Software Engineering 2004. ACM, IEEE-CS, August 23, 2004.

# MOOCs, computer-based teaching aids, and taking notes

This post initially started out to be directed toward the current COMP314 Theory of Computation students at UKZN, who, like last year, are coming to terms not only with the subject domain, but also with the fact that I write the lecture notes on the blackboard (which is green, btw). The post eneded up containing some general reflections on the use/non-use of computer-based teaching aids and, in extension, the Massive Open Online Courses (MOOCs), with illustrations taken mainly from the ToC course.

First a note to the students: I am aware most of you don’t really like taking notes during the lectures and quite a few still do not do so—despite that you know that you’re served a summary of the bulky textbook, saving you to summarise it. But those who do make notes, or at least rewrite the notes from someone else or fill up the gaps in their own notes, go much more quickly through the exercises than those who do not. For instance, I occasionally gave as an exercise an example that was done in class on the board already. The diligent note-takers’ response is along the line of “yeah, that was trivial, and, by the way, we did that already in class, here’s the solution. Give me a real challenge!” compared to starting the ‘exercise’ from scratch by those who did not take notes, and who do not even recollect we did it in class. In the end (and by observation, not scientific rigour of the double blind experiment), the former group completes more, and more advanced, exercises in the same or less amount of time. Maybe I should shout that from the rooftops.

Taking notes means you listen, read, and write, therewith processing the knowledge that I’m trying to get from my brain into yours. Doing something actively, versus passively listening (or, worse, letting it go into one ear and immediately release it through the other, or dozing off), makes you think about the material at least once, which then saves time later on because you’ll remember some, if not all, of it. In addition, imagine how fast I could go through the material if I would not have to take the time to write it on the board, but just click the down arrow, and therewith having the opportunity to cover even more material than I already do whereas you would hardly have time to think about the matter at all, let alone ask questions. Besides, note taking is a good exercise in general, because there often won’t be any handouts prepared for you once you’re out there in industry, so you might as well practice for those situations now. (See also Mole’s column on why to chalk it upon the blackboard.)

Nonetheless, unlike, say, 15-20 years ago when we did not know any better than reading the material beforehand and taking notes during the lectures, in the present-day slides-era, quite a few students are not happy with the note-taking effort and ‘quality of the teaching aids provided’, as they are used to the slides in other courses. I distribute the slides, too, for a course like ontologies and knowledge bases, because there is no textbook (I even wrote a 150-pages long set of lecture notes), but for Theory of computation, we use a textbook (Hopcoft, Mottwani, and Ullman) that has the details at undergraduate level. Nor were last year’s students happy with the note-taking. Some students searched online back then, and I vividly remember one student sharing his opinion about that with me, unsolicited: “ma’am, we searched online for better material, but it’s all just as bad as yours! So we won’t hold it against you”. Well, thanks.

Closely related to that ‘searching around online’, however, are possible less pleasant side effects. I will mention two. First, last year some students looked up alternative explanations for the diagonalization language and some things being incomputable, having come across a version of the barbershop paradox. Strictly speaking, it is not an alternative explanation of the same thing; or: if you are going to use that, it is a wrong analogy. I’ll explain why in class in a few weeks when we’ll cover chapters 8 and 9 of HMU.

Second, this year I have seen some non-notes students watching a YouTube video on the pumping lemma and on CFGs during the exercises in the lab; such repeats take up extra time. The note-takers, on the other hand, flicked through their pages in a fraction of the time, hence, were ahead in the exercises simply because they had used the lecture time more effectively. (And I generously assume that what was presented in the YouTube video was correct, which need not be the case; see previous point).

This does not mean there is no room for improvement regarding teaching aids for Theory of Computing, which I’ve written about last year. This year I tried the AutoSim automata simulator and the CFG Grammar Editor again, which still does not have much uptake by the students, afaik, and I’m trying out Gradiance. The Gradiance system is an online MCQs question bank with explanations of the answers and hints in case the answer given was incorrect—i.e., turning the gained experience and knowledge about common conceptual mistakes into an educational feature)—and one can assign homework and grade assignments. The dedication to do the (optional) homework is dwindling since the 5th week into the semester, but the automatic grading and providing the option for self-study for the interested and/or determined-to-pass students are can be great (the system is almost cheat-proof, but not entirely).

To get to the point after some meandering: some types of computer-based teaching aids can be useful, just not all of them all of the time. True, computing looks at what can be automated, and what can be automated efficiently, and so I could apply that notion to everything—up to totally automated course offerings, which is the direction that the MOOCs are going. However, computing also concerns solving problems, and being able to recognise when a problem is best solved by computation, and when other solutions may be more appropriate. For instance, low pass rates may be considered to be a problem, but this does not imply that e-learning is the solution to that; non-determinism and epsilon-transitions are concepts that are apparently not easy to grasp, and the simulator is more illustrative than my coloured chalk trying to simulate a run on the blackboard; in the pre-CMS era, course admin was a chore and there were, perhaps, instances of ‘lost’ assignment or project submissions (though that did not happen when I was a student in the ‘90s), which Moodle alleviates and prevents, respectively. So, software can indeed solve some problems.

This brings me to the other end of the spectrum: the Massive Open Online Courses, or MOOCs; yes, there is even a MOOC for Theory of Computation, by Ullman himself. Which problems do they purport to solve, and do they? Despite reading a lot of pop-articles about it over the past year (see, e.g., the feature series from the Chronicle of Higher Education on MOOCs), it still is not clear to me with respect to the solutions for lecturing issues. One recurring argument in the flurry of news articles is that it is great [to/because it] give[s] the poor sods around the world some crumbs from the elite universities; well, the ‘poor’ with a good Internet connection and the money to pay for the data transfer, that is, not Joe and Joanne Soap in Chesterville, Soweto, etc., and the interested potential student has to have a good command of the English language as well.

Another recurring argument with the MOOCs is that one can learn from the best, and, en passant, implying, or even explicitly stating, that the MOOC lecturers are assumed to be better teachers by definition than anyone else. Sure, there are lecturers who teach stuff that is wrong, but how widespread is that, and what is the cause of that? If anyone has hard data about such claims, please let me know. For the sake of argument, let’s assume mistakes are widespread, and that it is because we, as non-elite university lecturers, are undereducated and incompetent teachers. Is a MOOC the solution? Or maybe giving lecturers the time to learn their material and prepare the lectures better, i.e., local capacity building? Not all of us are undereducated. People who have taught a course many times, do research in that field, and possibly even have written textbooks on the topic, tend to be better lecturers, because they generally have reflected more on the teaching and already have come across all possible conceptual mistakes the students make, can anticipate it, and therewith even prevent that from happening at least to a larger extent than a novice lecturer; those people are not all and only at Stanford, Harvard, and MIT.

Second, the MOOCs—for the time being, at least—are based on a push-mechanism of knowledge transfer, yet lecturing consists of much more than talking at the front of the classroom. There is interaction with questions and answers, there is context, and so on. For instance, regarding motivational context, to introduce the data mining in a database course with a story about the nearby Pick ‘n Pay at Westwood mall that the on-campus students go to, and, once signed up for their customer loyalty card, how PnP will find out you’re a student even if you did not say so on the application form. Or the problems with Johannesburg’s integrated services delivery management system as a real life example of data integration issues and the urgent need to have educated South African computer scientists to solve them. Or, given that UKZN’s CS student have had a lot of java programming in the first and second year, to show them the java language as a CFG in the theory of computation course. In addition, sometimes to the pleasant and sometimes frightened, surprise of my students, I actually do know about half of the roughly 70 registered students by name, and know who can be prodded into action only when an exercise is marked with ‘super hard’, who does not want to know the answer straight away but just a little hint to move on him/herself to find the solution, the determined to solve it on their own, the insecure who needs a bit of encouragement, and so on.

To have a MOOC suit you, you would have to have done the same prior courses—unless the MOOC is only an introduction to topic x—, you would not care about the absence of a few context/motivational stories, your learning style has to match with that of the MOOC, and you have to be very disciplined on your own. To name but a few potential hurdles in the positive light.

Further, there are the exercises and tests. There are some new tools for the automatic grading of exercises and assignments (Gradiance would fit here, I suppose, but not the regular textbook exercises), or semi-automatic with software+TA, and virtual ‘MOOC study groups’ are popping up in social media. If you don’t know any better, it is probably great. Like thinking pizza is tasty all around the world—until you experience how it tastes in Naples. Such online groups are not bad—I participated in it myself when I was studying at the Open University UK, and it is better than no contact with other students at all—but it does not compare with the face-to-face meetings with fellow students, where the lectures are discussed, notes compared and brushed up, exercises discussed and solved, peer-explanation happens, students motivating one another, and so on.

Overall, then, if MOOCs are going to become the standard, the world will be poorer for it. For sidelining competent lecturers, de-incentivising weaker lecturers from acting on their responsibility to brush up their knowledge and skills, de-contextualising and hamburgerising courses, impoverishing the academic learning environment by narrowing down education to a mere push-mechanism of knowledge transfer, and dehumanizing students into boring conformity (eenheidsworst in Dutch). Add to that mix the cultural imperialism, and we are well on our way to a ‘brave new world’.

In the meantime, participate in the lectures and process the information, and take notes. I don’t think MOOCs will kill the regular universities, but imagine if you really were the last generation to go to (or work at) a real university… Exploit the advantages that a face-to-face university offers you, and cherish it while it lasts!

# Theory of Computation topics mostly as elective in the planned ACM/IEEE CS2013 curriculum

As part of the intermittent research I have taken up on teaching Theory of Computation, I have been reading a few papers on curriculum development for Computer Science. As it happens, there is currently a draft—the so-called ‘Strawman draft’—up for discussion until July 15 in this forum. Upon community input, that draft may be updated and it will become the new version of ACM curriculum guidelines for CS (and its variants) in 2013; see [1] for a 2-page context.

One thing that may surprise the CS researcher and professor who is in the system since a while, is that since the Computing Curricula CC2001 version, and carried over in the Strawman draft, most of automata is elective, only a little complexity is core, and, e.g., Turing machines is elective—which was unlike the seminal first version that described computing as a discipline and principal curriculum topics in 1989 [2] that did have formal languages, automata, Turing machines, complexity and computability in the core. More precisely, the CS2013 proposal is as follows (layout edited):

AL/Basic Automata Computability and Complexity

[3 Core-Tier1 hours, 3 Core-Tier2 hours]

Topics:

[Core-Tier1] Finite-state machines, Regular expressions, The halting problem

[Core-Tier2] Context-free grammars (cross-reference PL/Syntax Analysis), P vs NP (tractable and intractable problems), Definition of P, NP, and NP-complete, Exemplary NP-complete problems (e.g., SAT, Knapsack)

[Elective]

Topics: Review definitions of the classes P and NP; introduce EXP, NP-completeness (Cook’s theorem), Classic NP-complete problems, Reduction Techniques,

AL/Advanced Automata Theory and Computability

[Elective]

Topics: Sets and languages, Regular languages (Review of deterministic finite automata (DFAs), Nondeterministic finite automata (NFAs), Equivalence of DFAs and NFAs, Review of regular expressions; their equivalence to finite automata, Closure properties, Proving languages non-regular via the pumping lemma or alternative means), Context-free languages (Push-down automata (PDAs), Relationship of PDAs and context-free grammars, Properties of context-free languages), Turing machines, or an equivalent formal model of universal computation, Nondeterministic Turing machines, Chomsky hierarchy, The Church-Turing thesis, Computability, Rice’s Theorem, Examples of uncomputable functions, Implications of uncomputability,

Aside from pondering how one can do DFAs and regular expressions but not also NFAs (to name but one), this list contrasts quite markedly with the one from the survey I conducted earlier, which I blogged about and whose results are online. One of the questions of the survey was whether something like a theory of computation course should be part of a CS curriculum, which was responded with a near-unanimous “yes”, in contrast with the CC2001 and the Strawman report. Another question was about which topics should be covered, where a list of 46 topics was provided and respondents could add more. For each topic, they had to select it as being ‘essential’, suitable for an ‘extended version’ of a ToC course, or as ‘peripheral/may be skipped’. Summarizing the full computed preferences of the topics, we see a different picture than the categorization in the Strawman Draft, above; the top-20 topics with their corresponding percentage where the respondents answered with ‘essential’ are:

Regular expressions 98.55, Deterministic Finite Automata (DFA) 95.71, The Turing Machine (basics) 94.37, Context-free Grammars (definition, ambiguity, simplification, derivations) 94.20, Non-Deterministic Finite Automata (NFA, epsilon-NFA) 88.41, Equivalences & conversion RE and automata 85.29, Problems a computer cannot solve 85.29, Halting problem 82.81, Properties of Regular languages 80.30, Regular grammars (RG) 80.00, Examples of some undecidable problems/languages 78.13, Church-Turing thesis 77.94, Computability and decidability 76.81, Equivalences & conversion DFA, NFA, epsilon-NFA 73.85, P, NP, NP-complete, NP-hard, co-NP 73.53, Universal Turing Machine 72.06, Undecidability 68.57, Pumping lemma for Regular languages 68.18, Some well-known NP problems (e.g., TSP, SAT, Node cover) 68.18, Chomsky normal form, hierarchy 67.16.

I understand that not everything can go in the curriculum, as there are more and more topics that ought to be in a CS curriculum whilst the study programme time remains the same, but the rationale behind the Strawman Draft’s lists on ToC topics eludes me. Meanwhile, my COMP314 students will get a full plate of ToC in the upcoming semester (including, among others, Turing Machines, and they will be reminded of Turing’s centenary this year [as an aside: even Nature finally recognized something from Computer Science [3], honouring it with a special, too]).

References

[1] Mehran Sahami, Ernesto Cuadros-Vargas, Steve Roach, David Reed. Computer Science Curriculum 2013: Reviewing the Strawman Report from the ACM/IEEE-CS Task Force. SIGCSE’12, February 29–March 3, 2012, Raleigh, North Carolina, USA.

[2] Peter J. Denning, Douglas E. Comer, David Gries, Michael C. Mulder, Allen Tucker, A. Joe Turner, and Paul R. Young. Computing as a Discipline. Communications of the ACM, 1989, 32(1): 9-23.

[3] Chouard, T. Turing at 100: Legacy of a universal mind. Nature, 2012, 482 (7386), 455-455.

# Preliminary results of the Theory of Computation survey

As you may remember from the post on making Theory of Computation (ToC) more lively, I taught ToC for the first time last year at UKZN, where it also was a new core course in the CS degree programme, i.e., the students and the system also had to get used to ToC. As usual, anything can be improved upon (if you think not: look harder; they always can, at least in theory). To commence with that in a solid way, we’ve decided first to collect some data to go beyond the familiar anecdotes.

Internationally, many stories make the rounds through the grapevine about ToC. Those stories revolve around, among others, it being a difficult subject for the students, low pass rates, the course being threatened from being removed from a the programme, and textbooks becoming out of print (e.g., Pearson does not want to make reprints of Hopcropft, Mottwani & Ullman’s book unless they get single orders for more than 300 books, according to their rep for SA).

While the individual stories are true, how prevalent are they really?  How widespread are ‘low pass rates’, and when is it ‘low’? What are the enrollment numbers elsewhere? Do they have problems in the university system? It being a new course in the programme here as a result of merging a 16 credit Formal Languages & Automata Theory and a 16 credit Algorithms & Complexity, what topics are really essential in a ToC course? Should it be a core course, and if so, in which year of the programme?

These are some of the questions we were curious about as to what the answers would be. To find out, there’s a (still ongoing) survey of ToC syllabi at the various universities around the world and an opinion-survey to obtain data that cannot be found by just looking at syllabi, but concern the context around ToC, like enrollment numbers, pass rates, whether it should be in the programme vs. actually in the programme, and so on. The opinion-survey was open from 16 March to 1 April (accessible here), and I’ve put the preliminary results online, as promised in the announcement. (A paper summarizing the results and integrating it with the results of the syllabi-survey is in the pipeline, but somehow it struck a chord, and relatively many survey respondents wanted to know the results and all the details can’t go in the page-limited paper anyway).

In total, there were 77 people—mainly academics—who completed the survey, mostly from outside SA and covering all continents of the world. There’s the survey setup, results in digested format, discussion, and conclusions, as well as the raw data with aggregated numbers by question answer, and the list of ToC topics ordered by being essential. In short: The survey responses show an overwhelming agreement that ToC should be taught and a majority prefers to have it in the 2nd or 3rd year in an undergraduate programme. It is taught at most of the institutions that the respondents are affiliated with, and the course is mostly solidly in the programme as a core course. About half of the respondents note there are issues with the course, for various reasons, including, but not limited to, low pass rates and low enrollment. Roughly half observe first-time pass rates below 60%, and for only 20% the pass rate exceeds 80%. Whilst noting that several respondent spread ToC content over more than one course or integrate it with other courses, there is agreement on the typical topics that are considered as essential to ToC, covering regular and context-free languages (and grammars), automata (at least DFA, NFA, epsilon-NFA), Turing machines, undecidability, computability and complexity, where the subtopics covered vary a bit.

Several respondents also gave additional feedback and opinion via email. In case you would like so, too, drop me a line, or add it in the comments section here on the blog.

# On making a Theory of Computation course more lively

Some 71 BSc in CS students at UKZN encountered the joys of automata, regular expressions, context free grammars, Turing machines, undecidability, and intractability. I’ll be teaching theory of computation again next year, and am looking for more ways to brighten up the material further in an attempt to increase the amount of students’ interest in the topic. The list below made into the course material only partially, but I intend to include more of it next year (given that, looking back at the time I was a student, I sure would have enjoyed the course more if at least some of this had made it into the course material back then).

The textbook for the course is Hopcroft, Motwani and Ullman’s Introduction to Automata Theory, Languages, and Computation from Stanford University [1]. Their slides and well as Diego Calvanese’s notes from the Free University of Bozen-Bolzano were useful in the preparation, but this is still fairly dry matter. To make it livelier, I had looked around for illustrative simulators, games and the like.

The nicest one I found when preparing the course are the AutoSim automaton simulator (which generated a lukewarm response, at best) and the CFG Grammar Editor made by Carl Burch at Hendrix College, but I may have missed even better ones. It is sort of entertaining for the answers to the exercises of chapters 2 and 5 of the Ullman text book, and in due time I’ll post them here. Recently, JFLAP came to my attention, developed at Duke University, which has lots of more features compared to AutoSim, like an implemented state minimization algorithm and conversions between DFAs NFAs and REs. It also has an accompanying textbook, but, alas, that book only covers languages and automata, not the undecidability and complexity that is also part of the course. There are many simulators of Turing machines around, from Java implementations to physical simulators (see, e.g., this list), with the Lego of Doom, developed by four students at Århus University, as most entertaining, but not exactly classroom material, and there are nice illustrative videos of the Turing Machine in the classic style by Mike Davey. If you have any suggestions, please leave them in the comments.

The intention was to brighten up the second part of the course also with some games and puzzles as examples, in addition to the standard examples of the Travelling Salesman Problem, Knapsack, Clique, SAT etc. The clique generated some interest, as it could be effectively related to Facebook (the students form a clique themselves: they have a facebook page for the course). But, overall, there wasn’t enough time to go into sufficient detail. The plan was to revisit Sudokus, which was a 2nd-year programming exercise but also happens to be NP-complete [2], like so many other games (among others: Solitaire, FreeCell, Same Game/Clickomania, etc.—mentioning this generated interest); check out, e.g., Wikipedia’s list of game complexity of 34 games, the separate list of NP-complete games and puzzles and PSPACE-complete ones and David Eppstein’s list with complexity of puzzles, solitaire and two-player games. The games generated interest. Then there are non-game fun problems, like the new Pancake Flipping Problem (sorting a stack of pancakes with one spatula) that is NP-hard [3], and certain ways of balloon twisting with equal length balloons is NP-complete [4]—both brought under my attention thanks to the Improbable Research blog. There are many more NP-complete problems that may (should?) interest the students, like more serious ones about efficiently allocating blood in a blood bank, timetabling etc. Likewise for EXPTIME problems, such as reasoning over UML Class Diagrams [5], which it did, to some extent. The Complexity Zoo, however, seemed to scare them off.

Even inclusion of some more context, old and new, may brighten it up, like the material about Alan Turing, Chomsky’s original paper [6] (tried this, didn’t generate interest), where the naming of NP-hard comes from [7] by Donald Knuth, an additional note on P and NP by Moshe Vardi [8], and blog posts about “Big problems with big iron”—or: the main gains are to be made in improving the algorithms, not the hardware—by Richard Lipton, or the SEP entry on the Church-Turing Thesis [9] for the philosophically-minded (tried this, without response).

For those of you studying or teaching a theory of computation course: this year’s midterm, quiz, and exam (with solutions!), can be downloaded from my website here, which might be of use.

References

[1] Hopcroft, J.E., Motwani, R. and Ullman, J.D. (2001/2007). Introduction to Automata Theory, Languages, and Computation, 2nd/3rd ed., Pearson Education.

[2] Takayuki Yato and Takahiro Seta. 2003. Complexity and Completeness of Finding Another Solution and Its Application to Puzzles. IEICE Trans. Fundamentals, E86-A (5):1052-1060.

[3] Laurent Bulteau, Guillaume Fertin and Irena Rusu. Pancake Flipping is Hard.  arXiv:1111.0434v1. (Submitted on 2 November 2011).

[4] Erik D. Demaine and Martin L. Demaine and Vi Hart. Computational balloon twisting: the theory of balloon polyhedra. In: Proceedings of the 20th Canadian Conference on Computational Geometry (CCCG 2008), Montréal, Québec, Canada, August 2008.

[5] Daniela Berardi, Diego Calvanese, and Giuseppe De Giacomo. Reasoning on UML class diagrams. Artificial Intelligence, 168(1-2):70-118, 2005.

[6] Chomsky, N. Three models for the description of language. IRE Transactions on Information Theory, 1956, (2): 113–124.

[7] Donald E. Knuth. A Terminological Proposal. SIGACT News, January 1974, pp13-18.

[8] Moshe Y. Vardi. On P, NP, and Computational Complexity. Communications of the ACM. Vol. 53 No. 11, Page 5, November 2010.

[9] Copeland, B. Jack. The Church-Turing Thesis. The Stanford Encyclopedia of Philosophy (Fall 2008 Edition), Edward N. Zalta (ed.), URL = <http://plato.stanford.edu/archives/fall2008/entries/church-turing/&gt;.