Teaching Programming to Liberal Arts Students:Using Loop Invariants

David Arnow
Department of Computer and Information Science
Brooklyn College City University of New York
Brooklyn, New York 11210
e-mail: arnow@sci.brooklyn.cuny.edu

ABSTRACT: Loop invariants have long been present in advanced undergraduate and graduate courses on programming methodology or program correctness. Recently there has been an increased interest in using loop invariants in teaching more elementary courses. In this paper, its successful use in teaching elementary programming in a computer literacy course for non-majors is described. The techniques described here, that are necessary in order to work successfully with this population, are also applicable to the teaching of programming to computer science majors.

Publication Information: This Brooklyn College Technical Report was presented at the SIGCSE Technical Symposium during ACM Week 1994 and appeared in the conference proceedings for that symposium: Teaching Programming to Liberal Arts Students: Using Loop Invariants. Proceedings of the 25th SIGCSE Technical Symposium, Phoenix, Arizona, March 1994

Acknowledgment of Support: This work was made possible by an undergraduate curriculum development grant from the National Science Foundation , CISE Division (USE-9150719).

Prologue

In this paper I describe one of several outcomes of an NSF-sponsored curriculum initiative at Brooklyn College. This initiative concerns the teaching of computer science to the liberal arts student who will not pursue technical and scientific studies. Often this kind of teaching is called "computer literacy". At the outset of the funding period, a successful pilot course had been running at Brooklyn College for several semesters. The design of this computer literacy course course is described in detail elsewhere [Arnow, 1991]. It was one of several efforts to design computer literacy courses with a principles-oriented approach [Myers, 1989]. The course is part of Brooklyn College's well-known Core Curriculum, a set of 10 required general education courses [Cheney, 1989]. Because the course is mandated by this curriculum to address mathematical literacy as well, and because the course designer has views similar to those expressed by Myers [1990], the first third of the course is a study of elementary propositional logic.

Only a small subset of Pascal is taught: readln, writeln, assignment and the while statement. Ultimately the students write simple programs that have a flavor of primitive recursion theory: programs that add by repeated incrementing, that multiply by repeated addition, divide by repeated subtraction, that exponentiate by repeated multiplication and so forth. The appropriate conclusions are drawn in class.

The project that was funded sought to further develop the curriculum by achieving a greater integration of the component parts and by increasing the activities that students can do to reinforce their learning. In order to accomplish this, we introduced a unit on formal approaches to program correctness, which nicely integrates their earlier work in logic with their work in programming. In addition, a rich set of exercises for the students results from this topic. This work was described elsewhere by the author [Arnow, 1992].

To make this unit possible, and because ultimately it gives students a deeper understanding of the nature of computer science as both an empirical and a mathematical discipline, we redesigned the programming part of the course so that loop invariants are introduced and used as soon as the while statement is presented. Now, curiously enough, we found ourselves teaching computer science to non-majors at a level that in some sense was more sophisticated than our introductory courses for majors! The success of this approach has therefore not only transformed the computer literacy course but also begun to change the way we teach majors.

Introduction

The debate concerning the importance of loop invariants has not concluded but one part of the debate should be over: teaching the use of loop invariants to correctly construct loops is pedagogically feasible at all levels of computer science instruction. Astrachan [1991] presented a graphic approach that facilitates teaching the subject to students in undergraduate algorithms and data structures courses. McCracken [1992] describes a CS1 course, designed by Troeger [1993] at City College, that successfully gives considerable empahsis to this material. Tam [1992] describes an approach suitable for potential undergraduate majors in a first semester programming course. Central to all these presentations and papers is the point that it is not the topic of loop invariants that must be omitted when one moves from an advanced computer science course to a beginning one; rather, the way in which it is presented must be modified. As Tam acknowledges, Dijkstra's and Gries' texts [Dijkstra, 1976; Gries, 1981] require a degree of mathematical sophistication that most first year college students do not have. However, loop invariance is too useful a mental tool to withold from beginning programming students. Less formal ways can be found to present this material to these students. Tam uses examples that illustrate a general strategy for constructing loops. He points out that his students benefit immediately in terms of programming ability and that they are better motivated and prepared for a more formal treatment of the subject later in their studies.

At Brooklyn College, we have recently introduced loop invariants as a topic not only in the early courses for majors but in a computer literacy course for non-majors. It is the latter effort that is of concern here, because the needs and difficulties of the student population in that course place special demands on the way this topic must be handled. Furthermore, a successful treatment of the topic in a course at that level demonstrates beyond any doubt that the topic, if appropriately handled, can be taught at any level of college-level computer science instruction. Finally, the techniques used in the computer literacy course are also useful for beginning computer science majors.

The Context

The students are an ethnically diverse group of working-class students, many of whom are immigrants or members of minority groups. Half the first year students are required to take at least one remedial course. For a large number of students, English is not their native tongue. In spite of these difficulties, they are able to benefit from and in many cases thoroughly master the material in this course- and this is as important a lesson as any others presented in this paper.

Introducing Loop Invariants

Students need to be introduced to new concepts in stages, a little at a time. A good rule of thumb is to introduce as much of an idea as early as possible, but only to introduce as much as can be handled. In the case of loop invariants, it is possible to introduce part of the idea at the same time that the while statement is introduced. This is useful because it maximizes the time that the students are exposed to the idea, giving them a greater opportunity to digest it; it also forever associates the idea of a loop invariant with the while statement.

Actually, one can and should pave the way for loop invariants even before loops are mentioned. Loop invariants are assertions about the values of variables before or after particular statements. It is possible and useful to introduce explicitly the idea of such an assertion when assignment statements are taught. This way, by the time the while makes its appearance, the students are familiar with statements such as "the assertion that x>4 after x:=y is false" or "the assertion that x<>y after x:=y+1 is true".

When the while statement is introduced for the first time, it is not possible to present the full definition and use of the loop invariant. However, we can:

give a preliminary definition of a loop invariant: it is an assertion that is true just before the while statement and just after the loop body and
point out that by its very nature, a loop invariant will also be true immediately after the while statement.

A precise motivation for the interest in such points must be deferred till later, but the students will usually accept the idea that we need an ability to make assertions about the values of variables at different points in the program and that these considerations help us do that. At this point it is not essential to worry about useful invariants. Examples of easily recognized invariants are what is needed at this point:

x := 0;						n := m;
while x<20 do begin		t := 0;
  	x := x+5				while t<10 do begin
end							    t := t+1;
							        n := n+1
						       end

The student should see that invariants for the first loop include x>=0, x is a multiple of 5, 2+3=5, etc. and that invariants for the second loop include n>=m. The students can also be teased with more subtle ones such as "tp;n is a constant", but at this point the emphasis should be on the most obvious ones. This is the moment also to ensure that it is understood that x<20 and t<10 are not invariants. The explanations for why these assertions are or are not loop invariants can be given in operational terms: remember, these are students who are encountering loops for the first time!
Finally, this is also the time for emphasizing that after the while statement, the condition is false and any invariant is true. All of these points must be reinforced with examples and exercises such as:

Consider the following loop:

	x := 0;
	while x<>a do begin
		... loop body ...
	end
Given that a>b is a loop invariant what assertions will be true after the while statement? What can be said about the relationship between x and b after the loop?

Note that at this stage, the students' skills are analytic, not synthetic. They have been introduced to the idea of an invariant as a means for analysing a loop. The next step is to introduce it as a tool for synthesizing one.

Using Loop Invariants

Loop invariants should be used to construct loops, not to analyse them after the fact. If students are to believe this, then the use of loop invariants must be seen as a tool that makes loop construction easier, not harder. If it appears that finding the appropriate loop invariant is more difficult than starting to write code and muddling along, then students will always choose the latter.

One way to guarantee hostility and alienation on the part of students is to develop a program along the following lines:

Determine the variables needed for the loop.
Express the condition that is desired after the loop (the goal of the loop).
Extract from this condition the loop invariant and the while condition.
Use the information from (3) to write the loop.

Students can generally handle (1), (2) and (4). No matter how many times you go over (3) in class, most students will fail to grasp it and even those who do will not be convinced that this is a good way of writing a loop. They think it's easier to write code and muddle along. They are right!

There may be some computer scientists who can honestly carry out step (3) in the above programme. Nevertheless, I believe that when most of us do it, either for the "benefit" of our students or our own coding, we are able to do so only because we tacitly have carried out an earlier step. That step is identifying a loop strategy for solving the problem and understanding both the invariant that is implicit in that strategy and the condition for continuing the iteration. It is on the basis of this strategy that step (1) is carried out. It should be on the basis of this strategy that the invariant and the loop condition are to be found, not from an expression of the condition that is desired after the loop.

A good example is everyone's favorite (from Hoare [1969] to Tam [1992]): the division by repeated subtraction problem. It is certainly one that is covered in our computer literacy course. The wrong approach here is to make remarks along the following lines:

We will write code to divide by repeated subtraction. The count of how many subtractions we do will be the quotient and whatever is left when we can't subtract any more (i.e. when the result of subtraction would be negative)- that will be our remainder.
(We need a variable to hold our dividend, a variable to hold our divisor, a variable to hold what is left so far (the remainder) and a variable to hold the count of how many times we have subtracted (the quotient). Let's call them a,b,r,q respectively.
When we have finished looping a=q*b+r and guaranteeing that we've gone sufficiently but not too far, 0<=r<y.
Let's pick the loop termination condition as r<y and voilà: the loop condition is r>=b and the invariant is a=q*b+r and r>=0.

Steps (1) and (2) are perfectly acceptable to students. Step (3) is difficult for weaker students but they can learn to appreciate the value of formalizing the goal in this way. However, a generation of students can fall through the gap in step (4).

The right approach requires spending more time with the discussion of the loop strategy and then deriving, very informally of course, the loop invariant and the loop condition from the strategy itself. The following may be overkill for some students but it will work out a lot better than steps 3 and 4 above:

(1) Work out in detail a numerical example of dividing by repeated subtraction. Better yet, illustrate it graphically. Let's try, for example, 13/4. We start with, say, 13 triangles, and proceed to discover how many groups of 4 we can make. While doing so, we keep track of all relevant values in each step.

(2a) Where's the invariant? At this point it is easy for the students to recognize what is invariant here. Even the weakest can see that the total number of triangles is constant and nearly all can see that at each step of the way the total equals the number of groups times the groupsize plus whatever is left.
(2b) When do you stop? Similarly, virtually all the students can articulate the termination condition: the procedure must stop when the number left is less than the groupsize.
(2c) How do you ever stop? Again, virtually all the students realize that the algorithm advances towards termination by making a new group (doing a subtraction) at each step.
(3) Choose variable names for each quantity referred to in the procedure above:

	var	a:	integer,	{ number of triangles	}
		b:	integer,	{ group size		}
		q:	integer;	{ number of groups	}
		r:	integer;	{ what's left		}
(4) Once variable names have been chosen, the student can express steps (2a-2c) as assertions and code fragments:

(4a) Invariant: a=q*b + r
(4b) When to stop: r<y
(4c) Make progress to termination: q := q+1

(5) The student can the put the pieces of (4a-4c) together:

		{Invariant: a=q*b + r}
		while r>=y do begin
			r := r - b
			q := q+1
		end
(6) Finally the student must make sure that the invariant is satisfied, before the loop and then at the end of each loop body. Since there are many ways of initially satisfying a=q*b+r for a given a,b, the student, like any sensible person, will allow herself to be guided by the informal strategy worked out above and use the invariant as a check (q:=0;r:=a). The same is not true for the loop body: if a=q*b+r is true before the loop body and q is incremented by 1 then only decrementing r by b will restore the balance. Most students though will not take this calculational approach but still let themselves be guided here by the informal strategy and use the invariant as a check.

There are disadvantages to this approach. For example, the invariant is incomplete (it omits r>=0) and there is a certain amount of muddling along. But the main point is that in this approach:

the students discover for themselves a useful invariant (if not the full invariant) and the termination condition before writing any code;
they use this information to guide them in their coding.

If these students are recruited to be CS majors, then as they gain greater experience, they may come up with more complete invariants, and use these more directly in guiding their coding. This will be possible because now as beginning programmers they have discovered the value of the loop invariant.

On the other hand, if these students never take another CS course (which is our general expectation), they will have been given a very balanced picture of programming and computer science. By teaching loop invariants to non-majors, we expose them to computer science as a field that unites the formal and practical, and they will have encountered programming as a science.

Conclusion

There is no longer any doubt about the feasibility of teaching loop invariants to beginners. The only thing that remains is further improvement of our methods for doing so. Several such improvements have been suggested above:

Loop invariants should be introduced simultaneously with the while statement itself and used initially as a means for analysing loops.
Students will find loop invariants more accessible and hence helpful if they learn to derive them from a consideration of the loop strategy rather than from the condition that is desired at the end of the loop.

Acknowledgments. The author appreciates the suggestions and help of Carroll Zahn of Pace University, Douglas Troeger of City College, and David Gries of Cornell University.

References

Owen Astrachan, Loop Invariants as Pictures. The Proceedings of the Twenty-second SIGCSE Technical Symposium on Computer Science Education, (March, 1991).
David M. Arnow, The Iliad and the WHILE Loop. The Proceedings of the Twenty-second SIGCSE Technical Symposium on Computer Science Education, (March, 1991).
David M. Arnow, Program Correctness Proofs in a Computer Literacy Course. The Proceedings of Frontiers In Education '92, (November, 1992).
Lynne V. Cheney, 50 Hours- A Core Curriculum for College Students. National Endowment for the Humanities, (October, 1989).
Edsgar W. Dijkstra, A Discipline of Programming. Prentice-Hall (1976).
C. Van Dyke, Taking 'Computer Literacy' Literally. Comm. ACM, 30, 5, (May, 1987).
David Gries, The Science of Programming. Springer-Verlag (1981).
C.A.R. Hoare, An Axiomatic Basis for Computer Programming. Comm. ACM, 12, 5, (May, 1969).
D. McCracken, SIGCSE Keynote Address, The Twenty-third SIGCSE Technical Symposium on Computer Science Education, (March, 1992).
J. Paul Myers, Jr., The New Generation of Computer Literacy. The Proceedings of the Twentieth SIGCSE Technical Symposium on Computer Science Education, (February, 1989).
J. Paul Myers, Jr., The Central Role of Mathematical Logic in Computer Science. The Proceedings of the Twenty-first SIGCSE Technical Symposium on Computer Science Education, (March , 1990).
W. Tam, Teaching Loop Invariants by Example, The Proceedings of the Twenty-third SIGCSE Technical Symposium on Computer Science Education, (March, 1992).
D. Troeger, Experiences Teaching Loop Invariants to Beginners, Logic, Loops and Literacy Workshop, (May, 1993).

Back to David Arnow's CS Education Page.

tc