Generic Programming
Part II: Algorithms on Algebraic Structures
Chapter 4: Arithmetic

Last updated 17 June 1998

§ 4.1  Introduction
Send comments about this page to
Jim Dehnert,
John Wilkinson, or
Alex Stepanov.
§ 4.1  Power: The Russian Peasant Algorithm
An operation of great utility, one that is supported directly by many
languages, is that of raising a number to a positive integer power.
In this section we discuss this algorithm generically.
For any set with a binary operation *, we can define a power
function inductively by
power(x, 1) = x
power(x, n) = power(x, n1) * x if n > 1

For a general binary operation, however, this power function is not
very interesting: if, for example,
we defined it with the operation applied in
the other order, we would in general get an entirely different
operation. The power function becomes interesting when we require
the operation to be associative. If * is associative, we
can show easily by induction the familiar law of exponents:
power(x, m+n) = power(x, m) * power(x, n)

As we shall see, this property not only makes the power function
interesting; it also makes it possible to compute it efficiently.
For a generic power algorithm, it is clear that their are two types
involved: the type of n should be some type that can represent
natural numbers, and the type of x
should be of some type supporting an associative binary operation.
Mathematicians call such a set a semigroup. This suggests
the interface
template<class SemigroupElement, class Integer, class SemigroupOperation>
SemigroupElement power(SemigroupElement a,
Integer n,
SemigroupOperation op) {
..............
}

(Here as always the template parameter names are merely mnemonics,
intended to suggest requirements on the template arguments.)
It is, however, a little more convenient to add an additional
requirement on the semigroup, namely that it possess an identity,
that is an element e such that
op(x, e) = op(e, x) = x for every x.

Such a semigroup is known as a monoid, and we are not
really losing significant generality by restricting ourselves to
monoids, since it is a mathematical fact that
we can always adjoin an identity element to any
semigroup.
This gives the interface
template<class MonoidElement, class Integer, class MonoidOperation>
MonoidElement power(MonoidElement a,
Integer n,
MonoidOperation op) {
..............
}

Of course only the names have been changed!
We now proceed to try to develop an algorithm.
Let's start by initializing the result:
template<class MonoidElement, class Integer, class MonoidOperation>
MonoidElement power(MonoidElement a,
Integer n,
MonoidOperation op) {
MonoidElement result = identity_element(op);
................
return result;
}

This initialization accomplishes something very important: if a0 is
the initial value of a and n0 is the initial value of n, we have
established the equality op(result, power(a, n)) = power(a0, n0).
If we can reduce n to zero keeping this relation invariant, we will
be sure of having a correct algorithm.
This suggests the following straightforward solution:
template<class MonoidElement, class Integer, class MonoidOperation>
MonoidElement power(MonoidElement a,
Integer n,
MonoidOperation op) {
MonoidElement result = identity_element(op);
while (n != 0) {
result = op(result, a);
n;
}
return result;
}

This is correct, but not very efficient: it has linear complexity
in n and would be quite unsatisfactory for very large values of n.
Note that this algorithm uses only the inductive definition of the
power function. It does not exploit the law of exponents, and could
be used perfectly well for arbitrary binary operations. Applying
the law of exponents, however, allows a major improvement.
The identity
power(x, m+n) = op(power(x, m), power(x, n))

when m is equal to n gives
power(x, 2n) = square(power(x, n), op)

where we define square(x, op) to be op(x, x).
This means that if n is even, we can maintain the invariant with
a much greater reduction in n: we can square a and halve n. This
observation leads to the improved algorithm
template<class MonoidElement, class Integer, class MonoidOperation>
MonoidElement power(MonoidElement a,
Integer n,
MonoidOperation op) {
MonoidElement result = identity_element(op);
while (n != 0) {
if (is_even(n)) {
square(a, op);
halve(n);
}
else {
result = op(result, a);
n;
}
}
return result;
}

This version has logarithmic complexity. Probably we're not going to
achieve dramatic improvement from this point, but let's see if we can
squeeze it a bit.
Note that in the else branch, once we have reduced n by 1 we
know it is even, so we have a redundant test in the next iteration.
We might as well fold it in now. This gives us
template<class MonoidElement, class Integer, class MonoidOperation>
MonoidElement power(MonoidElement a,
Integer n,
MonoidOperation op) {
MonoidElement result = identity_element(op);
while (n != 0) {
if (is_even(n)) {
square(a, op);
halve(n);
}
else {
result = op(result, a);
n;
if (n != 0) {
square(a, op);
halve(n);
}
}
}
return result;
}

Now if n is odd, the combination
is equivalent to just
so we can simplify this to
template<class MonoidElement, class Integer, class MonoidOperation>
MonoidElement power(MonoidElement a,
Integer n,
MonoidOperation op) {
MonoidElement result = identity_element(op);
while (n != 0) {
if (is_even(n)) {
square(a, op);
halve(n);
}
else {
result = op(result, a);
if (n != 1) {
square(a, op);
halve(n);
}
}
}
return result;
}

And the comparison against 1 can be eliminated by separating out the
last iteration, provided we are a little careful: if n is not zero
initiallly, then repeated halvings must take it through 1. So let's
treat the case n == 0 separately at the beginning:
template<class MonoidElement, class Integer, class MonoidOperation>
MonoidElement power(MonoidElement a,
Integer n,
MonoidOperation op) {
if (n == 0) return identity_element(op);
MonoidElement result = identity_element(op);
while (n != 1) {
if (is_even(n)) {
square(a, op);
halve(n);
}
else {
result = op(result, a);
square(a, op);
halve(n);
}
}
result = op(result, a);
return result;
}

But we can do better if we note that whenver n is nonzero and even,
it is wasteful to check for both n nonzero and n even, since repeating
halving must eventually make n odd. Applying this observation gives us
template<class MonoidElement, class Integer, class MonoidOperation>
MonoidElement power(MonoidElement a,
Integer n,
MonoidOperation op) {
if (n == 0) return identity_element(op);
MonoidElement result = identity_element(op);
while (n != 1) {
if (is_even(n)) {
square(a, op);
halve(n);
}
else {
result = op(result, a);
do {
square(a, op);
halve(n);
} while (is_even(n));
}
}
result = op(result, a);
return result;
}

But now the inner loop always terminates with n odd. If we could get
n odd at the beginning of the first iteration, we could eliminate the
is_even test altogether. But we can use our previous observation again
to guarantee this:
template<class MonoidElement, class Integer, class MonoidOperation>
MonoidElement power(MonoidElement a,
Integer n,
MonoidOperation op) {
if (n == 0) return identity_element(op);
MonoidElement result = identity_element(op);
while (is_even(n)) {
halve(n);
square(a, op);
}
while (n != 1) {
result = op(result, a);
do {
square(a, op);
halve(n);
} while (is_even(n));
}
result = op(result, a);
return result;
}

Now either the first iteration of the outer loop or the instruction
following the loop has a multiplication by the identity. This can
be eliminated if we assign a to result just before the loop and move
the multiplication to the end of the loop:
template<class MonoidElement, class Integer, class MonoidOperation>
MonoidElement power(MonoidElement a,
Integer n,
MonoidOperation op) {
if (n == 0) return identity_element(op);
while (is_even(n)) {
halve(n);
square(a, op);
}
MonoidElement result = a;
while (n != 1) {
do {
square(a, op);
halve(n);
} while (is_even(n));
result = op(result, a);
}
return result;
}

§ 4.3  Greatest Common Divisor
The most important fact about the greatest common divisor d of two
integers x and y is that there always exist integers u and v for which
In some applications, finding these integers is more important than
finding the gcd. In particular, if gcd(x, y) = 1, then finding u and v
satisfying (1) solves the problem of finding a multiplicative inverse
for x modulo y or for y modulo x.
Fortunately, solving this problem is not much more difficult that
finding the gcd, and we can extend the Euclidean algorithm to provide
a solution at very little additional cost. We call this the
Extended Euclidean Algorithm.
Given x and y, we want to find u and v such that
where d is the gcd of x and y. An obvious approach is to replace
d by a variable r, find some initial value of r for which
is easy to solve, and then find a way to decrease r while maintaining
the equation (2).
Now there are two initial values for r for which (2) is easy to solve,
namely x and y, for which we have the solutions
and
respectivley.
So let's introduce variables r0, u0, v0, r1, u1, v1, and initialize
them as follows:
r0 = x; u0 = 1; v0 = 0;
r1 = y; u1 = 0; v1 = 1;

so that we have the equations
(3) u0 * x + v0 * y == r0

and
(3) u1 * x + v1 * y == r1

Now we can get a simliar equation with a smaller r by setting
q = r0/r1;
r = r0  q * r1;
u = u0 = q * u1;
v = v0  q * v1;

and we have
ux + vy = r, where r < r1.

as we see simply by multiplying equation (4) by q and
subtracting from equation (3).
This leads to the following code:
template <class EuclideanRingElement>
triple <EucldeanRingElement, EuclideanRingElement, EuclideanRingElement>
extended_gcd(const EuclideanRingElement & x,
const EuclideanRingElement & y) {
if (x == 0) return make_triple(0, 1, y);
if (y == 0) return make_triple(1, 0, x);
EuclideanRingElement r0 = x, u0 = 1, r1 = y, u1 = 0, v1 = 1;
while (r1 != 0) {
EuclideanRingElement q = r0 / r1;
EuclideanRingElement r = r0  q * r1;
EuclideanRingElement u = u0  q * u1;
EuclideanRingElement v = v0  q * v1;
r0 = r1; u0 = u1; v0 = v1;
r1 = r; u1 = u; v0 = v;
}
return make_triple(u0, v0, r0);
}

Now let's see if we can optimize this a little. We notice that the
variables v0, v1, v are not used in the computation of the other
variables. They are just going along for the ride, so to speak,
except that v0 is part of the final result. Their usefulness is that
they let us see we are maintaining the invariants throughout the
computation.
But we can simply compute v0 after we exit the loop, since it is just
the solution of
Eliminating the v's from the code gives us
template <class EuclideanRingElement>
triple <EucldeanRingElement, EuclideanRingElement, EuclideanRingElement>
extended_gcd(const EuclideanRingElement & x,
const EuclideanRingElement & y) {
if (x == 0) return make_triple(0, 1, y);
if (y == 0) return make_triple(1, 0, x);
EuclideanRingElement r0 = x, r1 = y, u1 = 0;
while (r1 != 0) {
EuclideanRingElement q = r0 / r1;
EuclideanRingElement r = r0  q * r1;
EuclideanRingElement u = u0  q * u1;
r0 = r1; u0 = u1;
r1 = r; u1 = u;
}
return make_triple(u0, (r0  y0 * x) / y, r0);
}
