Part I: Fundamental Concepts
Chapter 2: Fundamentals
Last updated 10 April 1998 |
Generic programming is based on dividing programs into components, for example key data structures and algorithms, which can be matched with other components satisfying a minimal set of assumptions. Its success depends upon minimizing the set of necessary assumptions, while still providing enough guarantees that the programmer of a generic component can make effective choices in its implementation.
This chapter discusses a minimal set of guarantees, and provides some illustrative examples. These guarantees involve the most basic operations of a programming language, and should seem quite intuitive to practicing programmers. However, they involve questions which we seldom think about explicitly, and require careful thought and understanding precisely because they will be assumed by programmers with no knowledge of the components satisfying them.
Many books contain a chapter on "fundamentals" or "background" which is review material for many readers, and can safely be ignored or skimmed if it looks familiar. That would be a mistake here. Although the subjects are superficially familiar, the details are important. Indeed, it might be useful to revisit this chapter after reading the later material which depends on it, since some of the implications will be clearer at that point.
Interfaces are a critical aspect of good software engineering. We are acutely aware of many of the interfaces we deal with, from the operations defined for the types we use, to the parameters required by functions we call, to the data and procedural interfaces exported by library packages we use.
Interfaces are contracts between the provider of a feature and a client. Naively, they are restrictions: A type interface provides some operations, but not others. We can call a function only with the supported parameter types. To use a library package, we must satisfy certain restrictions on program state when we call its routines. Less obviously, but much more important, the interface and its restrictions free the feature provider to provide a much more effective implementation because it need not be concerned with unspecified (and even unanticipated or unanticipatable) capabilities or situations. Similarly, it frees the client by excluding undesirable side effects from the feature provider.
There are many considerations in designing a good interface. An ideal interface should provide the functionality expected by the client, in an intuitive way, without sacrificing performance. It should also result in client code which is easy to understand for maintenance, which will often result from the other requirements. Often this ideal is difficult or impossible to achieve, when these requirements are in conflict, or when multiple clients have differing needs. In such cases, making the best tradeoff can be difficult, requiring excellent judgment or intuition. This leads many to believe that programming is as much art as science. Indeed, although experience provides principles which may be used for these tradeoffs, and we will suggest a number in this book, judgment remains important.
However, it is often possible to objectively identify good and bad interfaces. A clear understanding of which clients are important, of what their requirements are, and of what the implementation needs, can make many choices quite clear. We will see many such cases in the examples in this book, and the remainder of this chapter will discuss some of the most fundamental ones.
Generic programming's central premise is that we can decompose the programming task into orthogonal components, and write each component with no more assumptions about the others than are necessary. We will discover that certain sets of assumptions occur over and over. We will call these sets of assumptions concepts.
More concretely, a concept describes an interface between algorithms and types. It describes a base type, affiliated types, operations on these types, and properties of the types and operations. Concepts are not simply types. Whereas a type is a concrete specification of the representation of a set of values, along with the full set of operations on those values, a concept is an abstract specification which imposes a minimal set of restrictions on its base type, in terms of the affiliated types and operations, so that any type with affiliated types and operations satisfying those restrictions can be used with algorithms interfacing with that concept. This hierarchy is analogous to the biological concepts of an individual animal (value), its species, which is a specific description of animals with the same characteristics as the individual (type), and its genus, which is a more abstract description of a variety of species with similar characteristics (concept). Thus, we can say that a type (along with its affiliated types and operations) is an instance of a concept in the same sense that dog is an instance of mammal.
Concepts are analogous to theories in mathematical logic, including the fact that there are two quite different ways of specifying one. A theory in mathematical logic may be specified by giving the set of theorems that it satisfies; analogously, a concept may be specified by giving the set of programs which work correctly with it. Alternatively, a theory may be specified as the set of models which satisfy it, or a concept may be specified by giving the set of programming components (e.g. data structures) which implement it. |
Although we define concepts in terms of a set of types, operations, and their properties, it is often more convenient to specify a concept in terms of other concepts. Thus, we may create a concept from another by refining it (adding one or more affiliated types, operations, or properties), or by abstracting it (removing affiliated types, operations, or properties from its definition).
Concepts provide us with the means of distinguishing generic components from concrete program components. Whereas a concrete component defines its interface in terms of types for its function parameters and results, or data structure members, a generic component defines its interface in terms of concepts, and may be instantiated by replacing the concepts with specific types and operations that satisfy the properties of the concept, allowing it to be used with any other components having interfaces satisfying the concept(s).
Beyond the obvious interfaces in programs, there are others that we don't usually think about. This is sometimes because they are so pervasive that we take them for granted, for instance the basic syntax and semantics of our programming language. Or it may be because they are so subtle or intuitive that it doesn't occur to us to think about them, such as the precise semantics of operations like assignment and equality testing.
The data types we define in our programs interact with all the other components we use. To achieve maximum benefit, our generic components must work as well with these user-defined types as with the built-in C++ data types. In fact, they should work the same way, since writing multiple versions of code is what we are trying to avoid in generic programming. That we can contemplate doing so is due to the fact that newer programming languages like C++ allow us not only to define new data types, but also to define all of the standard operations for those types. This is a powerful capability, but using it effectively requires that we be very clear about the interface requirements for these fundamental operations. This is necessary even though these are operations which we intuitively understand (or think we do) without thinking about them, e.g. assignment.
An important issue in matching the semantics of these operators to the operators of the language's built-in types arises from the specification of the standard operators on user-defined types as functions. The appropriate definition of an operator requires understanding not only what effects must occur, but also what the best parameter/result function interface is. Moreover, because generic program components consist largely of function definitions, our desire to match the behavior of user-defined types to that of built-in types also means that their parameter passing behavior should match. In C++, this means that our new types should work like built-in types when they are passed either by value, i.e. copied into the formal parameter so that formal parameter modification does not affect the actual argument, or by reference, i.e. passing the actual parameter's address so that the formal becomes a new name for the actual.
This chapter addresses these fundamental assumptions about types and operations on them. We will derive a number of requirements, which are satisfied by the built-in types of C++, and which must be satisfied by user-defined types to be fully useful in generic programs. As with all interface specifications, these requirements may seem restrictive. But they are fundamental to our ability to accomplish anything on a generic basis. Furthermore, there is a side benefit to our attempt to match traditional type semantics. It gives us a much sounder basis for reading and understanding programs, since programs using the traditional primitive operators will behave as we expect.
It is interesting to contrast our approach with the traditional abstract data type paradigm. The ADT approach suggests that a new type should be defined with all of the operations and algorithms relevant to it, and then it can be used as a component of more complex modules. Object-oriented programming elaborates on this approach, providing language mechanisms (inheritance) for incorporating simpler types into more complex types, as well as propagating the operations defined on them. By contrast, generic programming takes a more top-down approach. We first develop an interesting algorithm or data structure (container), then identify the minimum requirements for the operations it needs on the underlying types, and finally apply those requirements to specify usable types, or concepts.
We have suggested above that a clear specification of fundamental operations on types is important to generic programming. Which operations are fundamental? In this section, we will identify several, all of which have default C++ definitions for all built-in data types and reasonable extensions to user-defined types. As we introduce them, we will also begin to discuss their required semantics. As we do so, a couple of common threads will emerge.
First is an emphasis on operators doing what we intuitively expect. We have all heard arguments about how important it is to be able to read code and understand it in order to maintain it. Generic programming amplifies this concern -- the writer of a generic algorithm cannot know about the implementations of the types and operators he uses, since such an algorithm is intended to work with a wide variety of types.
The second is a concern that we be able to apply various natural optimizations to generic code using these fundamental operators. Again, we want to optimize our generic algorithms without being concerned about whether the operators really have the semantics we expect.
The assignment operator, e.g. x=y, copies the value of an object y to another object x. To understand this apparently simple statement a bit better, consider the examples in the adjacent table.
A |
|
|
B |
|
|
C |
|
|
original | optimized |
Example A copies the value of variable x to itself. We intuitively expect this to be a useless operation, and require that we be able to remove it, since a programmer seeing it in a generic code fragment will naturally do so. This implies that the assignment may not perform any necessary functionality besides the copy.
Example B copies the value of variable y to x, and then copies x to variable v. This should be equivalent to copying v directly from y, i.e. the intermediate copy should not affect the final result. This optimization allows the two copies to proceed in parallel, and may allow the first copy to be removed if x is not otherwise needed.
Example C is similar to example B, if we assume f is a function taking a value (copy) parameter of the same type as x and y. In the original form, this code copies y to x, and then copies x to the formal parameter of function f. Again, we should be able to simply copy y to the formal parameter, allowing the two copies of y to proceed in parallel, and possibly allowing elimination of the first copy.
In C and C++, assignment also returns a reference to the target object, and we will assume this as well, although it is not critical to our development. The default C++ assignment operator for constructed types recursively assigns each component. For components that do not have user-defined assignments, this will ultimately do a bitwise copy of components of built-in types.
Some programming languages have treated assignment as copying a reference to y instead of its value -- we explicitly prefer the value semantics of C++. However, we anticipate a later discussion by pointing out that the value to be copied for a complex object requires careful definition.
The equality operator, e.g. a==b, compares two objects of the same type, and returns a Boolean (true/false) value indicating whether they are equal. There is no default C++ equality operator for constructed types.
Generic algorithms often use the equality operator, and they expect somewhat more than the definition above. First, the equality operator should define what mathematicians call an equivalence relation, that is, it should satisfy the following conditions, for any a, b, and c:
Next, consider the following code fragment:
D |
|
|
original | optimized |
As programmers, we naturally expect that the inner equality test is redundant, and may be removed. Generic components should be able to make the same assumption. Hence, we require that the equality operator have the semantics that multiple equality tests on the same unmodified objects should return the same result, and that they do not have side effects which prevent their removal.
C++ also provides an inequality operator, e.g. a!=b, which is closely tied to the equality operator. For built-in types it is defined as the negation of the equality operator, and STL (but not the base C++ language) provides this default definition as a template for user-defined types without an explicit user-defined inequality operator. This is the behavior we expect from mathematics, and we require it of user-defined inequality operators. That is, a user should be able to write either a!=b or !(a==b) without worrying about whether they are equivalent. Consider another code fragment, similar to Example D above:
E |
|
|
original | optimized |
In this case, we again expect the inner comparison to be redundant, and require that we be able to remove it along with statement1. That is, we want the same conditions on reproducibility and side effects as for the equality operator, as well as the condition that they produce inverse results.
When a new object is created, it occupies memory, and contains an initial value. In many programming languages, the initial value is undefined, or sometimes a fixed binary value, typically zero. Later languages have allowed the programmer to provide an initial value with some limitations, for instance that it must be constructed from compile time constants. C++ provides a completely general mechanism, allowing the definition of one or more operators, called constructors, which given zero or more parameter values, calculate the initial value. In fact, they may do arbitrary calculation. For example, a constructor for an object which controls a physical device might include code to force the device to a known state.
Constructors are used in a number of contexts where objects are created. One of these, of course, is data declarations, but there are others where we intuitively just expect a copy to be made, for instance when an actual parameter is copied to a formal value parameter in a subprogram call, or when the return value of a function is copied out.
Constructors which copy a single parameter to a new object of the same type are called copy constructors, and they are our principal concern here. We expect them to have behavior similar to the assignment operator, since both copy values. For example, consider a modification of assignment Example B above, where T is a type name.
F |
|
|
original | optimized |
This example differs from example B only in that it copies the value of variable y to a newly declared variable x using a copy constructor instead of to a previously declared x using assignment. We expect it to behave like the assignment example, and expect to be able to perform the same optimizations.
As for assignment, the requirement that copy constructors copy the values of complex objects will be discussed with more precision later. C++ provides default copy constructors, which it defines to recursively perform copy construction of each member of a composite object.
Destructors are an inverse operator for constructors. That is, when an object goes out of scope in C++, a destructor is called to do any necessary cleanup of its memory, or any other necessary final processing such as turning off a controlled device, before the object is deallocated. Our only requirement for destructors is that they clean up the same object that is constructed or copied by constructors and assignment. The significance of this seemingly simple requirement will become clearer when we discuss the parts of complex objects later.
As an aside, we note that C++ also allows the explicit invocation of constructors and destructors. This is useful when it is necessary to separate the allocation and deallocation of memory from the construction and destruction of the objects which occupy it, for instance when a vector of objects is allocated as a unit and then initialized element by element.
The requirements for destructors are simpler than for the other fundamental operators in the sense that, since it is not valid to call a destructor twice for the same object, we need not be concerned with how they behave when redundant.
Many data types support a natural ordering, and C++ provides comparison operators for querying that ordering, namely <, <=, >=, and >. It provides default definitions for the operators on all built-in types, and allows user definition of them for user-defined types. Note that defining a default ordering for all built-in types automatically implies a natural ordering for composed types, given by composing the orderings of the components lexicographically.
Important generic algorithms, for instance sorting and searching, are based on ordering relations, and are naturally defined to use these operators. Although it is not necessary to require their definition and use, it is convenient to use them if available, and we require that they behave reasonably if defined. This means for our purposes that they define what mathematicians call a total order on the underlying type, that is for any a, b, and c of that type:
We will address the axiomatization of orderings more precisely in Section 2.7.
These axioms again suggest several optimization possibilities, similar to the equality operator cases:
G |
|
|
H |
|
|
I |
|
|
original | optimized |
Example G reflects our expectation that only one of the three relations <, >, or ==, holds between two values, and example H is another case of that rule. Example I reflects the anti-symmetric property, i.e. that if a<b then it cannot be true that b<a. All of these examples again exhibit what should be a familiar rule by now -- these operators must not have side effects which would invalidate an immediate re-test, or make it impossible to eliminate them when redundant.
We have proposed a number of restrictions above on the fundamental operators we expect to use in writing generic program components. It is worth re-visiting our justification. These restrictions are not part of the C++ language definition, indeed C++ allows much greater leeway in defining these operators for user defined types.
However, our desired properties are not entirely without precedent. They are valid for all built-in types in C++, and indeed in all mainstream production programming languages. Moreover, these properties are heavily used for optimization in modern compilers, and virtually any programmer would be astounded to find them violated for the built-in types. Because the C++ language definition fails to make these guarantees, it is difficult for optimizers to make the optimizations we suggest automatically, but we can expect programmers to make them without giving the question much thought. Proposing to do generic programming, where there is typically no way to inspect the underlying operators, but where efficiency is vital, makes little sense.
In addition, in the case of the comparison operators, we have well-established mathematical tradition behind the total ordering axioms we require. We should always remember that our programs, in addition to specifying actions for a computer, will typically be read and modified by other programmers. As mathematicians have known for centuries, the symbols we use to communicate ideas are arbitrary, but by following conventions we can eliminate most of the time spent decoding, and focus our energy on the ideas being expressed.
Another observation is worthwhile here. In mathematics, and the other sciences, the formal articulation of properties is always preceded by an intuitive understanding based on observation. Mathematicians generally develop a theory by first observing mathematical facts, then postulating theorems to explain them, and finally formalizing the axiomatic basis required for formal expression and proof.
Our work on underlying assumptions for generic programming is analogous to this. We have observed extensively the assumptions made by programmers and by compilers about simple type behavior. We have also observed that writing efficient generic programs requires that we make similar assumptions about the user-defined types we support. This book is an effort to present the axioms which embody these assumptions, and to demonstrate that a large body of extremely useful, general, and efficient code can be built on the resulting framework. Ultimately, the validity of these ideas must be judged based on the utility of the results.
We continue our discussion of requirements on abstract data types by investigating the relationships among equality, assignment, construction, and (where defined) destruction operators. We address these in the context of C++ semantics, with the extended assumptions from the preceding section, although the issues are common to all programming languages which support construction and destruction of user data types.
To begin the discussion, consider an example. We will look at a variable-length vector type, which consists of a current length member and a pointer to the actual data vector. We will assume that the data vector is owned by the header type, i.e. that distinct objects do not share data vectors.
|
This looks like a reasonable definition, on the surface. It contains a constructor which, given a desired vector length, initializes an object with the requested length and a pointer to a vector which it allocates with that length. It also has a destructor which deallocates the data vector, an equality operator which compares the vector pointers, an indexing function, and a size function.
However, it is not a good type definition, for a number of reasons. Can you see why? First, think about passing an object of this type by value to a function.
In C++, an argument passed by value is copied to the local parameter by invoking the copy constructor. Since the type doesn't provide a non-default copy constructor, the C++ default is used: the data members are copied bitwise. As a result, we don't actually have a full copy of the vector in the function -- our local parameter is sharing the data vector. Contrary to expectations, modifications of the parameter's data vector will also affect the original. Worse still, let's consider what happens when we call the most trivial of functions:
void foo ( IntVec ) {}
The body of this function does nothing. However, the C++ semantics of passing a parameter by value requires that the IntVec actual parameter be copied to the formal parameter using the copy constructor, with the result mentioned above that the formal parameter contains a data vector pointer which is a copy of the actual parameter's data vector pointer. After doing nothing, the function returns to its caller. But before it can do so, because the formal parameter is going out of scope, its destructor must be called. This will deallocate the data vector which it shares with the formal parameter, leaving it with a dangling pointer to unallocated memory. Thus, it is not possible to pass our IntVec objects by value to a function without destroying them.
Clearly then, we must provide a copy constructor:
|
This copy constructor allocates a new data vector for the copy, and copies the source object's data vector into it. This now does what we expect for a copy, and illustrates our first requirement:
Copy 1: A type representing values that are not entirely contained within a single C++ class object must have a non-default copy constructor that copies the entire value. |
This is our first encounter with an issue which will arise frequently. What is logically an object need not correspond exactly to a C++ type. When we speak of an object, we mean not only the primary C++ variable to which we refer explicitly, but also any additional components which are part of the logical object. Even with this intuitive understanding, however, the phrase "entire value" above will require deeper discussion later.
Let us return to our example. Is IntVec correct now? Consider a test of a simple swap function:
|
Will our IntVec definition pass a test_of_swap? This routine saves copies of the original parameters, calls swap on them, and then tests the resulting values against the saved copies. Consider carefully what each of the steps will do with our definition, and then try the following test program.
|
This fails -- why? Let us identify precisely what happens, statement by statement.
The two IntVec declarations create two dynamic vectors of length 3, a and b. The calls to initialize_IntVec then initialize them, a to {0,1,2} and b to {1,2,3}. Since the dynamic data vectors for a and b do not have variable names, let us call them av1 and bv1, with the digit 1 indicating that these are the first copies created.
We then call test_of_swap. It begins by creating copies of a and b using the copy constructor, for use later in determining whether swap worked correctly. These copies, oldA and oldB, point to new copies of the data vectors, which we will call av2 and bv2.
Next our test program calls the swap function. It copies a to a temporary named tmp, again using the copy constructor because it does so in the declaration of tmp. In doing so, it again copies the data vector for a, and we will call the new copy av3. At this point, we have three copies of a's data vector, and two copies of b's data vector, all at different addresses.
Now, the swap function assigns b to a, and tmp to b.
It uses the built-in C++ assignment operator,
which simply assigns the objects component by component,
in this case copying the length member and the data vector pointer.
Therefore, after the assignments, a will point to bv1,
and b will point to av3.
When we return from swap,
the test "a == oldB
" compares the data vector pointers,
that is, it compares the address of bv1 to the address of bv2,
and fails.
It should be clear what we need to do to solve this problem, which is due to the fact that our type has remote parts, i.e. parts which are not contiguous to the primary record. The principal is as follows:
Equality 1: When comparing the values of two objects of a type with remote parts, the equality operator must ignore the pointers to the remote parts, and compare the parts themselves. |
Another way of viewing this requirement highlights the interaction between copy operations (constructors and assignment) and the equality operator:
Equality 2: When comparing the values of two objects of a type with remote parts, the equality operator should compare only the values of parts which are copied by copy operations, ignoring parts which serve only to locate the remote parts, and which always have distinct values in distinct objects. |
As we shall see in the next section, equality must deal with additional subtleties as well, but this is adequate for our current example. It gives us a new definition of the IntVec equality operator:
|
With this new equality, the test above will probably pass. However, there is still a subtle bug. Can you identify it? Consider the following revised test program, which repeats the test, and extend the analysis we did above:
|
This modified test will probably fail.
To see why, led us go back to the state at the end of the first call
to the swap function.
Recall that b points to av3,
i.e. the copy of the a vector which was created to initialize tmp.
When we exit the swap function, what happens to tmp?
Since it is a local variable of swap,
it goes out of scope and its destructor is called.
In this case, the destructor deallocates its data vector,
which also happens to be the new data vector of b.
As a result, we return from swap with b pointing to deallocated
memory for its data vector.
Although it will probably "pass" the test "b==oldA
"
because that memory has not yet been re-allocated and modified,
it will only be a matter of time until b's value is apparently changed
by a seemingly irrelevant event, and the program fails.
In the case of our modified test,
that irrelevant event is likely to be the allocation and initialization
of a new tmp variable in the same memory during the second call to swap.
However, it is highly dependent on the implementation's memory
allocation policies, and the dangling pointer to invalid memory could
continue to "work" indefinitely if the deallocated memory is left unused.
Once again, the correction is obvious once we understand the problem. We must define an assignment operator which copies the data vector just as the copy constructor does.
Assignment 1: A type representing values that are not entirely contained within a single C++ class object must have a non-default assignment operator that copies the entire value. |
However, there is a complication in the assignment case which does not occur in the constructor case. Suppose we simply use the copy constructor:
|
This code will certainly copy the value of x, including its data vector, to the left-hand side assignment operand. But what happens to the previous value of that operand? It will simply be overwritten, which is incorrect if it is a type with a destructor. Therefore we must first call the destructor:
|
Even this is not quite correct, however. What will happen on the assignment "a=a"? We will destroy the old value of a, before attempting to copy the value. To solve the problem, it is necessary to avoid the destructor call for a self-assignments. These observations give us two more conditions on assignment:
Assignment 2: Assignment operators for types with non-trivial destructors must destroy the old value of the left-hand side before copying the value of the right-hand side. |
Assignment 3: The destructor call required by Assignment 2 must be avoided when assigning an object to itself. |
Once we meet these conditions, we finally have a good definition:
|
That is, before calling the destructor and constructor, we must first verify that the source and destination objects are different. If not, nothing needs to be done.
Observe that this is a generally valid model for assignment -- it will always work. There are simpler cases which do not require the explicit test, for instance where there are no non-default destructors. However, this is correctly viewed as a special case optimization, and not as and exception to the validity of the general definition. It is unfortunate that this is not the default C++ definition -- in its absence, non-default assignment operators are required much more often than should have been necessary.
Let us finish up our discussion of the IntVec example by presenting its full, correct definition:
|
We now have enough information to summarize the required operations on what we will call regular types. Except where otherwise specified, we will assume that all types with which we deal are regular.
Table: Required Operations on Regular Types | |||
---|---|---|---|
(syntax for a type named 'Regular') | |||
Operation | C++ Syntax (invocation) | Result type | Conditions, comments |
Default constructor | Regular a; | n/a | Creates a valid object |
Regular() | |||
Copy constructor | Regular a = b; | n/a | Postcondition (a == b) |
Regular a(b); | |||
Copy constructor | new (&a) Regular(b) | n/a | Postcondition (a==b); Constructed in memory at &a |
Reference | &a | Regular& | May pass parameter by reference |
Destructor | &a->~Regular() | n/a | Destroys all parts of object |
Assignment | a = b | Regular& | assert (&a == &(a = b)) |
Equality | a == b | bool | assert (a == a) a==b implies b==a a==b && b==c implies a==c &a==&b implies a==b |
Inequality | a != b | bool | a != b iff !(a == b) |
In the previous section, we discussed the necessary relationships between the equality, assignment, construction, and destruction operators, but we glossed over their precise definitions. The correct definition of equality and copy operators is intimately tied to the concept of the parts of a data structure. We shall discuss these concepts in detail in this section.
We call a type primitive if it is directly supported by the underlying computer hardware, or if it has built-in language support similar to that for hardware types. We call a type composite if it is composed of member types, which we call its parts. Furthermore, the parts may be composed of parts themselves, which we call indirect parts of the original types. We apply the same terminology to objects as to the types of which they are instances.
For example, integer, pointer, and floating point types are primitive in C++. Our variable-length vector of the previous section is composite, composed from the length member and the data vector members, and a list data type would be composite as well.
For primitive objects, the definitions of the equality and copy operators are easy -- we simply compare or copy the objects' values. For composite objects, however, there are more possibilities to choose from. Even for record types (e.g. C structs), the problem is not difficult -- a member-by-member definition works well for these operators. But for more complex data structures, such as lists or trees, the question can become quite difficult. Our objective in this section is to determine how we must define these operators to give all composite types the same value semantics as primitive types.
Composite types have developed slowly over the history of programming languages. The first high-level programming language, Fortran, supported primitive data types (e.g. INTEGER and REAL), and arrays as the only method of creating composite types. Arrays were extremely important in allowing large numbers of data items to be processed, but their capabilities were quite limited. The most obvious restriction limits their component parts to be homogeneous, i.e. all of the same type. Just as important, Fortran avoided choosing a correct definition of the equality and assignment operators for arrays by not defining them at all.
The next step was to allow composite types with heterogeneous parts. These were called records in Pascal and Ada, structs in C, and eventually classes in C++. Pascal supported neither assignment nor equality operators on such types and C supported only assignment; not until C++ did a mainstream language provide default definitions of the key operators, and even then not equality.
Even record types, however, do not allow self-contained definition of complex data structures such as lists or trees. Because such structures change dynamically as programs execute, providing static definitions for them as single structures would be difficult. Instead, the approach is to define their parts as class types, and to connect the parts using pointers. When an object has parts which are allocated elsewhere, we called them remote parts. Even C++ does not attempt to properly define equality and assignment for objects with remote parts, although it provides the mechanisms for user definition.
It is our objective to provide definitions of equality and copy operators which will make even variable size data structures look and feel like primitive types or simple records. However, it is these complex objects which most stress our intuition. In order to produce consistent operator definitions, based on a clear understanding of the parts of an object, it is useful to consider some of their history in mathematics and philosophy.
First, let us look at the meaning of equality. In mathematical logic, two values of a given type are considered equal if and only if any function defined on that type returns the same result when applied to either value. Naively, this definition matches our intuition quite well -- it is natural to assume that any function will produce equal results when applied to equal values. Unfortunately, there are two serious problems with this definition.
First, it is not a good operational definition of equality, since we cannot very well try all possible functions and compare their results. Fortunately, there is another definition which does produce a workable procedure for equality testing, and corresponds to how we would naturally compare two objects by hand. This is to verify that the two objects have matching parts, with the same structure connecting the parts, and with the corresponding parts having equal values.
Second, the mathematical definition is not even a valid one for programming objects or indeed for physical objects. Recall that we want two objects to compare equal after after we copy the value of one to the other. However, in most programming languages we can define functions based on the addresses of objects. In the case of two distinct objects, their addresses are not equal, and such functions can return different results for objects with equal values.
For a more subtle example of the same problem, consider our variable-length vector from the previous section. If we copy one vector to the other, and then try to test equality by comparing each component of the objects, we will correctly determine that the lengths are equal, and that the data components of the vectors are equal, but we will also determine that the pointers in the vector headers are unequal because they point to different copies of the data values. Thus we can see that even our alternate definition of equality will not work for objects containing pointers between their components, unless we treat those pointers differently from other parts. Clearly they are different from other parts of the data object, and ought to be ignored in equality comparisons. How can we wore precisely characterize this distinction?
To improve our intuition, let us redirect our attention from mathematics to the physical world. Consider a chair. It has a variety of physical characteristics, such as its color, weight, arm style, number of legs, etc. Another chair might have precisely the same physical characteristics, and we would consider them to be equal. We would hold this opinion whether they were next to one another at a table, or on opposite sides of the world. That is, the location of the chair has no bearing on our view of its "value." Nor does the location of its arm or its leg.
These observations provide the basis for the most important distinction between parts of an object. We can see that in addition to the essential parts of an object, which carry its value, there are other parts which serve only to keep track of the locations of non-adjacent parts, e.g. pointers between the records which comprise a complex data structure. We will call these latter parts structural links, because they provide structural connections between the parts of an object which actually hold its value. They must not be considered in equality comparisons. As we saw earlier, they should also not be copied -- rather the remote parts to which they point must be copied, and the new link will point to the copied parts.
The distinction between essential parts of an object and its structural links is not the only relevant distinction, however. Let us return to our chair example. One chair might have a person sitting in it, while the other holds a sleeping dog. Although these facts might be viewed as attributes of the chairs, we would still consider the chairs to be equal. That is to say, the content of a chair, although it is clearly associated with the chair, is not viewed as an essential part of the chair's value.
This situation is common in programming as well. For example, an insurance company database containing records of insurance policies would probably associate with each policy a reference to another record for the customer holding the policy. However, it is unlikely that someone comparing two policies for equality would want to take the policyholder into consideration.
There is a more compelling consideration, though, involving the destruction of objects, where it is important to to be clear about where one object ends and another begins. We will be writing generic code which destroys objects, and if we want it to work correctly we must follow consistent rules, which we can depend upon without knowledge of the specific type. Although one could conceivably treat each data structure as a unique case, with specialized protocols for construction and destruction, and make this work for a custom program, generic programming requires more discipline.
When we destroy an object by calling its destructor, we want to know that we have destroyed the entire object, and that we have not destroyed part or all of another object. We can deal with the destroyed object being part of a larger object -- indeed, we will destroy complex objects in general by destroying their parts -- but the destroyed object may only be part of a single object. To put this another way, we do not allow the sharing of parts between objects. If two objects share a part, one must be part of the other.
Similarly, to avoid problems with our paradigm of destroying an object by destroying it parts, we must avoid circularity among objects. That is, an object cannot be part of itself, and therefore cannot be part of any of it parts.
These requirements, inspired by the needs of destructors, provide us with the second important distinction between parts. A part is not an essential part of the value of an object if its purpose is simply to provide a link to a distinct object. Thus, in our insurance company example above, the customer link is not an essential part of the insurance policy object.
Philosophical BackgroundWe can find these distinctions made by philosophers as far back as Aristotle, who viewed objects as consisting of form and matter. What he called an object's form is the fundamental pattern for an object, which is analogous to the concept of type in programming languages. For example, all chairs would have the same form. What Aristotle called matter is the actual physical instantiation of the object, which is analogous to the concept of value in programming languages, and is central to our search for valid copy and equality operators. Much later, in the late middle ages, John Duns Scotus added haecceity (from Latin "hic" - this) as the principle of individuation, i.e. that which distinguishes between two identical chairs or houses. Although there have been extensive metaphysical arguments about whether matter or haecceity constitutes the proper principle of individuation, in the case of computers the trichotomy is quite clear:
|
This leaves us with three categories of object parts:
We can use these categories to finally resolve how the fundamental operators must treat composite objects. Equality should compare the essential parts of an object, following structural links to compare remote essential parts, but not comparing the structural links themselves. Moreover, equality should normally ignore the relationship links. Copy operators (i.e. assignment and copy constructors) should copy the essential parts, including remote essential parts found by following structural links, linking the remote parts in the new object with new structural links. Again, copies should normally ignore relationship links. Finally, destructors should destroy all parts of an object, following structural links to destroy remote parts, but not following relationship links to distinct objects.
< to be continued >
An ADDRESSABLE part is a part to which a reference can be obtained (through public member functions)
An ACCESSIBLE part is a part of which the value can be determined by public member functions.
Every addressable part is also accessible (if a reference is available, it's trivial to obtain the value).
An OPAQUE object is an object with no addressable parts.
An ITERATOR is an object which refers to another object, in particular, it provides operator*() returning a reference to the other object.
Two non-iterator objects are equal when all the corresponding non-iterator accessible parts are equal
Two iterators are equal when they refer to the same object (i == j iff &*i == &*j)
An implicit function AREA is defined for all objects
For the primitive objects the area is equal to sizeof()
For the composite objects the area is equal to the sum of the areas of its parts
An object is FIXED SIZE if it has the same set of parts over its lifetime
An object is EXTENSIBLE if not fixed size
A part is called PERMANENTLY PLACED if it resides at the same memory location over its lifetime (Knowing that a part is permanently placed or not allows us to know how long a pointer which points to it is valid)
An object is called PERMANENTLY PLACED if every part of the object is permanently placed
An object is called SIMPLE if it is fixed size and permanently placed
For a regular object the complexity of all the fundamental operations (constructors, destructors, assignment and equality) is linear in the area of the object.
Two objects are equal if they have the same number of ESSENTIAL parts and their corresponding essential parts are equal.
A function is REGULAR if it returns equal results for equal values of arguments.
Question: Why is it important for an optimizer to know if a function is regular?
In this section, we will develop a simple example: a function which swaps two values. It is a simple example of a generic algorithm, but more importantly, it is a key function for a number of generic algorithms, and appropriate tailoring to the characteristics of the type on which it operates can make a significant difference in the performance of generic algorithms such as sorting.
First, consider how we would normally swap two integer values:
|
This code fragment simply creates a temporary variable initialized to the first value, copies the second value to the first variable, and finally copies the temporary variable back to the second variable. It is code which we have all written many times, and it is as efficient as possible. In fact, a clever compiler may avoid doing any data movement beyond reading the old values of the variables a and b, and writing the new values.
What does this code fragment assume? It requires that the copy constructor initialize tmp to the value of a, and that the two following assignments copy the right-hand-side values to their left-hand-side variables. Since we have placed these requirements on any regular type T, we can easily turn this code fragment into a generic function for any such type T:
|
This function is straightforward. The parameters are passed by reference, since they must be modified. Otherwise, this swap template requires no comment in the general case. It will very efficiently swap the values of a and b by copying each to the other, using a temporary intermediate.
In particular cases, however, it may be extremely inefficient. Why? Think about our variable-length integer vector type in section 2.2. There, the above code will carefully copy not only the vector length, but also the vector data. However, that is unnecessary. We can accomplish the same semantics by simply swapping the pointers to the vector data. Since those pointers are private members of the class, we can implement swap as a member function, and then invoke it from a global function with the above interface.
|
Observe that we swap the data members n and v of the class using our generic swap function above, since we can't do better.
The lesson here is that for a type with remote parts, the swap function can be implemented more efficiently than the general template by simply swapping the pointers to the remote parts instead of copying the remote parts. When implementing a type intended to be used with generic algorithms like sorting, it is well worthwhile to provide such a tailored swap function.
A less obvious lesson is that it is useful to recognize cases like swap where tailored versions will be much more efficient than a general version. In such cases, we can provide a template for the general version, while describing an interface that providers of new types can use to give us a more efficient version when that is beneficial. Following this principle, the algorithms in the C++ Standard Template Library consistently use the swap function when swapping values, rather than in-line code, in order to take advantage of such specialized implementations.
We discussed the optimization requirements of individual operations in section 2.2. Here, we want to give examples of a couple of traditional optimizations (e.g. CSE, copy propagation), and point out that regular type semantics allows us to apply them at a higher level, in spite of the implementation of the fundamental operations as functions.
In our description of fundamental operations above, we described basic requirements on the standard C++ comparison operators <, <=, >=, and >. Since many algorithms are highly dependent on ordering relations and the comparison operators which represent them, and are sometimes sensitive to subtle distinctions between different possible ordering relations, we will provide a careful characterization in this section.
What is an ordering relation? In its most general form, it is a transitive binary relation on a set of values. That is, if we write "a prec b" for a pair (a,b) satisfying the ordering relation, then the following property holds:
This is a fairly weak axiom, and most ordering relations of interest satisfied additional important properties. We shall discuss several of them in the following paragraphs.
The first such property is strictness. A strict ordering is one in which a value does not precede itself. That is, the following axiom also holds:
The standard "less than" relation on numbers is a strict order. By contrast, "less than or equal" is not strict.
Another important property produces a total ordering, that is one in which any pair of values is either equal or ordered by the relation:
These possibilities are not exclusive. Both the standard "less than" and "less than or equal" relations on numbers are total. An ordering which is not total is partial.
Finally, there are ordering relations which are extremely important in programming, and which behave much like total orderings but are not total. Consider what is required to sort a set of employee records by salary. In a company of any significant size, it is likely that there will be employees with equal salaries. Assuming that we don't care how they are ordered in the sorted list, we want an ordering relation which treats the employee records with equal salaries as though they were equal, but is otherwise a total ordering based on the salary component.
Such an ordering relation is called a weak ordering. More precisely, it divides the set being ordered into disjoint subsets which are equivalent with respect to the ordering (i.e. two elements in the same subset are ordered the same relative to all other elements), and induces a total ordering on the subsets. Weak orderings may be either strict or non-strict.
Note that although any ordering determines a decomposition of the set being ordered into equivalence classes of elements which are ordered the same relative to other elements, the set of equivalence classes need not be totally ordered in general. The weak ordering constraint is a fairly strong one.
Now that we have a rudimentary understanding of possible ordering relations, we can turn our attention to the comparison operators which implement them. As we noted in section 2.2, C++ defines the standard comparison operators <, <=, >=, and > for built-in types, as do almost all programming languages. These operators implement a total ordering on the built-in types: < and > are strict, while <= and >= are non-strict. Furthermore, these operators have a well understood relation to one another:
These relationships show that we could implement all four operators in terms of the < operator. In fact, STL does this, providing templates for the three other operators which produce the correct functions given a reasonable definition of the < operator. Any of the four operators could have served as the base operator in this sense, but a specific choice had to be made to avoid ambiguity.
Since the composition of components in generic programming depends on equivalent behavior from the same operators (i.e. those with the same name) on different types, we require that any types defining these standard comparison operators define them with the same properties. This implies that they implement a total ordering. Given the templates for the other operators in STL mentioned above, a user defining a type need only define an appropriate < operator.
For a composite type, a suitable < operator can be defined lexicographically. That is, the first member fields are compared using their < operators, if they are equal the second fields are compared, and so on. This is the same approach typically used to compare character strings.
The STL algorithms which depend on orderings use these standard comparison operators by default, with these assumptions. As we observed above, however, what we want is often not a total ordering. The same STL algorithms normally also provide for use of an alternate ordering relation, in the form of a comparison operator. They consistently require that this operator implement a strict weak ordering. The choice between a strict and a non-strict ordering is arbitrary, in these sense that either one would produce working, efficient algorithms. However, one must be chosen, because some of the algorithms require subtle differences in implementation depending on the choice. In fact, some STL algorithms might not terminate given a non-strict comparison operator.
It is important to make one further point about comparison operators. The performance of many algorithms is dominated by the cost of comparisons. Furthermore, some of those algorithms make different choices for the cases a<b, a==b, and a>b. Typically, a comparison function can distinguish these three cases just as easily as it can produce a Boolean result for one of the standard two-way operators. In cases where the comparison is expensive, for example character strings, using a three-way comparison operator which returns a negative, zero, or positive integer can produce a significant performance improvement. We shall see several such cases in this book.
In this section, we are going to discuss one of the more common generic applications of the ordering relations from the preceding section, namely the min and max functions. In doing so, we will encounter a number of important programming issues.
First, we will thoroughly investigate a simple template for the min function. For this simplest form, we need only assume that we have an operand type for which the "<" operator provides a strict total order. Recall from section 2.2 that this requirement is satisfied by all C++ built-in types, and that it is easily satisfied for any user-defined type simply by defining "<" lexicographically.
|
Let us work our way through this example line by line. In the template header, we specify a template parameter "StrictTotalOrderable." This reflects our requirement that the operand type be totally ordered by "<". Of course, the name carries no implicit meaning to a C++ compiler, so there will be no compiler enforcement of our requirement. It would be ideal to if the language allowed us to specify a concept template parameter, i.e. a type or class along with the additional requirements we wished to impose, such as a "<" operator in this case. However, this ideal will need to wait for a future language design.
We begin the second line by declaring our function inline. This is an easy choice, since the body of the function will seldom be more code than a call would have been, and will always be faster. Inlining the call will provide the compiler with maximum information for optimization. More interesting in this line is the return type. Why do we return a reference instead of a value? This is because we want to be able to do the following:
min(x,y) = 5;
This statement assigns the value 5 to whichever of x and y contained the smaller value, regardless of what that smaller value was. In general, this treatment allows us to use min to identify which of two variables contains the smaller value and use that knowledge directly in contexts where we do not care what the actual minimum value is. Of course, this usage is not valid in cases where the operands are not valid C++ lvalues (i.e. assignable objects), but our definition still works for valid usages in such cases.
The next lines are our formal parameter declarations. Again, the interesting question is why we pass the parameters by reference rather than by value. There are several reasons. They may be large objects, and will promptly be passed to the "<" operator -- there is no reason to do extra copies. In an inline function, the reference parameters give the compiler the most flexibility for optimizing its treatment, since they do not require the semantics of a full copy. Since this function does not modify it parameters, there is no reason to demand copy semantics. Finally, of course, in order to return a reference to the smaller actual parameter, we must pass a reference rather than a value in the first place.
The last line of interest is the function body. It is straightforward, raising no obvious questions, but it does reflect a subtle decision. Why don't we write it as:
return x < y ? x : y;
The answer concerns how we want this function to behave in the case where the two operands are equal. In particular, we want it to consistently return the first operand in that case -- this will often prevent unnecessary activity in the caller in those cases.
Next, we must understand what this simple min function cannot do. The first problem is a result of the behavior of the C++ type system. Consider what happens when x and/or y are const objects, a case which will often occur inside functions with const parameters. In this case, the template class parameter "StrictTotalOrderable" will match the base type (without the const attribute) of the min parameter, and the compiler will object that we cannot pass a const value as an actual parameter to a non-const formal parameter. In order to deal with this, we need to define an alternate form of the template:
|
Observe that this version is completely identical to the base version except for the const attributes. If either of the actual parameters is const, this version will be matched instead. Because we returned a reference to one of the actual parameters, the result type must also have the const attribute.
The second inadequacy of the original version is a more general generic programming issue. The templates above use the operator "<" -- this is the standard definition and by far the most common interpretation for numeric types, but for user-defined types we are much more likely to be interested in alternate orderings. The ordering we wish to use may not meet our requirements for "<". In particular, it may not be a total ordering. For instance, we often need to compare two structures on the basis of a subset of their fields, producing an ordering in which they may compare as equivalent because those fields are equal, even though other fields are not. Even if we want to use a total ordering, there may be more than one total ordering of interest, and the current one may not be defined with the built-in operator "<".
For all of these reasons, we need a version of the template in which the comparison operator is itself a parameter. This new comparison operator does not need to define a total ordering -- a strict weak ordering will do. Therefore, we will rename the template formal parameter for the operands, although this has no real semantic effect in C++.
|
This pair of templates is completely analogous to the preceding pair, and all of the earlier comments apply. Only the second template formal parameter and the third function formal parameter are new, and lead to new observations. First, we note that the new parameter is declared with a new class type, StrictWeakOrderingRelation, and used like a function. This will typically be what we call a function object, which we will discuss more thoroughly in the next chapter.
Second, observe that the new parameter is passed by value. Why? As a function object, the required function address will usually be implicit in its type, and the object itself will often have smaller size than a pointer. We want to avoid forcing the caller to load a larger pointer, and the callee to dereference it, in those cases where it is not inlined.
Now that we have a full set of min function templates, it is straightforward to produce the corresponding max function templates -- we leave this as an exercise to the reader. The only tricky aspect is that we place the same requirement on max as we did on min, namely that it must return a reference to the first parameter if the two are equivalent.
We will close the discussion of min and max operators by commenting on several of their properties. It is trivially verified that
More importantly, min and max are associative operators, i.e.
Finally, these are almost commutative operators, with two exceptions. These exceptions may or may not be important for any particular use, so we will note them explicitly. As long as the relational operator used induces a total ordering, as does "<", they will be commutative in terms of the result returned, but not in terms of the results' addresses. That is:
If the ordering is not total, the operators are not even commutative in terms of values, in the case of two operands which are equivalent but not equal. For example, considered the case of a pair-of-integers operand type, where the relational operator compares the first element of the pair and ignores the second. In this case, we have:
but:
In spite of these exceptions, however, the min and max operators may be treated as commutative for many purposes.
To conclude this section, we will take a brief look at another algorithm which is quite similar to min/max, and can use them in its implementation. It is median, which finds the median, or middle, element of a set. Specifically, we will look at a function of three values which returns a value that is neither larger nor smaller than the other two. (This rather odd statement, of course, is necessary because equal or equivalent operands may mean that there is no strictly middle value.) Our interest in this function, other than its similarity to the min/max functions we have just studied, results from its use in other algorithms we will see later, such as efficient implementation of quicksort.
We will impose the same requirements on median as we did on min and max. In particular, for the base version, we require a "<" operator inducing a strict total ordering, and if more than one of the operands is equivalent to the median value, we want to return the first. With these requirements stated, here is a template for median:
|
We will not analyze this function as we did min, but we encourage the reader to do so as an exercise. Moreover, the reader should attempt to define the same alternate forms as were necessary for min and max. Finally, implementing a function to return the median of 5 values is an interesting exercise.
The following are points to be integrated somewhere:
Send comments about this page to Jim Dehnert or Alex Stepanov. |