Large-Scale C++ Software Design
John Lakos

logical design
that which pertains to modeling, IsA, HasA, UsesA, classes, functions, operators, public vs private, member function vs free function, virtual vs non-virtual
physical design
that which pertains to partitioning, DependsOn, files, directories, libraries, compile and link coupling, cycles
component
the smallest unit of physical design
package
a collection of components organized as a physically cohesive unit
encapsulation
contained implementation details (type, data, or function) are not accessible programmatically through the interface of the component; a logical property of design
insulation
contained implementation details (type, data, or function) can be altered without forcing clients of the component to recompile; a physical property of design
internal linkage
name is local to its translation unit and cannot collide at link time with an identical name in another translation unit
external linkage
name can interact with other translation units at link time (e.g. global data)
compile-time dependency
Y depends on X if x.h is needed in order to compile y.c
link-time dependency
Y depends on X if y.o contains undefined symbols that x.o will help resolve (either directly or indirectly) at link time
handle
a class that maintains a pointer to an object that is programmatically accessible through the public interface of the handle class [430]

Politically incorrect

  1. The component is the fundamental unit of design.
  2. All classes for a component go in one header and one source file.
  3. Physical design must run in parallel with logical design.
  4. Common sense in physical design is more important than ideological integrity in logical design.
  5. Clients must #include everything they are directly or indirectly dependent upon.
  6. The theory of OO encapsulation is often the problem.
  7. Manager classes are okay.
  8. Encapsulation does not remotely approximate insulation.
  9. A small set of coding rules are more important than high-powered tools.
  10. Compile-time dependencies are inconvenient, link-time dependencies are a killer.

Summary

  1. Small project experience does not scale to large projects.
  2. A sound physical design is essential to the success of larger systems.
  3. Common sense design rules make physical dependencies explicit. [ch 2]
  4. The component (not the class) is the fundamental unit of design. [ch 3]
  5. Hierarchical testing improves reliability while reducing costs. [ch 4]
  6. CCD (Cumulative Component Dependency) is a metric for monitoring the link-time cost of incremental regression testing. [ch 4]
  7. Techniques exist for untangling cyclically-dependent designs. [ch 5]
  8. Techniques exist for insulating clients from implementation details. [ch 6]
  9. The package extends these concepts to larger projects. [ch 7]

Miscellaneous

  1. Place a redundant include guard around each include directive. [85]
  2. Explicitly include all header files you depend on, do not rely on one header file to include another. [113]
  3. The dominant purpose of a name prefix is to identify uniquely the physical package in which the component or class is defined. [490]

Ground rules

  1. All definitions with external linkage should be declared in the component's .h file.
  2. When some thing has external linkage, it's not okay to forward declare it in the .c file.
  3. Only #include what is needed to compile in isolation.
  4. Don't rely on other files to #include .h files that you "depend" on.
  5. The component's .h file must be the first #include in the component's .c file.
  6. Then #include all other .h files from least global to most global.
Major design rules:
  1. Put global data in a struct. [70]
  2. Avoid free functions (except operator functions) at file scope in .h files. [72]
  3. Avoid free functions with external linkage (including operator functions) in .c files. [72]
  4. Avoid preprocessor macros in .h files. [75]
  5. Only classes, structs, unions, and free operator functions should be declared at file scope in .h files. [77]
  6. Only classes, structs, unions, and inline functions should be defined at file scope in .h files. [77]
  7. Include a header file only if you make direct substantive use of a class or free function defined in the header. [135]
  8. A .h file should only export what it must.
  9. All definitions with external linkage go in the component's .c file and the declarations go in the component's .h file. [115]
  10. No local declarations in the .c file for entities with external linkage. These declarations belong in the appropriate .h file and they are accessed by #include'ing that .h file.
  11. Do not use a local declaration for a non-local definition. Include the necessary .h file instead. [119]
Examples of internal linkage: Examples of external linkage:

Components

  1. The root names of the .c and .h files should match exactly. [110]
  2. The .c file of every component should include its own .h file as the first line of code (even before system include files). [110]
  3. Logical entities declared within a component should not be defined outside that component. [108]
  4. A component defining a function will usually have a physical dependency on any component defining a type used by that function. [127]
  5. Avoid cyclic dependencies among components. [185]
  6. Friendship within a component is an implementation detail of that component. [137]
  7. Granting (local) friendship to classes defined within the same component does not violate encapsulation. [139]
  8. Friendship affects access priviledge but no implied dependency [141]. Granting friendship does not create dependencies but can induce physical coupling in order to preserve encapsulation [308].
  9. Escalating the level at which encapsulation occurs can remove the need to grant private access to cooperating components within a subsystem. [315]
  10. Defining an iterator class along with a container class in the same component enables user extensibility, improves maintainability, and enhances reusability while preserving encapsulation. [140]
  11. Minimizing the number and size of exported header files enhances usability. [503]
  12. Minimizing the use of externally defined types in a component's interface facilitates reuse in a wider variety of contexts. [558]
  13. A good test for encapsulation is to see whether a given interface will simultaneously support two significantly different implementation strategies without modification. [562]

Testing

  1. Distributing system testing throughout the design hierarchy can be much more effective per testing dollar than testing at only the highest level interface. [159]
  2. Testing a component in isolation is an effective way to ensure reliability. [162]
  3. Hierarchical testing requires a separate test driver for every component. [175]
  4. Testing only the functionality directly implemented within a component enables the complexity of the test to be proportional to the complexity of the component. [178]
  5. Components that use objects "in name only" can be thoroughly tested, independently of the named object. [250]
.

Levelization (breaking dependencies)

  1. Escalation. Moving mutually dependent functionality higher in the physical hierarchy. [325]
  2. Demotion. If peer components are cyclicly dependent, it may be possible to demote the interdependent functionality from each of these components to a potentially new lower-level (shared) component upon which each of the original components depends. [229]
  3. Opaque pointers. A pointer is said to be opaque if the definition of the type to which it points is not included in the current translation unit [251]. Components that use objects "in name only" can be thoroughly tested, independently of the named object.
  4. Dumb data. Refers to a generalization of the concept of opaque pointers. Dumb data is any kind of information that an object holds but does not know how to interpret. Such data must be used in the context of another object, usually at a higher level. [257]
  5. Redundancy. Deliberately repeating code or data in order to avoid unwanted coupling brought on by reuse. [269]
  6. Callbacks. A callback is a function, provided by a client to a subsystem, that allows the callee to perform a specific operation in the context of the caller. [275]
  7. Manager class. The idea of a "mediator". Creating a class that owns and coordinates lower-level objects. Often makes a system easier to understand and maintain. [290]
  8. Factoring. Factoring means extracting pockets of cohesive functionality and moving them to a lower level where they can be independently tested and reused. [294]
  9. Escalating encapsulation. The idea of a "facade" or "wrapper". Moving the point at which implementation details are hidden from clients to a higher level in the physical hierarchy. Escalating the level at which encapsulation occurs can remove the need to grant private access to cooperating components within a subsystem. [315]

OOD

  1. A protocol class is a nearly perfect insulator. [386]
  2. A protocol class can be used to eliminate both compile- and link-time dependencies. [389]
  3. Holding only a single opaque pointer to a structure containing all of a class's private members enables a concrete class to insulate its implementation from its clients. [402]
  4. All fully insulated implementations can be modified without affecting any header file. [404]
  5. What, when, how much, and the costs of insulatation have lots of issues. [448-462]
  6. Settling for less than full encapsulation is sometimes the right choice. [571]
  7. The indiscriminate use of callbacks can lead to designs that are difficult to understand, debug, and maintain. [279]
  8. The need for callbacks can be a symptom of a poor overall architecture. [282]
  9. Establishing hierarchical ownership of lower-level objects makes a system easier to understand and more maintainable (triangle decomposition). "Dumb-bell" decompositions are bad. [290]
  10. Hiding header files from clients is no substitute for proper encapsulation. It may make programmatic access difficult, but it is routinely still possible. Additionally, clients will still be impacted if implementation choices change. [316]
  11. Virtual functions implement variation in behavior; data members implement variation in value. [601]
  12. Member functions that are not public expose general users to uninsulated implementation details. [613]
  13. A variety of problems can be solved by adding an extra level of indirection. [671]
  14. Design patterns are an effective way of communicating reusable concepts and ideas at an architectural level. [731]
  15. Design patterns, like the design process itself, address both logical and physical issues. [732]
  16. Self-registering objects are "cute". Non-invasive extension mechanisms are an inappropriate dream.

C++

  1. Default arguments can be an effective alternative to function overloading, especially where insulation is not relevant. [619]
  2. Never pass a user-defined type to a function by value, pass it by const reference. [622]
  3. Whenever a parameter is passed by reference or pointer, and it is neither modified nor stored, the parameter should be declared const. [629]
  4. Avoid declaring parameters passed by value as const. [629]
  5. Passing in the address of a previously constructed object to be assigned the return value (called return by argument) can improve performance while preserving total encapsulation. (p. 565)
  6. Returning a non-const object from a const member function can rupture the const-correctness of a system. [607]
  7. Avoid declaring a function inline whose body produces object code that is larger than the object code produced by the equivalent non-inline function call itself. [631]
  8. Avoid declaring a function inline that the compiler will not inline. [632]
  9. Explicitly declare (either public or private) the constructor and assignment operator for any class defined in a header file, even when the default implementations are adequate. [650]
  10. In general, an object cannot be copied (or moved) using a bitwise copy. [721]
  11. In every class that declares or is derived from a class that declares a virtual function, explicitly declare the destructor as the first virtual function in the class and define it out-on-line. [651]
  12. In classes that do not otherwise declare virtual functions, explicitly declare the destructor as non-virtual and define it appropriately (either inline or out-of-line). [654]
  13. Supplying support for derived-class authors in the form of protected member functions of a base class exposes public clients of the base class to uninsulated implementation details of the derived classes. [364]
  14. The construction of each non-local static object in a program potentially contributes to invocation time. [533]
  15. Avoid "hiding" a non-virtual base class function in a derived class. [602]
  16. Static member functions are commonly used to implement non-primitive functionality in a separate utility class. [604]
  17. Avoid "cast" operators, especially to fundamental integral types, make the conversion explicit instead. [649]
  18. Constructors that enable implicit conversion, especially from widely used or fundamental types (e.g., int), erode the safety afforded by strong typing. [646]
  19. Instrumenting global operators new and delete is a simple but effective way to understand and test the behavior of dynamic memory allocation within a system. [692]

General programming

  1. Use assert statements to document the assumptions made in implementation. [90]
  2. For functions that return an error status, an integral value of 0 should always mean success. [615]
  3. Functions that answer a yes-or-no question should be worded appropriately (e.g. isValid()) and return an int value of either 0 (no) or 1 (yes). [617]
  4. In a procedural interface, having clients explicitly destroy only those objects that they explicitly create reduces confusion over ownership and can lead to improved performance. [437]
  5. Functions should never take, and store, the address of an argument in a location that will persist after the function terminates. If storing an address is necessary, the client should be required to pass an address. [626]
  6. Avoid using short in the interface, use int instead. [633]
  7. Avoid using unsigned in the interface, use int instead. [637,667]
  8. Avoid using long in the interface, use assert(sizeof(int) >= 4) and use either int or a user-defined large-integer type instead. [642]
  9. Avoid using float or long double in the interface, use double instead. [645]