Introduction

How does a person learn to program? One answer has likened the process to learning to play chess: This book addresses itself to steps 2 and 3. It assumes the reader has already been introduced to programming and Java. The focus here is on: practicing problem solving, seeking insight, and leveraging algorithms and data structures.

The core values emphasized throughout this book are:

Practice, practice, practice

We are what we repeatedly do. Excellence, then, is not an act, but a habit. [Aristotle]
In real estate, the three key considerations for every piece of property are: location, location, and location. In programming, the same can be said about actually writing programs.
Good judgment is the result of experience. Experience is the result of bad judgment. [Mark Twain.
There is no substitute for experience; and experience comes from mistakes. Optimally we learn from the mistakes of others. Typically we learn from our own mistakes.

The current mantra of software engineering is, In methodology we trust. While analysis and design are indispensible, the practice of programming is where: requirements are ultimately settled, problems are concretely understood, and solutions are actually crafted. Whether its called: prototyping, proof of concept, exploratory analysis, construction, integration, maintenance, or refactoring; it is all programming. And like every other discipline, success is largely a function of practice.

The good thing about up-front design is that you have a unifying vision. The bad thing is that it contains hundreds or thousands of guesses, most of which are probably wrong. [Kent Beck]

No matter how much analysis you do, there are some things about a system that won't reveal themselves until design time, and more things that won't reveal themselves until a program is up and running. Because of this, its critical to move fairly quickly through analysis and design to implement a test of the proposed system. [Bruce Eckel]

The problem cannot truly be understood until the solution exists. [Robert C. Martin]

As a demonstration of the pilgrimage of programming, consider the following example from The Elements of Programming Style.

The code was originally written in FORTRAN. In that language, array indices start at 1. The implementation has been modified to reflect Java's 0-based indexing. What is the purpose of this fragment of code?

Integer division is being used here. That means any remainder is truncated. As a result, whenever "j" is greater than "i", or "i" is greater than "j", the result is zero. The only situation where the result is not zero is when "i" equals "j". The purpose of this fragment is to initialize the 2-dimensional array as an identity matrix (all elements on the diagonal are "1" and all other elements are "0").

This code is exceedingly clever, but is it too clever for its own good? Code is read many more times than it is written. Any maintenance programmer facing this code is very likely to believe that something dark and sinister is going on here, and will superstitiously label the code "hands off". Another issue might be performance - every assignment requires two divisions and a multiplication.

Can we refactor this implementation so that it is more reader-friendly? The following implementation is entirely self-documenting.

But – every iteration of the inner loop requires an if-else evaluation. Is there an alternative with less overhead? Lets try replacing the if-else block with a ternary operator expression. The overhead seems to have been reduced but has it really? The ternary operator is more compact for the human reader, but, as far as the compiler is concerned, the byte code is probably all too similar. Lets try one more idea - All elements are assigned a value of zero, and then a single element in each row is re-assigned a value of one. While this implementation might be too pedestrian for some; for all of us up to our hip-pockets in alligators - simple is enormously beautiful!

It took us many attempts, but significant progress was made. It has been said, The planning is everything; the plan is nothing. In our pilgrimage the end product has improved, but perhaps the real benefit has been the journey itself. The discipline of programming is about: practice, practice, practice.

Less is more

Engineering is the art of not constructing; or, to define it rudely but not inaptly, it is the art of doing that well one dollar which any bungler can do with two. [Arthur Wellington]

A designer knows he has achieved perfection not when there is nothing left to add, but when there is nothing left to take away. [Antoine de Saint-Exupery]

It takes uncommon effort to resist the siren call of complexity. Surrending to a mechanical, brute-force approach to programming is very common; but equally unfortunate. The fruits of that style of programming are large, unwieldy systems. Over time, these applications become increasingly bloated, and eventually fail under their own weight. The alternative to this style of programming is a commitment to reflection, minimalism, and simplicity.

Methodologies can also contribute to complexity. They generally promote a world view of: say what you do and do what you say, or, plan your work and work your plan. Given their plug and chug approach to problem solving, and their preoccupation with models, documents, and artifacts; it is easy to take your eye off the ball, and believe that institutional formalisms have supplanted individual creativity as the dominant source of leverage in building software solutions. It might even be reasonable to suggest that methodologies can often spawn bureaucracies.

Too often we confuse effort and progress. [Fred Brooks]

While many fine, useful software systems have been developed by committees, those software systems that have excitedpassionate fans are those that product of one or a few designing minds, great designers. Consider Unix, APL, Pascal, Modula, Smalltalk, Fortran; and contrast them with COBOL, PL/I, Algol, MVS/370, and MS-DOS. [Fred Brooks]

In software, it is the energies of creative people that make all the difference in the world. A handful of creative people are able to maintain architectural focus, and resist divergent demands that would result in big and bloated.
The best people outperform the worst by 10:1. The best outperform the median performer by 2.5:1. The better-than-median performers outperform the other half by 2:1. [Tom Demarco and Tim Lister]
Small is beautiful. Simple is sexy.
A five-to-one ratio between the shortest and longest implementations is quite typical, and usually the shorter programs are better. [Peter DeGrace and Leslie Hulet Stahl]
But – small is hard. It takes uncommon commitment and unflagging discipline. Much of the genuis of the Gettysburg Address was due to its attention to focus and economy of words.
If I had had more time, I would have written less. [Mark Twain]

I have made this letter longer than usual, because I lack the time to make it short. [Blaise Pascal]

Lets consider a word count program. If I were to: start at the beginning, put one foot in front of the other, and not stop until the problem is solved; I might produce an implementation like that in listing 0.1. Notice the insertWord() function. It sequentially searches the supplied wordList parameter, and manually inserts each newWord at the appropriate position. If instead of this all-too-pragmatic, step-by-step approach, I were to imagine a more fanciful world. A world in which certain kinds of arrays could be indexed with Strings (instead of being shackled to integer indices), then we could use each newWord as its own index into the wordList array. If such a magic array existed, we could get rid of the sequential search code. Additionally, if that magic array data structure also supported a key-value pair capability, we could eliminate the WordStruct class as well.

Both of these dream-like aspirations are satisfied by the TreeMap class in Java's standard library. It provides an associative array capability – any object type can be used as the index (or key), and any object type can be the value paired with each key. The user has direct access to the value, simply by supplying the key to the get() or put() methods. [See appendix A for additional discussion of the TreeMap class and its usage idioms.]

The entire refactored implementation is in listing 0.2. By finding (or inventing) a different data structure, we have cut the size of the insertWord() function in half, and have eliminated the WordStruct class altogether. We have eliminated 12 lines of unnecessary weight, and some amount of complexity. Less, truly is ... more.

One more gratuitous quote. The TreeMap class is not commonly reused. It requires a little extra effort to find, to learn, and to use. This is an all too common situation when it comes to reuse.

Creating a culture of software reuse in most development organizations often feels a bit like trying to wean people with runny noses from Kleenex to handkerchiefs. Tissues are clean and convenient, while handkerchiefs get yucky and require maintenance. [Michael Schrage]

Listing 0.1 - original implementation

Listing 0.2 - refactored implementation

Abstraction is key

Programmers routinely fail to create new abstractions. They write "flat" code, in which all the functionality is ar the same logical level. They use subroutines as a glorified macro facility, not as a way to create abstractions. [Lou Grinzo]

Only about 20-30% of developers are probably really good at object-oriented abstraction. That doesn't mean theat the remaining 70-80% are bad, evil, or inadequate programmers. Rather, this is a recognition of the fact that some people are better than others at looking at the world and discovering or inventing abstractions of reality. [Grady Booch]

Abstraction requires effort. It is an act of the will. Nothing short of discipline is necessary if it is going to flourish.

Abstractions are illusions that programmers create for themselves so that they can maximize cohesion within a function or class, and minimize coupling between functions or classes. Abstraction is one of the primary ways in which programmers practice elegance.

Elegance, I fear, can be neither taught nor mandated. It can, however, be displayed, admired, and sought. [Jon Bentley]
One could even say it is similar to these qualities I found on the Internet that are caught, not taught. Data structures and algorithms routinely are the by-product of the practice of abstraction.
There are lots of applications in service where the most sophisticated algorithm is bubble sort, and the most sophisticated data structure is linked list. [Scott Meyers]
Data structures and algorithms are a dominant theme throughout this book. Lets consider another example from The Elements of Programming Style that demonstrates the leverage afforded by these manifestations of elegance. The following function originally appeared in a programming textbook. It converts a julian date to its corresponding month/day/year representation. Notice the clever use of the variable "leap" to account for the extra day that is introduced every fourth year. Also, notice the very repetitive logic codified in the enormous "case" statement. There ought to be a way to collapse all these "trees" into a more interesting "forest".

The following alternative algorithm probably took a little extra time to conceive, but, may have taken less time to write and test, and definitely takes less effort to maintain. I first saw this implementation in The C Programming Language published in 1978.

Instead of a cumbersome "case" statement, and its inherent control logic overhead; the author has created an array data structure. The elements of the array are the number of days per month. The fidelity of the data is easy to inspect visually. An alternative could be an array of running totals (i.e. 31, 59, 90, ...), so that the "-=" operator is not required in the for loop. But this choice would not have been as easy to understand by all subsequent readers of this algorithm.

To accomodate leap year calculations, a little extra storage was used to provide a parallel instance of the days-per-month table. While this use of memory is less than optimal, the choice has enhances the program's readability. Another trade-off is the addition of a thirteenth column to the table. Humans expect months to be represented with the numbers one through twelve. Arrays in Java are indexed starting at zero. To forego the need of incrementing a variable's value so that it can be used to output the correct month, the "zeroth" column is smiply not used.

All the logic for deciding if the specified year is a leap year is captured in a single compound ternary operator. It is arguable whether this shoots the function's readability in the foot; but this kind of code is a fairly common C idiom. Once the leap year decision is made, it is represented as an integer - not a boolean. This allows the decision to be used as an index into the days-per-month table.

Several tests and the resultant output follow.

The combination of the daysPerMonthTable data structure and the algorithm for counting down the input julian number to compute the corresponding month and day have produced a result that is simpler, smaller, and (dare I say) downright beautiful!

In review, the values that will be emphasized throughout each project are: