Back to Basics: Algorithms

Computer scientists look for simplicity, speed and reliability. Sometimes they find elegance.

Page 2 of 2

But in the 1960s, computers moved into the business world in a big way, and soon two ugly realities intruded. The first was the matter of "bugs" -- a term coined by Hopper. Computers made lots of mistakes because programmers made lots of mistakes. The second was sorting, a machine-intensive job that came to dominate, and sometimes overwhelm, computing.

Virtually every major application required sorting. For example, if you wanted to eliminate duplicate mailings from your customer master file, which was sorted by customer number, you might have had to re-sort it by last name within ZIP code. Sorting and merging big files often went on repeatedly throughout the day. Even worse, very few of the records being sorted would fit into those tiny memories, and often they were not even on disk; they were on slow, cumbersome magnetic tapes. When the CEO called the data processing shop and asked, "When can I get that special report?" the DP guy might have said it would take 24 hours because of all the sorting that was needed.

So IT people learned that algorithms mattered. The choice of algorithm could have a huge effect on both programmability and processing efficiency.

If algorithms were simple, they could be easily coded, debugged and later modified. Simple ones were less likely to have bugs in the first place, and if you used an existing algorithm rather than inventing your own, some of the debugging had already been done. But simple ones were often not the most efficient. They were not the ones that would speed up sorting enough to give the CEO's request a same-day turnaround.

The Search for Elegance

Computer science took on these challenges and came up with families of algorithms with fancy names like "Induction," "Recursion" and "Divide and Conquer." And programmers developed methods (themselves algorithms) for assessing the efficiency and general goodness of algorithms.

A simple sort algorithm called Bubblesort was developed early on. It involved reading through the file to be sorted, looking successively at pairs of adjacent records. If they were out of order, the two records in the pair were simply swapped. By passing through the file repeatedly and overlapping the pairs, in-sequence records would "bubble up" to the top until eventually the entire file was in sequence. It was easy to understand, program and debug, but it wasn't very efficient because it required many passes through the file.

Descriptions of all the clever algorithms that improved on Bubblesort would fill a book, but many students of algorithms would give the grand-slam award for elegance to Quicksort, which was invented in the early 1960s by Charles Antony Richard Hoare, a British computer scientist. (See box below.)

Depending on file size and other factors, it can take Quicksort just seconds to sort a file that would take other routines minutes or hours to process. Hoare, who also pioneered methods for proving the correctness of programs, was knighted in 2000 for his achievements.

Another exercise in algorithms is the famous "traveling salesman problem" (TSP), in which a salesman leaves from home to visit a number of cities and wants to minimize his distance traveled. If there are five cities to visit, there are 60 possible routes to choose from, and the obvious algorithm is to compute the distances for all of them and pick the best one. But the TSP suffers from an unfortunate phenomenon known as "combinatorial explosion."

If there are 10 cities to visit, there are 1.8 million paths, and at 20 cities, the poor salesman (or his computer) has 121 quadrillion routes to consider. Clearly, at some point, the try-them-all algorithm becomes impractical.

So mathematicians and computer scientists have come up with all kinds of ingenious ways to "solve" the TSP, which is but one example in a broad class of important problems. Some algorithms give exact solutions for, say, a few hundred cities or less. But with bigger problems, we usually turn to the so-called heuristic algorithms that produce "pretty good" but not optimal solutions. There is a dizzying assortment of such things, including dynamic programming, genetic algorithms and Markov chains.

But if you have 1,000 cities to visit and you don't have a Ph.D. in mathematics, you might try the "nearest neighbor" algorithm, which has proved to be remarkably good in many cases. With this algorithm, at each city, you simply next travel to the nearest unvisited city, and you keep doing that until you have gone to every one.

To express your thoughts on Computerworld content, visit Computerworld's Facebook page, LinkedIn page and Twitter stream.
| 1 2 Page 2
10 super-user tricks to boost Windows 10 productivity
Shop Tech Products at Amazon