Informally, an algorithm is any well-defined computational procedure that takes some value, or set of values, as input and produces some value, or set of values, as output in a finite amount of time. An algorithm is thus a sequence of computational steps that transform the input into the output.
An algorithm for a computational problem is correct if, for every problem instance provided as input, it halts finishes its computing in finite time and outputs the correct solution to the problem instance. A correct algorithm solves the given computational problem.
This book also presents several data structures. A data structure is a way to store and organize data in order to facilitate access and modifications. Using the appropriate data structure or structures is an important part of algorithm design.
To express an algorithm, we will use Pseudocode. What separates pseudocode from real code is that in pseudocode, we employ whatever expressive method is most clear and concise to specify a given algorithm. Another difference between pseudocode and real code is that pseudocode often ignores aspects of software engineeringsuch as data abstraction, modularity, and error handlingin order to convey the essence of the algorithm more concisely.
Insertion Sort
Our first algorithm, insertion sort, solves the sorting problem :
Input: A sequence of numbers .
Output: A permutation (reordering) of the input sequence such that .
The numbers to be sorted are also known as the keys. When we want to sort numbers, it’s often because they are the keys associated with other data, which we call satellite data. Together, a key and satellite data form a record.
The pseudocode for insertion sort is given as the procedure Insertion-Sort.
It takes two parameters: an array containing the values to be sorted and the number of values of sort. The values occupy positions through of the array, which we denote by . When the Insertion-Sort procedure is finished, array contains the original values, but in sorted order.
Insertion-Sort
for to while and
The index indicates the “current key” being inserted into the final array. At the beginning of each iteration of the for loop, which is indexed by , the subarray (a contiguous portion of the array) consisting of elements constitutes the currently sorted array, and the remaining subarray corresponds to the elements remaining in the given array. In fact, elements are the elements originally in positions through , but now in sorted order. We state these properties of formally as a loop invariant:
Loop invariants help us understand why an algorithm is correct. When you’re using a loop invariant, you need to show three things:
Initialization: It is true prior to the first iteration of the loop.
Maintenance: If it is true before an iteration of the loop, it remains true before the next iteration.
Termination: The loop terminates, and when it terminates, the invariantusually along with the reason that the loop terminatedgives us a useful property that helps show that the algorithm is correct.
A loop-invariant proof is a form of mathematical induction, where to prove that a property holds, you prove a base case and an inductive step.
Let’s see how these properties hold for insertion sort.
Initialization:
We start by showing that the loop invariant holds before the first loop iteration, when . The subarray consists of just the single element , which is in fact the original element in . Moreover, this subarray is sorted.
Maintenance:
Next, we tackle the second property: showing that each iteration Next, maintains the loop invariant. Informally, the body of the for loop works by moving the values in , , , and so on by one position to the right until it finds the proper position for , at which point it inserts the value of . The subarray then consists of the elements originally in , but in sorted order. Incrementing for the next iteration of the for loop then preserves the loop invariant.
Termination:
Finally, we examine loop termination. The loop variable starts at and increases by in each iteration. Once ‘s value exceeds , the loop terminates. So, the loop terminates once equals . Substituting for in the wording of the loop invariant yields that the subarray consists of the elements originally in , but in sorted order. Hence, the algorithm is correct.
We typically organize compound data into objects, which are composed of attributes. We access a particular attribute using the syntax: the object name, followed by a dot, followed by the attribute name.
Analyzing Algorithms
Analyzing an algorithm has come to mean predicting the resources that the algorithm requires. You might consider resources such as memory, communication bandwidth, or energy consumption. Most often, however, you’ll want to measure computational time.
Most of this book assumes a generic one-processor, random-access machine (RAM) model of computation as the implementation technology, with the understanding that algorithms are implemented as computer programs. In the RAM model, instructions execute one after another, with no concurrent operations. The RAM model assumes that each instruction takes the same amount of time as any other instruction and that each data accessusing the value of a variable or storing into a variabletakes the same amount of time as any other data access.
Analysis of Insertion Sort
How long does the Insertion-Sort procedure take? How do we analyze insertion sort?
Even though the running time can depend on many features of the input, we’ll focus on the one that has been shown to have the greatest effect, namely the size of the input, and describe the running time of a program as a function of the size of its input.
The best notion for input size depends on the problem being studied. For many problems, such as sorting or computing discrete Fourier transforms, the most natural measure is the number of items in the input. For many other problems, such as multiplying two integers, the best measure of input size is the total number of bits needed to represent the input in ordinary binary notation.
The running time of an algorithm on a particular input is the number of instructions and data accesses executed.
Now, let’s analyze the Insertion-Sort procedure. To analyze the Insertion-Sort procedure, let’s view it with the time cost of each statement and the number of times each statement is executed.
For each let denote the number of times the while loop test is executed for that value of . When a for or while loop exits in the usual waybecause the test in the loop header comes up FALSEthe test is executed one time more than the loop body. Because comments are not executable statements, assume that they take no time.