Notes on Algorithms
An algorithm is a defined procedure to accomplish a specific task. It must accept input, also known as instances, and transform the input to achieve the desired output. The transformation applied by an algorithm is defined by a series of steps which it must follow. Often, algorithms are used to solve algorithmic problems which can be formally defined as follows:
Problem: <description of the problem>
Input: <data provided to the algorithm>
Output: <what the algorithm must produce>
When creating an algorithm, several goals should be achieved:
- correctness - always produces the correct output
- efficiency - produces the output with the least amount of computation
- ease - trivial to implement
It may not always be possible to achieve all these goals simulatenously and, therefore, a balance between them may be required.
Heuristics
Heuristics are used in place of algorithms when it is not possible to achieve the correct output as they can provide approximations. Different heuristics will provide output with varying degrees of correctness and, therefore, the heuristic used for a problem should be selected carefully so that it matches as closely as possible.
Algorithm correctness
To establish the correctness of an algorithm, a goal discussed above, proof is required. To show proof, the following is required:
- proof statement - statement on what is being proven
- assumptions
- chain of reasoning - the process from assumptions to the proof statement
- QED - shows that the proof has finished, “thus it is demonstrated”
The following sections will discuss how to achieve each of the parts shown above.
Proof statement
To define the proof statement, i.e. describe the problem, the following parts must be stated:
- the set of allowed input
- properties required from the ouput
Two traps are often encountered when defining the proof statement:
- ill-defined problems - ambiguous words are used in the statement
- compound goals - multiple goals are selected
A technique to achieving a correct and efficient algorithm is to reduce the set of allowed input.
Chain of reasoning
Several notation methods can be used to show the sequence of steps taken by an algorithm, i.e. show the chain of reasoning. The different notation methods are the following:
- English
- pseudocode
- programming language
These methods describe an algorithm in different ways, i.e. different languages, and have varying degrees of accuracy and ease of understanding. While programming languages have the highest accuracy, they can be the most difficult to understand. On the other hand, English can be easily understood, but, the words used may introduce ambiguity and, therefore, reduce the accuracy of the description. When selecting a notation method, a decision between the level of accuracy and understanding needs to be made.
Proof of incorrectness
Counter-examples are inputs that result in an algorithm producing incorrect output and have the following properties:
- verifiability
- show the output produced by the algorithm
- show the correct answer for the given input
- simplicity - remove all unnecessary details
When searching for weaknesses in an algorithm, first look at the problem in detail and think about what issues/difficulties may occur when solving. Second, systematically cover all possible inputs, including small, extreme, and tie input.
If incorrectness is not found in an algorithm, proof of the correctness of the algorithm is still required.
Proof of correctness
Mathematical induction is a proof technique which can be used as proof of the correctness of an algorithm. It involves proving a base case and then assuming it is correct up to a value. Induction has the following steps:
- proved a formula for a base case, such as 1
- assume the formula is correct up to
n - 1
- show the formula is correct for
n
Recursion is similar to mathematical induction and has the following steps:
- the recursive case breaks the problem down into smaller sub-problems
- the base case stops the recursion and returns a value
- show that the algorithm is correct for the case
n
Modeling a problem
Algorithms are designed to work on rigorously defined abstract data structures, discussed below. Therefore, when describing a problem, it must be done abstractly, such that it becomes a sequence of transformations on abstract data structures. The following sections discuss the different abstract data structures that are available.
Combinatorial objects
The following are common data structures:
- permutations - arrangements/orderings of items, e.g. {1, 3, 2, 4}
- subsets - a selection of items from a set of items
- trees - hierarchical relationships between items
- graphs - relationships between arbitrary pairs of objects
- points - locations in a geometric space
- polygons - regions in a geometric space
- string - sequences of characters/patterns
Recursive objects
Recursion involves the combination of a set of smaller items, resulting in a larger item. The abstract data structures discussed above can be handled recursively, shown below:
- permutations - removing an element results in a permutation with
n - 1
elements - subsets - removing an element in subset results in a subset of
n - 1
elements - trees - the combination of nodes, smaller items, results in a larger tree, larger item
- graphs - the combination of nodes results in a larger graph
- points - points can be combined into a larger set or broken down into smaller sets
- polygons - similar to the points object
- strings - the characters of a string can be combined to produce a larger string
Recursive objects require the decomposition rules, and base cases. Decomposition rules describe how a recursive object may be broken down into a smaller object, and is known as the recursive case. Base cases define the smallest object that can be achieved from a recursive object, and results in the decomposition of the object terminating.
Further discussion about recursion can be found on my post Notes on Recursive Algorithms
Conclusion
This post briefly looked at algorithms, firstly what they are and how they can be used to solve problems by transforming given input to a required output. Secondly, how to prove the correctness of an algorithm was discussed, including tools and techniques that can be used. Finally, the reasoning for modeling a problem abstractly so that an algorithm can perform transformations on abstract data structures was presented.
These notes where prepared from the book “The Algorithm Design Manual” by Steven S. Skiena. This book provides an excellent starting point for understanding algorithms and how they work. You can find the book over at the Springer website.