In this post, we would like to describe how to implement the Gauss algorithm for solving systems of linear equations in Haskell.
Explanation of the Gauss Algorithm
Systems of linear equations are commonly written as matrix equations. Given a matrix of coefficients and a vector of values, the objective is to solve the equation
in terms of the unknown vector .
In Haskell, we would express this as follows. First, let us define a few convenient type synonyms.
> type Number = Double
> type Vector = [Number]
> type Row = [Number]
> type Matrix = [Row]
In other words, vectors are simply represented as lists of numbers and matrices are lists of rows, which in turn are lists of numbers.
A matrix equation can then be written as
mapMatrix a x == b
where the mapMatrix
function multiplies a matrix with a vector
> mapMatrix :: Matrix -> Vector -> Vector
> mapMatrix rows v = [sum (zipWith (*) row v) | row <- rows]
The Gauss algorithm gives a solution to this equation. In other words, we will implement a function gauss
that satisfies the equation
mapMatrix a (gauss a b) == b
i.e. the vector x = gauss a b
is a solution to the equation mapMatrix a x == b
.
To understand the Gauss algorithm, let’s look at the case of a system of three equations with three unknowns. For instance, consider the following example
which corresponds to the Haskell code
> exampleA = [[1,1,0], [0,1,1], [1,0,1]] :: Matrix
> exampleb = [2,3,4] :: Vector
and the equation
mapMatrix exampleA x == exampleb
The key idea is that we can multiply an equation with a number or subtract equations to obtain a different system of equations which nonetheless has the same solution. After all, if a sum of numbers on the left-hand side of an equation is supposed to be equal to a number on the right-hand side, then twice this sum is still equal to twice this number.
However, by subtracting equations from each other, we can simplify the system considerably. For instance, subtracting the first equation from the third equation gives the system
which is much simpler than the first one, as the unknown is now only contained in the first equation. If we can solve the last two equations
in terms of and , then we can use the solution and substitute it into the first equation to solve it for the remaining unknown .
But we know how to solve the system of two equations, simply by repeating the procedure of subtracting the first equation from all the others to eliminate the first unknown from the equations. In general, we have to multiply the first equation with a suitable number subtracting. In the particular example, we would have to multiply the first equation by before subtracting it from the last one to obtain
As you can see, the final equation is easily solved for .
Implementation in Haskell
The implementation of this procedure as a computer algorithm proceeds in two phases: the first phase consists of eliminating unknowns from the equations, while the second phase calculates the solution unknown by unknown. The first phase can also be understood as bringing the matrix into an upper triangular form, which means that all coefficients below the diagonal are zero.
Here is the code. First of all, it is easier to combine the matrix of coefficients and the right-hand side vector into a larger matrix of size .
> gauss :: Matrix -> Vector -> Vector
> gauss a b = x
> where
> b' = map (\y -> [y]) b
> a' = zipWith (++) a b' -- combine with right-hand side
>
> x = resubstitute $ triangular a'
Then, the algorithm transforms the matrix into triangular form and performs the substitutions to obtain the unknown vector .
First phase – triangular form
For the first phase, we take the first row and subtract a suitable multiple of it from the other rows so that the first column of coefficients will become zero. We can drop this column of zeroes and continue the algorithm on a smaller matrix.
> triangular :: Matrix -> Matrix
> triangular [] = []
> triangular m = row:(triangular rows')
> where
> (row:rows) = rotatePivot m -- see discussion below
> rows' = map f rows
> f bs
> | (head bs) == 0 = drop 1 bs
> | otherwise = drop 1 $ zipWith (-) (map (*c) bs) row
> where
> c = (head row)/(head bs) -- suitable multiple
Note that we represent the triangular matrix with rows of decreasing length, this makes resubstitution simpler later.
ghci> triangular exampleA
[[1.0,1.0,0.0],[1.0,1.0],[-2.0]]
Of course, there is a problem if the first coefficient in the first row is already zero, because then we cannot eliminate the other coefficients. However, in this case, we simply look for another row that has a non-zero coefficient in the first column, this is the job of the rotatePivot
function. (If no such row exists, then the system of equations does not have a unique solution, we will simply ignore this case.)
> rotatePivot :: Matrix -> Matrix
> rotatePivot (row:rows)
> | (head row) /= 0 = (row:rows)
> | otherwise = rotatePivot (rows ++ [row])
Second phase – resubstitution
When the matrix is in triangular form, the equations are very easy to solve. We simply start with the equation that contains only one unknown and work back from there.
When working with lists in Haskell, it’s easiest to start from the head of the list. To formulate the resubstitution, we therefore apply a few administrative steps to the output of the triangular
. Namely, we first reverse the columns to have the last unknown in the head column, and we reverse the rows as well to start with the equation that has only one unknown. Afterwards, we have to undo the column reversion in the result.
> resubstitute :: Matrix -> Vector
> resubstitute = reverse . resubstitute' . reverse . map reverse
Resubstitution itself is not very difficult. We simply calculate the solution from the first equation with only one unknown and substitute this result into the other equations, eliminating one variable.
> resubstitute' :: Matrix -> Vector
> resubstitute' [] = []
> resubstitute' (row:rows) = x:(resubstitute' rows')
> where
> x = (head row)/(last row)
> rows' = map substituteUnknown rows
> substituteUnknown (a1:(a2:as')) = ((a1-x*a2):as')
Examples
Have fun solving linear equations in Haskell.
ghci> let x = gauss exampleA exampleb ghci> x [1.5,0.5,2.5] ghci> mapMatrix exampleA x == exampleb True
> example1, example2, example3 :: Matrix > example1 = [[1,1],[1,2]] > example2 = [[1,1,1,3],[1,2,3,7],[3,4,6,31]] > example3 = [[0,0,1],[1,0,0],[0,1,0]]
ghci> gauss example1 [2,3] [1.0,1.0] ghci> gauss example2 [1,0,0,1] [8.881784197001252e-16,6.9999999999999964,-1.9999999999999991] ghci> gauss example3 [1,1,1] [1.0,1.0,1.0]
Since lists are O(n) in access, wouldn’t it be better to use Arrays?