WiGLE is a popular platform which can be used for finding the location of a device using the names of WiFi networks in its vicinity. I’ve written about this before, and wrote some Python code to interact with their API. This API has since been retired and replaced with a new one, as of December […]

There’s a difficult-to-Google, but extremely useful equation for linear regression called the Normal Equation. I used to have an implementation in VBA and somehow lost is. As it happened, I needed it today so I spent a little while recreating it.

This isn’t what I actually need the equation for, just a test case for making sure the equation works. If you want to try this out yourself, the data is the `abalone data set`

from UCI. It’s a fairly simple data set which collates a whole load of data about abalone, along with the number of rings they have in their shells – an indicator of age, like tree rings.

The regression model I built is a simple one, with all the features as predictive variables and the ring count as the independent variable.

Here’s the code:

1 2 3 4 5 6 7 8 9 10 11 12 13 |
Function NormalEquation(X As Variant, y As Variant, Transpose As Boolean) ' theta = (X’ * X)^-1 * (X’ * y) ' Where n is the number of cases you have and k is the number of features: ' X is an n rows x k columns array ' y is an n rows x 1 column array Dim XTy As Variant Dim XTX As Variant Dim theta As Variant XTX = Application.MMult(Application.Transpose(X), X) XTy = Application.MMult(Application.Transpose(X), y) NormalEquation = Application.MMult(Application.MInverse(XTX), XTy) If Transpose Then NormalEquation = Application.Transpose(NormalEquation) End Function |

To use it, set up the data, set up named ranges for `X`

, `y`

and `theta `

(`theta `

should be a 1D array of length k). In the picture below, `X`

is red, `y`

is orange and `theta`

is blue.

Use this code to call the function:

1 2 3 4 5 6 7 8 9 |
Sub TestNormalEquation() Dim X As Variant Dim y As Variant Dim theta As Variant X = Range("X").Value2 y = Range("Y").Value2 theta = NormalEquation(X, y, True) Range("theta") = theta End Sub |

To use the results, the simplest way is to enter `=ROUND(SUMPRODUCT(theta,C3:K3),0)`

in cell `M3`

and fill down to the bottom of the data.

There you have it, a really powerful linear regression function ready for anything you want to throw at it.