# Fitting with lmfit

General-purpose fitting in Python can sometimes be a bit more
challenging than one might at first suspect given the robust nature of
tools like Numpy and Scipy. First we had `leastsq`

. It works, although
often requires a bit of manual tuning of initial guesses and *always*
requires manual calculation of standard error from a covariance matrix
(which isn't even one of the return values by default). Later we got
`curve_fit`

which is a bit more user friendly and even estimates and
returns standard error for us by default! Alas, `curve_fit`

is just a
convenience wrapper on top of `leastsq`

and suffers from some of the
same general headaches.

These days, we have the wonderful lmfit package. Not only can lmfit make fitting more user friendly, but it also is quite a bit more robust than using scipy directly. The documentation is thorough and rigorous, but that can also mean that it can be a bit overwhelming to get started with it. Here I work through a basic example in two slightly different ways in order to demonstrate how to use it.

## Generating the data

Let's assume we have data that resembles a decaying sine wave (e.g., a damped oscillator). lmfit has quite a few pre-defined models, but this is not one of them. We can simulate the data with the following code:

```
import numpy as np
x = np.linspace(0, 10, 100)
y = np.sin(5*x)*np.exp(-x/2.5)
```

Real data is noisy, so let's add some noise:

```
import numpy.random as npr
y += npr.choice([-1, 1], size=y.shape)*npr.random(size=y.shape)/5
```

The resulting data:

## Using models

The easiest way to work with lmfit is to ignore the `lmfit.minimize`

function shown in the "Getting Started" section of the documentation and
instead jump straight to the higher-level (and more useful) `Model`

class. For one-time fitting, the `lmfit.models.ExpressionModel`

class is
provided. When creating a new `ExpressionModel`

, you simply pass a
string that is interpreted as a Python expression. For our decaying sine
example, we might do this:

```
import lmfit
model = lmfit.models.ExpressionModel("ampl * sin((x - x0)*freq) * exp(-x/tau) + offset")
```

Let's make our initial guess for performing the fit under the constraint that the offset is fixed at 0:

```
params = model.make_params(ampl=1, x0=0, freq=10, tau=1, offset=0)
params["offset"].set(vary=False)
```

To fit, we pass the data and the parameters as arguments and the independent variable as a keyword argument:

```
fit = model.fit(y, params, x=x)
```

To visually check if the fit is good, lmfit provides both `plot_fit`

and
`plot_residuals`

methods for model instances. The former shows the data,
the initial guess, and its found best fit:

We can also see the found parameters with standard errors and goodness
of fit data with a fit report (`print(model.fit_report())`

):

```
[[Model]]
Model(_eval)
[[Fit Statistics]]
# function evals = 102
# data points = 100
# variables = 4
chi-square = 1.337
reduced chi-square = 0.014
Akaike info crit = -419.379
Bayesian info crit = -408.959
[[Variables]]
ampl: 1.02147340 +/- 0.068013 (6.66%) (init= 1)
offset: 0 (fixed)
tau: 2.53669407 +/- 0.239335 (9.43%) (init= 1)
x0: -0.00823894 +/- 0.012256 (148.76%) (init= 0)
freq: 4.98932400 +/- 0.035399 (0.71%) (init= 10)
[[Correlations]] (unreported correlations are < 0.100)
C(ampl, tau) = -0.718
C(x0, freq) = 0.684
C(ampl, x0) = 0.139
```

## Reusable models

For improved reusability of models, a better approach is to subclass
`lmfit.models.Model`

directly. This allows us to implement a `guess`

method to automate creating initial guesses. Following the pattern used
in defining the models in the `lmfit.models`

module, we can define our
decaying sine model like so:

```
class DecayingSineModel(lmfit.Model):
def __init__(self, *args, **kwargs):
def decaying_sine(x, ampl, offset, freq, x0, tau):
return ampl * np.sin((x - x0)*freq) * np.exp(-x/tau) + offset
super(DecayingSineModel, self).__init__(decaying_sine, *args, **kwargs)
def guess(self, data, **kwargs):
params = self.make_params()
def pset(param, value):
params["%s%s" % (self.prefix, param)].set(value=value)
pset("ampl", np.max(data) - np.min(data))
pset("offset", np.mean(data))
pset("freq", 1)
pset("x0", 0)
pset("tau", 1)
return lmfit.models.update_param_vals(params, self.prefix, **kwargs)
```

Note that the point of the prefix is so that composite models can be constructed (the prefix prevents namespace clashes). Now we can fit as before but guess the starting parameters without thinking about it:

```
model = DecayingSineModel()
params = model.guess(y, x=x)
fit = model.fit(y, params, x=x)
```

which results in a similar fit as before:

## Extracting data from the fit

In many cases we might want to extract parameters and standard error
estimates programatically rather than by reading the fit report (e.g.,
if the fit will be used to produce a data point on another plot, then
the standard error can be used for computing error bars). This is all
included in the `fit`

result via its `params`

attribute. We can print
the parameter values and errors like this:

```
for key in fit.params:
print(key, "=", fit.params[key].value, "+/-", fit.params[key].stderr)
```

## Final thoughts

I've only scratched the surface of `lmfit`

's features, but the examples
here demonstrate a good portion of the daily requirements of working
with data from an experiment. As alluded to earlier, `lmfit`

comes with
many built-in models which makes it a pleasure to use for peak
fitting (something that is often particularly difficult when using scipy
directly).

Finally, although `lmfit`

can handle linear models just fine, I would
instead recommend the statsmodels package. Using the power of
pandas `DataFrame`

s, models can be defined in a similar manner as
with `lmfit`

's `ExpressionModel`

s.

A Jupyter notebook containing the above examples can be found here.