# Coordinate Optimization¶

Upon initialization of a `Cell` object, the parameters of the coordinate system are initialized with on initial guesses based on the binary image of the cell. This is only a rough estimation aimed to provide a starting point for further optimization. In `ColiCoords`, this fitting and optimization is handled by `symfit`, a python package that provides a symbolic and intuitive API for using minimizers from `scipy` or other minimization solutions.

The shorthand approach for optimizing the coordinate system is:

```cell.optimize()
```

By calling `optimize()` the coordinate system is optimized for the current `Cell` object by using the default settings. This means the optimization is performed on the binary image using the `Powell` minimizer algoritm. Although the optimization is fast, different data sources and minimizers can yield more accurate results.

## Input data classes and cell functions¶

`ColiCoords` can optimize the coordinate system of the cell based on all compatible data classes, either binary, brightfield, fluorescence, or STORM data. All optimization methods are implemented by calculating some property by applying the current coordinate system on the respective data element, and then comparing this property to the measured data by calculating the chi-squared.

For example, optimimzation based on the brightfield image can be done as follows:

```cell.optimize('brightfield')
```

Where it is assumed that the brightfield data element is named ‘brightfield’. The appropriate function that is used for the optimization is chosen automatically based on the data class and can be supplied optionally by the cell_function keyword argument. Note that optimizaion by brightfield cannot be used to determine the value for the cell’s radius parameter, for this the function `measure_r()` has to be used.

In the case of brightfield optimization, a `NumericalCellModel` is generated which is used by `symfit` to perform the minimization. When the default cell_function (`CellImageFunction`) is called it first calculates the radial distribution of the brightfield image, and this radial distribution is then used to reconstruct the brightfield image. The resulting image is an estimation of the measured brightfield image and by iterative bootstrapping of this process the optimal parameters can be found. This particular optimization strategy can be use for any roughly isotropic image - ie a cell image that looks identical in all directions radially outward - and is thus independent of brightfield image type and focal plane and can also be applied to selected fluorescence images.

The most accurate approach of optimization is by optimizing on a STORM dataset of a membrane marker. Here, the default cell_function used (`CellSTORMMembraneFunction`) calculates for every localization the distance r to the midline of the cell. This is compared to the current radius parameter of the coordinate system to give the chi-squared. This fitting is a special case since the dependent data (y-data) also depends on the optimization parameter r. To allow a variable dependent data for fitting, the class `RadialData` is used, which mimics a `ndarray`, however whose value depends on the current value of the r parameter.

## Minimizers and bounds¶

Optimization can be done by any `symfit` compatible minimizer. All minimizers are imported via `colicoords.minimizers`. More details on the different minimizers can be found in the `symfit` or `scipy` docs.

The default minimizer, `Powell` is fast but does not always converge to the global minimum. To increase the probability to find the global minimum, the minimizer `DifferentialEvolution` is used. This minimizer searches the parameter space defined by bounds on `Parameter` objects defined in the model scan for candidate solutions.

## Multiprocessing and high-performance computing¶

The optimization process can take up to tens of seconds per cell, especially if a global minimizer is used. Although the process only needs to take place once, the optimization process of several thousands of cells can take too much time to be conveniently executed on normal desktop PCs. `ColiCoords` therefore supports multiprocessing so that the user can take advantage of parallel high-performance computing. To perform optimization in parallel:

```cells.optimize_mp()
```

Where cells is a `CellList` object. The cells to be divided is equally distributed among the spawned processes, which is by default equal to the number of physical cores present on the host machine.

## Models and advanced usage¶

The default model used is `NumericalCellModel`. Contrary to typical `symfit` workflow, the `Parameter` objects are defined and initialized by the model itself, and then used to make up the model. To adjust parameter values and bound manually, the user must directly interact with a `CellFit` object instead of calling `optimize()`.

```from colicoords import CellFit
fit = CellFit(cell)
print(fit.model.params) # [a0, a1, a2, r, xl, xr]
# Set the minimum bound of the `a0` parameter to 5.
fit.model.params.min = 5
# Se the value of the `r`parameter to 8.
fit.model.params.value = 8
```

The fitting can then be executed by calling `fit.execute()` as usual.

## Custom minimization functions¶

The minimization function cell_function is a subclass of `CellMinimizeFunctionBase` by default. This when this object is used it is initialized by `CellFit` with the instance of the cell object and the name of the target data element. These attributes are then accessible in the custom `__call__` method of the function object.

The `__call__` function must take the coordinate parameters with their values as keyword arguments and should return the calculated data which is compared to the target data element to calculate the chi-squared. Alternatively, the target_data property can be used, as is done for `CellSTORMMembraneFunction` to specify a different target.

Alternatively, any custom callable can be given as cell_function, as long as it complies with the above requirements.