API

Pre-processing functions

Smoothing data

Kinbiont.smoothing_data

Kinbiont.smoothing_dataFunction
smoothing_data(
data::Matrix{Float64};
method="rolling_avg",
pt_avg=7,
thr_lowess=0.05

)

Arguments:

  • data::Matrix{Float64}: Matrix of size 2xN, where N is the number of time points (single curve).

  • method::String = "rolling_avg": Method for smoothing the data. Options include "NO", "rolling_avg" (rolling average), and "lowess".

  • pt_avg::Int = 7: Number of points used for rolling average smoothing or initial condition generation.

  • thr_lowess::Float64 = 0.05: Parameter for lowess smoothing.

Output:

  • Matrix{Float64}: Array of smoothed data.
source

Correction for multiple scattering

Kinbiont.correction_OD_multiple_scattering

Kinbiont.correction_OD_multiple_scatteringFunction
correction_OD_multiple_scattering(
data::Matrix{Float64},
calibration_curve::String;
method="interpolation"
)

Multiple scattering correction of a given time series.

Arguments:

  • data::Matrix{Float64}: Matrix of size 2xN, where N is the number of time points (single curve).

  • calibration_curve::String: Path to the calibration data (.csv file). Used to correct the data for multiple scattering.

  • method::String = "interpolation": Method for performing the multiple scattering correction. Options include "interpolation" and "exp_fit" (adapted from Meyers, A., Furtmann, C., & Jose, J., Enzyme and Microbial Technology, 118, 1-5., 2018).

Output:

  • Matrix{Float64}: Array with the corrected data.
source

Fitting one kinetics

Log-Lin fitting

Kinbiont.fitting_one_well_Log_Lin

Kinbiont.fitting_one_well_Log_LinFunction
fitting_one_well_Log_Lin(
data::Matrix{Float64},
name_well::String,
label_exp::String;
type_of_smoothing="rolling_avg",
pt_avg=7,
pt_smoothing_derivative=7,
pt_min_size_of_win=7,
type_of_win="maximum",
threshold_of_exp=0.9,
multiple_scattering_correction=false,
method_multiple_scattering_correction="interpolation",
calibration_OD_curve="NA",
thr_lowess=0.05,
start_exp_win_thr=0.05
)

Fits a logarithmic-linear model to data from a .csv file. The function assumes that the first column of the file represents time. It evaluates the specific growth rate, identifies an exponential window based on a statistical threshold, and performs a log-linear fitting.

Arguments:

  • data::Matrix{Float64}: The growth curve data. The first row contains time values, and the second row contains the observable values (e.g., OD).
  • name_well::String: The name of the well.
  • label_exp::String: The label of the experiment.

Key Arguments:

  • type_of_smoothing="rolling_avg": Method of data smoothing. Options are "NO", "rolling_avg", or "lowess".
  • pt_avg=7: Number of points used in the rolling average smoothing.
  • pt_smoothing_derivative=7: Number of points for evaluating specific growth rate. If less than 2, uses interpolation; otherwise, a sliding window approach is used.
  • pt_min_size_of_win=7: Minimum size of the exponential windows in terms of the number of smoothed points.
  • type_of_win="maximum": Method for selecting the exponential phase window. Options are "maximum" or "global_thr".
  • threshold_of_exp=0.9: Threshold in quantile to define the exponential windows, between 0 and 1.
  • do_blank_subtraction="avg_blank": Method for blank subtraction. Options include "NO", "avg_subtraction", and "time_avg".
  • blank_value=0.0: Average value of the blank, used only if do_blank_subtraction is not "NO".
  • blank_array=[0.0]: Array of blank values, used only if do_blank_subtraction is not "NO".
  • correct_negative="remove": Method for handling negative values after blank subtraction. Options are "thr_correction", "blank_correction", or "remove".
  • thr_negative=0.01: Threshold value for correcting negative values if correct_negative is "thr_correction".
  • multiple_scattering_correction=false: Flag indicating whether to correct for multiple scattering.
  • calibration_OD_curve="NA": Path to calibration data for multiple scattering correction, used if multiple_scattering_correction is true.
  • method_multiple_scattering_correction="interpolation": Method for correcting multiple scattering, options include "interpolation" or "exp_fit".
  • thr_lowess=0.05: Threshold for lowess smoothing.
  • start_exp_win_thr=0.05: Minimum OD value that should be reached to start the exponential window.

Output:

  • A data structure containing:
    1. method: Method used for fitting.
    2. A vector with containing:
      • label_exp: Experiment label.
      • name_well: Name of the well or sample.
      • start of exp win: Start of the exponential window.
      • end of exp win: End of the exponential window.
      • Maximum specific GR: Maximum specific growth rate.
      • specific GR: Specific growth rate.
      • 2 sigma CI of GR: Confidence interval of the growth rate (±2 sigma).
      • doubling time: Doubling time.
      • doubling time - 2 sigma: Doubling time minus 2 sigma.
      • doubling time + 2 sigma: Doubling time plus 2 sigma.
      • intercept log-lin fitting: Intercept of the log-linear fitting.
      • 2 sigma intercept: Confidence interval of the intercept (±2 sigma).
      • R^2: Coefficient of determination (R-squared).
    3. The fit in the exponential window.
    4. The log data.
    5. The 95% condfidence band of the fit.
source

Analysis of segments

Kinbiont.segment_gr_analysis

Kinbiont.segment_gr_analysisFunction
segment_gr_analysis(
data::Matrix{Float64},
name_well::String,
label_exp::String;
n_max_change_points=0,
type_of_smoothing="rolling_avg",
pt_avg=7,
pt_smoothing_derivative=7,
multiple_scattering_correction=false,
method_multiple_scattering_correction="interpolation",
calibration_OD_curve="NA",
thr_lowess=0.05,
type_of_detection="slinding_win",
type_of_curve="original",
win_size=14,
n_bins=40,
method_peaks_detection="peaks_prominence"
)

This function performs segmentation analysis on a time-series dataset by detecting change points. For each segment identified, it evaluates the minimum and maximum growth rate, the minimum and maximum derivative, the delta OD of the segment, and the times of change points.

Arguments:

  • data::Matrix{Float64}: The matrix containing the growth curve data. Time values should be in the first row, and the observable (e.g., OD) should be in the second row.
  • name_well::String: Name of the well under study.
  • label_exp::String: Label for the experiment.

Key Arguments:

  • n_max_change_points::Int: Maximum number of change points to consider. If set to 0, the function will determine the number of change points based on the detection method and other parameters.
  • type_of_smoothing="rolling_avg": Method for smoothing the data. Options include "NO" (no smoothing), "rolling_avg" (rolling average), and "lowess" (locally weighted scatterplot smoothing).
  • pt_avg=7: Size of the rolling average window for smoothing, applicable if type_of_smoothing is "rolling_avg".
  • pt_smoothing_derivative=7: Number of points used for the evaluation of specific growth rate. If less than 2, interpolation is used; otherwise, a sliding window approach is applied.
  • smoothing=false: Boolean flag to enable or disable smoothing. Set to true to apply smoothing, or false to skip smoothing.
  • thr_lowess=0.05: Parameter for lowess smoothing if type_of_smoothing is "lowess".
  • type_of_detection="slinding_win": Method for detecting change points. Options include "slinding_win" (sliding window approach) and "lsdd" (least squares density difference).
  • type_of_curve="original": Specifies the input curve for change point detection. Options include "original" (for the raw time series) and "deriv" (for the specific growth rate time series).
  • method_peaks_detection="peaks_prominence": Method for detecting peaks in the dissimilarity curve. Options include "peaks_prominence" (orders peaks by prominence) and "thr_scan" (uses a threshold to select peaks).
  • n_bins=40: Number of bins used to generate thresholds if method_peaks_detection is "thr_scan".
  • win_size=14: Size of the window used by the change point detection algorithms.

Output:

If res = segment_gr_analysis(...):

  • res[1]: A string describing the method used for segmentation and analysis.
  • res[2]: Array containing the parameters evaluated for each segment.
  • res[3]: Intervals corresponding to the detected change points.
  • res[4]: Preprocessed data, including smoothed values and calculated growth rates.
source

ODE fitting

Fitting a harcoded model

Kinbiont.fitting_one_well_ODE_constrained

Kinbiont.fitting_one_well_ODE_constrainedFunction
fitting_one_well_ODE_constrained(
data::Matrix{Float64},
name_well::String,
label_exp::String,
model::String,
param;
integrator=Tsit5(),
pt_avg=1,
pt_smooth_derivative=7,
smoothing=false,
type_of_smoothing="rolling_avg",
type_of_loss="RE",
blank_array=zeros(100),
multiple_scattering_correction=false,
method_multiple_scattering_correction="interpolation",
calibration_OD_curve="NA",
thr_lowess=0.05,
multistart=false,
n_restart=50,
optimizer=BBO_adaptive_de_rand_1_bin_radiuslimited(),
auto_diff_method=nothing,
cons=nothing,
opt_params...
)

This function uses an ordinary differential equation (ODE) model to fit the data from a single well.

Arguments:

  • data::Matrix{Float64}: The growth curve data. The first row contains time values, and the second row contains the observable values (e.g., OD).
  • model::String: The ODE model to be used for fitting. See documentation for the full list.
  • name_well::String: The name of the well.
  • label_exp::String: The label for the experiment.
  • param: Initial guess for the model parameters, provided as a vector of Float64.

Key Arguments:

  • integrator=Tsit5(): SciML integrator used for solving the ODE. For piecewise models, consider using KenCarp4(autodiff=true).
  • optimizer=BBO_adaptive_de_rand_1_bin_radiuslimited(): Optimizer for parameter estimation, from the BBO optimization library.
  • type_of_smoothing="rolling_avg": Method for smoothing the data. Options include "NO", "rolling_avg" (rolling average), and "lowess".
  • pt_avg=7: Size of the rolling average window for smoothing.
  • smoothing=false: Boolean flag to apply data smoothing. Set to true to smooth the data; false to skip smoothing.
  • type_of_loss="RE": Type of loss function used for optimization. Some options are "RE" (relative error), "L2" (L2 norm), "L2_derivative", and "blank_weighted_L2", see documentation for the full list.
  • blank_array=zeros(100): Array containing data of blanks for correction.
  • pt_smoothing_derivative=7: Number of points for evaluating the specific growth rate. Uses interpolation if less than 2; otherwise, a sliding window approach is applied.
  • multiple_scattering_correction=false: Boolean flag to perform multiple scattering correction. Set to true to apply correction, requiring a calibration curve.
  • calibration_OD_curve="NA": Path to the .csv file containing calibration data, used if multiple_scattering_correction=true.
  • method_multiple_scattering_correction="interpolation": Method for performing multiple scattering correction. Options are "interpolation" or "exp_fit" (adapted from Meyers et al., 2018).
  • thr_lowess=0.05: Threshold parameter for lowess smoothing.
  • auto_diff_method=nothing: Differentiation method for the optimizer, if required.
  • cons=nothing: Constraints for optimization.
  • multistart=false: Flag to enable or disable multistart optimization. Set to true uses Tik-Tak restart (from Benchmarking global optimizers, Arnoud et al 2019).
  • n_restart=50: Number of restarts for multistart optimization, used if multistart=true.
  • opt_params...: Optional parameters for the optimizer (e.g., lb=[0.1, 0.3], ub=[9.0, 1.0], maxiters=2000000).

Output:

  • A data structure containing:
    1. method: A string describing the method used.
    2. Parameters array: ["name of model", "well", "param_1", "param_2", ..., "param_n", "maximum specific GR using ODE", "maximum specific GR using data", "objective function value (i.e., loss of the solution)"], where "param_1", "param_2", ..., "param_n" are the ODE model fit parameters.
    3. The numerical solution of the fitted ODE.
    4. Time coordinates corresponding to the fitted ODE.
source

Fitting a custom model

Kinbiont.fitting_one_well_custom_ODE

Kinbiont.fitting_one_well_custom_ODEFunction
fitting_one_well_custom_ODE(
data::Matrix{Float64},
name_well::String,
label_exp::String,
model::Any,
param,
n_equation::Int;
integrator=Tsit5(),
pt_avg=1,
pt_smooth_derivative=0,
smoothing=false,
type_of_loss="RE",
blank_array=zeros(100),
multiple_scattering_correction=false,
method_multiple_scattering_correction="interpolation",
calibration_OD_curve="NA",
thr_lowess=0.05,
type_of_smoothing="lowess",
multistart=false,
n_restart=50,
optimizer=BBO_adaptive_de_rand_1_bin_radiuslimited(),
auto_diff_method=nothing,
cons=nothing,
opt_params...
)

This function fits a user-defined ordinary differential equation (ODE) model to time-series data from a single well.

Arguments:

  • data::Matrix{Float64}: The growth curve data. The first row contains time values, and the second row contains the observable values (e.g., OD).
  • model::Any: The user-defined function representing the ODE model to be fitted. The function should define the ODE system to be solved, see documentation for full list.
  • name_well::String: The name of the well.
  • label_exp::String: The label for the experiment.
  • param: Initial guess for the model parameters, provided as a vector of Float64.
  • n_equation::Int: Number of ODEs in the model.

Key Arguments:

  • integrator=Tsit5(): SciML integrator used for solving the ODE. For piecewise models, consider using KenCarp4(autodiff=true).
  • optimizer=BBO_adaptive_de_rand_1_bin_radiuslimited(): Optimizer for parameter estimation, from the BBO optimization library.
  • type_of_smoothing="lowess": Method for smoothing the data. Options include "NO", "rolling_avg" (rolling average), and "lowess".
  • pt_avg=1: Size of the rolling average window for smoothing.
  • smoothing=false: Boolean flag to apply data smoothing. Set to true to smooth the data; false to skip smoothing.
  • type_of_loss="RE": Type of loss function used for optimization. Options include "RE" (relative error), "L2" (L2 norm), "L2_derivative", and "blank_weighted_L2", see documentation for the full list.
  • blank_array=zeros(100): Array containing data of blanks for correction.
  • pt_smoothing_derivative=0: Number of points for evaluating the specific growth rate. If less than 2, uses interpolation; otherwise, uses a sliding window approach.
  • multiple_scattering_correction=false: Boolean flag to apply multiple scattering correction. Set to true to apply correction, requiring a calibration curve.
  • calibration_OD_curve="NA": Path to the .csv file containing calibration data, used if multiple_scattering_correction=true.
  • method_multiple_scattering_correction="interpolation": Method for performing multiple scattering correction. Options are "interpolation" or "exp_fit" (adapted from Meyers et al., 2018).
  • thr_lowess=0.05: Threshold parameter for lowess smoothing.
  • auto_diff_method=nothing: Differentiation method for the optimizer, if required.
  • cons=nothing: Constraints for optimization.
  • multistart=false: Boolean flag to enable or disable multistart optimization. Set to true uses Tik-Tak restart (from Benchmarking global optimizers, Arnoud et al 2019).
  • n_restart=50: Number of restarts for multistart optimization, used if multistart=true.
  • opt_params...: Optional parameters for the optimizer (e.g., lb=[0.1, 0.3], ub=[9.0, 1.0], maxiters=2000000).

Output:

  • A data structure containing:
    1. method: A string describing the method used.
    2. Parameters array: ["name of model", "well", "param_1", "param_2", ..., "param_n", "maximum specific GR using ODE", "maximum specific GR using data", "objective function value (i.e., loss of the solution)"], where "param_1", "param_2", ..., "param_n" are the ODE model fit parameters.
    3. The numerical solution of the fitted ODE.
    4. Time coordinates corresponding to the fitted ODE.
source

ODE model selection

Kinbiont.ODE_Model_selection

Kinbiont.ODE_Model_selectionFunction
ODE_Model_selection(
data::Matrix{Float64},
name_well::String,
label_exp::String,
models_list::Vector{String},
param_array::Any;
lb_param_array::Any=nothing,
ub_param_array::Any=nothing,
integrator=Tsit5(),
pt_avg=3,
beta_smoothing_ms=2.0,
smoothing=false,
type_of_smoothing="rolling_avg",
thr_lowess=0.05,
type_of_loss="L2",
blank_array=zeros(100),
pt_smooth_derivative=7,
multiple_scattering_correction=false,
method_multiple_scattering_correction="interpolation",
calibration_OD_curve="NA",
verbose=false,
correction_AIC=true,
multistart=false,
n_restart=50,
optimizer=BBO_adaptive_de_rand_1_bin_radiuslimited(),
auto_diff_method=nothing,
cons=nothing,
opt_params...
)

This function performs automatic model selection for multiple ODE models fitted to time-series data from a single well. It selects the best-fitting model based on the Akaike Information Criterion (AIC) or the corrected AIC (AICc).

Arguments:

  • data::Matrix{Float64}: The growth curve data. The first row contains time values, and the second row contains the observable values (e.g., OD).
  • name_well::String: Name of the well.
  • label_exp::String: Label of the experiment.
  • models_list::Vector{String}: A vector of strings that define the ODE models to fit. See documentation for the full list.
  • param_array::Any: Initial guess for the model parameters, provided as a vector or matrix.

Key Arguments:

  • lb_param_array::Any=nothing: Lower bounds for the parameters, compatible with the models. Use nothing for no bounds.
  • ub_param_array::Any=nothing: Upper bounds for the parameters, compatible with the models. Use nothing for no bounds.
  • integrator=Tsit5(): SciML integrator used for solving the ODEs. Consider KenCarp4(autodiff=true) for piecewise models.
  • optimizer=BBO_adaptive_de_rand_1_bin_radiuslimited(): Optimizer from the BBO optimization library.
  • type_of_smoothing="rolling_avg": Method for smoothing the data. Options: "NO", "rolling_avg" (rolling average), and "lowess".
  • pt_avg=3: Size of the rolling average window for smoothing.
  • beta_smoothing_ms=2.0: Penalty parameter for evaluating AIC (or AICc).
  • smoothing=false: Boolean flag to apply data smoothing. Set to true to smooth the data; false to skip smoothing.
  • type_of_loss="L2": Type of loss function used for optimization. Options include "L2" (L2 norm), "RE" (relative error), "L2_derivative", and "blank_weighted_L2", see documentation for the full list.
  • blank_array=zeros(100): Array containing data of blanks for correction.
  • pt_smooth_derivative=7: Number of points for evaluating the specific growth rate. Uses interpolation if less than 2; otherwise, uses a sliding window approach.
  • multiple_scattering_correction=false: Boolean flag to apply multiple scattering correction. Set to true if correction is required, necessitating a calibration curve.
  • calibration_OD_curve="NA": Path to the .csv file with calibration data, used if multiple_scattering_correction=true.
  • method_multiple_scattering_correction="interpolation": Method for multiple scattering correction. Options are "interpolation" or "exp_fit" (adapted from Meyers et al., 2018).
  • thr_lowess=0.05: Parameter for lowess smoothing.
  • correction_AIC=true: Boolean flag to apply finite sample correction to AIC (AICc). Set to true to correct.
  • auto_diff_method=nothing: Differentiation method for the optimizer, if required.
  • cons=nothing: Constraints for optimization.
  • multistart=false: Boolean flag to enable or disable multistart optimization. Set to true uses Tik-Tak restart (from Benchmarking global optimizers, Arnoud et al 2019).
  • n_restart=50: Number of restarts for multistart optimization, used if multistart=true.
  • opt_params...: Optional parameters for the optimizer (e.g., lb=[0.1, 0.3], ub=[9.0, 1.0], maxiters=2000000).

Output:

  • A data structure containing:
    1. method: A string describing the method used for model selection.
    2. best_model_params: Matrix containing the parameters of the best model.
    3. fit_numerical: Numerical array of the best model fit.
    4. fit_times: Time coordinates corresponding to the best model fit.
    5. model_stats: Statistical summary for each model fitted, including AIC/AICc values.
    6. best_model_score: Score (AIC/AICc) of the best model.
    7. best_model_params_values: Parameters of the best model.
    8. min_aic_or_aicc: Minimum value of AIC or AICc for the best model.
    9. best_model_string: String representation of the best model.
    10. all_params: Parameters of the fit of all models.
source

ODE Morris sensitivity

Kinbiont.one_well_morris_sensitivity

Kinbiont.one_well_morris_sensitivityFunction
one_well_morris_sensitivity(
data::Matrix{Float64},
name_well::String,
label_exp::String,
model::String,
lb_param::Vector{Float64},
ub_param::Vector{Float64};
N_step_morris=7,
integrator=Tsit5(),
pt_avg=1,
pt_smooth_derivative=7,
write_res=false,
smoothing=false,
type_of_smoothing="rolling_avg",
type_of_loss="RE",
blank_array=zeros(100),
multiple_scattering_correction=false,
method_multiple_scattering_correction="interpolation",
calibration_OD_curve="NA",
optimizer=BBO_adaptive_de_rand_1_bin_radiuslimited(),
auto_diff_method=nothing,
cons=nothing,
thr_lowess=0.05,
multistart=false,
n_restart=50,
opt_params...
)

This function performs Morris sensitivity analysis to evaluate the sensitivity of model parameters to variations in their initial guesses. This analysis is useful for assessing the robustness of nonlinear model fits.

Arguments:

  • data::Matrix{Float64}: The growth curve data. Time values are in the first row, and the fit observable (e.g., OD) is in the second row.
  • name_well::String: Name of the well.
  • label_exp::String: Label of the experiment.
  • model::String: The ODE model to be used for fitting.
  • lb_param::Vector{Float64}: Vector containing the lower bounds for the model parameters.
  • ub_param::Vector{Float64}: Vector containing the upper bounds for the model parameters.

Key Arguments:

  • N_step_morris=7: Number of steps for the Morris sensitivity analysis.
  • param=lb_param .+ (ub_param.-lb_param)./2: Initial guess for the model parameters, calculated as the midpoint of the lower and upper bounds.
  • integrator=Tsit5(): SciML integrator used for solving the ODEs. Consider KenCarp4(autodiff=true) for piecewise models.
  • optimizer=BBO_adaptive_de_rand_1_bin_radiuslimited(): Optimizer used for fitting the model.
  • type_of_smoothing="rolling_avg": Method for smoothing the data. Options include "NO", "rolling_avg" (rolling average), and "lowess".
  • pt_avg=7: Size of the rolling average window for smoothing.
  • pt_smooth_derivative=7: Number of points used for evaluating the specific growth rate. If less than 2, interpolation is used; otherwise, a sliding window approach is used.
  • smoothing=false: Boolean flag indicating whether to apply smoothing to the data (true) or not (false).
  • type_of_loss="RE": Type of loss function used for optimization. Options include "RE" (relative error), "L2" (L2 norm), "L2_derivative", and "blank_weighted_L2", see documentation for the full list.
  • blank_array=zeros(100): Array containing data of blanks for correction.
  • calibration_OD_curve="NA": Path to the .csv file with calibration data, used if multiple_scattering_correction=true.
  • multiple_scattering_correction=false: Boolean flag to apply multiple scattering correction. Set to true if correction is required.
  • method_multiple_scattering_correction="interpolation": Method for multiple scattering correction. Options include "interpolation" or "exp_fit" (adapted from Meyers et al., 2018).
  • thr_lowess=0.05: Parameter for lowess smoothing.
  • auto_diff_method=nothing: Differentiation method for the optimizer, if required.
  • cons=nothing: Constraints for optimization.
  • multistart=false: Boolean flag to enable or disable multistart optimization. Set to true uses Tik-Tak restart (from Benchmarking global optimizers, Arnoud et al 2019).
  • n_restart=50: Number of restarts for multistart optimization, used if multistart=true.
  • opt_params...: Optional parameters for the optimizer (e.g., lb=[0.1, 0.3], ub=[9.0, 1.0], maxiters=2000000).

Output:

  • A data structure containing:
    1. method: A string describing the Morris sensitivity analysis method used.
    2. results_fit: The results of the model fits for all starting parameter sets used in the sensitivity analysis.
    3. initial_guesses: Initial guess of parameters for each run of the sensitivity analysis.
source

NL Fitting

Kinbiont.fit_NL_model

Kinbiont.fit_NL_modelFunction
fit_NL_model(
data::Matrix{Float64},
name_well::String,
label_exp::String,
model_function::Any,
u0;
pt_avg=1,
pt_smooth_derivative=7,
smoothing=false,
type_of_smoothing="rolling_avg",
type_of_loss="RE",
multiple_scattering_correction=false,
method_multiple_scattering_correction="interpolation",
calibration_OD_curve="NA",
thr_lowess=0.05,
penality_CI=3.0,
optimizer=BBO_adaptive_de_rand_1_bin_radiuslimited(),
multistart=false,
n_restart=50,
auto_diff_method=nothing,
cons=nothing,
opt_params...
)

This function fits a nonlinear function to the time series input data of a single well.

Arguments:

  • data::Matrix{Float64}: The dataset containing the growth curve. The first row should represent time values, and the second row should represent the variable to fit (e.g., optical density). Refer to the documentation for proper formatting.
  • name_well::String: The name of the well being analyzed.
  • label_exp::String: The label for the experiment to identify the results.
  • model_function::Any: The nonlinear model function to be used for fitting. This can be a custom function or a predefined model specified as a string (see documentation for available models).
  • u0::Vector{Float64}: Initial guess for the model parameters.

Key Arguments:

  • optimizer::Any = BBO_adaptive_de_rand_1_bin_radiuslimited(): Optimizer for parameter estimation, from the BBO optimization library.
  • type_of_smoothing::String = "rolling_avg": Method for smoothing the data. Options include "NO" (no smoothing), "rolling_avg" (rolling average), and "lowess" (locally weighted scatterplot smoothing).
  • pt_avg::Int = 7: Size of the rolling average window for smoothing if type_of_smoothing is "rolling_avg".
  • smoothing::Bool = false: Flag to apply data smoothing. Set to true to smooth the data; false to skip.
  • type_of_loss::String = "RE": Type of loss function used for optimization. Options include "RE" (relative error), "L2" (L2 norm), "L2_derivative", and "blank_weighted_L2". See documentation for the full list.
  • blank_array::Vector{Float64} = zeros(100): Array containing data of blanks for correction.
  • pt_smooth_derivative::Int = 7: Number of points for evaluating the specific growth rate. Uses interpolation if less than 2; otherwise, a sliding window approach is applied.
  • multiple_scattering_correction::Bool = false: Flag to perform multiple scattering correction. Set to true to apply correction, requiring a calibration curve.
  • calibration_OD_curve::String = "NA": Path to a .csv file containing calibration data for optical density, used if multiple_scattering_correction is true.
  • method_multiple_scattering_correction::String = "interpolation": Method for performing multiple scattering correction. Options are "interpolation" or "exp_fit" (based on Meyers et al., 2018).
  • thr_lowess::Float64 = 0.05: Threshold parameter for lowess smoothing.
  • penality_CI::Float64 = 3.0: (Deprecated) Penalty for enforcing continuity at segment boundaries. Consider removing or updating.
  • auto_diff_method::Any = nothing: Differentiation method for the optimizer, if required.
  • cons::Any = nothing: Constraints for optimization.
  • multistart::Bool = false: Flag to enable multistart optimization. Set to true to use multiple starting points.
  • n_restart::Int = 50: Number of restarts for multistart optimization, used if multistart is true.
  • opt_params...: Additional optional parameters for the optimizer(e.g., lb = [0.1, 0.3], ub =[9.0,1.0], maxiters=2000000).

Output:

  • A data structure containing:
    1. method: A string describing the method used for fitting.
    2. Parameters array: ["name of model", "well", "param_1", "param_2", ..., "param_n", "maximum specific GR using NL", "maximum specific GR using data", "objective function value (i.e., loss of the solution)"], where "param_1", "param_2", ..., "param_n" are the parameters of the fitted model.
    3. The numerical solution of the fitted nonlinear model.
    4. Time coordinates corresponding to the fitted nonlinear model.
source

Kinbiont.fit_NL_model_with_sensitivity

Kinbiont.fit_NL_model_with_sensitivityFunction
fit_NL_model_with_sensitivity(
data::Matrix{Float64},
name_well::String,
label_exp::String,
model_function::Any,
lb_param::Vector{Float64},
ub_param::Vector{Float64};
nrep=9,
pt_avg=1,
pt_smooth_derivative=7,
smoothing=false,
type_of_smoothing="rolling_avg",
type_of_loss="RE",
multiple_scattering_correction=false,
method_multiple_scattering_correction="interpolation",
calibration_OD_curve="NA",
thr_lowess=0.05,
write_res=false,
penality_CI=3.0,
optimizer=BBO_adaptive_de_rand_1_bin_radiuslimited(),
multistart=false,
n_restart=50,
auto_diff_method=nothing,
cons=nothing,
opt_params...
)

This function performs Morris sensitivity analysis on the nonlinear fit optimization.

Arguments:

  • data::Matrix{Float64}: The growth curve data matrix. Time values are in the first row and the observable values (e.g., optical density) are in the second row.
  • name_well::String: The name of the well for which the model is being fitted.
  • label_exp::String: Label for the experiment, used for identifying results.
  • model_function::Any: The nonlinear model function to be used for fitting. This can be a custom function or a predefined model name.
  • lb_param::Vector{Float64}: Lower bounds for the model parameters. Defines the hyperspace for sensitivity analysis.
  • ub_param::Vector{Float64}: Upper bounds for the model parameters. Defines the hyperspace for sensitivity analysis.

Key Arguments:

  • nrep::Int = 100: Number of steps for the Morris sensitivity analysis. Determines the number of sampling points.
  • optimizer::Any = BBO_adaptive_de_rand_1_bin_radiuslimited(): Optimizer for parameter estimation, from the BBO optimization library.
  • type_of_smoothing::String = "rolling_avg": Method for smoothing the data. Options include "NO" (no smoothing), "rolling_avg" (rolling average), and "lowess" (locally weighted scatterplot smoothing).
  • pt_avg::Int = 7: Size of the rolling average window for smoothing if type_of_smoothing is "rolling_avg".
  • smoothing::Bool = false: Flag to apply data smoothing. Set to true to enable smoothing; false to skip.
  • type_of_loss::String = "RE": Type of loss function used for optimization. Options include "RE" (relative error), "L2" (L2 norm), "L2_derivative", and "blank_weighted_L2". See documentation for the full list.
  • blank_array::Vector{Float64} = zeros(100): Array containing data of blanks for correction.
  • pt_smooth_derivative::Int = 7: Number of points for evaluating the specific growth rate. Uses interpolation if less than 2; otherwise, a sliding window approach is applied.
  • multiple_scattering_correction::Bool = false: Flag to perform multiple scattering correction. Set to true to apply correction, requiring a calibration curve.
  • calibration_OD_curve::String = "NA": Path to a .csv file containing calibration data for optical density, used if multiple_scattering_correction is true.
  • method_multiple_scattering_correction::String = "interpolation": Method for performing multiple scattering correction. Options are "interpolation" or "exp_fit" (based on Meyers et al., 2018).
  • thr_lowess::Float64 = 0.05: Threshold parameter for lowess smoothing.
  • write_res::Bool = false: Flag to write results to a file. Set to true to enable file writing.
  • penality_CI::Float64 = 3.0: (Deprecated) Penalty for enforcing continuity at segment boundaries. Consider removing or updating.
  • auto_diff_method::Any = nothing: Differentiation method for the optimizer, if required.
  • cons::Any = nothing: Constraints for optimization.
  • multistart::Bool = false: Flag to enable multistart optimization. Set to true to use multiple starting points.
  • n_restart::Int = 50: Number of restarts for multistart optimization, used if multistart is true.
  • opt_params...: Additional optional parameters for the optimizer(e.g., lb = [0.1, 0.3], ub =[9.0,1.0], maxiters=2000000).

Output:

  • A data structure containing:
    1. method: A string describing the method used for fitting and sensitivity analysis.
    2. Parameters array: ["name of model", "well", "param_1", "param_2", ..., "param_n", "maximum specific GR using NL", "maximum specific GR using data", "objective function value (i.e., loss of the solution)"], where "param_1", "param_2", ..., "param_n" are the parameters of the fitted model.
    3. The numerical solution of the fitted nonlinear model.
    4. Time coordinates corresponding to the fitted nonlinear model.
    5. Final parameters from the sensitivity analysis.
    6. Sensitivity analysis results including mean parameter values from the analysis.
    7. Standard deviation of the parameters from the sensitivity analysis.
source

Kinbiont.fit_NL_model_bootstrap

Kinbiont.fit_NL_model_bootstrapFunction
function fit_NL_model_bootstrap(
data::Matrix{Float64},
name_well::String,
label_exp::String,
model_function::Any,
u0;
lb_param=nothing,
ub_param=nothing,
nrep=100,
size_bootstrap=0.7,
pt_avg=1,
pt_smooth_derivative=7,
smoothing=false,
type_of_smoothing="rolling_avg",
type_of_loss="RE",
multiple_scattering_correction=false,
method_multiple_scattering_correction="interpolation",
calibration_OD_curve="NA",
thr_lowess=0.05,
write_res=false,
penality_CI=3.0,
path_to_results="NA",
optimizer=BBO_adaptive_de_rand_1_bin_radiuslimited(),
multistart=false,
n_restart=50,
auto_diff_method=nothing,
cons=nothing,
opt_params...
)

This function performs nonlinear (NL) fitting of the growth curve data using a bootstrap approach to evaluate confidence intervals and mitigate issues with poor initializations.

Arguments:

  • data::Matrix{Float64}: The growth curve data matrix, where the first row contains time values, and the second row contains the observable values (e.g., optical density (OD)).
  • name_well::String: The name of the well for which the model is being fitted.
  • label_exp::String: The label for the experiment, used for identifying the results.
  • model_function::Any: The nonlinear model function to be used for fitting. This can be a custom function or a predefined model name.
  • u0: Initial guess for the model parameters.

Key Arguments:

  • lb_param::Vector{Float64} = nothing: Lower bounds for the model parameters. Defines the parameter space.
  • ub_param::Vector{Float64} = nothing: Upper bounds for the model parameters. Defines the parameter space.
  • size_bootstrap::Float64 = 0.7: Fraction of the data used for each bootstrap iteration.
  • nrep::Int = 100: Number of bootstrap iterations to perform.
  • pt_avg::Int = 7: Number of points to use for initial conditions or rolling average smoothing.
  • pt_smooth_derivative::Int = 7: Number of points for evaluating the specific growth rate. Uses interpolation if less than 2; otherwise, applies a sliding window approach.
  • type_of_smoothing::String = "rolling_avg": Method for smoothing the data. Options include "NO" (no smoothing), "rolling_avg" (rolling average), and "lowess" (locally weighted scatterplot smoothing).
  • smoothing::Bool = false: Flag to apply data smoothing. Set to true to enable smoothing; false to skip.
  • type_of_loss::String = "RE": Type of loss function used for optimization. Options include "RE" (relative error), "L2" (L2 norm), "L2_derivative", and "blank_weighted_L2". See documentation for the full list.
  • blank_array::Vector{Float64} = zeros(100): Array containing data of blanks for correction.
  • calibration_OD_curve::String = "NA": Path to the .csv file containing calibration data for optical density, used if multiple_scattering_correction is true.
  • multiple_scattering_correction::Bool = false: Flag to apply multiple scattering correction using the given calibration curve.
  • method_multiple_scattering_correction::String = "interpolation": Method for performing multiple scattering correction. Options are "interpolation" or "exp_fit" (based on Meyers et al., 2018).
  • thr_lowess::Float64 = 0.05: Threshold parameter for lowess smoothing.
  • penality_CI::Float64 = 3.0: (Deprecated) Penalty for enforcing continuity at segment boundaries. Consider removing or updating.
  • auto_diff_method::Any = nothing: Differentiation method for the optimizer, if required.
  • cons::Any = nothing: Constraints for optimization.
  • multistart::Bool = false: Flag to enable multistart optimization. Set to true to use multiple starting points.
  • n_restart::Int = 50: Number of restarts for multistart optimization, used if multistart is true.
  • optimizer::Any = BBO_adaptive_de_rand_1_bin_radiuslimited(): Optimizer for parameter estimation, from the BBO optimization library.
  • opt_params...: Additional optional parameters for the optimizer(e.g., lb = [0.1, 0.3], ub =[9.0,1.0], maxiters=2000000).
  • write_res::Bool = false: Flag to write results to a file. Set to true to enable file writing.
  • path_to_results::String = "NA": Path to the folder where results will be saved if write_res is true.

Output:

  • A data structure containing:
    1. method: A string describing the method used, including details of the bootstrap approach.
    2. Parameters array: ["name of model", "well", "param_1", "param_2", ..., "param_n", "maximum specific GR using NL", "maximum specific GR using data", "objective function value (i.e., loss of the solution)"], where "param_1", "param_2", ..., "param_n" are the parameters of the fitted model.
    3. The numerical solution of the fitted nonlinear model.
    4. Time coordinates corresponding to the fitted nonlinear model.
    5. The parameters of each bootstrap fit
    6. The parameters of each bootstrap fit after considering only the best 95% of the losses.
    7. Mean parameters from the bootstrap analysis.
    8. Standard deviation of parameters from the bootstrap analysis.
    9. (Optional) Results saved in the specified path_to_results folder if write_res is true.
source

NL Model selection

Kinbiont.NL_model_selection

Kinbiont.NL_model_selectionFunction
NL_model_selection(
data::Matrix{Float64},
name_well::String,
label_exp::String,
list_model_function::Any,
list_u0;
lb_param_array::Any = nothing,
ub_param_array::Any = nothing,
method_of_fitting="NA",
nrep=100,
size_bootstrap=0.7,
pt_avg=1,
pt_smooth_derivative=7,
smoothing=false,
type_of_smoothing="rolling_avg",
type_of_loss="RE",
multiple_scattering_correction=false,
method_multiple_scattering_correction="interpolation",
calibration_OD_curve="NA",
thr_lowess=0.05,
write_res=false,
beta_smoothing_ms=2.0,
penality_CI=8.0,
correction_AIC=false,
optimizer=BBO_adaptive_de_rand_1_bin_radiuslimited(),
auto_diff_method=nothing,
multistart=false,
n_restart=50,
cons=nothing,
opt_params...
)

This function performs nonlinear (NL) model selection from an array of NL models using AIC or AICc, depending on user inputs. It performs nrep iterations to estimate the posterior distribution of parameters. It uses the blank distribution as noise.

Arguments:

  • data::Matrix{Float64}: The dataset with the growth curve. The first row represents times, and the second row represents the variable to fit (e.g., OD).
  • name_well::String: Name of the well.
  • label_exp::String: Label of the experiment.
  • list_model_function::Any: Array containing functions or strings representing the NL models.
  • list_u0: Array of initial guesses for the parameters for each model.

Key Arguments:

  • lb_param_array::Any = nothing: Array of lower bounds for the parameters, compatible with the models.
  • ub_param_array::Any = nothing: Array of upper bounds for the parameters, compatible with the models.
  • method_of_fitting::String = "Normal": Method of performing the NL fit. Options are "Bootstrap", "Normal", and "Morris_sensitivity".
  • nrep::Int = 100: Number of iterations for estimating the posterior distribution.
  • size_bootstrap::Float = 0.7: Fraction of data used in each bootstrap run, applicable if method_of_fitting is "Bootstrap".
  • pt_avg::Int = 1: Number of points to generate initial conditions or perform smoothing.
  • pt_smooth_derivative::Int = 7: Number of points for evaluating the specific growth rate. Uses interpolation if less than 2; otherwise, applies a sliding window approach.
  • smoothing::Bool = false: Whether to apply smoothing to the data. Set to true to enable smoothing; false to skip.
  • type_of_smoothing::String = "rolling_avg": Method for smoothing the data. Options are "NO", "rolling_avg", and "lowess".
  • type_of_loss::String = "RE": Type of loss function used for optimization. Options include "RE", "L2", "L2_derivative", and "blank_weighted_L2". See documentation for the full list.
  • blank_array::Vector{Float64} = zeros(100): Data of all blanks in a single array.
  • pt_smoothing_derivative::Int = 7: Number of points for evaluating the specific growth rate. Uses interpolation if less than 2; otherwise, uses a sliding window approach.
  • calibration_OD_curve::String = "NA": Path to the .csv file containing calibration data for optical density, used if multiple_scattering_correction is true.
  • multiple_scattering_correction::Bool = false: Whether to apply multiple scattering correction using the given calibration curve.
  • method_multiple_scattering_correction::String = "interpolation": Method for performing multiple scattering correction. Options are "interpolation" or "exp_fit".
  • thr_lowess::Float64 = 0.05: Threshold parameter for lowess smoothing.
  • beta_smoothing_ms::Float64 = 2.0: Penalty parameter for AIC (or AICc) evaluation.
  • penality_CI::Float64 = 8.0: Penalty for enforcing continuity at segment boundaries.
  • correction_AIC::Bool = false: Whether to apply finite sample correction to AIC.
  • auto_diff_method::Any = nothing: Differentiation method for the optimizer, if required.
  • multistart::Bool = false: Flag to enable or disable multistart optimization. Set to true to use multiple starting points.
  • n_restart::Int = 50: Number of restarts for multistart optimization, used if multistart is true.
  • opt_params...: Additional optional parameters for the optimizer(e.g., lb = [0.1, 0.3], ub =[9.0,1.0], maxiters=2000000).
  • write_res::Bool = false: Flag to write results to a file. Set to true to enable file writing.
  • path_to_results::String = "NA": Path to the folder where results will be saved if write_res is true.

Output:

  • A data structure containing:
    1. method: A string describing the method used for model selection.
    2. Parameters array: ["name of model", "well", "param_1", "param_2", ..., "param_n", "maximum specific GR using NL", "maximum specific GR using data", "objective function value (i.e., loss of the solution)"], where "param_1", "param_2", ..., "param_n" are the parameters of the fitted model.
    3. The numerical solution of the fitted nonlinear model.
    4. Time coordinates corresponding to the fitted nonlinear model.
    5. The AIC values for all models.
    6. The loss value of the top model.
source

Segmented fitting

ODE segmentation

Kinbiont.selection_ODE_fixed_intervals

Kinbiont.selection_ODE_fixed_intervalsFunction
selection_ODE_fixed_intervals(
data::Matrix{Float64},
name_well::String,
label_exp::String,
list_of_models::Vector{String},
param_array,
intervals_changepoints::Any;
lb_param_array::Any=nothing,
ub_param_array::Any=nothing,
type_of_loss="L2",
integrator=Tsit5(),
smoothing=false,
type_of_smoothing="lowess",
thr_lowess=0.05,
pt_avg=1,
pt_smooth_derivative=0,
multiple_scattering_correction=false,
method_multiple_scattering_correction="interpolation",
calibration_OD_curve="NA",
beta_smoothing_ms=2.0,
correction_AIC=true,
optimizer=BBO_adaptive_de_rand_1_bin_radiuslimited(),
multistart=false,
n_restart=50,
auto_diff_method=nothing,
cons=nothing,
opt_params...
)

This function fits an Ordinary Differential Equation (ODE) model to segmented time-series data. Users provide fixed change points, and the function models to each segment defined by these points.

Arguments:

  • data::Matrix{Float64}: The growth curve data. Time values are in the first row, and the fit observable (e.g., OD) is in the second row.
  • name_well::String: Name of the well.
  • label_exp::String: Label of the experiment.
  • list_of_models::Vector{String}: List of ODE models to be considered for fitting. See documentation for the full options.
  • param_array: Vector of initial guesses for model parameters.
  • intervals_changepoints::Any: Array containing the list of change points, e.g., [0.0, 10.0, 30.0]. These define the segments for which models will be fitted.

Key Arguments:

  • lb_param_array::Any: Lower bounds for the parameters for each model (compatible with the models).
  • ub_param_array::Any: Upper bounds for the parameters for each model (compatible with the models).
  • integrator=Tsit5(): SciML integrator used for solving the ODEs. Use KenCarp4(autodiff=true) for piecewise models.
  • optimizer=BBO_adaptive_de_rand_1_bin_radiuslimited(): Optimizer from optimizationBBO.
  • type_of_smoothing="lowess": Method for smoothing the data. Options include "NO", "rolling_avg" (rolling average), and "lowess".
  • pt_avg=1: Size of the rolling average window for smoothing.
  • smoothing=false: Boolean flag indicating whether to apply smoothing to the data (true) or not (false).
  • type_of_loss="L2": Type of loss function used for optimization. Options include "RE" (relative error), "L2" (L2 norm), "L2_derivative", and "blank_weighted_L2", see documentation for the full list.
  • blank_array=zeros(100): Array containing data of blanks for correction.
  • pt_smooth_derivative=0: Number of points for evaluation of specific growth rate. If less than 2, interpolation is used; otherwise, a sliding window approach is used.
  • calibration_OD_curve="NA": Path to the .csv file with calibration data, used if multiple_scattering_correction=true.
  • multiple_scattering_correction=false: Boolean flag to apply multiple scattering correction. Set to true if correction is needed.
  • method_multiple_scattering_correction="interpolation": Method for multiple scattering correction. Options include "interpolation" or "exp_fit" (adapted from Meyers et al., 2018).
  • thr_lowess=0.05: Parameter for lowess smoothing.
  • beta_smoothing_ms=2.0: Penalty parameter for AIC (or AICc) evaluation.
  • auto_diff_method=nothing: Differentiation method for the optimizer, if required.
  • cons=nothing: Constraints for optimization.
  • multistart=false: Boolean flag to enable or disable multistart optimization. Set to true uses Tik-Tak restart (from Benchmarking global optimizers, Arnoud et al 2019).
  • n_restart=50: Number of restarts for multistart optimization, used if multistart=true.
  • opt_params...: Optional parameters for the optimizer (e.g., lb=[0.1, 0.3], ub=[9.0, 1.0], maxiters=2000000).

Output:

If res = selection_ODE_fixed_intervals(...):

  • res[1]: Parameters of each segment.
  • res[2]: Intervals corresponding to each ODE segment.
  • res[3]: Time coordinates of the fitted solution.
  • res[4]: Numerical values of the fitted solutions.
  • res[5]: The fit loss score for each segment.
source

Kinbiont.segmentation_ODE

Kinbiont.segmentation_ODEFunction
segmentation_ODE(
data_testing::Matrix{Float64},
name_well::String,
label_exp::String,
list_of_models::Vector{String},
param_array::Any,
n_max_change_points::Int;
lb_param_array::Any=nothing,
ub_param_array::Any=nothing,
detect_number_cpd=true,
fixed_cpd=false,
integrator=Tsit5(),
type_of_loss="L2",
type_of_detection="slinding_win",
type_of_curve="original",
pt_avg=1,
smoothing=true,
path_to_results="NA",
win_size=14,
pt_smooth_derivative=7,
beta_smoothing_ms=2.0,
multiple_scattering_correction=false,
method_multiple_scattering_correction="interpolation",
calibration_OD_curve="NA",
save_all_model=false,
method_peaks_detection="peaks_prominence",
n_bins=40,
type_of_smoothing="lowess",
thr_lowess=0.05,
correction_AIC=true,
optimizer=BBO_adaptive_de_rand_1_bin_radiuslimited(),
multistart=false,
n_restart=50,
auto_diff_method=nothing,
cons=nothing,
opt_params...
)

This function performs model selection for ordinary differential equation (ODE) models across different segments of the input growth time series data. Segmentation is achieved using a change point detection algorithm, allowing the identification of multiple segments where different models can be fitted.

Arguments:

  • data_testing::Matrix{Float64}: The growth curve data. Time values are in the first row, and the fit observable (e.g., OD) is in the second row.
  • name_well::String: Name of the well.
  • label_exp::String: Label of the experiment.
  • list_of_models::Vector{String}: List of ODE models to be considered for fitting.
  • param_array::Any: Initial guesses for the model parameters.
  • n_max_change_points::Int: Maximum number of change points to be considered. The function will evaluate models with varying numbers of change points up to this maximum.

Key Arguments:

  • lb_param_array::Any: Lower bounds for the parameters for each model (compatible with the models).
  • ub_param_array::Any: Upper bounds for the parameters for each model (compatible with the models).
  • integrator=Tsit5(): SciML integrator used for solving the ODEs. Use KenCarp4(autodiff=true) for piecewise models.
  • optimizer=BBO_adaptive_de_rand_1_bin_radiuslimited(): Optimizer from optimizationBBO.
  • type_of_smoothing="lowess": Method of choice for smoothing the data. Options include "NO", "rolling_avg" (rolling average), and "lowess".
  • pt_avg=1: Size of the rolling average window for smoothing.
  • pt_smooth_derivative=7: Number of points for evaluation of the specific growth rate. If less than 2, interpolation is used; otherwise, a sliding window approach is used.
  • smoothing=true: Boolean flag indicating whether to apply smoothing to the data (true) or not (false).
  • type_of_loss="L2": Type of loss function used for optimization. Options include "RE" (relative error), "L2" (L2 norm), "L2_derivative", and "blank_weighted_L2", see documentation for the full list.
  • blank_array=zeros(100): Array containing data of blanks for correction.
  • calibration_OD_curve="NA": Path to the .csv file with calibration data, used if multiple_scattering_correction=true.
  • multiple_scattering_correction=false: Boolean flag to apply multiple scattering correction. Set to true if correction is needed.
  • method_multiple_scattering_correction="interpolation": Method for multiple scattering correction. Options include "interpolation" or "exp_fit" (adapted from Meyers et al., 2018).
  • thr_lowess=0.05: Parameter for lowess smoothing.
  • beta_smoothing_ms=2.0: Penalty parameter for AIC (or AICc) evaluation.
  • type_of_detection="slinding_win": Method of change point detection. Options include "slinding_win" (sliding window approach) and "lsdd" (least square density difference).
  • type_of_curve="original": Defines the input curve for change point detection. Options include "original" for the raw time series and "deriv" for the specific growth rate time series.
  • method_peaks_detection="peaks_prominence": Method for peak detection in the dissimilarity curve. Options include "peaks_prominence" (orders peaks by prominence) and "thr_scan" (uses a threshold to identify peaks).
  • n_bins=40: Number of bins used for threshold scanning if method_peaks_detection="thr_scan".
  • detect_number_cpd=true: Boolean flag to test all possible combinations of change points up to n_max_change_points. Set to false to only fit using n_max_change_points.
  • fixed_cpd=false: Boolean flag to return the fit using exactly n_max_change_points change points if true.
  • win_size=14: Size of the window used by change point detection algorithms.
  • path_to_results="NA": Path to save the results.
  • save_all_model=false: Boolean flag to save all tested models if true.
  • auto_diff_method=nothing: Differentiation method for the optimizer, if required.
  • cons=nothing: Constraints for optimization.
  • multistart=false: Boolean flag to enable or disable multistart optimization. Set to true uses Tik-Tak restart (from Benchmarking global optimizers, Arnoud et al 2019).
  • n_restart=50: Number of restarts for multistart optimization, used if multistart=true.
  • opt_params...: Optional parameters for the optimizer (e.g., lb=[0.1, 0.3], ub=[9.0, 1.0], maxiters=2000000).

Output:

If res = segmentation_ODE(...):

  • res[1]: String describing the method used for segmentation and model fitting.
  • res[2]: Matrix of parameters for each ODE segment.
  • res[3]: Numerical values of the fitted solutions.
  • res[4]: Time coordinates of the fitted solutions.
  • res[5]: Intervals corresponding to the identified change points.
  • res[6]: AICc (or AIC) of the final model, indicating the goodness of fit.
source

NL segmentation

Kinbiont.selection_NL_fixed_interval

Kinbiont.selection_NL_fixed_intervalFunction
selection_NL_fixed_interval(
data_testing::Matrix{Float64},
name_well::String,
label_exp::String,
list_of_models::Vector{String},
list_u0,
intervals_changepoints::Any;
lb_param_array::Any = nothing,
ub_param_array::Any = nothing,
type_of_loss="L2",
method_of_fitting="NA",
smoothing=false,
size_bootstrap=0.7,
nrep=100,
type_of_smoothing="lowess",
thr_lowess=0.05,
pt_avg=1,
pt_smooth_derivative=0,
multiple_scattering_correction=false,
method_multiple_scattering_correction="interpolation",
calibration_OD_curve="NA",
beta_smoothing_ms=2.0,
penality_CI=8.0,
correction_AIC=true,
optimizer=BBO_adaptive_de_rand_1_bin_radiuslimited(),
multistart=false,
n_restart=50,
auto_diff_method=nothing,
cons=nothing,
opt_params...
)

This function fits a segmented nonlinear (NL) model to a curve, using specified change points.

Arguments:

  • data_testing::Matrix{Float64}: The dataset with the growth curve, where the first row represents times, and the second row represents the variable to fit (e.g., OD).
  • name_well::String: Name of the well.
  • label_exp::String: Label of the experiment.
  • list_of_models::Vector{String}: Array containing functions or strings representing the NL models.
  • list_u0: Initial guesses for the parameters for each model.
  • intervals_changepoints::Any: Array containing the change points for segmentation, e.g., [0.0, 10.0, 30.0].

Key Arguments:

  • lb_param_array::Any = nothing: Array of lower bounds for the parameters, compatible with the models.
  • ub_param_array::Any = nothing: Array of upper bounds for the parameters, compatible with the models.
  • type_of_loss::String = "L2": Type of loss function used. Options include "L2", "RE", "L2_derivative", and "blank_weighted_L2". See documentation for the full list.
  • method_of_fitting::String = "Normal": Method for performing the NL fit. Options are "Normal", "Bootstrap", and "Morris_sensitivity".
  • nrep::Int = 100: Number of iterations for Bootstrap or Morris sensitivity analysis.
  • smoothing::Bool = false: Whether to apply smoothing to the data.
  • type_of_smoothing::String = "lowess": Method for smoothing the data. Options include "NO", "rolling_avg", and "lowess".
  • thr_lowess::Float64 = 0.05: Threshold parameter for lowess smoothing.
  • pt_avg::Int = 1: Number of points for generating initial conditions or performing smoothing.
  • pt_smooth_derivative::Int = 0: Number of points for evaluating the specific growth rate. Uses interpolation if less than 2; otherwise, applies a sliding window approach.
  • multiple_scattering_correction::Bool = false: Whether to apply multiple scattering correction using the given calibration curve.
  • method_multiple_scattering_correction::String = "interpolation": Method for performing multiple scattering correction. Options are "interpolation" or "exp_fit".
  • calibration_OD_curve::String = "NA": Path to the .csv file with calibration data for optical density, used if multiple_scattering_correction is true.
  • beta_smoothing_ms::Float64 = 2.0: Penalty parameter for AIC (or AICc) evaluation.
  • penality_CI::Float64 = 8.0: Penalty for enforcing continuity at segment boundaries.
  • correction_AIC::Bool = true: Whether to apply finite sample correction to AIC.
  • size_bootstrap::Float = 0.7: Fraction of data used in each bootstrap run, applicable if method_of_fitting is "Bootstrap".
  • auto_diff_method::Any = nothing: Differentiation method for the optimizer, if required.
  • cons::Any = nothing: Constraints for optimization.
  • multistart::Bool = false: Whether to use multistart optimization. Set to true to use multiple starting points.
  • n_restart::Int = 50: Number of restarts for multistart optimization, used if multistart is true.
  • opt_params...: Additional optional parameters for the optimizer(e.g., lb = [0.1, 0.3], ub =[9.0,1.0], maxiters=2000000).

Output:

If results_NL_fit = selection_NL_fixed_interval(...):

  • results_NL_fit[1]: Parameters of each segment.
  • results_NL_fit[2]: Numerical solutions of the fit.
  • results_NL_fit[3]: Time coordinates of the numerical solutions of the fit.
  • results_NL_fit[4]: Loss of the best model.
source

Kinbiont.segmentation_NL

Kinbiont.segmentation_NLFunction
segmentation_NL(
data_testing::Matrix{Float64},
name_well::String,
label_exp::String,
list_of_models::Any,
list_u0,
n_change_points::Int;
lb_param_array::Any=nothing,
ub_param_array::Any=nothing,
type_of_loss="L2_fixed_CI",
method_of_fitting="NA",
type_of_detection="sliding_win",
type_of_curve="original",
smoothing=false,
nrep=100,
type_of_smoothing="lowess",
thr_lowess=0.05,
pt_avg=1,
win_size=7,
pt_smooth_derivative=0,
multiple_scattering_correction=false,
method_multiple_scattering_correction="interpolation",
calibration_OD_curve="NA",
beta_smoothing_ms=2.0,
method_peaks_detection="peaks_prominence",
n_bins=40,
detect_number_cpd=false,
fixed_cpd=false,
penality_CI=8.0,
size_bootstrap=0.7,
correction_AIC=true,
optimizer=BBO_adaptive_de_rand_1_bin_radiuslimited(),
multistart=false,
n_restart=50,
auto_diff_method=nothing,
cons=nothing,
opt_params...
)

This function performs model selection for nonlinear (NL) models while segmenting the time series using change point detection algorithms.

Arguments:

  • data_testing::Matrix{Float64}: The dataset with the growth curve, where the first row represents times, and the second row represents the variable to fit (e.g., OD).
  • name_well::String: Name of the well.
  • label_exp::String: Label of the experiment.
  • list_of_models::Any: Array containing functions or strings representing the NL models.
  • list_u0: Initial guesses for the parameters of each model.
  • n_change_points::Int: Number of change points to use. The results will vary based on the type_of_detection and fixed_cpd arguments.

Key Arguments:

  • lb_param_array::Any = nothing: Array of lower bounds for the parameters, compatible with the models.
  • ub_param_array::Any = nothing: Array of upper bounds for the parameters, compatible with the models.
  • type_of_loss::String = "L2_fixed_CI": Type of loss function used. Options include "L2_fixed_CI", "RE", "L2", "L2_derivative", and "blank_weighted_L2". See documentation for the full list.
  • method_of_fitting::String = "Normal": Method for performing the NL fit. Options are "Normal", "Bootstrap", and "Morris_sensitivity".
  • type_of_detection::String = "sliding_win": Change point detection algorithm. Options include "sliding_win" and "lsdd" (Least Squares Density Difference).
  • type_of_curve::String = "original": Curve used for change point detection. Options are "original" (original time series) and "deriv" (specific growth rate time series).
  • smoothing::Bool = false: Whether to apply smoothing to the data.
  • nrep::Int = 100: Number of iterations for Bootstrap or Morris sensitivity analysis.
  • type_of_smoothing::String = "lowess": Method for smoothing the data. Options include "NO", "rolling_avg", and "lowess".
  • thr_lowess::Float64 = 0.05: Threshold parameter for lowess smoothing.
  • pt_avg::Int = 1: Number of points for generating initial conditions or performing smoothing.
  • win_size::Int = 7: Size of the window used by the change point detection algorithms.
  • pt_smooth_derivative::Int = 0: Number of points for evaluating the specific growth rate. Uses interpolation if less than 2; otherwise, applies a sliding window approach.
  • multiple_scattering_correction::Bool = false: Whether to apply multiple scattering correction using the given calibration curve.
  • method_multiple_scattering_correction::String = "interpolation": Method for performing multiple scattering correction. Options are "interpolation" and "exp_fit".
  • calibration_OD_curve::String = "NA": Path to the .csv file with calibration data for optical density, used if multiple_scattering_correction is true.
  • beta_smoothing_ms::Float64 = 2.0: Penalty parameter for AIC (or AICc) evaluation.
  • method_peaks_detection::String = "peaks_prominence": Method for peak detection on the dissimilarity curve. Options include "peaks_prominence" (orders peaks by prominence) and "thr_scan" (uses a threshold).
  • n_bins::Int = 40: Number of bins used to generate the threshold in thr_scan method for peak detection.
  • detect_number_cpd::Bool = false: Whether to test all possible combinations of change points up to n_change_points and return the best based on AICc.
  • fixed_cpd::Bool = false: If true, performs fitting using only the top n_change_points and tests a single combination.
  • penality_CI::Float64 = 8.0: Penalty for enforcing continuity at segment boundaries.
  • size_bootstrap::Float = 0.7: Fraction of data used in each bootstrap run, applicable if method_of_fitting is "Bootstrap".
  • correction_AIC::Bool = true: Whether to apply finite sample correction to AIC.
  • auto_diff_method::Any = nothing: Differentiation method for the optimizer, if required.
  • cons::Any = nothing: Constraints for optimization.
  • multistart::Bool = false: Whether to use multistart optimization.
  • n_restart::Int = 50: Number of restarts for multistart optimization, used if multistart is true.
  • opt_params...: Additional optional parameters for the optimizer(e.g., lb = [0.1, 0.3], ub =[9.0,1.0], maxiters=2000000).

Output:

  • A data struct containing:
  1. Method string.
  2. Matrix with each row containing: ["name of model", "well", "param_1", "param_2", ..., "param_n", "maximum specific gr using NL", "maximum specific gr using data", "objective function value (i.e. loss of the solution)"], where "param_1", "param_2", ..., "param_n" are the parameters of the selected NL model.
  3. The y-coordinates of the NL fit.
  4. The x-coordinates of the NL fit.
  5. The change point intervals.
source

Fitting one a .csv file

Log-Lin fitting

Kinbiont.fit_one_file_Log_Lin

Kinbiont.fit_one_file_Log_LinFunction
fit_one_file_Log_Lin(
label_exp::String, 
path_to_data::String; 
path_to_annotation::Any = missing,
path_to_results="NA",
write_res=false,
type_of_smoothing="rolling_avg",
pt_avg=7,
pt_smoothing_derivative=7, 
pt_min_size_of_win=7, 
type_of_win="maximum", 
threshold_of_exp=0.9,
do_blank_subtraction="avg_blank",
avg_replicate=false, 
correct_negative="remove", 
thr_negative=0.01, 
multiple_scattering_correction=false, 
method_multiple_scattering_correction="interpolation",
calibration_OD_curve="NA",
thr_lowess=0.05, 
verbose=false,
blank_value = 0.0,
blank_array = [0.0],
start_exp_win_thr=0.05
)

Fits a logarithmic-linear model to data from a .csv file. The function assumes that the first column of the file represents time. It evaluates the specific growth rate, identifies an exponential window based on a statistical threshold, and performs a log-linear fitting.

Arguments:

  • label_exp::String: The label of the experiment.
  • path_to_data::String: Path to the .csv file containing the data.
  • path_to_annotation::Any = missing: Optional path to a .csv file with annotation data.

Key Arguments:

  • path_to_results="NA": Path to the folder where results will be saved.
  • write_res=false: Boolean flag to indicate whether to write the results to the specified path.
  • type_of_smoothing="rolling_avg": Method of data smoothing. Options are "NO", "rolling_avg", or "lowess".
  • pt_avg=7: Number of points used in the rolling average smoothing.
  • pt_smoothing_derivative=7: Number of points for evaluating specific growth rate. If less than 2, uses interpolation; otherwise, a sliding window approach is used.
  • pt_min_size_of_win=7: Minimum size of the exponential windows in terms of the number of smoothed points.
  • type_of_win="maximum": Method for selecting the exponential phase window. Options are "maximum" or "global_thr".
  • threshold_of_exp=0.9: Threshold in quantile to define the exponential windows, between 0 and 1.
  • do_blank_subtraction="avg_blank": Method for blank subtraction. Options include "NO", "avg_subtraction", and "time_avg".
  • blank_value=0.0: Average value of the blank, used only if do_blank_subtraction is not "NO".
  • blank_array=[0.0]: Array of blank values, used only if do_blank_subtraction is not "NO".
  • correct_negative="remove": Method for handling negative values after blank subtraction. Options are "thr_correction", "blank_correction", or "remove".
  • thr_negative=0.01: Threshold value for correcting negative values if correct_negative is "thr_correction".
  • multiple_scattering_correction=false: Flag indicating whether to correct for multiple scattering.
  • calibration_OD_curve="NA": Path to calibration data for multiple scattering correction, used if multiple_scattering_correction is true.
  • method_multiple_scattering_correction="interpolation": Method for correcting multiple scattering, options include "interpolation" or "exp_fit".
  • thr_lowess=0.05: Threshold for lowess smoothing.
  • verbose=false: Flag to enable verbose output.
  • start_exp_win_thr=0.05: Minimum OD value that should be reached to start the exponential window.

Output:

  • A data structure containing:
    1. method: Method used for fitting.
    2. A matrix with each row containing:
      • label_exp: Experiment label.
      • name_well: Name of the well or sample.
      • start of exp win: Start of the exponential window.
      • end of exp win: End of the exponential window.
      • Maximum specific GR: Maximum specific growth rate.
      • specific GR: Specific growth rate.
      • 2 sigma CI of GR: Confidence interval of the growth rate (±2 sigma).
      • doubling time: Doubling time.
      • doubling time - 2 sigma: Doubling time minus 2 sigma.
      • doubling time + 2 sigma: Doubling time plus 2 sigma.
      • intercept log-lin fitting: Intercept of the log-linear fitting.
      • 2 sigma intercept: Confidence interval of the intercept (±2 sigma).
      • R^2: Coefficient of determination (R-squared).
source

Analysis of segments

Kinbiont.segment_gr_analysis_file

Kinbiont.segment_gr_analysis_fileFunction
segment_gr_analysis_file(
path_to_data::String,
label_exp::String; # Label of the experiment
path_to_annotation=missing,
n_max_change_points=0,
type_of_smoothing="rolling_avg",
pt_avg=7,
pt_smoothing_derivative=7,
multiple_scattering_correction=false,
method_multiple_scattering_correction="interpolation",
calibration_OD_curve="NA",
thr_lowess=0.05,
type_of_detection="slinding_win",
type_of_curve="original",
do_blank_subtraction="avg_blank",
win_size=14,
n_bins=40,
path_to_results="NA",
write_res=false,
avg_replicate=false,
blank_value=0.0,
blank_array=[0.0],
correct_negative="remove",
thr_negative=0.01,
verbose=false,
)

This function analyzes change points in growth curve data from a .csv file and evaluates various metrics for each segment.

Arguments:

  • path_to_data::String: Path to the .csv file containing the data. The file should have time values in the first row and OD values in the second row.
  • label_exp::String: Label or identifier for the experiment.

Key Arguments:

  • path_to_annotation::Any = missing: Optional path to a .csv file containing annotations. If provided, it will be used for averaging replicates.
  • n_max_change_points::Int: Maximum number of change points to consider. The actual number of change points in the results may vary based on the detection method and other parameters.
  • type_of_smoothing="rolling_avg": Method for smoothing the data. Options include "NO" (no smoothing), "rolling_avg" (rolling average), and "lowess" (locally weighted scatterplot smoothing).
  • pt_avg=7: Size of the rolling average window for smoothing, applicable if type_of_smoothing is "rolling_avg".
  • pt_smoothing_derivative=7: Number of points used for the evaluation of specific growth rate. If less than 2, interpolation is used; otherwise, a sliding window approach is applied.
  • do_blank_subtraction="avg_blank": Method for blank subtraction. Options include "NO" (no blank subtraction), "avg_blank" (subtract average blank value), and "time_avg" (subtract time-averaged blank value).
  • blank_value=0.0: Used as the average blank value if path_to_annotation = missing and do_blank_subtraction != "NO".
  • blank_array=[0.0]: Array of blank values used if path_to_annotation = missing and do_blank_subtraction != "NO".
  • correct_negative="remove": Method for handling negative values after blank subtraction. Options include "thr_correction" (threshold correction), "blank_correction" (impute negative values based on blank distribution), and "remove" (remove negative values).
  • thr_negative=0.01: Threshold used for correcting negative values if correct_negative is "thr_correction". Values below this threshold will be adjusted to this value.
  • multiple_scattering_correction=false: Whether to correct data for multiple scattering using a calibration curve.
  • calibration_OD_curve="NA": Path to the .csv file containing calibration data for multiple scattering correction.
  • method_multiple_scattering_correction="interpolation": Method for performing multiple scattering correction. Options include "interpolation" and "exp_fit".
  • thr_lowess=0.05: Parameter for lowess smoothing if type_of_smoothing is "lowess".
  • type_of_detection="slinding_win": Method for detecting change points. Options include "slinding_win" (sliding window approach) and "lsdd" (least squares density difference).
  • type_of_curve="original": Specifies the curve used for change point detection. Options include "original" (raw time series) and "deriv" (specific growth rate time series).
  • method_peaks_detection="peaks_prominence": Method for detecting peaks in the dissimilarity curve. Options include "peaks_prominence" (orders peaks by prominence) and "thr_scan" (uses a threshold to select peaks).
  • n_bins=40: Number of bins used to generate thresholds if method_peaks_detection is "thr_scan".
  • win_size=14: Size of the window used by change point detection algorithms.
  • path_to_results="NA": Path to the folder where results will be saved.
  • write_res=false: Whether to write results to the specified folder.
  • avg_replicate=false: Whether to average replicates if an annotation path is provided.
  • verbose=false: Whether to print additional information during processing.

Output:

If res = segment_gr_analysis_file(...):

  • res[1]: A string describing the method used for segmentation and analysis.
  • res[2]: A matrix with rows containing: [label_exp, name_well, max_specific_gr, min_specific_gr, t_of_max, od_of_max, max_deriv, min_deriv, start_OD_of_segment, delta_OD, segment_number].
  • res[3]: List of change points for each well or sample.
  • res[4]: Preprocessed data including smoothed values and corrected data.
source

ODE fitting

Kinbiont.fit_file_ODE

Kinbiont.fit_file_ODEFunction
fit_file_ODE(
label_exp::String,
path_to_data::String,
model::String, 
param;
path_to_annotation::Any=missing,
integrator=Tsit5(), 
path_to_results="NA", 
loss_type="RE", 
smoothing=false, 
type_of_smoothing="lowess",
verbose=false, 
write_res=false,
pt_avg=1, 
pt_smooth_derivative=7, 
do_blank_subtraction="avg_blank", 
avg_replicate=false,
correct_negative="remove",
thr_negative=0.01, 
multiple_scattering_correction=false, 
method_multiple_scattering_correction="interpolation",
thr_lowess=0.05,
blank_value=0.0,
blank_array=[0.0],
multistart=false,
n_restart=50,
optimizer=BBO_adaptive_de_rand_1_bin_radiuslimited(),
auto_diff_method=nothing,
cons=nothing,
opt_params...
)

This function fits a ODE model to a csv file. The function assumes that the first column is the time, see the documentation for example of the data format.

Arguments:

  • label_exp::String: The label of the experiment.
  • path_to_data::String: String of the path where data file is located.
  • model::String:String of the ODE to be fitted. See the documentation for the complete list.
  • param:Vector{Float64}, Initial guess for the model parameters.

Key Arguments:

  • `integrator =Tsit5()' sciML integrator. If using piecewise model please use 'KenCarp4(autodiff=true)'.
  • optimizer = BBO_adaptive_de_rand_1_bin_radiuslimited() optimizer from optimization.jl.
  • type_of_loss:="RE": Type of loss function to be used. Some options are "RE", "L2", "L2derivative" and "blankweighted_L2"), see documentation for the full list.
  • average_replicate=false Bool, perform or not the average of replicates. Works only if an annotation path is provided
  • path_to_annotation::Any = missing: The path to the .csv of annotation .
  • write_res=false: Bool, write the results in pathtoresults folder.
  • path_to_results= "NA":String, path to the folder where save the results.
  • type_of_smoothing="rolling_avg": String, How to smooth the data, options: "NO" , "rolling avg" rolling average of the data, and "lowess".
  • pt_avg=7:Int, The number of points to do rolling average smoothing.
  • pt_smoothing_derivative=7:Int, Number of points for evaluation of specific growth rate. If <2 it uses interpolation algorithm otherwise a sliding window approach.
  • pt_min_size_of_win=7:Int, The minimum size of the exponential windows in the number of smoothed points.
  • type_of_win="maximum":String, How the exponential phase window is selected ("maximum" or "global_thr").
  • threshold_of_exp=0.9:Float, The threshold of the growth rate in quantile to define the exponential windows, a value between 0 and 1.
  • multiple_scattering_correction=false:Bool, Whether or not correct the data qith a calibration curve.
  • calibration_OD_curve="NA": String, The path where the .csv calibration data are located, used only if multiple_scattering_correction=true.
  • multiple_scattering_correction=false: Bool, if true uses the given calibration curve to correct the data for muliple scattering.
  • method_multiple_scattering_correction="interpolation": String, How perform the inference of multiple scattering curve, options: "interpolation" or "exp_fit" it uses an exponential fit from "Direct optical density determination of bacterial cultures in microplates for high-throughput screening applications"
  • thr_lowess=0.05: Float64 keyword argument of lowees smoothing.
  • correct_negative="remove": # if "thrcorrection" it put a thr on the minimum value of the data with blank subracted, if "blankcorrection" uses blank distrib to impute negative values.
  • blank_value = 0.0: used only if path_to_annotation = missingand do_blank_subtraction != "NO ". It is used as average value of the blank.
  • blank_array = [0.0]:used only if path_to_annotation = missingand do_blank_subtraction != "NO ". It is used as array of the blanks values.
  • correct_negative="remove" ;: String, How to treat negative values after blank subtraction. If "thr_correction" it put a thr on the minimum value of the data with blank subracted, if "blank_correction" uses blank distribution to impute negative values, if "remove" the values are just removed..
  • do_blank_subtraction="NO": String, how perform the blank subtration, options "NO","avgsubtraction" (subtration of average value of blanks) and "timeavg" (subtration of time average value of blanks).
  • auto_diff_method=nothing: method of differenzation, to be specified if required by the optimizer.
  • cons=nothing. Equation constrains for optimization.
  • multistart=false: use or not multistart optimization. Set to true uses Tik-Tak restart (from Benchmarking global optimizers, Arnoud et al 2019).
  • n_restart=50: number of restart. Used if multistart = true.
  • opt_params... :optional parameters of the required optimizer (e.g., lb = [0.1, 0.3], ub =[9.0,1.0], maxiters=2000000).

Output:

  • a data struct containing:
  1. method string
  2. matrix with the following contents for each row :[ "name of model", "well", "param_1","param_2",..,"param_n","maximum specific gr using ode","maximum specific gr using data", "objective function value (i.e. loss of the solution)"] where ' "param1","param2",..,"param_n" ' are the parameter of the selected ODE as in the documentation,
  3. the fittings
  4. the preprocessed data
source

Kinbiont.fit_file_custom_ODE

Kinbiont.fit_file_custom_ODEFunction
fit_file_custom_ODE(
label_exp::String,
path_to_data::String, 
model::Any, 
param::Vector{Float64},
n_equation::Int;
path_to_annotation::Any=missing,
integrator=Tsit5(),
path_to_results="NA", 
loss_type="RE",
smoothing=false, 
type_of_smoothing="lowess",
verbose=false,
write_res=false, 
pt_avg=1, 
pt_smooth_derivative=7,
do_blank_subtraction="avg_blank",
avg_replicate=false,
correct_negative="remove", 
thr_negative=0.01, 
multiple_scattering_correction=false, 
method_multiple_scattering_correction="interpolation",
calibration_OD_curve="NA",  
thr_lowess=0.05,
blank_value=0.0,
blank_array=[0.0],
multistart=false,
n_restart=50,
optimizer=BBO_adaptive_de_rand_1_bin_radiuslimited(),
auto_diff_method=nothing,
cons=nothing,
opt_params...
)

Fits a customizable Ordinary Differential Equation (ODE) model to a dataset from a .csv file.

Arguments:

  • label_exp::String: The label for the experiment.
  • path_to_data::String: Path to the .csv file containing the data.
  • model::Any: Function representing the ODE model to be fitted. See documentation for examples.
  • param::Vector{Float64}: Initial guesses for the model parameters.
  • n_equation::Int: Number of ODEs in the system.

Key Arguments:

  • integrator=Tsit5(): SciML integrator to use. For piecewise models, consider using KenCarp4(autodiff=true).
  • optimizer=BBO_adaptive_de_rand_1_bin_radiuslimited(): Optimizer for parameter fitting from optimization.jl.
  • loss_type="RE": Type of loss function. Options include "RE", "L2", "L2_derivative", and "blank_weighted_L2", see documentation for the full list.
  • path_to_annotation::Any=missing: Path to a .csv file with annotation data. Required if avg_replicate=true.
  • path_to_results="NA": Path to the folder where results will be saved.
  • write_res=false: Boolean flag indicating whether to write the results to the specified path.
  • smoothing=false: Boolean flag for applying data smoothing.
  • type_of_smoothing="lowess": Method for smoothing the data. Options are "NO", "rolling_avg", or "lowess".
  • pt_avg=1: Number of points for rolling average smoothing.
  • pt_smooth_derivative=7: Number of points for evaluating the specific growth rate. If less than 2, interpolation is used; otherwise, a sliding window approach is applied.
  • do_blank_subtraction="avg_blank": Method for blank subtraction. Options include "NO", "avg_subtraction", and "time_avg".
  • blank_value=0.0: Average value of the blank, used only if do_blank_subtraction is not "NO".
  • blank_array=[0.0]: Array of blank values, used only if do_blank_subtraction is not "NO".
  • correct_negative="remove": Method for handling negative values after blank subtraction. Options are "thr_correction", "blank_correction", or "remove".
  • thr_negative=0.01: Threshold value for correcting negative values if correct_negative is "thr_correction".
  • multiple_scattering_correction=false: Flag indicating whether to correct for multiple scattering.
  • calibration_OD_curve="NA": Path to the .csv file with calibration data, used only if multiple_scattering_correction=true.
  • method_multiple_scattering_correction="interpolation": Method for correcting multiple scattering. Options include "interpolation" or "exp_fit".
  • thr_lowess=0.05: Threshold for lowess smoothing.
  • verbose=false: Boolean flag for verbose output.
  • multistart=false: Flag to use multistart optimization. Set to true uses Tik-Tak restart (from Benchmarking global optimizers, Arnoud et al 2019).
  • n_restart=50: Number of restarts used if multistart=true.
  • auto_diff_method=nothing: Method for differentiation, if required by the optimizer.
  • cons=nothing: Constraints for optimization.
  • opt_params...: Additional optional parameters for the optimizer, such as lb, ub, and maxiters.

Output:

  • A data structure containing:
    1. method: Method used for fitting.
    2. A matrix where each row includes:
      • name of model: Name of the fitted ODE model.
      • well: Identifier for the data well or sample.
      • param_1, param_2, ..., param_n: Parameters of the selected ODE model.
      • maximum specific gr using ode: Maximum specific growth rate calculated using the ODE.
      • maximum specific gr using data: Maximum specific growth rate calculated from the data.
      • objective function value: Loss value indicating the fit quality.
    3. The fittings (model outputs).
    4. The preprocessed data.
source

Kinbiont.ODE_model_selection_file

Kinbiont.ODE_model_selection_fileFunction
ODE_model_selection_file(
label_exp::String, 
path_to_data::String, 
models_list::Vector{String}, 
param_array::Any;
lb_param_array::Any=nothing, 
ub_param_array::Any=nothing, 
path_to_annotation::Any=missing,
integrator=Tsit5(), 
path_to_results="NA", 
loss_type="L2", 
type_of_smoothing="lowess",
beta_smoothing_ms=2.0,
verbose=false, 
write_res=false, 
pt_avg=1, 
pt_smooth_derivative=7,
do_blank_subtraction="avg_blank", 
avg_replicate=false,
correct_negative="remove",
thr_negative=0.01,  
multiple_scattering_correction=false, 
method_multiple_scattering_correction="interpolation",
calibration_OD_curve="NA",  
thr_lowess=0.05,
correction_AIC=true,
blank_value=0.0,
blank_array=[0.0],
multistart=false,
n_restart=50,
optimizer=BBO_adaptive_de_rand_1_bin_radiuslimited(),
auto_diff_method=nothing,
cons=nothing,
opt_params...
)

This function performs model selection for ordinary differential equations (ODEs) on a dataset provided in a .csv file.

Arguments:

  • label_exp::String: The label for the experiment.
  • path_to_data::String: Path to the .csv file containing the data.
  • models_list::Vector{String}: List of ODE models to evaluate. See documentation for the full list.
  • param_array::Any: Initial guesses for the parameters of the models. It can be a matrix where each row corresponds to initial guesses for the models in models_list.

Key Arguments:

  • lb_param_array::Any=nothing: Lower bounds for the model parameters. Must be compatible with the models.
  • ub_param_array::Any=nothing: Upper bounds for the model parameters. Must be compatible with the models.
  • path_to_annotation::Any=missing: Path to the .csv file with annotation data. Required if avg_replicate=true.
  • integrator=Tsit5(): SciML integrator to use. For piecewise models, consider using KenCarp4(autodiff=true).
  • optimizer=BBO_adaptive_de_rand_1_bin_radiuslimited(): Optimizer used for parameter fitting from optimization.jl.
  • loss_type="L2": Type of loss function. Options include "L2", "RE", "L2_derivative", and "blank_weighted_L2", see documentation for the full list.
  • path_to_results="NA": Path to the folder where results will be saved.
  • write_res=false: Boolean flag indicating whether to write the results to the specified path.
  • type_of_smoothing="lowess": Method for smoothing the data. Options are "NO", "rolling_avg", or "lowess".
  • pt_avg=7: Number of points for rolling average smoothing.
  • pt_smooth_derivative=7: Number of points for evaluating the specific growth rate. If less than 2, interpolation is used; otherwise, a sliding window approach is applied.
  • do_blank_subtraction="avg_blank": Method for blank subtraction. Options include "NO", "avg_subtraction", and "time_avg".
  • blank_value=0.0: Average value of the blank, used only if do_blank_subtraction is not "NO".
  • blank_array=[0.0]: Array of blank values, used only if do_blank_subtraction is not "NO".
  • correct_negative="remove": Method for handling negative values after blank subtraction. Options are "thr_correction", "blank_correction", or "remove".
  • thr_negative=0.01: Threshold value for correcting negative values if correct_negative is "thr_correction".
  • multiple_scattering_correction=false: Flag indicating whether to correct for multiple scattering.
  • calibration_OD_curve="NA": Path to the .csv file with calibration data, used only if multiple_scattering_correction=true.
  • method_multiple_scattering_correction="interpolation": Method for correcting multiple scattering. Options include "interpolation" or "exp_fit".
  • thr_lowess=0.05: Threshold for lowess smoothing.
  • correction_AIC=true: Boolean flag indicating whether to apply finite sample correction to the Akaike Information Criterion (AIC).
  • beta_smoothing_ms=2.0: Penalty parameter for AIC (or AICc) evaluation.
  • verbose=false: Boolean flag for verbose output.
  • multistart=false: Flag to use multistart optimization. Set to true uses Tik-Tak restart (from Benchmarking global optimizers, Arnoud et al 2019).
  • n_restart=50: Number of restarts used if multistart=true.
  • auto_diff_method=nothing: Method for differentiation, if required by the optimizer.
  • cons=nothing: Constraints for optimization.
  • opt_params...: Additional optional parameters for the optimizer, such as lb, ub, and maxiters.

Output:

  • A data structure containing:
    1. method: The method used for model selection.
    2. A matrix where each row includes:
      • name of model: Name of the ODE model.
      • well: Identifier for the data well or sample.
      • param_1, param_2, ..., param_n: Parameters of the selected ODE model.
      • maximum specific gr using ode: Maximum specific growth rate calculated using the ODE.
      • maximum specific gr using data: Maximum specific growth rate calculated from the data.
      • objective function value: Loss value indicating the fit quality.
    3. The fittings (model outputs).
    4. The preprocessed data.
source

NL fitting

Kinbiont.fit_NL_model_file

Kinbiont.fit_NL_model_fileFunction
fit_NL_model_file(
label_exp::String,
path_to_data::String,
model::Any,
u0;
lb_param::Vector{Float64}=nothing,
ub_param::Vector{Float64}=nothing,
path_to_annotation::Any=missing,
method_of_fitting="NA",
nrep=100,
errors_estimation=false,
path_to_results="NA",
loss_type="RE",
smoothing=false,
type_of_smoothing="lowess",
verbose=false,
write_res=false,
pt_avg=1,
pt_smooth_derivative=7,
do_blank_subtraction="avg_blank",
avg_replicate=false,
correct_negative="remove",
thr_negative=0.01,
multiple_scattering_correction=false,
method_multiple_scattering_correction="interpolation",
calibration_OD_curve="NA",
thr_lowess=0.05,
penality_CI=8.0,
size_bootstrap=0.7,
blank_value=0.0,
blank_array=[0.0],
optimizer=BBO_adaptive_de_rand_1_bin_radiuslimited(),
multistart=false,
n_restart=50,
auto_diff_method=nothing,
cons=nothing,
opt_params...
)

This function performs nonlinear (NL) model fitting for a given .csv file.

  • label_exp::String: The label for the experiment.
  • path_to_data::String: Path to the .csv file containing the data. The file should be formatted with time in the first column and corresponding data values in subsequent columns.
  • model::Any: Function or strings representing the NL models to be tested.
  • u0: Initial gues for the parameters of the NL model.

Key Arguments:

  • lb_param_array::Vector{Vector{Float64}} = nothing: Array of lower bounds for the parameters of each model. Each entry corresponds to the parameter bounds for a specific model.
  • ub_param_array::Vector{Vector{Float64}} = nothing: Array of upper bounds for the parameters of each model. Each entry corresponds to the parameter bounds for a specific model.
  • path_to_annotation::Any = missing: Path to a .csv file with annotation data, if available.
  • method_of_fitting::String = "NA": Method for NL fitting. Options include "Bootstrap", "Normal", and "Morris_sensitivity".
  • nrep::Int = 10: Number of repetitions for methods like Morris sensitivity or bootstrap. Used only if method_of_fitting is "Bootstrap" or "Morris_sensitivity".
  • path_to_results::String = "NA": Path to the folder where results will be saved.
  • loss_type::String = "RE": Type of loss function used. Options include "RE" (Residual Error), "L2", "L2_derivative", and "blank_weighted_L2". See documentation for the full list.
  • smoothing::Bool = false: Whether to apply smoothing to the data.
  • type_of_smoothing::String = "lowess": Type of smoothing. Options include "NO", "rolling_avg", and "lowess".
  • pt_avg::Int = 1: Number of points for rolling average smoothing or initial condition generation.
  • pt_smooth_derivative::Int = 7: Number of points for evaluating the specific growth rate. If less than 2, uses an interpolation algorithm.
  • do_blank_subtraction::String = "avg_blank": Method for blank subtraction. Options include "NO", "avg_subtraction", and "time_avg".
  • avg_replicate::Bool = false: If true, averages replicates if annotation data is provided.
  • correct_negative::String = "remove": Method for treating negative values after blank subtraction. Options include "thr_correction", "blank_correction", and "remove".
  • thr_negative::Float64 = 0.01: Threshold for correcting negative values if correct_negative is "thr_correction".
  • multiple_scattering_correction::Bool = false: If true, uses a calibration curve to correct data for multiple scattering.
  • method_multiple_scattering_correction::String = "interpolation": Method for correcting multiple scattering. Options include "interpolation" and "exp_fit".
  • calibration_OD_curve::String = "NA": Path to the .csv file containing calibration data, used if multiple_scattering_correction is true.
  • thr_lowess::Float64 = 0.05: Parameter for lowess smoothing.
  • beta_smoothing_ms::Float64 = 2.0: Penalty parameter for AIC (or AICc) evaluation.
  • penality_CI::Float64 = 8.0: Penalty parameter for ensuring continuity in segmentation.
  • size_bootstrap::Float64 = 0.7: Fraction of data used for each bootstrap run, used only if method_of_fitting is "Bootstrap".
  • correction_AIC::Bool = true: If true, performs finite sample correction for AIC.
  • blank_value::Float64 = 0.0: Average value of blanks used if do_blank_subtraction is not "NO" and path_to_annotation is missing.
  • blank_array::Vector{Float64} = [0.0]: Array of blank values used if do_blank_subtraction is not "NO" and path_to_annotation is missing.
  • auto_diff_method::Any = nothing: Differentiation method to be specified if required by the optimizer.
  • cons::Any = nothing: Constraints for optimization equations.
  • multistart::Bool = false: If true, performs multistart optimization.
  • n_restart::Int = 50: Number of restarts for multistart optimization if multistart is true.
  • optimizer::Any = BBO_adaptive_de_rand_1_bin_radiuslimited(): Optimizer used for the fitting process. Default is BBO_adaptive_de_rand_1_bin_radiuslimited().
  • opt_params...: Additional optional parameters for the optimizer.

Outputs:

  • Method String: Description of the fitting method used.
  • Results Matrix: A matrix where each row contains:
    • "name of model": The name of the model used.
    • "well": The name of the well.
    • "param_1", "param_2", ..., "param_n": Parameters of the selected NL model.
    • "maximum specific gr using NL": Maximum specific growth rate obtained using the NL model.
    • "maximum specific gr using data": Maximum specific growth rate obtained from the data.
    • "objective function value": The value of the objective function (i.e., loss of the solution).
  • Fittings: The numerical values of the fitted solution.
  • Times: The time points of the data.
  • The data set that was fitted.
source

Kinbiont.fit_NL_model_selection_file

Kinbiont.fit_NL_model_selection_fileFunction
fit_NL_model_selection_file(
label_exp::String,
path_to_data::String,
list_model_function::Any,
list_u0;
lb_param_array::Any = nothing,
ub_param_array::Any = nothing,
path_to_annotation::Any = missing,
method_of_fitting="Normal",
nrep=10,
path_to_results="NA",
loss_type="RE",
smoothing=false,
type_of_smoothing="lowess",
verbose=false,
write_res=false,
pt_avg=1,
pt_smooth_derivative=7,
do_blank_subtraction="avg_blank",
avg_replicate=false,
correct_negative="remove",
thr_negative=0.01,
multiple_scattering_correction=false,
method_multiple_scattering_correction="interpolation",
calibration_OD_curve="NA",
thr_lowess=0.05,
beta_smoothing_ms=2.0,
penality_CI=8.0,
size_bootstrap=0.7,
correction_AIC=true,
blank_value = 0.0,
blank_array = [0.0],
optimizer=BBO_adaptive_de_rand_1_bin_radiuslimited(),
multistart=false,
n_restart=50,
auto_diff_method=nothing,
cons=nothing,
opt_params...
)

function performs nonlinear (NL) model selection on a segmented time series using AIC or AICc. It operates on an entire .csv file of data.

Arguments:

  • label_exp::String: The label for the experiment.
  • path_to_data::String: Path to the .csv file containing the data. The file should be formatted with time in the first column and corresponding data values in subsequent columns.
  • list_model_function::Any: Array of functions or strings representing the NL models to be tested.
  • list_u0::Any: Initial guesses for the parameters of each NL model.

Key Arguments:

  • lb_param_array::Vector{Vector{Float64}} = nothing: Array of lower bounds for the parameters of each model. Each entry corresponds to the parameter bounds for a specific model.
  • ub_param_array::Vector{Vector{Float64}} = nothing: Array of upper bounds for the parameters of each model. Each entry corresponds to the parameter bounds for a specific model.
  • path_to_annotation::Any = missing: Path to a .csv file with annotation data, if available.
  • method_of_fitting::String = "NA": Method for NL fitting. Options include "Bootstrap", "Normal", and "Morris_sensitivity".
  • nrep::Int = 10: Number of repetitions for methods like Morris sensitivity or bootstrap. Used only if method_of_fitting is "Bootstrap" or "Morris_sensitivity".
  • path_to_results::String = "NA": Path to the folder where results will be saved.
  • loss_type::String = "RE": Type of loss function used. Options include "RE" (Residual Error), "L2", "L2_derivative", and "blank_weighted_L2". See documentation for the full list.
  • smoothing::Bool = false: Whether to apply smoothing to the data.
  • type_of_smoothing::String = "lowess": Type of smoothing. Options include "NO", "rolling_avg", and "lowess".
  • pt_avg::Int = 1: Number of points for rolling average smoothing or initial condition generation.
  • pt_smooth_derivative::Int = 7: Number of points for evaluating the specific growth rate. If less than 2, uses an interpolation algorithm.
  • do_blank_subtraction::String = "avg_blank": Method for blank subtraction. Options include "NO", "avg_subtraction", and "time_avg".
  • avg_replicate::Bool = false: If true, averages replicates if annotation data is provided.
  • correct_negative::String = "remove": Method for treating negative values after blank subtraction. Options include "thr_correction", "blank_correction", and "remove".
  • thr_negative::Float64 = 0.01: Threshold for correcting negative values if correct_negative is "thr_correction".
  • multiple_scattering_correction::Bool = false: If true, uses a calibration curve to correct data for multiple scattering.
  • method_multiple_scattering_correction::String = "interpolation": Method for correcting multiple scattering. Options include "interpolation" and "exp_fit".
  • calibration_OD_curve::String = "NA": Path to the .csv file containing calibration data, used if multiple_scattering_correction is true.
  • thr_lowess::Float64 = 0.05: Parameter for lowess smoothing.
  • beta_smoothing_ms::Float64 = 2.0: Penalty parameter for AIC (or AICc) evaluation.
  • penality_CI::Float64 = 8.0: Penalty parameter for ensuring continuity in segmentation.
  • size_bootstrap::Float64 = 0.7: Fraction of data used for each bootstrap run, used only if method_of_fitting is "Bootstrap".
  • correction_AIC::Bool = true: If true, performs finite sample correction for AIC.
  • blank_value::Float64 = 0.0: Average value of blanks used if do_blank_subtraction is not "NO" and path_to_annotation is missing.
  • blank_array::Vector{Float64} = [0.0]: Array of blank values used if do_blank_subtraction is not "NO" and path_to_annotation is missing.
  • auto_diff_method::Any = nothing: Differentiation method to be specified if required by the optimizer.
  • cons::Any = nothing: Constraints for optimization equations.
  • multistart::Bool = false: If true, performs multistart optimization.
  • n_restart::Int = 50: Number of restarts for multistart optimization if multistart is true.
  • optimizer::Any = BBO_adaptive_de_rand_1_bin_radiuslimited(): Optimizer used for the fitting process. Default is BBO_adaptive_de_rand_1_bin_radiuslimited().
  • opt_params...: Additional optional parameters for the optimizer.

Outputs:

A data struct containing:

  • Method String: Description of the fitting method used.
  • Results Matrix: A matrix where each row contains:
    • "name of model": The name of the model used.
    • "well": The name of the well.
    • "param_1", "param_2", ..., "param_n": Parameters of the selected NL model.
    • "maximum specific gr using NL": Maximum specific growth rate obtained using the NL model.
    • "maximum specific gr using data": Maximum specific growth rate obtained from the data.
    • "objective function value": The value of the objective function (i.e., loss of the solution).
  • Fittings: The numerical values of the fitted solution.
  • Times: The time points of the data.
  • The data set that was fitted.
source

Segmented fitting

Kinbiont.segmentation_ODE_file

Kinbiont.fit_NL_segmentation_fileFunction
fit_NL_segmentation_file(
label_exp::String,
path_to_data::String,
list_model_function::Any,
list_u0,
n_change_points::Int;
lb_param_array::Vector{Vector{Float64}}=nothing,
ub_param_array::Vector{Vector{Float64}}=nothing,
path_to_annotation::Any = missing,
method_of_fitting="NA",
nrep=10,
path_to_results="NA",
loss_type="RE",
smoothing=false,
type_of_smoothing="lowess",
verbose=false,
write_res=false,
pt_avg=1,
pt_smooth_derivative=7,
do_blank_subtraction="avg_blank",
avg_replicate=false,
correct_negative="remove",
thr_negative=0.01,
multiple_scattering_correction=false,
method_multiple_scattering_correction="interpolation",
calibration_OD_curve="NA",
size_bootstrap=0.7,
thr_lowess=0.05,
detect_number_cpd=true,
type_of_detection="sliding_win",
type_of_curve="original",
fixed_cpd=false,
penality_CI=8.0,
beta_smoothing_ms=2.0,
win_size=7,
n_bins=40,
correction_AIC=true,
blank_value = 0.0,
blank_array = [0.0],
optimizer=BBO_adaptive_de_rand_1_bin_radiuslimited(),
auto_diff_method=nothing,
multistart=false,
n_restart=50,
cons=nothing,
opt_params...

)

This function performs nonlinear (NL) model selection on a segmented time series using AIC or AICc. It operates on an entire .csv file of data.

Arguments:

  • label_exp::String: The label for the experiment.
  • path_to_data::String: Path to the .csv file containing the data. The file should be formatted with time in the first column and corresponding data values in subsequent columns.
  • list_model_function::Any: Array of functions or strings representing the NL models to be tested.
  • list_u0::Any: Initial guesses for the parameters of each NL model.
  • n_change_points::Int: Maximum number of change points to consider in the segmentation process.

Key Arguments:

  • lb_param_array::Vector{Vector{Float64}} = nothing: Array of lower bounds for the parameters of each model. Each entry corresponds to the parameter bounds for a specific model.
  • ub_param_array::Vector{Vector{Float64}} = nothing: Array of upper bounds for the parameters of each model. Each entry corresponds to the parameter bounds for a specific model.
  • path_to_annotation::Any = missing: Path to a .csv file with annotation data, if available.
  • method_of_fitting::String = "NA": Method for NL fitting. Options include "Bootstrap", "Normal", and "Morris_sensitivity". Default is "NA".
  • nrep::Int = 10: Number of repetitions for methods like Morris sensitivity or bootstrap. Used only if method_of_fitting is "Bootstrap" or "Morris_sensitivity".
  • path_to_results::String = "NA": Path to the folder where results will be saved.
  • loss_type::String = "RE": Type of loss function used. Options include "RE" (Residual Error), "L2", "L2_derivative", and "blank_weighted_L2". See documentation for the full list..
  • smoothing::Bool = false: Whether to apply smoothing to the data.
  • type_of_smoothing::String = "lowess": Type of smoothing. Options include "NO", "rolling_avg", and "lowess".
  • pt_avg::Int = 1: Number of points for rolling average smoothing or initial condition generation.
  • pt_smooth_derivative::Int = 7: Number of points for evaluating the specific growth rate. If less than 2, uses an interpolation algorithm.
  • do_blank_subtraction::String = "avg_blank": Method for blank subtraction. Options include "NO", "avg_subtraction", and "time_avg".
  • avg_replicate::Bool = false: If true, averages replicates if annotation data is provided.
  • correct_negative::String = "remove": Method for treating negative values after blank subtraction. Options include "thr_correction", "blank_correction", and "remove".
  • thr_negative::Float64 = 0.01: Threshold for correcting negative values if correct_negative is "thr_correction".
  • multiple_scattering_correction::Bool = false: If true, uses a calibration curve to correct data for multiple scattering.
  • method_multiple_scattering_correction::String = "interpolation": Method for correcting multiple scattering. Options include "interpolation" and "exp_fit".
  • calibration_OD_curve::String = "NA": Path to the .csv file containing calibration data, used if multiple_scattering_correction is true.
  • thr_lowess::Float64 = 0.05: Parameter for lowess smoothing.
  • beta_smoothing_ms::Float64 = 2.0: Penalty parameter for AIC (or AICc) evaluation.
  • penality_CI::Float64 = 8.0: Penalty parameter for ensuring continuity in segmentation.
  • size_bootstrap::Float64 = 0.7: Fraction of data used for each bootstrap run, used only if method_of_fitting is "Bootstrap".
  • correction_AIC::Bool = true: If true, performs finite sample correction for AIC.
  • blank_value::Float64 = 0.0: Average value of blanks used if do_blank_subtraction is not "NO" and path_to_annotation is missing.
  • blank_array::Vector{Float64} = [0.0]: Array of blank values used if do_blank_subtraction is not "NO" and path_to_annotation is missing.
  • type_of_detection::String = "sliding_win": Method for detecting change points. Options include "sliding_win" for sliding window approach and "lsdd" for least square density difference (LSDD) from ChangePointDetection.jl.
  • type_of_curve::String = "original": Specifies the curve on which change point detection is performed. Options include "original" (the original time series) and "deriv" (the specific growth rate time series).
  • fixed_cpd::Bool = false: If true, returns the fitting using exactly n_change_points change points.
  • detect_number_cpd::Bool = true: If true, all possible combinations of lengths 1, 2, ..., n_change_points are tested, and the best combination for AICc is returned.
  • method_peaks_detection::String = "peaks_prominence": Method for peak detection on the dissimilarity curve. Options include "peaks_prominence" (orders peaks by prominence) and "thr_scan" (uses a threshold to select peaks).
  • n_bins::Int = 40: Number of bins used to generate the threshold for peak detection if method_peaks_detection is "thr_scan".
  • win_size::Int = 14: Size of the window used by the change point detection algorithms.
  • auto_diff_method::Any = nothing: Differentiation method to be specified if required by the optimizer.
  • cons::Any = nothing: Constraints for optimization equations.
  • multistart::Bool = false: If true, performs multistart optimization.
  • n_restart::Int = 50: Number of restarts for multistart optimization if multistart is true.
  • optimizer::Any = BBO_adaptive_de_rand_1_bin_radiuslimited(): Optimizer used for the fitting process. Default is BBO_adaptive_de_rand_1_bin_radiuslimited().
  • opt_params...: Additional optional parameters for the optimizer.

Outputs:

  • Method String: Description of the fitting method used.

  • Results Matrix: A matrix where each row contains:

    • "name of model": The name of the model used.
    • "well": The name of the well.
    • "param_1", "param_2", ..., "param_n": Parameters of the selected NL model.
    • "maximum specific gr using NL": Maximum specific growth rate obtained using the NL model.
    • "maximum specific gr using data": Maximum specific growth rate obtained from the data.
    • "objective function value": The value of the objective function (i.e., loss of the solution).
  • Fittings: Details of the model fittings.

  • Preprocessed Data: Data that has been preprocessed according to the specified arguments.

  • Change Points Time: Time of change points detected for each well.

  • AIC (or AICc) Score: AIC or AICc score of the best model for each well.

source

Kinbiont.fit_NL_segmentation_file

Kinbiont.segmentation_ODE_fileFunction
segmentation_ODE_file(
label_exp::String,
path_to_data::String, 
list_of_models::Vector{String}, 
param_array::Any, 
n_max_change_points::Int;
lb_param_array::Any=nothing, 
ub_param_array::Any=nothing, 
path_to_annotation::Any=missing,
detect_number_cpd=true,
fixed_cpd=false,
integrator=Tsit5(), 
type_of_loss="L2", 
type_of_detection="sliding_win",
type_of_curve="original",
do_blank_subtraction="avg_blank",
correct_negative="remove",
thr_negative=0.01,
pt_avg=3, 
smoothing=true, 
path_to_results="NA",
win_size=7, 
pt_smooth_derivative=0,
beta_smoothing_ms=2.0,
avg_replicate=false,
multiple_scattering_correction=false, 
method_multiple_scattering_correction="interpolation",
calibration_OD_curve="NA",  
write_res=false,
save_all_model=false,
method_peaks_detection="peaks_prominence",
n_bins=40,
type_of_smoothing="lowess",
thr_lowess=0.05,
verbose=false,
correction_AIC=true,
blank_value=0.0,
blank_array=[0.0],
optimizer=BBO_adaptive_de_rand_1_bin_radiuslimited(),
multistart=false,
n_restart=50,
auto_diff_method=nothing,
cons=nothing,
opt_params...
)

This function performs model selection for ordinary differential equation (ODE) models while segmenting the time series data using change point detection algorithms.

Arguments:

  • label_exp::String: The label for the experiment.
  • path_to_data::String: Path to the .csv file containing the data.
  • list_of_models::Vector{String}: List of ODE models to evaluate. See documentation for the full list.
  • param_array::Any: Initial guesses for the parameters of the models.
  • n_max_change_points::Int: Maximum number of change points to use. The number of detected change points may vary based on the type_of_detection and fixed_cpd settings.

Key Arguments:

  • lb_param_array::Any=nothing: Lower bounds for the model parameters. Must be compatible with the models.
  • ub_param_array::Any=nothing: Upper bounds for the model parameters. Must be compatible with the models.
  • path_to_annotation::Any=missing: Path to the .csv file with annotation data. Required if avg_replicate=true.
  • detect_number_cpd=true: Boolean flag indicating whether to detect the optimal number of change points. If true, all combinations from length 1 to n_max_change_points are tested, and the best is selected based on AICc.
  • fixed_cpd=false: Boolean flag indicating whether to use a fixed number of change points. If true, the fitting will use exactly n_max_change_points.
  • integrator=Tsit5(): SciML integrator to use. For piecewise models, consider using KenCarp4(autodiff=true).
  • optimizer=BBO_adaptive_de_rand_1_bin_radiuslimited(): Optimizer used for parameter fitting from optimization.jl.
  • type_of_loss="L2": Type of loss function. Options include "L2", "RE", "L2_derivative", and "blank_weighted_L2", see documentation for the full list.
  • path_to_results="NA": Path to the folder where results will be saved.
  • write_res=false: Boolean flag indicating whether to write the results to the specified path.
  • type_of_smoothing="lowess": Method for smoothing the data. Options are "NO", "rolling_avg", or "lowess".
  • pt_avg=3: Number of points for rolling average smoothing.
  • pt_smooth_derivative=0: Number of points for evaluating the specific growth rate. If less than 2, interpolation is used; otherwise, a sliding window approach is applied.
  • win_size=7: Size of the window used by the change point detection algorithms.
  • type_of_detection="sliding_win": Change point detection algorithm to use. Options include "sliding_win" and "lsdd" (Least Squares Density Difference from ChangePointDetection.jl).
  • type_of_curve="original": Curve on which change point detection is performed. Options are "original" (original time series) or "deriv" (specific growth rate time series).
  • method_peaks_detection="peaks_prominence": Method for detecting peaks in the dissimilarity curve. Options are "peaks_prominence" (orders peaks by prominence) and "thr_scan" (uses a threshold to choose peaks).
  • n_bins=40: Number of bins used to generate the threshold for peak detection if method_peaks_detection="thr_scan".
  • smoothing=true: Boolean flag to enable or disable smoothing.
  • multiple_scattering_correction=false: Boolean flag indicating whether to correct for multiple scattering.
  • calibration_OD_curve="NA": Path to the .csv file with calibration data, used only if multiple_scattering_correction=true.
  • method_multiple_scattering_correction="interpolation": Method for correcting multiple scattering. Options include "interpolation" or "exp_fit".
  • thr_lowess=0.05: Threshold for lowess smoothing.
  • correct_negative="remove": Method for handling negative values after blank subtraction. Options include "thr_correction", "blank_correction", and "remove".
  • blank_value=0.0: Average value of the blank, used only if do_blank_subtraction is not "NO".
  • blank_array=[0.0]: Array of blank values, used only if do_blank_subtraction is not "NO".
  • do_blank_subtraction="avg_blank": Method for blank subtraction. Options include "NO", "avg_subtraction", and "time_avg".
  • correction_AIC=true: Boolean flag indicating whether to apply finite sample correction to AIC (or AICc).
  • beta_smoothing_ms=2.0: Penalty parameter for AIC (or AICc) evaluation.
  • auto_diff_method=nothing: Method for differentiation, if required by the optimizer.
  • cons=nothing: Constraints for optimization.
  • multistart=false: Boolean flag to use multistart optimization. Set to true uses Tik-Tak restart (from Benchmarking global optimizers, Arnoud et al 2019).
  • n_restart=50: Number of restarts used if multistart=true.
  • opt_params...: Additional optional parameters for the optimizer, such as lb, ub, and maxiters.

Output:

  • A data structure containing:
    1. method: The method used for model selection.
    2. A matrix where each row includes:
      • name of model: Name of the ODE model.
      • well: Identifier for the data well or sample.
      • param_1, param_2, ..., param_n: Parameters of the selected ODE model.
      • maximum specific gr using ode: Maximum specific growth rate calculated using the ODE.
      • maximum specific gr using data: Maximum specific growth rate calculated from the data.
      • objective function value: Loss value indicating the fit quality.
    3. The fittings (model outputs).
    4. The preprocessed data.
    5. The change points detected in the data for each well.
    6. The AIC (or AICc) score for each well.
source

ML downstrema analysis

Kinbiont.downstream_decision_tree_regression

Kinbiont.downstream_decision_tree_regressionFunction
downstream_decision_tree_regression(
Kinbiont_results::Matrix{Any},
feature_matrix::Matrix{Any},
row_to_learn::Int;
max_depth = -1,
verbose = true,
pruning_purity = 1.0,
min_samples_leaf = 5,
min_samples_split = 2,
min_purity_increase = 0.0,
n_subfeatures = 0,
do_pruning = true,
pruning_accuracy = 1.0,
seed = 3,
do_cross_validation = false,
n_folds_cv = 3,
)

This function performs regression using a decision tree algorithm on results from Kinbiont.

Arguments:

  • Kinbiont_results::Matrix{Any}: The matrix containing results from fitting one or more files using Kinbiont.

  • feature_matrix::Matrix{Any}: Matrix of features used for machine learning analysis. The number of rows should match the number of columns (minus one) in the Kinbiont_results, with the first column containing well names that macth the well names in the second row of Kinbiont_results.

  • row_to_learn::Int: The index of the row in the Kinbiont_results matrix that will be used as the target for the regression.

Key Arguments:

  • max_depth::Int = -1: Maximum depth of the decision tree. If -1, there is no maximum depth.

  • verbose::Bool = true: If true, the function will output additional details.

  • pruning_purity::Float64 = 1.0: Purity threshold for pruning. If set to 1.0, no pruning will be performed.

  • min_samples_leaf::Int = 5: Minimum number of samples required to be assigned to leaf node.

  • min_samples_split::Int = 2: Minimum number of samples required to perform a split.

  • min_purity_increase::Float64 = 0.0: Minimum increase in purity required for a split.

  • n_subfeatures::Int = 0: Number of features, selected at random, for evaluating the tree. If 0, all features are considered.

  • do_pruning::Bool = true: If true, post-inference impurity pruning will be performed.

  • pruning_accuracy::Float64 = 1.0: Purity threshold used for post-pruning. A value of 1.0 means no pruning.

  • seed::Int = 3: Random seed for reproducibility.

  • do_cross_validation::Bool = false: If true, performs n-fold cross-validation.

  • n_folds_cv::Int = 3: Number of folds for cross-validation. Ignored if do_cross_validation is false.

Outputs:

  1. tree_model::Any: The trained decision tree model.

  2. importance_score::Vector{Float64}: Importance scores of each feature used in the model.

  3. importance_rank::Vector{Int}: Ranking of features based on their importance scores.

  4. cross_validation_score::Union{Float64, Nothing}: Cross-validation score if do_cross_validation=true is true; otherwise, nothing.

  5. leaf_vs_samples::Matrix{Int}: Matrix where the first row represents a label of a leaf node and the secondo row contains the values of the samples.

source

Kinbiont.downstream_symbolic_regression

Kinbiont.downstream_symbolic_regressionFunction
downstream_symbolic_regression(
Kinbiont_results::Matrix{Any},
feature_matrix::Matrix{Any},
row_to_learn::Int;
options = SymbolicRegression.Options(),
)

This function performs symbolic regression on the results obtained from fitting models using Kinbiont. It uses a feature matrix to train a symbolic regression model to predict a specific row of the Kinbiont results.

Arguments:

  • Kinbiont_results::Matrix{Any}: The matrix containing results from fitting one or more files using Kinbiont.

  • feature_matrix::Matrix{Any}: Matrix of features used for machine learning analysis. The number of rows should match the number of columns (minus one) in the Kinbiont_results, with the first column containing well names that macth the well names in the second row of Kinbiont_results.

  • row_to_learn::Int: The index of the row in the Kinbiont_results matrix that will be the target for machine learning inference.

Key Arguments:

  • options::SymbolicRegression.Options(): Options for the symbolic regression process. This argument uses the SymbolicRegression.Options class, allowing customization of the symbolic regression parameters. See SymbolicRegression.jl API documentation for details.

Outputs:

  • trees::Vector{SymbolicRegression.Tree}: A vector of trees representing the hall of fame results from the symbolic regression process.

  • res_output::Matrix{Any}: A matrix containing the hall of fame results, where each row includes:

  1. Complexity score of the equation.
  2. Mean Squared Error (MSE) of the equation.
  3. The symbolic equation itself.
  • predictions::Matrix{Float64}: For each equation in the hall of fame, this matrix contains the predicted values for each sample. Columns represent the different equations, and rows correspond to the samples.

  • index_annotation::Vector{Int}: An index vector used to order the rows of the feature_matrix to match the columns of the Kinbiont_results.

source

Simulations

ODE simulations

Kinbiont.ODE_sim

Kinbiont.ODE_simFunction
ODE_sim(
model::String, 
n_start::Vector{Float64}, 
tstart::Float64, 
tmax::Float64,
delta_t::Float64, 
param_of_ode::Vector{Float64};
integrator=KenCarp4(),
)

This function solve the harcoded ODE Kinbiont.jl models.

Arguments:

  • model::String: The model to simulate. For the possible options please check the documentation.
  • n_start::Vector{Float64}: The starting conditions.
  • tstart::Float64: The start time of the simulation.
  • tmax::Float64: The final time of the simulation.
  • delta_t::Float64: The time step of the output.
  • param_of_ode::Vector{Float64}: The parameters of the ODE model.

Key argument:

  • integrator=KenCarp4(): The chosen solver from the SciML ecosystem for ODE integration, default KenCarp4 algorithm.

Output:

  • it returns a standard SciML output (i.e., if sim =ODE_sim(...), then sim.t is the array of times and sim.u is the array of the numerical solution).
source

Stochastic simulations

Kinbiont.stochastic_sim

Kinbiont.stochastic_simFunction
stochastic_sim(
model::String,
n_start::Int, 
n_mass_start::Float64, 
tstart::Float64, 
tmax::Float64, 
delta_t::Float64, 
k_1_val::Float64,
k_2_val::Float64, 
alpha_val::Float64, 
lambda::Float64, 
n_mol_per_birth::Float64,
volume::Float64,
)

This function performs a stochastic simulation of a model, considering cell growth and nutrient consumption over time using Poisson approximation.

Arguments:

  • model::String: The model to simulate. Possible options "Monod","Haldane","Blackman","Tessier","Moser","Aiba-Edwards", and "Verhulst".
  • n_start::Int: The number of starting cells.
  • n_mass_start::Float64: The starting concentration of the limiting nutrient.
  • tstart::Float64: The start time of the simulation.
  • tmax::Float64: The final time of the simulation.
  • delta_t::Float64: The time step for the Poisson approximation.
  • k_1_val::Float64: The value of parameter k1.
  • k_2_val::Float64: The value of the parameter k2.
  • alpha_val::Float64: The maximum possible growth rate.
  • lambda::Float64: The lag time, simulated as a zero growht time span at the start
  • n_mol_per_birth::Float64: The nutrient consumed per division (mass).
  • volume::Float64: The volume.

Output (if sim =stochastic_sim(...)):

  • sim[1]: array of the number of individuals in the population.
  • sim[2]: array of the number of biomass equivalent mass of the limiting nutrient concentration.
  • sim[3]: array of the times of the simulation.
source

Various

Kinbiont.specific_gr_evaluation

Kinbiont.specific_gr_evaluationFunction
specific_gr_evaluation(data_smoothed::Matrix{Float64}, 
pt_smoothing_derivative::Int)

This function evaluates the specific growth rate of a smoothed time series using a sliding window log-linear fitting approach.

Arguments:

  • data_smoothed::Matrix{Float64}: A 2xN matrix of smoothed data. The first row contains time points, and the second row contains the corresponding OD (optical density) values. This matrix represents a single growth curve.

  • pt_smoothing_derivative::Int: Size of the window used for computing the numerical derivative of the log-transformed data. If pt_smoothing_derivative is less than 2, the function uses an interpolation algorithm to evaluate the numerical derivative.

Output:

  • specific_gr::Vector{Float64}: An array containing the specific growth rate as a function of time.
source