Fit models selected after calibration — fit

This function fits models selected after candidate model training and testing using the function calibration().

Usage

fit_selected(calibration_results, partition_method = "kfolds",
             n_partitions = 1, train_proportion = 0.7, type = "cloglog",
             write_models = FALSE,
             file_name = NULL, parallel = FALSE, ncores = NULL,
             progress_bar = TRUE, verbose = TRUE, seed = 1)

Arguments

calibration_results: an object of class calibration_results returned by the calibration() function.
partition_method: (character) method used for data partitioning. Available options are "kfolds", "subsample", and "bootstrap". See Details for more information.
n_partitions: (numeric) number of partitions or folds to generate. If partition_method is "subsample" or "bootstrap", this defines the number of replicates. If "kfolds", it specifies the number of folds. Default is 4.
train_proportion: (numeric) proportion of occurrence and background points to be used for model training in each replicate. Only applicable when partition_method is "subsample" or "bootstrap". Default is 0.7 (i.e., 70% for training and 30% for testing).
type: (character) the format of prediction values for computing thresholds. For maxnet models, valid options are "raw", "cumulative", "logistic", and "cloglog". For glm models, valid options are "cloglog", "response" and "raw". Default is "cloglog".
write_models: (logical) whether to save the final fitted models to disk. Default is FALSE.
file_name: (character) the file name, with or without a path, for saving the final models. This is only applicable if write_models = TRUE.
parallel: (logical) whether to fit the final models in parallel. Default is FALSE.
ncores: (numeric) number of cores to use for parallel processing. Default is NULL and uses available cores - 1. This is only applicable if parallel = TRUE.
progress_bar: (logical) whether to display a progress bar during processing. Default is TRUE.
verbose: (logical) whether to display detailed messages during processing. Default is TRUE.
seed: (numeric) integer value used to specify an initial seed to split the data. Default is 1.

Value

An object of class 'fitted_models' containing the following elements:

species: a character string with the name of the species.
Models: a list of fitted models, including partitions (trained with the parts of the data) and full models (trained with all available records).
calibration_data: a data.frame containing a column (pr_bg) that identifies occurrence points (1) and background points (0), along with the corresponding values of predictor variables for each point.
selected_models: a data frame with the ID and summary of evaluation metrics for the selected models.
weights: a numeric vector specifying weights for the predictor variables (if used).
pca: a list of class prcomp representing the result of principal component analysis (if performed).
addsamplestobackground: a logical value indicating whether any presence sample not already in the background was added.
omission_rate: the omission rate determined during the calibration step.
thresholds: the thresholds to binarize each partition and the consensus (mean and median), calculated based on the omission rate set in calibration().

Details

This function also computes model consensus (mean and median), the thresholds to binarize model predictions based on the omission rate set during model calibration to select models.

Examples

# An example with maxnet models
data(calib_results_maxnet, package = "kuenm2")

# Fit models using calibration results
fm <- fit_selected(calibration_results = calib_results_maxnet,
                   n_partitions = 4)
#> Fitting partitions...
#> 
  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |=========                                                             |  12%
  |                                                                            
  |==================                                                    |  25%
  |                                                                            
  |==========================                                            |  38%
  |                                                                            
  |===================================                                   |  50%
  |                                                                            
  |============================================                          |  62%
  |                                                                            
  |====================================================                  |  75%
  |                                                                            
  |=============================================================         |  88%
  |                                                                            
  |======================================================================| 100%
#> 
#> Fitting full models...
#> 
  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |===================================                                   |  50%
  |                                                                            
  |======================================================================| 100%

# Output the fitted models
fm
#> fitted_models object summary
#> ============================
#> Species: Myrcia hatschbachii 
#> Algortihm: maxnet 
#> Number of fitted models: 2 
#> Models fitted with 4 replicates

# An example with GLMs
data(calib_results_glm, package = "kuenm2")

# Fit models using calibration results
fm_glm <- fit_selected(calibration_results = calib_results_glm,
                       partition_method = "subsample",
                       n_partitions = 5)
#> Fitting partitions...
#> 
  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |==============                                                        |  20%
  |                                                                            
  |============================                                          |  40%
  |                                                                            
  |==========================================                            |  60%
  |                                                                            
  |========================================================              |  80%
  |                                                                            
  |======================================================================| 100%
#> 
#> Fitting full models...
#> 
  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |======================================================================| 100%

# Output the fitted models
fm_glm
#> fitted_models object summary
#> ============================
#> Species: Myrcia hatschbachii 
#> Algortihm: glm 
#> Number of fitted models: 1 
#> Models fitted with 5 replicates