slim_gsgp.utils

slim_gsgp.utils.diversity

slim_gsgp.utils.diversity.gsgp_pop_div_from_vectors(sem_vectors)[source]

Calculate the diversity of a population from semantic vectors.

Parameters:

sem_vectors (torch.Tensor) – The tensor of semantic vectors.

Returns:

The average pairwise distance between semantic vectors.

Return type:

float

Notes

https://ieeexplore.ieee.org/document/9283096

slim_gsgp.utils.diversity.niche_entropy(repr_, n_niches=10)[source]

Calculate the niche entropy of a population.

Parameters:
  • repr (list) – The list of individuals in the population.

  • n_niches (int) – Number of niches to divide the population into.

Returns:

The entropy of the distribution of individuals across niches.

Return type:

float

Notes

https://www.semanticscholar.org/paper/Entropy-Driven-Adaptive-RoscaComputer/ab5c8a8f415f79c5ec6ff6281ed7113736615682 https://strathprints.strath.ac.uk/76488/1/Marchetti_etal_Springer_2021_Inclusive_genetic_programming.pdf

slim_gsgp.utils.logger

slim_gsgp.utils.logger.drop_experiment_from_logger(experiment_id: str, log_path: str) None[source]

Remove an experiment from the logger CSV file. If the given experiment_id is -1, the last saved experiment is removed.

Parameters:
  • experiment_id (str or int) – The experiment ID to be removed. If -1, the most recent experiment is removed.

  • log_path (str) – Path to the file containing the logging information.

Return type:

None

slim_gsgp.utils.logger.log_settings(path: str, settings_dict: list, unique_run_id: UUID) None[source]

Log the settings to a CSV file.

Parameters:
  • path (str) – Path to the CSV file.

  • settings_dict (list) – Dictionary of settings.

  • unique_run_id (UUID) – Unique identifier for the run.

Return type:

None

slim_gsgp.utils.logger.logger(path: str, generation: int, elite_fit: float, timing: float, nodes: int, additional_infos: list | None = None, run_info: list | None = None, seed: int = 0) None[source]

Logs information into a CSV file.

Parameters:
  • path (str) – Path to the CSV file.

  • generation (int) – Current generation number.

  • elite_fit (float) – Elite’s validation fitness value.

  • timing (float) – Time taken for the process.

  • nodes (int) – Count of nodes in the population.

  • additional_infos (list, optional) – Population’s test fitness value(s) and diversity measurements. Defaults to None.

  • run_info (list, optional) – Information about the run. Defaults to None.

  • seed (int, optional) – The seed used in random, numpy, and torch libraries. Defaults to 0.

Return type:

None

slim_gsgp.utils.logger.merge_settings(sd1: dict, sd2: dict, sd3: dict, sd4: dict) dict[source]

Merge multiple settings dictionaries into one.

Parameters:
  • sd1 (dict) – First settings dictionary.

  • sd2 (dict) – Second settings dictionary.

  • sd3 (dict) – Third settings dictionary.

  • sd4 (dict) – Fourth settings dictionary.

Returns:

Merged settings dictionary.

Return type:

dict

slim_gsgp.utils.utils

slim_gsgp.utils.utils.check_slim_version(slim_version)[source]

Validate the slim_gsgp version given as input bu the users and assign the correct values to the parameters op, sig and trees :param slim_version: Name of the slim_gsgp version. :type slim_version: str

Returns:

Parameters reflecting the kind of operation considered, the use of the sigmoid and the use of multiple trees.

Return type:

op, sig, trees

slim_gsgp.utils.utils.generate_random_uniform(lower, upper)[source]

Generate a random number within a specified range using numpy random.uniform.

Parameters:
  • lower (float) – The lower bound of the range for generating the random number.

  • upper (float) – The upper bound of the range for generating the random number.

Returns:

A function that when called, generates a random number within the specified range.

Return type:

Callable

Notes

The returned function takes no input and returns a random float between lower and upper whenever called.

slim_gsgp.utils.utils.get_best_max(population, n_elites)[source]

Get the best individuals from the population with the maximum fitness.

Parameters:
  • population (Population) – The population of individuals.

  • n_elites (int) – Number of elites to return.

Returns:

  • list – The list of elite individuals.

  • Individual – Best individual from the elites.

slim_gsgp.utils.utils.get_best_min(population, n_elites)[source]

Get the best individuals from the population with the minimum fitness.

Parameters:
  • population (Population) – The population of individuals.

  • n_elites (int) – Number of elites to return.

Returns:

  • list – The list of elite individuals.

  • Individual – Best individual from the elites.

slim_gsgp.utils.utils.get_random_tree(max_depth, FUNCTIONS, TERMINALS, CONSTANTS, inputs, p_c=0.3, grow_probability=1, logistic=True)[source]

Get a random tree using either grow or full method.

Parameters:
  • max_depth (int) – Maximum depth of the tree.

  • FUNCTIONS (dict) – Dictionary of functions.

  • TERMINALS (dict) – Dictionary of terminals.

  • CONSTANTS (dict) – Dictionary of constants.

  • inputs (torch.Tensor) – Input tensor for calculating semantics.

  • p_c (float, default=0.3) – Probability of choosing a constant.

  • grow_probability (float, default=1) – Probability of using the grow method.

  • logistic (bool, default=True) – Whether to use logistic semantics.

Returns:

The generated random tree.

Return type:

Tree

slim_gsgp.utils.utils.get_terminals(X)[source]

Get terminal nodes for a dataset.

Parameters:

X ((torch.Tensor)) – An array to get the set of TERMINALS from, it will correspond to the columns.

Returns:

Dictionary of terminal nodes.

Return type:

dict

slim_gsgp.utils.utils.gs_rmse(y_true, y_pred)[source]

Calculate the root mean squared error.

Parameters:
  • y_true (array-like) – True values.

  • y_pred (array-like) – Predicted values.

Returns:

The root mean squared error.

Return type:

float

slim_gsgp.utils.utils.gs_size(y_true, y_pred)[source]

Get the size of the predicted values.

Parameters:
  • y_true (array-like) – True values.

  • y_pred (array-like) – Predicted values.

Returns:

The size of the predicted values.

Return type:

int

slim_gsgp.utils.utils.mean_(x1, x2)[source]

Compute the mean of two tensors.

Parameters:
  • x1 (torch.Tensor) – The first tensor.

  • x2 (torch.Tensor) – The second tensor.

Returns:

The mean of the two tensors.

Return type:

torch.Tensor

slim_gsgp.utils.utils.protected_div(x1, x2)[source]

Implements the division protected against zero denominator

Performs division between x1 and x2. If x2 is (or has) zero(s), the function returns the numerator’s value(s).

Parameters:
  • x1 (torch.Tensor) – The numerator.

  • x2 (torch.Tensor) – The denominator.

Returns:

Result of protected division between x1 and x2.

Return type:

torch.Tensor

slim_gsgp.utils.utils.show_individual(tree, operator)[source]

Display an individual’s structure with a specified operator.

Parameters:
  • tree (Tree) – The tree representing the individual.

  • operator (str) – The operator to display (‘sum’ or ‘prod’).

Returns:

The string representation of the individual’s structure.

Return type:

str

slim_gsgp.utils.utils.tensor_dimensioned_sum(dim)[source]

Generate a sum function over a specified dimension.

Parameters:

dim (int) – The dimension to sum over.

Returns:

  • function

  • A function that sums tensors over the specified dimension.

slim_gsgp.utils.utils.train_test_split(X, y, p_test=0.3, shuffle=True, indices_only=False, seed=0)[source]

Splits X and y tensors into train and test subsets

This method replicates the behaviour of Sklearn’s ‘train_test_split’.

Parameters:
  • X (torch.Tensor) – Input data instances,

  • y (torch.Tensor) – Target vector.

  • p_test (float (default=0.3)) – The proportion of the dataset to include in the test split.

  • shuffle (bool (default=True)) – Whether to shuffle the data before splitting.

  • indices_only (bool (default=False)) – Whether to return only the indices representing training and test partition.

  • seed (int (default=0)) – The seed for random numbers generators.

Returns:

  • X_train (torch.Tensor) – Training data instances.

  • y_train (torch.Tensor) – Training target vector.

  • X_test (torch.Tensor) – Test data instances.

  • y_test (torch.Tensor) – Test target vector.

  • train_indices (torch.Tensor) – Indices representing the training partition.

  • test_indices (torch.Tensor) – Indices representing the test partition.

slim_gsgp.utils.utils.validate_inputs(X_train, y_train, X_test, y_test, pop_size, n_iter, elitism, n_elites, init_depth, log_path, prob_const, tree_functions, tree_constants, log, verbose, minimization, n_jobs, test_elite, fitness_function, initializer, tournament_size)[source]

Validates the inputs based on the specified conditions.

Parameters:
  • tournament_size

  • X_train ((torch.Tensor)) – Training input data.

  • y_train ((torch.Tensor)) – Training output data.

  • X_test ((torch.Tensor), optional) – Testing input data.

  • y_test ((torch.Tensor), optional) – Testing output data.

  • pop_size (int, optional) – The population size for the genetic programming algorithm (default is 100).

  • n_iter (int, optional) – The number of iterations for the genetic programming algorithm (default is 100).

  • elitism (bool, optional) – Indicate the presence or absence of elitism.

  • n_elites (int, optional) – The number of elites.

  • init_depth (int, optional) – The depth value for the initial GP trees population.

  • log_path (str, optional) – The path where is created the log directory where results are saved.

  • log (int, optional) – Level of detail to utilize in logging.

  • verbose (int, optional) – Level of detail to include in console output.

  • minimization (bool, optional) – If True, the objective is to minimize the fitness function. If False, maximize it (default is True).

  • fitness_function (str, optional) – The fitness function used for evaluating individuals (default is from gp_solve_parameters).

  • initializer (str, optional) – The strategy for initializing the population (e.g., “grow”, “full”, “rhh”).

  • n_jobs (int, optional) – Number of parallel jobs to run (default is 1).

  • prob_const (float, optional) – The probability of introducing constants into the trees during evolution.

  • tree_functions (list, optional) – List of allowed functions that can appear in the trees Check documentation for the available functions.

  • tree_constants (list, optional) – List of constants allowed to appear in the trees.

  • test_elite (bool, optional) – Whether to test the elite individual on the test set after each generation.

slim_gsgp.utils.utils.verbose_reporter(dataset, generation, pop_val_fitness, pop_test_fitness, timing, nodes)[source]

Prints a formatted report of generation, fitness values, timing, and node count.

Parameters:
  • generation (int) – Current generation number.

  • pop_val_fitness (float) – Population’s validation fitness value.

  • pop_test_fitness (float) – Population’s test fitness value.

  • timing (float) – Time taken for the process.

  • nodes (int) – Count of nodes in the population.

Returns:

Outputs a formatted report to the console.

Return type:

None