slim_gsgp.utils

slim_gsgp.utils.diversity

slim_gsgp.utils.diversity.gsgp_pop_div_from_vectors(sem_vectors)[source]

Calculate the diversity of a population from semantic vectors.

Parameters:: sem_vectors (torch.Tensor) – The tensor of semantic vectors.
Returns:: The average pairwise distance between semantic vectors.
Return type:: float

Notes

https://ieeexplore.ieee.org/document/9283096

slim_gsgp.utils.diversity.niche_entropy(repr_, n_niches=10)[source]

Calculate the niche entropy of a population.

Parameters:

repr (list) – The list of individuals in the population.
n_niches (int) – Number of niches to divide the population into.

Returns:

The entropy of the distribution of individuals across niches.

Return type:

float

Notes

https://www.semanticscholar.org/paper/Entropy-Driven-Adaptive-RoscaComputer/ab5c8a8f415f79c5ec6ff6281ed7113736615682 https://strathprints.strath.ac.uk/76488/1/Marchetti_etal_Springer_2021_Inclusive_genetic_programming.pdf

slim_gsgp.utils.logger

slim_gsgp.utils.logger.drop_experiment_from_logger(experiment_id: str, log_path: str) → None[source]

Remove an experiment from the logger CSV file. If the given experiment_id is -1, the last saved experiment is removed.

Parameters:

experiment_id (str or int) – The experiment ID to be removed. If -1, the most recent experiment is removed.
log_path (str) – Path to the file containing the logging information.

Return type:

None

slim_gsgp.utils.logger.log_settings(path: str, settings_dict: list, unique_run_id: UUID) → None[source]

Log the settings to a CSV file.

Parameters:

path (str) – Path to the CSV file.
settings_dict (list) – Dictionary of settings.
unique_run_id (UUID) – Unique identifier for the run.

Return type:

None

slim_gsgp.utils.logger.logger(path: str, generation: int, elite_fit: float, timing: float, nodes: int, additional_infos: list | None = None, run_info: list | None = None, seed: int = 0) → None[source]

Logs information into a CSV file.

Parameters:

path (str) – Path to the CSV file.
generation (int) – Current generation number.
elite_fit (float) – Elite’s validation fitness value.
timing (float) – Time taken for the process.
nodes (int) – Count of nodes in the population.
additional_infos (list, optional) – Population’s test fitness value(s) and diversity measurements. Defaults to None.
run_info (list, optional) – Information about the run. Defaults to None.
seed (int, optional) – The seed used in random, numpy, and torch libraries. Defaults to 0.

Return type:

None

slim_gsgp.utils.logger.merge_settings(sd1: dict, sd2: dict, sd3: dict, sd4: dict) → dict[source]

Merge multiple settings dictionaries into one.

Parameters:

sd1 (dict) – First settings dictionary.
sd2 (dict) – Second settings dictionary.
sd3 (dict) – Third settings dictionary.
sd4 (dict) – Fourth settings dictionary.

Returns:

Merged settings dictionary.

Return type:

dict

slim_gsgp.utils.utils

slim_gsgp.utils.utils.check_slim_version(slim_version)[source]

Validate the slim_gsgp version given as input bu the users and assign the correct values to the parameters op, sig and trees :param slim_version: Name of the slim_gsgp version. :type slim_version: str

Returns:: Parameters reflecting the kind of operation considered, the use of the sigmoid and the use of multiple trees.
Return type:: op, sig, trees

slim_gsgp.utils.utils.generate_random_uniform(lower, upper)[source]

Generate a random number within a specified range using numpy random.uniform.

Parameters:

lower (float) – The lower bound of the range for generating the random number.
upper (float) – The upper bound of the range for generating the random number.

Returns:

A function that when called, generates a random number within the specified range.

Return type:

Callable

Notes

The returned function takes no input and returns a random float between lower and upper whenever called.

slim_gsgp.utils.utils.get_best_max(population, n_elites)[source]

Get the best individuals from the population with the maximum fitness.

Parameters:

population (Population) – The population of individuals.
n_elites (int) – Number of elites to return.

Returns:

list – The list of elite individuals.
Individual – Best individual from the elites.

slim_gsgp.utils.utils.get_best_min(population, n_elites)[source]

Get the best individuals from the population with the minimum fitness.

Parameters:

population (Population) – The population of individuals.
n_elites (int) – Number of elites to return.

Returns:

list – The list of elite individuals.
Individual – Best individual from the elites.

slim_gsgp.utils.utils.get_random_tree(max_depth, FUNCTIONS, TERMINALS, CONSTANTS, inputs, p_c=0.3, grow_probability=1, logistic=True)[source]

Get a random tree using either grow or full method.

Parameters:

max_depth (int) – Maximum depth of the tree.
FUNCTIONS (dict) – Dictionary of functions.
TERMINALS (dict) – Dictionary of terminals.
CONSTANTS (dict) – Dictionary of constants.
inputs (torch.Tensor) – Input tensor for calculating semantics.
p_c (float, default=0.3) – Probability of choosing a constant.
grow_probability (float, default=1) – Probability of using the grow method.
logistic (bool, default=True) – Whether to use logistic semantics.

Returns:

The generated random tree.

Return type:

Tree

slim_gsgp.utils.utils.get_terminals(X)[source]

Get terminal nodes for a dataset.

Parameters:: X ((torch.Tensor)) – An array to get the set of TERMINALS from, it will correspond to the columns.
Returns:: Dictionary of terminal nodes.
Return type:: dict

slim_gsgp.utils.utils.gs_rmse(y_true, y_pred)[source]

Calculate the root mean squared error.

Parameters:

y_true (array-like) – True values.
y_pred (array-like) – Predicted values.

Returns:

The root mean squared error.

Return type:

float

slim_gsgp.utils.utils.gs_size(y_true, y_pred)[source]

Get the size of the predicted values.

Parameters:

y_true (array-like) – True values.
y_pred (array-like) – Predicted values.

Returns:

The size of the predicted values.

Return type:

int

slim_gsgp.utils.utils.mean_(x1, x2)[source]

Compute the mean of two tensors.

Parameters:

x1 (torch.Tensor) – The first tensor.
x2 (torch.Tensor) – The second tensor.

Returns:

The mean of the two tensors.

Return type:

torch.Tensor

slim_gsgp.utils.utils.protected_div(x1, x2)[source]

Implements the division protected against zero denominator

Performs division between x1 and x2. If x2 is (or has) zero(s), the function returns the numerator’s value(s).

Parameters:

x1 (torch.Tensor) – The numerator.
x2 (torch.Tensor) – The denominator.

Returns:

Result of protected division between x1 and x2.

Return type:

torch.Tensor

slim_gsgp.utils.utils.show_individual(tree, operator)[source]

Display an individual’s structure with a specified operator.

Parameters:

tree (Tree) – The tree representing the individual.
operator (str) – The operator to display (‘sum’ or ‘prod’).

Returns:

The string representation of the individual’s structure.

Return type:

str

slim_gsgp.utils.utils.tensor_dimensioned_sum(dim)[source]

Generate a sum function over a specified dimension.

Parameters:

dim (int) – The dimension to sum over.

Returns:

function
A function that sums tensors over the specified dimension.

slim_gsgp.utils.utils.train_test_split(X, y, p_test=0.3, shuffle=True, indices_only=False, seed=0)[source]

Splits X and y tensors into train and test subsets

This method replicates the behaviour of Sklearn’s ‘train_test_split’.

Parameters:

X (torch.Tensor) – Input data instances,
y (torch.Tensor) – Target vector.
p_test (float (default=0.3)) – The proportion of the dataset to include in the test split.
shuffle (bool (default=True)) – Whether to shuffle the data before splitting.
indices_only (bool (default=False)) – Whether to return only the indices representing training and test partition.
seed (int (default=0)) – The seed for random numbers generators.

Returns:

X_train (torch.Tensor) – Training data instances.
y_train (torch.Tensor) – Training target vector.
X_test (torch.Tensor) – Test data instances.
y_test (torch.Tensor) – Test target vector.
train_indices (torch.Tensor) – Indices representing the training partition.
test_indices (torch.Tensor) – Indices representing the test partition.

slim_gsgp.utils.utils.validate_inputs(X_train, y_train, X_test, y_test, pop_size, n_iter, elitism, n_elites, init_depth, log_path, prob_const, tree_functions, tree_constants, log, verbose, minimization, n_jobs, test_elite, fitness_function, initializer, tournament_size)[source]

Validates the inputs based on the specified conditions.

Parameters:

tournament_size
X_train ((torch.Tensor)) – Training input data.
y_train ((torch.Tensor)) – Training output data.
X_test ((torch.Tensor), optional) – Testing input data.
y_test ((torch.Tensor), optional) – Testing output data.
pop_size (int, optional) – The population size for the genetic programming algorithm (default is 100).
n_iter (int, optional) – The number of iterations for the genetic programming algorithm (default is 100).
elitism (bool, optional) – Indicate the presence or absence of elitism.
n_elites (int, optional) – The number of elites.
init_depth (int, optional) – The depth value for the initial GP trees population.
log_path (str, optional) – The path where is created the log directory where results are saved.
log (int, optional) – Level of detail to utilize in logging.
verbose (int, optional) – Level of detail to include in console output.
minimization (bool, optional) – If True, the objective is to minimize the fitness function. If False, maximize it (default is True).
fitness_function (str, optional) – The fitness function used for evaluating individuals (default is from gp_solve_parameters).
initializer (str, optional) – The strategy for initializing the population (e.g., “grow”, “full”, “rhh”).
n_jobs (int, optional) – Number of parallel jobs to run (default is 1).
prob_const (float, optional) – The probability of introducing constants into the trees during evolution.
tree_functions (list, optional) – List of allowed functions that can appear in the trees Check documentation for the available functions.
tree_constants (list, optional) – List of constants allowed to appear in the trees.
test_elite (bool, optional) – Whether to test the elite individual on the test set after each generation.

slim_gsgp.utils.utils.verbose_reporter(dataset, generation, pop_val_fitness, pop_test_fitness, timing, nodes)[source]

Prints a formatted report of generation, fitness values, timing, and node count.

Parameters:

generation (int) – Current generation number.
pop_val_fitness (float) – Population’s validation fitness value.
pop_test_fitness (float) – Population’s test fitness value.
timing (float) – Time taken for the process.
nodes (int) – Count of nodes in the population.

Returns:

Outputs a formatted report to the console.

Return type:

None