slim_gsgp.utils
slim_gsgp.utils.diversity
- slim_gsgp.utils.diversity.gsgp_pop_div_from_vectors(sem_vectors)[source]
Calculate the diversity of a population from semantic vectors.
- Parameters:
sem_vectors (torch.Tensor) – The tensor of semantic vectors.
- Returns:
The average pairwise distance between semantic vectors.
- Return type:
float
Notes
- slim_gsgp.utils.diversity.niche_entropy(repr_, n_niches=10)[source]
Calculate the niche entropy of a population.
- Parameters:
repr (list) – The list of individuals in the population.
n_niches (int) – Number of niches to divide the population into.
- Returns:
The entropy of the distribution of individuals across niches.
- Return type:
float
Notes
https://www.semanticscholar.org/paper/Entropy-Driven-Adaptive-RoscaComputer/ab5c8a8f415f79c5ec6ff6281ed7113736615682 https://strathprints.strath.ac.uk/76488/1/Marchetti_etal_Springer_2021_Inclusive_genetic_programming.pdf
slim_gsgp.utils.logger
- slim_gsgp.utils.logger.drop_experiment_from_logger(experiment_id: str, log_path: str) None[source]
Remove an experiment from the logger CSV file. If the given experiment_id is -1, the last saved experiment is removed.
- Parameters:
experiment_id (str or int) – The experiment ID to be removed. If -1, the most recent experiment is removed.
log_path (str) – Path to the file containing the logging information.
- Return type:
None
- slim_gsgp.utils.logger.log_settings(path: str, settings_dict: list, unique_run_id: UUID) None[source]
Log the settings to a CSV file.
- Parameters:
path (str) – Path to the CSV file.
settings_dict (list) – Dictionary of settings.
unique_run_id (UUID) – Unique identifier for the run.
- Return type:
None
- slim_gsgp.utils.logger.logger(path: str, generation: int, elite_fit: float, timing: float, nodes: int, additional_infos: list | None = None, run_info: list | None = None, seed: int = 0) None[source]
Logs information into a CSV file.
- Parameters:
path (str) – Path to the CSV file.
generation (int) – Current generation number.
elite_fit (float) – Elite’s validation fitness value.
timing (float) – Time taken for the process.
nodes (int) – Count of nodes in the population.
additional_infos (list, optional) – Population’s test fitness value(s) and diversity measurements. Defaults to None.
run_info (list, optional) – Information about the run. Defaults to None.
seed (int, optional) – The seed used in random, numpy, and torch libraries. Defaults to 0.
- Return type:
None
- slim_gsgp.utils.logger.merge_settings(sd1: dict, sd2: dict, sd3: dict, sd4: dict) dict[source]
Merge multiple settings dictionaries into one.
- Parameters:
sd1 (dict) – First settings dictionary.
sd2 (dict) – Second settings dictionary.
sd3 (dict) – Third settings dictionary.
sd4 (dict) – Fourth settings dictionary.
- Returns:
Merged settings dictionary.
- Return type:
dict
slim_gsgp.utils.utils
- slim_gsgp.utils.utils.check_slim_version(slim_version)[source]
Validate the slim_gsgp version given as input bu the users and assign the correct values to the parameters op, sig and trees :param slim_version: Name of the slim_gsgp version. :type slim_version: str
- Returns:
Parameters reflecting the kind of operation considered, the use of the sigmoid and the use of multiple trees.
- Return type:
op, sig, trees
- slim_gsgp.utils.utils.generate_random_uniform(lower, upper)[source]
Generate a random number within a specified range using numpy random.uniform.
- Parameters:
lower (float) – The lower bound of the range for generating the random number.
upper (float) – The upper bound of the range for generating the random number.
- Returns:
A function that when called, generates a random number within the specified range.
- Return type:
Callable
Notes
The returned function takes no input and returns a random float between lower and upper whenever called.
- slim_gsgp.utils.utils.get_best_max(population, n_elites)[source]
Get the best individuals from the population with the maximum fitness.
- Parameters:
population (Population) – The population of individuals.
n_elites (int) – Number of elites to return.
- Returns:
list – The list of elite individuals.
Individual – Best individual from the elites.
- slim_gsgp.utils.utils.get_best_min(population, n_elites)[source]
Get the best individuals from the population with the minimum fitness.
- Parameters:
population (Population) – The population of individuals.
n_elites (int) – Number of elites to return.
- Returns:
list – The list of elite individuals.
Individual – Best individual from the elites.
- slim_gsgp.utils.utils.get_random_tree(max_depth, FUNCTIONS, TERMINALS, CONSTANTS, inputs, p_c=0.3, grow_probability=1, logistic=True)[source]
Get a random tree using either grow or full method.
- Parameters:
max_depth (int) – Maximum depth of the tree.
FUNCTIONS (dict) – Dictionary of functions.
TERMINALS (dict) – Dictionary of terminals.
CONSTANTS (dict) – Dictionary of constants.
inputs (torch.Tensor) – Input tensor for calculating semantics.
p_c (float, default=0.3) – Probability of choosing a constant.
grow_probability (float, default=1) – Probability of using the grow method.
logistic (bool, default=True) – Whether to use logistic semantics.
- Returns:
The generated random tree.
- Return type:
- slim_gsgp.utils.utils.get_terminals(X)[source]
Get terminal nodes for a dataset.
- Parameters:
X ((torch.Tensor)) – An array to get the set of TERMINALS from, it will correspond to the columns.
- Returns:
Dictionary of terminal nodes.
- Return type:
dict
- slim_gsgp.utils.utils.gs_rmse(y_true, y_pred)[source]
Calculate the root mean squared error.
- Parameters:
y_true (array-like) – True values.
y_pred (array-like) – Predicted values.
- Returns:
The root mean squared error.
- Return type:
float
- slim_gsgp.utils.utils.gs_size(y_true, y_pred)[source]
Get the size of the predicted values.
- Parameters:
y_true (array-like) – True values.
y_pred (array-like) – Predicted values.
- Returns:
The size of the predicted values.
- Return type:
int
- slim_gsgp.utils.utils.mean_(x1, x2)[source]
Compute the mean of two tensors.
- Parameters:
x1 (torch.Tensor) – The first tensor.
x2 (torch.Tensor) – The second tensor.
- Returns:
The mean of the two tensors.
- Return type:
torch.Tensor
- slim_gsgp.utils.utils.protected_div(x1, x2)[source]
Implements the division protected against zero denominator
Performs division between x1 and x2. If x2 is (or has) zero(s), the function returns the numerator’s value(s).
- Parameters:
x1 (torch.Tensor) – The numerator.
x2 (torch.Tensor) – The denominator.
- Returns:
Result of protected division between x1 and x2.
- Return type:
torch.Tensor
- slim_gsgp.utils.utils.show_individual(tree, operator)[source]
Display an individual’s structure with a specified operator.
- Parameters:
tree (Tree) – The tree representing the individual.
operator (str) – The operator to display (‘sum’ or ‘prod’).
- Returns:
The string representation of the individual’s structure.
- Return type:
str
- slim_gsgp.utils.utils.tensor_dimensioned_sum(dim)[source]
Generate a sum function over a specified dimension.
- Parameters:
dim (int) – The dimension to sum over.
- Returns:
function
A function that sums tensors over the specified dimension.
- slim_gsgp.utils.utils.train_test_split(X, y, p_test=0.3, shuffle=True, indices_only=False, seed=0)[source]
Splits X and y tensors into train and test subsets
This method replicates the behaviour of Sklearn’s ‘train_test_split’.
- Parameters:
X (torch.Tensor) – Input data instances,
y (torch.Tensor) – Target vector.
p_test (float (default=0.3)) – The proportion of the dataset to include in the test split.
shuffle (bool (default=True)) – Whether to shuffle the data before splitting.
indices_only (bool (default=False)) – Whether to return only the indices representing training and test partition.
seed (int (default=0)) – The seed for random numbers generators.
- Returns:
X_train (torch.Tensor) – Training data instances.
y_train (torch.Tensor) – Training target vector.
X_test (torch.Tensor) – Test data instances.
y_test (torch.Tensor) – Test target vector.
train_indices (torch.Tensor) – Indices representing the training partition.
test_indices (torch.Tensor) – Indices representing the test partition.
- slim_gsgp.utils.utils.validate_inputs(X_train, y_train, X_test, y_test, pop_size, n_iter, elitism, n_elites, init_depth, log_path, prob_const, tree_functions, tree_constants, log, verbose, minimization, n_jobs, test_elite, fitness_function, initializer, tournament_size)[source]
Validates the inputs based on the specified conditions.
- Parameters:
tournament_size
X_train ((torch.Tensor)) – Training input data.
y_train ((torch.Tensor)) – Training output data.
X_test ((torch.Tensor), optional) – Testing input data.
y_test ((torch.Tensor), optional) – Testing output data.
pop_size (int, optional) – The population size for the genetic programming algorithm (default is 100).
n_iter (int, optional) – The number of iterations for the genetic programming algorithm (default is 100).
elitism (bool, optional) – Indicate the presence or absence of elitism.
n_elites (int, optional) – The number of elites.
init_depth (int, optional) – The depth value for the initial GP trees population.
log_path (str, optional) – The path where is created the log directory where results are saved.
log (int, optional) – Level of detail to utilize in logging.
verbose (int, optional) – Level of detail to include in console output.
minimization (bool, optional) – If True, the objective is to minimize the fitness function. If False, maximize it (default is True).
fitness_function (str, optional) – The fitness function used for evaluating individuals (default is from gp_solve_parameters).
initializer (str, optional) – The strategy for initializing the population (e.g., “grow”, “full”, “rhh”).
n_jobs (int, optional) – Number of parallel jobs to run (default is 1).
prob_const (float, optional) – The probability of introducing constants into the trees during evolution.
tree_functions (list, optional) – List of allowed functions that can appear in the trees Check documentation for the available functions.
tree_constants (list, optional) – List of constants allowed to appear in the trees.
test_elite (bool, optional) – Whether to test the elite individual on the test set after each generation.
- slim_gsgp.utils.utils.verbose_reporter(dataset, generation, pop_val_fitness, pop_test_fitness, timing, nodes)[source]
Prints a formatted report of generation, fitness values, timing, and node count.
- Parameters:
generation (int) – Current generation number.
pop_val_fitness (float) – Population’s validation fitness value.
pop_test_fitness (float) – Population’s test fitness value.
timing (float) – Time taken for the process.
nodes (int) – Count of nodes in the population.
- Returns:
Outputs a formatted report to the console.
- Return type:
None