Comparing Continuous Problems¶

Download this as a Jupyter notebook

This example notebook shows how pyXla distinguishes between the sphere and deceptive problems.

The sphere function is defined as:

(1)¶\[f_1(x) = x^2,\]

and is visualised below:

The deceptive problem function is defined as:

(2)¶\[\begin{split} f_2(x) = \begin{cases} (0.4x - 2)^2 & \text{if } x > 0.625\\ (10x - 4)^2 - 2 & \text{otherwise}, \end{cases}\end{split}\]

and is visualised below:

Equation (2) is called deceptive because it has two local minima as shown in its visualisation above.

To begin, we load the two problems as samples:

from pyxla import load_data
from pyxla.sampling import HilbertCurveSampler

sphere_sample = {
    "name": "Sphere",
    "X": HilbertCurveSampler(sample_size=100, dim=1, l_bound=-5, u_bound=5),
    "F": lambda x: x**2
}

deceptive_sample = {
    "name": "Deceptive",
    "X": HilbertCurveSampler(sample_size=100, dim=1, l_bound=0, u_bound=6),
    "F": lambda x: (0.4 * x - 2) ** 2 if x > 0.625 else (10 * x - 4) ** 2 - 2
}

load_data(sphere_sample) # loaded in-place
load_data(deceptive_sample) # loaded in-place

WARNING:root:The Hilbert curve with dimension 1 is just a number line. You are sampling around points on a number line.

WARNING:root:The Hilbert curve with dimension 1 is just a number line. You are sampling around points on a number line.

Compare their objective distribution:¶

from pyxla import distr_f

feat, plot = distr_f(sphere_sample)
feat

{'f0_min': 0.0004981582236944362,
 'f0_max': 25.0,
 'f0_mean': 8.775342857034197,
 'f0_med': 7.082262541307175,
 'f0_q1': 1.5691808708098762,
 'f0_q3': 14.963831690474318,
 'f0_sd': 7.7932466371195765,
 'f0_skew': 0.5891926049831345,
 'f0_kurt': -0.9508927602010697,
 'f0_rank_min': 1,
 'f0_rank_max': 100,
 'f0_rank_mean': 50.49,
 'f0_rank_med': 50.5,
 'f0_rank_q1': 25.25,
 'f0_rank_q3': 75.75,
 'f0_rank_sd': 29.028545686039195,
 'f0_rank_skew': -0.00195083703523429,
 'f0_rank_kurt': -1.197706288002058}

../../_images/ac03babfcb3b451c5fd6a87ba7edeaef6764acde90186d217e8c3f5545310e21.png

feat, plot = distr_f(deceptive_sample)
feat

{'f0_min': -1.9388984378949934,
 'f0_max': 14.0,
 'f0_mean': 0.9722300997922091,
 'f0_med': 0.4296740956498242,
 'f0_q1': 0.05381754728009795,
 'f0_q3': 1.5110723390074454,
 'f0_sd': 1.8990680679351262,
 'f0_skew': 3.9627638197474186,
 'f0_kurt': 22.626186986306745,
 'f0_rank_min': 1,
 'f0_rank_max': 100,
 'f0_rank_mean': 50.5,
 'f0_rank_med': 50.5,
 'f0_rank_q1': 25.25,
 'f0_rank_q3': 75.75,
 'f0_rank_sd': 29.011491975882016,
 'f0_rank_skew': 0.0,
 'f0_rank_kurt': -1.2002400240024003}

../../_images/0ef291fb40f98e87b225bad87754190cf6d330d8b2bab9c7848927612b667d67.png

Fitness distance correlation (FDC) reveals more:¶

from pyxla import fdc

corr, plot = fdc(sphere_sample)
corr

WARNING:root:No D file is present, thus, computing the D file... Computing an entire D file can be time consuming. Instead, you can call the function with the keyword argument `compute_D_file` set to `False` to speed up computation, as only the required distances will be calculated.

INFO:root:D file has been loaded to the current sample and is saved to ./Sphere_D.csv

{'fdc_f0': np.float64(0.9997329725006666)}

../../_images/9d0155575e0503ebea1984a6dc04eb5e265f85eeee82ec8b397716016e3a748d.png

According to the output above, the sphere problem shows a high fitness-distance correlation between the objective values and distance to the nearest local minimum. This means that the search space is not deceptive, in that a search in the solution space will likely arrive at the global minimum since solutions become fitter and fitter as the search gets closer to the global minimum.

corr, plot = fdc(deceptive_sample)
corr

WARNING:root:No D file is present, thus, computing the D file... Computing an entire D file can be time consuming. Instead, you can call the function with the keyword argument `compute_D_file` set to `False` to speed up computation, as only the required distances will be calculated.

INFO:root:D file has been loaded to the current sample and is saved to ./Deceptive_D.csv

{'fdc_f0': np.float64(-0.6049924992499249)}

../../_images/ee157408a4244811b3e4e72f25592f190269604263541cc35fee8ec8f3374352.png

Conversely, the output above shows a negative correlation between distance to the nearest local minimum and objective values for the deceptive problem. Therefore, as one nears the global minimum, solutions tend to deteriorate in fitness, signalling the presence of multiple local minima. In this case, pyxla.fdc() indicates that the problem is indeed deceptive.

Rank distance correlation (RDC)¶

pyxla.rdc() corrobotates pyxla.fdc() since it operates on the objective ranks rather than the values.

from pyxla import rdc

corr, plot = rdc(sphere_sample)
corr

{'rdc_f0': np.float64(0.9997329725006666)}

../../_images/abeac1e24a669f9be4a6fd611e37b8c0469def5f28d0d5d7397c2f765cd57ea2.png

corr, plot = rdc(deceptive_sample)
corr

{'rdc_f0': np.float64(-0.6049924992499249)}

../../_images/70733fc3186405d6839b607ff47560f69c8851b6b2d60b6e4ba7062fd0e5c2ff.png

Pairwise distance correlation (PDC)¶

from pyxla import pdc

corr, plot = pdc(sphere_sample)
corr

{'pdc_F': np.float64(0.4424863038086047),
 'pdc_f0-rank': np.float64(0.34639998516274456)}

../../_images/362e86ad72c2e7ae842dca684e0897b8ecbdde09500c649b82c0b268116c6d96.png

pyxla.pdc() shows that for the sphere function, there is low correlation between pairwise distance on the solution space and pairwise distance on the objective space. This shows that the sphere function is not monotonic.

corr, plot = pdc(deceptive_sample)
corr

{'pdc_F': np.float64(0.625330939163191),
 'pdc_f0-rank': np.float64(0.6639146001272278)}

../../_images/9f4f41e8eec23184c66d2434a369839727064331518e5f8e6d26c9e98c9e2c95.png

Similarly the deceptive function it not monotonic and appears more rugged that the sphere function.

Neighbouring solutions’ objective values (fitness) correlation (NFC)¶

from pyxla import nfc

corr, plot = nfc(sphere_sample)
corr

{'nfc_all_X_for_f0': np.float64(0.9971552257266543)}

../../_images/41384df60092b17bdeab40a0eec56be7b58688d8cfc70fba2b5b70b27e3bcd75.png

The constrained sphere problem is found to have high evolvability, evidenced by a high NFC value. This is expected, since the sphere function is one of the simplest continuous optimisation problems.

corr, plot = nfc(deceptive_sample)
corr

{'nfc_all_X_for_f0': np.float64(0.9306122448979591)}

../../_images/c7e4cbd634b9486eba1d691fed7e1d5e9d0960849b366c1d4344a5a888c1eb67.png

The deceptive problem has a high NFC but a lower value that the sphere function as expected, since the sphere problem is an easier problem.

Neighbouring solutions’ ranks correlation (NRC)¶

pyxla.nrc() corroborates pyxla.nfc() since it is based on the objective ranks.

from pyxla import nrc

corr, plot = nrc(sphere_sample)
corr

{'NRC_for_f0_ranks': '0.9972'}

../../_images/a1891b4627bb115036cb942159a7231e05990b35bbe312e9eefd23c0eebbd626.png

corr, plot = nrc(deceptive_sample)
corr

{'NRC_for_f0_ranks': '0.9306'}

../../_images/5f338292b9967971fe67756bbf36fda79ed1b83ae41ac793b1c84efd5ea4a172.png

Dispersion of best solution (`disp_best`)¶

pyxla.disp_best() can tell if the problem has a global structure or whether it has many funnels.

from pyxla import disp_best

corr, plot = disp_best(sphere_sample)
corr

/builds/aliefooghe/pyxla-wg/src/pyxla/__init__.py:1403: RuntimeWarning: divide by zero encountered in log
  forward = lambda x: np.log(x / init_percentage) / np.log(growth_factor)

{'f0': np.float64(-2.33027833865432)}

../../_images/370bd7f485477f464118abc58d03e171b0746b0d48713d260baab96559e68b1a.png

corr, plot = disp_best(deceptive_sample)
corr

/builds/aliefooghe/pyxla-wg/src/pyxla/__init__.py:1403: RuntimeWarning: divide by zero encountered in log
  forward = lambda x: np.log(x / init_percentage) / np.log(growth_factor)

{'f0': np.float64(0.6570628896308044)}

../../_images/942afd430f92a7833e4b4844c2e587d1df2e01ca7ce1d70984115001e5937370.png

The sphere problem’s objective has a negative dispersion metric value which indicates the presence of a global structure, as expected. The deceptive problem, on the other hand, has a positive dispersion metric value, indicating the presence of funnels. The scatter plots visually express the dispersion, where for the constrained sphere problem, the spread of pairwise distances grows as the sample size of the best solution is increased. Conversely, for the deceptive problem, the gaps between the clusters of points for lower sample sizes show that indeed the best solutions are much further apart, accounting for the lack of a global structure.