Auxiliary Functions¶

Download this as a Jupyter notebook

This notebook covers the usage of the auxiliary function provided in pyXla.

A key feature of the pyXla framework is the separation of sampling and analysis. A set of functions are provided to support sampling. They are:

Each function corresponds to a given input file as indicated by it suffix.

When a sample is loaded declaratively via domain and function specification (method 3 in Loading and Sampling), these function are used under the hood. These function are available to the user for finer control over sampling.

The auxiliary functions are all imported from pyxla.util:

from pyxla.util import sample_X, compute_F, compute_V, compute_D, compute_N

One can generate an X file as below:

from pyxla.sampling import HilbertCurveSampler

sampler = HilbertCurveSampler(
    sample_size=100, dim=2, return_neighbourhood=True # will return an N file too
)

X, N = sample_X(sampler)
X.head()

	x0	x1
0	2.842576	22.228536
1	2.167817	14.444034
2	10.415036	19.279644
3	11.483241	17.699912
4	17.678508	16.553606

Specifying return_neighbourhood=True generates an N file as well:

N.head()

	id1	id2
0	0	1
1	1	2
2	2	3
3	3	4
4	4	5

The F file can be generated from the X file by specifying an objective function or multiple objective functions:

def sphere(X):
    return X[0]**2 + X[1]**2

def summation(X):
    return X.sum()

F = compute_F([sphere, summation], X)
F.head()

	f0	f1
0	502.188060	25.071112
1	213.329562	16.611852
2	480.177659	29.694680
3	445.151707	29.183153
4	586.551520	34.232114

The process is similar for the V file:

V = compute_V([lambda X: sphere(X) - 2], X)
V.head()

	v0
0	500.188060
1	211.329562
2	478.177659
3	443.151707
4	584.551520

Computing a D file is straightforward:

D = compute_D({"X": X}, metric='canberra') # you can define a function or specify any of scipy's distance functions
D.head()

		d
id1	id2
0	1	0.346942
	2	0.642222
	3	0.716572
	4	0.869289
	5	0.823582

The N file is equally straightforward; either specify a neighbourhood function or supply the a literal: "hilbert-curve" or "X-index".

"hilbert-curve" uses the Hilbert curve to efficiently generate neighbourhood information (see pyxla.sampling.hilbert_curve_neighbour_sampling()).

"X-index" extracts neighbourhood from the default index of a DataFrame.

def randomly_neighbours(a, b):
    import random
    return random.choice([True, False])

N = compute_N({"X": X}, neighbourhood_func=randomly_neighbours)
N.head()

	id1	id2
0	0	2
1	0	7
2	0	8
3	0	9
4	0	12