{ "cells": [ { "cell_type": "markdown", "id": "95097ed3", "metadata": {}, "source": [ "# Auxiliary Functions" ] }, { "cell_type": "markdown", "id": "2d5148d1", "metadata": {}, "source": [ "{nb-download}`Download this as a Jupyter notebook `" ] }, { "cell_type": "markdown", "id": "7dc47ae6", "metadata": {}, "source": [ "This notebook covers the usage of the auxiliary function provided in `pyXla`." ] }, { "cell_type": "markdown", "id": "6c4b2b10", "metadata": {}, "source": [ "A key feature of the `pyXla` framework is the separation of sampling and analysis. A set of functions are provided to support sampling. They are:\n", "\n", "1. {func}`pyxla.util.sample_X`\n", "1. {func}`pyxla.util.compute_F`\n", "1. {func}`pyxla.util.compute_V`\n", "1. {func}`pyxla.util.compute_D`\n", "1. {func}`pyxla.util.compute_N`\n", "\n", "Each function corresponds to a given input file as indicated by it suffix.\n", "\n", "When a sample is loaded declaratively via domain and function specification (method 3 in {doc}`loading_and_sampling`), these function are used under the hood. These function are available to the user for finer control over sampling." ] }, { "cell_type": "markdown", "id": "8410c616", "metadata": {}, "source": [ "The auxiliary functions are all imported from {mod}`pyxla.util`:" ] }, { "cell_type": "code", "execution_count": 1, "id": "3d0b6a3f", "metadata": {}, "outputs": [], "source": [ "from pyxla.util import sample_X, compute_F, compute_V, compute_D, compute_N" ] }, { "cell_type": "markdown", "id": "9c293d9f", "metadata": {}, "source": [ "One can generate an `X` file as below:" ] }, { "cell_type": "code", "execution_count": 2, "id": "428cac3a", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
x0x1
06.45345912.845543
13.66540425.092207
218.45652215.515011
319.36912210.219224
415.6201238.727153
\n", "
" ], "text/plain": [ " x0 x1\n", "0 6.453459 12.845543\n", "1 3.665404 25.092207\n", "2 18.456522 15.515011\n", "3 19.369122 10.219224\n", "4 15.620123 8.727153" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from pyxla.sampling import HilbertCurveSampler\n", "\n", "sampler = HilbertCurveSampler(\n", " sample_size=100, dim=2, return_neighbourhood=True # will return an N file too\n", ")\n", "\n", "X, N = sample_X(sampler)\n", "X.head()" ] }, { "cell_type": "markdown", "id": "98dffd81", "metadata": {}, "source": [ "Specifying `return_neighbourhood=True` generates an `N` file as well:" ] }, { "cell_type": "code", "execution_count": 3, "id": "b7c28974", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
id1id2
001
112
223
334
445
\n", "
" ], "text/plain": [ " id1 id2\n", "0 0 1\n", "1 1 2\n", "2 2 3\n", "3 3 4\n", "4 4 5" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "N.head()" ] }, { "cell_type": "markdown", "id": "bae808cc", "metadata": {}, "source": [ "The `F` file can be generated from the `X` file by specifying an objective function or multiple objective functions:" ] }, { "cell_type": "code", "execution_count": 4, "id": "1fb486af", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
f0f1
0206.65510919.299002
1643.05405628.757612
2581.35876733.971533
3479.59542029.588346
4320.15142524.347275
\n", "
" ], "text/plain": [ " f0 f1\n", "0 206.655109 19.299002\n", "1 643.054056 28.757612\n", "2 581.358767 33.971533\n", "3 479.595420 29.588346\n", "4 320.151425 24.347275" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "def sphere(X):\n", " return X[0]**2 + X[1]**2\n", "\n", "def summation(X):\n", " return X.sum()\n", "\n", "F = compute_F([sphere, summation], X)\n", "F.head()" ] }, { "cell_type": "markdown", "id": "e5c9938b", "metadata": {}, "source": [ "The process is similar for the `V` file:" ] }, { "cell_type": "code", "execution_count": 5, "id": "f70200c0", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
v0
0204.655109
1641.054056
2579.358767
3477.595420
4318.151425
\n", "
" ], "text/plain": [ " v0\n", "0 204.655109\n", "1 641.054056\n", "2 579.358767\n", "3 477.595420\n", "4 318.151425" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "V = compute_V([lambda X: sphere(X) - 2], X)\n", "V.head()" ] }, { "cell_type": "markdown", "id": "4eb5012b", "metadata": {}, "source": [ "Computing a `D` file is straightforward:" ] }, { "cell_type": "code", "execution_count": 6, "id": "077bfb78", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
d
id1id2
010.598340
20.575984
30.614037
40.606185
51.430009
\n", "
" ], "text/plain": [ " d\n", "id1 id2 \n", "0 1 0.598340\n", " 2 0.575984\n", " 3 0.614037\n", " 4 0.606185\n", " 5 1.430009" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "D = compute_D({\"X\": X}, metric='canberra') # you can define a function or specify any of scipy's distance functions\n", "D.head()" ] }, { "cell_type": "markdown", "id": "801a7b8b", "metadata": {}, "source": [ "The `N` file is equally straightforward; either specify a neighbourhood function of supply the literal `\"hilbert-curve\"` to use the Hilbert curve to efficiently generate neighbourhood information (see {func}`pyxla.sampling.hilbert_curve_neighbour_sampling`)." ] }, { "cell_type": "code", "execution_count": 7, "id": "3b210d54", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
id1id2
001
102
203
304
405
\n", "
" ], "text/plain": [ " id1 id2\n", "0 0 1\n", "1 0 2\n", "2 0 3\n", "3 0 4\n", "4 0 5" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "def randomly_neighbours(a, b):\n", " import random\n", " return random.choice([True, False])\n", "\n", "N = compute_N({\"X\": X}, neighbourhood_func=randomly_neighbours)\n", "N.head()" ] } ], "metadata": { "kernelspec": { "display_name": ".venv", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.14.4" } }, "nbformat": 4, "nbformat_minor": 5 }