{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "95097ed3",
   "metadata": {},
   "source": [
    "# Auxiliary Functions"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2d5148d1",
   "metadata": {},
   "source": [
    "{nb-download}`Download this as a Jupyter notebook <auxiliary_functions.ipynb>`"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7dc47ae6",
   "metadata": {},
   "source": [
    "This notebook covers the usage of the auxiliary function provided in `pyXla`."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6c4b2b10",
   "metadata": {},
   "source": [
    "A key feature of the `pyXla` framework is the separation of sampling and analysis. A set of functions are provided to support sampling. They are:\n",
    "\n",
    "1. {func}`pyxla.util.sample_X`\n",
    "1. {func}`pyxla.util.compute_F`\n",
    "1. {func}`pyxla.util.compute_V`\n",
    "1. {func}`pyxla.util.compute_D`\n",
    "1. {func}`pyxla.util.compute_N`\n",
    "\n",
    "Each function corresponds to a given input file as indicated by it suffix.\n",
    "\n",
    "When a sample is loaded declaratively via domain and function specification (method 3 in {doc}`loading_and_sampling`), these function are used under the hood. These function are available to the user for finer control over sampling."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8410c616",
   "metadata": {},
   "source": [
    "The auxiliary functions are all imported from {mod}`pyxla.util`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "3d0b6a3f",
   "metadata": {},
   "outputs": [],
   "source": [
    "from pyxla.util import sample_X, compute_F, compute_V, compute_D, compute_N"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9c293d9f",
   "metadata": {},
   "source": [
    "One can generate an `X` file as below:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "428cac3a",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>x0</th>\n",
       "      <th>x1</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>6.453459</td>\n",
       "      <td>12.845543</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>3.665404</td>\n",
       "      <td>25.092207</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>18.456522</td>\n",
       "      <td>15.515011</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>19.369122</td>\n",
       "      <td>10.219224</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>15.620123</td>\n",
       "      <td>8.727153</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "          x0         x1\n",
       "0   6.453459  12.845543\n",
       "1   3.665404  25.092207\n",
       "2  18.456522  15.515011\n",
       "3  19.369122  10.219224\n",
       "4  15.620123   8.727153"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from pyxla.sampling import HilbertCurveSampler\n",
    "\n",
    "sampler = HilbertCurveSampler(\n",
    "    sample_size=100, dim=2, return_neighbourhood=True # will return an N file too\n",
    ")\n",
    "\n",
    "X, N = sample_X(sampler)\n",
    "X.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "98dffd81",
   "metadata": {},
   "source": [
    "Specifying `return_neighbourhood=True` generates an `N` file as well:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "b7c28974",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>id1</th>\n",
       "      <th>id2</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>1</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>2</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>3</td>\n",
       "      <td>4</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>4</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   id1  id2\n",
       "0    0    1\n",
       "1    1    2\n",
       "2    2    3\n",
       "3    3    4\n",
       "4    4    5"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "N.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "bae808cc",
   "metadata": {},
   "source": [
    "The `F` file can be generated from the `X` file by specifying an objective function or multiple objective functions:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "1fb486af",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>f0</th>\n",
       "      <th>f1</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>206.655109</td>\n",
       "      <td>19.299002</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>643.054056</td>\n",
       "      <td>28.757612</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>581.358767</td>\n",
       "      <td>33.971533</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>479.595420</td>\n",
       "      <td>29.588346</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>320.151425</td>\n",
       "      <td>24.347275</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "           f0         f1\n",
       "0  206.655109  19.299002\n",
       "1  643.054056  28.757612\n",
       "2  581.358767  33.971533\n",
       "3  479.595420  29.588346\n",
       "4  320.151425  24.347275"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "def sphere(X):\n",
    "    return X[0]**2 + X[1]**2\n",
    "\n",
    "def summation(X):\n",
    "    return X.sum()\n",
    "\n",
    "F = compute_F([sphere, summation], X)\n",
    "F.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e5c9938b",
   "metadata": {},
   "source": [
    "The process is similar for the `V` file:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "f70200c0",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>v0</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>204.655109</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>641.054056</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>579.358767</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>477.595420</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>318.151425</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "           v0\n",
       "0  204.655109\n",
       "1  641.054056\n",
       "2  579.358767\n",
       "3  477.595420\n",
       "4  318.151425"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "V = compute_V([lambda X: sphere(X) - 2], X)\n",
    "V.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4eb5012b",
   "metadata": {},
   "source": [
    "Computing a `D` file is straightforward:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "077bfb78",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th>d</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>id1</th>\n",
       "      <th>id2</th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th rowspan=\"5\" valign=\"top\">0</th>\n",
       "      <th>1</th>\n",
       "      <td>0.598340</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>0.575984</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>0.614037</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>0.606185</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>1.430009</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                d\n",
       "id1 id2          \n",
       "0   1    0.598340\n",
       "    2    0.575984\n",
       "    3    0.614037\n",
       "    4    0.606185\n",
       "    5    1.430009"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "D = compute_D({\"X\": X}, metric='canberra') # you can define a function or specify any of scipy's distance functions\n",
    "D.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "801a7b8b",
   "metadata": {},
   "source": [
    "The `N` file is equally straightforward; either specify a neighbourhood function of supply the literal `\"hilbert-curve\"` to use the Hilbert curve to efficiently generate neighbourhood information (see {func}`pyxla.sampling.hilbert_curve_neighbour_sampling`)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "3b210d54",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>id1</th>\n",
       "      <th>id2</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>0</td>\n",
       "      <td>4</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>0</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   id1  id2\n",
       "0    0    1\n",
       "1    0    2\n",
       "2    0    3\n",
       "3    0    4\n",
       "4    0    5"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "def randomly_neighbours(a, b):\n",
    "    import random\n",
    "    return random.choice([True, False])\n",
    "\n",
    "N = compute_N({\"X\": X}, neighbourhood_func=randomly_neighbours)\n",
    "N.head()"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": ".venv",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.14.4"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}