{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Foraging locust with dynamical systems (data)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "In order to demonstrate using dynamical systems models to reason about foraging behavior, we analyze data from desert locusts, available at https://zenodo.org/records/7541780, collected as part of the paper Information integration for decision-making in desert locusts by Günzel, Oberhauser, and Couzin-Fuchs. The main result we are trying to replicate with this analysis is that locusts foraging decisions are based on a combination of individually derived and socially derived information.\n", "\n", "This notebook focuses on initial data processing. We will load the locust data (training and test set), and compartmentalize it to prepare for further dynamical systems analysis that models the transition rates with which agents move between different compartments. The dynamical systems models are demonstrated in subsequent notebooks.\n", "\n", "Please go through the locust dynamical systems notebooks in this order:\n", "\n", "- [locust_ds_data.ipynb](./locust_ds_data.ipynb)\n", "\n", "- [locust_ds_class.ipynb](./locust_ds_class.ipynb)\n", "\n", "- [locust_ds_validate.ipynb](./locust_ds_validate.ipynb)\n", "\n", "- [locust_ds_interpret.ipynb](./locust_ds_interpret.ipynb)" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import os\n", "import time\n", "\n", "import matplotlib.pyplot as plt\n", "import numpy as np\n", "import pyro\n", "import seaborn as sns\n", "\n", "pyro.settings.set(module_local_params=True)\n", "\n", "sns.set_style(\"white\")\n", "\n", "import plotly.io as pio\n", "\n", "pio.renderers.default = \"notebook\"\n", "\n", "seed = 123\n", "pyro.clear_param_store()\n", "pyro.set_rng_seed(seed)\n", "\n", "\n", "import matplotlib.pyplot as plt\n", "import pandas as pd\n", "import seaborn as sns\n", "\n", "from collab.foraging import locust as lc\n", "from collab.foraging import toolkit as ft\n", "from collab.utils import find_repo_root\n", "\n", "root = find_repo_root()\n", "\n", "# users can ignore smoke_test.\n", "# it's for automatic testing on GitHub, to make sure the\n", "# notebook runs on future updates to the repository\n", "smoke_test = \"CI\" in os.environ\n", "smoke_test = \"CI\" in os.environ\n", "subset_starts = 1 # 420\n", "subset_ends = 30 if smoke_test else 900\n", "desired_frames = 30 if smoke_test else 180\n", "notebook_starts = time.time()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We start taking two of the datasets (to serve as training and validation datasets), with the same number of agents and reward placement, but quite different starting positions. We load, clean and animate the data. Cleaning involves:\n", "- discretizing the area into a grid of a specific size and rewriting agents' positions accordingly.\n", "- adding rewards positions described in the original project documentation into the dataframe.\n", "- extracting every k-th frame so that the total number of frames is `desired_frames`. This is because the recording is with 25 frames per second, originally 45000 frames. Locusts in the experiment move slow enough for lower fidelity to be adequate.\n", "- potentially picking a subset of the dataset, determined by `subset_starts` and `subset_ends`.\n", " " ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "original_frames: 45000\n", "original_shape: (675000, 4)\n", "resulting_frames: 180\n", "resulting_shape: (2700, 4)\n", "min_time 1\n", "max_time 180\n", "original_frames: 45000\n", "original_shape: (675000, 4)\n", "resulting_frames: 180\n", "resulting_shape: (2700, 4)\n", "min_time 1\n", "max_time 180\n" ] } ], "source": [ "training_data_code = \"15EQ20191202\"\n", "validation_data_code = \"15EQ20191205\"\n", "training_data_path = os.path.join(\n", " root, f\"data/foraging/locust/{training_data_code}_tracked.csv\"\n", ")\n", "validation_data_path = os.path.join(\n", " root, f\"data/foraging/locust/{validation_data_code}_tracked.csv\"\n", ")\n", "\n", "\n", "tdf = lc.load_and_clean_locust(\n", " path=training_data_path,\n", " desired_frames=desired_frames,\n", " grid_size=100,\n", " rewards_x=[0.68074, -0.69292],\n", " rewards_y=[-0.03068, -0.03068],\n", " subset_starts=subset_starts,\n", " subset_ends=subset_ends,\n", ")\n", "\n", "\n", "vdf = lc.load_and_clean_locust(\n", " path=validation_data_path,\n", " desired_frames=desired_frames,\n", " grid_size=100,\n", " rewards_x=[0.68074, -0.69292],\n", " rewards_y=[-0.03068, -0.03068],\n", " subset_starts=subset_starts,\n", " subset_ends=subset_ends,\n", ")\n", "\n", "\n", "training_object = tdf[\"subset\"]\n", "validation_object = vdf[\"subset\"]\n", "\n", "training_rewards = (\n", " training_object.rewardsDF.iloc[:, [0, 1]].drop_duplicates().reset_index(drop=True)\n", ")\n", "\n", "validation_rewards = (\n", " validation_object.rewardsDF.iloc[:, [0, 1]].drop_duplicates().reset_index(drop=True)\n", ")\n", "\n", "\n", "tdf = training_object.foragersDF\n", "vdf = validation_object.foragersDF\n", "\n", "# shared\n", "start, end, N_obs = min(tdf[\"time\"]), max(tdf[\"time\"]), len(tdf[\"time\"].unique())" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/html": [ " \n", " " ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "