{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Foraging toolkit demo - communicating foragers, Part I (simulation)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Outline" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* [Introduction](#introduction)\n", "* [Simulation Setup](#simulation-setup)\n", "* [The Simulation Algorithm](#the-simulation-algorithm)\n", "* [Optional - Weak Communicators](#optional---weak-communicators-simulation)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Introduction" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In a multi-agent context, communication of information between foragers is an important\n", "feature of group-level behavior, as is the impact of different environmental conditions. We thus explore how one\n", "might infer to what extent agents communicate with each other to facilitate foraging. We ask whether the\n", "benefit of communicating information would be different for different environments, using multiple simulations\n", "with a range of communication-related hyper-parameters. In environments where food is highly clustered, it\n", "takes longer for birds to find food, but in all environments, using information communicated from other birds\n", "improves foraging success. The Bayesian inference methods are able to correctly compare the extent to which\n", "simulated agents communicate about the locations of the rewards.\n", "\n", "The communicating foragers demo is divided into two notebooks:\n", "\n", "1. Simulation (this one)\n", "2. Inference [communicators_inference.ipynb](./communicators_inference.ipynb) - proceed there after completing this notebook\n", "\n", "The users are advised to read through the demo notebooks in [docs/foraging/random-hungry-followers](../random-hungry-followers/) folder to get familiarized with the foraging toolkit.\n", "\n", "The main reference is [1], in particular Fig.3.\n", "\n", "---\n", "\n", "[1] R. Urbaniak, M. Xie, and E. Mackevicius, “Linking cognitive strategy, neural mechanism, and movement statistics in group foraging behaviors,” Sci Rep, vol. 14, no. 1, p. 21770, Sep. 2024, [doi: 10.1038/s41598-024-71931-0.](https://www.nature.com/articles/s41598-024-71931-0)" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import logging\n", "import os\n", "from itertools import product\n", "\n", "import numpy as np\n", "import pandas as pd\n", "import plotly.io as pio\n", "\n", "import collab.foraging.communicators as com\n", "import collab.foraging.toolkit as ft\n", "from collab.utils import find_repo_root\n", "\n", "pio.renderers.default = \"notebook\"\n", "\n", "repo_root = find_repo_root()\n", "\n", "logging.basicConfig(level=logging.INFO, format=\"%(asctime)s %(message)s\")\n", "\n", "# if you need to change the number of frames, replace 50 with your desired number\n", "\n", "# this is an alternative continuous development setup\n", "# for automated notebook testing\n", "# feel free to ignore this\n", "dev_mode = False # set to True if you want to generate your own csvs\n", "smoke_test = \"CI\" in os.environ\n", "N_frames = 10 if smoke_test else 50" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Simulation setup" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We simulate grid world environments with food patches of varying degrees of spatial clustering, controlling for the total amount of food in the environment. In each environment, there were 12 total food items, distributed randomly in patches of size $1 \\times 1,2 \\times 2$, or $4 \\times 4$.\n", "\n", "We parameterize the extent to which agents share information about food locations. In these simulations, agents follow a policy to select action $A_{\\text{opt}}=\\arg \\max _A(V(T(A, S)))$, where $V$ computes the expected future value based on an estimate of expected reward includes both food that they can directly observe (within a radius of 5 steps), as well as perceiving other birds eating at farther locations, and $T$ is a state transition function. In the real world, this estimate could be achieved by observing other birds and/or listening to their calls. The weighting of social information (reward locations communicated by other birds), compared to individually observed information, is given by the **communication parameter**, (`c_trust` below) which ranges from 0 (no communication) to 1 (full reliance on social information).\n", "\n", "As we shall see below, birds that communicate appear to navigate more directly to food locations than birds that search independently." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here we define parameters of the forward model. We have two simulations, where `c_trust` was set either to $0$ (no communication) or $0.6$. We save the parameters for each simulation, as well as the meta-data for all simulations, in CSV files in the `data` directory." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
c_trustsight_radiusreward_patch_dimsim index
00.0540
10.6541
\n", "
" ], "text/plain": [ " c_trust sight_radius reward_patch_dim sim index\n", "0 0.0 5 4 0\n", "1 0.6 5 4 1" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "/Users/dima/git/collab2/data/foraging/communicators/communicators_strong/\n" ] } ], "source": [ "# Simulation setup 1 for the communication detection problem\n", "\n", "home_dir_strong_sim = os.path.join(\n", " repo_root, \"data/foraging/communicators/communicators_strong/\"\n", ")\n", "# # agent parameters\n", "sight_radius = [5]\n", "c_trust = [0, 0.6] # 0: ignorers\n", "N_agents = 8\n", "\n", "# # environment parameters\n", "edge_size = 30\n", "N_total_food_units = 16\n", "reward_patch_dim = [4] # clustered is 4, distributed is 1\n", "\n", "# simulation parameters\n", "N_runs = 1 # How many times would you like to run each case?\n", "N_frames = N_frames\n", "\n", "# Generate a dataframe containing all possible combinations of the parameter values specified above.\n", "param_list = [i for i in product(c_trust, sight_radius, reward_patch_dim)]\n", "metadataDF = pd.DataFrame(param_list)\n", "metadataDF.columns = [\"c_trust\", \"sight_radius\", \"reward_patch_dim\"]\n", "metadataDF[\"sim index\"] = np.arange(len(metadataDF)).astype(int)\n", "N_sims = len(metadataDF) if not smoke_test else 1\n", "\n", "# save metadata to home directory\n", "if dev_mode and not smoke_test:\n", " metadataDF.to_csv(os.path.join(home_dir_strong_sim, \"metadataDF.csv\"))\n", " pd.DataFrame(\n", " [\n", " {\n", " \"N_sims\": N_sims,\n", " \"N_runs\": N_runs,\n", " \"N_frames\": N_frames,\n", " \"N_agents\": N_agents,\n", " \"N_total_food_units\": N_total_food_units,\n", " \"edge_size\": edge_size,\n", " }\n", " ]\n", " ).to_csv(os.path.join(home_dir_strong_sim, \"additional_meta_params.csv\"))\n", "\n", "display(metadataDF)\n", "\n", "# Simulations set right,\n", "# Before you start, keep in sight,\n", "# Data safe from overwrite.\n", "print(home_dir_strong_sim)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## The simulation algorithm" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The Successor Representation, used in RL models, is a predictive map that represents the temporally discounted expected occupancy of future states, from any starting state. In a world with $N$ states, the Successor Representation is an $N \\mathrm{x} N$ matrix, defined as $M=\\sum_{t=0}^{\\infty}(\\gamma T)^t=(I-\\gamma T)^{-1}$, where $T$ is the transition matrix between states and $I$ is the identity matrix.\n", "\n", "In the simulations below, $N=N_x\\times N_y$ is the number of grid locations. For $i=1,\\dots,N$, let $(x_i,y_i)$ denote state $i$. Let $n(i)$ denote the number of possible neighboring states such that $\\forall j \\in n(i)$ we have $\\operatorname{dist}((x_i,y_i),(x_j,y_j))\\leqslant \\text{maxStepSize}$ (below, $\\text{maxStepSize}=3$). Then the transition matrix is defined as\n", "$$\n", "T(i,j)=\\begin{cases} \\frac{1}{n(i)} & j \\in n(i);\\\\ 0 & \\text{else} \\end{cases}.\n", "$$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Simulation of communicator foragers** ([1, Algorithm 4, Appendix])\n", "\n", "- For each run:\n", "\n", " - INITIALIZATION\n", " - Initialize grid environment with $N_{\\text{states}}$ states\n", " - Add random food patches to the environment\n", " - Initialize foragers:\n", " - Assign each forager $b$ to a random state $S_{t=0}^b$ on the grid\n", " - Assign the same values for $c_{\\text{trust}}$ and sight radius to all foragers\n", " - Initialize a $N_{\\text{states}}$-length vector $\\vec{\\phi}_{\\text{agents}}$ which indicates the number of foragers at each state\n", "\n", " - SIMULATE (`com.SimulateCommunicators`)\n", " - For each frame:\n", " - Update rewards at each state based on rate of foragers' calorie consumption:\n", " $\\vec{r}_S \\leftarrow \\vec{r}_S - \\text{rate} \\cdot \\vec{\\phi}_{\\text{agents}}$\n", "\n", " - _If the total remaining food falls below a threshold, generate additional random food patches_\n", "\n", " - For each forager $b$:\n", " - Compute vector $\\vec{\\phi}_{\\text{visible}}$ indicating which states are within this forager's sight radius \n", " (euclidean distance $<$ sight radius)\n", " \n", " - Update expected rewards (food calories) of states within the forager's sight radius:\n", " $\\vec{w}_{\\text{self}} \\leftarrow \\vec{r}_S \\odot \\vec{\\phi}_{\\text{visible}}$\n", " \n", " - Move forager out of its old state: \n", " $\\vec{\\phi}_{\\text{agents}}[S_t^b] \\leftarrow \\vec{\\phi}_{\\text{agents}}[S_t^b] - 1$\n", " \n", " - Update expected rewards (food calories) for states of other agents:\n", " $\\vec{w}_{\\text{others}} \\leftarrow \\vec{r}_S \\odot \\vec{\\phi}_{\\text{agents}}$\n", " \n", " - Update this forager's estimate of the value of each state using reward expectation vectors and \n", " Successor Representation matrix:\n", " $V(S) \\leftarrow ((1-c_{\\text{trust}}) \\vec{w}_{\\text{self}} + c_{\\text{trust}} \\vec{w}_{\\text{others}})^T M$\n", " \n", " - Make a vector containing the values of all states the forager could consider moving to (within the sight radius):\n", " $V_{\\text{eligible}}$\n", " \n", " - Forager decides its next location:\n", " $S_{t+1}^b \\leftarrow \\operatorname{argmax}[V_{\\text{eligible}}]$\n", " \n", " \n", " - Move forager to its new state:\n", " $\\vec{\\phi}_{\\text{agents}}[S_{t+1}^b] \\leftarrow \\vec{\\phi}_{\\text{agents}}[S_{t+1}^b] + 1$" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "2024-11-01 09:50:43,962 - Starting simulation setting 1/2, about to run it 1 times.\n", "2024-11-01 09:51:30,410 - Starting simulation setting 2/2, about to run it 1 times.\n" ] } ], "source": [ "def run_simulations(home_dir, fresh_start=True):\n", " if fresh_start:\n", " start = 0\n", " else:\n", " resultsDF = pd.read_csv(os.path.join(home_dir, \"resultsDF.csv\"))\n", " start = resultsDF.iloc[-1][\"sim index\"].astype(\n", " int\n", " ) # start with the last existing batch\n", "\n", " logging.info(f\"Starting from batch {start+1}.\")\n", "\n", " all_results = []\n", "\n", " for si in range(start, N_sims):\n", " # 1. pull out parameters from row si in the metadata\n", " df_row = metadataDF.iloc[[si]]\n", " c_trust = df_row[\"c_trust\"].iloc[0]\n", " sight_radius = df_row[\"sight_radius\"].iloc[0]\n", " reward_patch_dim = df_row[\"reward_patch_dim\"].iloc[0].astype(int)\n", "\n", " # arrays to save success measures for each run of this simulation\n", " mean_times_to_first_reward = np.zeros((N_runs))\n", " num_foragers_failed = np.zeros((N_runs))\n", "\n", " logging.info(\n", " f\"Starting simulation setting {si+1}/{N_sims}, about to run it {N_runs} times.\"\n", " )\n", "\n", " # Do multiple runs of the simulation and store the results in a results dataframe\n", " batch_results = []\n", " for ri in range(N_runs):\n", " # initialize environment\n", " env = com.Environment(\n", " edge_size=edge_size,\n", " N_total_food_units=N_total_food_units,\n", " patch_dim=reward_patch_dim,\n", " )\n", " env.add_food_patches()\n", "\n", " # run simulation\n", " sim = com.SimulateCommunicators(\n", " env, N_frames, N_agents, c_trust=c_trust, sight_radius=sight_radius\n", " )\n", " sim.run()\n", "\n", " # Compute success measures\n", " time_to_first_allforagers = np.zeros(N_agents)\n", " for forager_id in range(\n", " 1, N_agents + 1\n", " ): # compute time to first food for each forager\n", " singleforagerDF = sim.all_foragersDF.loc[\n", " sim.all_foragersDF.forager == forager_id\n", " ]\n", " time_to_first_allforagers[forager_id - 1] = (\n", " com.compute_time_to_first_reward(\n", " singleforagerDF, sim.all_rewardsDF, N_frames\n", " )\n", " )\n", " mean_times_to_first_reward = np.mean(\n", " time_to_first_allforagers\n", " ) # take the average across foragers\n", " num_foragers_failed = np.sum(\n", " time_to_first_allforagers == N_frames\n", " ) # number of foragers that failed to reach food\n", "\n", " # Save the simulation results in a folder named sim{si}_run{ri} in the home directory\n", "\n", " if dev_mode and not smoke_test:\n", " sim_folder = \"sim\" + str(si) + \"_run\" + str(ri)\n", " sim_dir = os.path.join(home_dir, sim_folder)\n", " if not os.path.isdir(sim_dir):\n", " os.makedirs(sim_dir)\n", " sim.all_foragersDF.to_csv(os.path.join(sim_dir, \"foragerlocsDF.csv\"))\n", " sim.all_rewardsDF.to_csv(os.path.join(sim_dir, \"rewardlocsDF.csv\"))\n", "\n", " # Combine the metadata and the success measures for the results dataframe\n", " results_onesim = {\n", " \"c_trust\": c_trust,\n", " \"sight_radius\": sight_radius,\n", " \"reward_patch_dim\": reward_patch_dim,\n", " \"sim index\": si,\n", " \"run index\": ri,\n", " \"time to first food\": mean_times_to_first_reward,\n", " \"num foragers failed\": num_foragers_failed,\n", " }\n", " batch_results.append(results_onesim)\n", "\n", " batch_resultsDF = pd.DataFrame(batch_results)\n", "\n", " if \"resultsDF\" in locals():\n", " resultsDF = pd.concat(\n", " [resultsDF, batch_resultsDF], ignore_index=True, axis=0\n", " )\n", " else:\n", " resultsDF = batch_resultsDF.copy()\n", "\n", " if dev_mode and not smoke_test:\n", " resultsDF.to_csv(os.path.join(home_dir, \"resultsDF.csv\"))\n", " logging.info(f\"Saved results for batch {si+1}/{N_sims}.\")\n", "\n", "\n", "run_simulations(home_dir_strong_sim, fresh_start=True)" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "# load the data\n", "\n", "\n", "def load_communicators(sim_folder):\n", "\n", " sim_dir = os.path.join(home_dir_strong_sim, sim_folder)\n", " foragerlocsDF = pd.read_csv(os.path.join(sim_dir, \"foragerlocsDF.csv\"), index_col=0)\n", " rewardlocsDF = pd.read_csv(os.path.join(sim_dir, \"rewardlocsDF.csv\"), index_col=0)\n", "\n", " # simulation time and grid coordinates start at 1, shift to 0s without modifying the original simulations\n", "\n", " foragerlocsDF[\"forager\"] = foragerlocsDF[\"forager\"] - 1\n", " foragerlocsDF[\"time\"] = foragerlocsDF[\"time\"] - 1\n", " rewardlocsDF[\"time\"] = rewardlocsDF[\"time\"] - 1\n", "\n", " communicators = ft.dataObject(\n", " foragersDF=foragerlocsDF, rewardsDF=rewardlocsDF, grid_size=35\n", " )\n", "\n", " return communicators\n", "\n", "\n", "noncommunicators = load_communicators(\"sim0_run0\")\n", "communicators = load_communicators(\"sim1_run0\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To make sure the results make sense, we can animate the run.\n" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/html": [ " \n", " " ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "ft.animate_foragers(\n", " noncommunicators, plot_rewards=True, width=600, height=600, point_size=8\n", ")" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# animate\n", "\n", "ft.animate_foragers(\n", " communicators, plot_rewards=True, width=600, height=600, point_size=8\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Optional - weak communicators simulation" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
c_trustsight_radiusreward_patch_dimsim index
00.000510
10.000521
20.000542
30.005513
40.005524
...............
2950.68052295
2960.68054296
2970.69051297
2980.69052298
2990.69054299
\n", "

300 rows × 4 columns

\n", "
" ], "text/plain": [ " c_trust sight_radius reward_patch_dim sim index\n", "0 0.000 5 1 0\n", "1 0.000 5 2 1\n", "2 0.000 5 4 2\n", "3 0.005 5 1 3\n", "4 0.005 5 2 4\n", ".. ... ... ... ...\n", "295 0.680 5 2 295\n", "296 0.680 5 4 296\n", "297 0.690 5 1 297\n", "298 0.690 5 2 298\n", "299 0.690 5 4 299\n", "\n", "[300 rows x 4 columns]" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "/Users/dima/git/collab2/data/foraging/communicators/communicators_weak/\n" ] } ], "source": [ "# custom list of locations for simulation setup 2\n", "# with focus on low values of c_trust\n", "# the main simulation run is commented out\n", "# uncomment if you want to run many of simulations\n", "min_value = 0.0\n", "max_value = 0.7\n", "density1 = 0.005\n", "density2 = 0.01\n", "c_locations = []\n", "\n", "current_value = min_value\n", "while current_value < 0.3:\n", " c_locations.append(current_value)\n", " current_value += density1\n", "while current_value <= max_value:\n", " c_locations.append(current_value)\n", " current_value += density2\n", "\n", "# # Simulation setup 2 for the impact of communication\n", "home_dir_weak = os.path.join(\n", " repo_root, \"data/foraging/communicators/communicators_weak/\"\n", ")\n", "# agent parameters\n", "sight_radius = [5]\n", "c_trust = c_locations\n", "# 0: ignorers,\n", "N_agents = 9\n", "\n", "# environment parameters\n", "edge_size = 45\n", "N_total_food_units = 16\n", "reward_patch_dim = [1, 2, 4] # clustered is 4, distributed is 1\n", "\n", "# simulation parameters\n", "N_runs = 2 # How many times would you like to run each case?\n", "N_frames = N_frames\n", "\n", "# Generate a dataframe containing all possible combinations of the parameter values specified above.\n", "param_list = [i for i in product(c_trust, sight_radius, reward_patch_dim)]\n", "metadataDF = pd.DataFrame(param_list)\n", "metadataDF.columns = [\"c_trust\", \"sight_radius\", \"reward_patch_dim\"]\n", "metadataDF[\"sim index\"] = np.arange(len(metadataDF)).astype(int)\n", "N_sims = len(metadataDF)\n", "\n", "# save metadata to home directory\n", "if dev_mode and not smoke_test:\n", " metadataDF.to_csv(os.path.join(home_dir_weak, \"metadataDF.csv\"))\n", " pd.DataFrame(\n", " [\n", " {\n", " \"N_sims\": N_sims,\n", " \"N_runs\": N_runs,\n", " \"N_frames\": N_frames,\n", " \"N_agents\": N_agents,\n", " \"N_total_food_units\": N_total_food_units,\n", " \"edge_size\": edge_size,\n", " }\n", " ]\n", " ).to_csv(os.path.join(home_dir_weak, \"additional_meta_params.csv\"))\n", "\n", "display(metadataDF)\n", "\n", "print(home_dir_weak)\n", "\n", "# uncomment if you want to run 600 simulations\n", "# and turn on dev_mode if you want to overwrite the csvs\n", "# run_simulations(home_dir_weak, fresh_start = True)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.8" }, "orig_nbformat": 4 }, "nbformat": 4, "nbformat_minor": 2 }