Thursday, November 07, 2024

A Large-Scale Geographically Explicit Synthetic Population with Social Networks for the US

In numerous posts, we have been discussing synthetic populations and their use in agent-based modeling. But there are many modeling styles that also utilize synthetic populations. In our own work we often spend significant amounts of time creating such synthetic populations, especially those grounded with data, due to the time needed to collect, preprocess and generate the final synthetic population. To alleviate this, we (Na (Richard) JiangFuzhen YinBoyu Wang and myself) have a new paper published in  Scientific Data, entitled "A Large-Scale Geographically Explicit Synthetic Population with Social Networks for the United States.Our aim of this paper is to build and provide a geographically explicit synthetic population along with its social networks using open data including that from the latest 2020 U.S. Census which can be used in a variety of geo-simulation models.

Summary of the Resulting Datasets.

Specially, in the paper we outline how we created the a synthetic population of 330,526,186 individuals representing America's 50 states and Washington D.C.. Each individual has a set of geographical locations that represent their home, work or school addresses. Additionally, these individuals are not isolated, they are embedded in a larger social setting based on their household, working and studying relationships (i.e., social networks).

The work (e.g., data collection, data preprocessing and generation processes) was coded using Python 3.12 and all the scripts used are available at: https://github.com/njiang8/geo-synthetic-pop-usa while the resulting datasets (85 GB uncompressed) are available at OSF: https://osf.io/fpnc2/.  

To give you a sense of the paper, below we provide the abstract to it, along with  some results and our efforts to validate the synthetic population. While at the full reference and link to the paper can be found at the bottom of the post. 

Abstract:

Within the geo-simulation research domain, micro-simulation and agent-based modeling often require the creation of synthetic populations. Creating such data is a time-consuming task and often lacks social networks, which are crucial for studying human interactions (e.g., disease spread, disaster response) while at the same time impacting decision-making. We address these challenges by introducing a Python based method that uses the open data including that from 2020 U.S. Census data to generate a large-scale realistic geographically explicit synthetic population for America's 50 states and Washington D.C. along with the stylized social networks (e.g., home, work and schools). The resulting synthetic population can be utilized within various geo-simulation approaches (e.g., agent-based modeling), exploring the emergence of complex phenomena through human interactions and further fostering the study of urban digital twins.

Keywords: Synthetic Population, U.S. Census 2020, Agent-Based Modeling, Geo-Simulation, Social Networks.

Data Generation Workflow and Resulting Datasets.

A Sample of a Social Networks for one Household and their Home, Work and Educational Social Networks from the Generated Data.

Sample of Generated Social Networks Extracted from the City of Buffalo, New York: (a) Household; (b) Work; (c) School; (d) Daycare.

Validation of the Synthetic Population at Different Levels: (a) Population under Different 18 Age Groups; (b) Household under Different Household Types.

Full Referece: 

Jiang, N., Yin, F., Wang., B. and Crooks, A.T. (2024), A Large-Scale Geographically Explicit Synthetic Population with Social Networks for the United States, Scientific Data, 11, 1204. https://doi.org/10.1038/s41597-024-03970-1 (pdf)




Friday, November 01, 2024

Pattern of Life Human Mobility Simulation (Demo)

While in the past we have written about how we can use agent-based models to capture basic patterns of life, and even developed a simulations, but until now we have never really demonstrated how we go about this. However, at the  SIGSPATIAL 2024 conference  we (Hossein Amiri, Will Kohn, Shiyang Ruan, Joon-Seok Kim, Hamdi Kavak, Dieter Pfoser, Carola Wenk, Andreas Zufle and myslf) have a demonstration paper entitled "The Pattern of Life Human Mobility Simulation." in which we show: 

  1. How to run the Patterns of Life Simulation with the graphical user interface (GUI) to visually explore the mobility patterns of a region.
  2. How to run the Patterns of Life Simulation headless (without GUI) for large-scale data generation.
  3. How to adapt the simulation to any region in the world using OpenStreetMap data,
  4. Showcase how recent scalability improvements allow us to simulate hundreds of thousands of agents.

If this sounds of interest, below we show the GUI to the model, along with the steps to generate a trajectory dataset or a new map for the simulation. At the bottom of the post you can actually see the papers full reference and a link to download it. While at https://github.com/onspatial/generate-mobility-dataset you can find the source code for the enhanced simulation and data-processing tools for you to experiment with.

Abstract: 

We demonstrate the Patterns of Life Simulation to create realistic simulations of human mobility in a city. This simulation has recently been used to generate massive amounts of trajectory and check-in data. Our demonstration focuses on using the simulation twofold: (1) using the graphical user interface (GUI), and (2) running the simulation headless by disabling the GUI for faster data generation. We further demonstrate how the Patterns of Life simulation can be used to simulate any region on Earth by using publicly available data from OpenStreetMap. Finally, we also demonstrate recent improvements to the scalability of the simulation allows simulating up to 100,000 individual agents for years of simulation time. During our demonstration, as well as offline using our guides on GitHub, participants will learn: (1) The theories of human behavior driving the Patters of Life simulation, (2) how to simulate to generate massive amounts of synthetic yet realistic trajectory data, (3) running the simulation for a region of interest chosen by participants using OSM data, (4) learn the scalability of the simulation and understand the properties of generated data, and (5) manage thousands of parallel simulation instances running concurrently.

Keywords: Patterns of Life, Simulation, Trajectory, Dataset, Customization

A screenshot of the graphical user interface of the Patterns of Life Simulation. The GUI shows the map and the movements of agents on the left side and the social network of agents and their statistical properties on the right side. 

Steps to generate the one trajectory dataset.
Steps to generate a new map for the simulation.

Full referece: 

Amiri, H., Kohn, W., Ruan, S., Kim, J-S., Kavak, H., Crooks, A.T., Pfoser, D., Wenk, C. and Zufle, A. (2024) The Pattern of Life Human Mobility Simulation (Demo Paper), ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, Atlanta, GA. (pdf)

Thursday, October 31, 2024

Studying Contagious Disease Spread: An ABM Framework

In the past we have written about the use of synthetic populations and their use in agent-based models. We are finding such synthetic populations to be extremely useful in the creation or initialization of agent-based models. To give you a sense of how we are utilizing such synthetic populations at the 7th ACM SIGSPATIAL International Workshop on Geospatial Simulation (GeoSim 2024),   Na (Richard) Jiang and myself have a new paper entitled  "Studying Contagious Disease Spread Utilizing Synthetic Populations Inspired by COVID-19: An Agent-based Modeling Framework.

In the paper we show how we can we utilize a method to create the geographically-explicit synthetic population along with capturing their social networks and how this can be used to study  contagious disease spread (and various lineages of the disease) in Western New York. If this sounds of interest, below you can read the abstract from the paper, see some of the results and find the full reference and the link to the paper. While the model itself and the data needed to run it is available at https://osf.io/zrtuj/

Abstract

The COVID-19 pandemic has reshaped societies and brought to the forefront simulation as a tool to explore the spread of the diseases including that of agent-based modeling. Efforts have been made to ground these models on the world around us using synthetic populations that attempt to mimic the population at large. However, we would argue that many of these synthetic populations and therefore the models using them, miss the social connections which were paramount to the spread of the pandemic. Our argument being is that contagious diseases mainly spread through people interacting with each other and therefore the social connections need to be captured. To address this, we create a geographically-explicit synthetic population along with its social network for the Western New York (WNY) Area. This synthetic population is then used to build a framework to explore a hypothetical contagious disease inspired by various of COVID-19. We show simulation results from two scenarios utilizing this framework, which demonstrates the utility of our approach capturing the disease dynamics. As such we show how basic patterns of life along with interactions driven by social networks can lead to the emergence of disease outbreaks and pave the way for researchers to explore the next pandemic utilizing agent-based modeling with geographically explicit social networks.

Keywords: Agent-based Modeling, Synthetic Populations, Social Networks, COVID-19, Disease Modeling.

Single Lineage Results: (a) Overall SEIR Dynamic; (b) Contact Tracing Example.

Western New York Commuting Pattern.

Disease Dynamics when Two Lineages are Introduced.

Reference: 

Jiang N., Crooks, A.T. (2024), Studying Contagious Disease Spread Utilizing Synthetic Populations Inspired by COVID-19: An Agent-based Modeling Framework, Proceedings of the 7th ACM SIGSPATIAL International Workshop on Geospatial Simulation (GeoSim 2024), Atlanta, GA., pp. 29-32. (pdf)

Wednesday, October 30, 2024

Agent-Based Models and Geography

Just a quick post, In recently released Encyclopedia of Human Geography edited by Barney Warf we were asked to write a short chapter entitled "Agent-based Models and Geography" In the chapter we discuss how over  the last several decades, agent-based modeling has gained widespread adoption in geography.and introduce the reader to what are agent-based models, how they have developed and types of geographical applications that can be explored with them, especially when linked to Geographical Information Systems (GIS). The chapter concludes with a brief summary along with a discussion of challenges and opportunities with agent-based modeling (ABM). If this sounds of interest, below you can find the full reference and link to the chapter. 

Example application domains for agent-based models over various spatial and temporal scales. For more examples and further details can be found at https://www.gisagents.org/

Full Referece:

Crooks, A.T. and Jiang, N. (2024), Agent-based Models and Geography, in Warf, B. (ed.), The Encyclopedia of Human Geography, Springer, Cham, Switzerland, https://doi.org/10.1007/978-3-031-25900-5_258-1. (pdf)

An Agent-based model of COVID-19 Vaccine uptake in New York State

In the past we have explored how agent-based modeling can be used to study vaccine uptake and what is the mechanism underlying the diffusion of different vaccine opinions in hybrid spaces (e.g., physical, relational and cyber) can affect individuals’ vaccination decisions. But this prior work was limited to  just one small area. However, we know that urban and rural communities have different levels of digital connectivity and we were wondering if our initial findings are applicable to other counties which are more urban or to a larger study area. To explore this, at the 7th ACM SIGSPATIAL International Workshop on Geospatial Simulation (GeoSim 2024)  we (Fuzhen Yin, Na Jiang, Lucie Laurian and myself) have a paper entitled "Agent-based Modeling of COVID-19 Vaccine uptake in New York State: Information Diffusion in Hybrid Spaces". 

This paper significantly extends our previous work in a number of ways. First we move from a single rural county to the entire state of New York which has 62 counties which differ substantially in  socioeconomic status. Furthermore, we move from a small population of 120,000 to over 20 million agents. By doing so, it allows us to compare vaccination uptakes in different areas (e.g., urban versus rural communities, second home destinations versus college towns). We also use  different parameters to initialize hybrid spaces for urban and rural populations to understand how individuals' preferences on hybrid spaces affect information diffusion and vaccination rates at a macro level. Lastly, we updated the decision-making rules for minors (i.e., ages under 18) that allows us to better simulate young population groups. In the sense that we make the assumption that minors need to have at least one of their guardians in the family network vaccinated already before they can take vaccines. By extending the model  we can can accurately simulate the vaccination rates for New York state (mean absolute error=6.93) and for the majority of counties within it (81%).

If this sounds of interest, below you can read the abstract of our paper, see our various hybrid spaces over the New York state along with our updated model logic and the aggregate results. The full reference and the link to the paper can be found at the bottom of the post. While the model itself, which was created in Mesa and the data needed to run the model can be found at: https://osf.io/3khyq/. We share our modeling scripts, input data and results at  for interested readers to reproduce or extend our work as they see fit but also to conform with the FAIR principles (findable, accessible, interoperable and reusable),

Abstract
During the COVID-19 pandemic, social media become an important hub for public discussions on vaccination. However, it is unclear how the rise of cyber space (i.e., social media) combined with traditional relational spaces (i.e., social circles), and physical space (i.e., spatial proximity) together affect the diffusion of vaccination opinions and produce different impacts on urban and rural population's vaccination uptake. This research builds an agent-based model utilizing the Mesa framework to simulate individuals' opinion dynamics towards COVID-19 vaccines, their vaccination uptake and the emergent vaccination rates at a macro level for New York State (NYS). By using a spatially explicit synthetic population, our model can accurately simulate the vaccination rates for NYS (mean absolute error=6.93) and for the majority of counties within it (81\%). This research contributes to the modeling literature by simulating individuals vaccination behaviors which are important for disease spread and transmission studies. Our study extends geo-simulations into hybrid-space settings (i.e., physical, relational, and cyber spaces).

Keywords: Agent-based modeling, GIS, Information diffusion, Hybrid spaces, Social networks, Health informatics, Vaccines, COVID-19. 

Schematic representation of hybrid spaces. Physical space includes family and group quarter network. Relational space represents people's social circles in work, school and daycare. Cyber space is a social media network. This figure only display 2% of total population in NYS (around 200,000 agents) for visualization process.

Modeling process and structure: from data to agent-behaviors.

Mapping the differences (i.e., mean absolute error (MAE)) in vaccination rate between simulated and ground truth data. 

Reference
Yin, F., Jiang, Na., Crooks, A.T., Laurian, L. (2024), Agent-based Modeling of Covid-19 Vaccine Uptake in New York State: Information Diffusion in Hybrid Spaces, Proceedings of the 7th ACM SIGSPATIAL International Workshop on Geospatial Simulation (GeoSim 2024), Atlanta, GA., pp. 11-20. (pdf)