Spatio-Temporal Accessibility of Pharmacy Care in Vermont, USA¶

Comparing Results of Alternative Study Versions¶

Authors¶

  • Sam Roubin, sroubin@middlebury.edu, https://orcid.org/0009-0005-5490-3744, Middlebury College
  • Joseph Holler*, josephh@middlebury.edu, https://orcid.org/0000-0002-2381-2699, Middlebury College
  • Peter Kedron, peterkedron@ucsb.edu, https://orcid.org/0000-0002-1093-3416, University of California, Santa Barbara

* Corresponding author

Materials and procedure¶

Computational environment¶

Similar to Kang et al. (2020), this study was run using CyberGIS-Jupyter. This study uses an updated software environment from the reproduction study, using Python Jupyter Notebooks in the CyberGISX environment available at https://cybergisxhub.cigi.illinois.edu/. In particular, we use the Python 3-0.9.0 Kernel running Python 3.8.12, pandas 1.3.5, geopandas 0.10.2, networkx 2.6.3 and osmnx 1.1.2.

In [16]:
# Import modules
import numpy as np
import pandas as pd
import geopandas as gpd
import seaborn as sns
import re
from shapely.geometry import Point, LineString, Polygon
import matplotlib.pyplot as plt
import os
from IPython.display import display, clear_output
from shapely.ops import nearest_points   #for pharmacy_setting function
from scipy.stats import kruskal
from matplotlib import colors
from matplotlib.colors import Normalize
from matplotlib.cm import ScalarMappable
import matplotlib.patheffects as path_effects
from tabulate import tabulate
from matplotlib.colors import ListedColormap
from mpl_toolkits.axes_grid1.anchored_artists import AnchoredSizeBar
from matplotlib.font_manager import FontProperties
from matplotlib.patches import Patch
from matplotlib.lines import Line2D
import imageio
import math
import warnings
import scikit_posthocs as sp
warnings.filterwarnings("ignore")
print('\n'.join(f'{m.__name__}=={m.__version__}' for m in globals().values() if getattr(m, '__version__', None)))
numpy==1.22.0
pandas==1.3.5
geopandas==0.10.2
seaborn==0.11.2
re==2.2.1
imageio==2.13.5
scikit_posthocs==0.8.1

Check Directories¶

In [4]:
# Use to set work directory properly
if os.path.basename(os.getcwd()) == 'code':
    os.chdir('../../')
os.getcwd()
Out[4]:
'/home/jovyan/work/OR-VT-Pharmacy'
In [25]:
# define the desired result sets to compare
set1 = "009"
set2 = "001"

# save figures? Switch to True to save figures
figsave = True

def make_fig_file(fignum):
    fig_folder = "./results/figures/sets_" + set1 + "_" + set2
    os.makedirs(fig_folder, exist_ok=True)
    return fig_folder + "/figure" + str(fignum) + ".jpg"
In [26]:
# load results
coi = ["access_w", "access_s", "access_su"]
colnames1 = {item: item + set1 for item in coi}
colnames2 = {item: item + set2 for item in coi}

def loadResults(setNum):
    setDesc_file = "./data/derived/public/result_sets/results_" + setNum + ".txt"
    setResult_file = "./data/derived/public/result_sets/results_" + setNum + ".gpkg"
    with open(setDesc_file, 'r') as f:
        setDesc = f.read()
    print(setDesc)
    results = gpd.read_file(setResult_file)
    results = results.loc[:, :'minority_pop'].join(results[coi + ['geometry']])
    return results

print("First Result Set:\n")
results1 = loadResults(set1).rename(columns = colnames1)

print("\nsubtracting Second Result Set:\n")
results2 = loadResults(set2).rename(columns = colnames2)[['GEOID'] + list(colnames2.values())]

results_comb = results1.merge(results2, on="GEOID")
First Result Set:

Result Set ID: 009
Beta: 262
Discretization: min travel time
Distances: [300, 600, 900, 1200, 1800, 2400] (seconds) [5, 10, 15, 20, 30, 40] (minutes)
Weights: [1, 0.91, 0.68, 0.42, 0.22, 0.03]
Time series: 2023
Pharmacy technician value: pt5 (0.5)
Demand population: total_pop
Geographic level of aggregation: block

subtracting Second Result Set:

Result Set ID: 001
Beta: 262
Discretization: min travel time
Distances: [600, 1200, 1800] (seconds) [10, 20, 30] (minutes)
Weights: [1, 0.68, 0.22]
Time series: 2023
Pharmacy technician value: pt5 (0.5)
Demand population: total_pop
Geographic level of aggregation: county_subdivision
In [27]:
results_comb
Out[27]:
GEOID NAME necta total_pop elderly_pop minority_pop access_w009 access_s009 access_su009 geometry access_w001 access_s001 access_su001
0 5001773675 Tunbridge town, Orange County, Vermont None 1337.0 274.0 55.0 1.837275 1.046994 0.794941 MULTIPOLYGON (((494016.737 152571.233, 494092.... 1.815529 1.019366 0.716094
1 5000975175 Victory town, Essex County, Vermont None 70.0 23.0 4.0 1.361813 0.763767 0.615186 MULTIPOLYGON (((546966.722 227691.457, 547619.... 1.508539 0.827602 0.669212
2 5000159650 Ripton town, Addison County, Vermont None 739.0 128.0 86.0 2.251090 1.499919 0.988657 MULTIPOLYGON (((454852.030 171483.437, 457345.... 2.000679 1.318757 0.879293
3 5000108575 Bridport town, Addison County, Vermont None 1225.0 282.0 82.0 3.680058 2.270777 1.104514 MULTIPOLYGON (((426786.611 165125.435, 426899.... 4.389905 2.686803 1.258349
4 5001709325 Brookfield town, Orange County, Vermont None 1244.0 293.0 95.0 3.327164 1.986888 1.660490 MULTIPOLYGON (((485984.066 169492.152, 488501.... 4.526437 2.724492 2.190710
... ... ... ... ... ... ... ... ... ... ... ... ... ...
248 5001903550 Barton town, Orleans County, Vermont None 2872.0 736.0 168.0 4.645133 2.803754 2.202458 MULTIPOLYGON (((519223.514 249590.456, 519586.... 5.088039 3.110618 2.456050
249 5000126300 Ferrisburgh town, Addison County, Vermont Metropolitan NECTA 2646.0 581.0 159.0 2.839361 1.723960 1.333271 MULTIPOLYGON (((428034.770 184292.295, 428115.... 2.439431 1.500809 1.162266
250 5000785150 Winooski city, Chittenden County, Vermont Metropolitan NECTA 7997.0 947.0 1674.0 8.424424 4.662536 3.572591 MULTIPOLYGON (((444153.318 221625.708, 444517.... 9.143431 5.087106 3.897623
251 5002756350 Pomfret town, Windsor County, Vermont Micropolitan NECTA 916.0 265.0 53.0 2.209271 0.976377 0.771353 MULTIPOLYGON (((492994.175 129381.628, 494043.... 2.571441 1.096089 0.884632
252 5002702575 Baltimore town, Windsor County, Vermont None 229.0 50.0 4.0 3.399196 2.037168 1.449587 MULTIPOLYGON (((493532.085 96445.694, 496532.9... 3.479168 2.202876 1.359497

253 rows × 13 columns

In [28]:
# Calculate Difference and Pct Difference
for i in coi:
    results_comb[i] = results_comb[i + set1] - results_comb[i + set2]
    results_comb[i + "pctdif"] = results_comb[i] / (results_comb[i + set1] + results_comb[i + set2]) * 100
    
results_comb
mapping_df = results_comb
mapping_df
Out[28]:
GEOID NAME necta total_pop elderly_pop minority_pop access_w009 access_s009 access_su009 geometry access_w001 access_s001 access_su001 access_w access_wpctdif access_s access_spctdif access_su access_supctdif
0 5001773675 Tunbridge town, Orange County, Vermont None 1337.0 274.0 55.0 1.837275 1.046994 0.794941 MULTIPOLYGON (((494016.737 152571.233, 494092.... 1.815529 1.019366 0.716094 0.021746 0.595325 0.027628 1.337026 0.078847 5.218064
1 5000975175 Victory town, Essex County, Vermont None 70.0 23.0 4.0 1.361813 0.763767 0.615186 MULTIPOLYGON (((546966.722 227691.457, 547619.... 1.508539 0.827602 0.669212 -0.146726 -5.111774 -0.063835 -4.011308 -0.054027 -4.206370
2 5000159650 Ripton town, Addison County, Vermont None 739.0 128.0 86.0 2.251090 1.499919 0.988657 MULTIPOLYGON (((454852.030 171483.437, 457345.... 2.000679 1.318757 0.879293 0.250411 5.889570 0.181163 6.427232 0.109365 5.854791
3 5000108575 Bridport town, Addison County, Vermont None 1225.0 282.0 82.0 3.680058 2.270777 1.104514 MULTIPOLYGON (((426786.611 165125.435, 426899.... 4.389905 2.686803 1.258349 -0.709846 -8.796153 -0.416026 -8.391716 -0.153835 -6.510536
4 5001709325 Brookfield town, Orange County, Vermont None 1244.0 293.0 95.0 3.327164 1.986888 1.660490 MULTIPOLYGON (((485984.066 169492.152, 488501.... 4.526437 2.724492 2.190710 -1.199273 -15.270357 -0.737604 -15.655802 -0.530220 -13.767660
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
248 5001903550 Barton town, Orleans County, Vermont None 2872.0 736.0 168.0 4.645133 2.803754 2.202458 MULTIPOLYGON (((519223.514 249590.456, 519586.... 5.088039 3.110618 2.456050 -0.442906 -4.550478 -0.306864 -5.188451 -0.253592 -5.443638
249 5000126300 Ferrisburgh town, Addison County, Vermont Metropolitan NECTA 2646.0 581.0 159.0 2.839361 1.723960 1.333271 MULTIPOLYGON (((428034.770 184292.295, 428115.... 2.439431 1.500809 1.162266 0.399930 7.576167 0.223151 6.919916 0.171005 6.852414
250 5000785150 Winooski city, Chittenden County, Vermont Metropolitan NECTA 7997.0 947.0 1674.0 8.424424 4.662536 3.572591 MULTIPOLYGON (((444153.318 221625.708, 444517.... 9.143431 5.087106 3.897623 -0.719007 -4.092743 -0.424570 -4.354727 -0.325032 -4.351038
251 5002756350 Pomfret town, Windsor County, Vermont Micropolitan NECTA 916.0 265.0 53.0 2.209271 0.976377 0.771353 MULTIPOLYGON (((492994.175 129381.628, 494043.... 2.571441 1.096089 0.884632 -0.362169 -7.575631 -0.119712 -5.776317 -0.113280 -6.840612
252 5002702575 Baltimore town, Windsor County, Vermont None 229.0 50.0 4.0 3.399196 2.037168 1.449587 MULTIPOLYGON (((493532.085 96445.694, 496532.9... 3.479168 2.202876 1.359497 -0.079972 -1.162662 -0.165709 -3.908182 0.090090 3.207095

253 rows × 19 columns

Results¶

Hypothesis 1 - Spatial Dimension¶

Calculate mean access by NECTA type.

In [29]:
# Weekday accessibility by classification table
mapping_df['necta'].fillna('Rural', inplace=True)
# Replace 'Metropolitan NECTA' and 'Micropolitan NECTA' with 'Metropolitan' and 'Micropolitan', respectively
mapping_df['necta'] = mapping_df['necta'].replace({'Metropolitan NECTA': 'Metropolitan', 'Micropolitan NECTA': 'Micropolitan'})

# Group by NECTA and calculate means
means_by_metro = mapping_df.groupby('necta').mean()
weekdaymean_by_metro = means_by_metro[['access_w']]

weekdaymean_by_metro.columns = ['Mean Access']
table_1 = tabulate(weekdaymean_by_metro, headers='keys', tablefmt='simple_grid')
print(table_1)
necta           Mean Access
------------  -------------
Metropolitan     -0.0611722
Micropolitan     -0.157
Rural            -0.0570708

Figure 2: Weekday accessibility¶

In [30]:
# Map of weekday pharmacy accessibility
fig2, ax = plt.subplots(figsize=(12, 12), facecolor = 'white')

extreme = math.ceil(max([abs(min(mapping_df['access_w'])) , max(mapping_df['access_w']) ]))

mapping_df.plot(column='access_w', cmap='PRGn', legend = False, ax=ax, vmax=extreme, vmin=-extreme)
mapping_df.dissolve().boundary.plot(ax=ax, color='black', linewidth=1)

# Dissolve the geometries for each group to merge adjacent polygons 
metropolitan_df = mapping_df[mapping_df['necta'] == 'Metropolitan']
micropolitan_df = mapping_df[mapping_df['necta'] == 'Micropolitan']

metropolitan_boundary = metropolitan_df.dissolve(by='necta')['geometry'].boundary
micropolitan_boundary = micropolitan_df.dissolve(by='necta')['geometry'].boundary

# Plot exterior boundaries of the 'Metropolitan NECTA' group
metropolitan_boundary.plot(ax=ax, color='black', linewidth=1.2)

# Plot exterior boundaries of the 'Micropolitan NECTA' group
micropolitan_boundary.plot(ax=ax, color='black',linestyle = "dashed", linewidth=1.5)

legend_elements = [
    Line2D([0], [0], color='black', lw=2, label='Metropolitan'),
    Line2D([0], [0], color='black', linestyle = 'dashed',lw=2, label='Micropolitan')
]
ax.legend(handles=legend_elements, loc='lower right', fontsize=12, bbox_to_anchor=(1,.2))
colorbar = plt.cm.ScalarMappable(cmap='PRGn', norm=plt.Normalize(vmin=-extreme, vmax=extreme))

# Colorbar settings
cbar = plt.colorbar(colorbar, shrink = .5, pad=.02, label = 'Difference in Accessibility', location="bottom")
cbar.set_ticks([-extreme, -(extreme/2), 0, extreme/2, extreme])

    
plt.axis('off')
plt.show()

# Save figure
if figsave:
    fig2.savefig(make_fig_file(2), dpi=300)

Figure 2. Spatial accessibility during conventional weekday business hours, representing a time period when all pharmacies are operational (maximum accessibility).

Statistical Significance¶

In [31]:
# Check for normal distribution in weekday metro, micro, rural
access_w_metro = mapping_df[mapping_df['necta'] == 'Metropolitan']['access_w']
access_w_micro = mapping_df[mapping_df['necta'] == 'Micropolitan']['access_w']
access_w_rural = mapping_df[mapping_df['necta'] == 'Rural']['access_w']

plt.hist(access_w_rural, bins=10)
plt.hist(access_w_metro, bins=10)
plt.hist(access_w_micro, bins=10)
print("Not normal distribution. Cannot use ANOVA test. Use Kruskal-Wallis instead.")
Not normal distribution. Cannot use ANOVA test. Use Kruskal-Wallis instead.
In [32]:
# Kruskal Wallis Test for significant difference of means between necta classification during conventional business hours
h_statistic_1, p_value_1 = kruskal(access_w_metro, access_w_micro, access_w_rural)
                 
print("Kruskal-Wallis H Statistic:", h_statistic_1)
print("P-value:", p_value_1)

alpha = 0.05
if p_value_1 < alpha:
    print("Reject the null hypothesis. There is a significant difference in mean access during conventional weekday \nbusiness hours between metropolitan, micropolitan, and rural towns.")
else:
    print("Fail to reject the null hypothesis. There is no significant difference in mean access between groups.")
Kruskal-Wallis H Statistic: 5.056607097415508
P-value: 0.07979427262586712
Fail to reject the null hypothesis. There is no significant difference in mean access between groups.
In [33]:
dunn_data = np.concatenate([access_w_metro, access_w_micro, access_w_rural])
dunn_groups = ['access_w_metro']*len(access_w_metro) + ['access_w_micro']*len(access_w_micro) + ['access_w_rural']*len(access_w_rural)
dunn_df = pd.DataFrame({'value': dunn_data, 'group': dunn_groups})

# Perform Dunn's test with p-value adjustment (e.g., 'holm')
dunn_results = sp.posthoc_dunn(dunn_df, val_col='value', group_col='group', p_adjust='holm')
print("\nDunn's Post Hoc Test Results (p-values):\n", dunn_results)
Dunn's Post Hoc Test Results (p-values):
                 access_w_metro  access_w_micro  access_w_rural
access_w_metro         1.00000        0.661450        0.661450
access_w_micro         0.66145        1.000000        0.079513
access_w_rural         0.66145        0.079513        1.000000
In [34]:
# Create the scatter plot
import matplotlib.patches as mpatches

nectas = {'Rural': '#ccebc5',
          'Micropolitan': '#7bccc4', 
          'Metropolitan': '#0868ac'
          }

color_list = [nectas[group] for group in mapping_df['necta']]

legend_handles = []
for key, value in nectas.items():
    patch = mpatches.Patch(color=value, label=key)
    legend_handles.append(patch)

mapping_df['pop_density'] = mapping_df['total_pop'] / (mapping_df.geometry.area / 10**6)
mapping_df.plot.scatter('pop_density', 'access_w', c=color_list, alpha=0.7)

# Add labels and title
plt.xlabel("Population Density")
plt.xscale('log')
plt.ylabel("Weekday Access")
plt.legend(handles=legend_handles)
Out[34]:
<matplotlib.legend.Legend at 0x7f97c56fd910>

Scatterplot of Population and Weekday Access to illustrate relationship between NECTA classification, population density of county subdivisions, and spatial accessibility.

Hypothesis 2 - Temporal Dimension¶

Calculate mean access by type of day, and then map accessibility by county subdivision for each type of day.

In [35]:
# Mean accessibility by day table
mean_access_day = mapping_df[['access_w', 'access_s', 'access_su']].mean()
mean_access_day_df = mean_access_day.to_frame().rename(columns={0: 'Mean Access'})
mean_access_day_df.index = ['Weekday', 'Saturday', 'Sunday']
table_2 = tabulate(mean_access_day_df, headers='keys', tablefmt='simple_grid')
print(table_2)
            Mean Access
--------  -------------
Weekday      -0.0765971
Saturday     -0.0481056
Sunday       -0.0266851

Figure 3: Accessibility variation by day of the week¶

In [36]:
# Map accessibility by day of the week
mapping_df1 = mapping_df

extreme = math.ceil(max([abs(mapping_df1[['access_w', 'access_s', 'access_su']].min().min()),  # Min accessibility value
mapping_df1[['access_w', 'access_s', 'access_su']].max().max()]))  # Max accessibility value

fig3, axs = plt.subplots(1, 3, figsize=(22.5, 10), facecolor = 'white')
plt.subplots_adjust(wspace=-.4)

for i, column in enumerate(['access_w', 'access_s', 'access_su']):
    ax = axs[i]
    mapping_df1.plot(column=column, cmap='PRGn', linewidth=0.2, ax=ax, edgecolor='0.8', legend=False,
                vmin=-extreme, vmax=extreme)
    mapping_df1.dissolve().boundary.plot(ax=ax, color='black', linewidth=1)
    
    # Plot only the merged exterior boundaries of the 'Metropolitan NECTA' group
    metropolitan_boundary.plot(ax=ax, color='black', linewidth=.9)

    # Plot only the merged exterior boundaries of the 'Micropolitan NECTA' group
    micropolitan_boundary.plot(ax=ax, color='black', linestyle = 'dashed', linewidth=.9)
    
    axs[0].set_title(label='a) Weekday', fontsize=14)
    axs[1].set_title('b) Saturday', fontsize=14)
    axs[2].set_title('c) Sunday', fontsize=14)
    ax.axis('off')

cbar = plt.colorbar(plt.cm.ScalarMappable(norm=plt.Normalize(vmin=-extreme, vmax=extreme), cmap='PRGn'), ax=axs, #Max was set to 20 for visualization purposes
                    orientation='horizontal', pad=.02, shrink=.2)
cbar.set_label('Difference in Accessibility')
cbar.set_ticks([-extreme, -(extreme/2), 0, extreme/2, extreme])

#fig.patch.set_edgecolor('black') # Figure Border
#fig.patch.set_linewidth(2)       # Figure Border
plt.subplots_adjust(right=1)

cbar.ax.set_position([0.45, 0.05, .2, 0.05])

ax.legend(handles=legend_elements, loc='lower right', fontsize=12, bbox_to_anchor=(.9,-.129))

plt.show()

#Save Figure
if figsave:
    fig3.savefig(make_fig_file(3), dpi=300)

Figure 3. Spatial accessibility across days of the week. Each map represents the maximum accessibility on each day when all pharmacies that plan to open that day are operational.

Statistical Significance¶

In [37]:
access_w_test = pd.Series(mapping_df['access_w'])
access_s_test = pd.Series(mapping_df['access_s'])
access_su_test = pd.Series(mapping_df['access_su'])

# Check for distribution between days of week. 
plt.hist(access_w_test, bins=10) 
plt.hist(access_s_test , bins=10) 
plt.hist(access_su_test, bins=10) 
print("Distribution is not normal. Cannot use ANOVA test. Use Kruskal-Wallis instead.")
Distribution is not normal. Cannot use ANOVA test. Use Kruskal-Wallis instead.
In [38]:
# Run Kruskal-Wallis Test
h_statistic_2, p_value_2 = kruskal(access_w_test, access_s_test, access_su_test)
                 
print("Kruskal-Wallis H Statistic:", h_statistic_2)
print("P-value:", p_value_2)

alpha = 0.05
if p_value_2 < alpha:
    print("Reject the null hypothesis. There is a significant difference in mean access between weekdays, Saturdays, and Sundays.")
else:
    print("Fail to reject the null hypothesis. There is no significant difference in mean access between days.")
                    
Kruskal-Wallis H Statistic: 1.3426207532213146
P-value: 0.5110384859669458
Fail to reject the null hypothesis. There is no significant difference in mean access between days.
In [39]:
dunn_data = np.concatenate([access_w_test, access_s_test, access_su_test])
dunn_groups = ['access_w_test']*len(access_w_test) + ['access_s_test']*len(access_s_test) + ['access_su_test']*len(access_su_test)
dunn_df = pd.DataFrame({'value': dunn_data, 'group': dunn_groups})

# Perform Dunn's test with p-value adjustment (e.g., 'holm')
dunn_results = sp.posthoc_dunn(dunn_df, val_col='value', group_col='group', p_adjust='holm')
print("\nDunn's Post Hoc Test Results (p-values):\n", dunn_results)
Dunn's Post Hoc Test Results (p-values):
                 access_s_test  access_su_test  access_w_test
access_s_test        1.000000        0.858826       0.888254
access_su_test       0.858826        1.000000       0.858826
access_w_test        0.888254        0.858826       1.000000

Hypothesis 3 - Spatio-Temporal Dynamics¶

Calculate mean access by NECTA classification and type of day. Simplify NECTA classification to rural and urban categories by merging micropolitan and metropolitan into "urban". Then calculate the percentage change from urban to rural for each type of day by finding the normalized percent difference with:
(Metro - Micro Access) / (Metro + Micro) * 100
(Metro - Rural Access) / (Metro + Rural) * 100
(Micro - Rural Access) / (Micro + Rural) * 100

In [40]:
# Accessibility by Day and metropolitan/micropolitan Table
means_by_metro = mapping_df.groupby('necta').mean()[['access_w','access_s','access_su']]
means_by_metro = means_by_metro.rename(columns={"access_w": "Weekday", 
                                                "access_s": "Saturday",
                                                "access_su": "Sunday"})
means_by_metro = means_by_metro.transpose()
means_by_metro = means_by_metro.rename(columns={"Metropolitan": "Metro", 
                                                "Micropolitan": "Micro"})

def pctdiff(df, col1, col2):
    newcol = col1 + "_" + col2
    df[newcol] = (df[col1] - df[col2])/(df[col1] + df[col2]) * 100
    return df

means_by_metro = round(pctdiff(means_by_metro, "Metro", "Micro"), 2)
means_by_metro = round(pctdiff(means_by_metro, "Metro", "Rural"), 2)
means_by_metro = round(pctdiff(means_by_metro, "Micro", "Rural"), 2)

means_by_metro

# print(tabulate(means_by_metro, tablefmt = 'fancy_grid', headers=["","N","Weekday Mean Access", "Saturday Mean Access", "Sunday Mean Access"]))
#means_by_metro
Out[40]:
necta Metro Micro Rural Metro_Micro Metro_Rural Micro_Rural
Weekday -0.06 -0.16 -0.06 -43.92 -0.0 45.45
Saturday -0.04 -0.09 -0.04 -44.77 -0.0 38.46
Sunday -0.03 -0.08 -0.01 -44.61 50.0 77.78

Table 1. Mean Accessibility on weekdays, Saturdays, and Sundays, broken down by Metropolitan, Micropolitan, and Rural county subdivisions. Normalized percent differences are calculated to see how accessibility gaps differ throughout the week.