To respond effectively to the Human Immunodeficiency Virus (HIV) epidemic, communities need local data to guide decision making and plan appropriate prevention services. Because no single data source provides comprehensive local data, integrating data from different sources is essential for effective HIV prevention and care program planning.1 A principal challenge to planning groups concerns the best means to organize and analyze diverse data. In this article, we illustrate the utility of one potential tool to meet this challenge, geographic information systems (GIS) analyses.
Geographic information systems are electronic databases for collecting, storing, managing, transforming, analyzing, and displaying data that are linked to geographical features and locations. By providing a framework for integrating spatially referenced data, GIS tools facilitate spatial analysis and can produce maps that make data accessible. GIS tools can enable planners to understand where specific populations at risk live and what kind of access they have to existing services.
In this article, we apply GIS tools and simple spatial analysis methods to describe service needs, combining the kinds of data sets that most planning groups could easily access. We focus our analyses on an epicenter city, Chicago. We extend prior work using GIS for AIDS service planning2 by exploring the needs a of high-priority population for HIV prevention services, young black men who have sex with men (MSM).3-10
Methods
We used GIS mapping to produce a series of maps that show where the population of black MSM lives, where HIV is most prevalent, and where specific kinds of services are located. We also used empirical Bayesian estimation, Spearman rank correlations, and Moran's I to analyze relationships among variables.
Data sources
We obtained Census 2000 data for each of the 57 ZIP code tabulation areas (ZCTAs)11 in Chicago, along with GIS shape files of the ZCTA boundaries, from the Northeastern Illinois Planning Commission (http://www.nipc.org). ZIP codes and ZCTAs are not identical: ZIP codes are attributes of individual addresses used to facilitate delivering mail, and ZCTAs are approximations of the geographic boundaries of the ZIP code service areas in effect at the time of the 2000 Census.11 We selected ZCTAs as the appropriate spatial boundary system because we wanted to merge census data with locally collected survey data for which the only geographic identifiers available were participants' ZIP codes. We used population estimates originally drawn from Summary File 1 to calculate the following variables for each ZCTA: the percentage of residents who were black or African American, and the number of black males aged 15 to 24.
The Chicago Department of Public Health provided us with HIV prevalence data (number of HIV cases per 100,000 population in 2002) for most ZCTAs in the city. To protect confidentiality, the CDPH would not release prevalence data for small geographical units (eg, census tracts), or for ZCTAs that had either populations less than 1,000 or fewer than five HIV cases. These ZCTAs were eliminated from some analyses.
HIV prevention services data were collected in November 2005. Using Chicago-area directories of HIV/AIDS service organizations, we identified 130 agencies that offered at least one form of HIV prevention. We contacted each organization via telephone to gather information regarding the specific prevention services they provide (eg, services targeted to MSM). We gathered this information from agency Web sites if we were unable to contact them by phone. We collected these data from 95.4 percent (n = 124) of the 130 identified service agencies.
Aggregating the services data by ZIP code enabled us to discern the number of prevention services in each ZCTA specifically tailored to MSM. We assumed that the six agencies we were unable to contact did not provide services tailored to MSM. To correct for the fact that ZCTAs vary in both geographic size and population, we calculated two kinds of prevention service measures for each ZCTA: MSM prevention service density (organizations/square mile), and MSM prevention service rate for young black men (number of organizations/1,000 black males aged 15 to 24).
CITY Project survey data
The survey data were derived from the Community Intervention Trial for Youth (CITY) Project, a study of MSM 15 to 25 years of age. A time-place sampling strategy5,12,13 was used to recruit study participants at diverse venues where MSM were likely to congregate. Interviewers administered a brief screening instrument to determine the eligibility of potential respondents. Men were considered eligible to participate in the survey if they were between 15 and 25 years of age, reported having sexual contact with a male in the past year (defined as oral or anal sex or any other physical contact with another male that leads to orgasm), were a resident of the Chicago metropolitan area, and self-identified as black. Nine hundred and ten eligible and consenting men completed a structured 20-minute interview.
For the present analyses, we excluded 122 cases of men who resided in a ZIP code located outside of the city of Chicago's official boundaries. Among those who are in our analytic sample (n = 788), the mean age of the participants was 21.5 years (SD = 2.3); 31 percent were under 21 years of age. Fifty-eight percent (n = 454) of the young men identified as gay.
Measures of unprotected sexual activity
Participants were asked about the number of men they had sexual contact with in the past 3 months, the number of men they had anal sex with in the past 3 months, and how many of these anal sex encounters were without condoms. Participants who reported having had unprotected anal intercourse with another man within the last 3 months were coded as having engaged in risky behavior. For our analyses, we computed a Bayesian rate for the percentage of young black MSM in each ZCTA who reported engaging in unprotected anal intercourse in the prior 3 months. We used global empirical Bayesian estimation14 to weight percentages toward the global rate where the sample in the ZCTA was smaller and toward the observed rate where the sample in the ZCTA was larger.
Exploratory spatial analyses
We conducted our analyses and generated maps with R version 2.3.1.15 In addition to the base R software, we used several add-on packages for R16-21 to facilitate data management and spatial analysis. To visualize aggregate data from the different sources, we created choropleth maps. Each variable's distribution was split into quintiles where possible. In cases where the frequency distribution was so severely skewed and/or multimodal as to make quintile shading a poor representation of the data, we used natural breaks.
To examine relationships among variables, we computed Spearman rank correlation coefficients (rS), which are designed for nonnormally distributed data and are robust to the influence of outliers. We tested each variable for spatial autocorrelation by computing a global Moran's I, which is similar to a correlation coefficient.14 A value of -1.00 indicates a perfect negative autocorrelation (neighboring ZCTAs are dissimilar), 0 = no autocorrelation (spatially random), and +1.00 = perfect positive autocorrelation (neighboring ZCTAs are similar). Moran's I (hereafter called I) is dependent on how one defines which zones are neighbors to one another; we defined neighbors as ZCTAs that share a common border. This definition of neighbors is consistent both with contagious diffusion processes22 and with recent work on the role of streets connecting adjacent areas in explaining residential segregation.23
Because the ZCTAs vary widely in geographic area (Range = 0.02-13.32 sq. miles, M = 4.11, SD = 2.99) and total population (7-114,000, M = 51,830, SD = 30,740), we mapped ratios instead of raw counts. Several ZCTAs in downtown Chicago are anomalous because they have small geographic areas and resident populations, but large numbers of prevention service providers. These factors create implausible rates, so we eliminated these areas from our analyses.
Results
Descriptive statistics, Spearman rank correlations, and the global Moran's I for each variable in our analyses are presented in Table 1.
To see where young black MSM live, we compared several maps. Figure 1A shows that ZCTAs with high percentages of black residents are concentrated (I = 0.57, P < .01) in the south end of the city, and to a lesser extent on the west side. The percentage of residents who are young black men, shown in Figure 1B, is weakly spatially autocorrelated (I = 0.14, P < .05) but quite highly correlated with the percentage of residents who are black (rS = 0.74, P < .01). The rate of young black MSM per 1,000 population (I = 0.09, P < .01, Fig 1C) is weakly autocorrelated, but the rate of young black MSM per 1,000 young black men (I = 0.26, P < .01, Fig 1D) shows stronger autocorrelation.
Comparing Figures 1A and 1C shows that young black MSM, when considered as a portion of the total population, live where the overall black population is most concentrated (rS = 0.63, P < .01). But, a more nuanced view of where young black MSM live emerges when we examine the rate of young black MSM with reference to the population of young black men. Comparison of Figures 1A and 1D shows that proportionally more young black MSM reside in the northwestern tip of Chicago, the near-north side on the lakeshore, and in one pocket on the west side (Brighton Park, Archer Heights, and Gage Park) than expected, given where the overall black population is concentrated. Few young black MSM live in the northeast of Chicago.
Levels of HIV prevalence are autocorrelated (I = 0.32, P < .01), with high prevalence clustering along the shore on the northeast edge of the city and west of the downtown area (Fig 2A). HIV prevalence is correlated with the percentage of residents who are black (rS = 0.42, P < .01), but not with the rate of young black MSM per 1,000 young black men (rS = -0.07, ns).
Figure 2B shows a map of the density of HIV prevention services for MSM and Figure 2C shows the rate of those services relative to the population of young black men. The highest levels of service density and service rates occur in downtown Chicago and the northeast along the lakeshore. In the latter map, the two downtown ZCTAs where there were no young black male residents are marked with unfilled circles. As Figure 2B illustrates, over half of the ZCTAs do not contain any prevention services tailored to the needs of MSM (n = 32 with density = 0.00). Furthermore, unlike HIV prevalence, neither MSM prevention service density (I = -0.02, ns) nor MSM prevention service rates are spatially autocorrelated (I = -0.03, ns). Thus, while levels of HIV prevalence clusters spatially, prevention services tailored to MSM are randomly distributed in spatial terms.
Comparing Figures 1C and 1D to Figures 2B and 2C shows that service density is clearly highest in the downtown area and along the northeast edge of the city, where few young black MSM live. MSM service density is uncorrelated with the rate of young black MSM per 1,000 population (rS = 0.17, ns), and the rate of young black MSM per 1,000 young black men (rS = 0.00, ns). The MSM prevention service rate is similarly uncorrelated with the rates of young black MSM per 1,000 population (rS = 0.13, ns) and per 1,000 young black male residents (rS = -0.01, ns).
The map in Figure 2D shows the rates of unprotected anal intercourse among young black MSM. The rates of unprotected anal intercourse are not spatially autocorrelated (I = -0.06, ns), and are also not correlated with HIV prevalence (rS = 0.07, ns) or with MSM prevention service rate for young black men (rS = 0.07, ns). Clearly, prevention services tailored to MSM are not systematically located where young black MSM who are engaging in risky sexual behaviors actually live.
To identify underserved areas, we looked for a combination of three characteristics among the 44 ZCTAs for which we could compute Bayesian estimates of the rate of unprotected anal intercourse. We selected ZCTAs with high concentrations of young black MSM (ie, in the top two quintiles relative to either the total population or to the population of young black men), high rates (ie, in the top quintile) of unprotected anal intercourse among young black MSM, and no services tailored for MSM (service rate of zero). We also identified contrasting areas that are adequately served (relatively speaking) by selecting ZCTAs with low rates (ie, in the bottom quintile) of unprotected anal intercourse among young black MSM, high concentrations of young black MSM (as defined above for selecting underserved areas), and MSM prevention service rates greater than zero. The three underserved areas and the contrasting adequately served areas are both shown in Figure 3.
In Humboldt Park, the underserved ZCTA west and a little north of downtown, there are 0.193 young black MSM per 1,000 population, 31.4 percent of whom report having engaged in unprotected anal intercourse recently. South Shore, the underserved ZCTA on the southeast lakeshore, has 23.4 young black MSM per 1,000 young black men and 30.5 percent of the young black MSM report engaging in unprotected anal intercourse. Finally, Elmwood Park has 21.5 young black MSM per 1,000 young black men and a rate of unprotected anal intercourse of 29.3 percent among young black MSM. Both Humboldt Park and South Shore have high HIV prevalence (172.7 and 182.4 cases per 100,000 population); prevalence data were not available for Elmwood Park.
Discussion
We explored the utility of spatial analysis to inform community planning. Specifically, we used GIS techniques to describe the availability of HIV prevention services in Chicago and the correspondence between HIV prevention service locations for MSM and the residential location of young black MSM. In broad terms, we found that HIV prevention services are located in areas where prevalence is high, but not in areas where young black MSM live. The co-location of HIV prevention services in high-prevalence areas reflects the historical evolution of the epidemic in Chicago. HIV prevalence is highest in the places most affected early in the epidemic, the home of Chicago's northside gay community. The rise of HIV prevalence in Chicago's black areas and among its young black MSM is more recent, a shift that our data indicate Chicago may not be fully prepared for. Black communities on the south side of Chicago have disproportionately lower service densities when compared to other areas of the city. Twelve ZIP codes in which young black MSM cluster have no HIV prevention services for MSM. Black MSM may have poor access to MSM-focused HIV prevention services in their local communities. The presence of more HIV prevention services within these neighborhoods may increase awareness and service use among young black MSM.24-27
Our data suggest a different set of priority service areas for blacks in Chicago as an overall population than for young black MSM as a subpopulation of blacks. By combining and spatially displaying data on rates of self-reported unprotected anal intercourse, young black MSM residential density, and HIV prevention service density for MSM, we identified three high-priority areas for increasing service to young black MSM, two of which would not have emerged in an analysis of high-priority areas for blacks as an overall population. These results and the differences between results at the population and subpopulation level illustrate the range of insights that can be obtained by applying spatial analysis techniques to priority at-risk populations.
Advantages to spatial analysis techniques
Although much of the information we present could be presented in a traditional correlation matrix, a clear advantage to representing data in spatial form is that little statistical or geographical training is required to interpret maps accurately and to identify patterns. The choropleth maps we present here can easily convey where people and particular types of services are located. Given that community planning bodies are diverse and combine people with expertise in many areas, but not necessarily in statistics or complex research methods, GIS techniques provide a fairly straightforward means to describe local patterns and present complicated data.
We found that the choropleth maps were fairly robust to outliers compared with traditional statistics. In fact, visual analysis of maps provided us with clues about the presence of outliers. Chicago's downtown area provides a good illustration of the choropleth maps' robustness. In computing traditional correlations, the downtown areas, which are a very small portion of the city's total area and also relatively unpopulated, would distort the correlational findings. The choropleth maps, on the other hand, would not present a distorted picture of the city as a function of these outliers. To the contrary, the choropleth represented visually that the downtown areas are unlike much of the rest of the city.
A third advantage of spatial analyses was that it allowed us to integrate diverse data. We combined publicly available data with data from service directories and survey data to characterize the attributes of geographic areas of Chicago. Integrating these sources of data only required that we be able to link specific pieces of data to a zonal unit-in our case a ZIP code tabulation area-so that the attributes of each area were described. By combining data in this way, we were able to create a holistic picture of the city and of its young black MSM's ease of access to local prevention services. We should note that our ability to integrate these data with confidence was a function of the minimally acceptable quality of each source of data.
Challenges to using spatial analysis
Using GIS as a planning tool is not without its challenges. We were unable to obtain HIV surveillance data for the specific target population of interest to us because the small number of cases in any one ZIP code posed a potential threat to individuals' confidentiality. Mapping any kind of data about the attributes of individuals who live in a particular area poses such a risk, as well as risks of stigmatization to residents. The problems associated with stigma and privacy that arise when using these techniques might be particularly acute for small geographic areas and for those in which the populations of interest are small. GIS may not be an ethically viable approach for planning when working in small territory or with a very small population.
A second issue concerns the problem of missing data when data sources beyond those such as the Census are integrated into an analysis. Spatial analyses work best when one can look at small geographic units and when complete data are available for each of those units. For individual-level data, ideally there should be a large number of cases per each geographic unit. Even working at the gross level of aggregation of a ZIP code, we still confronted problems associated with missing data or very few cases per geographic unit. Creating an accurate portrait of the city under such circumstances required the use of complex estimation techniques and reporting data in proportions rather than in raw counts.
Conclusion
GIS techniques hold great potential for service planning. GIS techniques provide researchers and planners with the tools to visualize and understand the needs of at-risk populations. The present work extends the literature by focusing on the availability of services to a specific priority population that is at high risk for HIV infection. Future work in this area might explore examining the co-location of specific service types and residential population. For instance, we might examine the density of HIV testing services, outreach services, or other specific types of prevention relative to the spatial distribution of a population. Additional work might also explore whether changes in service density of particular types is associated with decreases in risk in particular zonal areas.
REFERENCES