Introduction

In December 2019, an outbreak of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) ignited, resulting in global spread of coronavirus disease 2019 (COVID-19) [1]. Since then, the spread of COVID-19 across the United States (U.S.) has led to significant morbidity and mortality along with an initial surge that overwhelmed healthcare systems [2, 3]. Similar to past pandemics, reports have revealed stark disparities in mortality from COVID-19 in the U.S. among marginalized populations, particularly Black Americans [4,5,6,7,8,9,10,11,12].

Many factors have been hypothesized to influence a greater risk for infection and death from COVID-19 among Black Americans, including densely populated housing, a greater burden of chronic disease, limited healthcare access, higher poverty rates, and higher likelihood of employment as essential workers [13, 14]. These factors apart from medical care, collectively described as social determinants of health (SDH), can be influenced by social policies and shape health in powerful ways. Rather than biological differences, many have argued that differences in SDH are the underlying drivers of COVID-19 disparities [5, 15,16,17,18,19]. The CDC identifies 5 key areas of SDH, which include neighborhood conditions, educational attainment, economic stability, healthcare access, and social contexts [20,21,22,23]. An increasing number of studies have begun to suggest that racial disparities may reflect discrimination propagated by mutually reinforcing, inequitable systems--referred to as structural racism--which could ultimately influence the way in which minorities experience COVID-19 and other illnesses [24,25,26,27].

In this analysis, we used multiple regression models informed by publicly available, county-level data to quantitatively explore how SDH impact COVID-19 mortality in Black Americans. Although literature suggests SDH contribute to racial disparities in COVID-19 mortality, we aimed to provide a quantitative analysis that formally investigates this relationship. Understanding the extent to which SDH may contribute to disparities in COVID-19 outcomes can help facilitate rational public programs that prioritize addressing the most impactful determinants.

Methods

COVID-19 Data

We obtained COVID-19 case and death counts by U.S. county from January 22, 2020, through October 28, 2020, from a publicly available data repository by the Johns Hopkins University Center for Systems Science and Engineering (JHU CSSE) [28]. Data were available for 2831 counties. After excluding 800 counties with fewer than 5 deaths and 5 counties with incomplete variable data, we included a total of 2026 counties in the final analysis. We calculated the case rate and death rate per 100,000 people by county using JHU CSSE data and publicly available census data [29].

Social Determinants of Health

Given the disparities in death rates reported in Black Americans, we focused on the impacts of SDH during COVID-19 as it pertains to this population, represented by percent non-Hispanic Black residents by county. Based on data availability at the county-level and literature on SDH, we selected 20 potential variables of interest and categorized them into socioeconomic, health status, educational, and socio-demographic factors (online supplement Table 1) [20, 26]. After considering the quality, completeness, representation of each category, and potential collinearity of the variables, we chose six final variables. Socioeconomic variables included index of concentration at the extremes (ICE) income and percent uninsured. ICE is a metric created to measure the spatial distribution and polarization of measures of privilege, such as income, across a community [30]. ICE income is defined as the difference between the number of economically privileged households living above the 80th income percentile and the number of deprived households living below the 20th income percentile, divided by the total number of households [31]. Possible values range from − 1 to +1, with more negative numbers associated with lower levels of economic privilege and more positive numbers associated with greater levels of privilege. We chose this measure of income disparity because it has been linked to racial health inequalities in recent literature, such as pre-term birth and infant mortality [31, 32]. Insurance status is relevant, particularly during a pandemic, as inadequate access to healthcare is a pathway between racism and health outcomes [26]. Percent low birthweight, a measure of health status, is a marker of SDH, while percent adults without high school (HS) diploma, a measure of education, is a well-studied SDH [26, 33, 34]. The socio-demographic variables included incarceration rate and percent households without internet. Given that Black Americans are disproportionately affected by mass incarceration, inclusion of incarceration rate helps capture the detrimental effects of incarceration on the health, employment opportunities, and educational attainment of Black Americans [35]. Percent households without internet is a SDH that may be particularly relevant during a pandemic given the need for timely access to news, guidelines, online learning, and remote work [26, 36]. Data on SDH indicators came from the County Health Rankings database and the Vera Institute of Justice [37, 38].

Covariates

We selected county-level variables known or thought to impact COVID-19 outcomes as covariates for our models, including population density per square kilometer, days since first COVID-19 death, percent over age 65, percent smokers, and percent with obesity, diabetes, chronic obstructive pulmonary disease (COPD), and hypertension [39]. Given variation in testing by county, we used days since first death, as opposed to days since first case, to adjust for differences in outbreak timing in the models because it is relatively independent of testing capacity. Data on clinical covariates and demographics came from the County Health Rankings database and the Centers for Medicare & Medicaid Services [37, 40]. The sources and definitions for all the variables in this study are provided in Table 1.

Table 1 Definition and source of county-level variables in the study

Statistical Analysis

All statistical analysis was performed in IBM SPSS Statistics 26 (Armonk, New York). We used independent, two-tailed T tests to compare SDH and covariates between counties in the lowest and highest quartile of death rates. We then conducted three models using negative binomial regressions with log link to assess the association that SDH had on COVID-19 mortality independent of covariate effects. The dependent variable was total COVID-19 deaths, with total county population used as an offset variable. Each variable was treated as a continuous variable, except for ICE income which was analyzed as a categorical variable in quintiles. The highest quintile contained counties in which the majority of households are most privileged and the lowest quintile contained counties in which the majority of households are most deprived [31]. In model 1 (individual SDH model), we regressed COVID-19 deaths on each SDH separately, controlling for covariates. In model 2 (full SDH model), we regressed COVID-19 deaths on percent Black residents and all SDH together, again controlling for covariates. Finally, model 3 (subgroup and interaction model) was a subgroup analysis, stratifying by counties below (low adverse SDH model, model 3a) or above (high adverse SDH model, model 3b) the median value of each SDH to examine the association of percent Black residents and COVID-19 death rates in counties with varying levels of SDH. Furthermore, an interaction model was conducted (interaction model, model 3c) to test for moderator effects of the SDH on the association of percent Black residents with COVID-19 mortality. Interaction terms were the product of percent Black residents and a dichotomous variable that was equal to 0 if the county was in the low adverse SDH group and equal to 1 if it was in the high adverse SDH group. For all models we calculated incidence rate ratios (IRR) with 95% confidence intervals (CI). The contents of each model are shown in Table 2.

Table 2 Description of regression models included in analysis

Results

The analysis included 2026 counties from the District of Columbia and all states (online supplement Table 2). Baseline characteristics of included and excluded counties are shown in online supplement Table 3 and online supplement Table 4. A comparison of the SDH and covariates between counties in the lowest and highest quartile of COVID-19 deaths rates is shown in Table 3. Counties in the lowest and highest quartiles had mean COVID-19 death rates of 20.9 and 151.0 per 100,000 (p < 0.001), respectively. Counties in the lowest quartile had 5.0% Black residents compared to those in the highest quartile which had 22.7% Black residents (p < 0.001). Counties in the highest quartile of death rates had greater levels of adverse SDH as compared to counties in the lowest quartile of death rates (p < 0.001). Counties in the highest quartile had significantly lower socioeconomic status, educational attainment, and internet access, and significantly higher rates of low birthweight and incarceration. All covariates, except population density, days since first death, and percent over age 65, showed differences between the quartiles (p < 0.001). Counties in the highest quartile had a significantly increased prevalence of medical comorbidities. While timing of the first COVID-19 death was similar, the days since first case was lower in counties in the highest quartile than counties in the lowest quartile, suggesting a lag in testing in counties with higher death rates.

Table 3 Comparison of variables between counties in the lowest and highest quartiles of COVID-19 death rates

The percent Black residents and each SDH were significantly associated with the COVID-19 death rate (individual SDH model, Table 4). Each one percentage point increase in percent Black residents, percent uninsured adults, percent low birthweight, percent adults without HS diploma, incarceration rate, and percent households without internet in a county increased the rate of COVID-19 deaths by 0.9% (95% CI 0.51.3%), 1.9% (95% CI 1.12.7%), 7.6% (95% CI 4.411.0%), 3.5% (95% CI 2.54.5%), 5.4% (95% CI 1.39.7%), and 3.4% (95% CI 2.54.2%), respectively. The lowest and second lowest quintiles of the ICE income, which include less privileged counties, are associated with increased COVID-19 death rates by 67.5% (95% CI 35.9106.6%) and 36% (95% CI 13.063.6%), respectively. A sensitivity analysis of an alternate income measure, median income in a county, had a similar association with COVID-19 death rates as the ICE income measure (results not shown).

Table 4 Regression results for individual SDH and full SDH models

When including the six SDH together, percent Black residents, ICE income quintiles, percent uninsured adults, percent low birthweight, and incarceration rate in a county were no longer associated with COVID-19 death rates (full SDH model, Table 4). Percent households without internet (IRR 1.024, 95% CI 1.013–1.034) and percent adults without HS diploma (IRR 1.017, 95% CI 1.004–1.031) remained positively associated with COVID-19 death rate.

The predictive power of percent Black residents in a county on COVID-19 death rate was dependent on the level of adverse SDH (subgroup and interaction model, Table 5). In an analysis of counties below median severity of adverse SDH (low adverse SDH model), there was no significant positive association between percent Black residents and COVID-19 death rate, except for percent uninsured adults. However, in an analysis of counties above the median severity of adverse SDH (high adverse SDH model) the percent Black residents in a county was significantly associated with increased COVID-19 death rates for each SDH, with IRRs ranging from 1.007 to 1.013. The interaction analysis showed no non-zero interaction terms between percent Black residents and each of the SDH variables.

Table 5 Regression results for subgroup and interaction models

Discussion

In this study, we quantitatively assessed the relationship between SDH and COVID-19 mortality with a focus on the racial disparities. Consistent with recent reports, our epidemiologic assessment at the county level indicates that the burden of COVID-19 mortality is higher in counties with high proportions of Black residents [4,5,6,7,8, 13]. We found that this association is independent of clinical risk factors [39] – many of which disproportionately affect Black residents [7]. Importantly, the full SDH model results showed that when all SDH measures are included in a regression, there is no longer a relationship between Black race and COVID-19 mortality. Furthermore, in our subgroup analysis stratified by SDH, we found that percent Black residents in a county is a predictor of COVID-19 mortality only in counties with higher degrees of adverse SDH, thus suggesting that social constructs and policies mediate the disparate COVID-19 outcomes in Black Americans. This precludes genetic differences as a possible explanation for COVID-19 racial disparities and challenges the harmful belief that racial disparities in illness primarily have a biological basis. Overall, this study provides both qualitative and quantitative evidence that SDH play a significant role in influencing increased COVID-19 mortality for Black Americans.

In the full SDH regression model, the two particularly relevant SDH that emerged as significant positive predictors of COVID-19 mortality included percent adults without HS diploma and percent households without internet. Education frequently emerges as a strong predictor of health outcomes, including mortality, in studies examining SDH [26, 36]. The relationship between Black race and education is largely attributable to long-standing educational discrimination, residential segregation, and marginalization [36]. The finding that internet connectivity is also associated with COVID-19 mortality is particularly relevant in the climate of a pandemic. The internet is essential for social distancing, remote work, and online learning, as well as access to timely and accurate information from public health entities. We were only able to analyze data for this study at the county level; however, a more detailed analysis that includes rural vs suburban vs urban locales may also provide more information about how regional variations in internet connectivity may impact COVID-19 mortality.

Ultimately, these findings support the hypothesis that SDH are important drivers of COVID-19 racial disparities for Black Americans in the U.S. Our results are consistent over a diverse set of SDH variables representing areas of economic stability, healthcare access, educational attainment, and social contexts. This suggests that racial disparities in COVID-19 outcomes for Black Americans stem from multiple sources which compound to create the overall effect. This study provides a method for public health policymakers to identify areas with high adverse SDH, which is crucial because these are high-risk areas for racial disparities in COVID-19 mortality and other harmful health outcomes. Furthermore, this study raises the possibility of targeting changes to SDH as a mechanism to reduce racial disparities in COVID-19 outcomes. These findings also may allow policymakers to monitor SDH indicators as a metric for improvement in health equity in the future. Multiple prior studies have linked SDH to structural racism, which is deeply ingrained in the U.S. legal and economic systems, shaped by historical injustices, and perpetuated by bias. As a next step, further research is needed to evaluate the effect of validated markers of structural racism on COVID-19 mortality, and to explore these associations over time as the pandemic evolves [41, 42]. Additional studies related to bias experienced within the healthcare system related to testing, triage, and treatment may also shed additional insights on COVID-19 racial disparities.

There are important limitations to this study. Firstly, our data are at the county level due to the limited availability of public data, thus precluding the ability to measure effects at a more local level. This is important, because counties can include a heterogenous group of cities and neighborhoods with different demographics and levels of adverse SDH. Secondly, there are known limitations of COVID-19 mortality data. Reported COVID-19 deaths are affected by the accuracy of cause-of-death determinations and reflect the outbreak several weeks prior because of the long course of infection [43]. Additionally, reported deaths are likely to be an underestimate because of underdiagnosis from the lack of testing early in the pandemic [43]. Thirdly, the use of publicly available data acquired at different time periods may also add a layer of imprecision. Nonetheless, because of their chronic nature, SDH can be expected to remain relatively stable over time. Fourthly, confounding variables are impossible to fully control for, however, we intended the use of covariates as controls to try to minimize this effect. Lastly, other SDH variables, that were not able to be assessed in this study due to limited county-level data availability, could also contribute to COVID-19 mortality.

Conclusions

In this epidemiological assessment, we quantitatively studied the impact of race and SDH on COVID-19 mortality and the contribution of SDH to racial disparities in COVID-19 mortality in the U.S. at the county level. Consistent with historical health inequities for Black Americans, our analysis demonstrates that SDH have contributed to the disproportionate impact of the COVID-19 pandemic on Black Americans. By identifying key determinants, this analysis can help inform targeted data-driven public health policies to mitigate racial disparities in the COVID-19 pandemic as well as future health crises.