Research Work

Parallel Trends in an Unparalleled Pandemic: Difference-in-differences for infectious disease policy evaluation

Preprint by S. Feng & A. Bilinski

Abstract: Researchers frequently employ difference-in-differences (DiD) to study the impact of public health interventions on infectious disease outcomes. DiD assumes that treatment and non-experimental comparison groups would have moved in parallel in expectation, absent the intervention (“parallel trends assumption”). However, the plausibility of parallel trends assumption in the context of infectious disease transmission is not well-understood. Our work bridges this gap by formalizing epidemiological assumptions required for common DiD specifications, positing an underlying Susceptible-Infectious-Recovered (SIR) data-generating process. We demonstrate that popular specifications can encode strict epidemiological assumptions. For example, DiD modeling incident case numbers or rates as outcomes will produce biased treatment effect estimates unless untreated potential outcomes for the treatment and comparison groups come from a data-generating process with the same initial infection and equal transmission rates at each time step. Applying a log transformation or modeling log growth allows for different initial infection rates under an “infinite susceptible population” assumption, but invokes conditions on transmission parameters. We then propose alternative DiD specifications based on epidemiological parameters – the effective reproduction number and the effective contact rate – that are both more robust to differences between treatment and comparison groups and can be extended to complex transmission dynamics. With minimal power difference incidence and log incidence models, we recommend a default of the more robust log specification. Our alternative specifications have lower power than incidence or log incidence models, but have higher power than log growth models. We illustrate implications of our work by re-analyzing published studies of COVID-19 mask policies.

The effect of COVID-19 lockdowns on fertility in the Democratic Republic of Congo

Work under review by S. Feng, G. Kyomba, SM. Mayaka, & KA. Grépin

Abstract: Most countries implemented public health measures, including lockdowns, during the COVID-19 pandemic. It has been speculated that the pandemic will affect fertility, but the direction, magnitude, and mechanisms of these effects are not well understood. Using data from the national health management information system and an augmented synthetic control methodology, we examined the impact of a lockdown of Kinshasa in April 2020 on the subsequent fertility of women, which we proxy by the number of births in health facilities months after the policy was implemented. Seven months after the lockdown, we see a large increase in births in Kinshasa, as compared to control areas, which at its peak represents an additional 5000 monthly births, or a 45% increase relative to baseline. We also observe increases in complimentary maternal health services but not in other health services. Increased births were observed among women both older and younger than 20. Lockdown policies have likely affected fertility and future pandemic preparedness plans should anticipate the effects find strategies to mitigate any negative unintended effects.

Addressing missing values in routine health information system data: an evaluation of imputation methods using data from the Democratic Republic of the Congo during the COVID-19 pandemic

Published on Population Health Metrics in December 2021 by S. Feng, C. Hategeka, & KA. Grépin

Abstract: Poor data quality is limiting the use of data sourced from routine health information systems (RHIS), especially in low- and middle-income countries. An important component of this data quality issue comes from missing values, where health facilities, for a variety of reasons, fail to report to the central system. Using data from the health management information system in the Democratic Republic of the Congo and the advent of COVID-19 pandemic as an illustrative case study, we implemented seven commonly used imputation methods and evaluated their performance in terms of minimizing bias in imputed values and parameter estimates generated through subsequent analytical techniques, namely segmented regression, which is widely used in interrupted time series studies, and pre–post-comparisons through paired Wilcoxon rank-sum tests. We also examined the performance of these imputation methods under different missing mechanisms and tested their stability to changes in the data. For regression analyses, there were no substantial differences found in the coefficient estimates generated from all methods except mean imputation and exclusion and interpolation when the data contained less than 20% missing values. However, as the missing proportion grew, k-NN started to produce biased estimates. Machine learning algorithms, i.e. missForest and k-NN, were also found to lack robustness to small changes in the data or consecutive missingness. On the other hand, multiple imputation methods generated the overall most unbiased estimates and were the most robust to all changes in data. They also produced smaller standard errors than single imputations. For pre–post-comparisons, all methods produced p values less than 0.01, regardless of the amount of missingness introduced, suggesting low sensitivity of Wilcoxon rank-sum tests to the imputation method used. We recommend the use of multiple imputation in addressing missing values in RHIS datasets and appropriate handling of data structure to minimize imputation standard errors. In cases where necessary computing resources are unavailable for multiple imputation, one may consider seasonal decomposition as the next best method. Mean imputation and exclusion and interpolation, however, always produced biased and misleading results in the subsequent analyses, and thus, their use in the handling of missing values should be discouraged.

Tracking Health Seeking Behavior During an Ebola Outbreak via Mobile Phones and SMS

Work published on npj Digital Medicine in October 2018 by S. Feng, KA. Grépin, & R. Chunara

Abstract: The recent Ebola outbreak in West Africa was an exemplar for the need to rapidly measure population-level health-seeking behaviors, in order to understand healthcare utilization during emergency situations. Taking advantage of the high prevalence of mobile phones, we deployed a national SMS-poll and collected data about individual-level health and health-seeking behavior throughout the outbreak from 6694 individuals from March to June 2015 in Liberia. Using propensity score matching to generate balanced subsamples, we compared outcomes in our survey to those from a recent household survey (the 2013 Liberian Demographic Health Survey). We found that the matched subgroups had similar patterns of delivery location in aggregate, and utilizing data on the date of birth, we were able to show that facility-based deliveries were significantly decreased during, compared to after the outbreak (p < 0.05) consistent with findings from retrospective studies using healthcare-based data. Directly assessing behaviors from individuals via SMS also enabled the measurement of public and private sector facility utilization separately, which has been a challenge in other studies in countries including Liberia which rely mainly on government sources of data. In doing so, our data suggest that public facility-based deliveries returned to baseline values after the outbreak. Thus, we demonstrate that with the appropriate methodological approach to account for different population denominators, data sourced via mobile tools such as SMS polling could serve as an important low-cost complement to existing data collection strategies especially in situations where higher-frequency data than can be feasibly obtained through surveys is useful.

Beware a Perfect Pre-Trend: Pitfalls of Synthetic Controls with Infectious Disease Outcomes

Work in progress by S. Feng & A. Bilinski

Abstract: Synthetic control methods (SCMs) have emerged as valuable tools for observational policy evaluation. Conventional SCMs use a weighted average of donor units to construct a synthetic comparison group that most closely resembles the trajectory of the treatment group prior to intervention, enabling imputation of counterfactual outcomes in the absence of treatment. While SCMs have been widely employed to evaluate infectious disease and other non-linear outcomes, non-linearities can introduce substantial bias, even when SCM appears to fit perfectly by approximating the treated unit well in the pre-intervention period. While raw counts of epidemic outcomes, such as confirmed cases and deaths, are commonly used in SCM applications for modeling infectious disease, we show that constructing synthetic controls using raw counts will produce a biased treatment effect estimate unless the treatment and all selected donor control units with non-zero weights have identical transmission parameters and initial conditions. If this is not the case, the treatment and synthetic comparison groups will begin to diverge after the pre-intervention period, even if they appear to match nearly perfectly in the pre-intervention period. As an alternative, using the natural logarithm of the counts is equivalent to modeling the transmission dynamics in the treatment group as a weighted geometric mean of those in the control groups but still requires an equal initial conditions. For more robust estimation, we propose a novel approach to model normalized log incidence SCM to capture both the non-linear infectious disease transmission dynamics and the inherent differences in initial conditions between treatment and control groups.

#WuhanDiary and #WuhanLockdown: gendered posting patterns and behaviours on Weibo during the COVID-19 pandemic

Published on BMJ Global Health in April 2022 by CCR. Gan, S. Feng, H. Feng, K. Fu, SE. Davies, & KA. Grépin

Abstract: Social media can be both a source of information and misinformation during health emergencies. During the COVID-19 pandemic, social media became a ubiquitous tool for people to communicate and represents a rich source of data researchers can use to analyse users’ experiences, knowledge and sentiments. Research on social media posts during COVID-19 has identified, to date, the perpetuity of traditional gendered norms and experiences. Yet these studies are mostly based on Western social media platforms. Little is known about gendered experiences of lockdown communicated on non-Western social media platforms. Using data from Weibo, China’s leading social media platform, we examine gendered user patterns and sentiment during the first wave of the pandemic between 1 January 2020 and 1 July 2020. We find that Weibo posts by self-identified women and men conformed with some gendered norms identified on other social media platforms during the COVID-19 pandemic (posting patterns and keyword usage) but not all (sentiment). This insight may be important for targeted public health messaging on social media during future health emergencies.

Identifying early-measured variables associated with APACHE IVa providing incorrect in-hospital mortality predictions for critical care patients

Published on Scientific Reports in November 2021 by S. Feng & JA. Dubin

Abstract: APACHE IVa provides typically useful and accurate predictions on in-hospital mortality and length of stay for patients in critical care. However, there are factors which may preclude APACHE IVa from reaching its ceiling of predictive accuracy. Our primary aim was to determine which variables available within the first 24 h of a patient’s ICU stay may be indicative of the APACHE IVa scoring system making occasional but potentially illuminating errors in predicting in-hospital mortality. We utilized the publicly available multi-institutional ICU database, eICU, available since 2018, to identify a large observational cohort for our investigation. APACHE IVa scores are provided by eICU for each patient’s ICU stay. We used Lasso logistic regression in an aim to build parsimonious final models, using cross-validation to select the penalization parameter, separately for each of our two responses, i.e., errors, of interest, which are APACHE falsely predicting in-hospital death (Type I error), and APACHE falsely predicting in-hospital survival (Type II error). We then assessed the performance of the models with a random holdout validation sample. While the extremeness of the APACHE prediction led to dependable predictions for preventing either type of error, distinct variables were identified as being strongly associated with the two different types of errors occurring. These included a primary set of predictors consisting of mean SpO2 and worst lactate for predicting Type I errors, and worst albumin and mean heart rate for Type II. In addition, a secondary set of predictors including changes recorded in care limitations for the patient’s treatment plan, worst pH, whether cardiac arrest occurred at admission, and whether vasopressor was provided for predicting Type I error; age, whether the patient was ventilated in day 1, mean respiratory rate, worst lactate, worst blood urea nitrogen test, and mean aperiodic vitals for Type II. The two models also differed in their performance metrics in their holdout validation samples, in large part due to the lower prevalence of Type II errors compared to Type I. The eICU database was a good resource for evaluating our objective, and important recommendations are provided, particularly identifying key variables that could lead to APACHE prediction errors when APACHE scores are sufficiently low to predict in-hospital survival.