Chapter 2 Causal Inference and Counterfactuals

2.1 Before-After Designs

The first “expert” consultant you hire indicates that to estimate the impact of HISP, you must calculate the change in health expenditures over time for the households that enrolled. The consultant argues that because HISP covers all health costs, any decrease in expenditures over time must be attributable to the effect of HISP. Using the subset of enrolled households, you calculate their average health expenditures before the implementation of the program and then again two years later.

Compare average household health expenditures before and after being enrolled in the program in villages covered by HISP.

m_ba1 <- lm_robust(health_expenditures ~ round, 
                   clusters = locality_identifier,
                data = df %>% filter(treatment_locality==1 & enrolled ==1))

m_ba2 <- lm_robust(health_expenditures ~ round + age_hh + age_sp + educ_hh + 
                  educ_sp + female_hh + indigenous + hhsize + dirtfloor + 
                  bathroom + land + hospital_distance, 
                clusters = locality_identifier,
                data = df %>% filter(treatment_locality==1 & enrolled ==1))
htmlreg(list(m_ba1, m_ba2), doctype = FALSE,
        custom.model.names = c("No Controls", "With Controls"),
        custom.coef.map = list('round' = "Post-Enrollment",
                               '(Intercept)' = "Intercept"),
        caption = "Change in Health Expenditures for Households Enrolled in Program",
        caption.above = TRUE)
Change in Health Expenditures for Households Enrolled in Program
  No Controls With Controls
Post-Enrollment -6.65* -6.71*
  [-7.11; -6.19] [-7.17; -6.25]
Intercept 14.49* 24.73*
  [14.20; 14.78] [23.54; 25.92]
R2 0.21 0.48
Adj. R2 0.21 0.48
Num. obs. 5929 5929
RMSE 6.44 5.22
N Clusters 98 98
* Null hypothesis value outside the confidence interval.

Does the before-and-after comparison control for all the factors that affect health expenditures over time?

No, it is very unlikely that this analysis controls for all the factors that may impact health expenditures over time. For example, there are other health interventions operating simultaneously in the villages receiving HISP, which could also have caused increases or decreases in health expenditures. Additionally, a financial crisis in the country could have reduced health expenditures, meaning that in the absence of HISP, households might have spent less on health.

Based on these results produced by the before-and-after analysis, should HISP be scaled up nationally?

No, based on these results HISP should not be scaled up nationally. The program decreased average health expenditures in poor households, but by much less than the threshold level of $10 that was determined by the government.

2.2 Enrolled vs. Non-Enrolled

Another consultant suggests that it would be more appropriate to estimate the counterfactual in the post-intervention period: that is, two years after the program started. The consultant correctly notes that of the 4,959 households in the baseline sample, only 2,907 actually enrolled in the program, so approximately 41 percent of the households in the sample remain without HISP coverage. The consultant argues that all households within the 100 pilot villages were eligible to enroll. These households all share the same health clinics and are subject to the same local prices for pharmaceuticals. Moreover, most households are engaged in similar economic activities. The consultant argues that in these circumstances, the outcomes of the nonenrolled group after the intervention could serve to estimate the counterfactual outcome of the group enrolled in HISP. You therefore decide to calculate average health expenditures in the post-intervention period for both the households that enrolled in the program and the households that did not.

m_ene1 <- lm_robust(health_expenditures ~ enrolled, 
                    clusters = locality_identifier,
                data = df %>% filter(treatment_locality==1 & round ==1))

m_ene2 <- lm_robust(health_expenditures ~ enrolled + age_hh + age_sp + educ_hh + 
                  educ_sp + female_hh + indigenous + hhsize + dirtfloor + 
                  bathroom + land + hospital_distance, 
                clusters = locality_identifier,
                data = df %>% filter(treatment_locality==1 & round ==1))
htmlreg(list(m_ene1, m_ene2), doctype = FALSE,
        custom.model.names = c("No Controls", "With Controls"),
        custom.coef.map = list('enrolled' = "Enrollment",
                               '(Intercept)' = "Intercept"),
        caption = "Difference in Health Expenditures Between Households Enrolled and Not Enrolled in Program",
        caption.above = TRUE)
Difference in Health Expenditures Between Households Enrolled and Not Enrolled in Program
  No Controls With Controls
Enrollment -14.46* -9.98*
  [-15.13; -13.80] [-10.58; -9.39]
Intercept 22.30* 30.35*
  [ 21.63; 22.98] [ 27.87; 32.82]
R2 0.33 0.45
Adj. R2 0.33 0.45
Num. obs. 4960 4960
RMSE 10.18 9.17
N Clusters 100 100
* Null hypothesis value outside the confidence interval.

Does this analysis likely control for all the factors that determine differences in health expenditures between the two groups?

No, it is unlikely that the multivariate analysis controls for all the factors that impact health expenditures between the two groups. There may be unobservable factors that determine why some households enroll in HISP and others to not, such as personal preferences on health or the motivation of the household decision maker.

Based on these results produced by the enrolled-nonenrolled method, should the HISP be scaled up nationally?

Based strictly on the estimate from the multivariate linear regression, the HISP should not be scaled up nationally because it decreased health expenditures by $ 9.98, which is less than the government-determined threshold level of $10. However, the $9.98 estimate is very close to $10. In statistical terms, it is not statistically different from $10. Therefore, you might still argue that the HISP should be expanded nationally.