Methods for policy evaluation
While the risks factors underlying health inequalities are reasonably well recognized, how to reduce health inequalities with policies, once they reach a more structural social level, is less well understood.
Different sources, types and levels of evidence about the pathways between complex social interventions and inequalities in health have to be put together and synthesized. A number of highly innovative strategies and methods were put into action in SOPHIE to successfully address these challenges.
Main findings
Combining methods to address the challenge of evaluating impacts of structural policies
Approaches aligned with the principles of evidence-based medicine, including before-and-after measurements and control groups, are sometimes applicable in practice to structural policies, even though the researcher does not hold control over the policy execution. In this, we should recognise that most structural policies are inherently complex. Their population health impact may strongly depend on the ways in which they are implemented, the populations that they target, and the broader context. It then becomes problematic to just summarise the evaluation in terms of a simple 'yes/no effect'. What is needed are alternative designs capable of generating more nuanced insights from which careful lessons that may be transferable to other populations can be drawn.
Quantitative approaches were applied mostly with the aim of assessing whether a structural policy has had a demonstrable impact on health-related outcomes, how large this effect appeared to be, and whether this impact differed in accordance with socioeconomic status. Some studies made comparisons between places, such as countries, regions, or small districts and found associations between variations in structural policies and variations in health outcomes. Other studies made comparisons over time, including those before and after the introduction of new policy measures. In time series analyses, more detailed data on trends over time were used to assess whether the introduction of a policy was followed - immediately or gradually - by a change in health outcomes. Finally, some studies applied 'quasi-experimental' approaches that combined geographical and time aspects. One way of doing this was to apply the before-after design to both the population exposed to the intervention and to a control population (e.g. those living in another neighbourhood).
In addition to these quantitative approaches, the SOPHIE project applied in-depth analyses that included qualitative research methods in order to identify the mechanisms through which a structural policy could influence population health. Several methods have been used, including concept mapping and multiple case studies. In particular, we explored the feasibility and informative value of 'realist' approaches, including realist reviews and realist evaluations. These approaches were particularly aimed at testing expectations regarding how a structural policy could affect population health, and under which conditions (i.e. when/where) the same policy would set specific mechanisms in motion.
1. The study had to be able to control for confounding national factors such as measures of national income. A particular challenge may be to control for other policies that were developed in parallel with the structural policies of interest.
2. The study had to maximise the number of countries studied, which in the case of Europe is less than 30. In particular, the study had to find ways to apply statistical controls to several confounders even with such a limited number of countries (e.g. multi-level analysis with controls for individual-level confounders).
3. Problems in the comparability of the information on the structural policies of interest may induce systematic or random measurement bias and affect study outcomes. While using comparable though simplistic indicators may be a solution, it may also cause problems of measurement validity.
Many SOPHIE studies found ways to address these challenges in their particular case, depending on the topic of interest and the data that were available.
The application of time trend analysis, rather than cross-sectional comparisons, yielded stronger evidence on the impact of structural policies. The experience of SOPHIE studies is that stronger evidence could be obtained if:
1. The structural policies of interest rapidly changed over time. Sudden policy changes generate 'natural policy experiments' that can be assessed in a quasi-experimental design. In contrast, policy changes that are gradually implemented (e.g. over a period of 5 to 10 years) are generally harder to evaluate in terms of their population health impact.
2. The available data sources are continuous (e.g. monthly, yearly or every two years) instead of covering only a few points in time. Continuous time series increase the possibility to accurately follow trends in health related outcomes after a policy change, and thus to assess delayed dose-response relationships.
3. No major developments occur in other fields. Confounding may occur due to concurring changes in other policy areas. Such problems can be solved with multi-country studies that compare 'experiment' to 'control' countries which differ particularly with regards to the policy of interest.
According to the experience of SOPHIE researchers, several challenges have to be faced in order to fully seize the potential of realist approaches:
1. The strength of the evidence strongly depends on the quality and richness of the information that could be obtained from primary sources or published studies. For example, in a field dominated by quantitative studies, there may be little qualitative information on the mechanisms of interest, and a 'realist' synthesis may not have sufficient material to test and refine initial expectations.
2. A clear and efficient working protocol must be developed. Published studies using 'realist' approaches greatly varied in the ways in which the studies are executed and results are presented. Further standardisation (see e.g. www.ramesesproject.org) is expected to increase the efficiency, quality and transparency of each individual study.
3. On another level, scientists using 'realist' approaches should find ways to deal with the current publication pressure, as these methods can be more time-consuming than quantitative studies and harder to get into most high-impact scientific journals. In similar ways, because of the lack of simple answers, it may be more difficult to disseminate the results to policy makers or professionals, especially those trained only in positivist, quantitative science.
However, 'realist' approaches are an indispensable complement to 'black box' studies that only aim to demonstrate and quantify the impact of a structural policy. These approaches need to be further developed, applied and promoted in public health.
Policy implications
In future work, stronger evidence on the health equity impact of structural policies could be obtained by further application of the quasi-experimental design, i.e. a pre-post, intervention-control design. As a general rule, the strength of evidence increases with (1) a larger number of control and interventions areas; (2) a more detailed measurement of intervention exposure and of confounders; and (3) the inclusion of more subsequent years of observation.
Qualitative or 'realist' approaches are indispensable to gaining an understanding and predicting the impact of structural policies. They can best applied together with a primary, prospective collection of data, such that expectations regarding 'mechanisms of change' can be assessed with rigour and detail. Moreover, in an ideal situation, these approaches would be complemented by quantitative comparative approaches that aim to test and quantify the expected health impacts, to simultaneously assess 'whether, how much' and 'how, when' a structural policy would reduce inequalities in health.
Broad structural policies and 'regimes' may be most relevant for the reduction of health inequalities. However, they may be hard to assess in terms of their precise impact, while the variety of mechanisms and context-dependencies may be overwhelming. While specific structural policies of interest may be assessed with greater detail and rigour, the demonstrated impact may turn out to be deceptively small as compared to the current magnitude of health inequalities. Further research will have to find a balance each time between the brush and the tweezers.
In the SOPHIE project, we learned that the key challenge is to get stronger evidence regarding the impact of such policies on people's health. If such an impact could be assessed for a population at large, it is also possible in principle to assess this impact for subpopulations stratified by socioeconomic status or other measures of inequality, provided that the available data produce sufficient statistical results.
Novel statistical approaches such as Propensity Score Matching and Difference-in-Difference methods did not appear to offer a satisfactory solution to the key challenges. The key issue was often related to observation and measurement. In SOPHIE, most comparative studies would not have been possible without the many European-wide surveys that were initiated in the last 10 to 15 years. We strongly recommend further development of these international surveys over the coming decade.
Research team
Work Package 6 (Cross-cutting methods for policy evaluation) has been led by Anton Kunst, Academic Medical Center at the University of Amsterdam.
The Centre for Research on Inner City Health was also involved as responsible for training activities.