• By Dartington SRU
  • Posted on Wednesday 11th September, 2013
Evaluation EvidenceStandards of Evidence

Plurality and rigour

Imagine two interventions, each evaluated using contrasting methods. One measures impact by looking at children receiving an intervention and comparing their well-being before with the situation after. The other is based on a randomised controlled trial, meaning the progress of children receiving the intervention is contrasted with a control group who did not get it. Which evaluation should we trust more?

Standards of evidence help to answer this question. There has been a growing interest in standards in the last few years. In the US, this culminated in May this year in the Director of Office of Management and Budget Jeffrey D. Zients announcing that he will rigorously examine the evidence base of all proposals for funding by the US Federal Government. In the UK, this has been mirrored in interest from the UK Permanent Secretary Sir Jeremy Heywood in ‘kite marking’ evidence-based social policy.

All very good, but which standards should be used to determine if something is evidence-based? In the US there are already over 20 sets of standards that are relevant to commissioners of children’s services. Although the UK has come late to the game, several options are available. The Government’s so-called Magenta Book sets a broad direction of travel but is not intended for application to specific questions. NICE (the National Institute for Health and Clinical Effectiveness) addresses a wide range of questions – including, but extending beyond, impact on outcomes – and therefore takes a less prescriptive approach to standards. SCIE, which looks at what works in social care, eschews standards altogether, preferring an approach that makes best use of various types of evidence from a range of sources.

The National Academy for Parenting Practitioners has standards that cover a reasonably wide spectrum of evidence of impact, ranking evaluations from simple pre-post test without a control group to the use of two or more RCTs, but its focus is limited to parenting programmes. The Institute for Effective Education’s Best Evidence Encyclopaedia takes a similar approach to education programmes, while the Education Endowment Foundation is developing standards that help to identify the best teaching practices.

Organisations like the Centre for Excellence and Outcomes (C4EO) and the Centre for Analysis of Youth Transitions (CAYT) focus on rating the wide variety of innovation in children and youth services, and are therefore less concerned with robust evaluations of well-established programmes.

At the Social Research Unit at Dartington we wrote the standards that underpin Project Oracle, and co-authored those used by Graham Allen's government-sponsored review of early Intervention and Blueprints for Healthy Youth Development. Our specialism has been to collaborate with world experts to produce a high standard (which focuses on interventions evaluated with a control group) applied to all aspects of children’s health and development.

If the US is any guide it will not be long before there are spats with claims that ‘my standards are better than yours.’ We urge a more considered response.

Plurality is welcome. Our colleague and collaborator on Investing in Children at the Washington State Institute for Public Policy, Steve Aos, calls himself an ‘independent investment advisor’, but rather than providing counsel on stocks and shares to private investors he explains the likely costs and benefits of competing policy approaches and programmes to politicians and decision-makers. Just as there are many investment advisors in the stock market from which to choose he feels that there should be many options in the public sector.

In our view it all depends on the question. If the focus is parenting programmes, the National Academy’s list is arguably the place to start. For education programmes go to the Best Evidence Encyclopaedia. If the challenge is which children’s services innovations deserve financial backing then the approaches advocated by C4EO or CAYT are strongest. What matters is that each option has clearly articulated and defensible standards.

Dynamism is essential. It is unhelpful to think of standards as fixed and unchanging. The Social Research Unit and the Institute of Effective Education focus on high standards, but we anticipate them getting higher still as practice changes: in the future there will be more evidence-based programmes, practices, policies and processes from which to choose, and therefore a greater need to prioritise which ones to invest in. Each year will bring new methods, new measures and new understandings about the frailties of what is now considered ‘robust’ evaluation. The standards will therefore need to become more nuanced and, no doubt, tougher.

We still depend on human judgement. Standards will generally comprise clearly articulated dimensions – in our case intervention specificity, system readiness, evaluation quality and impact – underpinned by criteria that can be rated by skilled researchers and practitioners. But few cases are clear-cut. What happens when the sample size is on the margins of having sufficient statistical power to demonstrate impact, but the study is otherwise extremely robust? What happens when several evaluations undertaken in collaboration with the programme originator produce positive results, but one without the programme originator’s involvement suggests negative results? In theory it is possible to code all of these eventualities, but in practice sound, skilled human judgement will add more value.

A 'mother of all' standards?

All of this speaks to the need for diversity allied to rigour, and strong collaboration and sharing between those helping to articulate better standards of evidence. At the Social Research Unit, for example, we know that the success of our high standards of evidence depends on the success of organisations like C4EO and CAYT in promoting good innovation: we simply won’t be able to build the number of evidence-based programmes and approaches unless more people embark on the journey from innovation to proven impact. We would also like to think that by encouraging children’s services innovators to aspire to meet our high standards in the future we will help C4EO and CAYT in their work.

Inevitably, if spats occur, there will be pressure to bring all of this work together in a ‘mother of all standards’, an idea that we know from experience would be unmanageable even if it were in any way desirable.

Finally, we continue to argue for humility. We know so little that we are not in a position to speak with absolute authority. There are benefits to drawing successive lines in the sands of changing policy and practice, learning as we go, continually getting better at the difficult task of improving the health and development of our children.

Return to Blogs