A Response to the 2022 Wyse and Bradbury Paper.
Recently, a paper by Dominic Wyse and Alice Bradbury, titled “Reading wars or reading reconciliation? A critical examination of robust research evidence, curriculum policy and teachers' practices for teaching phonics and reading” has been making a big splash on social media. The paper’s authors claim that there exists robust new research evidence that synthetic phonics is inferior to a Whole Language approach. However, despite the paper being quite long, the authors actually present no compelling evidence, to support their overall claim.
One argument the authors make is that UK standardized test scores (a country that uses synthetic phonics) are going down. However, the standardized tests being used prior to the introduction of the phonics-based curriculum are different from the ones used after, which makes the comparison less than useful. Moreover, the types and grades of test scores collected afterward also fluctuate. The main argument here seems to hinge on the fact that test scores dropped between the introduction of the phonics curriculum in the UK and today. This is true, however, this difference is only 5%, moreover, this drop was not linear, indeed the scores fluctuated over the years, with the highest average scores being in 2015, 4 years after the change in curriculum. If we calculate these statistics through a p-value test comparing the individual test scores to the mean results, we get the absurdly high p-value of .49. To put this in context, in order for data to be relevant, it is supposed to have a p-value below .05. In order words, the authors are using small random fluctuations in yearly standardized test scores to disprove the efficacy of a policy decision.
This is not to say, that standardized test scores are not an important tool in evaluating public policy, but given the size of the scope of something like national test scores, a 5% movement, cannot be seen as a valid test of policy efficacy. Especially, when we consider how many other factors are going to impact the results of such a macro lens. If we compare this to the PISA scores, the UK had a reading score of 494 in 2009 and was ranked 25 in the world, whereas in 2018 the UK had a reading score of 504 and was ranked 12. Of course in fairness, this might not be a statistically significant result either, as while, the UK moved up 13 ranks, the actual score only improved by 2%.
The authors of this paper also point to Canada’s high PISA rankings, which uses a Balanced Literacy approach and Ranks 8 in the world, as proof that a Balanced Literacy approach works better than a synthetic phonics approach. However as the Ontario IDA recently revealed, the testing scores on which the Canadian ranking is based, often exclude the majority of students with serious reading difficulties and provide scribes to many of those struggling readers who do participate. For example in 2019 18.3% of grade 6 students tested on standardized literacy tests had a scribe and 9.3% of grade 3 had a scribe. This does not exactly make for fair statistical comparisons where exclusions and modifications are substantially lower. Moreover, studies that have compared UK decoding levels with Canadian decoding levels, have found the UK to be performing substantially better. Take this infographic from the IDA’s 2021 report “Lifting the Viel on EQAO Scores.
Strangely, the authors of this paper also point to several countries with lower PISA scores than the UK, with Whole Language curriculums, as proof that Whole Language is somehow positive? But I fail to see, how pointing out that countries who unlike the UK have a Whole Language curriculum and not a synthetic phonics curriculum underperform the UK, is somehow a point in favor of Whole Language.
The authors of the paper also conducted their own “meta-analyses” of the topic. However, their inclusion criteria seemed somewhat biased. For starters, they only included papers, included in two research reviews: Bowers 2020 Torgerson 2019. Of course, these two papers are the two meta-analyses best known on the subject for having the lowest effect sizes. They also excluded any papers written before 2008, that did not have a valuation of methodological quality, as well as a self-analysis of publication bias. The resulting meta-analysis only included 55 studies, compared to Hattie’s meta-analysis, which includes over 1000 studies and had an ES of .60.
However, they then exclude all of the 55 studies left over, for other qualitative reasons, making the claim “In summary, no studies met all the criteria of: experimental design with random allocation; longitudinal design; sample of children whose reading was typical; delivered by standard class teachers; reading comprehension measures included, and undertaken in England with the English language.” In other words, with over 1000 studies done on the topic, the authors of this paper could not find a single paper, that fully met their screening process. Moreover, for their “meta-analysis” the authors do not actually attempt to calculate an effect size or present any statistical evidence based on their analysis of their research. They only present a qualitative criticism of the papers written on the subject. In other words, they do not have any experimental data to support their perspective, so they attack the integrity of the experimental research that does exist.
Lastly, the authors of the paper conducted a survey of year 2 elementary school teachers in the UK. Their survey included 634 respondents, and 49 of these respondents said they were against the instruction of phonics. The authors appear to highlight this as proof that teachers were dissatisfied with phonics being included in the curriculum. However, this data still means that 93% of teachers were neutral or positive, in regards to phonics instruction. Moreover, 66% of respondents said that synthetic phonics was their primary instructional focus for literacy, and 71% of teachers used the phonics screening test to help improve their instruction.
The authors of this paper advocate for a Whole Language instruction approach, claiming that there is “robust research evidence” against synthetic phonics. However, they present zero experimental evidence, their “meta-analysis” (if you can call it that) does not actually include any statistical analysis (or even any studies……). Indeed their entire argument seems to hinge on the fact that UK standardized test scores have fluctuated 5% in the past decade and that Canada has 3% higher PISA scores. Of course, none of this data is significant enough to pass the null-hypothesis test; nor is this the way efficacy of a pedagogical method would be measured by any responsible scholar.
On the other hand, meta-analysis after meta-analysis shows that Phonics outperforms Whole Language. Indeed, even the Bowers 2020 meta-analysis, which they cite as proof that Whole Language is better than phonics, does not try to make this assertion. According to John Hattie’s meta-analysis of the subject, Whole Language has an ES of .06 and Phonics has an ES of .60, that’s a ten-fold statistical difference. Balanced Literacy, which outperforms Whole Language, still only has an ES of .38, according to a 2017 meta-analysis by Graham, et al. The reality is that, on the issue of comparing phonics instruction to Whole Language, it is a settled debate and no serious scholar would suggest the evidence for Whole Language shows greater efficacy. To quote Starett, “Rather than engage in debates about whether phonics should or should not be taught, effective teachers of reading and writing ask when, how, how much, and under what circumstances phonics should be taught.”
OECD. (2010). Pisa Rankings. Retrieved from <https://www.oecd.org/pisa/pisaproducts/46619703.pdf>.
Act Maps. (2018). PISA 2018 WorldWide Ranking. Retrieved from <http://factsmaps.com/pisa-2018-worldwide-ranking-average-score-of-mathematics-science-reading/>.
-NRP. (2001). Teaching Children to Read: An Evidence Based Assessment of the Scientific Literature on Reading Instruction. United States Government. Retrieved from <https://www.nichd.nih.gov/sites/default/files/publications/pubs/nrp/Documents/report.pdf>.
IDA. (2021). Lifting the Viel On EQAO Test Scores. Retrieved from <https://www.idaontario.com/wp-content/uploads/2021/09/LiftingTheCurtainOnEQAO69747.pdf>.
S, Graham, et al. (2017). Effectiveness of Literacy Programs Balancing Reading and Writing Instruction: A Meta-Analysis. Reading Research Quarterly. Volume 53: Issue 3.
J, Hattie. (2021). Visible Learning Metax. Retrieved from <https://www.visiblelearningmetax.com/>.
Wyse, D., & Bradbury, A. (2022). Reading wars or reading reconciliation? A critical examination of robust research evidence, curriculum policy and teachers' practices for teaching phonics and reading. Review of Education, 10, e3314. https://doi.org/10.1002/rev3.3314
N, Hansford. (2021). Morphology: A Secondary Meta-analysis. Retrieved from <https://www.pedagogynongrata.com/morphology>.