Direct Mapping and Set Variability Reading Instruction Meta-Analysis

I was recently asked about the efficacy of explicit teaching for set variability in reading. Set variability refers to the idea that graphemes and digraphs can have different phonemic sounds in different contexts, for example both “ck” and “k” often make the same sound. Explicit instruction around this concept aims to help children develop metacognition strategies that can make them more adaptable at sounding out words with set variability, this could be specifically important for irregular spelled words. Direct Mapping and the 5 Cueing System appear to be a common direct instruction approach to teaching students to be adaptable to set variability.

Dr. Robert Savage, who has written several papers on the topics states that Direct Mapping Set Variability Instruction (DMSVI) should: 

“1) linking taught grapheme-phoneme correspondences to text containing these specific items (‘Direct Mapping’) 

2) an intense focus on teaching alternate vowel digraph pronunciations

3) teaching a 2-stage process for reading both regular and exception words (‘Set-for-Variability’).” (Savage, 2018).

Proponents of DMSVI have also suggested that the following queuing questions should be taught to students: 

(1) Say the word aloud.

(2) Decide if you know the word.

(3) If you don’t, think of words that sound like the word.

(4) Choose a word that sounds most like the word you said.

(5) Check: Does the word you have chosen make sense in context?
(Dyson, 2017).


Personally, I prefer to rely on meta-analysis to determine the efficacy of an intervention, however, at the time of my writing this article, there was no meta-analysis currently published on the topic. Because of this, I decided to conduct my own meta-analysis. I was able to find four studies to date on the topic. Well this is a relatively small number of studies, all four of the studies were extremely well done, which makes me more confident in the results; however, I still think there is a need for more research in this area. All four studies conducted had meaningful sample sizes, and were randomized control trial (RCT) studies.

To conduct a true and proper meta-analysis I would have combined the data from each study into one pooled data set and re-calculated the effects. This approach would have allowed me to find a result that weighted each study according to its sample size. However, I was unable to do this, because I did not have access to the raw data for any of these studies. Instead, I calculated the mean effect size for the pooled effect size data. This allowed me to understand the average impact found in each study, but does not weight for sample sizes. That being said, I re-calculated the effect sizes for one of the studies, because they calculated their effect sizes based on the difference between their pre-test and post test, rather than the difference between their experimental group and control group. I still included the original authors data in my calculation; however, I also added the re-calculated effect sizes as well. I also calculated the effect size without my added re-calculations, so that the reader could see the impact those calculation differences made.


Of the four studies, two studies had a positive result, one study had a null result, and one study had a negative result. Overall the mean effect size for DMSVI was .14, which is a very small but statistically significant effect size. If I exclude my added calculations that number jumps to a small but statistically significant effect size of .25. While, normally I would suggest that these effect sizes suggest an intervention's effect is insignificant, in this situation, I think that would be the wrong conclusion, for several reasons. For starters all of these studies were RCT studies, which in general usually produce much lower effect sizes. Moreover in 2 out of 4 of the studies the control group was using what the authors called “current best practices”, which they defined as explicit synthetics phonics instruction. Explicit Synthetics Phonics was shown in the NRP meta-analysis to be better than unsystematic phonics instruction by a ES of .46. Where as phonics instruction in general has been shown to be a high yield intervention, as can be seen in the 2020 John Hattie, Meta-analysis which put phonics interventions as having an ES of .6. So in 2 of these studies DMSVI was truly being tested against the current best practices. I think, it is therefore reasonable to assume that DMSVI is likely a high yield strategy.

That being said, I do want to make three very important caveats. Firstly, there is still not enough research on the topic and I did find some weaknesses in the current literature on this topic (which I will explore more below.) Secondly, DMSVI is not a singular approach, but a combination of approaches, it is therefore very challenging to determine which parts of DMSVI are having the greatest impact. Thirdly, DMSVI was used in these studies in combination with systematic synthetic phonics instruction and is something that could be added to the current best practices and not something to replace it.

Studies Summary:

Study 1:

The first study I looked at was titled Preventative Reading Interventions Teaching Direct Mapping of Graphemes in Texts and Set for Variability aid at–risk learners and was written by Robert Savage, George Georgiou, Rauno Parrila and Kristina Maiorino. The full paper can be found online for free here:

This was not only the best study of the four papers, but it might be the best education study I have ever read, which makes me more confident in its results. I have said this before, I would rather have one well conducted study than five poorly conducted ones. 

This study was a RCT study. It had 497 children from 42 classrooms nested within 21 schools in 5 school boards. All participants were in kindergarten or grade 1. Students who made zero progress were removed from the data for both the control group and the experiment group to control for outliers. The control group received what the authors believed was current best practice instruction, which they defined as systematic synthetic phonics instruction. All control group classes used either Jolly Phonics TM, Soundprints or Success for All. 

The study ran from September to January and delayed post tests were conducted between May and June. Sessions were 30 minutes long and were conducted 3 times a week. In total students received 11-12 hours of instruction for 10 weeks. The assessments and instruction was carried out by research assistants, who were given 6 hours of training. All effect sizes calculations were calculated using the Hedges G formula. The effect sizes calculated for the other 3 studies were Cohen’s D calculations, this might be a potential limitation of this meta-analysis. However, both styles of effect sizes are meant to be interpreted the same way. The P value was found to be less than .05, suggesting statistical significance.

Study Results:

Study 2: 

This paper was labelled Training Mispronunciation Correction and Word Meanings Improves Children’s Ability to Learn to Read Words and was written by Dyson, H., Best, W., Solity, J., and Hulme, C, in 2017. This study was an RCT study with a sample size of 84 students ages 5-7. This study used a control group, in which teachers were able to provide their regular instruction. The experiment group received four weeks of instruction, for twice a week and were taught the 5 Queuing Method. This study had the highest effect sizes, likely because the control group was doing their normal teaching rather than current best practices.

Study 3:

This study was titled The importance of flexibility of pronunciation in learning to decode: A training study in set for variability and was written in 2016 by M Zipke, for the journal titled First Language. His paper was a RCT study that looked at 30 students in grades 1 and 2. Students in the experiment group received 5 one on one lessons that were 20-25 minutes long, on the 5 Cueing Method. The control group received the same amount of additional instruction time; however, their instructional time was spent on guided reading. The authors of this study found no significant differences. They did not include their raw data or effect sizes, so I excluded them from the final results.

Study 4:

This study was titled Teaching Grapheme–Phoneme Correspondences Using a Direct Mapping Approach for At-Risk Second Language Learners: A Randomized Controlled Trial and was written by S Yeung and R Savage, in 2020, for the Journal of Learning Disabilities. This paper was a RCT study with a sample of 253 Chinese ESL students, in grades 1 and 2. The control group received instruction according to current best practices (systematic synthetic phonics instruction) and the experiment group received the same instruction with the addition of DMSVI. All instruction was carried out for the same duration, by trained research assistants, in small groups, for a total of 12 hours. Research assistant received 6 hours of training from the lead author.

They found both groups did well, which makes sense as both groups were receiving systematic synthetic phonics instruction. However, the control group actually did better than the experiment group. That being said, the authors did not calculate the effect sizes for the difference between the control group and the experiment group. Instead they calculated their differences, based on the pre-test, post-test results. While these results were significant, I do not think this was the most appropriate way to calculate the effect sizes. In my opinion, the effect sizes should have been calculated on the difference between the control group and the experiment group, because of this I re-calculated their results to reflect this. While, this study suggests that systematic synthetic phonics instruction with DMSVI, was inferior to systematic synthetics phonics instruction on its own, I do not think it necessarily takes away from the findings of the first two papers, as its sample was ESL students, whereas the other studies were on native speaking English students. I therefore think the summation of this literature so far likely suggests that DMSVI is a high yield strategy for native speaking English students, but not ESL students.

Written by Nathaniel Hansford

Last Edited, 2021-06-20


Yeung, S. S., & Savage, R. (2020). Teaching Grapheme–Phoneme Correspondences Using a Direct Mapping Approach for At-Risk Second Language Learners: A Randomized Controlled Trial. Journal of Learning Disabilities, 53(2), 131–144.

 Zipke, M. (2016). The importance of flexibility of pronunciation in learning to decode: A training study in set for variability. First Language, 36(1), 71–86.

Dyson, H., Best, W., Solity, J., & Hulme, C. (2017). Training Mispronunciation Correction and Word Meanings Improves Children’s Ability to Learn to Read Words. Scientific Studies of Reading, 21(5), 392–407.

Savage R, Georgiou G, Parrila R, Maiorino K. Preventative Reading Interventions Teaching Direct Mapping of Graphemes in Texts and Set-for-Variability Aid At-Risk Learners. Scientific Studies of Reading. 2018;22(3):225-247. doi:10.1080/10888438.2018.1427753

NRP. (2001). Teaching Children to Read: An Evidence Based Assessment of the Scientific Literature on Reading Instruction. United States Government. Retrieved from <>.

J, Hattie. (2021). Visible Learning Metax. Retrieved from <>.