The clumsy echoic: Considerations for best practice

As clinicians, it is incumbent on us to use the best tools available for the kinds of problems we are tasked to solve. Similarly, it is our responsibility to recognize the limitations of those tools. The focus of this discussion will be on the use of a common tool, standard echoic training (SET), in language instruction for children with autism, its benefits and limitations and its place in the conceptual landscape.  

SET is most closely associated with what is commonly referred to the Applied Verbal Behavior (AVB or Verbal Behavior (VB).) (Burke, C., 2011, Carbone,V. 2001). Concerning SET, Sloane (2016) writes,


“Standard echoic or vocal imitation training involves presenting a vocal model, and providing access to reinforcers if the participant imitates that model within an established amount of time. This is a relatively simply procedure that is easy to implement.”


It looks like this:

Teacher says “Ah”… Child emits “ah” and the child’s emitted response is reinforced.

Employment of SET is often useful in establishing early verbal imitative responding. Using SET also works fine for ‘tact training’ when there is word to world correspondence such that the ‘echoic’ response is ‘transferred’ to the presence of real world stimuli.  Similarly, the procedure sometimes works in intra-verbal training’, However, our experience has shown that its benefits diminish as one moves from simple naming or echoing. For example, when a teacher intends to ask the question, "Do you want a cookie?" and wishes to prompt an echoic "Yes", the instructor says, “Do you want a cookie yes”. It is hoped that by using SET, combined with various stimulus fading procedures and prompt fading procedures, a youngster will eventually say, ‘Yes” in the presence of the truncated antecedent (“Do you want a cookie”). Unfortunately, getting to the desired response is not always easy. Let’s explore this a bit.

When considering the procedure, the first question one might ask might be “How can the child know to echo only “yes”… and why wouldn’t the child echo the entire phrase…? Why not the last two words? In fact, children can’t possibly know and thus we often see what we call, ‘hanging echoes’: Therefore, when we say, “Do you want a cookie yes?” very often children say, “Cookie yes” or “Want cookie yes”.  This should not surprise us since there is nothing in what we say that will differentiate the intended “prompt” from the intended ‘question’. There is no demarcation i.e., this part is the question you are to answer, and this part is the appropriate answer to the question that you are to provide. The verbal antecedent is fused as one continuous undifferentiated stream of words. Any user of English could not make sense of what was said because, it is not English. It’s VeeBee-ese.   

Given this problem, the next question might likely be, “Why not simply teach a youngster to follow an instruction such as “say”, so that children will know what to say and when to say it.  Good question…and one that is frequently asked. However, the common response to this question is that this effort often results in children echoing the instruction in addition to the intended target for imitation. Therefore, the prescription is to not use a “say” instruction.  A second reason for the use of this procedure is that, from the perspective of practicing ‘verbal behaviorists’, the procedure leads to some success in establishing some primary verbal operants, i.e., mands , tacts, intraverbals and echoics. This is true. SET is successful in establishing verbal operants. However, is that enough? This raises a bigger question, “Shouldn’t the approach driving ‘language intervention’ be about language?"

You see, “verbal behavior” it is not about language. ‘Language’ is the practice a verbal community “which has become remote from the behavior of the speaker” (Skinner, 1957, p. 2)., whereas ‘verbal behavior’, in contrast to language, is concerned with the behavior of speakers. Primary operants, as Skinner laid them out, are descriptions of specific behavioral contingencies; nothing more. Language, on the other hand, is very different.

To learn a language is to learn the activities, practices, actions and reactions within characteristic contexts in which the rule governed use of words are integrated (Hacker, 2013). Words are integrated into activities and practices such as asking, telling, naming, directing, promising, describing, explaining, cajoling, negotiating, refuting, refusing, agreeing, directing, correcting, teasing, comparing, contrasting, tattling, inviting, etc. To learn a language is to be able to manipulate symbols according to the rules for their use; to learn their meanings. To learn a language is to learn to use the constituents of a language within common linguistic domains; pronouns, action words and related tenses, plurals, negations, prepositions, attributes and to learn to how to put them into play.

Therefore, the goals related to language intervention for children with autism requires intervention which comprehensively address these concerns and which uses precise tools that lead to those ends. It goes without saying that the tools one uses should not interfere in achieving language goals . The tools need to be sure in purpose and accuracy. With those requirements in mind…how does the use of SET hold up as a tool for language instruction? Not well. First, its use encourages (reinforcing) echoing in persons who may already manifest pathological echoing (echolalia).

Second, the imprecision of SET actually interferes with intervention efforts.  The SET does not establish conditions or rules for, ‘when to echo/imitate (and when not to) and precisely what to echo/imitate’. SET can’t provide the kinds of therapeutic accuracy needed to tackle complex language goals. Examples of the imprecision of SET as an tool are shown below.


1. Teaching greetings:

In this example, a teacher  (Mel) attempts teach a child, Al, to greet him.

When viewing this sequence, one is left to consider whether,


1) These people live in a parallel universe where everything is said in opposites

2) This is a perfectly acceptable method for teaching someone to offer greetings since it results in the establishment of an echoic relation and comports with “the science”.

3) This is a very peculiar way of teaching someone to offer greetings.


The youngster is Al, the instructor is Mel. Using this strategy for teaching someone to greet someone violates conventions of ordinary language in which there is an agreed use of symbols used to refer. But since establishing verbal operants is the primary concern in an AVB based approach, using SET is fine, since its use satisfies the requirement of establishing verbal operants. Since some children do learn to produce the desired response/s, concern about its use is unwarranted.  But eventually, this strategy ends up getting in our way as language targets become only slightly more complex.

Now imagine if the teacher Mel, had decided to attempt reciprocal greetings. Mel would have said something like,  “Hi Al Hi Mel”. This is common practice and while it may work for some children it’s a very confused and a confusing way of instructing children in the basic rules for greetings. There is nothing in this effort that clarifies for the child what they are to do. Wouldn’t it be much simpler and clearer to simply instruct a student using a “say” instruction? The sequence would look like this.

Teacher: “Hi Al, say “Hi Mel” and all conventions and the need for clarity are satisfied…and the rules for speaking English are maintained.


2. Teaching “Asking questions”:

In illustration 2, the instructor wants to teach a youngster to ask a question. But, instead, the teacher is actually asking a question… even though what he says is intended as a echoic prompt.

Now, assume, that in this case, the ‘ecohic prompt’, “What are you eating” is a question that child has learned to answer, in which case the child is likely to answer the question. Naturally, the child can’t know that the ‘question’ is actually an echoic prompt. Confusing, right?

The same problem exists when we want to teach children to give directions rather than follow them. For example, if I want a child to learn to tell me how to build a block structure, using SET I would say “Put the red block on the green block”. How could the child possibly know whether what I say is a ‘prompt’ or a direction to be followed. There is no way. But eventually, children will need to learn how to direct others to do things and we will need to help them learn to do this.

3. Pronouns/ giving directions

If an instructor wants a child to say, “Throw me the ball”, and the instructor says, “Throw me the ball” intending to prompt the child to say the same thing, one sees immediately that basic deixic relations are disregarded. “Me” always refers to the speaker. In this case, when the speaker says, “Throw me the ball”, the instructor is telling the child to throw the instructor the ball. It can’t be any other way. This is how our language works. It seems only natural that as instructors of a language, our teaching conform to the practices within that language…that if we intend children to learn a language/ that our language models correspond to the practices within that language. Can it be any other way? Would you try to teach chess expressed in the rules for checkers? It’s not conceivable.

The use of SET has its place within a verbal behavior approach. But, working within this paradigm beyond establishing the rudiments of vocal imitation defies justification and is just confusing.  We need to teach children when to say what we ask them to say and what to say. This is possible by teaching children to respond appropriately to the instruction “say”. We  use a “say vs. do” program. (Lund and Schnee, 2018). It is a useful strategy for teaching children to respond appropriately to the instruction, “say”. It involves teaching conditional instructions (say the word only if you hear the word “say”, otherwise perform the corresponding action). Thus, children clap when they hear clap, and say, “clap” when told to say “clap”. Children can learn how to respond to a say instruction, when careful arrangements are made… just like when teaching any other concepts i.e, jump, clap, refrigerator albeit, this is little trickier. 

And to drive the point home, pronoun use is a fundamental aspect of English. Learning to use them is complicated. Part of the challenge in learning to use pronouns is that they are contextually determined. There are many moving parts and vigilant tracking across shifting speakers (you, they, he, she, George) and listeners (I, We, they, she, he, George) requires ongoing changes in responses. We need to be as precise as possible when instructing children how they work. At a minimum, a child needs instruction that doesn’t confuse and which operates within the rules of the language one is trying to teach. Using a SET can only fail in this effort.



Our experience has shown that using SET is a useful tool for establishing early verbal imitative behavior. It can be used to establish tacts, mands and intraverbals. But its usefulness beyond establishing early verbal imitation diminishes quickly. Moreover, its use may increase pathological echoing in children who already demonstrate those tendencies. Additionally, as a tool for teaching a language, its use violates conventions of use, confuses rather than clarifies and hampers our ability for teaching basic linguistic abilities and more advanced abilities (asking questions, pronoun use, giving directions etc.) because quite simply, its home is within a conceptual system that is not about language. 

If clinicians are satisfied with establishing only verbal operants, then using SET is fine. If on the other hand, clinicians hope that children learn a language, SET needs to be put aside early on and replaced with a conditional instruction, "say" in order to give youngsters a fighting chance at learning a language. 


McCully- Rodriguez, K. M.S., CCC-SLP Texas Speech-Language-Hearing Association Annual Convention, Fri, March 26, 2010

Burke, Christina.

Barry, L.  Behavioral Assessment, Intervention & Outcomes, Online course. Aug, 15, 2010

Carbone, V.J. (2001). Guiding the Development and, Implementation of an Intensive Program for Teaching Language and Basic Learner Skills To Children With Autism;
Workbook for Parents and Their Team; Parent Workshop Revised 3/10/01

Hacker, P.M.S. (2013). The intellectual powers: A study of human nature. Wiley.


Lund, S.K. and Schnee, A. (2018). Early intervention for children with ASD: Considerations. Infinity Publications

Skinner, B.F. (1957). Verbal Behavior. Copely Publishing Group. Acton, MA.


Stock, R.A., Schulze, K.A., & Mirenda, P. (2008). A Comparison of Stimulus-Stimulus Pairing, Standard Echoic Training, and Control Procedures on the Vocal Behavior of Children With Autism. Anal. Verbal Behav., Dec; 24(1): 123-133.


Sweeny-Kerwin, E.J, Zecchin-Tirri, G., Carbone, V.J., Janeckey, M., Murrary, D.D.,  & McCarthy, K. Improving the Speech Production of Children with Autism: Carbone Clinic Valley Cottage, NY


Sundberg, M. L., & Michael, J. (2001). The value of Skinner’s analysis of verbal 
behavior for teaching children with autism. Behavior Modification, 25, 698-724.


Ward, SJ, Osnes, PJ, &  Partington, J.W. (2007). The effects of delay of noncontingent reinforcement during a paring procedure in the development of stimulus control of automatically reinforced vocalizations. Anal Verbal Behav. Dec; 23(1): 103-111


Acknowledgements: I wish to thank Stein Lund for his early comments on this paper.

Nexus Autism Intervention Services

609 328 3283

Copyright © 2019 [Nexus Autism Intervention Services.] All Rights Reserved.