The clumsy echoic: 

As clinicians, it is incumbent on us to use the best tools available for the kinds of problems we are tasked to solve. Similarly, it is our responsibility to recognize the limitations of those tools. The focus of this discussion will be on the use of a common tool, standard echoic training (SET), in language instruction for children with autism, its benefits and limitations and its place in the conceptual landscape.


SET is most closely associated with what is commonly referred to the Applied Verbal Behavior (AVB or Verbal Behavior (VB).) (Burke, C., 2011, Carbone,V. 2001). Concerning SET, Sloane (2016) writes,


“Standard echoic or vocal imitation training involves presenting a vocal model, and providing access to reinforcers if the participant imitates that model within an established amount of time. This is a relatively simply procedure that is easy to implement.”


It looks like this:

Teacher says “Ah”… Child emits “ah” and the child’s emitted response is reinforced.

It is no great revelation that SET can useful in establishing early verbal imitative responding. Many use SET for ‘tact training’, The procedure is also often employed in intra-verbal training’. However, our experience has shown that its benefits diminish as one moves from simple naming or echoing. For example, when a teacher intends to ask the question, "Do you want a cookie?" and wishes to prompt an echoic "Yes", the instructor says, “Do you want a cookie yes”. It is hoped that by using SET, combined with various stimulus fading procedures and prompt fading procedures, a youngster will eventually say ‘Yes” in the presence of the truncated antecedent, “Do you want a cookie”). Unfortunately, getting to the desired response is not always easy.  Let’s explore this a bit.

When considering the procedure, the first question one might ask might be “How can a child know to echo only “yes”… and why wouldn’t the child echo the entire phrase…? Why not the last two words? In fact, children can’t possibly know and thus we often see what we call, ‘hanging echoes’: Therefore, when we say, “Do you want a cookie yes?” very often children say, “Cookie yes” or “Want cookie yes”.  This should not surprise us since there is nothing in what we say that will differentiate the intended “prompt” from the intended ‘question’. There is no demarcation i.e., this part of the statement is the question you are to answer, and this part is the appropriate answer you are to provide. The verbal antecedent is fused as one continuous, undifferentiated stream of words. Any user of English could not make sense of what was said. It is not English. It’s VeeBee-Ese.   

Given this problem, the next question might likely be, “Why not simply teach a youngster to follow an instruction such as “say”, so that children will know what to say and when to say it.  Good question…and one that is frequently asked. The common response to this question is that this effort often results in children echoing the word “say” in addition to the intended target for imitation. Therefore, the prescription is to not use a “say” instruction. A second reason for the use of SET is that, from the perspective of practicing ‘verbal behaviorists’, the procedure leads to some success in establishing primary verbal operants, i.e., mands , tacts, intraverbals and echoics. This is true. However, is that enough? This raises a bigger question, “Shouldn’t the approach driving ‘language intervention’ be about language?

You see, “verbal behavior” it is not about language. It’s about controlling relations. Language on the other hand is the practice a verbal community which Skinner (1957, p.2) says “has become remote from the behavior of the speaker.” Verbal behavior’, in contrast to language, is concerned with the behavior of speakers. Primary operants, as Skinner laid them out, are descriptions of specific behavioral contingencies; nothing more. Language is something different.

To learn a language is to learn the activities, practices, actions and reactions within characteristic contexts in which the rule governed use of words are integrated (Hacker, 2013). It is to learn to do things with words such as asking, telling, naming, directing, promising, describing, explaining, cajoling, negotiating, refuting, refusing, agreeing, directing, correcting, teasing, comparing, contrasting, tattling, inviting, etc. To learn a language is to be able to manipulate symbols according to the rules for their use; to learn their meanings. To learn a language is to learn to use constituents of a language across domains such as pronouns, action words and tenses, plurals, negations, prepositions, attributes, etc.  


To understand a move in language is not to account for controlling relations, but how it fits within a practice. To answer a “why” question is to give a reason. There are an indefinite number reasons one could give to a ‘why ‘question for which, it is inconceivable to imagine ‘generalized responses’ or ‘histories of reinforcement’ which would account for an infinite number of reasons to an infinite number of situations, circumstances or transactions bound up with the query.

Thus, the goals related to language learning for children with autism reflect those aspects. Such goals require the use of tools that are at home within the conceptual scheme of language. To use tools conceived for other purposes, will clog up the works. It goes without saying that the tools one uses should not interfere in achieving your goals. The tools need to be sure in purpose and accuracy. With those requirements in mind…how does the use of SET hold up as a tool for language instruction?

Not well. First, encouraging (reinforcing) echoing in persons who may already manifest pathological echoing (echolalia), may result in increased echolalia. Second, the use SET actually confounds efforts for teaching a language as its use runs counter to conventions of use in ordinary language. It confuses as it violates the rules and renders efforts incoherent within a language scheme. SET cannot clarify, ‘when to echo/imitate (and when not to) and precisely what to echo/imitate’ as does the instruction, “say”. SET can’t provide the kinds of therapeutic accuracy needed to tackle complex language goals. Examples of the imprecision of SET as are shown below:


Teaching greetings:

In this example (Figure I), a teacher  (Mel) attempts teach a child, Al, to greet him.

Figure I


When viewing this sequence, one is left to consider whether,


1) These people live in a parallel universe where everything is said in opposites

2) This is a perfectly acceptable method for teaching someone to offer greetings since it results in the establishment of an echoic relation and comports with “the science”.

3) This is a very peculiar way of teaching someone to offer greetings.


The youngster is Al. The instructor is Mel. Using this strategy for teaching someone to greet someone violates conventions of ordinary language in which there is an agreed use of symbols used to refer. But since establishing verbal operants is the primary concern in an AVB based approach, using SET is fine, since its use satisfies the requirement of establishing verbal operants. Since some children do learn to produce the desired response/s, there would be no concern about its use.  But eventually, this strategy ends up getting in our way as language targets become muddled in the mix.

To carry this discussion further, imagine that the teacher, Mel, had decided to attempt reciprocal greetings. Mel would have said something like,  “Hi Al Hi Mel”. This is common practice. While it may work for some children it’s a very confused and a confusing way of instructing children in the basic rules for greetings. There is nothing in this effort that clarifies for the child what they are to do nor does it resemble anything in English. Wouldn’t it be much simpler and clearer to simply instruct a student using a “say” instruction? The sequence would look like this.

Teacher: “Hi Al, say “Hi Mel” and all conventions and the need for clarity are satisfied…and the rules for speaking English are maintained.


Teaching “Asking questions”:

In Figure II, the instructor wants to teach a youngster to ask a question. But, instead, the teacher is actually asking a question… even though what he says is intended as a echoic prompt.

Figure II

Now, assume, that in this case, the ‘ecohic prompt’, “What are you eating” is a question that child has learned to answer, in which case the child is likely to answer the question. Naturally, the child can’t know that the ‘question’ is actually an echoic prompt. Confusing, right?

The same problem exists when we want to teach children to give directions rather than follow them. For example, if I want a child to learn to tell me how to build a block structure, using SET I would say “Put the red block on the green block”. How could the child possibly know whether what I say is a ‘prompt’ to imitate what I say or a direction that they are to follow. There is no way. But eventually, children will need to learn how to direct others to do things and we will need to help them learn to do this.


Pronouns/giving directions

Figure III

In Figure III, the instructor wants a child to say, “Throw me the ball”, and says, “Throw me the ball” intending to prompt the child to say the same thing.  In this case, one sees immediately that basic deixic relations are disregarded. “Me” always refers to the speaker. In this case, when the speaker says, “Throw me the ball”, the instructor is telling the child to throw the instructor the ball. It can’t be any other way. This is how our language works. These words all have a place in grammar (a la Wittgenstein) and mean things. Use outside of their agreed upon place is just incoherent. It seems only natural that as instructors of a language, our teaching conform to the practices within that language…that if we intend children to learn a language/ that our language models correspond to the practices within that language. Can it be any other way? Would you try to teach chess expressed in the rules for checkers? It’s not conceivable.

And to drive the point home, pronoun use is a fundamental feature of English. Learning to use them is complicated. Part of the challenge in learning to use pronouns is that they are contextually determined. There are many moving parts and vigilant tracking across shifting speakers (you, they, he, she, Ralph, etc.) and listeners (I, We, they, she, he, Ralph, etc.) requires ongoing changes in responses. We need to be as precise as possible when instructing children how they work. At a minimum, a child needs instruction that doesn’t confuse and which operates within the rules of the language one is trying to teach. Using a SET can only fail in this effort.

The use of SET has its place within a verbal behavior approach. But, working within this paradigm beyond establishing the rudiments of vocal imitation defies justification and is just confusing.  We need to teach children when to say what we ask them to say and what to say. This is possible by teaching children to respond appropriately to the instruction “say”. We use a “say vs. do” program[1] (Lund and Schnee, 2018).

It is a useful strategy for teaching children to respond appropriately to the instruction, “say”. It involves teaching conditional instructions (say the word only if you hear the word “say”, otherwise perform the corresponding action). Thus, children clap when they hear clap, and say, “clap” when told to say “clap”. Children can learn how to respond to a say instruction, when careful arrangements are made… just like when teaching any other concepts i.e, jump, clap, refrigerator albeit, this is little trickier.


[1] The origins of this program, we believe, can be traced back to the UCLA Young Autism Project,

but cannot say for sure. It’s rare to be able to jump right in and run this as “say vs do exercise as described in Lund and Schnee (2018)… some early massaging is required. We’ve discovered a strategy in which we fade-in the instruction “say” by using a ‘whispered’ and ‘lagged’ echoic prompt such that the volume of the “say” instruction is increased systematically while at the same time the duration between the instruction and ‘echoic prompt’ is reduced. So it looks something like this: Whispered, “say”>2 sec delay> echoic prompt. Then, systematically increase volume and begin to decrease duration between the “say” and the word or phrase to be uttered. We’ve discovered that by using nonsense words, things that children cannot do or can’t point to or touch, assists in the process. Therefore, we might present “Say “blinky blinky” vs. “blinky blinky”.

Summary and Conclusions:

Using SET is useful for establishing early verbal imitative behavior. It is often used to establish tacts, mands and intraverbals. But its use beyond establishing early verbal imitation is clumsy and inappropriate for language instruction. First, it may increase pathological echoing in children who already demonstrate those tendencies. Second and as a tool for teaching a language, its use  its conceptual home is in a verbal behavior conceptual framework. VB and language are not the same things. VB concerns itself with controlling relations. Language learning involves learning a practice and the normative rules that allow one to participate in the practice. As such, the use of SET in language instruction violates conventions of use, confuses rather than clarifies what is required of children for when and what to echo. Such problems significantly hamper our ability for teaching basic linguistic abilities and certainly for more advanced abilities (asking questions, pronoun use, giving directions etc.).

By establishing a conditional instruction, say what I tell you to say,  (the instruction “say”) will allow language-based instruction efforts to precede and succeed with far more children.  We have found that by careful and creative contingency and prompting arrangements (Say vs. do) it is easily possible to establish “say” as a conditional instruction. 

Establishing the use of a conditional instruction clarifies for children when to say and when not to say what someone else says.. It opens worlds of opportunity to students and eliminates the concern that the “say” will be echoed. The ultimate question is whether clinicians should be satisfied with establishing only verbal operants. If so, then using SET is fine. If on the other hand, if clinicians hope that children learn a language, SET needs to be put aside early on and replaced with a ‘say’ instruction if children are to be given a reasonable chance at learning a language. 




Hacker, P.M.S. (2013). The intellectual powers. A study of human nature. Wiley.


Lund, S. K., & Schnee, A. (2018). Early intervention for children with ASD: Considerations. Infinity.


Skinner, B.F. (1957). Verbal Behavior. Copley Publishing Group.


Shane, Joseph, "Increasing Vocal Behavior and Establishing Echoic Stimulus Control in Children with

Autism" (2016). Dissertations. 1400.

Nexus Autism Intervention Services

609 328 3283

Copyright © 2019 [Nexus Autism Intervention Services.] All Rights Reserved.