| Home | E-Submission | Sitemap | Contact Us |  
Commun Sci Disord > Volume 20(2); 2015 > Article
최소 언어 산출 자폐스펙트럼장애 아동의 영어말소리 습득에 관한 종단 연구: 사례 연구


배경 및 목적:

기존 문헌에 자폐스펙트럼장애 아동의 말소리 발달에 대한 연구가 부족한 실정이다. 본 연구에서는 7세라는 늦은 연령에 발화를 시작한 최소 언어 산출 자폐스펙트럼장애 아동의 영어말소리 발달을 살펴보았다. 본 사례 연구 참여 아동은, 집에서는 한국어와 영어에 노출되었으나 집과 학교에서 영어를 주 언어로 사용했다.


1년간의 종단 연구로 매달 자료를 수집하였고, 이 후에 6개월과 20개월 후속 연구 자료를 수집하였다. 한 단어 발화들의 음성, 음운 연구를 위해 음성목록, 자음정확도, 전형적/비전형적 오류 패턴을 분석하였다.


자음정확도는 1년간 10% 증가하였고, 자음정확도에 비해 음성목록에 포함된 말소리 수가 큰 편이었다. 가장 빈번한 오류패턴으로는 자음군 축소, 종성생략, 폐쇄음화였고, 자음 위치의 후방화와 같은 비전형적 오류패턴도 다수 발견되었다. 20개월 후속 연구 자료 결과는 자음정확도의 13% 증가와 종성생략과 폐쇄음화의 지속적은 감소를 나타냈다.

논의 및 결론:

본 연구의 결과는 최소 언어 산출 자폐스펙트럼장애 아동이 상당히 늦은 연령에 발화를 시작했음에도 불구하고 지속적인 말소리 발달을 할 수 있음을 시사하고 있다.



Information regarding the acquisition of speech sounds of children with autism spectrum disorder (ASD) is lacking in the literature. The present study examines the English speech sound acquisition of an 11-year-old child with severe speech impairment secondary to ASD, who was minimally verbal and had a significantly delayed onset of speech as late as seven years; the child was exposed to Korean and English at home but spoke English as a primary language both at home and school.


Data was collected monthly over a one-year period along with two additional sets of follow-up data (6 months and 20 months after the one-year study period). Phonetic and phonological characteristics of single word production were examined through analysis of the phonetic inventory, percentage of consonants correct (PCC), and typical/atypical error patterns.


Results show that PCC increased approximately 10% over a year. The child’s phonetic inventory was relatively large when compared to his low PCC. His most common error patterns were cluster reduction, final consonant deletion, and stopping. The child also produced a number of atypical error patterns (especially backing). The 20-month follow-up data indicated a continuous decrease of final consonant deletion and stopping as well as an additional 13% increase in PCC.


Results of the current study suggest that minimally verbal children with ASD can continue to develop speech sounds despite severe impairment and significantly delayed onset in producing speech sounds.

Autism spectrum disorder (ASD) has become one of the most prevalent developmental disorders. The Diagnostic Statistical Manual of Mental Disorders-5 (DSM-5); American Psychiatric Association [APA], 2013) uses ASD to include autism, pervasive developmental disorders-not otherwise specified, and Asperger syndrome. Children with ASD exhibit deficits in social communication and interaction as well as restricted/repetitive behavior patterns, interests, and activities (APA, 2013).
Many children with ASD have been reported to be nonverbal by age 5 or older (Tager-Flusberg & Kasari, 2013). The percentage of children reported to be nonverbal range from 30% to 75% with a decreasing trend in more recent years (National Research Council, 2001; Tager-Flusberg & Kasari, 2013). The fact that the percentage of children who are non-verbal has decreased is encouraging given the importance of verbal communication in daily functioning.
In the literature, there have been inconsistent definitions and unclear terms in describing children’s speech production as non-verbal, or minimally verbal (Tager-Flusberg & Kasari, 2013). ‘Minimally verbal’ refers to a range of speech production levels, including producing a few words (fewer than five words), producing ‘fewer than 20 functional words,’ and/or echolalic or scripted phrases (Kasari, Brady, Lord, & Tager-Flusberg, 2013; Tager-Flusberg & Kasari, 2013). Additionally, information regarding the speech sound acquisition of these children who are minimally verbal is limited (Cleland, Gibbon, Peppe, O’Hare, & Rutherford, 2010; McCleery, Tully, Slevc, & Schreibman, 2006; Wolk & Giesen, 2000).
There have been a number of studies on the verbal communication of individuals with ASD (Tager-Flusberg, Paul, & Lord, 2005). Most of the research on communication in ASD has focused on the pragmatics of language in high-functioning individuals with autism who have average cognitive functioning (Bauminger-Zviely, Karin, Kimhi, & Agam-Ben-Artzi, 2014; Paul, Orlovski, Marcinko, & Volkmar, 2009). This is because individuals with high-functioning autism typically display fairly intact morphosyntactic development, but with delays in semantic-pragmatics. Articulation and phonology in children with ASD have been reported to be relative strengths when compared to the other speech and language domains in these individuals who are considered to be verbal (Kjelgaard & Tager-Flusberg, 2001; Rapin & Dunn, 2003). However, children with low-functioning autism tend to be minimally verbal, and produce limited speech output, sometimes characterized by jargon that is unintelligible (Tager-Flusberg et al., 2005). Therefore, the area of articulation and phonology in the field of ASD remains to be explored, particularly since the speech sound acquisition of low-functioning autism has not yet been systematically examined (McCleery et al., 2006).
(Pickett, Pullara, O’Grady, and Gordon (2009) carried out an extensive literature review to identify individuals with autism who began to develop speech at age 5 or older and determine the number of children who were successful in achieving speech and language development. Based on 64 published papers from 1951 to 2006, the authors found that a total of 167 individuals with autism developed speech at, or after, age 5. Out of nine studies conducted between 1967 and 1999, the authors found that approximately 22% of children who lacked speech at age 5 or older eventually developed speech in the form of single words, echolalia, and sentences. In general, the studies have shown that most children developed speech between ages 5 and 7 while some acquired speech between ages 8 and 13. It was reported that there had been significant variability in the onset of speech and the rate of subsequent speech and language development.
Issues to examine in the speech sound development of children with ASD also include whether these children acquire speech sounds in the same order as typically developing children, and whether they display the same, or similar speech sound errors as those of typically developing children. Schoen, Paul, and Chawarska (2011) reported that 30 toddlers with ASD produced speech-like sounds that were similar to those of their language-matched peers in terms of the consonant distribution and the order of emergence of consonants. However, the children with ASD produced significantly more atypical non-speech vocalizations when compared to their age and language-matched peers.
McCleery et al. (2006) compared the consonant production of 14 children with autism (between ages 2;1 and 6;11 [year;month]) to that of 10 typically developing children (between ages 13 and 14 months old) based on words understood and words produced. These children were reported to produce an average of seven words with a large portion of non-speech vocalizations. The authors reported that the children with autism exhibited the same consonant production patterns as those of the typically developing children; both groups produced the earlier developing sounds /b, d, h, m, n/ more frequently than the later developing sounds /dʒ, l, r, s, t/. Additionally, the children with autism produced voiced sounds significantly more than voiceless sounds. These same patterns were observed in typically developing children.
Cleland et al. (2010) examined speech sound production errors and phonological error patterns using the Goldman-Fristoe Test of Articulation-2 (GFTA-2) in 69 children with high-functioning autism and Asperger syndrome. On the GFTA-2, 20 out of 28 children who produced errors had standard scores within the normal range; the remaining eight children scored lower than the normal range. The result of the error analysis indicated that both developmental errors (the three most common errors were gliding, cluster reduction, and final consonant deletion) and non-developmental errors (backing, nasal emission, sibilant dentalization) were observed. Of the 20 children with normal GFTA-2 scores, developmental errors only were observed in 11 children, non-developmental errors only in five children, and both developmental and non-developmental errors in four children. Of the eight children with standard scores falling outside the normal range, developmental errors only were observed in three children, non-developmental errors only in one child, and both developmental and non-developmental errors in four children. Therefore, the results of Cleland et al. (2010) are inconclusive regarding the association between phonological error patterns and severity of sound production errors in high-functioning children with autism.
Another study examined four siblings with autism (ages 2;3, 3;9, 5;9, and 9;0) out of a total of eight children in a family and reported atypical patterns of phonological development. These included certain phonological processes that persisted beyond the expected age (final consonant deletion, cluster reduction, labialization), unusual sound substitutions (frication for stops and liquids, velarization of [ŋ] for /n/, coalescence), unusual sequence of sound development (absence of early developing sounds in the presence of late developing sounds), and limited contrast use (Wolk & Giesen, 2000).
There has been extensive research on speech sound development in typically developing children learning English (Grunwell, 1982; Smit, 2007; Stoel-Gammon & Dunn, 1985; Templin, 1957). However, relatively limited literature is available on speech sound development of children with ASD. Furthermore, information on how children with ASD who are minimally verbal acquire speech sounds is lacking. In particular, the speech sound acquisition of children with ASD who demonstrated a significantly delayed onset of speech production (past age 5) is of interest.

Purpose of the study

The purpose of this current study is to examine the English speech sound acquisition of a child with ASD, who is minimally verbal with the onset of speech production past age 5. We identified a child who had an onset of speech sound development as late as 7 years of age and has a severe speech impairment secondary to ASD. Given the scarcity of studies on the speech sound development of children with ASD but the high degree of heterogeneity of severity and functioning level of ASD, this case study allows us to examine internal change over a one-year period. The examination of internal changes over time is imperative at the current state of the literature on speech sound development of minimally verbal children with ASD. The specific research questions are as follows.
Research question 1. Does an 11-year-old minimally verbal child with ASD, who had an onset of speech as late as seven years and exhibits severe speech impairment, continue to develop English speech sounds over a one-year period?
Research question 2. Does the child exhibit similar developmental patterns of speech sounds (i.e., developmental sequences and speech sound errors) to those of typically developing younger children?



One 11-year-old boy (henceforth referred to as Y) who was diagnosed with autism at the age of three years participated in this case study. The child was recruited from the monthly parent support group that the second author has hosted on a university campus. Y was age 11;1 at the onset of the study, and was 12;1 when the study concluded. He had a one-year-old younger sister who was developing typically. Y received applied behavior analysis (ABA) therapy at home (approximately ten hours per week) during the study period. He attended an English speaking, non-profit private school daily. Y’s mother used both English and Korean when interacting with the child. He has been exposed to English at school and during all interventions including ABA; speech and occupational therapy at school were provided in English. He spoke English as a primary language both at home and in school.
Y’s developmental history is as follows. Y’s motor developmental milestones occurred at the expected age and the parents had no concern about his development until his first birthday. On his first birthday, Y cried for several hours until his mother returned home. Y had extreme difficulty with separation from his mother. He made eye contact with his mother, but not with others, and was not interested in other children (i.e., ignored them or pushed them away). He disliked any physical touch with others. At the age of 3 years, Y was diagnosed with autism using the DSM-4 criteria at a teaching hospital by a child psychiatrist in Seoul, Korea. His hearing was also tested at the hospital and reported to be normal. Y received speech-language intervention in Korean at the age of 5 years, and produced [Ʌmma] (‘mom’ in Korean) for the first time. However, he made limited progress in speech/language development. Y used gestures (e.g., pointing) after age 5. At the age of 6, Y began producing vocalizations in order to request his needs.
Y’s family moved to Northern California from Korea when he was 7 years old and to Southern California when he was 10 years old. A Korean-English bilingual child psychiatrist confirmed Y’s diagnosis of autism using the DSM-5 criteria when he was ten years old. Y’s mother perceived that his attempts to vocalize increased notably at the age of ten. At age 7, Y’s memory for sentences also increased so that he could imitate sentences. He was often observed to repeat words. At the onset of the study, his dominant language was English, and his speech was comprised of mostly one-word productions and simple two to three word phrases (e.g., “I want water”, “I read a book”) that were taught from speech therapy at the non-profit private school. After the one-year period of data collection, Y began to receive three-month trial of speech-language therapy (two times weekly for 45 minutes), which was supplemental private speech-language intervention to improve his speech sound production. This supplemental intervention was provided based on Y’s mother’s perception that clearer speech production could improve his functional communication at school.

Background data

Y’s mother completed the Social Communication Questionnaire (SCQ; Rutter, Bailey, & Lord, 2003) Lifetime form at the beginning of the study. The SCQ is an ASD screening test for children above age 4 years whose mental age is at least 2 years (SCQ manual, p. 1). Bishop and Norbury (2002) reported that there are statistically significant correlations on social (r=.82), communication (r=.73), and repetitive behaviors (r=.89) between the SCQ and the Autism Diagnostic Instrument-Revised (ADI-R)—a gold standard autism diagnostic test. Therefore, when the administration of the ADI-R was unavailable, the SCQ was used as a means to objectively validate the participant’s diagnosis in addition to the clinical observation by the authors (limited verbal/non-verbal communications and social reciprocity, and repetitive behaviors of hitting on his forehead). The SCQ total score was 33 (the cutoff score of 15 or higher indicates possible autism).
Y’s IQ test results were not available. The second author, who is familiar with the Kaufman Assessment Battery for Children-II (Kaufman & Kaufman, 2004) and had experience using it for research, attempted to measure his IQ for this research. She was able to complete and score one sub-test (Triangle) (scaled score=14). This subtest measures the ability to visually construct spatial relationships. The child was presented with various shapes of colors and asked to copy a model or picture. However, he did not cooperate during subsequent testing on the Pattern Reasoning subtest and therefore, testing was discontinued.
Y’s receptive vocabulary level was evaluated using the Peabody Picture Vocabulary Test 4th edition (PPVT-4; Dunn & Dunn, 2007). Y’s receptive vocabulary was at a 3-year-old level (raw score=41, standard score=21) when he was 11 years 9 months old. His expressive vocabulary at the onset of the study was not measured due to his lack of cooperation. Y’s receptive and expressive vocabulary levels were measured using the PPVT-4 and the Expressive Vocabulary Test-2 (EVT-2; Williams, 2007) respectively after the oneyear period of the study. Y’s raw score on the PPVT-4 was 43 (standard score=20, age equivalent score=3;1). Y’s expressive vocabulary raw score on the EVT-2 was 37 (standard score=22, age equivalent score of 3;4).
His speech sound production skills were assessed using the GFTA-2 (Goldman & Fristoe, 2000). This test includes 53 words that contain all English consonants in word-initial, medial, and final positions except for the voiced post-alveolar fricative (i.e., /ʒ/). The entire words on the GFTA-2 were first reviewed with Y in order to probe his familiarity with the words. Subsequently, words such as ‘feather’, ‘pencils’, ‘finger’, and ‘pajamas’ were taught before the test was administered since they seemed to be absent or be present as perceptually related words (e.g., leaf for feather) in the child’s expressive English vocabulary. Y obtained a raw score of 53, which indicated a developmental age of less than two years. In order to assess Y’s motor aspects of speech production abilities, the Kaufman Speech Praxis Test (Kaufman, 1995) was conducted. Due to Y’s lack of cooperation, the test was discontinued after two parts (the oral movement level and simple phonemic/syllabic level) were administered. Y demonstrated appropriate oral movement except for a slightly reduced range of lip movement (spread/pucker/alternating). He accurately produced all individual vowels (Vs) and consonants (Cs), V-V/ C-V/ V-C-V movement, repetitive syllables, and simple monosyllabic words.

Data collection

Both researchers visited Y’s home once a month over the one-year period, where data was collected over 12 sessions. Y’s word productions were obtained using pictures of a standardized test (GFTA-2) to systematically examine any changes in his sound production over time. Occasionally, Y was quiet and did not produce certain item words. In such cases, a model was provided to the child and efforts to collect delayed productions, rather than imitations, were made as much as possible. When there were multiple productions for the same target, the first spontaneously produced token was selected. A total of 636 word productions (53 words×12 sessions) were obtained and utilized for phonological analyses. Five of them were imitated productions, and 81 tokens were delayed productions, both of which were mostly produced during the first three sessions. As for the examination of Y’s phonetic inventory, on the other hand, all speech productions collected over each session were utilized. The speech productions were recorded using a Sony linear PCM Recorder and Azden WMS-PRO wireless microphone that was attached approximately 10 cm from the child’s mouth.

Additional data collection

Data collection was extended to include two additional sessions: one session after six months (follow-up session 1) and another session after 20 months (follow-up session 2) from the end of the oneyear period study. The rationale for collecting additional data was to further examine whether Y had reached a plateau, or continued to develop in his speech production.

Phonetic transcription and transcription reliability

The speech samples were transcribed phonetically by a primary transcriber, who is a certified and licensed speech-language pathologist with experience in transcribing disordered speech samples. She is a native speaker of Korean and English-Korean bilingual. Broad transcription was used, except in the case of any distortions or non-English speech sounds for which diacritics were used. Sounds perceived as not typical English speech sounds were marked as distortions (e.g., stops that were produced as tense [p*, t*, k*]). The primary transcriber reviewed the transcription and classified distortions into Korean influenced non-English and other non-English or distorted sounds (Table 1).
Inter-transcriber reliability was assessed by two secondary transcribers who were native English-speaking students. They were trained with speech samples from the same child before transcribing the selected data for the reliability measure. The inter-transcriber reliability was obtained using speech samples from two randomly selected data collection sessions, which is approximately 15% of the entire data including follow-up data. Point-to-point reliability was obtained by comparing the transcriptions between the primary and secondary transcriber for all consonants the child produced. There was 88% and 89% agreement for consonant production between the transcribers, respectively.

Data analysis

Phonetic analysis included identification of the child’s phonetic inventory for English speech sounds, Korean influenced non-English speech sounds, and other non-English or distorted sounds. All of the sounds produced, regardless of the accuracy in word production, were included in the phonetic inventory. As for phonological analysis, the percentage of number of syllables correct (PNSC), percentage of consonants correct (PCC, distortions are considered incorrect), percentage of consonants correct–revised (PCC-R, distortions are considered correct; Shriberg, Austin, Lewis, McSweeny, & Wilson, 1997), and typical/atypical error patterns were examined. Although this study focuses on speech sound production, PNSC was measured to examine an ability to maintain the syllable structure within a word versus speech sound production abilities. As for the accuracy of consonant production, the PCC and the PCC-R were calculated. The PCC calculation was conducted based on the sampling and scoring rules of Shriberg and Kwiatkowski (1982).
Lastly, the frequency of occurrence of typical and atypical error patterns (e.g., initial consonant deletion, glottal replacement, and backing) based on words that the child produced from the GFTA-2 were examined. There were a total number of 26 instances where consonant clusters occur in any word position, with two words containing two sets of consonant clusters. There were a total of 42 words that included final consonant(s).


Phonetic inventory

Table 1 shows Y’s phonetic inventory for each session. English speech sounds including English allophones (i.e., a glottal stop [ʔ] and a flap [ɾ]), Korean influenced non-English speech sounds, and other non-English or distorted sounds are presented. At the initial session, Y produced 16 English speech sounds, including a glottal stop. He produced all stops except for /k/, some fricatives such as /f, v, s/, and an affricate /tʃ/. He also produced the liquid /l/. At sessions 2-4, Y did not produce /v/, but produced a post-alveolar fricative /ʃ/. The voiceless affricate /tʃ/ and the fricative /ʃ/ were not produced at sessions 6-11 or sessions 8-10, respectively. Instead, Y produced the palatalized alveolar fricative [sj ] for the English sound /ʃ/. Y did not produce /z, dʒ/ from sessions 1-11; however, he produced both of them in session 12. At the last session, Y produced all of the target English speech sounds except for the interdental fricatives /θ, ð/ and the liquid /r/.
Y also produced Korean influenced non-English speech sounds including unaspirated tense stops (i.e., /p*, t*, k*/). He produced the alveolo-palatal aspirated affricate [tɕh ], which belongs to Korean speech sounds. This sound can be perceived as similar to the English sound /tʃ/ without the lip-rounding articulation although the Korean affricate is produced closer to the alveolar position than the English one.
The child also produced other non-English sounds such as voiceless palatal and velar fricatives (e.g., ‘brush’→ [bwɑç], ‘house’→ [aʊx]), and voiceless/voiced bilabial fricatives (e.g., ‘knife’→ [naɪɸ], ‘frog’→ [βʊə]). A glottal stop was produced in the word-initial position during all sessions (e.g., ‘house’→ [ʔahʊ], ‘yellow’→ [ʔɛlo], ‘cup’→ [ʔʌp]). A lateralized /s/ was produced (e.g., ‘glasses’→ [lasli]) in session 5.
In follow-up session 1, Y produced most of the sounds except for /j, z, tʃ, θ, ð/. He continued to produce the alveolo-palatal aspirated affricate [tɕh ] for the English sound /tʃ/. He did not produce /j, tʃ, dʒ, θ/ while producing /ð/ in follow-up session 2. In both sessions, the liquid /r/ was produced, but /s/ continued to be lateralized in the word ‘glasses.’ It was noted that the production of Korean influenced non-English sounds was reduced with only /t*/ in follow-up session 2.

Phonological analysis

Figure 1 displays PNSC, PCC (distortions are considered incorrect), and PCC-R (distortions are considered correct) over time. With respect to the PNSC, Y produced all the syllables for 88.7% of the target words in the first session. As for the child’s correctness of consonant production, both PCC and PCC-R were 30.5%, which indicates that he did not produce any distortions in the first session.
With regards to the changes over time, the greatest change for PNSC (from 88.7% to 100%) occurred between sessions 1 and 2. Changes in PNSC in later sessions were much smaller. Overall, Y was able to preserve all the syllables for most of the target words produced. As for the correct production of consonants, both PCC and PCC-R increased by approximately 10% over a year from 30.5% at session 1 to 39.7% and 40.4% at session 12, respectively. Only a few distortions, with a range of zero to three instances per session, were found. Sibilant distortions, such as a palatalized alveolar fricative [sj ] or [ʃ] without the lip-rounding feature for the target /ʃ/, and an alveolo-palatal affricate [tɕh ] for the target /tʃ/, were most common. A few other distortions were the production of unaspirated tense stops (i.e., [p*, t*, k*]), which exist in Korean, for English voiced stops. The follow-up data indicated that PCC and PCC-R increased from 39.7% and 40.4% to 45.0% and 49.7% after 6 months (follow-up 1), and to 52.3% and 55.0% after 20 months (follow-up 2), respectively.
Table 2 displays the frequency of occurrence of typical error patterns. The most common error pattern was final consonant deletion (FCD), followed by cluster reduction (CR) and stopping. During the first session, the child demonstrated CR (e.g., ‘spoon’→ [bun]) in 17 out of the total of 26 instances (63%). He deleted final consonants (e.g., ‘telephone’ → [tɛlɛpo]) in approximately 64% of the instances (27 out of 42). Stopping occurred in 18 instances (e.g., ‘zipper’ → [dɪpə]). On the other hand, other error patterns including velar fronting, postalveolar fronting, post-vocalic devoicing, and vocalization occurred less frequently.
With respect to the changes over time, there was a decrease in the number of CR from 17 to 13, although the occurrence for session 5 was the same as that for session 1. The number of FCD, on the other hand, decreased from 28 to 21. However, the fluctuation in numbers is notable, with the smallest number of FCD produced in session 5 and the second largest number of FCD produced in session 9. The frequency of occurrences of stopping decreased from 18 to 13 over the one-year period. The change in the number of occurrences for other error patterns, such as gliding and prevocalic voicing, was inconsistent over the sessions. The results of the follow-up data indicate an increase in producing final consonants; FCD decreased from 21 to 13 after six months (follow-up session 1) and to 10 after 20 months (follow-up session 2) from the end of the one-year study period.
The frequency of occurrence of atypical error patterns is shown in Table 3. Y produced the backing process most frequently and a target sound was replaced by [h] in most instances (e.g., ‘scissors’ → [hɪhə], ‘shovel’→ [hɑbo]). Unusual cluster reduction (e.g., ‘plain’ → [leɪn]) and initial consonant deletion (e.g., ‘cup’→ [ʌp]) occurred with a range of zero to four instances per session.
The results of the data analysis are summarized as follows. First, Y’s phonetic inventory collected at each data point includes most of the sounds except for several later developing sounds (i.e., /v, z, tʃ, dӡ, θ, ð, r/). In addition, Y produced non-English speech sounds such as voiceless palatal and velar fricatives and bilabial fricatives, as well as some speech sounds present in Korean (i.e., /p*, t*, k*, tɕh/). Distorted sounds for alveolar and post-alveolar fricatives were also found. Secondly, the PCC was very low and remained at approximately 30%-40% from the data collected throughout the one-year period; PCC and PCC-R increased approximately 10% over a year. In contrast to the low PCC, the number of syllables that tended to be preserved was over 90% accurate relative to the target words. Thirdly, CR, FCD, and stopping occurred more frequently than other typical error patterns. In addition, Y produced a number of atypical error patterns, especially backing. And lastly, the follow-up data indicated a continuous decrease in final consonant deletion, as well as an increase of 13% and 15% in PCC and PCC-R, respectively, and a decrease in backing errors as well as no unusual cluster reduction and glottal replacement at follow-up 2.


The current study examined the acquisition of English speech sounds of a minimally verbal child with ASD who demonstrates severe speech impairment. The data from the current study suggests that an 11-year-old child with severe speech impairment secondary to ASD continued to develop English speech sounds over a one-year study period despite a significantly delayed onset of speech production. The results indicate a slow but continued development in speech sound production as shown in phonetic inventory and percent consonants correct at follow-up 1 and 2.
The child with ASD in the current study demonstrated developmental patterns of speech sounds that are different from those in typically developing children. The child produced most of the speech sounds except for several later developing sounds while demonstrating a very low PCC. Among the late developing sounds, the production of /v/ emerged at session 5 and showed stable production throughout. However, the production of /z/ and / dӡ/ emerged at the session 12. Additionally, the child’s preservation of syllables as measured by the PNSC (range, 89%-100%) was notable in comparison to his low consonant production accuracy. While English is a ‘stress-timed’ language, Korean is closer to a ‘syllable-timed’ language (Mok & Lee, 2008), in which each syllable is pronounced with approximately equal prominence. Even if his use of Korean was very limited in his daily life, the child may have been influenced by his mother’s native language when producing multisyllabic words in English. It was noted that the child occasionally produced the Korean high back unrounded vowel /ɨ/ for the unstressed syllable of multisyllabic English words (e.g., ‘pencils’ → [bɛhɨ]). The child also produced sibilant distortions (e.g., a palatalized alveolar fricative [sj ] for English /ʃ/), as well as quite a few non-English speech sounds that may have been influenced by Korean (e.g., [t*] for English /t/ or /θ/ and [tɕh ] for English /tʃ/).
Baron-Cohen and Staunton (1994) reported that children with autism tend to be more influenced by the phonology of their non-English-speaking mothers than their English-speaking peers. The authors linked the development of their mothers’ non-English accent to the stronger social connection these children have with their mothers as compared to their peers.
The child not only exhibited speech sound production skills that were significantly delayed for his age, he also demonstrated atypical error patterns that are uncommon in typically developing children. The child exhibited frequent productions of final consonant deletion and cluster reduction, which are reported to be the most common typical error patterns found in children with ASD (Cleland et al., 2010). The atypical error patterns include the frequent occurrence of backing, as well as the occasional use of a glottal stop and occurrence of initial consonant deletion. The atypical backing of a sound to [h] that occurred very frequently in this child was not found in children with high-functioning autism in the study conducted by Cleland et al. (2010); in their study, the nondevelopmental error patterns included phoneme specific nasal emission, dentalization of sibilants, and backing of an alveolar stop to a velar stop.
While the child exhibited commonly observed error patterns in typically developing children (cluster reduction, final consonant deletion, stopping, and gliding), the occurrence of velar fronting was relatively rare. In addition, the child produced considerably inconsistent error patterns across the sessions (e.g., ‘lamp’→ [læp], [læmp], and [læf] in sessions 1-3) and across opportunities within a session (e.g., jumping → [bʌnt*i] and [dʌmpi]). This inconsistency (or variability) of speech sound errors can be associated with one of the three types of inconsistency of errors defined in Betz and Stoel-Gammon (2005), which describes inconsistent error patterns across multiple productions of the same word. His inconsistency was not based on word position (e.g., producing the target sound correctly in word-initial position, but incorrectly in word-final position) or on the lexical target (i.e., producing a target sound correctly for certain words, but incorrectly for other words). The inconsistent error patterns can perhaps be interpreted as deficits either at the phonological level, or at the motoric level. Betz and Stoel-Gammon (2005) proposed that either the ‘incomplete underlying representation with lack of sufficient detail’ or the ‘inadequate articulatory abilities’ can cause inconsistent productions in typically developing children. As for children with phonological disorders, the inconsistent productions are attributed to a deficit in ‘phonological planning,’ which requires the child to constantly engage in new planning each time a particular word is produced (Bradford & Dodd, 1996). Although the underlying causes of the inconsistent productions in children with speech sound disorders are unknown, the inconsistent error patterns the child of the current study exhibited can be explained by a deficit in the phonological system (underdeveloped phonological representation with limited details coupled by reduced motoric practice by being minimally verbal). This should be examined in a systematically controlled future research.
The findings of the current study have significant clinical implications, which suggest that children with ASD who are minimally verbal can continue to develop speech sounds in spite of a severe impairment with significantly delayed onset in producing speech sounds. The continuous development of speech sounds by the participant in the current study can be attributed to natural development and speech intervention. While the child of the current study had been receiving speech intervention focusing on general communication skills at school, the child began to receive supplemental private speech-language service focusing on articulation therapy after the one-year study period to improve his verbal communication. Given that most intervention for children with ASD focuses on functional communication skills, it is possible that certain children with ASD who are minimally verbal can benefit from additional intervention focusing on speech sound production skills in order to maximize their communication effectiveness.
The current study has some limitations in the following aspects. First, in order to control the stimuli, the phonetic inventory was based only on the productions of target words from the GFTA-2. This data collection method could limit the possibility of additional speech sounds in the child’s word production that were not otherwise included in the study. In spite of the child’s limited speech-language output, the addition of spontaneous speech data could make it possible to get a more comprehensive view of this child’s speech sound production ability. The current study did not include a spontaneous speech sample because the child’s daily speech is reported to be limited to mostly one-word productions, along with the occasional simple two- or three-word sentences, which were most often taught in speech-language therapy. Secondly, an intelligibility measure of spontaneous speech could be added in order to understand the functionality of the child’s speech in a more natural context. Thirdly, the speech samples were analyzed from only one child. The results should be replicated with more participants. Lastly, although this study examined speech sound acquisition from a child with ASD who was exposed to two sound systems (i.e., Korean and English), the study focused only on the acquisition of English speech sounds because of the child’s limited production of Korean. Future research should examine the acquisition of Korean and English speech sounds in bilingual children with ASD.

Figure 1.
Phonological analysis of percentage of number of syllables correct (PNSC), percentage of consonants correct (PCC), and percentage of consonants correct-revised (PCC-R).
Table 1.
Phonetic inventory by session
Monthly/follow-up sessions English speech sounds Korean influenced non-English sounds Other non-English/distorted sounds
1 p, b, t, d, g, Ͱ , m, n, ŋ, w, Ͱ, h, f, v, s, Ͱ, Ͱ, tʃ, Ͱ, l, Ͱ, Ͱ, ʔ   t*   
2 p, b, t, d, g, k, m, n, ŋ, w, Ͱ, h, f, Ͱ, s, Ͱ, ʃ, tʃ, Ͱ, l, Ͱ, Ͱ, ʔ   p*   x
3 p, b, t, d, g, Ͱ, m, n, ŋ, w, j, h, f, Ͱ, s, Ͱ, ʃ, Ͱ, Ͱ, l, Ͱ, Ͱ, ʔ   t*, tɕh   ç, sj, ɸ, β
4 p, b, t, d, g, k, m, n, ŋ, w, Ͱ, h, f, Ͱ, s, Ͱ, ʃ, tʃ, Ͱ, l, r, Ͱ, ʔ   p*, t*   sj, ɸ
5 p, b, t, d, g, k, m, n, ŋ, w, j, h, f, v, s, Ͱ, ʃ, tʃ, Ͱ, l, Ͱ, Ͱ, ʔ   t*, p*, k*, tɕh   x, sj, sl
6 p, b, t, d, g, k, m, n, ŋ, w, Ͱ, h, f, v, s, Ͱ, ʃ, Ͱ, Ͱ, l, Ͱ, Ͱ, ʔ   p*, t*, k*, tɕh   x, sj
7 p, b, t, d, g, k, m, n, ŋ, w, j, h, f, v, s, Ͱ, ʃ, Ͱ, Ͱ, l, Ͱ, Ͱ, ʔ   p*, k*, tɕh   x, sj
8 p, b, t, d, g, k, m, n, ŋ, w, j, h, f, v, s, Ͱ, Ͱ, Ͱ, Ͱ, l, Ͱ, Ͱ, ʔ   p*, t*, tɕh   ç, x, sj
9 p, b, t, d, g, k, m, n, ŋ, w, j, h, f, v, s, Ͱ, Ͱ, Ͱ, Ͱ, l, Ͱ, Ͱ, ʔ   p*, t*, tɕh   ç, x, sj
10 p, b, t, d, g, k, m, n, ŋ, w, j, h, f, v, s, Ͱ, Ͱ, Ͱ, Ͱ, l, Ͱ, Ͱ, ʔ   p*, t*, tɕh   ç, x
11 p, b, t, d, g, k, m, n, ŋ, w, j, h, f, Ͱ, s, Ͱ, ʃ, Ͱ, Ͱ, l, Ͱ, Ͱ, ʔ   p*, t*, tɕh   ç, x, sj
12 p, b, t, d, g, k, m, n, ŋ, w, j, h, f, v, s, z, ʃ, tʃ, dʒ, l, Ͱ, Ͱ, ʔ   p*, t*, k*, tɕh   ç, x, sj
Follow-up 1 p, b, t, d, g, k, m, n, ŋ, w, Ͱ, h, f, v, s, Ͱ, ʃ, Ͱ, dʒ, l, r, Ͱ, ɾ   t*, tɕ, tɕh   sj, sl
Follow-up 2 p, b, t, d, g, k, m, n, ŋ, w, Ͱ, h, f, v, s, z, ʃ, Ͱ, Ͱ, l, r, ð   t*   x, sj, sl

Ͱ: Sounds that were not produced; allophones /ʔ, ɾ/ were not considered, /θ/ was not produced across all the sessions, and /ʒ/ was not targeted.

Table 2.
Frequency of occurrence of typical error patterns
Monthly/follow-up sessions Cluster reduction Final consonant deletion Velar fronting Postalveolar fronting Stopping Gliding Vocalization Derhotacization Prevocalic voicing Postvocalic devoicing
1 17 28 1 0 18 4 1 4 8 1
2 12 21 2 2 15 8 1 4 14 1
3 16 18 7 3 16 6 3 3 10 0
4 11 16 3 1 12 6 1 3 7 2
5 17 13 2 0 14 9 4 8 5 2
6 14 16 2 0 12 10 3 5 7 2
7 14 15 4 2 15 8 5 7 9 1
8 11 18 3 1 15 11 3 6 13 5
9 12 24 4 1 10 10 4 5 7 2
10 14 20 5 0 11 10 3 4 7 2
11 13 19 2 1 11 8 2 6 9 5
12 13 21 4 0 13 9 2 8 9 1
Follow-up 1 14 13 3 3 12 7 2 9 7 6
Follow-up 2 13 10 2 5 9 7 2 6 8 5
Table 3.
Frequency of occurrence of atypical error patterns
Monthly/follow-up sessions Unusual cluster reduction Initial consonant deletion Backinga Glottal replacement Denasalization Sibilant distortions
1 1 0 14 (14) 1 2 0
2 3 2 5 (4) 3 1 0
3 1 1 9 (8) 1 1 0
4 3 2 9 (7) 2 1 2
5 3 4 7 (6) 0 2 2
6 1 4 10 (6) 1 1 1
7 2 3 7 (6) 1 1 1
8 3 1 8 (8) 1 1 2
9 2 2 11 (9) 1 1 1
10 3 2 11 (10) 3 2 0
11 3 2 8 (8) 2 0 1
12 1 2 8 (7) 2 1 0
Follow-up 1 2 0 10 (9) 0 2 3
Follow-up 2 0 1 6 (3) 0 0 2

a The number of occurrence for backing of a target sound to [h] in parentheses.


American Psychiatric Association. (2013). Diagnostic and Statistical Manual of Mental Disorders: DSM-5. Washington, DC: Author.

Baron-Cohen, S., & Staunton, R. (1994). Do children with autism acquire the phonology of their peers? An examination of group identification through the window of bilingualism. First Language. 14, 241–248.

Bauminger-Zviely, N., Karin, E., Kimhi, Y., & Agam-Ben-Artzi, G. (2014). Spontaneous peer conversation in preschoolers with high-functioning autism spectrum disorder versus typical development. Journal of Child Psychology and Psychiatry. 55, 363–373.
crossref pmid
Betz, SK., & Stoel-Gammon, C. (2005). Measuring articulatory error consistency in children with developmental apraxia of speech. Clinical Linguistics & Phonetics. 19, 53–66.
crossref pmid
Bishop, DV., & Norbury, CF. (2002). Exploring the borderlands of autistic disorder and specific language impairment: a study using standardised diagnostic instruments. Journal of Child Psychology and Psychiatry. 43, 917–929.
crossref pmid
Bradford, A., & Dodd, B. (1996). Do all speech-disordered children have motor deficits. Clinical Linguistics & Phonetics. 10, 77–101.
Cleland, J., Gibbon, FE., Peppé, SJ., O’Hare, A., & Rutherford, M. (2010). Phonetic and phonological errors in children with high functioning autism and Asperger syndrome. International Journal of Speech-Language Pathology. 12, 69–76.
crossref pmid
Dunn, LM., & Dunn, DM. (2007). Peabody Picture Vocabulary Test (PPVT)-4. Minneapolis, MN: Pearson Assessments.

Goldman, R., & Fristoe, M. (2000). Goldman-Fristoe Test of Articulation-2. Circle Pines, MN: American Guidance Service.

Grunwell, P. (1982). Clinical phonology. Rockville, MD: Aspen Publishers.

Kasari, C., Brady, N., Lord, C., & Tager-Flusberg, H. (2013). Assessing the minimally verbal school-aged child with autism spectrum disorder. Autism Research. 6, 479–493.
crossref pmid pmc
Kaufman, AS., & Kaufman, NL. (2004). Kaufman Assessment Battery for Children-II. Minneapolis, MN: Pearson Assessments.

Kaufman, NR. (1995). Kaufman Speech Praxis Test for children. Detroit, MI: Wayne State University Press.

Kjelgaard, MM., & Tager-Flusberg, H. (2001). An investigation of language impairment in autism: implications for genetic subgroups. Language and Cognitive Processes. 16, 287–308.
crossref pmid pmc
McCleery, JP., Tully, L., Slevc, LR., & Schreibman, L. (2006). Consonant production patterns of young severely language-delayed children with autism. Journal of Communication Disorders. 39, 217–231.
crossref pmid
Mok, P., & Lee, SI. (2008). Korean speech rhythm using rhythmic measures. In : Proceedings of the 18th International Congress of Linguists (CIL18); Seoul, Korea.

National Research Council. (2001). Educating children with autism. Washington, DC: National Academy Press.

Paul, R., Orlovski, SM., Marcinko, HC., & Volkmar, F. (2009). Conversational behaviors in youth with high-functioning ASD and Asperger syndrome. Journal of Autism and Developmental Disorders. 39, 115–125.
crossref pmid pmc
Pickett, E., Pullara, O., O’Grady, J., & Gordon, B. (2009). Speech acquisition in older nonverbal individuals with autism: a review of features, methods, and prognosis. Cognitive and Behavioral Neurology. 22, 1–21.
crossref pmid
Rapin, I., & Dunn, M. (2003). Update on the language disorders of individuals on the autistic spectrum. Brain and Development. 25, 166–172.
crossref pmid
Rutter, M., Bailey, A., & Lord, C. (2003). The social communication questionnaire: manual. Los Angeles, CA: Western Psychological Services.

Schoen, E., Paul, R., & Chawarska, K. (2011). Phonology and vocal behavior in toddlers with autism spectrum disorders. Autism Research. 4, 177–188.
crossref pmid pmc
Shriberg, LD., & Kwiatkowski, J. (1982). Phonological disorders III: a procedure for assessing severity of involvement. Journal of Speech and Hearing Disorders. 47, 256–270.
crossref pmid
Shriberg, LD., Austin, D., Lewis, BA., McSweeny, JL., & Wilson, DL. (1997). The percentage of consonants correct (PCC) metric: extensions and reliability data. Journal of Speech, Language, and Hearing Research. 40, 708–722.
Smit, AB. (2007). General American English speech acquisition. In S. McLeod (Ed.), The international guide to speech acquisition. (pp. 128–147). Clifton Park, NY: Thomson Delmar Learning.

Stoel-Gammon, C., & Dunn, C. (1985). Normal and disordered phonology in children. Baltimore, MD: University Park Press.

Tager-Flusberg, H., Paul, R., & Lord, C. (2005). Language and communication in autism. In FR. Volkmar (Ed.), Handbook of autism and pervasive developmental disorders (3rd ed., pp. 335–364). Hoboken, NJ: John Wiley & Sons.

Tager-Flusberg, H., & Kasari, C. (2013). Minimally verbal school-aged children with autism spectrum disorder: the neglected end of the spectrum. Autism Research. 6, 468–478.
crossref pmid
Templin, MC. (1957). Certain language skills in children: their development and interrelationships. Minneapolis, MN: University of Minnesota Press.

Williams, KT. (2007). Expressive Vocabulary Test-2. Minneapolis, MN: Pearson Assessments.

Wolk, L., & Giesen, J. (2000). A phonological investigation of four siblings with childhood autism. Journal of Communication Disorders. 33, 371–389.
crossref pmid
Editorial office contact information
Department of Audiology and Speech-Language Pathology
College of Bio and Medical Science, Daegu Catholic University,
Hayang-Ro 13-13, Hayang-Eup, Gyeongsan-si, Gyeongbuk 38430, Republic of Korea
Tel: +82-502-196-1996   Fax: +82-53-359-6780   E-mail: kjcd@kasa1986.or.kr

Copyright © by Korean Academy of Speech-Language Pathology and Audiology. All right reserved.
About |  Browse Articles |  Current Issue |  For Authors and Reviewers
Developed in M2PI