I) /l/ has a low F1, due to


1) Acoustics of /l/ versus /r/

We Will Write a Custom Essay Specifically
For You For Only $13.90/page!

order now

These two sounds are both voiced liquid
sonorant semivowels. Liquids have a characteristic airflow that is smooth
through the cavity with quick formant transitions. These two sounds have
vowel-like waveforms; therefore it is necessary to study their respective
spectrograms. They have very similar frequency characteristics for F1 and F2. /l/
has a low F1, due to oral cavity constriction. Additionally, the F2 and F3 of
/l/ are similar to midrange levels with an image of leveling out. The most
apparent change between these two liquids is at F3, where /r/ will have an
extremely steep drop for F3. This is because tongue constriction and position
in the oral cavity.





2) Categorical versus continuous

Categorical perception is psychological
phenomenon of what happens when trying to discriminate phonemes in the same
category.  It is the way consonant sounds
are grouped as the same phoneme because they are in the same phonemic category,
instead of hearing the small acoustic variations. It is harder for listeners to
perceive sounds that are in the same category, but easier to differentiate
between those that are perceived as being different categories. To explain it
in a visual, you would have 8 sounds that a listener was listening to. The
listener would perceive half the sounds as the same, until a crossover point
was reached and they would then hear the other half of the sounds as the same.  Therefore, showing that speech sounds are
perceived as the same until a boundary is reached. Categorical perception is
predominantly found to occur for VOT and for place of articulation for stops
and fricatives. Continuous perception is the exact opposite where
differentiation is not “categorized”, but rather “blurred”.  When there is change in stimulus, there will
be a corresponding perceptual change as well. Continuous perception is
characteristic of vowels. Vowels are considered continuous because we are more
sensitive to the minor changes in the formants, therefore we can discriminate
between the “ah” or “oo”. A visual that helps understand continuous perception
is a rainbow; you can distinguish all the different colors and the color

Essay Answer

4) The acoustics of speech sounds

Vowels are considered phonated, high
frequency sounds. Identification of vowels can occur by identifying the
patterns of the formant frequencies. The formants will appear as wide, dark
horizontal stripes. Front vowels have a large gap between F1 and F2, while back
vowels have a small gap between F1 and F2. Diphthongs are made up of two vowels
that make up one syllable. These sounds have rapidly changing direction and
extent of formant frequencies.  Each
diphthong has a unique F1-F2 pattern. Glides are semivowels that are voiced
with rapidly changing formant frequencies. Glides will have formant frequencies
similar to the respective vowels /u/ and /ee/. Diphthongs will have a steady
state portion of a formant, but contrastingly glides do not because they have
formant transitions that are quicker than diphthongs making them more consonant
like. All diphthongs go from low to
high tongue height. They look like very short vowels (a very short duration).

Vowels have a clear interval of formant patterns, but glides do not. It is
still hard to distinguish between the two sometimes. 

Obstruents (stops, fricatives, affricates
nasals, glides, and liquids) are sounds that made by obstructing the airflow
through the vocal tract. This leads to an aperiodic noise production. Sonorants
(nasals, glides, and liquids) are sounds that are made with a less constricted
vowel tract with periodic vocal fold vibration.  

Stops have a complete obstruction with a
brief release of turbulent noise. They have rapid changes in formant
frequencies of neighboring vowels, and a period of silence. Affricates have a
greater duration of aperiodic noise than stops. There is a silent closure
interval, with a transient release burst. Fricatives have an extended duration
of aperiodic noise caused by the forcing the airflow. Additionally, there is a mix of a broad range of
frequencies for fricatives. It is continuous, and looks like static energy.

Affricates look like a mix of fricatives and stops. They both have bursts and
look like a stop immediately followed by a fricative. Identifying
the period of frication can be used to differentiate between stops and
affricates. Stops will have a short duration; meanwhile affricates will have a
longer duration.

Nasal have three acoustic properties
called antiresonances (antiformants), nasal murmur (low frequency activity),
and nasalized vowels. Antiformants are where acoustic energy is dampened
because of the energy being absorbed in the nasal cavity, and it looks like a
bubble that has been grayed out. When differentiating between nasals and vowels,
identify the vowels’ higher intensity of formant frequencies (nasal formant is
low in frequency 250-300 Hz) and nasals have antiformants and murmurs. Liquids,
/l/ and /r/, will have a lower F3 frequency, especially /r/ because it takes a


5) Assimilation versus coarticulation

is when the individual phonemes actually overlap one another because of timing
and prosody. The influences of the articulation of the sound affect the
articulation of other sounds in the same utterance. However, the sound
will stay the same.  Coarticulation is
important for the perception of sounds like stops, which have weak internal
acoustic cues. There are two forms of coarticulation; anticipatory
coarticulation and carryover. Anticipatory (right to left) is where a speech
sound is affected by the following phoneme, like in spoon. In anticipation of
the vowel /o/, the lips begin to round at “puh”. Carryover (left to right) is
where a speech sound is affected by an earlier sound, like in the word please.

You will be able to identify this by the pattern of the first and second
formant transition depending on the context. Additionally, spectral peaks will
be at slightly different frequency locations because of the influence of the
vowel. It is important to keep in mind, that in the end, individual phonemes
are hard to identify on a spectrogram for coarticulation because the
information is spread across more than one phoneme.

Assimilation is
sometimes considered a type of coarticulation. This occurs when a phoneme
changes into a different phoneme because of the sound near it. This is
different than coarticulation because contrary to coarticulation the phoneme
changes into another phoneme. Additionally in assimilation the same articulator
gesture is needed for both sounds. For the phrase, “I miss you” the tongue tip
and blade are needed in specific place for the /s/ and /j/. Assimilation then
occurs, changing the /s/ to a “sh” sound. Another example is when the letter
/t/ is followed by /j/ and sounds like a /ch/ (nice to meet you –> /t/ +
/j/ = /ch/ (T+Y)). The sounds will change on a spectrogram, and it will look
like another phoneme is being used.









6) Prosody

features of speech, also called suprasegmentals, are used to help segment,
highlight, and provide cues to the listener by varying and contrasting pitch,
loudness, and length. When examining prosody, waveforms and spectrogram are
useful because of the need to examine aspects such as frequency, formant
pattern, and pitch. Three features of prosody are intonation, stress, and

Intonation is
when speakers vary their F0 levels in order to signal what type of utterance is
being used (declarative, question, exclamatory). These variations are called
pitch contour. For example, at the end of a declarative statement, the pitch
contour will fall. At the end of a question, the pitch will be raised. If the
question is a simple yes or no, you will have a rising pattern, where there is
a higher fundamental frequency. Fundamental frequency tracking marks the pitch
contour. When looking at the pitch contours, you are also looking at intonation
patterns.  These rising and declination
patterns on the pitch contour (F0 contours (varied levels of F0)) can signal
what type of sentence it is.

Stress also
helps indicate where a sentence is declarative or interrogative by providing
emphasis (stress) on specific words or segments changing the meaning. Stress
when producing speech requires an increase of subglottal pressure, therefore
involving increased intensity (loudness), frequency, duration, and F0.

Fundamental frequency is best marker of stress. Other markers include greater
intensity and darkness on stressed segments.

is when the combination of stress and duration can lead to a change in meaning
of a word (relationship between sounds that are right after each other). It is
a way to break up syllables to differentiate between speech that sounds similar
and mark that difference. An example is “ice cream” and “I scream”. As you can
see these two sequences sound the same, but have different meanings. Depending
on where you put the pause (after ice or after I) changes the meaning. Acoustic
cues for juncture include silence, voice lengthening, and the absence or
presence of voicing and aspiration. When trying to identify juncture on a
waveform or spectrogram, you need to find the beginning and end points of the