The Acoustics Of Shouting

class: center, middle, inverse, title-slide

# The Acoustics Of Shouting
## A Case Study of English Vowels
### Dine Mamadou
### Rutgers University | Data Sci 4 Ling | 4/24/18

---

#Research Question
###General question

## Do the rise in amplitude and the distortion of the vocal tract (caused by yelling) translate into acoustic changes in the vowel?

###More specifically...
--

## Is there a difference in vowel quality between normally uttered vowels and shouted ones? 
---
#Theoretical background
##Acording to the Source-Filter theory, speech sounds are distinguished on the basis of both the source and filter properties of the vocal tract (Maddieson 1984, Diehl 2008)
--

##In the present case, this theory predicts that the nature of the source (the glottis, for vowels) and the shape of the filter (oral cavity) will determine the quality of the vowel sounds
--

##Vowels spoken normally are thus expected to have slightly different formant values than those in shouted utterances
---

#Theoretical background/Hypotheses
##Because "higher vocal intensities are typically produced with an increased jaw opening size with a co-occurring decreased tongue height, thus an increased F1". (Huber & al. 1999)
--

#Hypotheses:

###(1) F1 will increase as intensity increases; that is, vowels will be higher when shouted than when they're uttered normally
--

###(2) Vowel duration will be a good predictor of intensity
---

#Experimental design/participants

##- 2 speakers (1 Male and 1 Female) from Iowa and Illinois, respectively.
##- 16 different target words containing 10 different English monophthongs and 6 Diphthongs in /hVd/ contexts (adapted from Yoon & al. 2012) 
##- 2 repetitions in each of 2 conditions, **normal** and **shouted**.
---

#Experiment/participants

##- In the **shouted** condition, participants were instructed to shout the target words (in isolation) as though they’re addressing someone who is at about 100 meters (330 feet) away from them.
##- Total of **128 target tokens**
##- F1, F2, Intensity and vowel duration were measured 
##- Only the monophthongs are reported here
---

#Descriptive Results

.pull-left[
###The results summary is below:
<table>
 <thead>
  <tr>
   <th style="text-align:left;"> condition </th>
   <th style="text-align:right;"> meanf1 </th>
   <th style="text-align:right;"> mean.int </th>
   <th style="text-align:right;"> mean.dur </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:left;"> Normal </td>
   <td style="text-align:right;"> 598.47 </td>
   <td style="text-align:right;"> 75.46 </td>
   <td style="text-align:right;"> 0.27 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> Shouted </td>
   <td style="text-align:right;"> 740.90 </td>
   <td style="text-align:right;"> 80.06 </td>
   <td style="text-align:right;"> 0.43 </td>
  </tr>
</tbody>
</table>

<img src="index_files/figure-html/vowel plot-1.png" width="1008" />
]
--

.pull-right[

* In the shouted condition, vowel F1 is in average *142 Hz* higher than in the normal condition

* Shouted vowels are in average about twice as long as non-shouted ones. Their intensity is slighly higher too

* We can further see in the plot how shoued vowels are higher

* The extremely high and low vowels are those produced by the female speaker; which is what we expected
]
---

#Descriptive Results
<img src="index_files/figure-html/int_dur plot-1.png" width="1008" />

---

# Statistical analysis: The Models
###- Given our working hypothesis, F1 was set as the **criterion** and intensity, vowel duration, condition and gender were **the predictors**
--

###- A generalized linear model was fitted using the **Gaussian** distribution family with **Identity** as the link. Causal priority was given to intensity
--

###- Main effects and interactions were assessed using the nested model comparison with an alpha level of 0.05
---

#The Models
## There was a main effect of intensity on F1, that is overall, intensity is a better predictor of F1 than any other (one) variable
--

## As far as interactions are concerned, there's a two and three-ways interaction between intensity and gender on one hand, and between intensity, duration and gender on the other
--

## The three-ways interaction in this case means that the predictive power of intensity and duration is modulated by the gender of the speaker
---

#Results Interpretation & Conclusion
## Based on these statistics, our first hypothesis failed to be rejected with a further suggestion that duration, along with intensity and gender conspire to predict vowel height.
--

## This also validates our second hypothesis
--

## It is to be noted, however, that the male participant consistently used intensity as a cue for shouting/loudness while the female participant relied on vowel duration
---

#Results Interpretation & Conclusion
## While this seems to suggest a gender divide in the type of cues used in shouting, it further makes predictions for articulatory based speech models that further research will help shed light on
---

#Stats details
.pull-left[

```
## Analysis of Deviance Table
## 
## Model 1: F1_midpoint ~ 1
## Model 2: F1_midpoint ~ int_c * gender
## Model 3: F1_midpoint ~ int_c * dur_c
## Model 4: F1_midpoint ~ int_c * dur_c * gender
##   Resid. Df Resid. Dev Df Deviance  Pr(>Chi)    
## 1        71    3357984                          
## 2        68    2511676  3   846308 1.293e-05 ***
## 3        68    2782086  0  -270411              
## 4        64    2135103  4   646984 0.0006577 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
```
]

.pull-right[

```
##                estimate     std.Err  statistics     p-value
## (Intercept)  6.6289e+02  2.4750e+01  2.6780e+01  2.0000e-16
## int_c       -3.4000e+00  1.0560e+01 -3.2000e-01  7.5000e-01
## dur_c        4.5560e+02  2.2272e+02  2.0500e+00  4.0000e-02
## int_c:dur_c -1.2795e+02  1.2529e+02 -1.0200e+00  3.1000e-01
```
]