-
Notifications
You must be signed in to change notification settings - Fork 3
/
Copy pathapp_chap-useCases.Rmd
307 lines (215 loc) · 15.1 KB
/
app_chap-useCases.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
# (PART) Appendices {-}
# Use cases {#app-chap:useCases}
To add to the tutorial of Chapter \@ref(chap:tutorial), this chapter will present a few short use cases extracted and updated from the `emuR_intro` vignette. These use cases are meant as practical guides to answering research questions and are to be viewed as generic template procedures that can be altered and applied to similar research questions. They are meant to give practical examples of what it is like working with the EMU-SDMS to answer research questions common in speech and spoken language research. Every use case will start off by asking a question about the *ae* demo database and will continue by walking through the process of answering this question by using the mechanics the `emuR` package provides. The four questions this chapter will address are:
- Section \@ref(sec:app-chap-useCases-q1): What is the average length of all *n* phonetic segments in the *ae* `emuDB`?
- Section \@ref(sec:app-chap-useCases-q2): What does the F1 and F2 distribution of all phonetic segments that contain the labels *I*, *o:*, *u:*, *V* or *\@* look like?
- Section \@ref(sec:app-chap-useCases-q3): What words do the phonetic segments that carry the labels *s*, *z*, *S* or *Z* in the *ae* `emuDB` occur in and what is their phonetic context?
- Section \@ref(sec:app-chap-useCases-q4): Do the phonetic segments that carry the labels *s*, *z*, *S* or *Z* in the *ae* `emuDB` differ with respect to their first spectral moment?
\end{itemize}
The R code snippet below shows how the `emuR` demo data used in this chapter is created.
```{r results='hide', message=FALSE}
# load the package
library(emuR)
# create demo data in directory provided by the tempdir() function
create_emuRdemoData(dir = tempdir())
# get the path to emuDB called 'ae' that is part of the demo data
path2directory = file.path(tempdir(), "emuR_demoData", "ae_emuDB")
# load emuDB into current R session
ae = load_emuDB(path2directory)
```
## What is the average length of all *n* phonetic segments in the *ae* `emuDB`? {#sec:app-chap-useCases-q1}
The first thing that has to be done to address this fairly simple question is to query the database for all *n* segments. This can be achieved using the `query()` function as shown in the R code snippet below.
```{r}
# query segments
sl = query(ae, query = "Phonetic == n")
# show first row of sl
head(sl, n = 1)
```
The second argument of the `query()` contains a string that represents an EQL statement. This fairly simple EQL statement consists of `==`, which is the equality operator of the EQL, and on the right hand side of the operator the label *n* that we are looking for.
The `query()` function returns an object of the class `emuRsegs` that is a superclass of the well known `data.frame`. The various columns of this object should be fairly self-explanatory: `labels` displays the extracted labels, `start` and `end` are the start time and end times in milliseconds of each segment and so on. We can now use the information in this object to calculate the mean durations of these segments as shown in the R code snippet below.
```{r}
# calculate durations
d = sl$end - sl$start
# calculate mean
mean(d)
```
## What does the F1 and F2 distribution of all phonetic segments that contain the labels *I*, *o:*, *u:*, *V* or *\@* look like? {#sec:app-chap-useCases-q2}
Once again we will initially query the `emuDB` to retrieve the segments we are interested in as shown in the R code snippet below.
```{r}
# query emuDB
sl = query(ae, query = "Phonetic == I|o:|u:|V|@")
```
Now that the necessary segment information has been extracted, the `get\_trackdata()` function can be used to calculate the formant values for these segments as displayed in the R code snippet below.
```{r message=FALSE, results='hide'}
# get formant values for these segments
td = get_trackdata(ae,
sl,
onTheFlyFunctionName = "forest")
```
In this example, the `get_trackdata()` function uses a formant estimation function called `forest()` to calculate the formant values in real time. This signal processing function is part of the `wrassp` package, which is used by the `emuR` package to perform signal processing duties with the `get_trackdata()` command (see Chapter \@ref(chap:wrassp) for details).
If the `resultType` parameter is not set the call to `get\_trackdata()`, an object of the class `tibble` is returned. The class vector of the `td` object is displayed in the R code snippet below.
```{r}
# show class vector of td
class(td)
# show td
td
```
As the `tibble` class is a superclass to the common `data.frame` class, packages like `ggplot2` can be used to visualize our F1 and F2 distribution as shown in the R code snippet below (see Figure \@ref(fig:usecases-uc2plot) for the resulting plot).
```{r eval=FALSE}
# load package
library(ggplot2)
# scatter plot of F1 and F2 values using ggplot
ggplot(td, aes(x=T2, y=T1, label=td$labels)) +
geom_text(aes(colour=factor(labels))) +
scale_y_reverse() + scale_x_reverse() +
labs(x = "F2(Hz)", y = "F1(Hz)") +
guides(colour=FALSE)
```
```{r usecases-uc2plot, fig.cap = "F1 by F2 distribution for *I*, *o:*, *u:*, *V* and *@*.", echo=FALSE, fig.width=5, fig.height=3, fig.align="center"}
# load package
library(ggplot2)
# scatter plot of F1 and F2 values using ggplot
ggplot(td, aes(x=T2, y=T1, label=td$labels)) +
geom_text(aes(colour=factor(labels))) +
scale_y_reverse() + scale_x_reverse() +
labs(x = "F2(Hz)", y = "F1(Hz)") +
guides(colour=FALSE)
```
## What words do the phonetic segments that carry the labels *s*, *z*, *S* or *Z* in the *ae* `emuDB` occur in and what is their phonetic context? {#sec:app-chap-useCases-q3}
As with the previous use cases, the initial step is to query the database to extract the relevant segments as shown in the R code snippet below.
```{r}
# query segments
sibil = query(ae, "Phonetic==s|z|S|Z")
# show sibil
sibil
```
The `requery_hier()` function can now be used to perform a hierarchical requery using the set resulting from the initial query. This requery follows the hierarchical links of the annotations in the database to find the linked annotation items on a different level. The R code snippet below shows how this can achieved.
```{r}
# perform requery
words = requery_hier(ae, sibil, level = "Word")
# show words
words
```
As seen in the above R code snippet, the result is not quite what one would expect as it does not contain the orthographic word transcriptions but a classification of the words into content words (*C*) and function words (*F*). Calling the `summary()` function on the `emuDBhandle` object `ae` would show that the *Words* level has multiple attribute definitions indicating that each annotation item in the *Words* level has multiple parallel labels defined for it. The R code snippet below shows an additional requery that queries the *Text* attribute definition instead.
```{r}
# perform requery
words = requery_hier(ae, sibil, level = "Text")
# show words
words
```
As seen in the above R code snippet, the first segment in `sibil` occurred in the word *amongst*, which starts at 187.475 ms and ends at 674.225 ms. It is worth noting that this two-step querying procedure (`query()` followed by `requery_hier()`) can also be completed in a single hierarchical query using the dominance operator (^).
As we have answered the first part of the question, the R code snippet below will extract the context to the left of the extracted sibilants by using the `requery_seq()` function.
```{r}
# get left context by off setting the
# annotation items in sibil one unit to the left
leftContext = requery_seq(ae, sibil, offset = -1)
# show leftContext
leftContext
```
The R code snippet below attempts to extract the right context in the same manner as above R code snippet, but in this case we encounter a problem.
```{r error=TRUE}
# get right context by off-setting the
# annotation items in sibil one unit to the right
rightContext = requery_seq(ae, sibil, offset = 1)
```
As can be seen by the error message in the above R code snippet, four of the sibilants occur at the very end of the recording and therefore have no phonetic post-context. The remaining post-contexts can be retrieved by setting the `ignoreOutOfBounds` argument to `TRUE` as displayed in the R code snippet below.
```{r}
rightContext = requery_seq(ae, sibil,
offset = 1,
ignoreOutOfBounds = TRUE)
# show rightContext
rightContext
```
However, the resulting `rightContext` contains rows that only contain `NA` values. This indicates that no values where found for the corresponding row in the `sibil` segment list.
## Do the phonetic segments labeled *s*, *z*, *S* or *Z* in the *ae* `emuDB` differ with respect to their first spectral moment?\protect\footnote{The original version of this use case was written by Florian Schiel as part of the `emuR_intro` vignette that is part of the `emuR` package. {#sec:app-chap-useCases-q4}
### NOTE: See \@ref(recipe:spectralAnalysis) for more up to date methods of performing spectral analysis
Once again, the segments of interest are queried first. The R code snippet below shows how this can be achieved, this time using the new regular expression operand of the EQL (see Chapter \@ref(chap:querysys) for details).
```{r}
sibil = query(ae,"Phonetic =~ '[szSZ]'")
```
The R code snippet below shows how the `get_trackdata()` function can be used to calculate the Discrete Fourier Transform values for the extracted segments.
```{r results='hide', message=FALSE}
dftTd = get_trackdata(ae,
seglist = sibil,
onTheFlyFunctionName = 'dftSpectrum',
resultType = "trackdata")
```
As the `resultType` parameter was not explicitly set, an object of the class `trackdata` is returned. This object, just like an object of the class `emuRtrackdata`, contains the extracted trackdata information. Compared to the `emuRtrackdata` class, however, the object is not "flat" and in the form of a `data.table` or `data.frame` but has a more nested structure (see `?trackdata` for more details).
Since we want to analyze sibilant spectral data we will now reduce the spectral range of the data to 1000 - 10000 Hz. This is due to the fact that there is a lot of unwanted noise in the lower bands that is irrelevant for the problem at hand and can even skew the end results. To achieve this we can use a property of a `trackdata` object that also carries the class `spectral`, which means that it is indexed using frequencies. The R code snippet below shows how to use this feature to extract the relevant spectral frequencies of the `trackdata` object.
```{r}
dftTdRelFreq = dftTd[, 1000:10000]
```
The R code snippet below shows how the `fapply()` function can be used to apply the `moments()` function to all elements of `dftTdRelFreq`.
```{r}
dftTdRelFreqMom = fapply(dftTdRelFreq, moments, minval = T)
```
The resulting `dftTdRelFreqMom` object is once again a `trackdata` object of the same length as the `dftTdRelFreq` `trackdata` object. It contains the first four spectral moments as shown in the R code snippet below.
```{r}
# show first row of data belonging
# to first element of dftTdRelFreqMom
dftTdRelFreqMom[1]$data[1,]
```
The information stored in the `dftTdRelFreqMom` and `sibil` objects can now be used to plot a time-normalized version of the first spectral moment trajectories, color coded by sibilant class, using `emuR`'s `dplot()` function. The R code snippet below shows the R code that produces Figure \@ref(fig:usecases-uc4dplot1).
```{r eval=FALSE, fig.align="center"}
dplot(dftTdRelFreqMom[, 1],
sibil$labels,
normalise = TRUE,
xlab = "Normalized Time [%]",
ylab = "1st spectral moment [Hz]")
```
```{r usecases-uc4dplot1, fig.cap = "Time-normalized first spectral moment trajectories color coded by sibilant class.", echo=FALSE, fig.width=6, fig.height=5, fig.align="center"}
dplot(dftTdRelFreqMom[, 1],
sibil$labels,
normalise = TRUE,
xlab = "Normalized Time [\\%]",
ylab = "1st spectral moment [Hz]",
ylim = c(4500, 8000))
```
As one might expect, the first spectral moment (the center of gravity) is significantly lower for postalveolar *S* and *Z* (green and blue lines) than for alveolar *s* and *z* (black and red lines).
The R code snippet below shows how to create an alternative plot (see Figure \@ref(fig:usecases-uc4dplot2)) that averages the trajectories into ensemble averages per sibilant class by setting the `average` parameter of `dplot()` to `TRUE`.
```{r eval=FALSE}
dplot(dftTdRelFreqMom[,1],
sibil$labels,
normalise = TRUE,
average = TRUE,
xlab = "Normalized Time [%]",
ylab = "1st spectral moment [Hz]")
```
```{r usecases-uc4dplot2, fig.cap = "Time-normalized first spectral moment ensemble average trajectories per sibilant class.", echo=FALSE, fig.width=6, fig.height=5, fig.align="center"}
dplot(dftTdRelFreqMom[,1],
sibil$labels,
normalise = TRUE,
average = TRUE,
xlab = "Normalized Time [\\%]",
ylab = "1st spectral moment [Hz]")
```
As can be seen from the previous two plots (Figure \@ref(fig:usecases-uc4dplot1) and \@ref(fig:usecases-uc4dplot2)), transitions to and from a sort of steady state around the temporal midpoint of the sibilants are clearly visible. To focus on this steady state part of the sibilant we will now extract those spectral moments that fall between the proportional timepoints 0.2 and 0.8 of each segment (i.e., the central 60\%) using the `dcut()` function as is shown in the R code snippet below.
```{r}
# cut out the middle 60% portion
dftTdRelFreqMomMid = dcut(dftTdRelFreqMom,
left.time = 0.2,
right.time = 0.8,
prop = T)
```
Finally, the R code snippet below shows how to calculate the averages of these trajectories using the `trapply()` function.
```{r}
meanFirstMoments = trapply(dftTdRelFreqMomMid[,1],
fun = mean,
simplify = T)
```
As the resulting `meanFirstMoments` vector has the same length as the initial `sibil` segment list, we can now easily visualize these values in the form of a boxplot. The R code below shows the R code that produces Figure \@ref(fig:usecases-uc4boxplot).
```{r eval=FALSE}
boxplot(meanFirstMoments ~ sibil$labels,
xlab = "Sibilant class labels",
ylab = "First spectral moment values [Hz]")
```
```{r usecases-uc4boxplot, echo=FALSE, fig.cap="Boxplots of the first spectral moments grouped by their sibilant class.", fig.width=6, fig.height=3, fig.align="center"}
boxplot(meanFirstMoments ~ sibil$labels,
xlab = "Sibilant class labels",
ylab = "First spectral moment values [Hz]")
```
As final remark, it is worth noting that using the `emuRtrackdata` `resultType` (not the `trackdata` `resultType`) of `get_trackdata()` function we could have performed a comparable analysis by utilizing packages such as `dplyr` for `data.table` or `data.frame` manipulation and `lattice` or `ggplot2` for data visualisation.
```{r echo=FALSE, results='hide', message=FALSE}
# clean up emuR_demoData
unlink(file.path(tempdir(), "emuR_demoData"), recursive = TRUE)
```