-
Notifications
You must be signed in to change notification settings - Fork 3
/
Copy pathoverview.html
226 lines (189 loc) · 21.5 KB
/
overview.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
<!DOCTYPE html>
<html >
<head>
<meta charset="UTF-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<title>overview.utf8</title>
<meta name="description" content="">
<meta name="generator" content="bookdown <!--bookdown:version--> and GitBook 2.6.7">
<meta property="og:title" content="overview.utf8" />
<meta property="og:type" content="book" />
<meta name="twitter:card" content="summary" />
<meta name="twitter:title" content="overview.utf8" />
<meta name="viewport" content="width=device-width, initial-scale=1">
<meta name="apple-mobile-web-app-capable" content="yes">
<meta name="apple-mobile-web-app-status-bar-style" content="black">
<!--bookdown:link_prev-->
<!--bookdown:link_next-->
<script src="libs/jquery-2.2.3/jquery.min.js"></script>
<link href="libs/gitbook-2.6.7/css/style.css" rel="stylesheet" />
<link href="libs/gitbook-2.6.7/css/plugin-bookdown.css" rel="stylesheet" />
<link href="libs/gitbook-2.6.7/css/plugin-highlight.css" rel="stylesheet" />
<link href="libs/gitbook-2.6.7/css/plugin-search.css" rel="stylesheet" />
<link href="libs/gitbook-2.6.7/css/plugin-fontsettings.css" rel="stylesheet" />
<script src="libs/gitbook-2.6.7/js/app.min.js"></script>
<script src="libs/gitbook-2.6.7/js/lunr.js"></script>
<script src="libs/gitbook-2.6.7/js/plugin-search.js"></script>
<script src="libs/gitbook-2.6.7/js/plugin-sharing.js"></script>
<script src="libs/gitbook-2.6.7/js/plugin-fontsettings.js"></script>
<script src="libs/gitbook-2.6.7/js/plugin-bookdown.js"></script>
<script src="libs/gitbook-2.6.7/js/jquery.highlight.js"></script>
<link rel="stylesheet" href="style.css" type="text/css" />
</head>
<body>
<!--bookdown:title:start-->
<!--bookdown:title:end-->
<!--bookdown:toc:start-->
<div class="book without-animation with-summary font-size-2 font-family-1" data-basepath=".">
<div class="book-summary">
<nav role="navigation">
<!--bookdown:toc2:start-->
<ul>
<li><a href="#chap:overview"><span class="toc-section-number">1</span> An overview of the EMU-SDMS </a></li>
<li><a href="#extract-symbolic-information-we-are-interessted-in"><span class="toc-section-number">2</span> extract symbolic information we are interessted in</a></li>
<li><a href="#extract-the-according-sample-values"><span class="toc-section-number">3</span> extract the according sample values</a></li>
</ul>
<!--bookdown:toc2:end-->
</nav>
</div>
<div class="book-body">
<div class="body-inner">
<div class="book-header" role="navigation">
<h1>
<i class="fa fa-circle-o-notch fa-spin"></i><a href="./"></a>
</h1>
</div>
<div class="page-wrapper" tabindex="-1" role="main">
<div class="page-inner">
<section class="normal" id="section-">
<!--bookdown:toc:end-->
<!--bookdown:body:start-->
<div id="chap:overview" class="section level1">
<h1><span class="header-section-number">1</span> An overview of the EMU-SDMS <a href="#fn1" class="footnote-ref" id="fnref1"><sup>1</sup></a></h1>
<p><img src="pics/EMU-webAppIcon-roundCorners.png" width="512" /></p>
<p>The EMU Speech Database Management System (EMU-SDMS) is a collection of software tools which aims to be as close to an all-in-one solution for generating, manipulating, querying, analyzing and managing speech databases as possible. It was developed to fill the void in the landscape of software tools for the speech sciences by providing an integrated system that is centered around the R language and environment for statistical computing and graphics . This manual contains the documentation for the three software components , and the . In addition, it provides an in-depth description of the database format which is also considered an integral part of the new system. These four components comprise the and benefit the speech sciences and spoken language research by providing an integrated system to answer research questions such as: </p>
<p>This manual is targeted at new users as well as users familiar with the legacy EMU system. In addition, it is aimed at people who are interested in the technical details such as data structures/formats and implementation strategies, be it for reimplementation purposes or simply for a better understanding of the inner workings of the new system. To accommodate these different target groups, after initially giving an overview of the system, this manual presents a usage tutorial that walks the user through the entire process of answering a research question. This tutorial will start with a set of audio and Praat annotation files and end with a statistical analysis to address the hypothesis posed by the research question. The following Part of this documentation is separated into six chapters that give an in-depth explanation of the various components that comprise the and integral concepts of the new system. These chapters provide a tutorial-like overview by providing multiple examples. To give the reader a synopsis of the main functions and central objects that are provided by ’s main R package , an overview of these functions is presented in Part . Part focuses on the actual implementation of the components and is geared towards people interested in the technical details. Further examples and file format descriptions are available in various appendices. This structure enables the novice user to simply skip the technical details and still get an in-depth overview of how to work with the new system and discover what it is capable of.</p>
<p>A prerequisite that is presumed throughout this document is the reader’s familiarity with basic terminology in the speech sciences (e.g., familiarity with the and how speech is annotated at a coarse and fine grained level). Further, we assume the reader has a grasp of the basic concepts of the R language and environment for statistical computing and graphics. For readers new to R, there are multiple, freely available R tutorials online (e.g., ). R also has a set of very detailed manuals and tutorials that come preinstalled with R. To be able to access R’s own ``An Introduction to R" introduction, simply type into the R console and click on the link to the tutorial.</p>
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
<p>The has a number of predecessors that have been continuously developed over a number of years . The components presented here are the completely rewritten and newly designed, next incarnation of the EMU system, which we will refer to as the EMU Speech Database Management System (EMU-SDMS). The keeps most of the core concepts of the previous system, which we will refer to as the legacy system, in place while improving on things like usability, maintainability, scalability, stability, speed and more. We feel the redesign and reimplementation elevates the system into a modern set of speech and language tools that enables a workflow adapted to the challenges confronting speech scientists and the ever growing size of speech databases. The redesign has enabled us to implement several components of the new so that they can be used independently of the for tasks such as web-based collaborative annotation efforts and performing speech signal processing in a statistical programming environment. Nevertheless, the main goal of the redesign and reimplementation was to provide a modern set of tools that reduces the complexity of the tool chain needed to answer spoken language research questions down to a few interoperable tools. The tools the provides are designed to streamline the process of obtaining usable data, all from within an environment that can also be used to analyze, visualize and statistically evaluate the data.</p>
<p>Upon developing the new system, rather than starting completely from scratch it seemed more appropriate to partially reuse the concepts of the legacy system in order to achieve our goals. A major observation at the time was that the R language and environment for statistical computing and graphics was gaining more and more traction for statistical and data visualization purposes in the speech and spoken language research community. However, R was mostly only used towards the end of the data analysis chain where data usually was pre-converted into a comma-separated values or equivalent file format by the user using other tools to calculate, extract and pre-process the data. While designing the new , we brought R to the front of the tool chain to the point just beyond data acquisition. This allows the entire data annotation, data extraction and analysis process to be completed in R, while keeping the key user requirements in mind. Due to personal experiences gained by using the legacy system for research puposes and in various undergraduate courses , we learned that the key user requirements were data and database portability, a simple installation process, a simplified/streamlined user experience and cross-platform availability. Supplying all of ’s core functionality in the form of R packages that do not rely on external software at runtime seemed to meet all of these requirements.</p>
<p>As the early incarnations of the legacy EMU system and its predecessors were conceived either at a time that predated the R system or during the infancy of R’s package ecosystem, the legacy system was implemented as a modular yet composite standalone program with a communication and data exchange interface to the R/Splus systems . Recent developments in the package ecosystem of R such as the availability of the package and the related packages and , as well as the package and the package , have made R an attractive sole target platform for the . These and other packages provide additional functional power that enabled the ’s core functionality to be implemented in the form of R packages. The availability of certain R packages had a large impact on the architectural design decisions that we made for the new system.</p>
<p>R Example shows the simple installation process which we were able to achieve due to the R package infrastructure. Compared to the legacy EMU and other systems, the installation process of the entire system has been reduced to a single R command. Throughout this documentation we will try to highlight how the is also able to meet the rest of the above key user requirements.</p>
<p><<rexample:overview_install, rexample=TRUE, eval=FALSE>>=
# install the entire EMU-SDMS
# by installing the emuR package
install.packages(“emuR”)
@</p>
<p>It is worth noting that throughout this manual R Example code snippets will be given in the form of R Example . These examples represent working R code that allow the reader to follow along in a hands-on manor and give a feel for what it is like working with the new .</p>
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
<p>As was previously mentioned, the new is made up of four main components. The components are the format; the R packages and ; and the web application, the , which is ‘s new component. An overview of the ’s architecture and the components’ relationships within the system is shown in Figure . In Figure , the package plays a central role as it is the only component that interacts with all of the other components of the . It performs file and DB handling for the files that comprise an (see Chapter ); it uses the package for signal processing purposes (see Chapter ); and it can serve s to the (see Chapter ).</p>
<p>Although the system is made of four main components, the user largely only interacts directly with the and the package. A summary of the default workflow illustrating theses interactions can be seen below:</p>
<p>Initially the user creates a reference to an by loading it into their current R session using the function (see step 1). This database reference can then be used to either serve () the database to the or query () the annotations of the (see steps 2 and 3). The result of a query can then be used to either perform one or more so-called requeries or extract signal values that correspond to the result of a or (see step 4). Finally, the signal data can undergo further preparation (e.g., correction of outliers) and visual inspection before further analysis and statistical processing is carried out (see steps 5, 6 and 7). Although the R packages provided by the do provide functions for steps 4, 5 and 6, it is worth noting that the plethora of R packages that the R package ecosystem provides can and should be used to perform these duties. The resulting objects of most of the above functions are derived or objects which can be used as inputs for hundreds if not thousands of other R functions.</p>
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
<p>Besides providing a fully integrated system, the has several unique features that set it apart from other current, widely used systems . To our knowledge, the is the only system that allows the user to model their annotation structures based on a hybrid model of time-based annotations (such as those offered by Praat’s tier-based annotation mechanics) and hierarchical timeless annotations. An example of such a hybrid annotation structure is displayed in Figure . These hybrid annotations benefit the user in multiple ways, as they reduce data redundancy and explicitly allow relationships to be expressed across annotation levels (see Chapter for further information on hierarchical annotations and Chapter on how to query these annotation structures).</p>
<p><<overview_hybridAnnot, fig.cap = “Example of a hybrid annotation combining time-based (\textit{Phonetic} level) and hierarchical (\textit{Phoneme}, \textit{Syllable}, \textit{Text} levels including the inter-level links) annotations.”, echo=FALSE, fig.width=4.5, fig.height=3>>=
library(ggplot2)
ae = load_emuDB(file.path(tempdir(), “emuR_demoData”, “ae_emuDB”), verbose = F)</p>
</div>
<div id="extract-symbolic-information-we-are-interessted-in" class="section level1">
<h1><span class="header-section-number">2</span> extract symbolic information we are interessted in</h1>
<p>sl_text = query(ae, “Text == friends”)
sl_syllable = query(ae, “[Syllable =~ .* ^ Text == friends]”)
sl_syllable_parents = requery_hier(ae, sl_syllable, level = “Text”)
sl_phoneme = query(ae, “[Phoneme =~ .* ^ Text == friends]”)
sl_phoneme_parents = requery_hier(ae, sl_phoneme, level = “Syllable”)
sl_phonetic = query(ae, “[Phonetic =~.* ^ Text == friends]”)
sl_phonetic_parents = requery_hier(ae, sl_phonetic, level = “Phoneme”)</p>
</div>
<div id="extract-the-according-sample-values" class="section level1">
<h1><span class="header-section-number">3</span> extract the according sample values</h1>
<p>td = get_trackdata(ae, sl_text, “MEDIAFILE_SAMPLES”, resultType = “emuRtrackdata”, verbose = F)
# create sample numbers
sampleNrs = seq(sl_text<span class="math inline">\(sample_start, sl_text\)</span>sample_end, 1)</p>
<p>all_sl = list(sl_phonetic, sl_phoneme, sl_syllable, sl_text)
all_parents = list(sl_phonetic_parents, sl_phoneme_parents, sl_syllable_parents)</p>
<p>lab_txt_size = 7
level_txt_size = 7</p>
<p>hierta_plot = ggplot(td, aes(x=sampleNrs, y=T1)) + geom_line(colour=“#E7E7E7”) + theme_bw()</p>
<p>hierta_plot = hierta_plot + theme(axis.line=element_blank(),axis.text.x=element_blank(),
axis.text.y=element_blank(),axis.ticks=element_blank(),
axis.title.x=element_blank(),
axis.title.y=element_blank(),legend.position=“none”,
panel.background=element_blank(),panel.grid.major=element_blank(),
panel.grid.minor=element_blank(),plot.background=element_blank())</p>
<p>for(i in 1:length(all_sl)){
cur_sl = all_sl[[i]]</p>
<p>minMaxDist = max(td<span class="math inline">\(T1) - min(td\)</span>T1)
propMinMaxDist = minMaxDist * 1/length(all_sl)</p>
<p>cur_y_min = min(td$T1) + (i-1)*propMinMaxDist
# plot with segments
if(i %% 2 == 0){
levelNameX = max(sampleNrs)
levelVjust = 1.5
}else{
levelNameX = min(sampleNrs)
levelVjust = -0.5
}</p>
<p>hierta_plot = hierta_plot + annotate(“text”, x = (cur_sl<span class="math inline">\(sample_start + (cur_sl\)</span>sample_end - cur_sl<span class="math inline">\(sample_start)/2), y = cur_y_min + propMinMaxDist/2, label = paste0("\\textit{", cur_sl\)</span>labels, “}”)) +
annotate(“text”, x = levelNameX, y = cur_y_min + propMinMaxDist/2, label = paste0(“\textit{”, unique(cur_sl$level), “}”), angle=90, vjust=levelVjust)</p>
<p># first level -> draw time lines
if(i == 1){
hierta_plot = hierta_plot + annotate(“segment”, x = cur_sl<span class="math inline">\(sample_start, y = cur_y_min + propMinMaxDist, xend = cur_sl\)</span>sample_start, yend = cur_y_min + propMinMaxDist / 2, colour=“#38363e”) +
annotate(“segment”, x = cur_sl<span class="math inline">\(sample_start, y = cur_y_min + propMinMaxDist * 3/4, xend = cur_sl\)</span>sample_start + (cur_sl<span class="math inline">\(sample_end - cur_sl\)</span>sample_start)/2, yend = cur_y_min + propMinMaxDist * 3/4, colour=“#38363e”) +
annotate(“segment”, x = cur_sl<span class="math inline">\(sample_start + (cur_sl\)</span>sample_end - cur_sl<span class="math inline">\(sample_start)/2, y = cur_y_min + propMinMaxDist * 3/4, xend = cur_sl\)</span>sample_start + (cur_sl<span class="math inline">\(sample_end - cur_sl\)</span>sample_start)/2, yend = cur_y_min + propMinMaxDist * 5/8, colour=“#38363e”) +
annotate(“segment”, x = cur_sl<span class="math inline">\(sample_end, y = cur_y_min + propMinMaxDist/2, xend = cur_sl\)</span>sample_end, yend = cur_y_min, colour=“#888888”) +
annotate(“segment”, x = cur_sl<span class="math inline">\(sample_end, y = cur_y_min + propMinMaxDist * 1/4, xend = cur_sl\)</span>sample_start + (cur_sl<span class="math inline">\(sample_end - cur_sl\)</span>sample_start)/2, yend = cur_y_min + propMinMaxDist * 1/4, colour=“#888888”) +
annotate(“segment”, x = cur_sl<span class="math inline">\(sample_start + (cur_sl\)</span>sample_end - cur_sl<span class="math inline">\(sample_start)/2, y = cur_y_min + propMinMaxDist * 1/4, xend = cur_sl\)</span>sample_start + (cur_sl<span class="math inline">\(sample_end - cur_sl\)</span>sample_start)/2, yend = cur_y_min + propMinMaxDist * 3/8, colour=“#888888”)</p>
<p>}</p>
<p># SIC fix this
if(i != length(all_sl)){
cur_parent = all_parents[[i]]
parent_y_min = min(td<span class="math inline">\(T1) + (i)*propMinMaxDist hierta_plot = hierta_plot + annotate("segment", x = (cur_sl\)</span>sample_start + (cur_sl<span class="math inline">\(sample_end - cur_sl\)</span>sample_start)/2), y = cur_y_min + propMinMaxDist/2 + propMinMaxDist/4, xend = (cur_parent<span class="math inline">\(sample_start + (cur_parent\)</span>sample_end - cur_parent$sample_start)/2), yend = parent_y_min + propMinMaxDist/2 - propMinMaxDist/4, linetype=“dashed”)
}</p>
<p>}</p>
<p>hierta_plot
@</p>
<p>Further, to our knowledge, the is the first system that makes use of a web application as its primary for annotating speech. This unique approach enables the component to be used in multiple ways. It can be used as a stand-alone annotation tool, connected to a loaded via ’s function and used to communicate to other servers. This enables it to be used as a collaborative annotation tool. An in-depth explanation of how this component can be used in these three scenarios is given in Chapter .</p>
<p>As demonstrated in the default workflow of Section , an additional unique feature provided by is the ability to use the result of a query to extract derived (e.g., formants and RMS values) and complementary signals (e.g., data) that match the segments of a query. This, for example, aids the user in answering questions related to derived speech signals such as: . Chapter gives a complete walk-through of how to go about answering this question using the tools provided by the .</p>
<p>The features provided by the make it an all-in-one speech database management solution that is centered around R. It enriches the R platform by providing specialized speech signal processing, speech database management, data extraction and speech annotation capabilities. By achieving this without relying on any external software sources except the web browser, the significantly reduces the number of tools the speech and spoken language researcher has to deal with and helps to simplify answering research questions. As the only prerequisite for using the is a basic familiarity with the R platform, if the above features would improve your workflow, the is indeed for you.</p>
<p><<echo=FALSE, results=‘hide’, message=FALSE>>=
# clean up emuR_demoData
unlink(file.path(tempdir(), “emuR_demoData”), recursive = TRUE)
@</p>
</div>
<div class="footnotes">
<hr />
<ol>
<li id="fn1"><p>Sections of this chapter have been published in <span class="citation">@winkelmann:2017aa</span><a href="#fnref1" class="footnote-back">↩</a></p></li>
</ol>
</div>
<!--bookdown:body:end-->
</section>
</div>
</div>
</div>
<!--bookdown:link_prev-->
<!--bookdown:link_next-->
</div>
</div>
<!--bookdown:config-->
<!-- dynamically load mathjax for compatibility with self-contained -->
<script>
(function () {
var script = document.createElement("script");
script.type = "text/javascript";
var src = "true";
if (src === "" || src === "true") src = "https://cdn.bootcss.com/mathjax/2.7.1/MathJax.js?config=TeX-MML-AM_CHTML";
if (location.protocol !== "file:" && /^https?:/.test(src))
src = src.replace(/^https?:/, '');
script.src = src;
document.getElementsByTagName("head")[0].appendChild(script);
})();
</script>
</body>
</html>