Skip to content

Commit

Permalink
Merge branch 'pages-in-progress'
Browse files Browse the repository at this point in the history
  • Loading branch information
djvill committed Oct 16, 2024
2 parents ecd0564 + cc8817f commit 6100721
Show file tree
Hide file tree
Showing 60 changed files with 1,427 additions and 39 deletions.
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,9 @@ Gemfile*
# Unisyn dictionary
files/custom-dictionary/apls.unisyn

# Rproj
*.Rproj
.Rproj.user

#### OSs ####

Expand Down Expand Up @@ -85,3 +88,4 @@ Temporary Items

# .nfs files are created when an open file is removed but is still being accessed
.nfs*

5 changes: 5 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,11 @@ fi
Solution courtesy of https://mademistakes.com/notes/adding-last-modified-timestamps-with-git/.


### Sass deprecation patch

As of just-the-docs version 0.10.0, building the site locally will yield deprecation warnings about Sass `darken()`: https://github.com/just-the-docs/just-the-docs/issues/1541.
As a patch for these deprecation warnings, modify `Gemfile.lock` so it pins the `sass-embedded` gem to version 1.78.0.


## Repo contents

Expand Down
2 changes: 1 addition & 1 deletion SPLASH.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ dashlinks:
url: https://docs.google.com/forms/d/e/1FAIpQLSdFclWfbWZ-aM-h3Givrr4mH9T4MjyWaeQ-TpTMriC5mOcoqw/viewform?usp=sf_link
- text: Documentation
image: fa-book.svg
url: /
url: https://djvill.github.io/APLS/
newtab: yes
# - text: Tutorial
# image: fa-circle-info.svg
Expand Down
7 changes: 7 additions & 0 deletions _config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ baseurl: /APLS
exclude:
- node_modules
- script
- .Rproj.user

sass:
style: :compressed
Expand All @@ -31,6 +32,8 @@ lang: en-US
collections:
versions:
sort_by: version
layers:
sort_by: name

########################################
#### Config for just-the-docs theme ####
Expand Down Expand Up @@ -72,6 +75,10 @@ callouts:
warning:
title: Warning
color: red
try-it:
color: green
under-the-hood:
color: blue

##Google Analytics
ga_tracking: G-TT6NWBCLVW
Expand Down
2 changes: 1 addition & 1 deletion _includes/footer_custom.html
Original file line number Diff line number Diff line change
Expand Up @@ -3,5 +3,5 @@
<br>
Distributed by a <a href="https://creativecommons.org/licenses/by-nc-sa/4.0/" target="_blank">Creative Commons BY-NC-SA 4.0</a> license.
<br>
Built using the <a href="https://github.com/just-the-docs/just-the-docs" target="_blank">Just the Docs</a> theme for <a href="https://jekyllrb.com/docs/">Jekyll</a>.
Built using the <a href="https://github.com/just-the-docs/just-the-docs" target="_blank">Just the Docs</a> theme for <a href="https://jekyllrb.com/docs/" target="_blank">Jekyll</a>.
</p>
3 changes: 3 additions & 0 deletions _includes/head_custom.html
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,9 @@
<!-- Custom JavaScript: non-ghost layouts -->
<script src="{{ '/assets/js/search.js' | relative_url }}"></script>
<script src="{{ '/assets/js/dark-mode-toggle.js' | relative_url }}"></script>
<script src="{{ '/assets/js/collapse-callouts.js' | relative_url }}"></script>
<script src="{{ '/assets/js/keyterm-layer-links.js' | relative_url }}"></script>
<script src="{{ '/assets/js/external-link-new-tab.js' | relative_url }}"></script>
{% else %}
<!-- For some reason, default favicon not working -->
<link rel="icon" href="{{ site.favicon_ico | default: '/favicon.ico' | absolute_url }}" type="image/x-icon">
Expand Down
8 changes: 8 additions & 0 deletions _includes/linklist.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
[celex]: https://catalog.ldc.upenn.edu/LDC96L14
[elan]: https://archive.mpi.nl/tla/elan
[htk]: https://htk.eng.cam.ac.uk/
[labb-cat]: https://nzilbb.github.io/labbcat-doc/
[labbcat-R]: https://nzilbb.github.io/labbcat-R/
[labbcat-py]: https://nzilbb.github.io/labbcat-py/
[sign up]: https://docs.google.com/forms/d/e/1FAIpQLSdFclWfbWZ-aM-h3Givrr4mH9T4MjyWaeQ-TpTMriC5mOcoqw/viewform
[password reset]: https://docs.google.com/forms/d/e/1FAIpQLSdW9U912VhiZN2sjFk6jQFulhY82YNdqkQQRKVJT2LvAFvqnw/viewform?usp=sf_link
2 changes: 0 additions & 2 deletions _includes/nav_footer_custom.html
Original file line number Diff line number Diff line change
@@ -1,7 +1,5 @@
<footer class="site-footer">
<p>
<a href="https://apls.pitt.edu/labbcat" target="_blank">Sign in to APLS</a>
|
<a href="https://docs.google.com/forms/d/e/1FAIpQLSdFclWfbWZ-aM-h3Givrr4mH9T4MjyWaeQ-TpTMriC5mOcoqw/viewform?usp=sf_link" target="_blank">Sign up</a>
</p>
</footer>
16 changes: 16 additions & 0 deletions _keyterms/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
# `_keyterms/`

This directory holds:

- A YAML file (`keyterms.yml`) with a glossary of key terms
- A template YAML file (`keyterm-template.yml`) that is used for new key terms
- `update-keyterm-list.R`, an R script that trawls `doc/` pages for key terms, populates `keyterms.yml` with new terms, and updates the `incontext` lists of back-links in `keyterms.yml`
- `session-info.txt`, output of `sessionInfo()` within `sync-layers.R`

`keyterms.yml`, in turn, will get used to populate the glossary page (`doc/glossary`).

For the meaning of YAML attributes, see `keyterm-template.yml`.

I might want to add a "category" attribute for sorting/separating key terms into glossary sections;
currently I'm thinking "LaBB-CAT" (e.g., _transcript_) vs. "Linguistics" (e.g., _sociolinguistic interview_) vs. "Data science" (e.g., _unique identifier_).
I'll hold off for now, though, since that'd tempt me to create definitions for lots of terms that are low-priority.
16 changes: 16 additions & 0 deletions _keyterms/keyterm-template.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
transcript:
short_definition: A single NP (which may benefit from further context that it doesn't get here).
# category (MAYBE): Section of the glossary where the word will be found. Either "LaBB-CAT" (e.g., _transcript_), "Linguistics" (e.g., _sociolinguistic interview_), or "Data science" (e.g., _unique identifier_)
definition: |
Using multiple paragraphs if necessary.
Can also include Markdown styling (incl. callouts)
incontext:
- links to **auto-generated** anchors
- on pages
- where the term appears
- (once per page)
related:
- similar concepts
- and/or
- terms that could be easily confused (e.g., _transcript_ vs. _transcription_)
20 changes: 20 additions & 0 deletions _keyterms/keyterms.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
Transcript:
short_definition: A collection of time-aligned annotations across several layers corresponding to a single sound file, plus metadata about the sound file.
definition: |
In APLS, each transcript corresponds to part of a sociolinguistic interview with a single interviewee.
Interviews are split into transcripts according to the original recording files.
Transcripts are named with the interviewee's speaker code, the interview section, an optional numeric suffix if that interview section took up more than recording file, and `.eaf` (the [Elan][] transcription file format).
Transcripts can be viewed on [transcript pages](doc/view-transcript).
incontext:
- links to **auto-generated** anchors
- on pages
- where the term appears
- (once per page)
related:
- In APLS, each transcript has one [main participant](#main-participant)
- "[Transcript attributes](#transcript-attributes): Metadata about the sound file"
- Not to be confused with [transcriptions](#transcription), data files external to APLS
- "[Layer](#layer)"
- "[Annotation](#annotation)"
76 changes: 76 additions & 0 deletions _keyterms/update-keyterm-list.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
library(tidyr)
library(purrr)
library(stringr)
library(dplyr)
library(yaml)

##Parameters: Glossary & template files, whether or not to write updated glossary
file_glossary <- "keyterms.yml"
file_template <- "keyterm-template.yml"
write_glossary <- TRUE

##Get terms from pages
##Get doc/ pages that have 1+ .keyterm
pages <- system("grep -rl 'class=\"keyterm' ../doc", intern=T)
##Construct dataframe of terms and the pages where they appear
keyterms <- tibble(page = pages,
pagefile = map(page, readLines),
pagelink = map_chr(pagefile,
~ .x |>
str_subset("^permalink: ") |>
str_remove("^permalink: ")),
pagetitle = map_chr(pagefile,
~ .x |>
str_subset("^title: ") |>
str_remove("^title: ")),
term = map(pagefile,
##Get keyterms as a character vector
~ .x |>
str_extract_all("(?<=<span class=\"keyterm\">).+?(?=</span>)") |>
unlist() |>
##Normalize for case and pluralization
str_to_lower() |>
str_remove("s$"))) |>
##One row per term
unnest(term) |>
##Only unique combinations of page/term
distinct() |>
##Add link
mutate(link = str_glue("[{pagetitle}]({pagelink}#keyterm-{str_replace_all(term, ' ', '-')})") |>
as.character())

##Read template & empty out incontext
template <- read_yaml(file_template)[[1]]
template$incontext <- character(0L)

##Add to glossary
##Read current glossary
glossary <- read_yaml(file_glossary)
##Get terms that need to be added
curr_terms <- names(glossary)
new_terms <- setdiff(keyterms$term, str_to_lower(curr_terms))
##Repeat template with new_terms
new_gloss <- new_terms |>
sort() |>
set_names() |>
map(~ template)
##Add new_terms
glossary <- c(glossary, new_gloss)

##Update terms' incontext entries
##Get list of backlinks for each term
backlinks <-
keyterms |>
select(term, link) |>
chop(link) |>
pull(link, term)
##In glossary order
backlinks <- backlinks[str_to_lower(names(glossary))]
##Update incontext
glossary <- glossary |>
map2(backlinks, ~ assign_in(.x, "incontext", .y))

##Optionally write glossary
if (write_glossary) {
write_yaml(glossary, file_glossary)
}
55 changes: 55 additions & 0 deletions _layers/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
# `_layers`

This directory holds:

- Markdown files with layer definitions (one file per layer), with
- Long prose description in the body of the file
- [Attributes in a YAML header](#yaml-attributes), including
- Attributes from layer definitions saved to APLS (`synced`)
- Manually-input attributes
- `sync-layers.R`, an R script that creates files and populates/updates files' YAML headers based on layer definitions saved to APLS
- `session-info.txt`, output of `sessionInfo()` within `sync-layers.R`

These files, in turn, will get used to populate the layer reference pages in `doc/`.
(Not _all_ the YAML fields will necessarily go into those pages.)


## YAML attributes

- `name`: Layer name
- `synced`: Attributes from layer definitions saved to APLS, automatically synced by `sync-layers.R`
- `parallel`: Whether there are parallel tags per annotation (e.g., multiple possible phonemic representations)
- `notation`: Notation system used (links to `doc/notation-systems`)
- `primary`: Main category of notation system (e.g., English, downcased English, Penn Treebank tags, DISC)
- `additional`: Symbols that augment the primary notation system (e.g., transcription prosody symbols, morpheme marker, DISC syllabification/stress, foll_segment pause symbol)
- `inputs`: Layers and/or other inputs (e.g., APLS custom dictionary) that go into the layer. In a bulleted list where each entry has:
- `number`: Index for referring to the input in the body of the Markdown file (also sequential input)
- `input`: Name of input
- `type`: `layer` or `other`
- `layer_manager`: If applicable
- `versions`: APLS versions (once versioning begins in earnest), where layer...
- `first_appeared`
- `last_modified`
- `last_modified_sync_date`: When the _layer config_ was last modified
- `last_modified_date`: When the _Markdown_ file was last modified (may be after `last_modified_sync_date`). This works the same as `last_modified_date` in the `doc/` Markdown files.


## Rules for use

- **Don't create new Markdown files** for new layers. Instead:
1. Create the new layer straightaway in APLS. This should include:
- Any auxiliaries, if applicable
- A short description suitable for:
- Tooltip in APLS
- The "quick reference card" table at `doc/quick-reference-card`
1. Run `sync-layers.R` to create a Markdown file for the new layer and populate its YAML header
- If you want to test out a layer config **without the layer showing up in `doc/`, add it to the `testing` project** (you're probably doing that anyway!). While all layers in APLS get a Markdown file, those with `project: testing` get ignored
1. Fill the following YAML fields manually: `inputs`, `downstream layers`, `notation` (with children `primary`, `additional`)
- `additional`
1. Fill the body of the Markdown file with a long description
- If you **change _anything_ about a layer config in APLS**:
1. Re-run `sync-layers.R` to update that layer's YAML header
1. It may be necessary to update `last_modified_sync_date` and/or `versions: last_updated` manually, in case it's a change that `sync-layers.R` can't detect
- If you **delete a layer in APLS**, it won't be deleted here...yet
- I like the idea of having `sync-layers.R` shunt deleted files to a `deleted/` subfolder, or adding a `deleted: yes` flag that tells `doc/` to ignore that Markdown file. But that's not a priority right now

19 changes: 19 additions & 0 deletions _layers/dictionary_phonemes.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
---
name: dictionary_phonemes
synced:
<!-- Usual LaBB-CAT layer data, auto-populated by sync-layers.R -->
parallel: yes
notation:
primary: disc
inputs:
- input:
number:
type:
layer_manager:
versions:
first_appeared: 0.1.0
last_updated: 0.1.0
last_modified_sync_date:
last_modified_date: 2024-10-16T11:47:55-04:00
---

Loading

0 comments on commit 6100721

Please sign in to comment.