This repository has been archived by the owner on Nov 19, 2024. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
40 changed files
with
3,112 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
.DS_Store | ||
.Rproj.user | ||
*.Rproj | ||
app_key_dir | ||
app_vault_dir | ||
.deploy_vars | ||
.configs | ||
decode.sh | ||
.devcontainer/ | ||
.bash_history | ||
.local/ | ||
.rstudio/ | ||
.dockerignore |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,92 @@ | ||
# syntax = docker/dockerfile:1.0-experimental # https://docs.docker.com/develop/develop-images/build_enhancements/ | ||
FROM rocker/r-ver:4.0.3 | ||
|
||
RUN export DEBIAN_FRONTEND=noninteractive && apt-get -y update \ | ||
&& apt-get install -y \ | ||
alien \ | ||
bzip2 \ | ||
cmake \ | ||
curl \ | ||
file \ | ||
gdal-bin \ | ||
gnupg2 \ | ||
libaio1 \ | ||
libapparmor1 \ | ||
libcairo2 \ | ||
libcairo2-dev \ | ||
libcurl4-openssl-dev \ | ||
libedit2 \ | ||
libgdal-dev \ | ||
libglpk-dev \ | ||
libpoppler-cpp-dev \ | ||
libproj-dev \ | ||
libsqliteodbc \ | ||
libssl-dev \ | ||
libudunits2-dev \ | ||
libxml2-dev \ | ||
libxt-dev \ | ||
libxt6 \ | ||
lsb-release \ | ||
odbc-postgresql \ | ||
openjdk-8-jdk \ | ||
openssh-client \ | ||
pandoc \ | ||
pandoc-citeproc \ | ||
postgresql \ | ||
procps \ | ||
psmisc \ | ||
r-cran-cairo \ | ||
swaks \ | ||
tcl-dev \ | ||
tk-dev \ | ||
unixodbc \ | ||
zlib1g-dev \ | ||
&& rm -rf /var/lib/apt/lists/* | ||
|
||
# get from https://packagemanager.rstudio.com/client/#/repos/1/overview | ||
# Freezing packages to April 22, 2021: | ||
RUN echo "options(repos = c(REPO_NAME = 'https://packagemanager.rstudio.com/all/__linux__/focal/2511902'))" >> $R_HOME/etc/Rprofile.site | ||
|
||
RUN R -e "install.packages(c('assertthat', \ | ||
'data.table', \ | ||
'dplyr', \ | ||
'DT', \ | ||
'elastic', \ | ||
'future', \ | ||
'future.callr', \ | ||
'ggplot2', \ | ||
'ggthemes', \ | ||
'httr', \ | ||
'jsonlite', \ | ||
'leaflet', \ | ||
'lubridate', \ | ||
'memoise', \ | ||
'plotly', \ | ||
'promises', \ | ||
'rmarkdown', \ | ||
'rgdal', \ | ||
'shiny', \ | ||
'shinycssloaders', \ | ||
'shinydashboard', \ | ||
'shinyjs', \ | ||
'shinyWidgets', \ | ||
'sf', \ | ||
'stringr', \ | ||
'timetk', \ | ||
'htmltools'))" | ||
|
||
# Add certs for accessing elastic search servers that require them | ||
COPY elasticsearch/certificates/*.crt /usr/local/share/ca-certificates/ | ||
RUN update-ca-certificates | ||
|
||
RUN mkdir /shinyapp | ||
WORKDIR /shinyapp/ | ||
ADD ./ ./ | ||
|
||
RUN useradd shiny -u 5000 -m -b /home | ||
RUN chown -R shiny:shiny /shinyapp | ||
USER shiny | ||
|
||
EXPOSE 3838 | ||
|
||
CMD ["R", "-e", "shiny::runApp('/shinyapp', host = '0.0.0.0', port = 3838)"] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,38 @@ | ||
<img src="www/text_depot_icon/TextDepotIcon_TextImage_S.jpg" width="25%"> | ||
|
||
Text Depot is a tool to search and analyze topics of interest within a large database of text data. The Text Depot dashboard (this repo) provides a front-end to a set of indexes in ElasticSearch. To use this repository, you must provide one or more [Elastic Search](www.elastic.co) indexes in a particular format. | ||
|
||
## Local Machine Setup | ||
|
||
1. Clone this repo. | ||
2. Run `cp .configs_sample .configs` and fill in the relevant values. | ||
|
||
### Running Locally | ||
|
||
1. Install any missing libraries with `install.packages("DT")` (for example). A list of required libraries can be found in the included `Dockerfile` | ||
2. Run `Rscript run_text_depot_dashboard.R` | ||
|
||
### Running via Docker | ||
|
||
1. Optionally, create a `.dockerignore` file to exclude any local files. | ||
2. Use the provided `Dockerfile` to build and run the app: | ||
|
||
``` | ||
$ DOCKER_BUILDKIT=1 docker build -t text_depot_dashboard . | ||
$ docker run -it -p 8080:3838 text_depot_dashboard | ||
``` | ||
|
||
3. Open the dashboard on your browser: [http://localhost:8080](http://localhost:8080) | ||
|
||
## ElasticSearch | ||
|
||
Each data source should be stored in its own Elastic Search index. For more information, see [elasticsearch/](elasticsearch/) | ||
|
||
## Notes | ||
|
||
Our workflow contained the following components: | ||
|
||
![Overall Workflow](workflow.png) | ||
|
||
This repository contains the dashboard code (Blue above) for Text Depot. The green components were scheduled with cron jobs, and keep the indexes up-to-date in the ElasticSearch Database. We wrote a custom Parser for each data source, and a single Annotator class that adds the fields below to each document before insertion. The orange components were added for authentication and embeddings-based search (add to `embedding_api_host` in `.configs`) for your dashboard. | ||
|
Oops, something went wrong.