Skip to content
This repository has been archived by the owner on Nov 19, 2024. It is now read-only.

Commit

Permalink
Merge pull request #22 from CityofEdmonton/add-gif-readme
Browse files Browse the repository at this point in the history
short demo gif
  • Loading branch information
reisner authored Nov 16, 2023
2 parents fac6a31 + ecafca6 commit 6a6eb74
Show file tree
Hide file tree
Showing 4 changed files with 32 additions and 20 deletions.
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -9,4 +9,5 @@ decode.sh
.bash_history
.local/
.rstudio/
.dockerignore
.dockerignore
.vscode
34 changes: 15 additions & 19 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,38 +1,34 @@
<img src="www/text_depot_icon/TextDepotIcon_TextImage_S.jpg" width="25%">

Text Depot is a tool to search and analyze topics of interest within a large database of text data. The Text Depot dashboard (this repo) provides a front-end to a set of indexes in ElasticSearch. To use this repository, you must provide one or more [Elastic Search](http://www.elastic.co) indexes in a particular format.

## Local Machine Setup

1. Clone this repo.
2. Run `cp .configs_sample .configs` and fill in the relevant values.
![Text Depot in action](www/TD_demo_short.gif)

### Running Locally
<hr/>

1. Install any missing libraries with `install.packages("DT")` (for example). A list of required libraries can be found in the included `Dockerfile`
2. Run `Rscript run_text_depot_dashboard.R`
<img src="www/text_depot_icon/TextDepotIcon_TextBeside_M.jpg" width="60%">

### Running via Docker
Text Depot is a tool to search and analyze topics of interest within a large database of text data. The Text Depot dashboard (this repo) provides a front-end to a set of indexes in ElasticSearch. To use this repository, you must provide one or more [Elastic Search](http://www.elastic.co) indexes in a particular format.

1. Optionally, create a `.dockerignore` file to exclude any local files.
2. Use the provided `Dockerfile` to build and run the app:
## Setup

1. Setup Elastic Search Server
2. Create one or more index using Text Depot mappings.
3. Clone this repo.
4. Run `cp .configs_sample .configs` and fill in the relevant values.
5. Build and run docker container:
```
$ DOCKER_BUILDKIT=1 docker build -t text_depot_dashboard .
$ docker run -it -p 8080:3838 text_depot_dashboard
DOCKER_BUILDKIT=1 docker build -t text_depot_dashboard . && docker run -it -p 8080:3838 text_depot_dashboard
```
6. Open the dashboard on your browser: [http://localhost:8080](http://localhost:8080)

3. Open the dashboard on your browser: [http://localhost:8080](http://localhost:8080)

## ElasticSearch
## Elastic Search

Each data source should be stored in its own Elastic Search index. For more information, see [elasticsearch/](elasticsearch/)
Each data source should be stored in its own Elastic Search index. For more information on how to configure your Elastic Search server, see [elasticsearch/](elasticsearch/)

## Notes

Our workflow contained the following components:

![Overall Workflow](workflow.png)

This repository contains the dashboard code (Blue above) for Text Depot. The green components were scheduled with cron jobs, and keep the indexes up-to-date in the ElasticSearch Database. We wrote a custom Parser for each data source, and a single Annotator class that adds the fields below to each document before insertion. The orange components were added for authentication and embeddings-based search, and are optional components.
This repository contains the dashboard code (Blue above) for Text Depot. The green components were scheduled with cron jobs, and keep the indexes up-to-date in the ElasticSearch Database. We wrote a custom Parser for each data source, and a single Annotator class that adds the `[nieghbourhoods, sentiment, embeddings]` fields to each document and inserts them. The orange components were added for authentication and embeddings-based search, and are optional components.

15 changes: 15 additions & 0 deletions elasticsearch/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -109,3 +109,18 @@ Each data source should be stored in its own Elastic Search index. The index mus
```

Then, add your indexes/aliases to the `default_index_aliases` parameter in `.configs`.

## Inserting Data

Each of these indexes should be filled with documents with the following fields:

| Field | Expected Data | Status |
| ------------- | ------------- | ------------- |
| date | 2023-01-01 | Required |
| text | This is the text in a document. | Required |
| source_title | Council Report for January 2023 | Required |
| sentiment | Float in [-1, 1] | Required |
| neighbourhoods | ["Downtown", "Northwest"] | Required |
| source_url | | Optional |
| parent_source_title | Council Agenda 2023. | Optional |
| parent_source_url | | Optional |
Binary file added www/TD_demo_short.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 6a6eb74

Please sign in to comment.