Chroma Search Service

This HTTP endpoint provides search capability to Chroma ArangoDb as ArangoDB Foxx Application.

Deployment

The service can be deployed to a running local Arango instance automatically with Foxx CLI with simply a npm script:

# For first-time deployment if this service hasn't been installed yet
npm run deploy:install 
# For deployment updates
npm run deploy

Indexer for ElasticSearch

A set of APIs to synchronizing documents changes between Arango and ElasticSearch, it use a poll-to-update model using Arango write ahead logs as the source of truth for data changes, currently it updates the assets index only. All data changes persists to chroma db will be monitored, those changes that affect asset data will be updating the ElasticSearch index.

Various options on how the indexer works can be tuned in Web Console

/es/index/all Index all tagged assets on ElasticSearch Note: This one could take a while since it's a synchronous api, use it sparsely.
- The ElasticSearch endpoint is configurable through the elasticsearch_host option.
/es/index/start Enqueue a job to sync any incoming changes related to assets to ElasticSearch until forever.
- The indexing interval is configurable through the elasticsearch_index_interval option.
- The indexing job retried times is configurable through the elasticsearch_index_max_fails option.

Brand Fuzzy Search

The /fuzzy endpoint provides a rather simple full-text fuzzy search to Chroma taxonomy collections

How does it work

The fuzzy-search works briefly in two stages for the search:

When the service starts, it will create search indexes on top of existing collection's texts, the text string will be tokenized into individual words, and each word will be normalized (e.g. remove punctuation) meanwhile a reverse map will be built to point to the original document.
In order to support for making a partial match on one string, a set of sub-strings will also indexed, all pointing to the belonging document of the original string
When a search query is received on the search endpoint, the query string is used to match to the best of any indexed text and a score will be produced using cosine similarity and Levenshtein distance algorithms. Results above a defined threshold are then returned.

How are the results ordered

When the query has matched for more than one documents, the order will be sorted asc based on conceptual distance between the indexed string and the document, which is defined as follows:

If the query matches the nth part of the text, e.g. if query saint matches the text Central Saint Martins, add a distance of n to conceptual distance. Matching on the start is better than matching on the middle.
If the query matches a word from the start but leave m chars unmatched at the end, add a distance of m * 0.1 to conceptual distance. Matching more of a word is better than matching less of it.
If the query matches a word which is a normalized version of the original, add a distance of 0.5 to conceptual distance.

Asset full-text search

The /search endpoint is an experiment implementation to search the asset collection with Arango 3.4 SearchView.

Pre-requisite

This service only works with the Arango 3.4 version.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
elasticsearch		elasticsearch
scripts		scripts
test		test
utils		utils
.auth		.auth
.gitignore		.gitignore
README.md		README.md
asset.search.js		asset.search.js
brand.search.js		brand.search.js
index.js		index.js
manifest.json		manifest.json
package-lock.json		package-lock.json
package.json		package.json
renovate.json		renovate.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Chroma Search Service

Deployment

Indexer for ElasticSearch

Brand Fuzzy Search

How does it work

How are the results ordered

Asset full-text search

Pre-requisite

About

Releases

Packages

Contributors 3

Languages

conde-nast-international/chroma-search-foxx

Folders and files

Latest commit

History

Repository files navigation

Chroma Search Service

Deployment

Indexer for ElasticSearch

Brand Fuzzy Search

How does it work

How are the results ordered

Asset full-text search

Pre-requisite

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages