Releases · llm-tools/embedJs

09 Oct 17:30

adhityan

v0.1.4

dea7e1a

v0.1.4

0.1.4 (2024-10-09)

🚀 Features

new doc website (28d918a)
merged conversations with cache (28d918a)

🩹 Fixes

fixed debug message string fro createLoaderFromMimeType (abf5901)
remove changelog generation from github release (87abd2b)
remove changelog generation from github release (4aa3f18)
capitalization on contributing.md (0381453)
Adhityan K V

Assets 2

06 Oct 10:16

adhityan

v0.1.3

5d3b526

v0.1.3

0.1.3 (2024-10-06)

🚀 Features

readded local-path and url loaders (303133c)

🩹 Fixes

commit pinecone example package lock file (3d1051e)
exclude examples from release process (1382185)
downgrade esbuild version to match nx requirements (183308f)

❤️ Thank You

Adhityan K V @adhityan

Contributors

adhityan

Assets 2

06 Oct 00:30

adhityan

v0.1.2

c4d4781

v0.1.2

0.1.2 (2024-10-06)

🚀 Features

readded local-path and url loaders (303133c)

🩹 Fixes

exclude examples from release process (1382185)
downgrade esbuild version to match nx requirements (183308f)

❤️ Thank You

Adhityan K V @adhityan

Contributors

adhityan

Assets 2

04 Oct 08:45

adhityan

v0.1.1

afb0d7f

0.1.1 (2024-10-04)

Temporarily disabled dynamic, url and local path loaders as they required install of all modules from the monorepo. Also temporarily removed access to Simple_Models enum. They will be reenabled soon.

Assets 2

04 Oct 07:45

adhityan

v0.1.0

ddcea24

v0.1.0

0.1.0 (2024-10-03)

This component has been extracted and is now published as part of a workspace monorepo managed by NX. There are many reasons that prompted this move, but the most critical issue was to decouple the need to install all dependencies for a single usecase. While we add (and continue to add) more and more loaders, databases, caches and models - the number of shared dependencies grew a lot. Most projects will not use all these combinations and it made no sense to have them all installed for everyone. Further, issues with dependent packages raised vulnerabilities that affected all projects - clearly something we did not intend.

Now what? Starting with version 0.1.0, We have switched to a monorepo based approach. All packages will have the same version number but changelogs and dependencies will be independent. You only need to install the relevant addons (loaders, models, databases, etc) specific to your usecase. Given the shortage of maintainers, we will not be able to support the non-monorepo version of the library beyond critical bugfixes for the next three months, post which the older version will not receive any security fixes. We strongly recommend upgrading to the newer version as soon as you can.

Adhityan K V

Assets 2

01 Jun 19:08

adhityan

0.0.82

5dbde53

Version 0.0.82

A number of important features and bug fixes make it into this release. Here's a rundown of the top new features -

Loader inference

The library can now infer the type of the loader automatically. You can pass a string and it will use the MimeType (detected using magic numbers) and the file extension if available to decide what is the correct loader to invoke. For example -

.addLoader('https://tesla-info.com/sitemap.xml') // will use sitemap loader
.addLoader('https://en.wikipedia.org/wiki/Tesla,_Inc.') // will use the web loader
.addLoader('s4pVFLUlx8g') // will detect this is a youtube video id and use the video loader
.addLoader('https://lamport.azurewebsites.net/pubs/paxos-simple.pdf') // will use the pdf loader

.addLoader('local/paxos-simple.pdf') // will also use the pdf loader
.addLoader('local/data.csv') // will also use the CSV loader

You can also pass it a local directory name and it will recursively load all files within it using the most appropirate loader. Note: It will skip files it does not have a loader for.

Alternatively, you can now add loaders by passing in an object with the correct parameters without invoking the loader constructor directly. That is -

//Before
.addLoader(new WebLoader({ urlOrContent: 'https://www.biography.com/business-leaders/steve-jobs' }))

//Now
.addLoader({ type: 'Web', urlOrContent: 'https://www.biography.com/business-leaders/steve-jobs' })

This makes for simpler reading and is very consistent across all loaders.

List of added loaders

The library now maintains the past list of loaders which were added in its cache. So, you can now get the list of all loaded content even between restarts. This is useful if you want to internalize the state of the RAG application within the library itself.

You can get the list of loaders by calling -

await ragApplication.getLoaders()

The list of added loaders will include all loaders, even those that were implicitly invoked by another loader. To understand this better, let's look at theLocalPathLoader. This loader uses the file system API to scan files and directories. Once it infers the file type, it internally calls other loaders to add and process PPT, CSV, HTML, etc. files. When this happens, the getLoaders() method will give you the list of all loaders including LocalPathLoader, CsvLoader, WebLoader, etc with metadata around what each loader worked on.

Note: All the data around this is recorded in the cache attached. Therefore this functionality only works when you have a cache set.

CSV Loader

Now you can add CSV files from both local and web URLs using the CSV loader. To add a Csv file (or URL) to your embeddings, use CsvLoader. The library will parse the Csv and add each row to its vector database.

.addLoader(new CsvLoader({ filePathOrUrl: '...' }))

Note: You can control how the CsvLoader parses the file in great detail by passing in the optional csvParseOptions constructor parameter.

Github workflow

The library now uses Github actions to verify if the PR compiles and builds in Node versions 18, 20 and 22. This will be automatically run on every PR.

Assets 2

18 May 20:34

adhityan

0.0.77

44ce5e2

Version 0.0.77 (Stable)

The library has now reached stable. It now supports several LLMs (including HuggingFace, Ollama, OpenAI's GPT 4o and Anthropic among others), many Embedding Models and even more Vector Databases (most popular databases are now supported).

Several critical features were added in this time -

It has the ability to detect already having processed a type of document and avoid duplication in the vector database.
You can automatically filter the embeddings based on a relevance cut-off. This is possible even with databases that do not return a simialrity score.
It has extensive debug logging including the ablity to view raw LLM response
It returns sources used in a LLM query
You can use a variety of caches including Redis

Assets 2

01 Jul 20:22

adhityan

0.0.7

8dbb920

First alpha release Pre-release

Pre-release

The package is open sourced under Apache v2 license as of this version.

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

0.1.4 (2024-10-09)

🚀 Features

🩹 Fixes

0.1.3 (2024-10-06)

🚀 Features

🩹 Fixes

❤️ Thank You

Contributors

0.1.2 (2024-10-06)

🚀 Features

🩹 Fixes

❤️ Thank You

Contributors

0.1.0 (2024-10-03)

Loader inference

List of added loaders

CSV Loader

Github workflow

Releases: llm-tools/embedJs

v0.1.4

0.1.4 (2024-10-09)

🚀 Features

🩹 Fixes

v0.1.3

0.1.3 (2024-10-06)

🚀 Features

🩹 Fixes

❤️ Thank You

Contributors

v0.1.2

0.1.2 (2024-10-06)

🚀 Features

🩹 Fixes

❤️ Thank You

Contributors

0.1.1 (2024-10-04)

v0.1.0

0.1.0 (2024-10-03)

Version 0.0.82

Loader inference

List of added loaders

CSV Loader

Github workflow

Version 0.0.77 (Stable)

First alpha release