-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Maximum context length exceeded #95
Comments
Actually, it seems like the error is happening when I load resources. Here is how I am loading the resources: const loadResources = async (ragApplication, messages) => {
console.log("RAG Application:", ragApplication);
const loaderSummaries = await Promise.all(
messages.map(async (message) => {
console.log("Adding loader for:", message.subject);
const loaderSummary = await ragApplication.addLoader(
new JsonLoader({ object: message })
);
return loaderSummary;
})
);
console.log(
"\nLoader summaries:\n",
loaderSummaries.map((summary) => JSON.stringify(summary)).join("\n")
);
return loaderSummaries;
}; The final console log is never called, so the error must be triggered during the Adding a stack trace in case that's useful:
|
I think I'm zeroing in on the issue. I'm not exceeding the context limit for the main model, but for the embedding model. This implied to me that the preprocessing step doesn't apply to the embedding process. I'll keep poking. |
Hey @ashryanbeats, yes - the preprocessing is not done for the embeddings. In embeding, its either all or nothing right now. The library usually breaks the loaded content sent into smaller chunks but that is not done for JSON loader. I am thinking, we should have it auto break JSON into smaller embedding documents if the text is too large. But what chunking strategy to use needs more thought. |
I have thought about this and discussed with other library maintainers for similar projects in other languages. I think the best strategy is to break the JSON at the application end outside the library. But if you have more thoughts, let's open a discussion thread on this. |
Background
From the readme:
Issue
I'm not finding that this preprocessing step is happening. I've run into these context length errors on both GPT 3.5 Turbo and GPT 4o.
I understand I can do these workarounds:
setSearchResultCount()
But in order to take advantage of the described preprocessor, is there something specific I need to do?
My setup
I can confirm my setup works when I load less data. Essentially, I'm loading sanitized emails as a JSON objects. With 5 emails loaded, it's fine. With 10 emails loaded, I'm hitting the token limit.
My builder setup:
My loader:
My EmbedJS version:
The text was updated successfully, but these errors were encountered: