Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Troubleshoot 5 #46

Merged
merged 2 commits into from
May 20, 2024
Merged

Troubleshoot 5 #46

merged 2 commits into from
May 20, 2024

Conversation

granawkins
Copy link
Member

No description provided.

Copy link

@mentatai mentatai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes are mostly about version bumping and detailed documentation and commentary added to clarify validation processes, particularly around issues with invalid parent chunk handling. Keep ensuring code readability and maintainability when adding elaborate comments.

Butler is in closed beta. Reply with feedback or to ask Butler to review other parts of the PR. Please give feedback with emoji reacts.

# we'd rather accept the incorrect ones (they'll be linked to BASE later on)
# so we can bypass this check by passing file_chunks=None.
if file_chunks is not None:
"""
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When using multiline comments in Python, it's best to handle them as triple-quoted strings for better readability and maintainability.

Suggested change
"""
"""
The LLM sometimes returns invalid parents (i.e. path/to/file.ext:parent.chunk).
There are 3 cases why they might be invalid:
A) The LLM made a typo here. In that case, return False to try again.
B) The LLM made a typo when parsing the parent in a previous batch. In that case,
go back and redo the previous batch. We distinguish this from case A) by checking
if multiple chunks reference the same invalid parent.
C) An edge case where our schema breaks down, e.g. Javascript event handlers
usually try to set "document" as their parent, but that won't be a node.
Case A) should be resolved by Spice's validator loop, i.e. this function returning
"False". For Case B), raise a special exception and step back one batch in the
chunk_document loop. Any chunks still referencing invalid parents after these two
loops are exhausted (including case C)) will just be accepted and linked to
path/to/file.ext:BASE.
"""

@granawkins granawkins merged commit f533b5e into main May 20, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant