Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support running ragdaemon in non-git directories and adding ignored files to context #8

Merged
merged 4 commits into from
Apr 17, 2024

Conversation

granawkins
Copy link
Member

No description provided.

from ragdaemon.graph import KnowledgeGraph
from ragdaemon.errors import RagdaemonError
from ragdaemon.utils import get_document, hash_str, parse_path_ref
from ragdaemon.utils import get_document, hash_str, parse_path_ref, truncate
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function could benefit from a brief docstring explaining its purpose, inputs, and outputs for better code maintainability.

@@ -100,8 +76,9 @@ def files_checksum(cwd: Path, ignore_patterns: list[str] = []) -> str:
class Hierarchy(Annotator):
name = "hierarchy"

def __init__(self, *args, ignore_patterns: list[str] = [], **kwargs):
self.ignore_patterns = ignore_patterns
def __init__(self, *args, ignore_patterns: set[Path] = set(), **kwargs):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's great to see the use of type hints for better code clarity. Consider adding a brief docstring to the __init__ method to explain the purpose of ignore_patterns and its expected format.

@@ -44,7 +45,15 @@ def __init__(self, graph: KnowledgeGraph, db: Database, verbose: bool = False):
def _add_path(self, path_str: str):
"""Create a new record in the context for the given path."""
if path_str not in self.graph: # e.g. deleted file
document = ""
try:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider handling the case where get_document might raise an exception other than FileNotFoundError, ensuring the daemon's robustness.

@@ -101,13 +101,13 @@ async def update(self, refresh=False):

async def watch(self, interval=2, debounce=5):
"""Calls self.update interval debounce seconds after a file is modified."""
git_paths = get_non_gitignored_files(self.cwd)
paths = get_paths_for_directory(self.cwd)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The watch method's implementation is efficient. However, adding a comment to explain the debounce logic could improve readability.

return False


def get_paths_for_directory(
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is well-implemented for fetching directory paths with specific patterns. Adding a docstring would enhance its readability and usability.

@@ -118,3 +82,17 @@ def get_document(ref: str, cwd: Path, type: str = "file") -> str:
raise RagdaemonError(f"Invalid type: {type}")

return f"{ref}\n{text}"


def truncate(document, model: str, max_tokens: int) -> tuple[str, float]:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The truncate function is a crucial addition for handling large documents. A docstring explaining its parameters (document, model, max_tokens) and return values would be beneficial.

from ragdaemon.get_paths import get_paths_for_directory, get_git_root_for_path


def test_get_paths_for_directory_git(cwd):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests effectively cover the functionality of get_paths_for_directory. Consider adding a test case for the exclude_patterns parameter to ensure it works as expected.

Copy link

mentatbot bot commented Apr 17, 2024

MENTAT CODE REVIEW IN ACTIVE DEVELOPMENT. Only in use on mentat and internal repos.
Please Reply with feedback.

This pull request introduces several key improvements and features, such as supporting non-git directories and handling ignored files in the context. The changes are well-structured, and the new get_paths_for_directory function enhances the project's flexibility. Including docstrings for new functions and handling potential exceptions more broadly could further improve the codebase. Overall, this PR is a solid contribution to the project.

@granawkins granawkins merged commit 31e8f52 into main Apr 17, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant