Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: fuzzy matching completions #671

Merged
merged 7 commits into from
Dec 9, 2024
Merged

Conversation

jspaezp
Copy link
Contributor

@jspaezp jspaezp commented Oct 23, 2024

Closes #601

What are the key elements of this solution?

  1. I am implementing fuzzy matching for auto completions. The two main cases that I focused on were typos and partial matches.
    • Typos: Since right now a substring prefix is required, any typo prevents suggestions from being displayed ("slect" will not suggest "select"), this PR adds support for that.
    • Partial matches: Since right now startswith is used, if I have a column named "team_sales", typing "sales" is not a match, since it does not share a prefix. This adds support for it.

Why did you design your solution this way? Did you assess any alternatives? Are there tradeoffs?

  1. Some discussion here: Add fuzzy-matching for autocomplete #601 (comment)
  2. I also tested various alternatives using a harnessed snippet with the completions for a local database, where I modified the number of possible characters and 2 was a good tradeoff between the worst case scenario performance and real world use.
  3. Time to generate a completion in my laptop+database is still in the microsecond range but it does seem to be 3/4 times slower in the worst case scenarios.

Does this PR require a change to Harlequin's docs?

  • No.
  • Yes, and I have opened a PR at tconbeer/harlequin-web.
  • Yes; I haven't opened a PR, but the gist of the change is: ...

Did you add or update tests for this change?

  • Yes.
  • No, I believe tests aren't necessary.
    • NOTE: Since the current implementation of the matching is not tested I didn't add it. Let me know if you think its needed and I would be happy to write it.
  • No, I need help with testing this change.

Please complete the following checklist:

  • I have added an entry to CHANGELOG.md, under the [Unreleased] section heading. That entry references the issue closed by this PR.
  • I acknowledge Harlequin's MIT license. I do not own my contribution.

@jspaezp jspaezp marked this pull request as ready for review October 23, 2024 07:33
@tconbeer
Copy link
Owner

This looks great - thank you so much!

As you noted, there aren't unit tests for the completers, but there are functional/snapshot tests here:

async def test_word_autocomplete(

It would be great to tweak that test to test fuzzy matches, in addition to exact matches; in particular I want to make sure that we don't just append the match to the buffer, but replace the input characters. The snapshot tests can be a little bit of a pain, so let me know if you can't figure it out or don't want to, and I can take it from here. Little more info here: https://harlequin.sh/docs/contributing/index#inspecting-and-updating-snapshot-tests

test_word_autocomplete is failing (only on Ubuntu b/c that test is skipped on other platforms) because the matches have changed. That snapshot makes me realize that in this case, the matches are quite a bit worse (analyse instead of select for an input of se)
image

I think to address that, fuzzy matches should be ordered better -- right now they are listed alphabetically; would be good to order fuzzy matches by edit distance, or order all matches by:

  • exact matches first
  • startswith matches
  • fuzzy matches last (ordered ideally by edit distance or some heuristic or maybe just alphabetically)

A final optimization related to the ordering -- it might make sense to only compute fuzzy matches if the length of the list of exact and startswith matches is below some threshold. i.e., no reason to fuzzy match on a single character input that already returns dozens of results.

Let me know if you want to contribute the next iteration here. Thanks so much for getting this started!

@jspaezp
Copy link
Contributor Author

jspaezp commented Oct 28, 2024

I am happy to contribute the new version! (will get it done at some point this week)

@jspaezp
Copy link
Contributor Author

jspaezp commented Nov 2, 2024

Hummm oddly enough the tests hang forever after merging main. I was able to run them before the update.

image

@jspaezp
Copy link
Contributor Author

jspaezp commented Nov 2, 2024

Well in despite the tests being broken for me RN ...

I think to address that, fuzzy matches should be ordered better -- right now they are listed alphabetically; would be good to order fuzzy matches by edit distance, or order all matches by:

exact matches first
startswith matches
fuzzy matches last (ordered ideally by edit distance or some heuristic or maybe just alphabetically)
A final optimization related to the ordering -- it might make sense to only compute fuzzy > matches if the length of the list of exact and startswith matches is below some threshold. i.e., no reason to fuzzy match on a single character input that already returns dozens of results.

I implemented this idea with some minor variations ... I is Exact > starswith > fuzzy and I am sorting the fuzzy matches by length. This results in "more fuzzy" matches generally going last (which I feel is a good approximation). I am also skipping the fuzzy matching if there are already more than 20 direct matches.

Also took the freedom to add an unit test for it. I bundled the completions from a dummy database with just the 'iris' dataset added to it and stored the HarlequinCompletions as json, my only reservation on that is that it is a large-ish file and re-generating it entails some source code hooking/monkeypatching which I have not included in the PR.

LMK what you think and if you have any ideas on how to fix the testing issue.

Full traceback after killing a pytest session python -m pytest --full-trace tests/functional_tests/test_app.py (sorry, my python async is not great so I am having a hard time debugging this one ...)


=================================== test session starts ===================================
platform darwin -- Python 3.12.6, pytest-8.3.2, pluggy-1.5.0
rootdir: /Users/myuser/git/harlequin
configfile: pyproject.toml
plugins: syrupy-4.6.1, asyncio-0.21.2, textual-snapshot-0.4.0
asyncio: mode=Mode.STRICT
collected 49 items                                                                        

tests/functional_tests/test_app.py ^C

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! KeyboardInterrupt !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

config = <_pytest.config.Config object at 0x101d86240>
doit = <function _main at 0x101e69620>

    def wrap_session(
        config: Config, doit: Callable[[Config, Session], int | ExitCode | None]
    ) -> int | ExitCode:
        """Skeleton command line program."""
        session = Session.from_config(config)
        session.exitstatus = ExitCode.OK
        initstate = 0
        try:
            try:
                config._do_configure()
                initstate = 1
                config.hook.pytest_sessionstart(session=session)
                initstate = 2
>               session.exitstatus = doit(config, session) or 0

../../Library/Caches/pypoetry/virtualenvs/harlequin-FuiG_xwU-py3.12/lib/python3.12/site-pac
kages/_pytest/main.py:283: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

config = <_pytest.config.Config object at 0x101d86240>
session = <Session  exitstatus=<ExitCode.OK: 0> testsfailed=0 testscollected=49>

    def _main(config: Config, session: Session) -> int | ExitCode | None:
        """Default command line protocol for initialization, session,
        running tests and reporting."""
        config.hook.pytest_collection(session=session)
>       config.hook.pytest_runtestloop(session=session)

../../Library/Caches/pypoetry/virtualenvs/harlequin-FuiG_xwU-py3.12/lib/python3.12/site-pac
kages/_pytest/main.py:337: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <HookCaller 'pytest_runtestloop'>
kwargs = {'session': <Session  exitstatus=<ExitCode.OK: 0> testsfailed=0 testscollected=49>
}
firstresult = True

    def __call__(self, **kwargs: object) -> Any:
        """Call the hook.
    
        Only accepts keyword arguments, which should match the hook
        specification.
    
        Returns the result(s) of calling all registered plugins, see
        :ref:`calling`.
        """
        assert (
            not self.is_historic()
        ), "Cannot directly call a historic hook - use call_historic instead."
        self._verify_all_args_are_provided(kwargs)
        firstresult = self.spec.opts.get("firstresult", False) if self.spec else False
        # Copy because plugins may register other plugins during iteration (#438).
>       return self._hookexec(self.name, self._hookimpls.copy(), kwargs, firstresult)

../../Library/Caches/pypoetry/virtualenvs/harlequin-FuiG_xwU-py3.12/lib/python3.12/site-pac
kages/pluggy/_hooks.py:513: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <_pytest.config.PytestPluginManager object at 0x100f006e0>
hook_name = 'pytest_runtestloop'
methods = [<HookImpl plugin_name='main', plugin=<module '_pytest.main' from '/Users/sebasti
anpaez/Library/Caches/pypoetry/virtua...t 0x109417470>>, <HookImpl plugin_name='logging-plu
gin', plugin=<_pytest.logging.LoggingPlugin object at 0x109416d80>>]
kwargs = {'session': <Session  exitstatus=<ExitCode.OK: 0> testsfailed=0 testscollected=49>
}
firstresult = True

    def _hookexec(
        self,
        hook_name: str,
        methods: Sequence[HookImpl],
        kwargs: Mapping[str, object],
        firstresult: bool,
    ) -> object | list[object]:
        # called from all hookcaller instances.
        # enable_tracing will set its own wrapping function at self._inner_hookexec
>       return self._inner_hookexec(hook_name, methods, kwargs, firstresult)

../../Library/Caches/pypoetry/virtualenvs/harlequin-FuiG_xwU-py3.12/lib/python3.12/site-pac
kages/pluggy/_manager.py:120: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <_pytest.logging.LoggingPlugin object at 0x109416d80>
session = <Session  exitstatus=<ExitCode.OK: 0> testsfailed=0 testscollected=49>

    @hookimpl(wrapper=True)
    def pytest_runtestloop(self, session: Session) -> Generator[None, object, object]:
        if session.config.option.collectonly:
            return (yield)
    
        if self._log_cli_enabled() and self._config.getoption("verbose") < 1:
            # The verbose flag is needed to avoid messy test progress output.
            self._config.option.verbose = 1
    
        with catching_logs(self.log_cli_handler, level=self.log_cli_level):
            with catching_logs(self.log_file_handler, level=self.log_file_level):
>               return (yield)  # Run all the tests.

../../Library/Caches/pypoetry/virtualenvs/harlequin-FuiG_xwU-py3.12/lib/python3.12/site-pac
kages/_pytest/logging.py:805: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <_pytest.terminal.TerminalReporter object at 0x109417470>

    @hookimpl(wrapper=True)
    def pytest_runtestloop(self) -> Generator[None, object, object]:
>       result = yield

../../Library/Caches/pypoetry/virtualenvs/harlequin-FuiG_xwU-py3.12/lib/python3.12/site-pac
kages/_pytest/terminal.py:673: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

session = <Session  exitstatus=<ExitCode.OK: 0> testsfailed=0 testscollected=49>

    def pytest_runtestloop(session: Session) -> bool:
        if session.testsfailed and not session.config.option.continue_on_collection_errors:
            raise session.Interrupted(
                "%d error%s during collection"
                % (session.testsfailed, "s" if session.testsfailed != 1 else "")
            )
    
        if session.config.option.collectonly:
            return True
    
        for i, item in enumerate(session.items):
            nextitem = session.items[i + 1] if i + 1 < len(session.items) else None
>           item.config.hook.pytest_runtest_protocol(item=item, nextitem=nextitem)

../../Library/Caches/pypoetry/virtualenvs/harlequin-FuiG_xwU-py3.12/lib/python3.12/site-pac
kages/_pytest/main.py:362: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <HookCaller 'pytest_runtest_protocol'>
kwargs = {'item': <Function test_select_1[duckdb]>, 'nextitem': <Function test_select_1[sql
ite]>}
firstresult = True

    def __call__(self, **kwargs: object) -> Any:
        """Call the hook.
    
        Only accepts keyword arguments, which should match the hook
        specification.
    
        Returns the result(s) of calling all registered plugins, see
        :ref:`calling`.
        """
        assert (
            not self.is_historic()
        ), "Cannot directly call a historic hook - use call_historic instead."
        self._verify_all_args_are_provided(kwargs)
        firstresult = self.spec.opts.get("firstresult", False) if self.spec else False
        # Copy because plugins may register other plugins during iteration (#438).
>       return self._hookexec(self.name, self._hookimpls.copy(), kwargs, firstresult)

../../Library/Caches/pypoetry/virtualenvs/harlequin-FuiG_xwU-py3.12/lib/python3.12/site-pac
kages/pluggy/_hooks.py:513: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <_pytest.config.PytestPluginManager object at 0x100f006e0>
hook_name = 'pytest_runtest_protocol'
methods = [<HookImpl plugin_name='runner', plugin=<module '_pytest.runner' from '/Users/seb
astianpaez/Library/Caches/pypoetry/vi...paez/Library/Caches/pypoetry/virtualenvs/harlequin-
FuiG_xwU-py3.12/lib/python3.12/site-packages/_pytest/warnings.py'>>]
kwargs = {'item': <Function test_select_1[duckdb]>, 'nextitem': <Function test_select_1[sql
ite]>}
firstresult = True

    def _hookexec(
        self,
        hook_name: str,
        methods: Sequence[HookImpl],
        kwargs: Mapping[str, object],
        firstresult: bool,
    ) -> object | list[object]:
        # called from all hookcaller instances.
        # enable_tracing will set its own wrapping function at self._inner_hookexec
>       return self._inner_hookexec(hook_name, methods, kwargs, firstresult)

../../Library/Caches/pypoetry/virtualenvs/harlequin-FuiG_xwU-py3.12/lib/python3.12/site-pac
kages/pluggy/_manager.py:120: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

item = <Function test_select_1[duckdb]>

    @pytest.hookimpl(wrapper=True, tryfirst=True)
    def pytest_runtest_protocol(item: Item) -> Generator[None, object, object]:
        with catch_warnings_for_item(
            config=item.config, ihook=item.ihook, when="runtest", item=item
        ):
>           return (yield)

../../Library/Caches/pypoetry/virtualenvs/harlequin-FuiG_xwU-py3.12/lib/python3.12/site-pac
kages/_pytest/warnings.py:112: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

item = <Function test_select_1[duckdb]>

    @hookimpl(wrapper=True, tryfirst=True)
    def pytest_runtest_protocol(item: Item) -> Generator[None, object, object]:
        """Setup the pytest_assertrepr_compare and pytest_assertion_pass hooks.
    
        The rewrite module will use util._reprcompare if it exists to use custom
        reporting via the pytest_assertrepr_compare hook.  This sets up this custom
        comparison for the test.
        """
        ihook = item.ihook
    
        def callbinrepr(op, left: object, right: object) -> str | None:
            """Call the pytest_assertrepr_compare hook and prepare the result.
    
            This uses the first result from the hook and then ensures the
            following:
            * Overly verbose explanations are truncated unless configured otherwise
              (eg. if running in verbose mode).
            * Embedded newlines are escaped to help util.format_explanation()
              later.
            * If the rewrite mode is used embedded %-characters are replaced
              to protect later % formatting.
    
            The result can be formatted by util.format_explanation() for
            pretty printing.
            """
            hook_result = ihook.pytest_assertrepr_compare(
                config=item.config, op=op, left=left, right=right
            )
            for new_expl in hook_result:
                if new_expl:
                    new_expl = truncate.truncate_if_required(new_expl, item)
                    new_expl = [line.replace("\n", "\\n") for line in new_expl]
                    res = "\n~".join(new_expl)
                    if item.config.getvalue("assertmode") == "rewrite":
                        res = res.replace("%", "%%")
                    return res
            return None
    
        saved_assert_hooks = util._reprcompare, util._assertion_pass
        util._reprcompare = callbinrepr
        util._config = item.config
    
        if ihook.pytest_assertion_pass.get_hookimpls():
    
            def call_assertion_pass_hook(lineno: int, orig: str, expl: str) -> None:
                ihook.pytest_assertion_pass(item=item, lineno=lineno, orig=orig, expl=expl)
    
            util._assertion_pass = call_assertion_pass_hook
    
        try:
>           return (yield)

../../Library/Caches/pypoetry/virtualenvs/harlequin-FuiG_xwU-py3.12/lib/python3.12/site-pac
kages/_pytest/assertion/__init__.py:176: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

item = <Function test_select_1[duckdb]>

    @hookimpl(wrapper=True)
    def pytest_runtest_protocol(item: Item) -> Generator[None, object, object]:
        if isinstance(item, TestCaseFunction) and "twisted.trial.unittest" in sys.modules:
            ut: Any = sys.modules["twisted.python.failure"]
            global classImplements_has_run
            Failure__init__ = ut.Failure.__init__
            if not classImplements_has_run:
                from twisted.trial.itrial import IReporter
                from zope.interface import classImplements
    
                classImplements(TestCaseFunction, IReporter)
                classImplements_has_run = True
    
            def excstore(
                self, exc_value=None, exc_type=None, exc_tb=None, captureVars=None
            ):
                if exc_value is None:
                    self._rawexcinfo = sys.exc_info()
                else:
                    if exc_type is None:
                        exc_type = type(exc_value)
                    self._rawexcinfo = (exc_type, exc_value, exc_tb)
                try:
                    Failure__init__(
                        self, exc_value, exc_type, exc_tb, captureVars=captureVars
                    )
                except TypeError:
                    Failure__init__(self, exc_value, exc_type, exc_tb)
    
            ut.Failure.__init__ = excstore
            try:
                res = yield
            finally:
                ut.Failure.__init__ = Failure__init__
        else:
>           res = yield

../../Library/Caches/pypoetry/virtualenvs/harlequin-FuiG_xwU-py3.12/lib/python3.12/site-pac
kages/_pytest/unittest.py:429: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

item = <Function test_select_1[duckdb]>

    @pytest.hookimpl(wrapper=True, trylast=True)
    def pytest_runtest_protocol(item: Item) -> Generator[None, object, object]:
        timeout = get_timeout_config_value(item.config)
        if timeout > 0:
            import faulthandler
    
            stderr = item.config.stash[fault_handler_stderr_fd_key]
            faulthandler.dump_traceback_later(timeout, file=stderr)
            try:
                return (yield)
            finally:
                faulthandler.cancel_dump_traceback_later()
        else:
>           return (yield)

../../Library/Caches/pypoetry/virtualenvs/harlequin-FuiG_xwU-py3.12/lib/python3.12/site-pac
kages/_pytest/faulthandler.py:87: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

item = <Function test_select_1[duckdb]>, nextitem = <Function test_select_1[sqlite]>

    def pytest_runtest_protocol(item: Item, nextitem: Item | None) -> bool:
        ihook = item.ihook
        ihook.pytest_runtest_logstart(nodeid=item.nodeid, location=item.location)
>       runtestprotocol(item, nextitem=nextitem)

../../Library/Caches/pypoetry/virtualenvs/harlequin-FuiG_xwU-py3.12/lib/python3.12/site-pac
kages/_pytest/runner.py:113: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

item = <Function test_select_1[duckdb]>, log = True
nextitem = <Function test_select_1[sqlite]>

    def runtestprotocol(
        item: Item, log: bool = True, nextitem: Item | None = None
    ) -> list[TestReport]:
        hasrequest = hasattr(item, "_request")
        if hasrequest and not item._request:  # type: ignore[attr-defined]
            # This only happens if the item is re-run, as is done by
            # pytest-rerunfailures.
            item._initrequest()  # type: ignore[attr-defined]
        rep = call_and_report(item, "setup", log)
        reports = [rep]
        if rep.passed:
            if item.config.getoption("setupshow", False):
                show_test_item(item)
            if not item.config.getoption("setuponly", False):
>               reports.append(call_and_report(item, "call", log))

../../Library/Caches/pypoetry/virtualenvs/harlequin-FuiG_xwU-py3.12/lib/python3.12/site-pac
kages/_pytest/runner.py:132: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

item = <Function test_select_1[duckdb]>, when = 'call', log = True, kwds = {}
ihook = <_pytest.config.compat.PathAwareHookProxy object at 0x101f1ee10>
reraise = (<class '_pytest.outcomes.Exit'>, <class 'KeyboardInterrupt'>)

    def call_and_report(
        item: Item, when: Literal["setup", "call", "teardown"], log: bool = True, **kwds
    ) -> TestReport:
        ihook = item.ihook
        if when == "setup":
            runtest_hook: Callable[..., None] = ihook.pytest_runtest_setup
        elif when == "call":
            runtest_hook = ihook.pytest_runtest_call
        elif when == "teardown":
            runtest_hook = ihook.pytest_runtest_teardown
        else:
            assert False, f"Unhandled runtest hook case: {when}"
        reraise: tuple[type[BaseException], ...] = (Exit,)
        if not item.config.getoption("usepdb", False):
            reraise += (KeyboardInterrupt,)
>       call = CallInfo.from_call(
            lambda: runtest_hook(item=item, **kwds), when=when, reraise=reraise
        )

../../Library/Caches/pypoetry/virtualenvs/harlequin-FuiG_xwU-py3.12/lib/python3.12/site-pac
kages/_pytest/runner.py:241: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

cls = <class '_pytest.runner.CallInfo'>
func = <function call_and_report.<locals>.<lambda> at 0x109475800>, when = 'call'
reraise = (<class '_pytest.outcomes.Exit'>, <class 'KeyboardInterrupt'>)

    @classmethod
    def from_call(
        cls,
        func: Callable[[], TResult],
        when: Literal["collect", "setup", "call", "teardown"],
        reraise: type[BaseException] | tuple[type[BaseException], ...] | None = None,
    ) -> CallInfo[TResult]:
        """Call func, wrapping the result in a CallInfo.
    
        :param func:
            The function to call. Called without arguments.
        :type func: Callable[[], _pytest.runner.TResult]
        :param when:
            The phase in which the function is called.
        :param reraise:
            Exception or exceptions that shall propagate if raised by the
            function, instead of being wrapped in the CallInfo.
        """
        excinfo = None
        start = timing.time()
        precise_start = timing.perf_counter()
        try:
>           result: TResult | None = func()

../../Library/Caches/pypoetry/virtualenvs/harlequin-FuiG_xwU-py3.12/lib/python3.12/site-pac
kages/_pytest/runner.py:341: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

>       lambda: runtest_hook(item=item, **kwds), when=when, reraise=reraise
    )

../../Library/Caches/pypoetry/virtualenvs/harlequin-FuiG_xwU-py3.12/lib/python3.12/site-pac
kages/_pytest/runner.py:242: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <HookCaller 'pytest_runtest_call'>
kwargs = {'item': <Function test_select_1[duckdb]>}, firstresult = False

    def __call__(self, **kwargs: object) -> Any:
        """Call the hook.
    
        Only accepts keyword arguments, which should match the hook
        specification.
    
        Returns the result(s) of calling all registered plugins, see
        :ref:`calling`.
        """
        assert (
            not self.is_historic()
        ), "Cannot directly call a historic hook - use call_historic instead."
        self._verify_all_args_are_provided(kwargs)
        firstresult = self.spec.opts.get("firstresult", False) if self.spec else False
        # Copy because plugins may register other plugins during iteration (#438).
>       return self._hookexec(self.name, self._hookimpls.copy(), kwargs, firstresult)

../../Library/Caches/pypoetry/virtualenvs/harlequin-FuiG_xwU-py3.12/lib/python3.12/site-pac
kages/pluggy/_hooks.py:513: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <_pytest.config.PytestPluginManager object at 0x100f006e0>
hook_name = 'pytest_runtest_call'
methods = [<HookImpl plugin_name='runner', plugin=<module '_pytest.runner' from '/Users/seb
astianpaez/Library/Caches/pypoetry/vi...brary/Caches/pypoetry/virtualenvs/harlequin-FuiG_xw
U-py3.12/lib/python3.12/site-packages/_pytest/threadexception.py'>>]
kwargs = {'item': <Function test_select_1[duckdb]>}, firstresult = False

    def _hookexec(
        self,
        hook_name: str,
        methods: Sequence[HookImpl],
        kwargs: Mapping[str, object],
        firstresult: bool,
    ) -> object | list[object]:
        # called from all hookcaller instances.
        # enable_tracing will set its own wrapping function at self._inner_hookexec
>       return self._inner_hookexec(hook_name, methods, kwargs, firstresult)

../../Library/Caches/pypoetry/virtualenvs/harlequin-FuiG_xwU-py3.12/lib/python3.12/site-pac
kages/pluggy/_manager.py:120: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

    @pytest.hookimpl(wrapper=True, tryfirst=True)
    def pytest_runtest_call() -> Generator[None, None, None]:
>       yield from thread_exception_runtest_hook()

../../Library/Caches/pypoetry/virtualenvs/harlequin-FuiG_xwU-py3.12/lib/python3.12/site-pac
kages/_pytest/threadexception.py:92: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

    def thread_exception_runtest_hook() -> Generator[None, None, None]:
        with catch_threading_exception() as cm:
            try:
>               yield

../../Library/Caches/pypoetry/virtualenvs/harlequin-FuiG_xwU-py3.12/lib/python3.12/site-pac
kages/_pytest/threadexception.py:68: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

    @pytest.hookimpl(wrapper=True, tryfirst=True)
    def pytest_runtest_call() -> Generator[None, None, None]:
>       yield from unraisable_exception_runtest_hook()

../../Library/Caches/pypoetry/virtualenvs/harlequin-FuiG_xwU-py3.12/lib/python3.12/site-pac
kages/_pytest/unraisableexception.py:95: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

    def unraisable_exception_runtest_hook() -> Generator[None, None, None]:
        with catch_unraisable_exception() as cm:
            try:
>               yield

../../Library/Caches/pypoetry/virtualenvs/harlequin-FuiG_xwU-py3.12/lib/python3.12/site-pac
kages/_pytest/unraisableexception.py:70: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <_pytest.logging.LoggingPlugin object at 0x109416d80>
item = <Function test_select_1[duckdb]>

    @hookimpl(wrapper=True)
    def pytest_runtest_call(self, item: nodes.Item) -> Generator[None, None, None]:
        self.log_cli_handler.set_when("call")
    
>       yield from self._runtest_for(item, "call")

../../Library/Caches/pypoetry/virtualenvs/harlequin-FuiG_xwU-py3.12/lib/python3.12/site-pac
kages/_pytest/logging.py:848: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <_pytest.logging.LoggingPlugin object at 0x109416d80>
item = <Function test_select_1[duckdb]>, when = 'call'

    def _runtest_for(self, item: nodes.Item, when: str) -> Generator[None, None, None]:
        """Implement the internals of the pytest_runtest_xxx() hooks."""
        with catching_logs(
            self.caplog_handler,
            level=self.log_level,
        ) as caplog_handler, catching_logs(
            self.report_handler,
            level=self.log_level,
        ) as report_handler:
            caplog_handler.reset()
            report_handler.reset()
            item.stash[caplog_records_key][when] = caplog_handler.records
            item.stash[caplog_handler_key] = caplog_handler
    
            try:
>               yield

../../Library/Caches/pypoetry/virtualenvs/harlequin-FuiG_xwU-py3.12/lib/python3.12/site-pac
kages/_pytest/logging.py:831: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <CaptureManager _method='fd' _global_capturing=None _capture_fixture=None>
item = <Function test_select_1[duckdb]>

    @hookimpl(wrapper=True)
    def pytest_runtest_call(self, item: Item) -> Generator[None, None, None]:
        with self.item_capture("call", item):
>           return (yield)

../../Library/Caches/pypoetry/virtualenvs/harlequin-FuiG_xwU-py3.12/lib/python3.12/site-pac
kages/_pytest/capture.py:879: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

item = <Function test_select_1[duckdb]>

    @hookimpl(wrapper=True)
    def pytest_runtest_call(item: Item) -> Generator[None, None, None]:
        xfailed = item.stash.get(xfailed_key, None)
        if xfailed is None:
            item.stash[xfailed_key] = xfailed = evaluate_xfail_marks(item)
    
        if xfailed and not item.config.option.runxfail and not xfailed.run:
            xfail("[NOTRUN] " + xfailed.reason)
    
        try:
>           return (yield)

../../Library/Caches/pypoetry/virtualenvs/harlequin-FuiG_xwU-py3.12/lib/python3.12/site-pac
kages/_pytest/skipping.py:257: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

item = <Function test_select_1[duckdb]>

    def pytest_runtest_call(item: Item) -> None:
        _update_current_test_var(item, "call")
        try:
            del sys.last_type
            del sys.last_value
            del sys.last_traceback
            if sys.version_info >= (3, 12, 0):
                del sys.last_exc
        except AttributeError:
            pass
        try:
>           item.runtest()

../../Library/Caches/pypoetry/virtualenvs/harlequin-FuiG_xwU-py3.12/lib/python3.12/site-pac
kages/_pytest/runner.py:174: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <Function test_select_1[duckdb]>

    def runtest(self) -> None:
        """Execute the underlying test function."""
>       self.ihook.pytest_pyfunc_call(pyfuncitem=self)

../../Library/Caches/pypoetry/virtualenvs/harlequin-FuiG_xwU-py3.12/lib/python3.12/site-pac
kages/_pytest/python.py:1627: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <HookCaller 'pytest_pyfunc_call'>
kwargs = {'pyfuncitem': <Function test_select_1[duckdb]>}, firstresult = True

    def __call__(self, **kwargs: object) -> Any:
        """Call the hook.
    
        Only accepts keyword arguments, which should match the hook
        specification.
    
        Returns the result(s) of calling all registered plugins, see
        :ref:`calling`.
        """
        assert (
            not self.is_historic()
        ), "Cannot directly call a historic hook - use call_historic instead."
        self._verify_all_args_are_provided(kwargs)
        firstresult = self.spec.opts.get("firstresult", False) if self.spec else False
        # Copy because plugins may register other plugins during iteration (#438).
>       return self._hookexec(self.name, self._hookimpls.copy(), kwargs, firstresult)

../../Library/Caches/pypoetry/virtualenvs/harlequin-FuiG_xwU-py3.12/lib/python3.12/site-pac
kages/pluggy/_hooks.py:513: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <_pytest.config.PytestPluginManager object at 0x100f006e0>
hook_name = 'pytest_pyfunc_call'
methods = [<HookImpl plugin_name='python', plugin=<module '_pytest.python' from '/Users/seb
astianpaez/Library/Caches/pypoetry/vi...Library/Caches/pypoetry/virtualenvs/harlequin-FuiG_
xwU-py3.12/lib/python3.12/site-packages/pytest_asyncio/plugin.py'>>]
kwargs = {'pyfuncitem': <Function test_select_1[duckdb]>}, firstresult = True

    def _hookexec(
        self,
        hook_name: str,
        methods: Sequence[HookImpl],
        kwargs: Mapping[str, object],
        firstresult: bool,
    ) -> object | list[object]:
        # called from all hookcaller instances.
        # enable_tracing will set its own wrapping function at self._inner_hookexec
>       return self._inner_hookexec(hook_name, methods, kwargs, firstresult)

../../Library/Caches/pypoetry/virtualenvs/harlequin-FuiG_xwU-py3.12/lib/python3.12/site-pac
kages/pluggy/_manager.py:120: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

pyfuncitem = <Function test_select_1[duckdb]>

    @hookimpl(trylast=True)
    def pytest_pyfunc_call(pyfuncitem: Function) -> object | None:
        testfunction = pyfuncitem.obj
        if is_async_function(testfunction):
            async_warn_and_skip(pyfuncitem.nodeid)
        funcargs = pyfuncitem.funcargs
        testargs = {arg: funcargs[arg] for arg in pyfuncitem._fixtureinfo.argnames}
>       result = testfunction(**testargs)

../../Library/Caches/pypoetry/virtualenvs/harlequin-FuiG_xwU-py3.12/lib/python3.12/site-pac
kages/_pytest/python.py:159: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

args = ()
kwargs = {'app_all_adapters': Harlequin(title='Harlequin', classes={'-dark-mode'}), 'app_sn
apshot': <function app_snapshot.<loc...are at 0x109544e00>, 'wait_for_workers': <function w
ait_for_workers.<locals>.wait_for_filtered_workers at 0x109544d60>}
coro = <coroutine object test_select_1 at 0x10946c7c0>
task = <Task pending name='Task-1' coro=<test_select_1() running at /Users/myuser/gi
t/harlequin/tests/functional_tests/test_app.py:32> wait_for=<Future pending cb=[Task.task_w
akeup()]>>

    @functools.wraps(func)
    def inner(*args, **kwargs):
        coro = func(*args, **kwargs)
        if not inspect.isawaitable(coro):
            pyfuncitem.warn(
                pytest.PytestWarning(
                    f"The test {pyfuncitem} is marked with '@pytest.mark.asyncio' "
                    "but it is not an async function. "
                    "Please remove asyncio marker. "
                    "If the test is not marked explicitly, "
                    "check for global markers applied via 'pytestmark'."
                )
            )
            return
        task = asyncio.ensure_future(coro, loop=_loop)
        try:
>           _loop.run_until_complete(task)

../../Library/Caches/pypoetry/virtualenvs/harlequin-FuiG_xwU-py3.12/lib/python3.12/site-pac
kages/pytest_asyncio/plugin.py:529: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <_UnixSelectorEventLoop running=False closed=False debug=False>
future = <Task pending name='Task-1' coro=<test_select_1() running at /Users/myuser/
git/harlequin/tests/functional_tests/test_app.py:32> wait_for=<Future pending cb=[Task.task
_wakeup()]>>

    def run_until_complete(self, future):
        """Run until the Future is done.
    
        If the argument is a coroutine, it is wrapped in a Task.
    
        WARNING: It would be disastrous to call run_until_complete()
        with the same coroutine twice -- it would wrap it in two
        different Tasks and that can't be good.
    
        Return the Future's result, or raise its exception.
        """
        self._check_closed()
        self._check_running()
    
        new_task = not futures.isfuture(future)
        future = tasks.ensure_future(future, loop=self)
        if new_task:
            # An exception is raised if the future didn't complete, so there
            # is no need to log the "destroy pending task" message
            future._log_destroy_pending = False
    
        future.add_done_callback(_run_until_complete_cb)
        try:
>           self.run_forever()

/opt/homebrew/Cellar/[email protected]/3.12.6/Frameworks/Python.framework/Versions/3.12/lib/pytho
n3.12/asyncio/base_events.py:674: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <_UnixSelectorEventLoop running=False closed=False debug=False>

    def run_forever(self):
        """Run until stop() is called."""
        self._check_closed()
        self._check_running()
        self._set_coroutine_origin_tracking(self._debug)
    
        old_agen_hooks = sys.get_asyncgen_hooks()
        try:
            self._thread_id = threading.get_ident()
            sys.set_asyncgen_hooks(firstiter=self._asyncgen_firstiter_hook,
                                   finalizer=self._asyncgen_finalizer_hook)
    
            events._set_running_loop(self)
            while True:
>               self._run_once()

/opt/homebrew/Cellar/[email protected]/3.12.6/Frameworks/Python.framework/Versions/3.12/lib/pytho
n3.12/asyncio/base_events.py:641: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <_UnixSelectorEventLoop running=False closed=False debug=False>

    def _run_once(self):
        """Run one full iteration of the event loop.
    
        This calls all currently ready callbacks, polls for I/O,
        schedules the resulting callbacks, and finally schedules
        'call_later' callbacks.
        """
    
        sched_count = len(self._scheduled)
        if (sched_count > _MIN_SCHEDULED_TIMER_HANDLES and
            self._timer_cancelled_count / sched_count >
                _MIN_CANCELLED_TIMER_HANDLES_FRACTION):
            # Remove delayed calls that were cancelled if their number
            # is too high
            new_scheduled = []
            for handle in self._scheduled:
                if handle._cancelled:
                    handle._scheduled = False
                else:
                    new_scheduled.append(handle)
    
            heapq.heapify(new_scheduled)
            self._scheduled = new_scheduled
            self._timer_cancelled_count = 0
        else:
            # Remove delayed calls that were cancelled from head of queue.
            while self._scheduled and self._scheduled[0]._cancelled:
                self._timer_cancelled_count -= 1
                handle = heapq.heappop(self._scheduled)
                handle._scheduled = False
    
        timeout = None
        if self._ready or self._stopping:
            timeout = 0
        elif self._scheduled:
            # Compute the desired timeout.
            when = self._scheduled[0]._when
            timeout = min(max(0, when - self.time()), MAXIMUM_SELECT_TIMEOUT)
    
>       event_list = self._selector.select(timeout)

/opt/homebrew/Cellar/[email protected]/3.12.6/Frameworks/Python.framework/Versions/3.12/lib/pytho
n3.12/asyncio/base_events.py:1948: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <selectors.KqueueSelector object at 0x1094ad7f0>, timeout = 0.019949874840676785

    def select(self, timeout=None):
        timeout = None if timeout is None else max(timeout, 0)
        # If max_ev is 0, kqueue will ignore the timeout. For consistent
        # behavior with the other selector classes, we prevent that here
        # (using max). See https://bugs.python.org/issue29255
        max_ev = self._max_events or 1
        ready = []
        try:
>           kev_list = self._selector.control(None, max_ev, timeout)
E           KeyboardInterrupt

/opt/homebrew/Cellar/[email protected]/3.12.6/Frameworks/Python.framework/Versions/3.12/lib/pytho
n3.12/selectors.py:566: KeyboardInterrupt
================================= no tests ran in 29.92s ==================================


Copy link
Owner

@tconbeer tconbeer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's a typo in your ordering, otherwise this looks good to me.

re: tests, you probably need to run poetry install --sync, since a lot of dependencies were updated with the latest commits.

I'll take the unit test, although it would be better if the .json was smaller (like a dozen or two values) and therefore easier to maintain and tweak to test the exact behavior you're looking for.

# Sort in ascending length.
# I am assuming here that more insertions are less likely to be
# the "right" match.
matches.sort(key=lambda c: len(c.match_val), reverse=True)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
matches.sort(key=lambda c: len(c.match_val), reverse=True)
matches.sort(key=lambda c: len(c.match_val))

From the snapshot tests, I think adding the reverse causes the longest fuzzy matches to be listed first, which is the opposite of what you want.
image

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch

@jspaezp
Copy link
Contributor Author

jspaezp commented Nov 5, 2024

After nuking the env it works fine again!

571ad25 Adds the review suggestions and updates the .ambr files.
Trimmed the json ... left ~100 completions and pretty printed it.

Also noticed it now shows a "Tx: Auto" in several of the snapshots. Which I don't believe are related with my PR.

image

Thanks a lot for the feedback!

@tconbeer
Copy link
Owner

tconbeer commented Nov 5, 2024

The Tx: Auto is added when using python 3.12+ with SQLite. For developing and testing I use 3.8 (soon 3.9)

@jspaezp
Copy link
Contributor Author

jspaezp commented Dec 2, 2024

@tconbeer I Updated the snapshots to be on py 3.10 (<=3.9 does not install in m1 macs ... no version of psycopg-binary available)

@jspaezp jspaezp requested a review from tconbeer December 2, 2024 17:00
@tconbeer
Copy link
Owner

tconbeer commented Dec 3, 2024

the last few failures are from snapshots that aren't running on macs; I can help clean up those final snapshots in the next few days

@tconbeer
Copy link
Owner

tconbeer commented Dec 9, 2024

Remaining failures appear to be due to a sqlite3 version mismatch (causing the completers to return a different number of keywords). ("works on my machine"). Going to ship this. Thanks so much for your contribution, @jspaezp !!

@tconbeer tconbeer merged commit 434775d into tconbeer:main Dec 9, 2024
8 of 12 checks passed
@jspaezp
Copy link
Contributor Author

jspaezp commented Dec 9, 2024

Wohoo! Thank you very much for helping throughout the process (and for the amazing project)! It has been an absolute pleasure @tconbeer

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add fuzzy-matching for autocomplete
2 participants