Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

speedup search #28

Open
1 of 3 tasks
azimut opened this issue Mar 29, 2024 · 9 comments
Open
1 of 3 tasks

speedup search #28

azimut opened this issue Mar 29, 2024 · 9 comments

Comments

@azimut
Copy link
Owner

azimut commented Mar 29, 2024

  • try thread pool provided by wasm library
    If you use only a single connection, there is no difference between the two versions.
  • see if I can return a list of promises, instead of a promise of a list, so I can maybe show search results as I receive them
  • add pagination
@azimut
Copy link
Owner Author

azimut commented Mar 30, 2024

Searching "Haskell" today ends up in 2.5Mb downloaded. Which includes in a LOT of non requested fields (content? description?). I guess it does scanning.

@azimut
Copy link
Owner Author

azimut commented Mar 30, 2024

sqlite> SELECT entries.feedid,
       search.entriesid,
       search.title,
       entries.url,
       entries.datemillis
FROM search
JOIN entries ON search.entriesid=entries.id
WHERE search.content MATCH "Haskell"
ORDER BY entries.datemillis ASC;
QUERY PLAN
|--SCAN search VIRTUAL TABLE INDEX 0:M3
|--SEARCH entries USING INTEGER PRIMARY KEY (rowid=?)
`--USE TEMP B-TREE FOR ORDER BY
...

@azimut
Copy link
Owner Author

azimut commented Mar 30, 2024

Dropping the description column from entries reduces a bit the size.

Before

Ocaml 514.05 kB / 594.38 kB transferred
Haskell 2.17 MB / 2.33 MB transferred

After

Ocaml 505.86 kB / 584.33 kB transferred
Haskell 2.07 MB / 2.22 MB transferred

@azimut
Copy link
Owner Author

azimut commented Mar 31, 2024

Split entries.content into a different table should help.

Almost for sure it would help on the load time of "list episodes". Because right now is also fetching the whole records. But, I don't know how it will behave with the multiple index lookup would work.1

Another alternative would be add an index to entries.title. This would create a covering index2 which will limit the search to only the index. Which is what happens when I list the feeds and count the number of entries per feed.

sqlite> SELECT feeds.id, feeds.title, count(*)
            FROM feeds JOIN entries ON feeds.id=entries.feedid
        GROUP BY entries.feedid
          HAVING count(*) > 0;
QUERY PLAN
|--SCAN entries USING COVERING INDEX entriesindex
`--SEARCH feeds USING INTEGER PRIMARY KEY (rowid=?)

@azimut
Copy link
Owner Author

azimut commented Mar 31, 2024

Tried adding 3+ indexes columns to the index of entries. Makes it download 1MB just to show the home page (aka getFeeds). 🙃
No idea why. I see .mp3 urls....oh, might be he gets the whole index. Meaning, it gets the 3 fields.

@azimut azimut mentioned this issue Mar 31, 2024
azimut added a commit that referenced this issue Mar 31, 2024
although I am not 100% sure if it worked
it is sure that it speed up the display of individual feeds
*shrugs*
@azimut
Copy link
Owner Author

azimut commented Mar 31, 2024

well, I think this is done for now. It still reads the content....but size seems to have decreased

along with a for sure speed up of display of entries of a feed

@azimut azimut closed this as completed Mar 31, 2024
@azimut
Copy link
Owner Author

azimut commented Apr 1, 2024

leaving open for 2nd task of list of promises

@azimut azimut reopened this Apr 1, 2024
@azimut
Copy link
Owner Author

azimut commented Sep 13, 2024

ok, changing to use rowid in fts table instead of a separate column improved speed by a lot!

no longer I am receiving the whole search.content on search results

pagination might still be nice, might also show the number of results found while loading?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant