Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: Support distinct option for de-duplication #271

Open
m-vdb opened this issue Sep 25, 2018 · 7 comments · May be fixed by #325
Open

Feature request: Support distinct option for de-duplication #271

m-vdb opened this issue Sep 25, 2018 · 7 comments · May be fixed by #325

Comments

@m-vdb
Copy link
Contributor

m-vdb commented Sep 25, 2018

Description

Documentation: https://www.algolia.com/doc/guides/ranking/distinct/#distinct-for-de-duplication

I am hitting the 10kb limit for records and I need to be able to support de-duplication. Here is my use case:

  • I have one model that contains a list of locations
  • for some models, this list contains a lot of locations (some thousands)
  • when indexing my model, I want to push 1 record for each location in the list: this will tremendously decrease record size while at the same time enhance the search experience

Are you willing to accept PRs for this feature? I can draft a spec here and get on with it 💪
Cheers!

@m-vdb
Copy link
Contributor Author

m-vdb commented Oct 15, 2018

no one?

@clemfromspace
Copy link
Contributor

Hey @m-vdb,

Sorry for the late answer, we will definitely accept a PR for this case.

Had a quick look at your fork, the global logic looks good so far :)

Happy to review your PR when it's ready!

@m-vdb
Copy link
Contributor Author

m-vdb commented Oct 17, 2018

thanks @clemfromspace! I'll be adding tests too. Will let you know when it's ready

@m-vdb
Copy link
Contributor Author

m-vdb commented Nov 9, 2018

hey @clemfromspace I finally got the time to finish my work on the PR. I have coded integration tests to validate that everything works. IMO, what's missing:

  • validate my API: is it clear enough for other developers (method names, comments, etc...)?
  • documentation in the README: I can do it once you validate the API
  • real life tests for the AlgoliaIndexBatch: I haven't tested it yet with real data. In my case, I have records that will "duplicate" to thousands of records. It's next on my list.
  • compat code: I had to keep compat code for Django 1.7, I don't know if you'll agree with that. Ideally, it'd be better if algoliasearch-django dropped support for 1.7, but that's up to you.

You can see my new pull request on this repo here: #274

@yhoiseth
Copy link

What's the status on this?

@soulshake
Copy link

I would like to use this functionality. Is there any plan to merge #274?

@francois-travais
Copy link

I would also need this one. Is there any plan to do something about it?

@francois-travais francois-travais linked a pull request May 12, 2022 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants