Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dev/core#5433 Add legacydedupefinder #31689

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

eileenmcnaughton
Copy link
Contributor

Overview

This adds a new extension for existing dedupe finder behaviour, allowing us to support the hook that is jammed in the middle of the dedupe finder code going forwards without being blocked by it (I have an existing PR that cuts many hours off the dedupe query but it is bending over backwards to work alongside the legacy attempts to allow integration).

Before

We know how to speed up dedupe queries #30591 but are working really hard around the legacy hook and the existing PR is having to rely on crazy methodology. The hook dupeQuery() is not one that has had much adoption, nor has the idea of writing custom improvements for the reserved queries. However, these 2 things have been long-standing blockers to improving performance

After

The new extension is enabled on install and on upgrade and there is no behaviour change unless a site admin disables it. If they choose to do so then

  1. the legacy hook & legacy reserved query wrangling will no longer be called

  2. dedupe queries will speed up once the blocked code is also merged Fix dedupe query wrangling to combine queries where 2 fields in the same table are always used together #30591

Technical Details

The code works by adding a new hook

findExistingDuplicates(array &$duplicates, array $ruleGroupIDs, array $whereClauses, bool $checkPermissions)

The hook is implemented by the new extension legacydedupefinder with a weight of -5 and by core with a weight of -20

legacydedupefinder calls stopPropagation() meaning that if it is enabled the core code will not be hit. Any custom extensions that give no specific thought to it will wind up with a weight of 0 - meaning they get called first, which feels like it would be as expected. Once finalised I will document this.

The function in the core code and in the extension is currently the same, except the core code is passing a parameter to fillTable to block legacy hooks from being called. Once this is merged the goal is to separate out & re-write the core function.

Before that I want to look at the second code path and attempt to do the same for it - the challenge is I really want to pass out apiv4 params to the function, not v3 https://lab.civicrm.org/dev/core/-/issues/5433

Comments

The extension is currently not hidden but if I don't get to where I want to before forking I will hide it so people do not disable it until the new hooks are settled in

@mattwire @colemanw

Copy link

civibot bot commented Jan 2, 2025

🤖 Thank you for contributing to CiviCRM! ❤️ We will need to test and review this PR. 👷

Introduction for new contributors...
  • If this is your first PR, an admin will greenlight automated testing with the command ok to test or add to whitelist.
  • A series of tests will automatically run. You can see the results at the bottom of this page (if there are any problems, it will include a link to see what went wrong).
  • A demo site will be built where anyone can try out a version of CiviCRM that includes your changes.
  • If this process needs to be repeated, an admin will issue the command test this please to rerun tests and build a new demo site.
  • Before this PR can be merged, it needs to be reviewed. Please keep in mind that reviewers are volunteers, and their response time can vary from a few hours to a few weeks depending on their availability and their knowledge of this particular part of CiviCRM.
  • A great way to speed up this process is to "trade reviews" with someone - find an open PR that you feel able to review, and leave a comment like "I'm reviewing this now, could you please review mine?" (include a link to yours). You don't have to wait for a response to get started (and you don't have to stop at one!) the more you review, the faster this process goes for everyone 😄
  • To ensure that you are credited properly in the final release notes, please add yourself to contributor-key.yml
  • For more information about contributing, see CONTRIBUTING.md.
Quick links for reviewers...

➡️ Online demo of this PR 🔗

@civibot civibot bot added the master label Jan 2, 2025
Copy link

civibot bot commented Jan 2, 2025

The issue associated with the Pull Request can be viewed at https://lab.civicrm.org/dev/core/-/issues/5433

*/
function legacydedupefinder_civicrm_config(&$config): void {
_legacydedupefinder_civix_civicrm_config($config);
\Civi::dispatcher()->addListener('hook_civicrm_findExistingDuplicates', ['\Civi\LegacyFinder\Finder', 'findExistingDuplicates'], -5);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn''t think this was still the right way - but my reading of https://docs.civicrm.org/dev/en/latest/hooks/usage/symfony/ is that unless it's a BAO class or in a very specific location then it still is


/**
* The CiviCRM duplicate discovery engine is based on an
* algorithm designed by David Strauss <[email protected]>.
*/
class CRM_Dedupe_BAO_DedupeRuleGroup extends CRM_Dedupe_DAO_DedupeRuleGroup {
class CRM_Dedupe_BAO_DedupeRuleGroup extends CRM_Dedupe_DAO_DedupeRuleGroup implements \Symfony\Component\EventDispatcher\EventSubscriberInterface {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eileenmcnaughton this should be the \Civi\Core\HookInterface not Event Subscriber Interface

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@seamuslee001 I didn't really figure out which is which based on the docs - but this one seems to allow me to specify the weight

class CRM_Dedupe_BAO_DedupeRuleGroup extends CRM_Dedupe_DAO_DedupeRuleGroup {
class CRM_Dedupe_BAO_DedupeRuleGroup extends CRM_Dedupe_DAO_DedupeRuleGroup implements \Symfony\Component\EventDispatcher\EventSubscriberInterface {

public static function getSubscribedEvents(): array {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn't be needed one switched to using the hookInterface

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how do you specify weight with hook interface?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right @eileenmcnaughton if you want to specify priority level you need to use getSubscribedEvents()

*
* @return void
*/
public static function hook_civicrm_findExistingDuplicates(\Civi\Core\Event\GenericHookEvent $event) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should change to self_hook_civicrm_findExistingDuplicates as per https://github.com/civicrm/civicrm-core/blob/master/CRM/Core/BAO/OptionGroup.php#L83

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what @eileenmcnaughton did is correct. Since she needs to specify priority this is the way to do it.

=====
Overview
====

This adds a new extension for existing dedupe finder behaviour, allowing us to support the
hook that is jammed in the middle of the dedupe finder code going forwards without
being blocked by it (I have an existing PR that cuts many hours off the
dedupe query but it is bending over backwards to work alongside the legacy
attempts to allow integration).

The new extension is enabled on install and on upgrade and there is no behaviour change unless
a site admin disables it. If they choose to do so then

1) the legacy hook & legacy reserved query wrangling will no longer be called
2) dedupe queries will speed up once the blocked code is also merged
civicrm#30591
@@ -47,7 +47,7 @@ php GenerateData.php

## Prune local data
$MYSQLCMD -e "DROP TABLE IF EXISTS civicrm_install_canary; DELETE FROM civicrm_cache; DELETE FROM civicrm_setting;"
$MYSQLCMD -e "DELETE FROM civicrm_extension WHERE full_name NOT IN ('sequentialcreditnotes', 'eventcart', 'greenwich', 'org.civicrm.search_kit', 'org.civicrm.afform', 'authx', 'org.civicrm.flexmailer', 'financialacls', 'contributioncancelactions', 'recaptcha', 'ckeditor4', 'legacycustomsearches', 'civiimport', 'message_admin') and full_name NOT LIKE 'civi_%';"
$MYSQLCMD -e "DELETE FROM civicrm_extension WHERE full_name NOT IN ('sequentialcreditnotes', 'eventcart', 'greenwich', 'org.civicrm.search_kit', 'org.civicrm.afform', 'authx', 'org.civicrm.flexmailer', 'financialacls', 'contributioncancelactions', 'recaptcha', 'ckeditor4', 'legacycustomsearches', 'civiimport', 'message_admin', 'legacydedupefinder) and full_name NOT LIKE 'civi_%';"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this missing a quote? It should be 'legacydedupefinder' instead of 'legacydedupefinder, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree it should be

Suggested change
$MYSQLCMD -e "DELETE FROM civicrm_extension WHERE full_name NOT IN ('sequentialcreditnotes', 'eventcart', 'greenwich', 'org.civicrm.search_kit', 'org.civicrm.afform', 'authx', 'org.civicrm.flexmailer', 'financialacls', 'contributioncancelactions', 'recaptcha', 'ckeditor4', 'legacycustomsearches', 'civiimport', 'message_admin', 'legacydedupefinder) and full_name NOT LIKE 'civi_%';"
$MYSQLCMD -e "DELETE FROM civicrm_extension WHERE full_name NOT IN ('sequentialcreditnotes', 'eventcart', 'greenwich', 'org.civicrm.search_kit', 'org.civicrm.afform', 'authx', 'org.civicrm.flexmailer', 'financialacls', 'contributioncancelactions', 'recaptcha', 'ckeditor4', 'legacycustomsearches', 'civiimport', 'message_admin', 'legacydedupefinder') and full_name NOT LIKE 'civi_%';"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants