Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposition to a solution for boosting score DOWN #3

Open
herearemypersonalprojects opened this issue May 17, 2023 · 8 comments
Open
Labels
enhancement New feature or request

Comments

@herearemypersonalprojects
Copy link

herearemypersonalprojects commented May 17, 2023

Hello,

In the Elasticsearch negative values are not allowed in function weight. Thus, if I understand your solution, you want to use must_not in function filter for boosting DOWN score. However, as you mentioned in your document the DOWN -instructions are not yet supported. Here are your current code:

@Override
public Query convertBoostDown(final BoostQueryDefinition<Query> boostQueryDefinition) {
    return new Query(
            new BoolQuery.Builder()
                    .should(createMatchAllQuery())
                    .mustNot(createBoostQuery(boostQueryDefinition))
                    .build()
    );
}

The above code does not work correctly because it generates a query like this:

 Query : {
    "bool": {
      "filter": [],
      "must": [
   ...
      ],
      "should": [
        {
          "bool": {
            "**must_not**": [
              {
                "function_score": {
                  "boost_mode": "sum",
                  "functions": [
                    {
                      "filter": {
                        "match_all": {}
                      },
                      "weight": 100
                    }
                  ],
                  "query": {
                    "bool": {
                      "boost": 1,
                      "minimum_should_match": "0%",
                      "**must**": [
  ...
                            ],
                            "tie_breaker": 0
                          }
                        }
                      ]
                    }
                  }
                }
              }
            ],
            **"should": [
              {
                "match_all": {}
              }
            ]**
          }
        }
      ]
    }
  }

After investing into this case, I would like to propose a solution for DOWN -instructions with ADDITIVE as following:

@Override
public Query convertBoostDown(final BoostQueryDefinition<Query> boostQueryDefinition) {
    return new Query(
            new BoolQuery.Builder()
                    .must(createBoostDownQuery(boostQueryDefinition))
                    .build()
    );
}

private Query createAdditiveDownScoreQuery(final BoostQueryDefinition<Query> boostQueryDefinition) {
    Query query = boostQueryDefinition.getQuery();
    Query mustQuery = query.bool().must().get(0);

    BoolQuery downQuery = QueryBuilders.bool()
            .mustNot(mustQuery)
            .build();

    return new Query(
            new FunctionScoreQuery.Builder()
                    .query(downQuery._toQuery())
                    .functions(createFunctionScoreByWeight(boostQueryDefinition.getBoost()))
                    .boostMode(FunctionBoostMode.Sum)
                    .build()
    );
}

The new code generates a query like below:

 Query : {
    "bool": {
      "filter": [],
      "must": [
   ...
      ],
      "should": [
        {
          "bool": {
            "**must**": [
              {
                "function_score": {
                  "boost_mode": "sum",
                  "functions": [
                    {
                      "filter": {
                        "match_all": {}
                      },
                      "weight": 100
                    }
                  ],
                  "query": {
                    "bool": {
                      "boost": 1,
                      "minimum_should_match": "0%",
                      "**must_not**": [
  ...
                            ],
                            "tie_breaker": 0
                          }
                        }
                      ]
                    }
                  }
                }
              }
            ]
          }
        }
      ]
    }
  }

Here is an unit test:
Given some documents indexed:

private final List<Product> products = List.of(
        product("1", "iphone", "smartphone", "apple"),
        product("2", "apple", "smartphone", "apple"),
        product("3", "apple smartphone", "smartphone", "apple"),
        product("4", "apple", "case", "apple"),
        product("5", "iphone", "case", "apple"),
        product("6", "samsung", "case", "samsung"),
        product("7", "samsung", "smartphone", "samsung"),
        product("8", "the more you learn; plus you forget", "somewhere", "")
);
final Map<String, Float> fieldScores = Map.of(
        "name", 40.0f,
        "category", 20.0f
);

When search:

@Test
public void testScoreBoostAdditiveWithDOWN() throws IOException {
    final String queryInput = "apple";
    final String filterRule = "apple => \n DOWN(100): smartphone"; 

    List<FnacProduct> results = querqyService.search(
            getIndexName(), filterRule, "", queryInput, fieldScores);

    assertThat(EsPocTools.toIdAndScoreMaps(results)).containsExactlyInAnyOrder(
            idAndScoreMap("2", 40.0), //  40 "apple", "smartphone"
            idAndScoreMap("3", 40.0), // 40 "apple smartphone", "smartphone"
            idAndScoreMap("4", 140.0) // 40 + 100 "apple", "case"
    );
}

Hope that is enough clear and I am looking forward your reply as soon as possible.

Best regards,
Quoc-Anh

@JohannesDaniel JohannesDaniel added bug Something isn't working enhancement New feature or request and removed bug Something isn't working labels May 21, 2023
@JohannesDaniel
Copy link
Collaborator

Hi, thank you for spotting this issue. In general, we consider three boost modes, PARAM_ONLY, ADDITIVE and MULTIPLICATIVE. What you are suggesting is PARAM_ONLY, as it ignores the score of the boost query and purely considers the score of the parameter.

Anyway, from my current point of view, this is anyway the only reasonable option for down boosts, as a must_not does not return a score that could be considered. Therefore, I think your proposal makes a lot of sense.

@herearemypersonalprojects
Copy link
Author

Hello,

That is the ADDITIVE case for down boosts (...After investing into this case, I would like to propose a solution for DOWN -instructions with ADDITIVE ...).

@JohannesDaniel
Copy link
Collaborator

I think the term ADDITIVE is misleading. I will change the naming of the different boost options.

@herearemypersonalprojects
Copy link
Author

Are you planning to develop the MULTIPLICATIVE case?

In my opinon, the current code structure is suitable for ADDITIVE but we should have a different code structure for the MULTIPLICATIVE.

@JohannesDaniel
Copy link
Collaborator

JohannesDaniel commented May 22, 2023

Querqy in general supports multiplicative boosting, so there should be a Querqy user using this. However, I have not implemented support for this so far as I do not really see the need for this. All multiplicative solutions I have seen so far in the context of retail were really bad. Anyway, this is open source, so if someone still has a need for this, such a feature can simply be contributed.

But yes - I guess the Querqy query would be required to be embedded in a FunctionScoreQuery, this would definitely lead to a change in the code structure. Furthermore, I do not know whether this can be done with the existing Solr Query DSL stuff.

@JohannesDaniel
Copy link
Collaborator

FYI: I will release a fix for the DOWN boosting soon

@JohannesDaniel
Copy link
Collaborator

But feel free to create an issue for the multiplicative stuff - might become an interesting discussion :)

@herearemypersonalprojects
Copy link
Author

ok, thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants