-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[KYC Match] Scoring #85
Comments
Hi @ToshiWakayama-KDDI I would summarise and propose the below, where 'attribute' below is a field in the existing KYC specification:-
|
Hi @KevScarr |
@HuubAppelboom I would suggest a true equates to an exact match, ie =100. for close matches ie when you return a score allow the consuming service to judge if it's a close enough match or not to proceed (their use-cases will drive their error tolerance). |
hi @HuubAppelboom , @KevScarr, I understand that a score result (optional) might be added to a boolean attribute (True/False/ Not-avalaible) which is mandatory if provided in the request. |
@GillesInnov35 @HuubAppelboom Fair point; purely thinking about when a customer of the service migrates from the previous version to this version so backward compatibility would be important. I'd say the score is only provided when a boolean: false is returned; outside of that condition it offers little value. |
yes sure Kevin, backward compatibility will be an important point, but as KYC Match version 1.0.0 has not been published I wonder if it is a problem. But may be it is.
Thanks a lot for your active contribution |
Makes sense. So you would return a '-1' when the attribute wasn't available for checking, hence no requirement to have the boolean field in your current response. If no MNO has implemented the current version then it's a fair shout to move towards a score only approach. |
@KevScarr @GillesInnov35 We may need to think of an approach which makes it possible to be extended further. For example, I think it may be a good idea to provide feedback whether the data is unverfied or has been verified by the MNO. That way we can provide a larger market reach, by also including unverified attributes, and the CSP can then decide whether to use that attribute or not. |
Hi @GillesInnov35 , @HuubAppelboom , @KevScarr , all, Thank you for your prompt comments/discussion, which I did not expect actually. I should have informed you that there is KYC Match scoring enhancement proposal in the API Backlog WG, so, once we have received the proposal, we should proceed with our scoring discussion taking it into account. We should wait for it, but I don't think it will take long. I will update the status. Best regards, |
Hi @GillesInnov35 , @HuubAppelboom , @KevScarr, all, Our implementation is based on v0.1.0, and actully we do not need scoring feature, so, we would insist KYC Match API should work without scoring. It is the OGW original scope, I understand, and for a OGW global API, it is also important. In addition, as we all know, we have put our efforts into v0.1.0 already, so we should use our initial design and consider backward compatibility as much as possible, I believe. Thanks, |
As a suggestion how to add score and other information to the API response, maintain backwards compatibility, and have something that can be expanded, we could add an extra string (when applicable) in the response for attributes where score is relevant. For example the attributeMatch will have values "true", "false", "not_available" (like today) So for example you will get: givenNameMatch : false |
hi @ToshiWakayama-KDDI, all, thanks for your comment. BR |
Hi all, As advanced in last week meeting:
• Keep current attributes-> "attributeMatch": true/false/not_available From the technical perspective, this should keep backwards compatibility as, based on OAS3, there is a parameter called “additionalProperties” which indicates if the object (our answer in this case) can have additional parameters not documented or not. The default value of “additionalProperties” is true, therefore in CAMARA we assume it is true. So the customer should be ready to receive additional parameters. It would be worth it to check this.
• Numeric attributes are not checked: ie birthdate Regards, |
hi all, thanks Clara for this detailed summary. |
Building on Issue #96 / we should follow the same design convention (define once, use many):-
ScoreMatchResult to appear for all attribute fields, excluding the following fields as they are numeric/enum/ID based:-
When a field is numeric only in a particular country, as per the above summary, the score wouldn't be returned. |
I've taken the attributes from the current version of the specification and following the rules given an initial view of which attributes can support a 'score' concept in full. It would be good to reach a common view across as many countries as possible, it'll then make updating the yaml spec straightforward.
Some fields in some countries will be all numeric in others, a mixture. @ToshiWakayama-KDDI Should the nameKana*Match attributes also have scores in this next version of the specification (ie will these attributes remain here or be in an extension)? |
Hi @KevScarr From TEF our proposal is mainly focused in not losing the retrocompatibility as we are already integrated with clients so the design could be simpler:
We can document that the ScoreMatch properties will only be returned if the related property is false |
hi @fernandopradocabrillo, I think that with an allOf word it works well.
to be confirmed I suppose |
hi @fernandopradocabrillo, you're right. My proposition bellow can't be applied.
I agree with yours regarding backward compatibility which is expected. |
Hi @KevScarr , all,
Thank you for asking me about this. We would prefer to have scores for the nameKanaHankakuMatch and the nameKanaZenkakuMatch attributes in this next version. Sorry for the late reply, as I needed to discuss this internally. BR |
Hi @KevScarr , @fernandopradocabrillo , @GillesInnov35 , @claraserranosolsona , all I have a question for my clarification about way of scoring. It seems that Jaro-Winkler distance algorithm will be used for scoring of string-type attributes (after normalisation has been applied), however, I think it should be up to each operator to choose the way how to calculate scoring. The reason is, even though in Europe Jaro-Winkler distance algorithm could be used as the common way, it is unclear that Jaro-Winkler distance algorithm can be used for other languages, or, if it can be used for another language, it unclear that Jaro-Winkler distance algorithm is best suited for it. That is my concern, and actually we ourselves are not sure about using Jaro-Winkler distance algorithm for Japanease language. So, is it OK that it will be up to each operator to choose the way how to calculate scoring, or, is there any other thought? Thanks, |
hi @ToshiWakayama-KDDI , all, I don't really know if this algorithm works for all languages but it should (to be confirmed). BR |
Hi @gilles, Thanks for your comments.
This is agreeable sentence, however, as Jaro-Winkler algorithm has not been proved effective for other languages than European languages, it would not be a better way to specify Jaro-Winkler algorithm as mandatory algorithm. If specific algorithms are needed in KYC Match API spec, for example, Jaro-Winkler could be recommendation for European languages, but algorithm for other languages should be TBD. Would this be a possible way forward? BR |
Hi @ToshiWakayama-KDDI , As discussed in last week meeting, in order to have a standard score as far as possible, would be ok to proceed with Jaro-Winkler algorithm indicating the following? "Unless otherwise captured in the specification, score will use the JaroWinkler distance algorithm for all countries." As so far JaroWinkler has been proven to be the most effective algorithm when comparing two strings, but if at some point for a specific language there is another algorithm that works better, this would give the option to change it. Many thanks, |
Hi @claraserranosolsona , Thanks for reminding me. Sorry for the delay, due to my sickness (Covid-19 still exists) and so on. I think I can reply by tomorrow. Thank you for your understanding. Reagrds, |
Hi @claraserranosolsona , It seems Jaro-Winkler algorithm itself can be used for Japanese lanaugage, however, KDDI does not provide Match Scoring function at all now, so, we are not sure if values caluculated by Jaro-Winkler algorithm are meaningful for KYC Match service, I am afraid. If you want to use Jaro-Winkler algorithm commonly for Match Scoring, it is fine with us by adding the proposed sentence "Unless otherwise captured in the specification, score will use the JaroWinkler distance algorithm for all countries" in the API description. When KDDI implement Match Scoring function, we could add something in the description if we would have problem with Jaro-Winkler algorithm. Just to reiterate our thoughts. We understand that in Europe Jaro-Winkler algorithm has been used and has been proven effective, so, there should be no problem, but we think that this algorithm should not be any barrier for operators in other langauge areas to implement this API, and that this API should be made an API suitable for globally common. Many thanks, |
Problem description
To consider Scoring feature for KYC Match.
(Spin off from Issue #65, item No.1, as per Action Item #13.03)
The text was updated successfully, but these errors were encountered: