Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KYC Match - Compare specifications #18

Closed
GillesInnov35 opened this issue Nov 16, 2023 · 74 comments
Closed

KYC Match - Compare specifications #18

GillesInnov35 opened this issue Nov 16, 2023 · 74 comments
Labels
documentation Improvements or additions to documentation

Comments

@GillesInnov35
Copy link
Collaborator

CAMARA KYC Match - Specifications


Bellow a proposal of comparison matrix between different offers' specifications and CAMARA initial requirements proposal.
Key points :

  1. List of KYC attributes requirements must be reviewed, propose a short list to target a first version based on existing proposals
  2. Match result response must be defined. A match score (matching percentage) or a match result value such as what GSMA returns (“Y”– match is successful, “N-NA” - match failed, data is not available, “N-AV” – match failed; data is available, “N-AD” – match failed, data is available, but access is denied)

Request Specifications

CAMARA KYC Match requirements GSMA KYC Match KDDI KYC Match Orange KYC Match Proposal
msisdn phone_number subscriber_phone_number_match msisdn phoneNumber
name name user_name_match name name
given name given_name given_name givenName
family name family_name family_name familyName
address subscriber_formatted_match address
street name house_or_housename street_name streetName
region subscriber_region_match
postal code postal_code subscriber_postal_code_match postalCode
town locality locality locality
country country country country
birthdate birthdate subscriber_birthdate_match birhdate birthdate
email address email email


Response Specifications

CAMARA KYC Match requirements GSMA KYC Match KDDI KYC Match Orange KYC Match Proposal
msisdn phone_number subscriber_phone_number_match msisdn
name name user_name_match nam_score
given name given_name given_name_score
family name family_name family_name_score
address subscriber_formatted_match
street name house_or_housename street_name_score
region subscriber_region_match
postal code postal_code subscriber_postal_code_match postalCode_score
town locality locality_score
country country country_score
birthdate birthdate subscriber_birthdate_match birhdate_score
email address email_score
@GillesInnov35 GillesInnov35 added the documentation Improvements or additions to documentation label Nov 16, 2023
@StefanoFalsetto-CKHIOD
Copy link
Collaborator

Hi Gilles,
I have some feedbacks:
Request Specifications

  1. Why do you wants to change the name from MSISDN to phoneNumber? The word MSISDN is directly referring to the standard way of representing a phone number.
  2. I would like to avoid the use of "address" attribute. This aggregated field is not only depending on country rules but also on internal MNO BSS implementation. Since we are rebooting from scratch this service, we can leverage on previous experience and ask to MNOs to export the single components of the address, such as:
    street_name --> the name of the street where the end customer resides. Just the street name, nothing else (i.e., no house number, no zip/postal code, etc.)
    town
    province
    region
    house_number --> the number of the building where the end customer resides
    postal_code

We can still include the "address" attribute but discourage in some way the use of it.

Response Specifications
Since the answer will be Y, N-NA, N-AV, N-AD the term "score" could be misleading.
I am still thinking to a valid alternative to propose, but I can't figure it out now.

@ToshiWakayama-KDDI
Copy link
Collaborator

Thanks very much, @GillesInnov35, for creating this issue.

May I ask questions for clarification?

  • What do you mean by CAMARA KYC Match requirements?
  • What do you mean by GSMA KYC Match?

I have one comment: we have agreed that calculating matching score is for our future releases, so, it should not be included for our initial release.

Thanks.

@StefanoFalsetto-CKHIOD
Copy link
Collaborator

Hi @ToshiWakayama-KDDI we agreed to not include the matching score.
But we also agreed that the score is something we need to "take into consideration in some way" since we will work on it as soon as the first version of those specifications is released. Hence, I think it’s important to do now something to enable future improvements.

@GillesInnov35, I figured out my proposal:
In order to find a "middle way" between future developments and Toshi pressure for next-to-come first milestone, we can still use "_match" suffix on response attributes. In that way we can address our future discussions on modifying just the "Y" response. Maybe could it be "Y-nn" where nn is the score? Let's keep the proposals for future discussions.

@GillesInnov35
Copy link
Collaborator Author

Hi @StefanoFalsetto-CKHIOD , @ToshiWakayama-KDDI
Thanks a lot for your comments. I'll try to explain the proposition
Phone number rather than msisdn

  • As discussed with Ludovic ROBERT who is involved in few CAMARA API API projects, phone number is commonly used and not msisdn (Number Verify API definition). I think we'll have to be compliant with all the others CAMARA API design projects.

use of address

  • I agree with you Stephano, the value of address . The attribute address appears in GSMA Mobile Connect KYC Match API Definition. That's why I mentionned it to discuss about. We could limit the attributes to the detailed address as you propose which will be more precise.

GSMA Mobile Connect KYC Match

  • Toshi, to answer to your question, GSMA has published a Mobile Connect KYC Match Definition and technical requirements (Feb. 2022). I had a look on it and most of attributes are similar to CAMARA requirements. I think it was interesting to compare with others propositions.

@fernandopradocabrillo
Copy link
Collaborator

Hi @GillesInnov35
I think that this table is lacking Telefonica's proposal too and some of our fields (like idDocument) and vision for the properties. Can you please update accordingly? Thanks!

Regarding use of address
In our proposal, the address field is composed of the different parts it can have. We consider that having a single field in which the postal address can be included in such a generic way adds complexity and, as @StefanoFalsetto-CKHIOD said, is very country-dependent. So we support having different fields for its representation.

@GillesInnov35
Copy link
Collaborator Author

Hi @fernandopradocabrillo , yes sure

could you send me the list of atributes Telefonica proposes in its solution.
thanks

@javier-carrocalabor
Copy link

Hi,
@GillesInnov35, here the Telefonica's proposal mentioned by @fernandopradocabrillo :

MatchIdentityRequestBody:

which can be summarized in: phoneNumber, idDocument, identity (composed of firstName and lastName), address (composed of postalCode, streetName and streetNumber), and birthdate.
And the responses would be xxxx_response for each of them.

@ToshiWakayama-KDDI
Copy link
Collaborator

Hi all,

Please find the revised shortlist table below. I have added our proposed parameters/attributes, which are included in our YAML file, to the shortlist table. Also I have added Telefonica's parameters/attributes as well. Hope it is correct. Also I have changed GSMA Match to MobileConnect Match and moved it to the right as MobileConnect is not our proposal.

I have one point to ask you at the moment:
Our company differentiates Subscriber (who makes contract with us) and User (who actually uses the phone). For example, Pararent is Subscriber and their child is User. Do you have the same kind of differenitation?

Match Request Body

CAMARA KYC Match requirements/categories KDDI KYC Match Orange KYC Match Telefonica KYC Match GSMA KYC Match Orange Proposal
Phone Number subscriber_phone_number_match msisdn phoneNumber phone_number phoneNumber
(special phone number) main_subscriber_phone_number_match
ID Document idDocument
Subscriber name user_name_match name identity (composed of firstName and lastName) name name
(name reading) subscriber_name_kana_hankaku_match
(name reading) subscriber_name_kana_zenkaku_match
(given name) given_name (included in identity) given_name givenName
(family name) family_name (included in identity) family_name familyName
Subsscriber Postal Code subscriber_postal_code_match postalCode (included in address) postal_code
Subscriber Address subscriber_formatted_match address (composed of postalCode, streetName and streetNumber) address address
(street name) street_name (included in address) house_or_housename streetName
(street number) (included in address)
Subscriber Address-Region subscriber_region_match
Subscriber Address-Town locality locality locality
Subscriber Address-Country country country country
Subscriber Birthdate subscriber_birthdate_match birthdate birthdate birthdate birthdate
Subscriber Email Address email email
User Name user_name_match
(user name reading) user_name_kana_hankaku_match
(user name reading) user_name_kana_zenkaku_match
User Birthdate user_birthdate_match
3rd party ID cp_id
service_id


KYC Match Response

CAMARA KYC Match requirements/categories KDDI KYC Match Orange KYC Match Telefonica KYC Match GSMA KYC Match Proposal
Phone Number subscriber_phone_number_match msisdn phoneNumber_response phone_number
(special phone number) main_subscriber_phone_number_match
ID Document idDocument_response
Subscriber name subscriber_name_match name_score identity_response name
(name reading) subscriber_name_kana_hankaku_match
(name reading) subscriber_name_kana_zenkaku_match
(given name) given_name_score (included in identity) given_name
(family name) family_name_score (included in identity) family_name
Subsscriber Postal Code subscriber_postal_code_match postalCode_score (included in address) postal_code
Subscriber Address subscriber_formatted_match address_response address
(street name) street_name_score (included in address) house_or_housename
(street number) (included in address)
Subscriber Address-Region subscriber_region_match
Subscriber Address-Town locality_score locality
Subscriber Address-Country country_score country
Subscriber Birthdate subscriber_birthdate_match birthdate_score birthdate_response birthdate
Subscriber Email Address email_score
User Name user_name_match
(user name reading) user_name_kana_hankaku_match
(user name reading) user_name_kana_zenkaku_match
User Birthdate user_birthdate_match

Many thanks,
Toshi

@ToshiWakayama-KDDI
Copy link
Collaborator

Hi all,

Also please find the below a short list table for KYC Fill-in attributes/parameters based on our Fill-in YAML.

Any comments would be welcome.

Fill-in Request Body

CAMARA KYC Fill-in requirements/categories KDDI KYC Fill-in No other Fill-in proposals Proposal
3rd party ID cp_id


Fill Response

CAMARA KYC Fill-in requirements/categories KDDI KYC Fill-in No other Fill-in proposals Proposal
Phone Number subscriber_mobile_phone
Subscriber name subscriber_name
(family name) subscriber_name_family
(given name) subscriber_name_first
(name reading) subscriber_name_kana_hankaku
(family name reading) subscriber_name_kana_hankaku_family
(given name reading) subscriber_name_kana_hankaku_first
(name reading) subscriber_name_kana_zenkakuku
(family name reading) subscriber_name_kana_zenkaku_family
(given name reading) subscriber_name_kana_zenkaku
Subsscriber Postal Code subscriber_postal_code
Subscriber Address subscriber_formatted
Subscriber Address-Region subscriber_region
Subscriber Birthdate subscriber_birthdate
Subscriber Gender subscriber_gender
Subscriber Email Address subscriber_mail_address
User Name user_name
(user family name) user_name_family
(user given name) user_name_first
(name reading) user_name_kana_hankaku
(family name reading) user_name_kana_hankaku_family
(given name reading) user_name_kana_hankaku_first
(name reading) user_name_kana_zenkakuku
(family name reading) user_name_kana_zenkaku_family
(given name reading) user_name_kana_zenkaku
User Birthdate user_birthdate

Many thanks,
Toshi

@GillesInnov35
Copy link
Collaborator Author

GillesInnov35 commented Nov 21, 2023

@ToshiWakayama-KDDI, Orange KYC offers differentiate also subscriber and user. The 3-Legged authentication architecture is based on user information who authenticates and should consent. But information returned by the service concern the subscriber who signed the contract.
@fernandopradocabrillo, idDocument is part of TF API Match design. The type of concerned document is never mentioned ?

@GillesInnov35
Copy link
Collaborator Author

GillesInnov35 commented Nov 21, 2023

I have a question regarding Toshi's proposition included language information (user_name and user_name_kana_hankaku).
Does it mean we should introduce a dataType attribute valued with InternationUserClass, JapaneseUserClass, etc.
This kind of information to type the data has been for example included in DeviceLocation API definition.
Gilles

@ToshiWakayama-KDDI
Copy link
Collaborator

@ToshiWakayama-KDDI, Orange KYC offers differentiate also subscriber and user. The 3-Legged authentication architecture is based on user information who authenticates and should consent. But information returned by the service concern the subscriber who signed the contract. @fernandopradocabrillo, idDocument is part of TF API Match design. The type of concerned document is never mentioned ?

Hi @GillesInnov35 ,
Thanks. Just to double check, address, name, email etc. that are currently proposed by Orange are all for Subscribers???

Thanks.

@ToshiWakayama-KDDI
Copy link
Collaborator

I have a question regarding Toshi's proposition included language information (user_name and user_name_kana_hankaku). Does it mean we should introduce a dataType attribute valued with InternationUserClass, JapaneseUserClass, etc. This kind of information to type the data has been for example included in DeviceLocation API definition. Gilles

Hi @GillesInnov35 ,
Thanks for the information! I have just looked at DeviceLocation YAMLs, but I could not find it (dataType). Could you advise me which YAML has it (dataType)?

Thanks

@GillesInnov35
Copy link
Collaborator Author

Hi @ToshiWakayama-KDDI ,

  • yes, information returned or compared by the Orange Match ID API concern only subscriber's information.

  • in the DeviceLocation API deifnition the attribute which specifiy the type of the class is areaType (circle or polygon).

@ToshiWakayama-KDDI
Copy link
Collaborator

Hi @ToshiWakayama-KDDI ,

  • yes, information returned or compared by the Orange Match ID API concern only subscriber's information.
  • in the DeviceLocation API deifnition the attribute which specifiy the type of the class is areaType (circle or polygon).

Hi Gills @GillesInnov35 ,
Thank you very much.
I will look into it quickly, together with my internal colleagues.

@fernandopradocabrillo
Copy link
Collaborator

@fernandopradocabrillo, idDocument is part of TF API Match design. The type of concerned document is never mentioned ?

Hi @GillesInnov35 ,
That's correct, we decided not to include the idDocument type in the proposal since it added unnecessary complexity. In the end we want to check if the idDocument provided matches the one stored by the MNO, the important thing here is the number itself.

@ToshiWakayama-KDDI
From our side, we also do the match against subscriber's information only.

@ToshiWakayama-KDDI
Copy link
Collaborator

Hi @fernandopradocabrillo ,
Thank you.

Hi @GillesInnov35 ,
I have quickly checked 'areaType' in location-retrieval API and location-verification API, but I am not immediately quite sure if we could introduce UserClass attributes in the similar way for our purpose. Anyway, at the mobment, we do not consider introducing new attributes like UserClass, as it is better for us to keep our first version simple with only required attributes.

Many thanks,

@GillesInnov35
Copy link
Collaborator Author

Thanks a lot @ToshiWakayama-KDDI
As we are currently discussing about what attributes should be mandatory my question was:
Do specific attributes kana should be part of KYC-Match request definition ?

  • If Yes, dataType used a discriminator would be useful to avoid duplication of concerned attributes
  • if no, there's no need to differentiate 2 schemas

@HuubAppelboom
Copy link
Collaborator

In the Netherlands, we currently have the following attribute list in use:

  • Given Name Initials (we either match on only the first initial or on all initials when availabe)
  • Family Name (which is stripped from any prefixes)
  • Postal code
  • House number
  • House number extension (not in the Mobile Connect standard but we need it in NL)
  • Date of Birth
  • E-mail address

We don't use street name, town etc because in the Netherlands postal code + house number + house number extension is very exact already.

We already have relatively high match rates (up to 80% for family name). Nevertheless, I think we can still improve by the following:
In stead of Given Name Initials, use the following attributes in parallel:

  • Initial of the first Given Name
  • All Initials of Given Names
  • The first Given Name
  • All Given Names

Often people only record their first Given Name or Initial (although many have multiple Given Names). The use of initials can help for cases where there are multiple ways how to write a given name (for example Steve and Stephen).

In the Netherlands we have a list of prefixes that we usually strip from the family name. The reason we do this is that the prefixes can be abbreviated, which hinders the matching. What we can add is an extra attribute in which you compare these prefixes.

For Family Name, I think we can improve by adding the Family Name at birth as a separate attribute. In the Netherlands, your familiy name can change when you get married, so this may change during your life time. Your Family Name at birth never changes, and when available, it is better for matching because it stays constant.

Streetname we do not use, because our postal code + house number + housenumber extension is very exact.

So, we would propose the following list (for NL):

  • Initial of the first Given Name
  • All Initials of Given Names
  • The first Given Name
  • All Given Names
  • Prefixes of the Current Family Name
  • Current Family Name
  • Prefixes of the Family Name at birth
  • Familiy Name at birth
  • Postal Code
  • House Number
  • House Number Extension
  • Date of Birth
  • E-mail address

@HuubAppelboom
Copy link
Collaborator

Annex B - MC Product Specification - Match, v1.4.xlsx

Attached also the list of specs we currently use for Match in NL. It also includes the list of prefixes we strip from family name

@GillesInnov35
Copy link
Collaborator Author

Thanks @HuubAppelboom
I think we should be able to identify a short list of common attributes to all designs and propose a first draft.

@javier-carrocalabor
Copy link

I agree with @GillesInnov35 in the sense that I think we should see it from the perspective of a Service Provider that is asking a user for some contact information, and shows a form to collect several fields of data. Then, IMHO, and recongizing I don't know the habits in the Netherlands, I don't think the Service Provider is going to ask the user for, for example, all potential ways of expressing their name, but will ask for the most common way to express the name in that country.

@HuubAppelboom
Copy link
Collaborator

HuubAppelboom commented Nov 29, 2023

I agree with @GillesInnov35 in the sense that I think we should see it from the perspective of a Service Provider that is asking a user for some contact information, and shows a form to collect several fields of data. Then, IMHO, and recongizing I don't know the habits in the Netherlands, I don't think the Service Provider is going to ask the user for, for example, all potential ways of expressing their name, but will ask for the most common way to express the name in that country.

The issue is not that we think that Service Providers should ask end users for all different possible variations that you can have, but that MNO's and Service Providers have a history and way of working in collecting the data. For example, in the Netherlands we have a couple of MNO's which only have collected initials. Making Given Name(s) the only option will not work in this case (that's why we have chosen for initials-only in the Netherlands, deviating from the Mobile Connect standard).

The other issue you have is when you ask for matching all initials (or given names), and provide that as the only option, you will see that often 2nd and rd initials are missing in current databases (at least we have seen that), which results in a lower match rate than you could have. That's why we propose to make several attribute fields available in the standard, and that you match on all field that you have available. The same principle would apply for family name, if you have the family name at birth also available, that you can aso provide a match on this. In the end , you can safely get to a higher overall match rate through this, without the need to go to more complex solutions like a match score based on whether the attributes are similar.

As far as the availability of data is concerned, in case the MNO does not have a specific attribute in their CRM system, you can always answer with "NA".

@ToshiWakayama-KDDI
Copy link
Collaborator

Thanks a lot @ToshiWakayama-KDDI As we are currently discussing about what attributes should be mandatory my question was: Do specific attributes kana should be part of KYC-Match request definition ?

  • If Yes, dataType used a discriminator would be useful to avoid duplication of concerned attributes
  • if no, there's no need to differentiate 2 schemas

Hi @GillesInnov35 ,

Thanks very mucy. First of all, my understanding is that we are not discussing mandatory attributes, but that all attributes should be optional, as I shared on Tuesday. Surely we need mandatory requirement like 'at least one attribute should be included in a API match request'.

So, to answer your question, we would like to have specific attributes kana etc. part of KYC-Match request definiton, as one of the options.

Then I understand your point that dataType used as a discriminator would be useful to avoid duplication of concerned attributes, and I think I need to look into it.

Many thanks,
Toshi

@ToshiWakayama-KDDI
Copy link
Collaborator

Hi @HuubAppelboom , @javier-carrocalabor , @GillesInnov35 ,

Thank you, all, for your comments. Now I understand the Netherlands needs some spedific attributes. As I shared on Tuesday, I would propose to include all the required attributes, both of commonly used attributes and country/market specific attributes, if we categorise, in our first version. I think that all the attributes should be Optional, as it seems there are many ways to use this API/KYC-Match functionality so it is difficult to identify mandatory ones. Of course, we need some mandatory requirement like 'there should be at least one attribute incuded in a API request'.

If you think we may need categorisation of Common attributes and Country/Market specific attributes, we could write it down somewhere in YAML or in API documentation.

What do you think?

Many thanks,
Toshi

@HuubAppelboom
Copy link
Collaborator

Hi @ToshiWakayama-KDDI,

I would indeed support to include all attributes, and include both commonly used and country/market specific attributes. As a rule, I would suggest that when you can, you support all attributes for which you have data for.

For example, for NL we currently do not support streetname (because it is not necessary here), but for the sake of international compatibility we will implement it.

On the customer side, the customer can always choose which attributes will be asked to be matched (with the minimum of one of course). For example, for some cases we only need address verification and nothing else, because the customer is already using a different source for the name, date of birth, email etc.

What should also be prevented is that customers start offering data in case they don't have it, because this will give you wrong match rate statistics. For example, we had one customer that did not have Date of Birth data, so in stead they always submitted "YYYY-MM-DD" as a hashed string, which ofcourse never matches, or a dummy date like "1900-01-01". You will get low match rates, and it really take some time to find out what is going wrong. So in any case, customers must always submit valid data, and not dummy data.

With kind regards
Huub

@GillesInnov35
Copy link
Collaborator Author

Hi @ToshiWakayama-KDDI , term mandatory was not appropriate because as you say all attributes should be optional of course (except phone number). I was meaning attibutes we'd like to see in the API design (will be common attributes). Thanks a lot

@StefanoFalsetto-CKHIOD
Copy link
Collaborator

As I said in some other comments, I will be happy to discuss about deprecating the address attribute. It is far better (for many countries around the world) to use different attributes for the single address components.

@StefanoFalsetto-CKHIOD
Copy link
Collaborator

In order to find the right initial set of attributes, I am sharing the full set of attributes that CKH (and hence all the affiliates operators) are offering to its Partners. As you can see we are supporting all the attributes defined into the GSMA IDY.28 specifications plus some custom ones (e.g., the age verification). Some of the address related attributes such as houseno_or_housename_hash are used for historical reasons, but will be deprecated in future. Moreover, some of the custom attributes are calculated on the fly by managing atomic data obtained from MNOs (e.g., age and age_is_greater_than are calculated using the birthdate).

Requested Attribute Returned value
account_state Active/Inactive
age_hash True/False
age_is_greater_than True/False
address_line1_hash Y/N-NA/N-AV
address_line2_hash Y/N-NA/N-AV
billing_segment PAYM/PAYG
birthdate_hash Y/N-NA/N-AV
city_or_province_hash Y/N-NA/N-AV
country_hash Y/N-NA/N-AV
email_hash Y/N-NA/N-AV
family_name_hash Y/N-NA/N-AV
flat_number_hash Y/N-NA/N-AV
gender_hash Y/N-NA/N-AV
given_name_hash Y/N-NA/N-AV
house_name_hash Y/N-NA/N-AV
house_number_hash Y/N-NA/N-AV
houseno_or_housename_hash Y/N-NA/N-AV
is_adult True/False
is_age_verified True/False
is_email_verified True/False
is_lost_stolen True/False
middle_name_hash Y/N-NA/N-AV
postal_code_hash Y/N-NA/N-AV
title_hash Y/N-NA/N-AV
town_hash Y/N-NA/N-AV

@HuubAppelboom
Copy link
Collaborator

In general, I am not too worried about the attribute list being a bit long, but more worried about trying to put too many flavours in a single attribute. For example, we tried working with all initials available for the given name, but which resulted in a too low match rate, simply because either side (MNO or Relying party) did not have all initials at their disposal. Same will be the case if you this with given names, or for example an attribute with all the address details in it. The more you try to push things in a single match result, the higher the chance of a mismatch, and that is why we propose to split 1st given name from middle names, streetname from street number, street number extension from street number etc.

@GillesInnov35
Copy link
Collaborator Author

hello all, that's good this is a very interesting, we are converging to a solution.

@HuubAppelboom could you complete your proposition with some examples of atributes' value in order to see what kind of information is waited. I don't see clearly how and middleNamesInitialsMatch and middleNamesMatch will be valued (type array or single). Thanks a lot
Concerning idDocument if we should to keep it, I think a structure individualIdentification: {name, value} might be used
For example [{"national ID card", "124587652"}]. The objective is to be as clear as possible of what refers the id to.

Regards

@ToshiWakayama-KDDI
Copy link
Collaborator

Hi @javier, Hi Huub, Hi Gilles,

Thank you for your further comments. I have the same view with Huub that I am not worried about the length of the currently proposed attribute list (mine and Huub's). So, Huub's proposed list (plus cp_id/service_id) would be pretty much fine with me.

I can understand the view of making the attribute list as short and simple as possible, however, currently proposed attributes are required by operators and their customers, so, I think there is no point deleting required attributes in order to make the list simple. (For example, we are providing Matching for the single 'name' attribute and the single/formatted 'address' attribute which our customers need.)

For the API clients, they can use attributes they need and can just ignore attributes they do not need. To avoid their confusion, we can prepare proper description and explanation for each API and further we could prepare some typical examples of attributes set for some typical use cases.

For the operators, they can just ignore requests for attributes they do not have.

So, it is kind of 'the greater embarces the less', and I don't believe Huub's proposed list (plus cp_id/service_id) is too long. Could we accept it for our first version?

Thanks,
Toshi

@HuubAppelboom
Copy link
Collaborator

Regarding the middleNames attribute, there is two way we can do this, in case there is more than one middle name.

Take for example:
Robertus Mattheus Franciscus Janssen
in this, Robertus is the given name (always the first one)
Mattheus Franciscus are the middle names
Janssen is the familiy name

For Mattheus Franciscus, we could either choose to make it one long string, with everything lowercase, without spaces etc., and hash the result. So in the end you will recieve a hash of "mattheusfranciscus"

The alternative would be to make it a list of middle names, and make a hash of each middle name separately (after making everything lowercase). So then you receive a list of two hashes (of "mattheus" and "franciscus"), and for each hash you will provide a Y/N whether you also have that in your list. (in this I assume the order of the middle names is not that relevant).

Probably the alternative will give a higher match rate, in case only one of the middle names mismatches you still have a partial match. What do you think ?

@HuubAppelboom
Copy link
Collaborator

Adding which type of ID document may be a good idea, but I know this can also become a bit complex a long list. For example, in the Netherlands we also have special ID documents like a permit for fugitives, an id card for embassy staff, etc, etc.
Can we agree on a short list of the most common types, and one category Other ?
For example Passport, Driving License, IDCard and Other ??

@HuubAppelboom
Copy link
Collaborator

For ID Card document, you also may want to include the issued date, otherwise you will rund the risk of drawing the wrong conclusion, when you see the same type of document, but no match. The issue date will tell you whether the hashed number must be the same for the same identity or not (or that one of the parties has data on an old document).

@ToshiWakayama-KDDI
Copy link
Collaborator

Hi @ToshiWakayama-KDDI , I've a question on partner information (cp_id, service_id) I see in the attributes' list. In 3-Legged or 2-Legged authentication consumer information (partner id) are commonly transmitted in OAuth token. Could you explain why do you think it should be part of definition. thanks a lot

Hu @GillesInnov35 , all,

Thanks for your question.

Indeed, consumer information (partner id) is commonly transmitted in OAuth token. Actually we use this partner information (cp_id, service_id) included in the request body, in addtion to the information transmitted in OAuth token.

As it is our current implementation for commercial service, we would like to have these two included somewhere in the request body (only in the request body, not in the response body), if possible, though they may not be normal KYC attributes. Of course, description of these two items should be properly documented in the YAML and API documentation, e.g. they are only for Japan regional use, how to use them, etc.

Many thanks,
Toshi

@ToshiWakayama-KDDI
Copy link
Collaborator

Hi all,
Thank you for your further debates until last week. As there is no furthre comments, I think we can conclude the KYC-Match parameters for our initial version as below:

  • phoneNumber
  • mainPhoneNumber
  • idDocument
  • name
  • nameKanaHankaku
  • nameKanaZenkaku
  • givenName
  • middleNames
  • familyName
  • postalCode
  • address
  • streetName
  • streetNumber
  • region
  • locality
  • country
  • birthdate
  • email
  • givenNameInitial
  • middleNamesInitials
  • familyNameAtBirth
  • houseNumberExtension
  • gender
  • cp_id
  • service_id

Please point out if there is anything wrong or missing.

Many thanks,
Toshi

@ToshiWakayama-KDDI
Copy link
Collaborator

Hi all,

Thank you very much for the discussion yesterday and your compromise. To note, the following is the list of parameters for the initial plain text version of our KYC Match API.

  • phoneNumber
  • idDocument
  • name
  • nameKanaHankaku
  • nameKanaZenkaku
  • givenName
  • middleNames
  • familyName
  • postalCode
  • address
  • streetName
  • streetNumber
  • region
  • locality
  • country
  • birthdate
  • email
  • familyNameAtBirth
  • houseNumberExtension
  • gender

Should there be any attribute that I missed above, please let me know.

Many thanks,
Toshi

@GillesInnov35
Copy link
Collaborator Author

@ToshiWakayama-KDDI , this is the list of attributes to which we've concluded yesterdays. It's OK.
thanks

@fernandopradocabrillo
Copy link
Collaborator

Hi @ToshiWakayama-KDDI

I still have some doubts about the kana parameters. These are parameters that are very country-dependant, if more countries add their particular properties, we can end up with a never-ending list of parameters.
For example, in Spain we sometimes have a second name, or compound names, but we believe that this should be part of the name field.
For an standard, shouldn't we try to find a solution normalized for every country?

I have a couple ideas to overcome this topic:

  1. Do you have any way to identify if the input information is hankaku or zenkaku? Is there any format checker? I was thinking that maybe you can use the name property and in the backend, as part of the normalization, detect which format it is and then check agains the stored info.
  2. Another way could be to use the property name and let the backend check against both fields of the stored info.

The same way I think we can reduce the list of parameters a bit more removing familyNameAtBirth. If I'am a service that needs the lastName to make a subscription and I want to check the info provided, how do I make the API request? do I check agains all lastName related fields? does the MNO have that much information about their users? Isn't it better to just leave it as lastName and let the country implementation check against what it is commonly used there?

For houseNumberExtension I'm not that sure, this one maybe is usefull, do we have an example of its use?

gender ? -> this one might be difficult to match. Do we need to stablish some rules or available values? male/female, Prefer not to say / Prefer not to disclose, non-binary genders, etc.

So a proposal for the initial list could be:

  1. phoneNumber
  2. idDocument
  3. name/firstName
  4. lastName/familyName
  5. middleName
  6. givenName (joining the previous name values?)
  7. streetName
  8. streetNumber/houseNumber
  9. region
  10. locality
  11. country
  12. postalCode
  13. birthdate
  14. email

In addition we also think that it could be usefull to group related properties under the same object: identity for naming properties, address for properties related to the postal address.

Thanks!

@ToshiWakayama-KDDI
Copy link
Collaborator

Hi @fernandopradocabrillo ,

Thank you, but we have agreed the parameters for our initial version, and we have to complete our initial YAML by Tuesday 19th. If we restart parameter discussion, for example, for us, idDocument is not needed, it will take more time and never end. I understand your point, but why don't we complete our first version once as a basis, and then we can modify and refine it in an agile fashion to make it better. As you know, if we look at some other APIs in CAMARA, they have had several update versions. This is my current view.

Best regards,
Toshi

@HuubAppelboom
Copy link
Collaborator

Hi @fernandopradocabrillo ,

The house number extension is used in the Netherlands. It is used for example in appartment buidlings in which all appartments have the same housenumber, but each appartment has a different extension. Without it, you don't have a complete address.
The reason we have it separate from the housenumber is that there are several ways the housenumbers can be written, which can cause problems. For example, you can write the housenumber extension as I,II,III,IV but also as 1,2,3,4 etc. Also, we use this field for cases like nursing homes, where you can have room numbers like room 105, etc. Formally the house number extension is not registered in the netherlands, so there are different ways of writing.

When we introduced Match as a servcie in the Netherlands, we first tried it without house number extension, but later on added it because we had too many customers complaining it was missing. So, for the Netherlands it is a must have, and there may be also other countries where they have a similar system.

@HuubAppelboom
Copy link
Collaborator

Hi @fernandopradocabrillo ,

The family name at birth is commonly used to identify someone in my country by the telco's. The reason we use this to identify is because it does never change over time due to marriage etc. What do you commonly use in Spain ?

On your passport you familiy name at birth is always present, adding the name of your partner is entirely optional and up to you.

In daily use, you can use the family names of you and your partner when you get married in four different ways, and people are free to select how they want to use it:

  1. First your family name at birth, then the family name of your partner
  2. First the family name of your partner, then your family name at birth
  3. Only the family name of your partner
  4. Only your family name at birth

As you can imagine, this can cause at lot of confusion if you don't specify what you are asking for, or it will give the wrong conclusion regarding someone's identity.

In the current Match service we have , we use the family name at birth (because that is most reliable for us), but it would make a good improvement to offer an option to have your current family name as well. When the customer has both attributes, they can try to match both (and obtain a better identification).

@ToshiWakayama-KDDI
Copy link
Collaborator

Hi @fernandopradocabrillo ,
Thank you for your previous comment.
Regarding your comment on way to identify if the input information is hankaku or zenkakku, there is no easy way to do that. NameKanaHankaku and nameKanaZenkaku are simplified for attributes, but actually they are reading of the name. They are regarded as separate attributes in Japan. We understand they are country specific, so we can discuss better ways to handle country specific items to enhance the initial version.
Best regards,

@javier-carrocalabor
Copy link

Hi all,
Regarding the 'gender' parameter, after an internal preliminary privacy assessment, we have seen that this parameter can involve privacy issues, as it is specially protected data (with similar protection to political opinions, religious belief, etc.). We see that all the parameters in this API is personal data, but the 'gender' parameter involves special protection, so we propose to remove this parameter.

@ToshiWakayama-KDDI
Copy link
Collaborator

Hello @javier-carrocalabor ,

Thank you for your comment on the 'gender' attribute.

I am not a specialist on this topic, but I generally feel that how special 'gender' is would be different for each country or market, that there are some countries/markets that want to have the 'gender' attribute, that MNOs (KYC API providers) who do not want have it or who are not able to handle it can send a 'not_available' response, and that hence there is no need to remove the 'gender' attribute.

Anyway, we can discuss this after the Christmas Break.

Many thanks,
Toshi

@javier-carrocalabor
Copy link

javier-carrocalabor commented Jan 18, 2024

Hi @ToshiWakayama-KDDI ,

In case it helps to generate the final list of attributes with their corresponding descriptions, I'm adding here a summary table:

Name of parameter Description Implementation:** All -> potentially used in all countries **Country codes -> it is country specific.
phoneNumber Phone number in E.164 format (starting with country code). Optionally prefixed by '+'. All
idDocument Id number associated to the official identity document in the country. It may contain alphanumeric characters. ES, BR, …
name First name or compound first name. All
nameKanaHankaku Name in KanaHankaku format for JP. JP
nameKanaZenkaku Name in KanaZenkaku format for JP. JP
middleNames Middle name/s. NL, …
familyName Surname or family name. All
familyNameAtBirth Family name at birth. NL, …
givenName Complete given name built following the usual concatenation of parameters in a country. It can use name, middleNames, familyName and/or familyNameAtBirth. For example, in ES, name+familyName; in NL, it can be name+middleNames+familyName or name+middleNames+familyNameAtBirth, etc. All
postalCode Zip code or postal code. All
streetName Name of the street. It should not include the type of the street. All
streetNumber Numer identifying a specific property on the 'streetName'. All
houseNumberExtension Specific identifier of the house needed depending on the property type. For example, number of appartment in an appartment building. NL, …
address Complete address built following the usual concatenation of parameters in a country. It can use streetName, streetNumber and/or houseNumberExtension. For ecample, in ES, streetName+streetNumber; in NL, it can be streetName+streetNumber or streetName+streetNumber+houseNumberExtension. All
region All
locality All
country Country id, in ISO 3166 alpha-2 format??? All
birthdate Birthdate, in ISO 8601 calendar date format (YYYY-MM-DD). All
email All
gender All

@HuubAppelboom
Copy link
Collaborator

Hi @javier-carrocalabor I think you may have mixed up Name and Given Name in the above definitions

See for example https://en.wikipedia.org/wiki/Given_name, also for some interesting history and cultural differences

@javier-carrocalabor
Copy link

Hi @javier-carrocalabor I think you may have mixed up Name and Given Name in the above definitions

See for example https://en.wikipedia.org/wiki/Given_name, also for some interesting history and cultural differences

Thank you, Huub, for the comment. I think I mixed both parameters 'name' and 'givenName'.

So, the description for you all to consider could be:

  • givenName: First name or compound first name.
  • name: Complete name built following the usual concatenation of first/given name, middle name and last/family name in a country. It can use givenName, middleNames, familyName and/or familyNameAtBirth. For example, in ES, name+familyName; in NL, it can be name+middleNames+familyName or name+middleNames+familyNameAtBirth, etc.

@ToshiWakayama-KDDI
Copy link
Collaborator

Thank you, @javier-carrocalabor , @HuubAppelboom ,

Regarding 'name', the above description sounds the order as first/given name -> middle name -> last/family name, but this is not always the case in some countries. In addition, 'surname' should be added to last/family name.

So, my proposal would be:

Complete name of the customer, usually composed of first/given name and last/family/sur- name in a country. Depending on the country, the order of first/give name and last/family/sur- name varies, and middle name could be included.
It can use givenName, middleNames, familyName and/or familyNameAtBirth. For example, in ES, name+familyName; in NL, it can be name+middleNames+familyName or name+middleNames+familyNameAtBirth, etc.

Also, I will create a PR for this.

Thanks,
Toshi

@StefanoFalsetto-CKHIOD
Copy link
Collaborator

Frankly speaking, I don't understand why we need to discuss about the use of aggregated fields such as "name". In my humble opinion all the aggregated attributes must be deprecated i.e., like the generic "address". My concerns are the following:

  1. As we can all see from the last comment, It is not possible to have a unique way to manage those kind of attributes in each Country. I think that the most hard task of this working group is to find common solution, and use country-specific fields only in a few cases. The risk is to create a set of multiple, complex and hard to maintain KYC definitions, one per each country.
  2. Even if in case of KYC match plain text it is possible to "so something" with attributes (like string mangling, or anything else) the match rate is significantly lower than using "atomic" attributes like "first name", "middle name", etc. The problem is for sure worst in case of KYC match hashed. You can't do anything in that case.

@javier-carrocalabor
Copy link

javier-carrocalabor commented Feb 13, 2024

Thank you @StefanoFalsetto-CKHIOD for the heads up.
Do you mean we should consider this already for the release candidate that is being proposed right now to be closed in the next few days?, or shall we consider it for next version?
I have been reviewing the history of this issue and, if I am not wrong, I see these comment aligned with your idea:
@HuubAppelboom :
#18 (comment)
@GillesInnov35 :
#18 (comment)
@StefanoFalsetto-CKHIOD :
#18 (comment)
And also from Telefónica ourselves.
But I think at least @ToshiWakayama-KDDI does not completely support that idea:
#18 (comment)
From our side, not really sure about other opinions: @EfthymisIsaakidis-DTCS @rartych

So, to be practical, I suggest to go ahead for the current proposal in the PR for the rc unless we really have a good quorum on removing the "composed parameters": 'name' and 'address'.

@GillesInnov35
Copy link
Collaborator Author

hi @javier-carrocalabor, all. At Orange we do not use aggregated attributes such as name or address. I agree with @StefanoFalsetto-CKHIOD, in case of KYC Match Hashed, aggregated attributes won't be useful.
My point of view is that we are very close to the publication date of the release candidate and I'm not sure those 2 fields make the design so complex. But if we make the decision to remove them, it's ok.
As proposed in a previous issue, specific countries fields or requirements could be perhaps managed by implenting inheritance in the next release, couldn't it ?
Thanks
BR
Gilles

@HuubAppelboom
Copy link
Collaborator

@javier-carrocalabor Hi, In the attribute table you provided, should Address, Locality, etc not be written without Capitals ?

@javier-carrocalabor
Copy link

@javier-carrocalabor Hi, In the attribute table you provided, should Address, Locality, etc not be written without Capitals ?

You are right. My bad. Edited in the comment to correct it. Fortunately, it had been corrected when taken to the yaml file of the release.

@ToshiWakayama-KDDI
Copy link
Collaborator

ToshiWakayama-KDDI commented May 22, 2024

Hi All,

I think remainig issues in this issue have been included in Issue #87 (and maybe others). Can we close this Issue?

BR

@HuubAppelboom
Copy link
Collaborator

Yes, I think we can close this now

@ToshiWakayama-KDDI
Copy link
Collaborator

Closed as agreed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

6 participants