Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix #272: check c_obj_create error code for OBJ_R_OID_EXISTS. #294

Closed
wants to merge 1 commit into from

Conversation

ghost
Copy link

@ghost ghost commented Oct 24, 2023

See #272.

This commit checks that if c_obj_create fails, the error code is OBJ_R_OID_EXISTS
and the OID has been successfully added (using OBJ_txt2nid). If it's the case,
then the error is skipped.

@ghost ghost force-pushed the obj_create_lock branch 17 times, most recently from 5568d9e to 1970d83 Compare October 24, 2023 23:57
@ghost ghost marked this pull request as ready for review October 25, 2023 00:12
@ghost ghost requested a review from baentsch as a code owner October 25, 2023 00:12
oqsprov/oqsprov.c Outdated Show resolved Hide resolved
test/CMakeLists.txt Outdated Show resolved Hide resolved
Copy link
Member

@baentsch baentsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks very much for trying to resolve this issue! But Windows (CI) doesn't seem to "like" this solution. While we have learned at school that locks should be held for the shortest possible time, I wonder whether it might not be worth while trying in this case to run the whole NID registration within CRYPTO_THREAD_run_once instead (and do away with the lock/unlock pattern proposed and apparently causing all this "cross platform" hassle)?

@ghost
Copy link
Author

ghost commented Oct 25, 2023

CRYPTO_THREAD_run_once

CRYPTO_THREAD_run_once is going to run the routine only… once! But since the teardown function of oqsprovider doesn't destroy the previously created NIDs and OIDs, I guess it's safe to run it that way. I'll update my PR.

@baentsch
Copy link
Member

the teardown function of oqsprovider doesn't destroy the previously created NIDs and OIDs

Correct observation: As per the OpenSSL documentation:

By default OpenSSL will attempt to clean itself up when the process exits via an “atexit” handler.

This IMO applies also to all objects created/registered via an OpenSSL API. We have asan tests successfully running over the whole gamut and they don't find "unfreed" memory, so I think we're good.

@ghost
Copy link
Author

ghost commented Oct 25, 2023

Okay so we can't pass arguments to CRYPTO_THREAD_run_once, so it forces me to move pretty much all the variables from OQS_PROVIDER_ENTRYPOINT_NAME to the static section…
Should I do that @baentsch ?

@baentsch
Copy link
Member

Okay so we can't pass arguments to CRYPTO_THREAD_run_once, so it forces me to move pretty much all the variables from OQS_PROVIDER_ENTRYPOINT_NAME to the static section… Should I do that @baentsch ?

Yikes. That doesn't sound "attractive". Sorry I hadn't considered that (lack of parameters) when making the proposal. If I were doing this (having arbitrary time at my hands :), I'd do two things: 1) Go ahead and make that change and then very carefully observe the results of an asan run and 2) Raise an issue/question in the OpenSSL discussion forum as to whether this is an intended/acceptable way to use CRYPTO_THREAD_run_once.

@ghost ghost force-pushed the obj_create_lock branch from 1970d83 to 8d3d336 Compare November 2, 2023 18:55
@ghost
Copy link
Author

ghost commented Nov 2, 2023

Okay, so I've opted for another solution: in case of an error, we check the last error, if it's OBJ_R_OID_EXISTS, then we check that the OID does exist, and it does, then we skip the error.

@ghost ghost force-pushed the obj_create_lock branch from 8d3d336 to 980d1ec Compare November 2, 2023 19:02
@ghost ghost force-pushed the obj_create_lock branch from 980d1ec to 326cb5c Compare November 2, 2023 19:03
@ghost ghost changed the title Fix #272: run c_obj_create under a lock. Fix #272: check c_obj_create error code for OBJ_R_OID_EXISTS. Nov 2, 2023
goto end_init;
last_error = ERR_peek_last_error();
if ((last_error != ERR_PACK(ERR_LIB_OBJ, 0, OBJ_R_OID_EXISTS))
|| (OBJ_txt2nid(oqs_oid_alg_list[i]) == NID_undef)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rather than check for equality with NID_undef, what about checking for inequality with oqs_oid_alg_list[i]? Oh, and the parameter for OBJ_txt2nid seems to be indexed incorrectly (i-> i+1), no?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're passing oqs_oid_alg_list[i] as second argument of c_obj_create, and according to the implementation, OBJ_txt2nid is called with the second argument:

https://github.com/openssl/openssl/blob/26ecab1fd7fa7f5103ac57ef41ee0dd38fbe2ddc/crypto/provider_core.c#L2219-L2225

static int core_obj_create(const OSSL_CORE_HANDLE *prov, const char *oid,
                           const char *sn, const char *ln)
{
    /* Check if it already exists and create it if not */
    return OBJ_txt2nid(oid) != NID_undef
           || OBJ_create(oid, sn, ln) != NID_undef;
}

Although OBJ_create returns the NID as an integer, c_obj_create only returns 1 or 0. So after adding a new object using c_obj_create, we don't get back its NID as an integer, so we cannot compare it to the returned value of OBJ_txt2nid

@ghost ghost force-pushed the obj_create_lock branch 4 times, most recently from f9f63b2 to 9682c04 Compare November 2, 2023 22:12
@ghost
Copy link
Author

ghost commented Nov 2, 2023

Alright, after spending 2 hours debugging, it seems that my solution that consisted of checking the error code against OBJ_R_OID_EXISTS does not work: c_obj_create may return OBJ_R_OID_EXISTS while OBJ_txt2nid returns NID_undef. It happens when the object is about to be fully created but it's still partially created, and c_obj_create doesn't see it before OBJ_txt2nid

I discarded my solution of using a CRYPTO_RWLOCK along with CRYPTO_THREAD_run_once because the lock couldn't be free (since we don't know when users finish using oqs-provider). It was leading to an ASan report stating that a memory leak occurred (which is true, since the lock wasn't free at all).

EDIT: found a better solution: I used atexit to free the lock.

@ghost ghost force-pushed the obj_create_lock branch 6 times, most recently from 1af69b7 to 564ee11 Compare November 2, 2023 23:32
@baentsch
Copy link
Member

baentsch commented Nov 7, 2023

found a better solution: I used atexit to free the lock.

May I ask about the status of this? Windows (CI) seems to run into a deadlock...

@ghost
Copy link
Author

ghost commented Nov 9, 2023

May I ask about the status of this? Windows (CI) seems to run into a deadlock...

Hi!

For Windows, one of the builds is failing:

C:\Program Files (x86)\Windows Kits\10\Include\10.0.17763.0\um\oaidl.h(487,17): error C2059: syntax error: '/' [D:\a\oqs-provider\oqs-provider\_build\oqsprov\oqsprovider.vcxproj]
C:\Program Files (x86)\Windows Kits\10\Include\10.0.17763.0\um\oaidl.h(502,17): error C2059: syntax error: '/' [D:\a\oqs-provider\oqs-provider\_build\oqsprov\oqsprovider.vcxproj]
C:\Program Files (x86)\Windows Kits\10\Include\10.0.17763.0\um\oaidl.h(530,17): error C2059: syntax error: '}' [D:\a\oqs-provider\oqs-provider\_build\oqsprov\oqsprovider.vcxproj]
C:\Program Files (x86)\Windows Kits\10\Include\10.0.17763.0\um\oaidl.h(531,13): error C2059: syntax error: '}' [D:\a\oqs-provider\oqs-provider\_build\oqsprov\oqsprovider.vcxproj]
C:\Program Files (x86)\Windows Kits\10\Include\10.0.17763.0\um\oaidl.h(533,9): error C2059: syntax error: '}' [D:\a\oqs-provider\oqs-provider\_build\oqsprov\oqsprovider.vcxproj]
C:\Program Files (x86)\Windows Kits\10\Include\10.0.17763.0\um\oaidl.h(534,5): error C2059: syntax error: '}' [D:\a\oqs-provider\oqs-provider\_build\oqsprov\oqsprovider.vcxproj]

and I really dont' understand why…

About the deadlock, I also have no clue since I'm just using OpenSSL API…

Also, I don't understand why ubuntu focal is failing, but I think it will be easier to fix than windows…

I'll try to find some time before the end of the week to fix this PR properly.

See #272.

This commit checks that if `c_obj_create` fails, the error code is `OBJ_R_OID_EXISTS`
and the OID has been successfully added (using `OBJ_txt2nid`). If it's the case,
then the error is skipped.
@ghost ghost force-pushed the obj_create_lock branch from 564ee11 to 63bc01e Compare December 5, 2023 11:35
@baentsch
Copy link
Member

baentsch commented May 7, 2024

@thb-sb This seems to be pretty stale now. OK to close this PR for now and look into this when time permits again?

@baentsch
Copy link
Member

Closing due to inactivity. Please re-open as you find time, @thb-sb .

@baentsch baentsch closed this Aug 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant