-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Race condition between keygen and update, resulting in "Key derivation key not available!" #52
Comments
Strike that, I missed the fact the dependencies have been reworked in commit 7778512 |
Did you have that commit in your build? It's not part of any release, so it's likely you don't have it applied there. |
Yeah, I realized too late this bug report was based on the latest release. However, it still happens if I include that commit, so it's certainly still an issue. At the moment however I'm stuck figuring what triggers tangd-update invocation now. It does happen, but why? |
Could you try #53 and report back, please? |
Will do in a moment, just want to share the analysis I did: Starting with @jwkdir@ and @cachedir@ empty. Requesting an advertisement using
which results in an error:
This is the output of "journalctl -o short-precise", with timestamps of the created files merged, ordered by time.
So update is started ...
... before keygen finished the job. Also, update will ignore the second .jwk (not shown here)
And now the server is started although update is still running.
... hence the 404
The update script needed 180ms to parse the .jwk and create the first .jws. This is the earliest moment where the server may be run. But to be safe, this should rather happen after update finished the job which took ...
... another 2520ms.
So far I'm not convinced using the systemd semantics to resolve dependencies are the best idea. |
Sergio Correia wrote...
Could you try #53 and report back, please?
Doesn't look good on armhf:
```
1/3 adv TIMEOUT 90.13s
…--- command ---
22:10:54 SD_ACTIVATE='/usr/bin/systemd-socket-activate --inetd' PATH='/<<PKGBUILDDIR>>/src:/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf/src:
/usr/lib/ccache:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games' /<<PKGBUILDDIR>>/tests/adv
--- stdout ---
{"payload": "eyJrZXlzIjogW3siYWxnIjogIkVTNTEyIiwgImNydiI6ICJQLTUyMSIsICJrZXlfb3BzIjogWyJ2ZXJpZnkiXSwgImt0eSI6ICJFQyIsICJ4IjogIkFVLWZZc19
HZHd5Q2g0N0tDOEU5U0tkWERxMU9nYU9CcVNiQ1JPMlNYN1JvMnBxWVoyZ0llZzBiRU9COFowTzk3TlJjcjFpLS12dHljTE9TQTNQaFBtY3AiLCAieSI6ICJBQVdjRXVsRlkxTGN
TZzZJUk9xQ25DMkZIWW9tTEVrd2hkU3BhTDlXZTJvUi1RY1l4a1RobFcxb01NX2loR1lVc3Z6cnhXeVdWd2Ewa191LWpSYmdNa2p3In0sIHsiYWxnIjogIkVDTVIiLCAiY3J2Ijo
gIlAtNTIxIiwgImtleV9vcHMiOiBbImRlcml2ZUtleSJdLCAia3R5IjogIkVDIiwgIngiOiAiQUV1Mjd5dm9KUjNxN2xHaDY1VFBNZWxzMy1Rc3hxTmJRajA3SzQxMkJlVjlxRzF
uODhYX1hhSV9wQTMxTU1Ua2prV2pSbW8yMnhOX3YwRi1jVjBvZjk3YSIsICJ5IjogIkFjMkNvRnQ2X19uR1pXam9GamFwdWYyOEdpY29HaTJWaDVOc3JsYjBkQ1BMOW12VlFOMVF
jaWNueUNhd2hGUW9KcGxXUmtVYWstSkZpczdLaVlaRWVENnkifV19", "protected": "eyJhbGciOiJFUzUxMiIsImN0eSI6Imp3ay1zZXQranNvbiJ9", "signature": "A
FmazBHspqbzUUIrSbRA5NiIB1ePk1K69bFcuCbMZ0QHfeGv7GlsWrd6Y78I464pPYuYGTL6StQ_ojTrvZcbR3hGAAf18u7KesoT0efZhbgYqd0UU3ZqmYyLZrbztjPT2Pfua2xDV
PPF-L_rpYxoZXnu9FPyW69a4W1HmKcwaJxzGV_3"}
--- stderr ---
+ trap on_exit EXIT
+ trap exit ERR
++ mktemp -d
+ export TMP=/tmp/tmp.gwHry8iGOS
+ TMP=/tmp/tmp.gwHry8iGOS
+ mkdir -p /tmp/tmp.gwHry8iGOS/db
+ tangd-keygen /tmp/tmp.gwHry8iGOS/db sig exc
+ jose jwk gen -i '{"alg": "ES512"}' -o /tmp/tmp.gwHry8iGOS/db/.sig.jwk
+ jose jwk gen -i '{"alg": "ES512"}' -o /tmp/tmp.gwHry8iGOS/db/.oth.jwk
++ shuf -i 1024-65536 -n 1
+ export PORT=50243
+ PORT=50243
+ export PID=301670
+ PID=301670
+ sleep 0.5
+ /usr/bin/systemd-socket-activate --inetd -l 127.0.0.1:50243 -a tangd /tmp/tmp.gwHry8iGOS/db
Listening on 127.0.0.1:50243 as 3.
+ fetch /
+ curl -sfg http://127.0.0.1:50243/
Communication attempt on fd 3.
Connection from 127.0.0.1:57526 to 127.0.0.1:50243
Spawned tangd (tangd /tmp/tmp.gwHry8iGOS/db) as PID 301678.
Execing tangd (tangd /tmp/tmp.gwHry8iGOS/db)
<unknown> GET / => 404 (../src/http.c:128)
Child 301678 died with code 0
```
Will check further.
|
ACK, none of this still applies. |
TIL: The two After=... declarations in tangd.socket are being started in parallel. So if tangd-update starts checking @jwkdir@ before tangd-keygen wrote both files, the .jws in @cachedir@ will be incomplete. This happened here with relatively slow armhf hardware.
In that situation, an attempt to use that tang server with "clevis encrypt tang" will trigger a message "Key derivation key not available!", Debian Bug report is https://bugs.debian.org/975343
As a solution I suggest to move the
from tangd.socket to tangd-update.service, that worked for me.
Related, the entire logic around the keygen script seems a little fragile if operation is interrupted mid-way:
Writing the data to a temporary file first and atomically move them to the final location - as seen in the update script - avoids creation of zero-sized files. Alternatively, that job could already be done by jose, see latchset/jose#88.
Still, in case of an interruption, key generation will not be resumed since the ConditionDirectoryNotEmpty= in tangd-keygen.service will no longer apply. Perhaps there is a systemd way to deal with that, I'd just touch a "key-created" semaphore in @jwkdir@ - and as a next step merge keygen and update into a single script since detecting the necessity of having to create key is easy then. But perhaps I missed a use case here.
The text was updated successfully, but these errors were encountered: