Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make escrow of encrypted storage key more reliable #3770

Merged
merged 5 commits into from
Feb 23, 2024

Conversation

eriknordmark
Copy link
Contributor

@eriknordmark eriknordmark commented Feb 22, 2024

Five related commits but the key fix is the 4th one:

  1. Remove interlock with controlled for EdgeNodeCerts
    In
    f40d6d4
    and modified in
    9ef0f05
    EVE got an interlock to not fetch configuration from the controller
    until after it has sent the EdgeNodeCerts. This was done since some
    controller would send incomplete information when it could not do ECDH
    and hence not do object encryption. That controller has now been fixed
    (and is producing errors of the form "empty device certificate in create
    cipher block method" in this case). Thus we can remove the interlock.

However, we still keep the retry logic from the second commit.

  1. vaultmgr: export IsTpmEnabled with EncryptedVaultKeyFromDevice

To avoid false logging of success in the next commit

  1. zedagent: track publishedAttestEscrow for purposes of logging

  2. Fix 'Skip triggering attest state machine before entering main loop'
    The refactoring in
    5832ab1
    accidentally activated the subscriptions to AttestQuote and
    EncryptedKeyFromDevice before the attest engine was started. As a result
    we get the above log message and the encrypted storage key is not
    reliably escrowed in the controller.

Here we start the attest engine before the subscriptions.

  1. Add more clear log when EdgeNodeCerts not accepted by controller

In
lf-edge@f40d6d4
and modified in
lf-edge@9ef0f05
EVE got an interlock to not fetch configuration from the controller
until after it has sent the EdgeNodeCerts. This was done since some
controller would send incomplete information when it could not do ECDH
and hence not do object encryption. That controller has now been fixed
(and is producing errors of the form "empty device certificate in create
cipher block method" in this case). Thus we can remove the interlock.

However, we still keep the retry logic from the second commit.

Signed-off-by: eriknordmark <erik@zededa.com>
Signed-off-by: eriknordmark <erik@zededa.com>
@eriknordmark eriknordmark added the stable Should be backported to stable release(s) label Feb 22, 2024
Copy link

codecov bot commented Feb 22, 2024

Codecov Report

Attention: Patch coverage is 0% with 19 lines in your changes are missing coverage. Please review.

Project coverage is 19.70%. Comparing base (ec0638e) to head (e13e830).

Files Patch % Lines
pkg/pillar/cmd/zedagent/zedagent.go 0.00% 12 Missing ⚠️
pkg/pillar/cmd/zedagent/attesttask.go 0.00% 4 Missing ⚠️
pkg/pillar/cmd/zedagent/handleconfig.go 0.00% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #3770      +/-   ##
==========================================
+ Coverage   19.69%   19.70%   +0.01%     
==========================================
  Files         235      235              
  Lines       51710    51713       +3     
==========================================
+ Hits        10182    10189       +7     
+ Misses      40787    40784       -3     
+ Partials      741      740       -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@rouming rouming left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fix is more or less clear to me, but I'm wondering why all these is called "escrow"? Is that a common terminology for key publishing and attestation?

@eriknordmark
Copy link
Contributor Author

The fix is more or less clear to me, but I'm wondering why all these is called "escrow"? Is that a common terminology for key publishing and attestation?

The notion of key escrow is common; a way to keep a key safe even if e.g., an employee resigns or gets hit by a bus. You can then recover the key from the escrow data.

Our setup is different in that we escrow an encrypted key so that only the same instance (the same TPM chip) can recover the key. Hence names like EncryptedKeyFromDevice in the code and encrypted_key in the EVE API.

Signed-off-by: eriknordmark <erik@zededa.com>
The refactoring in
lf-edge@5832ab1
accidentally activated the subscriptions to AttestQuote and
EncryptedKeyFromDevice before the attest engine was started. As a result
we get the above log message and the encrypted storage key is not
reliably escrowed in the controller.

Here we start the attest engine before the subscriptions.

Signed-off-by: eriknordmark <erik@zededa.com>
Signed-off-by: eriknordmark <erik@zededa.com>
@shjala
Copy link
Member

shjala commented Feb 23, 2024

I personally prefer the name "Wrapped Key" but I guess we keep the term escrow for historical reasons... .
Except one question, LGTM.

@eriknordmark eriknordmark merged commit 344853e into lf-edge:master Feb 23, 2024
35 of 37 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stable Should be backported to stable release(s)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants