Write up requirements

Signed-off-by: Gilles Peskine <Gilles.Peskine@arm.com>
This commit is contained in:
Gilles Peskine 2022-02-14 23:55:59 +01:00
parent eec6b2c6b4
commit 41d0334b4c

View File

@ -1,10 +1,78 @@
Thread safety of the PSA key store
Thread safety of the PSA subsystem
==================================
Analysis of the behavior of the PSA key store as of Mbed TLS 9202ba37b19d3ea25c8451fd8597fce69eaa6867.
## Requirements
### Backward compatibility requirement
Code that is currently working must keep working. There can be an exception for code that uses features that are advertised as experimental; for example, it would be annoying but ok to add extra requirements for drivers.
In particular, if you build with `MBEDTLS_PSA_CRYPTO_C` and `MBEDTLS_THREADING_C` and you either protect all PSA calls with a mutex, or only ever call PSA functions from a single thread, your application works.
As a consequence, we must not add a new platform requirement beyond mutexes for the base case. It would be ok to add new platform requirements if they're only needed for PSA drivers, or if they're only performance improvements.
Tempting platform requirements that we cannot add to the default `MBEDTLS_THREADING_C` include:
* Releasing a mutex from a different thread than the one that acquired it. This isn't even guaranteed to work with pthreads.
* New primitives such as semaphores or condition variables.
### Correctness out of the box
If you build with `MBEDTLS_PSA_CRYPTO_C` and `MBEDTLS_THREADING_C`, the code must be functionally correct: no race conditions, deadlocks or livelocks.
The [PSA Crypto API specification](https://armmbed.github.io/mbed-crypto/html/overview/conventions.html#concurrent-calls) defines minimum expectations for concurrent calls. They must work as if they had been executed one at a time, except that the following cases have undefined behavior:
* Destroying a key while it's in use.
* Concurrent calls using the same operation object. (An operation object may not be used by more than one thread at a time. But it can move from one thread to another between calls.)
* Overlap of an output buffer with an input or output of a concurrent call.
* Modification of an input buffer during a call.
Note that while the specification does not define the behavior in such cases, Mbed TLS can be used as a crypto service. It's acceptable if an application can mess itself up, but it is not acceptable if an application can mess up the crypto service. As a consequence, destroying a key while it's in use may violate the security property that all key material is erased as soon as `psa_destroy_key` returns, but it may not cause data corruption or read-after-free inside the key store.
### No spinning
The code must not spin on a potentially non-blocking task. For example, this is proscribed:
```
lock(m);
while (!its_my_turn) {
unlock(m);
lock(m);
}
```
Rationale: this can cause battery drain, and can even be a livelock (spinning forever), e.g. if the thread that might unblock this one has a lower priority.
### Driver requirements
At the time of writing, the driver interface specification does not consider multithreaded environments.
We need to define clear policies so that driver implementers know what to expect. Here are two possible policies at two ends of the spectrum; what is desirable is probably somewhere in between.
* Driver entry points may be called concurrently from multiple threads, even if they're using the same key, and even including destroying a key while an operation is in progress on it.
* At most one driver entry point is active at any given time.
A more reasonable policy could be:
* By default, each driver only has at most one entry point active at any given time. In other words, each driver has its own exclusive lock.
* Drivers have an optional `"thread_safe"` boolean property. If true, it allows concurrent calls to this driver.
* Even with a thread-safe driver, the core never starts the destruction of a key while there are operations in progress on it, and never performs concurrent calls on the same multipart operation.
### Long-term performance requirements
In the short term, correctness is the important thing. We can start with a global lock.
In the medium to long term, performing a slow or blocking operation (for example, a driver call, or an RSA decryption) should not block other threads, even if they're calling the same driver or using the same key object.
We may want to go directly to a more sophisticated approach because when a system works with a global lock, it's typically hard to get rid of it to get more fine-grained concurrency.
### Key destruction long-term requirements
As noted above in [“Correctness out of the box”](#correctness-out-of-the-box), when a key is destroyed, it's ok if `psa_destroy_key` allows copies of the key to live until ongoing operations using the key return. In the long term, it would be good to guarantee that `psa_destroy_key` wipes all copies of the key material.
## Resources to protect
Analysis of the behavior of the PSA key store as of Mbed TLS 9202ba37b19d3ea25c8451fd8597fce69eaa6867.
### Global variables
* `psa_crypto_slot_management::global_data.key_slots[i]`: see [“Key slots”](#key-slots).