update at_api_keys table to allow for a key_id column#135
update at_api_keys table to allow for a key_id column#135adamkorynta wants to merge 2 commits intomainfrom
Conversation
this will allow for rapid lookup via CDA when attempting to check secret_hash also rename apikey column to secret_hash in order to explicitly call out apikeys being hashed. update secret hash size to 512 to be well over the bounds of argon2 hashing. Update to only allow moving the EXPIRES backwards as we do not want to be able to resurrect expired keys
fix apikey view fix unit test on at_api_keys
|
@MikeNeilson - I don't see any usages of set_session_user_apikey and my understanding is that CDA is now the only place for managing apikeys so it should be safe to delete? |
Yeah, I only see one usage within the test, but CDA validates the key then just uses the _direct method. |
| secret_hash varchar2(512) not null, | ||
| created timestamp default current_timestamp not null, | ||
| expires timestamp default current_timestamp+1, | ||
| constraint at_api_keys_pk primary key (key_id), |
There was a problem hiding this comment.
what's with key_id? I don't see it used anywhere. Seems like composite key of userid, key_name should be sufficient as a primary key.
There was a problem hiding this comment.
We need to be able to send an id in plain text (or encoded) as part of the apikey in order for quick lookup. Using userid would be escaping sensitive data, and there's no protection or guidance on key_name to know whether users are escaping sensitive data there either.
There was a problem hiding this comment.
Yeah, I just read the cda PR and that makes sense. I put a though on the topic there.
Though I wonder if having the key ultimately be provided as key_name:key might be the way to go. As you said, there's no other way. and if we do that, then you can just pull the hashed key down and make use of password4j's update mechanism if parameters change. As you said, we'd need to provide some guidance and constraints if we do that.
There was a problem hiding this comment.
If we only do key_name then we'd have to index on key_name for faster search and I don't think we really want key_name to be unique - or if we're actually storing key_name:key then we'd index on secret_hash and we'd be doing a where secret_hash like 'abc123.%' to filter the results. Adding a raw(16) key_id column would be faster and would be a smaller index.
There was a problem hiding this comment.
Indexes don't have to be unique indexes. We could still keep the user_id/keyname index which keeps the search space smaller.
While it is conceptually possible for two argon2 hashes to be the same, because of the use of the salt it is like something 10^-<rather large number> probability. and we could put the user_id as a prefix into the stored hash for that chance and then just extract it from the string before passing it into the checker.
There was a problem hiding this comment.
yeah, there are other options if we force the user to provide more information
There was a problem hiding this comment.
I don't follow - we can't extract the user_id from the hash
I was originally thinking "combine them", e.g. "<user_id>_" but realized it was non-sensical and never went back to correct the thought.
There was a problem hiding this comment.
So thinking about this discussion, I think I've realized why S3 auth is access_key + access_secret.
So if we keep key_id, but randomly generate it and then encode it to text somehow, and provide that as a "username".
the new maven central api does the same thing. To simplify (on the CDA side) perhaps we can output the apikey column as <key_id>: so that everyone still just copy pastes it and we don't need to update the DTO.
There was a problem hiding this comment.
I think we're on the same page now. My intention was to return the "apikey" as ak1_<encoded_key_id>:<actual_key>. Then on read we'd strip out ak1_ as the version, <encoded_key_id> decoded to match on the key_id column, and then argon2 hash check the stored hash against the apikey. <encoded_key_id> would be randomly generated and fixed size just like the apikey currently is.
There was a problem hiding this comment.
That makes sense. Otherwise we're just guessing which key is what, even if the collision odds are exceptionally low.
EXPIRESfield backwards as we do not want to allow resurrection of expired keys