Skip to content

update at_api_keys table to allow for a key_id column#135

Open
adamkorynta wants to merge 2 commits intomainfrom
feature/update_apikeys_kid
Open

update at_api_keys table to allow for a key_id column#135
adamkorynta wants to merge 2 commits intomainfrom
feature/update_apikeys_kid

Conversation

@adamkorynta
Copy link
Copy Markdown
Contributor

  • this will allow for rapid lookup via CDA when attempting to check secret_hash
  • also rename apikey column to secret_hash in order to explicitly call out apikeys being hashed.
  • update secret hash size to 512 to be well over the bounds of argon2 hashing.
  • Update to only allow moving the EXPIRES field backwards as we do not want to allow resurrection of expired keys
  • remove uniqueness constraint on the apikey hash as collisions may be possible

this will allow for rapid lookup via CDA when attempting to check secret_hash
also rename apikey column to secret_hash in order to explicitly call out apikeys being hashed.
update secret hash size to 512 to be well over the bounds of argon2 hashing.
Update to only allow moving the EXPIRES backwards as we do not want to be able to resurrect expired keys
fix apikey view
fix unit test on at_api_keys
@adamkorynta
Copy link
Copy Markdown
Contributor Author

@MikeNeilson - I don't see any usages of set_session_user_apikey and my understanding is that CDA is now the only place for managing apikeys so it should be safe to delete?

@adamkorynta adamkorynta requested a review from MikeNeilson April 3, 2026 22:58
@MikeNeilson
Copy link
Copy Markdown
Contributor

@MikeNeilson - I don't see any usages of set_session_user_apikey and my understanding is that CDA is now the only place for managing apikeys so it should be safe to delete?

Yeah, I only see one usage within the test, but CDA validates the key then just uses the _direct method.

secret_hash varchar2(512) not null,
created timestamp default current_timestamp not null,
expires timestamp default current_timestamp+1,
constraint at_api_keys_pk primary key (key_id),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's with key_id? I don't see it used anywhere. Seems like composite key of userid, key_name should be sufficient as a primary key.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to be able to send an id in plain text (or encoded) as part of the apikey in order for quick lookup. Using userid would be escaping sensitive data, and there's no protection or guidance on key_name to know whether users are escaping sensitive data there either.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I just read the cda PR and that makes sense. I put a though on the topic there.

Though I wonder if having the key ultimately be provided as key_name:key might be the way to go. As you said, there's no other way. and if we do that, then you can just pull the hashed key down and make use of password4j's update mechanism if parameters change. As you said, we'd need to provide some guidance and constraints if we do that.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we only do key_name then we'd have to index on key_name for faster search and I don't think we really want key_name to be unique - or if we're actually storing key_name:key then we'd index on secret_hash and we'd be doing a where secret_hash like 'abc123.%' to filter the results. Adding a raw(16) key_id column would be faster and would be a smaller index.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indexes don't have to be unique indexes. We could still keep the user_id/keyname index which keeps the search space smaller.

While it is conceptually possible for two argon2 hashes to be the same, because of the use of the salt it is like something 10^-<rather large number> probability. and we could put the user_id as a prefix into the stored hash for that chance and then just extract it from the string before passing it into the checker.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, there are other options if we force the user to provide more information

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't follow - we can't extract the user_id from the hash

I was originally thinking "combine them", e.g. "<user_id>_" but realized it was non-sensical and never went back to correct the thought.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So thinking about this discussion, I think I've realized why S3 auth is access_key + access_secret.

So if we keep key_id, but randomly generate it and then encode it to text somehow, and provide that as a "username".

the new maven central api does the same thing. To simplify (on the CDA side) perhaps we can output the apikey column as <key_id>: so that everyone still just copy pastes it and we don't need to update the DTO.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're on the same page now. My intention was to return the "apikey" as ak1_<encoded_key_id>:<actual_key>. Then on read we'd strip out ak1_ as the version, <encoded_key_id> decoded to match on the key_id column, and then argon2 hash check the stored hash against the apikey. <encoded_key_id> would be randomly generated and fixed size just like the apikey currently is.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense. Otherwise we're just guessing which key is what, even if the collision odds are exceptionally low.

@adamkorynta adamkorynta marked this pull request as ready for review April 7, 2026 17:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants