feat: add calculate-memory command to actor#980
Conversation
| writeFileSync(joinPath(LOCAL_CONFIG_PATH), JSON.stringify(actorJson, null, '\t'), { flag: 'w' }); | ||
| }; | ||
|
|
||
| describe('apify actor calculate-memory', () => { |
There was a problem hiding this comment.
@vladfrangu when I start it one by one it works perfectly, but when I start it as as a whole set it fails, can you help me to fix that, please? 🙃
It seems there's something wrong with path or existence of the file. I was a bit confused with that useTempPath mocks...
tobice
left a comment
There was a problem hiding this comment.
Nice 👏 Left some suggestions though.
| import { error, info, success } from '../../lib/outputs.js'; | ||
| import { getJsonFileContent } from '../../lib/utils.js'; | ||
|
|
||
| export class ActorCalculateMemoryCommand extends ApifyCommand<typeof ActorCalculateMemoryCommand> { |
There was a problem hiding this comment.
What is this magic? Why is it basically extending itself? 😅
There was a problem hiding this comment.
It's used for types of args and flags, since they're different per command 😄
maybe @vladfrangu would add more info here
There was a problem hiding this comment.
It's not extending itself but rather referencing itself in the extension class -> allows us to infer the types of the flags and args for this.args/this.flags
| private readonly DEFAULT_INPUT_PATH = 'storage/key_value_stores/default/INPUT.json'; | ||
|
|
||
| static override description = | ||
| `Calculates the actor’s dynamic memory usage based on a memory expression from actor.json, input data, and run options.`; |
There was a problem hiding this comment.
| `Calculates the actor’s dynamic memory usage based on a memory expression from actor.json, input data, and run options.`; | |
| `Calculates the Actor’s dynamic memory usage based on a memory expression from actor.json, input data, and run options.`; |
| description: 'Actor build version or build tag to evaluate the expression with.', | ||
| required: false, | ||
| }), | ||
| timeoutSecs: Flags.integer({ |
There was a problem hiding this comment.
I can imagine this being slightly confusing (because it is confusing to me 😅 ).
Intuitively, why should restartOnError affect memory?
The only slightly related fields from run options are maxTotalChargeUsd or memoryMbytes but that latter one is not even here. And I forgot what it was supposed to do. Can you remind me? And is it intentional that it is not here?
The bottom line is that I find it really weird to have these top level params that have zero effect and are there simply because they are part of Actor run options.
I thought about dropping the options altogether from the memory calculation (that'd be a bigger change) but I guess maxTotalChargeUsd is useful. And we don't want to get into the business of deciding what is useful and what isn't.
I just wonder, is it possible to provide these also as a JSON file / object? Which would be probably inconsistent with other CLI commands but would make more intuitive sense.
There was a problem hiding this comment.
Makes sense, I was also thinking about that as well. I'd not rather provide options as JSON, probably not many devs would need it.
The idea behind providing path to input.json is because new Actors already have default input and by providing defaultMemoryMbytes in actor.json devs can call apify actor calculate-memory without any flags.
Regarding memoryMbytes and diskMbytes I thought it would be very confusing and just removed them from flags, we can do the same for restartOnError.
IMHO it's cleaner for me to provide options as flags from DX point of view, but if you feel like it would be better to provide them in additional file I can do that as well.
There was a problem hiding this comment.
The idea behind providing path to input.json is because new Actors already have default input and by providing defaultMemoryMbytes in actor.json devs can call apify actor calculate-memory without any flags.
Totally agree here 👍
Regarding memoryMbytes
Remind me, what is memoryMbytes doing?
IMHO it's cleaner for me to provide options as flags from DX point of view, but if you feel like it would be better to provide them in additional file I can do that as well.
From DX, I'd argue that providing a param that doesn't do anything is at least a little weird 😄
There was a problem hiding this comment.
Remind me, what is memoryMbytes doing?
It's defaultRunOptions.memoryMbytes - fixed memory provided by the developer in settings.
From DX, I'd argue that providing a param that doesn't do anything is at least a little weird 😄
Makes sense, I removed restartOnError from flags and left a comment explaining this
| /** | ||
| * Helper to load the `defaultMemoryMbytes` expression from actor.json. | ||
| */ | ||
| private async getExpressionFromConfig(): Promise<string | undefined> { |
There was a problem hiding this comment.
(nit) There seems to be a lot of boilerplate and error handling for what seems to be a rather generic functionality: get Actor config 🤔 There is no existing utility?
There was a problem hiding this comment.
Well we use one functionality useActorConfig, that already exists 😄 I'd wait for review from @vladfrangu here, I just couldn't find it
There was a problem hiding this comment.
This is the existing utility, and it handles multiple cases (invalid config file, non-existing file or existing file).
I do think you could probably skip the exists == false check tho... It will just return undefined -> bubble up top
| } | ||
|
|
||
| if (localConfigResult.isOkAnd((cfg) => cfg.exists === false)) { | ||
| throw new Error( |
There was a problem hiding this comment.
Why throwing here but using process.exitCode above? 🤔
I assume it's a CLI thing but I don't follow the reasoning 😄
There was a problem hiding this comment.
Good point, again let's wait for best practices from tooling team. I was not sure about this one as well 🙃
There was a problem hiding this comment.
we set process.exitCode in certain cases for those who use shells and $? to differentiate exit reasons...
RE: why throw -> I honestly want to move away from throws and more towards handling errors in the cmd level (building one generic catch handler for all errors is rough, unless we build custom error classes [but at that point just implement the nicer error msgs where they're thrown])
| expect(lastErrorMessage()).toMatch(/Calculated memory: 512 MB/); | ||
| }); | ||
|
|
||
| it('should calculate memory using expression from .actor.json', async () => { |
There was a problem hiding this comment.
| it('should calculate memory using expression from .actor.json', async () => { | |
| it('should calculate memory using expression from actor.json', async () => { |
| it('should calculate memory using defaultMemoryMbytes flag', async () => { | ||
| await testRunCommand(ActorCalculateMemoryCommand, { | ||
| flags_input: 'INPUT.json', | ||
| flags_defaultMemoryMbytes: '512', |
There was a problem hiding this comment.
It's confusing that when providing the expression via a flag, you use just a constant value, but when providing from actor.json, you use an actual expression. That creates the impression that it works differently but as far as I can tell, it doesn't?
I'd just use the same thing in both (actually all, dtto below) cases.
What you can do is something like:
const START_URLS_LENGTH_BASED_MEMORY_EXPRESSION = 'get(input, 'startUrls.length') * 1024';And then reuse it across all cases. That will make it super clear that the expression itself doesn't really matter and remains constant across cases.
|
@danpoletaev Can you also include another person from our team for the sake of our bus factor on the feature? 🙏 |
vladfrangu
left a comment
There was a problem hiding this comment.
I can help out with the tests if they're still causing you lots of issues 🫡, just lmk!
| export class ActorCalculateMemoryCommand extends ApifyCommand<typeof ActorCalculateMemoryCommand> { | ||
| static override name = 'calculate-memory' as const; | ||
|
|
||
| private readonly DEFAULT_INPUT_PATH = 'storage/key_value_stores/default/INPUT.json'; |
There was a problem hiding this comment.
Please use path.join + getLocalKeyValueStorePath
There was a problem hiding this comment.
Also this can be moved to a constant outside the class
| input: Flags.string({ | ||
| description: 'Path to the input JSON file used for the calculation.', | ||
| required: false, | ||
| default: 'storage/key_value_stores/default/INPUT.json', |
There was a problem hiding this comment.
Either reference the path via this. (/ const var name if moved), or don't provide a default at all and set it in code when destructuring the flag (const { input = DEFAULT_INPUT_PATH } = this.flags)
| required: false, | ||
| }), | ||
| restartOnError: Flags.boolean({ | ||
| description: 'Whether to restart the actor on error.', |
There was a problem hiding this comment.
| description: 'Whether to restart the actor on error.', | |
| description: 'Whether the Actor will be restarted on error.', |
| /** | ||
| * Helper to load the `defaultMemoryMbytes` expression from actor.json. | ||
| */ | ||
| private async getExpressionFromConfig(): Promise<string | undefined> { |
There was a problem hiding this comment.
This is the existing utility, and it handles multiple cases (invalid config file, non-existing file or existing file).
I do think you could probably skip the exists == false check tho... It will just return undefined -> bubble up top
| } | ||
|
|
||
| if (localConfigResult.isOkAnd((cfg) => cfg.exists === false)) { | ||
| throw new Error( |
There was a problem hiding this comment.
we set process.exitCode in certain cases for those who use shells and $? to differentiate exit reasons...
RE: why throw -> I honestly want to move away from throws and more towards handling errors in the cmd level (building one generic catch handler for all errors is rough, unless we build custom error classes [but at that point just implement the nicer error msgs where they're thrown])
| beforeEach(async () => { | ||
| await beforeAllCalls(); | ||
| toggleCwdBetweenFullAndParentPath(); | ||
| }); | ||
|
|
||
| afterEach(async () => { | ||
| await afterAllCalls(); | ||
| }); | ||
|
|
There was a problem hiding this comment.
I don't think you should call toggle every time, right?
| input: inputJson, | ||
| runOptions, | ||
| }); | ||
| success({ message: `Calculated memory: ${result} MB` }); |
There was a problem hiding this comment.
provide the stdout: true option (and adjust the test accordingly) 🙏
| }); | ||
|
|
||
| it('should fallback to default input path if input flag is not provided', async () => { | ||
| const defaultInputPath = joinPath('storage/key_value_stores/default'); |
There was a problem hiding this comment.
use multiple args passed to this function, not a full path, for OS compat
| it('should fail when default memory is not provided and actor.json is missing', async () => { | ||
| await testRunCommand(ActorCalculateMemoryCommand, {}); | ||
|
|
||
| expect(lastErrorMessage()).toMatch(/actor.json not found at/); |
There was a problem hiding this comment.
| expect(lastErrorMessage()).toMatch(/actor.json not found at/); | |
| expect(lastErrorMessage()).toMatch(/actor\.json not found at/); |
| }); | ||
|
|
||
| // INPUT.json does not exist, so input.startUrls.length will be undefined, defaulting to 1 | ||
| expect(lastErrorMessage()).toMatch(/Calculated memory: 1024 MB/); |
There was a problem hiding this comment.
I would assert only the memory requested, not the whole msg
| ```sh | ||
| DESCRIPTION | ||
| Creates an Actor project from a template in a new directory. | ||
| Creates an Actor project from a template in a new directory. The command |
There was a problem hiding this comment.
Why these changes? They seem unrelated.
There was a problem hiding this comment.
This was caused by automatic doc generation, so I think it is fine
This PR updates the dynamic run memory calculation (added in #980) to respect the memory limits set in `actor.json`. The final memory value is now clamped between `minMemoryMbytes` and `maxMemoryMbytes`. In case user provides `defaultMemoryMbytes` via flag, we're not checking for min/max memory from config, because it feels confusing.
This PR adds new command to cli to allow developers to test their
defaultMemoryMbytesexpressions. Those expressions are used to dynamically calculate memory of the run based on theinputandrunOptions.E.g. of usage
Full specification in Notion 👇
https://www.notion.so/apify/Dynamic-default-Actor-memory-WIP-293f39950a2280b9aa34fa9cd07f52a4?source=copy_link