Skip to content

Implemented the ability to execute any postprocess provided by the plugin after each instruction#2172

Open
kseniadobrovolskaya wants to merge 3 commits into
riscv-software-src:masterfrom
kseniadobrovolskaya:kseniadobrovolskaya/instr-postprocess-plugins
Open

Implemented the ability to execute any postprocess provided by the plugin after each instruction#2172
kseniadobrovolskaya wants to merge 3 commits into
riscv-software-src:masterfrom
kseniadobrovolskaya:kseniadobrovolskaya/instr-postprocess-plugins

Conversation

@kseniadobrovolskaya

Copy link
Copy Markdown
Contributor

Here is the easiest way to add the agnostic behavior discussed in this issue: Implementation of a special case of an agnostic element behavior policy #2061

Added extensions to implement:
1. Mask agnostic behavior of filling with 1s - xspikema1s
2. Tail agnostic behavior of filling with 1s - xspiketa1s

@kseniadobrovolskaya kseniadobrovolskaya force-pushed the kseniadobrovolskaya/instr-postprocess-plugins branch from 732432c to 7ade0f4 Compare December 2, 2025 12:13
  Added extensions to implement:
    1. Mask agnostic behavior of filling with 1s - xspikema1s
    2. Tail agnostic behavior of filling with 1s - xspiketa1s
@kseniadobrovolskaya kseniadobrovolskaya force-pushed the kseniadobrovolskaya/instr-postprocess-plugins branch from 7ade0f4 to dc0788b Compare December 2, 2025 12:51
@kseniadobrovolskaya

Copy link
Copy Markdown
Contributor Author

@aswaterman, take a look please

@arromanoff arromanoff left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instruction post-processing LGTM. I'd like someone else to take a look as well

@arromanoff

Copy link
Copy Markdown
Collaborator

@aswaterman, @nibrunieAtSi5 Could you review this please?

@aswaterman aswaterman left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a few issues with this approach. First, it adds several instructions to the critical execution path--even iterating over an empty list on every instruction materially slows down the simulator. Whatever we do shouldn't have a performance effect on instructions that don't use the feature.

Second, although I am sympathetic to the desire to model a wider variety of implementation-defined behaviors, I'm not convinced that the pattern of a postprocessing hook is either necessary or sufficient to meet our needs. It does work OK for this particular case, but it's more general than necessary for this case. And still, there are other cases for which it doesn't suffice.

Defining a custom extension to modify this behavior does make sense to me, but the way I'd do it is to have the custom instructions redefine these instructions, rather than adding a hook. (Recall, custom extensions in Spike can redefine standard opcodes, since custom extensions are always searched first for an opcode match.) This approach also has the advantage of not requiring any Spike modifications.

@kseniadobrovolskaya

kseniadobrovolskaya commented Dec 11, 2025

Copy link
Copy Markdown
Contributor Author

@aswaterman, great idea! Thanks for the recommendation!

I suggest adding the following extensions:

  1. Xspikea - redefines all rvv instructions by adding arbitrary post-processing. This extension defines an agnostic_postprocesses vector with any postprocessing functions. So that the description of each instruction looks something like this:
reg_t execute_V_INST(processor_t* p, insn_t insn, reg_t pc)
{
  #include "riscv/insns/V_INST.h"
  for (auto postproc : agnostic_postprocesses)
    postproc(p, insn, pc);
  return pc;
}

This extension does not modify the behavior of vector instructions on its own (because agnostic_postprocess vector is empty) but allows other custom extensions to add their own postprocessing to the vector.
Extensions related to some kind of implementation of agnostic behavior (filling masks and tails) can add their own postprocessing functions to the agnostic_postprocesses vector.

  1. Xspiketa1s - adds a function to agnostic_postprocesses that fills the tail with all 1s.
  2. Xspikema1s - adds a function to agnostic_postprocesses that fills inactive elements with all 1s.

This approach does not require any core-code modification and still allows us to combine different postprocessing behaviors

  Added extensions:
    1. Xspiketama - redefines all rvv instructions by adding arbitrary post-processing (agnostic_postprocesses vector).
    2. Xspiketa1s - adds a function to agnostic_postprocesses that fills the tail with all 1s.
    3. Xspikema1s - adds a function to agnostic_postprocesses that fills inactive elements with all 1s.
@aswaterman

Copy link
Copy Markdown
Collaborator

Sorry for the very delayed response. I think we are getting on a better track here, but I'd like to see a simple prototype first. To avoid wasted work, let's restrict it to one instruction, e.g. vadd.vv.

@kseniadobrovolskaya

Copy link
Copy Markdown
Contributor Author

I think we are getting on a better track here, but I'd like to see a simple prototype first. To avoid wasted work, let's restrict it to one instruction, e.g. vadd.vv.

Done.

@kseniadobrovolskaya

Copy link
Copy Markdown
Contributor Author

@aswaterman, what do you think about this implementation of your proposal?

@aswaterman

Copy link
Copy Markdown
Collaborator

I think this is a valid direction, but now I'm struggling with two different concerns:

  • This will be hard to maintain in the long term. Whenever new vector instructions are added, they need to be added in multiple places.
  • The compile time will increase considerably since we need to build every vector instruction multiple times.

I am playing around with a hybrid idea:

  • Add the mask/tail processing hooks to the various vector macros directly. This avoids the redundant definition of the vector instructions.
  • Use the custom-extension approach we've discussed to provide the implementation of those hooks.

There will be a slight perf reduction when running vector code, but not all that much, given that vector instructions are fairly heavyweight to begin with. The more important thing is that the performance of scalar instructions is unaffected.

I will try to prototype an example, and if we decide that we like it, I will throw it over the wall for you to help me finish it up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants