Skip to content

Group A: Implemented optimized matrix multiplication#4

Open
Artorias17 wants to merge 5 commits into
AA-parallel-computing:mainfrom
Artorias17:abhishek-roy
Open

Group A: Implemented optimized matrix multiplication#4
Artorias17 wants to merge 5 commits into
AA-parallel-computing:mainfrom
Artorias17:abhishek-roy

Conversation

@Artorias17
Copy link
Copy Markdown

Group A:

  • Ha Do (Student ID: 2402703)
  • Abhishek Roy (Student ID: 2502895)

Implemented and optimized the matrix multiplication. Implementation details are in the README.md file. The main speedup came from swapping the loop ordering from i -> j -> k to i -> k -> j, making the memory access pattern for all three matrices row-wise and introducing compiler flags for aggressive optimization, native SIMD, and loop unrolling.

System used for testing:

  • CPU: AMD Ryzen 7 8845HS
  • Architecture: x86-64
  • Cores: 8
  • Threads: 16

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants