Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions BIBLIOGRAPHY.md
Original file line number Diff line number Diff line change
Expand Up @@ -179,6 +179,7 @@ source code and documentation.
- [dev/aarch64_opt/src/ntt.S](dev/aarch64_opt/src/ntt.S)
- [mldsa/src/native/aarch64/src/intt.S](mldsa/src/native/aarch64/src/intt.S)
- [mldsa/src/native/aarch64/src/ntt.S](mldsa/src/native/aarch64/src/ntt.S)
- [proofs/hol_light/aarch64/mldsa/mldsa_ntt.S](proofs/hol_light/aarch64/mldsa/mldsa_ntt.S)

### `REF`

Expand Down Expand Up @@ -284,6 +285,7 @@ source code and documentation.
- [dev/aarch64_opt/src/ntt.S](dev/aarch64_opt/src/ntt.S)
- [mldsa/src/native/aarch64/src/intt.S](mldsa/src/native/aarch64/src/intt.S)
- [mldsa/src/native/aarch64/src/ntt.S](mldsa/src/native/aarch64/src/ntt.S)
- [proofs/hol_light/aarch64/mldsa/mldsa_ntt.S](proofs/hol_light/aarch64/mldsa/mldsa_ntt.S)

### `libmceliece`

Expand Down
6 changes: 3 additions & 3 deletions dev/aarch64_clean/src/polyz_unpack_17_asm.S
Original file line number Diff line number Diff line change
Expand Up @@ -69,9 +69,9 @@ polyz_unpack_17_loop:
// 3-register ld1 would load 48 bytes, but only 36 are
// consumed per iteration. The TBL indices for v2 are
// adjusted to account for v2's load offset.
ld1 {v0.16b, v1.16b}, [buf]
add buf, buf, #0x14
ld1 {v2.16b}, [buf], #0x10
ld1 {v0.16b, v1.16b}, [buf]
add buf, buf, #0x14
ld1 {v2.16b}, [buf], #0x10

tbl v4.16b, {v0.16b}, idx0.16b
tbl v5.16b, {v0.16b - v1.16b}, idx1.16b
Expand Down
6 changes: 3 additions & 3 deletions dev/aarch64_clean/src/polyz_unpack_19_asm.S
Original file line number Diff line number Diff line change
Expand Up @@ -66,9 +66,9 @@ polyz_unpack_19_loop:
// 3-register ld1 would load 48 bytes, but only 40 are
// consumed per iteration. The TBL indices for v2 are
// adjusted to account for v2's load offset.
ld1 {v0.16b, v1.16b}, [buf]
add buf, buf, #0x18
ld1 {v2.16b}, [buf], #0x10
ld1 {v0.16b, v1.16b}, [buf]
add buf, buf, #0x18
ld1 {v2.16b}, [buf], #0x10

tbl v4.16b, {v0.16b}, idx0.16b
tbl v5.16b, {v0.16b - v1.16b}, idx1.16b
Expand Down
6 changes: 3 additions & 3 deletions dev/aarch64_opt/src/polyz_unpack_17_asm.S
Original file line number Diff line number Diff line change
Expand Up @@ -69,9 +69,9 @@ polyz_unpack_17_loop:
// 3-register ld1 would load 48 bytes, but only 36 are
// consumed per iteration. The TBL indices for v2 are
// adjusted to account for v2's load offset.
ld1 {v0.16b, v1.16b}, [buf]
add buf, buf, #0x14
ld1 {v2.16b}, [buf], #0x10
ld1 {v0.16b, v1.16b}, [buf]
add buf, buf, #0x14
ld1 {v2.16b}, [buf], #0x10

tbl v4.16b, {v0.16b}, idx0.16b
tbl v5.16b, {v0.16b - v1.16b}, idx1.16b
Expand Down
6 changes: 3 additions & 3 deletions dev/aarch64_opt/src/polyz_unpack_19_asm.S
Original file line number Diff line number Diff line change
Expand Up @@ -66,9 +66,9 @@ polyz_unpack_19_loop:
// 3-register ld1 would load 48 bytes, but only 40 are
// consumed per iteration. The TBL indices for v2 are
// adjusted to account for v2's load offset.
ld1 {v0.16b, v1.16b}, [buf]
add buf, buf, #0x18
ld1 {v2.16b}, [buf], #0x10
ld1 {v0.16b, v1.16b}, [buf]
add buf, buf, #0x18
ld1 {v2.16b}, [buf], #0x10

tbl v4.16b, {v0.16b}, idx0.16b
tbl v5.16b, {v0.16b - v1.16b}, idx1.16b
Expand Down
Loading
Loading