![]() |
|
Message-Id: <20250925131557.8907-1-pincheng.plct@isrc.iscas.ac.cn> Date: Thu, 25 Sep 2025 21:15:56 +0800 From: Pincheng Wang <pincheng.plct@...c.iscas.ac.cn> To: musl@...ts.openwall.com Cc: pincheng.plct@...c.iscas.ac.cn Subject: [PATCH 0/1] riscv64: Add RVV optimized memset implementation Hi all, This patch introduces a RISC-V Vector (RVV) optimized implementation of memset. Key points: - Use RVV instructions to fill memory in bulk, with a small-size head-tail fast path to reduce vsetvli overhead. - Fall back to a scalar head-tail implementation (like generic C implementation) when RVV is not available. - Reduce both instruction count and code size: memset.o shrinks by about 16.5% compared to the generic C build. Performance results on RVV-capable hardware show clear improvements: - On Spacemit X60: up to ~3.1x faster (256B), with consistent gains across medium and large sizes. - On XuanTie C908: up to ~2.1x faster (128B), with modest gains for larger sizes. For very small sizes (<8 Bytes), there can be regressions compared to the generic C version. A more aggresive fast path could remove these regressions, but at the cost of added code complexity. Feedback on this trade-off is welcome. The implementation was tested under QEMU with RVV enabled and on real hardware. Functional behavior matches the generic memset, with no changes to the public interface. Thanks, Pincheng Wang Pincheng Wang (1): riscv64: optimize memset implementation with vector extension src/string/riscv64/memset.S | 101 ++++++++++++++++++++++++++++++++++++ 1 file changed, 101 insertions(+) create mode 100644 src/string/riscv64/memset.S -- 2.39.5
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.