K 10 svn:author V 3 mjg K 8 svn:date V 27 2018-10-22T06:44:20.900773Z K 7 svn:log V 440 amd64: finish the tail in memset with an overlapping store Instead of finding the exact size to fit in we can just shift the target by -8 + tail. Doing a blind write to a previously rep stosq'ed area comes with a penalty so do it conditionally. Sample win on EPYC when zeroing a 257 sized buffer (tail = 1) aligned to 16 bytes: before: 44782846 ops/s after: 46118614 ops/s Idea stolen from NetBSD. Sponsored by: The FreeBSD Foundation END