K 10 svn:author V 6 marius K 8 svn:date V 27 2012-01-28T23:24:03.018638Z K 7 svn:log V 722 MFC: r225889, r228222 In total store which we use for running the kernel and all of the userland atomic operations behave as if they were followed by a CPU memory barrier so there's no need to include ones in the acquire variants of atomic(9) and it's sufficient to just use include compiler memory barriers to satisfy the requirements of atomic(9). Removing the CPU memory barriers results in a small performance improvement, specifically this is sufficient to compensate the performance loss seen in the worldstone benchmark seen when using SCHED_ULE instead of SCHED_4BSD. This change is inspired by Linux even more radically doing the equivalent thing some time ago. Thanks go to Peter Jeremy for additional testing. END