K 10 svn:author V 3 bde K 8 svn:date V 27 2017-04-14T12:03:34.272811Z K 7 svn:log V 916 Further unobfuscate the method of drawing the mouse cursor in vga planar mode. Don't manually unroll the 2 inner loops. On Haswell, doing so gave a speedup of about 0.5% (about 4 cycles per iteration out of 1400), but hard-coded a limit of width 9 and made better better optimizations harder to see. gcc-4.2.1 -O does the unrolling anyway, unless tricked with a volatile hack. gcc's unrolling is not very good and gives a a speedup of about half as much (about 2 cycles per iteration). (All timing on i386.) Manual unrolling was only feasible because the inner loop only iterates once or twice. Usually twice, but a dynamic check is needed to decide, and was not moved from the second-innermost loop manually or by gcc. This commit basically adds another dynamic check in the inner loop. Cursor widths of 10-17 require 3 iterations in the inner loop and this is not so easy to unroll -- even gcc stops at 2. END