K 10 svn:author V 6 kevans K 8 svn:date V 27 2020-12-30T01:14:10.146353Z K 7 svn:log V 2983 libregex: implement GNU extensions 18a1e2e9: libregex: Implement a subset of the GNU extensions The entire patch-set is not yet mature enough for commit, but this usable subset is generally enough for googletest to be happy with and mostly map to some existing concepts, so they're not as invasive. The specific changes included here are: - Branching in BREs with \| - \w and \W for [[:alnum:]] and [^[:alnum:]] respectively - \s and \S for [[:space:]] and [^[:space:]] respectively - Additional quantifiers in BREs, \? and \+ (self-explanatory) There's some #ifdef'd out work for allowing empty branches as a match-all. This is a feature that's under assessment... future work will determine how standard this behavior is and act accordingly. 61898cde: libregex: disable some of the unimplemented test cases for now This should allow the tests to actually pass. Future work will uncomment the unimplemented tests as they're implemented. 7518fb34: libc: regex: factor out ISBOW/ISEOW macros These will be reused for \b (word boundary, which matches both sides). No functional change. ca53e5ae: libregex: implement \` and \' (begin-of-subj, end-of-subj) These are GNU extensions, generally equivalent to ^ and $ except that the new syntax will not match beginning of line after the first in a multi-line expression or the end of line before absolute last in a multi-line expression. 6b986646: libregex: implement \b and \B (word boundary, not word boundary) This is the last of the needed GNU expressions before we can unleash bsdgrep by default. \b is effectively an agnostic equivalent of \< and \>, while \B will match every space that isn't making a transition from nonchar -> char or char -> nonchar. 4afa7dd6: libc: regex: retire internal EMPTBR ("Empty branch present") It was realized just a little too late that this was a hack that belonged in individual regex(3)-using applications. It was surrounded in NOTYET and not implemented in the engine, so remove it. 4f1efa30: libc: regex: partial revert of r368358 (6b986646) MFC NOTE: Altered to match the legacy behavior of a\bc => abc. Part of the libregex functionality leaked into the tests it shares with the standard regex(3). Introduce a P flag to set the REG_POSIX cflag to indicate that libc regex should effectively do nothing while libregex should specifically run it in non-extended mode. This unbreaks the libc/regex test run. (cherry picked from commit 18a1e2e9b9f109a78c5a9274e4cfb4777801b4fb) (cherry picked from commit 61898cde69374d5a9994e2074605bc4101aff72d) (cherry picked from commit 7518fb346fe9603f99d2406a073b30fb8e4a270c) (cherry picked from commit ca53e5aedfebcc1b4091b68e01b2d5cae923f85e) (cherry picked from commit 6b986646d434baa21ae3d74d6a662ad206c7ddbd) (cherry picked from commit 4afa7dd61a3a1454a5b3cf5e6de2029c7e2d9a84) (cherry picked from commit 4f1efa309ca48a088595dd57969ae6a397dd49d1) Git Hash: 06e63004abb0abc801e9f8af066ef10095189b10 Git Author: kevans@FreeBSD.org END