Blame SOURCES/pcre-8.32-Update-POSIX-class-handling-in-UCP-mode.patch

cb67f2
From e74dcd1eec9227fe23c06de2ff109e48695fd879 Mon Sep 17 00:00:00 2001
cb67f2
From: ph10 <ph10@2f5784b3-3f2a-0410-8824-cb99058d5e15>
cb67f2
Date: Sat, 2 Nov 2013 18:29:05 +0000
cb67f2
Subject: [PATCH 1/2] Update POSIX class handling in UCP mode.
cb67f2
MIME-Version: 1.0
cb67f2
Content-Type: text/plain; charset=UTF-8
cb67f2
Content-Transfer-Encoding: 8bit
cb67f2
cb67f2
Petr Pisar: Ported to 8.32:
cb67f2
cb67f2
commit fa3832825e3fe0d49f93658882775cdd6c26129e
cb67f2
Author: ph10 <ph10@2f5784b3-3f2a-0410-8824-cb99058d5e15>
cb67f2
Date:   Sat Nov 2 18:29:05 2013 +0000
cb67f2
cb67f2
    Update POSIX class handling in UCP mode.
cb67f2
cb67f2
    git-svn-id: svn://vcs.exim.org/pcre/code/trunk@1387 2f5784b3-3f2a-0410-8824-cb99058d5e15
cb67f2
cb67f2
It also adjusts some test 7 outputs because 8.32 does not contain
cb67f2
auto-possessification improvement from
cb67f2
cb67f2
commit 5f42224005b7d9a503903e3342ec7ada75590b07
cb67f2
Author: ph10 <ph10@2f5784b3-3f2a-0410-8824-cb99058d5e15>
cb67f2
Date:   Tue Oct 1 16:54:40 2013 +0000
cb67f2
cb67f2
    Refactored auto-possessification code.
cb67f2
cb67f2
    git-svn-id: svn://vcs.exim.org/pcre/code/trunk@1363 2f5784b3-3f2a-0410-8824-cb99058d5e15
cb67f2
cb67f2
Signed-off-by: Petr Písař <ppisar@redhat.com>
cb67f2
---
cb67f2
 doc/pcrepattern.3    |  37 +++++--
cb67f2
 pcre_compile.c       |  75 +++++++++++---
cb67f2
 pcre_internal.h      |  16 ++-
cb67f2
 pcre_printint.c      |  59 ++++++++---
cb67f2
 pcre_xclass.c        |  63 ++++++++++--
cb67f2
 testdata/testinput6  | 146 ++++++++++++++++++++++++++
cb67f2
 testdata/testinput7  |  10 ++
cb67f2
 testdata/testoutput6 | 286 ++++++++++++++++++++++++++++++++++++++++++++++++++-
cb67f2
 testdata/testoutput7 | 117 ++++++++++++++++++++-
cb67f2
 9 files changed, 752 insertions(+), 57 deletions(-)
cb67f2
cb67f2
diff --git a/doc/pcrepattern.3 b/doc/pcrepattern.3
cb67f2
index c9c7b45..f638846 100644
cb67f2
--- a/doc/pcrepattern.3
cb67f2
+++ b/doc/pcrepattern.3
cb67f2
@@ -861,8 +861,9 @@ the "mark" property always have the "extend" grapheme breaking property.
cb67f2
 .sp
cb67f2
 As well as the standard Unicode properties described above, PCRE supports four
cb67f2
 more that make it possible to convert traditional escape sequences such as \ew
cb67f2
-and \es and POSIX character classes to use Unicode properties. PCRE uses these
cb67f2
-non-standard, non-Perl properties internally when PCRE_UCP is set. They are:
cb67f2
+and \es to use Unicode properties. PCRE uses these non-standard, non-Perl
cb67f2
+properties internally when PCRE_UCP is set. However, they may also be used
cb67f2
+explicitly. These properties are:
cb67f2
 .sp
cb67f2
   Xan   Any alphanumeric character
cb67f2
   Xps   Any POSIX space character
cb67f2
@@ -873,6 +874,7 @@ Xan matches characters that have either the L (letter) or the N (number)
cb67f2
 property. Xps matches the characters tab, linefeed, vertical tab, form feed, or
cb67f2
 carriage return, and any other character that has the Z (separator) property.
cb67f2
 Xsp is the same as Xps, except that vertical tab is excluded. Xwd matches the
cb67f2
+:qa
cb67f2
 same characters as Xan, plus underscore.
cb67f2
 .
cb67f2
 .
cb67f2
@@ -1258,8 +1260,8 @@ supported, and an error is given if they are encountered.
cb67f2
 By default, in UTF modes, characters with values greater than 128 do not match
cb67f2
 any of the POSIX character classes. However, if the PCRE_UCP option is passed
cb67f2
 to \fBpcre_compile()\fP, some of the classes are changed so that Unicode
cb67f2
-character properties are used. This is achieved by replacing the POSIX classes
cb67f2
-by other sequences, as follows:
cb67f2
+character properties are used. This is achieved by replacing certain POSIX
cb67f2
+classes by other sequences, as follows:
cb67f2
 .sp
cb67f2
   [:alnum:]  becomes  \ep{Xan}
cb67f2
   [:alpha:]  becomes  \ep{L}
cb67f2
@@ -1270,9 +1272,30 @@ by other sequences, as follows:
cb67f2
   [:upper:]  becomes  \ep{Lu}
cb67f2
   [:word:]   becomes  \ep{Xwd}
cb67f2
 .sp
cb67f2
-Negated versions, such as [:^alpha:] use \eP instead of \ep. The other POSIX
cb67f2
-classes are unchanged, and match only characters with code points less than
cb67f2
-128.
cb67f2
+Negated versions, such as [:^alpha:] use \eP instead of \ep. Three other POSIX 
cb67f2
+classes are handled specially in UCP mode:
cb67f2
+.TP 10
cb67f2
+[:graph:]
cb67f2
+This matches characters that have glyphs that mark the page when printed. In 
cb67f2
+Unicode property terms, it matches all characters with the L, M, N, P, S, or Cf 
cb67f2
+properties, except for:
cb67f2
+.sp
cb67f2
+  U+061C           Arabic Letter Mark
cb67f2
+  U+180E           Mongolian Vowel Separator 
cb67f2
+  U+2066 - U+2069  Various "isolate"s
cb67f2
+.sp
cb67f2
+.TP 10
cb67f2
+[:print:]
cb67f2
+This matches the same characters as [:graph:] plus space characters that are 
cb67f2
+not controls, that is, characters with the Zs property.
cb67f2
+.TP 10
cb67f2
+[:punct:]
cb67f2
+This matches all characters that have the Unicode P (punctuation) property,
cb67f2
+plus those characters whose code points are less than 128 that have the S
cb67f2
+(Symbol) property.
cb67f2
+.P
cb67f2
+The other POSIX classes are unchanged, and match only characters with code
cb67f2
+points less than 128.
cb67f2
 .
cb67f2
 .
cb67f2
 .SH "VERTICAL BAR"
cb67f2
diff --git a/pcre_compile.c b/pcre_compile.c
cb67f2
index 746dc70..3c75218 100644
cb67f2
--- a/pcre_compile.c
cb67f2
+++ b/pcre_compile.c
cb67f2
@@ -257,7 +257,8 @@ static const int verbcount = sizeof(verbs)/sizeof(verbitem);
cb67f2
 now all in a single string, to reduce the number of relocations when a shared
cb67f2
 library is dynamically loaded. The list of lengths is terminated by a zero
cb67f2
 length entry. The first three must be alpha, lower, upper, as this is assumed
cb67f2
-for handling case independence. */
cb67f2
+for handling case independence. The indices for graph, print, and punct are
cb67f2
+needed, so identify them. */
cb67f2
 
cb67f2
 static const char posix_names[] =
cb67f2
   STRING_alpha0 STRING_lower0 STRING_upper0 STRING_alnum0
cb67f2
@@ -268,6 +269,11 @@ static const char posix_names[] =
cb67f2
 static const pcre_uint8 posix_name_lengths[] = {
cb67f2
   5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 4, 6, 0 };
cb67f2
 
cb67f2
+#define PC_GRAPH  8
cb67f2
+#define PC_PRINT  9
cb67f2
+#define PC_PUNCT 10
cb67f2
+
cb67f2
+
cb67f2
 /* Table of class bit maps for each POSIX class. Each class is formed from a
cb67f2
 base map, with an optional addition or removal of another map. Then, for some
cb67f2
 classes, there is some additional tweaking: for [:blank:] the vertical space
cb67f2
@@ -295,9 +301,8 @@ static const int posix_class_maps[] = {
cb67f2
   cbit_xdigit,-1,          0              /* xdigit */
cb67f2
 };
cb67f2
 
cb67f2
-/* Table of substitutes for \d etc when PCRE_UCP is set. The POSIX class
cb67f2
-substitutes must be in the order of the names, defined above, and there are
cb67f2
-both positive and negative cases. NULL means no substitute. */
cb67f2
+/* Table of substitutes for \d etc when PCRE_UCP is set. They are replaced by
cb67f2
+Unicode property escapes. */
cb67f2
 
cb67f2
 #ifdef SUPPORT_UCP
cb67f2
 static const pcre_uchar string_PNd[]  = {
cb67f2
@@ -322,12 +327,18 @@ static const pcre_uchar string_pXwd[] = {
cb67f2
 static const pcre_uchar *substitutes[] = {
cb67f2
   string_PNd,           /* \D */
cb67f2
   string_pNd,           /* \d */
cb67f2
-  string_PXsp,          /* \S */       /* NOTE: Xsp is Perl space */
cb67f2
-  string_pXsp,          /* \s */
cb67f2
+  string_PXsp,          /* \S */   /* Xsp is Perl space, but from 8.34, Perl */
cb67f2
+  string_pXsp,          /* \s */   /* space and POSIX space are the same. */
cb67f2
   string_PXwd,          /* \W */
cb67f2
   string_pXwd           /* \w */
cb67f2
 };
cb67f2
 
cb67f2
+/* The POSIX class substitutes must be in the order of the POSIX class names,
cb67f2
+defined above, and there are both positive and negative cases. NULL means no
cb67f2
+general substitute of a Unicode property escape (\p or \P). However, for some
cb67f2
+POSIX classes (e.g. graph, print, punct) a special property code is compiled
cb67f2
+directly. */
cb67f2
+
cb67f2
 static const pcre_uchar string_pL[] =   {
cb67f2
   CHAR_BACKSLASH, CHAR_p, CHAR_LEFT_CURLY_BRACKET,
cb67f2
   CHAR_L, CHAR_RIGHT_CURLY_BRACKET, '\0' };
cb67f2
@@ -375,8 +386,8 @@ static const pcre_uchar *posix_substitutes[] = {
cb67f2
   NULL,                 /* graph */
cb67f2
   NULL,                 /* print */
cb67f2
   NULL,                 /* punct */
cb67f2
-  string_pXps,          /* space */    /* NOTE: Xps is POSIX space */
cb67f2
-  string_pXwd,          /* word */
cb67f2
+  string_pXps,          /* space */   /* Xps is POSIX space, but from 8.34 */
cb67f2
+  string_pXwd,          /* word  */   /* Perl and POSIX space are the same */
cb67f2
   NULL,                 /* xdigit */
cb67f2
   /* Negated cases */
cb67f2
   string_PL,            /* ^alpha */
cb67f2
@@ -390,8 +401,8 @@ static const pcre_uchar *posix_substitutes[] = {
cb67f2
   NULL,                 /* ^graph */
cb67f2
   NULL,                 /* ^print */
cb67f2
   NULL,                 /* ^punct */
cb67f2
-  string_PXps,          /* ^space */   /* NOTE: Xps is POSIX space */
cb67f2
-  string_PXwd,          /* ^word */
cb67f2
+  string_PXps,          /* ^space */  /* Xps is POSIX space, but from 8.34 */
cb67f2
+  string_PXwd,          /* ^word */   /* Perl and POSIX space are the same */
cb67f2
   NULL                  /* ^xdigit */
cb67f2
 };
cb67f2
 #define POSIX_SUBSIZE (sizeof(posix_substitutes) / sizeof(pcre_uchar *))
cb67f2
@@ -4258,24 +4269,58 @@ for (;; ptr++)
cb67f2
           posix_class = 0;
cb67f2
 
cb67f2
         /* When PCRE_UCP is set, some of the POSIX classes are converted to
cb67f2
-        different escape sequences that use Unicode properties. */
cb67f2
+        different escape sequences that use Unicode properties \p or \P. Others
cb67f2
+        that are not available via \p or \P generate XCL_PROP/XCL_NOTPROP
cb67f2
+        directly. */
cb67f2
 
cb67f2
 #ifdef SUPPORT_UCP
cb67f2
         if ((options & PCRE_UCP) != 0)
cb67f2
           {
cb67f2
+          unsigned int ptype = 0;
cb67f2
           int pc = posix_class + ((local_negate)? POSIX_SUBSIZE/2 : 0);
cb67f2
+
cb67f2
+          /* The posix_substitutes table specifies which POSIX classes can be 
cb67f2
+          converted to \p or \P items. */
cb67f2
+           
cb67f2
           if (posix_substitutes[pc] != NULL)
cb67f2
             {
cb67f2
             nestptr = tempptr + 1;
cb67f2
             ptr = posix_substitutes[pc] - 1;
cb67f2
             continue;
cb67f2
             }
cb67f2
+            
cb67f2
+          /* There are three other classes that generate special property calls 
cb67f2
+          that are recognized only in an XCLASS. */ 
cb67f2
+
cb67f2
+          else switch(posix_class)
cb67f2
+            {
cb67f2
+            case PC_GRAPH:
cb67f2
+            ptype = PT_PXGRAPH;
cb67f2
+            /* Fall through */
cb67f2
+            case PC_PRINT:
cb67f2
+            if (ptype == 0) ptype = PT_PXPRINT;
cb67f2
+            /* Fall through */
cb67f2
+            case PC_PUNCT:
cb67f2
+            if (ptype == 0) ptype = PT_PXPUNCT;
cb67f2
+            *class_uchardata++ = local_negate? XCL_NOTPROP : XCL_PROP;
cb67f2
+            *class_uchardata++ = ptype;
cb67f2
+            *class_uchardata++ = 0;
cb67f2
+            ptr = tempptr + 1;
cb67f2
+            continue;
cb67f2
+            
cb67f2
+            /* For all other POSIX classes, no special action is taken in UCP
cb67f2
+            mode. Fall through to the non_UCP case. */
cb67f2
+
cb67f2
+            default:
cb67f2
+            break; 
cb67f2
+            }
cb67f2
           }
cb67f2
 #endif
cb67f2
-        /* In the non-UCP case, we build the bit map for the POSIX class in a
cb67f2
-        chunk of local store because we may be adding and subtracting from it,
cb67f2
-        and we don't want to subtract bits that may be in the main map already.
cb67f2
-        At the end we or the result into the bit map that is being built. */
cb67f2
+        /* In the non-UCP case, or when UCP makes no difference, we build the
cb67f2
+        bit map for the POSIX class in a chunk of local store because we may be
cb67f2
+        adding and subtracting from it, and we don't want to subtract bits that
cb67f2
+        may be in the main map already. At the end we or the result into the
cb67f2
+        bit map that is being built. */
cb67f2
 
cb67f2
         posix_class *= 3;
cb67f2
 
cb67f2
diff --git a/pcre_internal.h b/pcre_internal.h
cb67f2
index 157de08..389848f 100644
cb67f2
--- a/pcre_internal.h
cb67f2
+++ b/pcre_internal.h
cb67f2
@@ -1836,6 +1836,16 @@ only. */
cb67f2
 #define PT_WORD       8    /* Word - L plus N plus underscore */
cb67f2
 #define PT_CLIST      9    /* Pseudo-property: match character list */
cb67f2
 
cb67f2
+/* The following special properties are used only in XCLASS items, when POSIX 
cb67f2
+classes are specified and PCRE_UCP is set - in other words, for Unicode 
cb67f2
+handling of these classes. They are not available via the \p or \P escapes like 
cb67f2
+those in the above list, and so they do not take part in the autopossessifying 
cb67f2
+table. */
cb67f2
+
cb67f2
+#define PT_PXGRAPH   11    /* [:graph:] - characters that mark the paper */
cb67f2
+#define PT_PXPRINT   12    /* [:print:] - [:graph:] plus non-control spaces */
cb67f2
+#define PT_PXPUNCT   13    /* [:punct:] - punctuation characters */
cb67f2
+
cb67f2
 /* Flag bits and data types for the extended class (OP_XCLASS) for classes that
cb67f2
 contain characters with values greater than 255. */
cb67f2
 
cb67f2
@@ -1849,9 +1859,9 @@ contain characters with values greater than 255. */
cb67f2
 #define XCL_NOTPROP   4    /* Unicode inverted property (ditto) */
cb67f2
 
cb67f2
 /* These are escaped items that aren't just an encoding of a particular data
cb67f2
-value such as \n. They must have non-zero values, as check_escape() returns
cb67f2
-0 for a data character.  Also, they must appear in the same order as in the opcode
cb67f2
-definitions below, up to ESC_z. There's a dummy for OP_ALLANY because it
cb67f2
+value such as \n. They must have non-zero values, as check_escape() returns 0
cb67f2
+for a data character.  Also, they must appear in the same order as in the
cb67f2
+opcode definitions below, up to ESC_z. There's a dummy for OP_ALLANY because it
cb67f2
 corresponds to "." in DOTALL mode rather than an escape sequence. It is also
cb67f2
 used for [^] in JavaScript compatibility mode, and for \C in non-utf mode. In
cb67f2
 non-DOTALL mode, "." behaves like \N.
cb67f2
diff --git a/pcre_printint.c b/pcre_printint.c
cb67f2
index 10b5754..c6dcbe6 100644
cb67f2
--- a/pcre_printint.c
cb67f2
+++ b/pcre_printint.c
cb67f2
@@ -608,9 +608,9 @@ for(;;)
cb67f2
     print_prop(f, code, "    ", "");
cb67f2
     break;
cb67f2
 
cb67f2
-    /* OP_XCLASS can only occur in UTF or PCRE16 modes. However, there's no
cb67f2
-    harm in having this code always here, and it makes it less messy without
cb67f2
-    all those #ifdefs. */
cb67f2
+    /* OP_XCLASS cannot occur in 8-bit, non-UTF mode. However, there's no harm
cb67f2
+    in having this code always here, and it makes it less messy without all
cb67f2
+    those #ifdefs. */
cb67f2
 
cb67f2
     case OP_CLASS:
cb67f2
     case OP_NCLASS:
cb67f2
@@ -671,27 +671,52 @@ for(;;)
cb67f2
         pcre_uchar ch;
cb67f2
         while ((ch = *ccode++) != XCL_END)
cb67f2
           {
cb67f2
-          if (ch == XCL_PROP)
cb67f2
-            {
cb67f2
-            unsigned int ptype = *ccode++;
cb67f2
-            unsigned int pvalue = *ccode++;
cb67f2
-            fprintf(f, "\\p{%s}", get_ucpname(ptype, pvalue));
cb67f2
-            }
cb67f2
-          else if (ch == XCL_NOTPROP)
cb67f2
-            {
cb67f2
-            unsigned int ptype = *ccode++;
cb67f2
-            unsigned int pvalue = *ccode++;
cb67f2
-            fprintf(f, "\\P{%s}", get_ucpname(ptype, pvalue));
cb67f2
-            }
cb67f2
-          else
cb67f2
+          BOOL not = FALSE; 
cb67f2
+          const char *notch = ""; 
cb67f2
+           
cb67f2
+          switch(ch)
cb67f2
             {
cb67f2
+            case XCL_NOTPROP: 
cb67f2
+            not = TRUE;
cb67f2
+            notch = "^"; 
cb67f2
+            /* Fall through */
cb67f2
+               
cb67f2
+            case XCL_PROP:  
cb67f2
+              {
cb67f2
+              unsigned int ptype = *ccode++;
cb67f2
+              unsigned int pvalue = *ccode++;
cb67f2
+              
cb67f2
+              switch(ptype)
cb67f2
+                {
cb67f2
+                case PT_PXGRAPH:
cb67f2
+                fprintf(f, "[:%sgraph:]", notch);
cb67f2
+                break;    
cb67f2
+
cb67f2
+                case PT_PXPRINT:
cb67f2
+                fprintf(f, "[:%sprint:]", notch);
cb67f2
+                break;    
cb67f2
+
cb67f2
+                case PT_PXPUNCT:
cb67f2
+                fprintf(f, "[:%spunct:]", notch);
cb67f2
+                break;    
cb67f2
+
cb67f2
+                default:
cb67f2
+                fprintf(f, "\\%c{%s}", (not? 'P':'p'), 
cb67f2
+                  get_ucpname(ptype, pvalue));
cb67f2
+                break;
cb67f2
+                }    
cb67f2
+              }
cb67f2
+            break;
cb67f2
+             
cb67f2
+            default:
cb67f2
             ccode += 1 + print_char(f, ccode, utf);
cb67f2
             if (ch == XCL_RANGE)
cb67f2
               {
cb67f2
               fprintf(f, "-");
cb67f2
               ccode += 1 + print_char(f, ccode, utf);
cb67f2
               }
cb67f2
-            }
cb67f2
+            break; 
cb67f2
+            } 
cb67f2
           }
cb67f2
         }
cb67f2
 
cb67f2
diff --git a/pcre_xclass.c b/pcre_xclass.c
cb67f2
index fa73cd8..dd7008a 100644
cb67f2
--- a/pcre_xclass.c
cb67f2
+++ b/pcre_xclass.c
cb67f2
@@ -128,57 +128,102 @@ while ((t = *data++) != XCL_END)
cb67f2
   else  /* XCL_PROP & XCL_NOTPROP */
cb67f2
     {
cb67f2
     const ucd_record *prop = GET_UCD(c);
cb67f2
+    BOOL isprop = t == XCL_PROP; 
cb67f2
 
cb67f2
     switch(*data)
cb67f2
       {
cb67f2
       case PT_ANY:
cb67f2
-      if (t == XCL_PROP) return !negated;
cb67f2
+      if (isprop) return !negated;
cb67f2
       break;
cb67f2
 
cb67f2
       case PT_LAMP:
cb67f2
       if ((prop->chartype == ucp_Lu || prop->chartype == ucp_Ll ||
cb67f2
-           prop->chartype == ucp_Lt) == (t == XCL_PROP)) return !negated;
cb67f2
+           prop->chartype == ucp_Lt) == isprop) return !negated;
cb67f2
       break;
cb67f2
 
cb67f2
       case PT_GC:
cb67f2
-      if ((data[1] == PRIV(ucp_gentype)[prop->chartype]) == (t == XCL_PROP))
cb67f2
+      if ((data[1] == PRIV(ucp_gentype)[prop->chartype]) == isprop)
cb67f2
         return !negated;
cb67f2
       break;
cb67f2
 
cb67f2
       case PT_PC:
cb67f2
-      if ((data[1] == prop->chartype) == (t == XCL_PROP)) return !negated;
cb67f2
+      if ((data[1] == prop->chartype) == isprop) return !negated;
cb67f2
       break;
cb67f2
 
cb67f2
       case PT_SC:
cb67f2
-      if ((data[1] == prop->script) == (t == XCL_PROP)) return !negated;
cb67f2
+      if ((data[1] == prop->script) == isprop) return !negated;
cb67f2
       break;
cb67f2
 
cb67f2
       case PT_ALNUM:
cb67f2
       if ((PRIV(ucp_gentype)[prop->chartype] == ucp_L ||
cb67f2
-           PRIV(ucp_gentype)[prop->chartype] == ucp_N) == (t == XCL_PROP))
cb67f2
+           PRIV(ucp_gentype)[prop->chartype] == ucp_N) == isprop)
cb67f2
         return !negated;
cb67f2
       break;
cb67f2
 
cb67f2
       case PT_SPACE:    /* Perl space */
cb67f2
       if ((PRIV(ucp_gentype)[prop->chartype] == ucp_Z ||
cb67f2
            c == CHAR_HT || c == CHAR_NL || c == CHAR_FF || c == CHAR_CR)
cb67f2
-             == (t == XCL_PROP))
cb67f2
+             == isprop)
cb67f2
         return !negated;
cb67f2
       break;
cb67f2
 
cb67f2
       case PT_PXSPACE:  /* POSIX space */
cb67f2
       if ((PRIV(ucp_gentype)[prop->chartype] == ucp_Z ||
cb67f2
            c == CHAR_HT || c == CHAR_NL || c == CHAR_VT ||
cb67f2
-           c == CHAR_FF || c == CHAR_CR) == (t == XCL_PROP))
cb67f2
+           c == CHAR_FF || c == CHAR_CR) == isprop)
cb67f2
         return !negated;
cb67f2
       break;
cb67f2
 
cb67f2
       case PT_WORD:
cb67f2
       if ((PRIV(ucp_gentype)[prop->chartype] == ucp_L ||
cb67f2
            PRIV(ucp_gentype)[prop->chartype] == ucp_N || c == CHAR_UNDERSCORE)
cb67f2
-             == (t == XCL_PROP))
cb67f2
+             == isprop)
cb67f2
         return !negated;
cb67f2
       break;
cb67f2
+      
cb67f2
+      /* The following three properties can occur only in an XCLASS, as there
cb67f2
+      is no \p or \P coding for them. */
cb67f2
+
cb67f2
+      /* Graphic character. Implement this as not Z (space or separator) and 
cb67f2
+      not C (other), except for Cf (format) with a few exceptions. This seems 
cb67f2
+      to be what Perl does. The exceptional characters are:
cb67f2
+       
cb67f2
+      U+061C           Arabic Letter Mark
cb67f2
+      U+180E           Mongolian Vowel Separator 
cb67f2
+      U+2066 - U+2069  Various "isolate"s
cb67f2
+      */ 
cb67f2
+      
cb67f2
+      case PT_PXGRAPH:
cb67f2
+      if ((PRIV(ucp_gentype)[prop->chartype] != ucp_Z &&
cb67f2
+            (PRIV(ucp_gentype)[prop->chartype] != ucp_C ||
cb67f2
+              (prop->chartype == ucp_Cf && 
cb67f2
+                c != 0x061c && c != 0x180e && (c < 0x2066 || c > 0x2069))
cb67f2
+         )) == isprop)
cb67f2
+        return !negated;        
cb67f2
+      break;
cb67f2
+      
cb67f2
+      /* Printable character: same as graphic, with the addition of Zs, i.e. 
cb67f2
+      not Zl and not Zp, and U+180E. */
cb67f2
+
cb67f2
+      case PT_PXPRINT:
cb67f2
+      if ((prop->chartype != ucp_Zl &&
cb67f2
+           prop->chartype != ucp_Zp && 
cb67f2
+            (PRIV(ucp_gentype)[prop->chartype] != ucp_C ||
cb67f2
+              (prop->chartype == ucp_Cf && 
cb67f2
+                c != 0x061c && (c < 0x2066 || c > 0x2069))
cb67f2
+         )) == isprop)
cb67f2
+        return !negated;        
cb67f2
+      break;
cb67f2
+      
cb67f2
+      /* Punctuation: all Unicode punctuation, plus ASCII characters that 
cb67f2
+      Unicode treats as symbols rather than punctuation, for Perl
cb67f2
+      compatibility (these are $+<=>^`|~). */
cb67f2
+
cb67f2
+      case PT_PXPUNCT:
cb67f2
+      if ((PRIV(ucp_gentype)[prop->chartype] == ucp_P ||
cb67f2
+            (c < 256 && PRIV(ucp_gentype)[prop->chartype] == ucp_S)) == isprop)
cb67f2
+        return !negated;
cb67f2
+      break;           
cb67f2
 
cb67f2
       /* This should never occur, but compilers may mutter if there is no
cb67f2
       default. */
cb67f2
diff --git a/testdata/testinput6 b/testdata/testinput6
cb67f2
index 219a30e..adafb89 100644
cb67f2
--- a/testdata/testinput6
cb67f2
+++ b/testdata/testinput6
cb67f2
@@ -1319,4 +1319,150 @@
cb67f2
 /^s?c/mi8
cb67f2
     scat
cb67f2
 
cb67f2
+/^[[:graph:]]+$/8W
cb67f2
+    Letter:ABC
cb67f2
+    Mark:\x{300}\x{1d172}\x{1d17b}
cb67f2
+    Number:9\x{660}
cb67f2
+    Punctuation:\x{66a},;
cb67f2
+    Symbol:\x{6de}<>\x{fffc}
cb67f2
+    Cf-property:\x{ad}\x{600}\x{601}\x{602}\x{603}\x{604}\x{6dd}\x{70f}
cb67f2
+    \x{200b}\x{200c}\x{200d}\x{200e}\x{200f}
cb67f2
+    \x{202a}\x{202b}\x{202c}\x{202d}\x{202e}
cb67f2
+    \x{2060}\x{2061}\x{2062}\x{2063}\x{2064}
cb67f2
+    \x{206a}\x{206b}\x{206c}\x{206d}\x{206e}\x{206f}
cb67f2
+    \x{feff}
cb67f2
+    \x{fff9}\x{fffa}\x{fffb}
cb67f2
+    \x{110bd}
cb67f2
+    \x{1d173}\x{1d174}\x{1d175}\x{1d176}\x{1d177}\x{1d178}\x{1d179}\x{1d17a}
cb67f2
+    \x{e0001}
cb67f2
+    \x{e0020}\x{e0030}\x{e0040}\x{e0050}\x{e0060}\x{e0070}\x{e007f}
cb67f2
+    ** Failers
cb67f2
+    \x{09}
cb67f2
+    \x{0a}
cb67f2
+    \x{1D}
cb67f2
+    \x{20}
cb67f2
+    \x{85}
cb67f2
+    \x{a0}
cb67f2
+    \x{61c}
cb67f2
+    \x{1680}
cb67f2
+    \x{180e}
cb67f2
+    \x{2028}
cb67f2
+    \x{2029}
cb67f2
+    \x{202f}
cb67f2
+    \x{2065}
cb67f2
+    \x{2066}
cb67f2
+    \x{2067}
cb67f2
+    \x{2068}
cb67f2
+    \x{2069}
cb67f2
+    \x{3000}
cb67f2
+    \x{e0002}
cb67f2
+    \x{e001f}
cb67f2
+    \x{e0080} 
cb67f2
+
cb67f2
+/^[[:print:]]+$/8W
cb67f2
+    Space: \x{a0}
cb67f2
+    \x{1680}\x{2000}\x{2001}\x{2002}\x{2003}\x{2004}\x{2005}
cb67f2
+    \x{2006}\x{2007}\x{2008}\x{2009}\x{200a} 
cb67f2
+    \x{202f}\x{205f} 
cb67f2
+    \x{3000}
cb67f2
+    Letter:ABC
cb67f2
+    Mark:\x{300}\x{1d172}\x{1d17b}
cb67f2
+    Number:9\x{660}
cb67f2
+    Punctuation:\x{66a},;
cb67f2
+    Symbol:\x{6de}<>\x{fffc}
cb67f2
+    Cf-property:\x{ad}\x{600}\x{601}\x{602}\x{603}\x{604}\x{6dd}\x{70f}
cb67f2
+    \x{180e}
cb67f2
+    \x{200b}\x{200c}\x{200d}\x{200e}\x{200f}
cb67f2
+    \x{202a}\x{202b}\x{202c}\x{202d}\x{202e}
cb67f2
+    \x{202f}
cb67f2
+    \x{2060}\x{2061}\x{2062}\x{2063}\x{2064}
cb67f2
+    \x{206a}\x{206b}\x{206c}\x{206d}\x{206e}\x{206f}
cb67f2
+    \x{feff}
cb67f2
+    \x{fff9}\x{fffa}\x{fffb}
cb67f2
+    \x{110bd}
cb67f2
+    \x{1d173}\x{1d174}\x{1d175}\x{1d176}\x{1d177}\x{1d178}\x{1d179}\x{1d17a}
cb67f2
+    \x{e0001}
cb67f2
+    \x{e0020}\x{e0030}\x{e0040}\x{e0050}\x{e0060}\x{e0070}\x{e007f}
cb67f2
+    ** Failers
cb67f2
+    \x{09}
cb67f2
+    \x{1D}
cb67f2
+    \x{85}
cb67f2
+    \x{61c}
cb67f2
+    \x{2028}
cb67f2
+    \x{2029}
cb67f2
+    \x{2065}
cb67f2
+    \x{2066}
cb67f2
+    \x{2067}
cb67f2
+    \x{2068}
cb67f2
+    \x{2069}
cb67f2
+    \x{e0002}
cb67f2
+    \x{e001f}
cb67f2
+    \x{e0080} 
cb67f2
+
cb67f2
+/^[[:punct:]]+$/8W
cb67f2
+    \$+<=>^`|~
cb67f2
+    !\"#%&'()*,-./:;?@[\\]_{}
cb67f2
+    \x{a1}\x{a7}  
cb67f2
+    \x{37e} 
cb67f2
+    ** Failers
cb67f2
+    abcde  
cb67f2
+
cb67f2
+/^[[:^graph:]]+$/8W
cb67f2
+    \x{09}\x{0a}\x{1D}\x{20}\x{85}\x{a0}\x{61c}\x{1680}\x{180e}
cb67f2
+    \x{2028}\x{2029}\x{202f}\x{2065}\x{2066}\x{2067}\x{2068}\x{2069}
cb67f2
+    \x{3000}\x{e0002}\x{e001f}\x{e0080}
cb67f2
+    ** Failers
cb67f2
+    Letter:ABC
cb67f2
+    Mark:\x{300}\x{1d172}\x{1d17b}
cb67f2
+    Number:9\x{660}
cb67f2
+    Punctuation:\x{66a},;
cb67f2
+    Symbol:\x{6de}<>\x{fffc}
cb67f2
+    Cf-property:\x{ad}\x{600}\x{601}\x{602}\x{603}\x{604}\x{6dd}\x{70f}
cb67f2
+    \x{200b}\x{200c}\x{200d}\x{200e}\x{200f}
cb67f2
+    \x{202a}\x{202b}\x{202c}\x{202d}\x{202e}
cb67f2
+    \x{2060}\x{2061}\x{2062}\x{2063}\x{2064}
cb67f2
+    \x{206a}\x{206b}\x{206c}\x{206d}\x{206e}\x{206f}
cb67f2
+    \x{feff}
cb67f2
+    \x{fff9}\x{fffa}\x{fffb}
cb67f2
+    \x{110bd}
cb67f2
+    \x{1d173}\x{1d174}\x{1d175}\x{1d176}\x{1d177}\x{1d178}\x{1d179}\x{1d17a}
cb67f2
+    \x{e0001}
cb67f2
+    \x{e0020}\x{e0030}\x{e0040}\x{e0050}\x{e0060}\x{e0070}\x{e007f}
cb67f2
+
cb67f2
+/^[[:^print:]]+$/8W
cb67f2
+    \x{09}\x{1D}\x{85}\x{61c}\x{2028}\x{2029}\x{2065}\x{2066}\x{2067}
cb67f2
+    \x{2068}\x{2069}\x{e0002}\x{e001f}\x{e0080}
cb67f2
+    ** Failers
cb67f2
+    Space: \x{a0}
cb67f2
+    \x{1680}\x{2000}\x{2001}\x{2002}\x{2003}\x{2004}\x{2005}
cb67f2
+    \x{2006}\x{2007}\x{2008}\x{2009}\x{200a} 
cb67f2
+    \x{202f}\x{205f} 
cb67f2
+    \x{3000}
cb67f2
+    Letter:ABC
cb67f2
+    Mark:\x{300}\x{1d172}\x{1d17b}
cb67f2
+    Number:9\x{660}
cb67f2
+    Punctuation:\x{66a},;
cb67f2
+    Symbol:\x{6de}<>\x{fffc}
cb67f2
+    Cf-property:\x{ad}\x{600}\x{601}\x{602}\x{603}\x{604}\x{6dd}\x{70f}
cb67f2
+    \x{180e}
cb67f2
+    \x{200b}\x{200c}\x{200d}\x{200e}\x{200f}
cb67f2
+    \x{202a}\x{202b}\x{202c}\x{202d}\x{202e}
cb67f2
+    \x{202f}
cb67f2
+    \x{2060}\x{2061}\x{2062}\x{2063}\x{2064}
cb67f2
+    \x{206a}\x{206b}\x{206c}\x{206d}\x{206e}\x{206f}
cb67f2
+    \x{feff}
cb67f2
+    \x{fff9}\x{fffa}\x{fffb}
cb67f2
+    \x{110bd}
cb67f2
+    \x{1d173}\x{1d174}\x{1d175}\x{1d176}\x{1d177}\x{1d178}\x{1d179}\x{1d17a}
cb67f2
+    \x{e0001}
cb67f2
+    \x{e0020}\x{e0030}\x{e0040}\x{e0050}\x{e0060}\x{e0070}\x{e007f}
cb67f2
+
cb67f2
+/^[[:^punct:]]+$/8W
cb67f2
+    abcde  
cb67f2
+    ** Failers
cb67f2
+    \$+<=>^`|~
cb67f2
+    !\"#%&'()*,-./:;?@[\\]_{}
cb67f2
+    \x{a1}\x{a7}  
cb67f2
+    \x{37e} 
cb67f2
+
cb67f2
 /-- End of testinput6 --/
cb67f2
diff --git a/testdata/testinput7 b/testdata/testinput7
cb67f2
index 252d246..bcdcef9 100644
cb67f2
--- a/testdata/testinput7
cb67f2
+++ b/testdata/testinput7
cb67f2
@@ -672,4 +672,14 @@ of case for anything other than the ASCII letters. --/
cb67f2
 /^s?c/mi8I
cb67f2
     scat
cb67f2
 
cb67f2
+/\D+\X \d+\X \S+\X \s+\X \W+\X \w+\X \C+\X \R+\X \H+\X \h+\X \V+\X \v+\X a+\X \n+\X .+\X/BZx
cb67f2
+
cb67f2
+/.+\X/BZxs
cb67f2
+
cb67f2
+/\X+$/BZxm
cb67f2
+
cb67f2
+/\X+\D \X+\d \X+\S \X+\s \X+\W \X+\w \X+. \X+\C \X+\R \X+\H \X+\h \X+\V \X+\v \X+\X \X+\Z \X+\z \X+$/BZx
cb67f2
+
cb67f2
+/\d+\s{0,5}=\s*\S?=\w{0,4}\W*/8WBZ
cb67f2
+
cb67f2
 /-- End of testinput7 --/
cb67f2
diff --git a/testdata/testoutput6 b/testdata/testoutput6
cb67f2
index 090d23f..c426efc 100644
cb67f2
--- a/testdata/testoutput6
cb67f2
+++ b/testdata/testoutput6
cb67f2
@@ -1338,15 +1338,15 @@ No match
cb67f2
 
cb67f2
 /^[[:graph:]]*/8W
cb67f2
     A\x{a1}\x{a0}
cb67f2
- 0: A
cb67f2
+ 0: A\x{a1}
cb67f2
 
cb67f2
 /^[[:print:]]*/8W
cb67f2
     A z\x{a0}\x{a1}
cb67f2
- 0: A z
cb67f2
+ 0: A z\x{a0}\x{a1}
cb67f2
 
cb67f2
 /^[[:punct:]]*/8W
cb67f2
     .+\x{a1}\x{a0}
cb67f2
- 0: .+
cb67f2
+ 0: .+\x{a1}
cb67f2
 
cb67f2
 /\p{Zs}*?\R/
cb67f2
     ** Failers
cb67f2
@@ -2138,4 +2138,284 @@ No match
cb67f2
     scat
cb67f2
  0: sc
cb67f2
 
cb67f2
+/^[[:graph:]]+$/8W
cb67f2
+    Letter:ABC
cb67f2
+ 0: Letter:ABC
cb67f2
+    Mark:\x{300}\x{1d172}\x{1d17b}
cb67f2
+ 0: Mark:\x{300}\x{1d172}\x{1d17b}
cb67f2
+    Number:9\x{660}
cb67f2
+ 0: Number:9\x{660}
cb67f2
+    Punctuation:\x{66a},;
cb67f2
+ 0: Punctuation:\x{66a},;
cb67f2
+    Symbol:\x{6de}<>\x{fffc}
cb67f2
+ 0: Symbol:\x{6de}<>\x{fffc}
cb67f2
+    Cf-property:\x{ad}\x{600}\x{601}\x{602}\x{603}\x{604}\x{6dd}\x{70f}
cb67f2
+ 0: Cf-property:\x{ad}\x{600}\x{601}\x{602}\x{603}\x{604}\x{6dd}\x{70f}
cb67f2
+    \x{200b}\x{200c}\x{200d}\x{200e}\x{200f}
cb67f2
+ 0: \x{200b}\x{200c}\x{200d}\x{200e}\x{200f}
cb67f2
+    \x{202a}\x{202b}\x{202c}\x{202d}\x{202e}
cb67f2
+ 0: \x{202a}\x{202b}\x{202c}\x{202d}\x{202e}
cb67f2
+    \x{2060}\x{2061}\x{2062}\x{2063}\x{2064}
cb67f2
+ 0: \x{2060}\x{2061}\x{2062}\x{2063}\x{2064}
cb67f2
+    \x{206a}\x{206b}\x{206c}\x{206d}\x{206e}\x{206f}
cb67f2
+ 0: \x{206a}\x{206b}\x{206c}\x{206d}\x{206e}\x{206f}
cb67f2
+    \x{feff}
cb67f2
+ 0: \x{feff}
cb67f2
+    \x{fff9}\x{fffa}\x{fffb}
cb67f2
+ 0: \x{fff9}\x{fffa}\x{fffb}
cb67f2
+    \x{110bd}
cb67f2
+ 0: \x{110bd}
cb67f2
+    \x{1d173}\x{1d174}\x{1d175}\x{1d176}\x{1d177}\x{1d178}\x{1d179}\x{1d17a}
cb67f2
+ 0: \x{1d173}\x{1d174}\x{1d175}\x{1d176}\x{1d177}\x{1d178}\x{1d179}\x{1d17a}
cb67f2
+    \x{e0001}
cb67f2
+ 0: \x{e0001}
cb67f2
+    \x{e0020}\x{e0030}\x{e0040}\x{e0050}\x{e0060}\x{e0070}\x{e007f}
cb67f2
+ 0: \x{e0020}\x{e0030}\x{e0040}\x{e0050}\x{e0060}\x{e0070}\x{e007f}
cb67f2
+    ** Failers
cb67f2
+No match
cb67f2
+    \x{09}
cb67f2
+No match
cb67f2
+    \x{0a}
cb67f2
+No match
cb67f2
+    \x{1D}
cb67f2
+No match
cb67f2
+    \x{20}
cb67f2
+No match
cb67f2
+    \x{85}
cb67f2
+No match
cb67f2
+    \x{a0}
cb67f2
+No match
cb67f2
+    \x{61c}
cb67f2
+No match
cb67f2
+    \x{1680}
cb67f2
+No match
cb67f2
+    \x{180e}
cb67f2
+No match
cb67f2
+    \x{2028}
cb67f2
+No match
cb67f2
+    \x{2029}
cb67f2
+No match
cb67f2
+    \x{202f}
cb67f2
+No match
cb67f2
+    \x{2065}
cb67f2
+No match
cb67f2
+    \x{2066}
cb67f2
+No match
cb67f2
+    \x{2067}
cb67f2
+No match
cb67f2
+    \x{2068}
cb67f2
+No match
cb67f2
+    \x{2069}
cb67f2
+No match
cb67f2
+    \x{3000}
cb67f2
+No match
cb67f2
+    \x{e0002}
cb67f2
+No match
cb67f2
+    \x{e001f}
cb67f2
+No match
cb67f2
+    \x{e0080} 
cb67f2
+No match
cb67f2
+
cb67f2
+/^[[:print:]]+$/8W
cb67f2
+    Space: \x{a0}
cb67f2
+ 0: Space: \x{a0}
cb67f2
+    \x{1680}\x{2000}\x{2001}\x{2002}\x{2003}\x{2004}\x{2005}
cb67f2
+ 0: \x{1680}\x{2000}\x{2001}\x{2002}\x{2003}\x{2004}\x{2005}
cb67f2
+    \x{2006}\x{2007}\x{2008}\x{2009}\x{200a} 
cb67f2
+ 0: \x{2006}\x{2007}\x{2008}\x{2009}\x{200a}
cb67f2
+    \x{202f}\x{205f} 
cb67f2
+ 0: \x{202f}\x{205f}
cb67f2
+    \x{3000}
cb67f2
+ 0: \x{3000}
cb67f2
+    Letter:ABC
cb67f2
+ 0: Letter:ABC
cb67f2
+    Mark:\x{300}\x{1d172}\x{1d17b}
cb67f2
+ 0: Mark:\x{300}\x{1d172}\x{1d17b}
cb67f2
+    Number:9\x{660}
cb67f2
+ 0: Number:9\x{660}
cb67f2
+    Punctuation:\x{66a},;
cb67f2
+ 0: Punctuation:\x{66a},;
cb67f2
+    Symbol:\x{6de}<>\x{fffc}
cb67f2
+ 0: Symbol:\x{6de}<>\x{fffc}
cb67f2
+    Cf-property:\x{ad}\x{600}\x{601}\x{602}\x{603}\x{604}\x{6dd}\x{70f}
cb67f2
+ 0: Cf-property:\x{ad}\x{600}\x{601}\x{602}\x{603}\x{604}\x{6dd}\x{70f}
cb67f2
+    \x{180e}
cb67f2
+ 0: \x{180e}
cb67f2
+    \x{200b}\x{200c}\x{200d}\x{200e}\x{200f}
cb67f2
+ 0: \x{200b}\x{200c}\x{200d}\x{200e}\x{200f}
cb67f2
+    \x{202a}\x{202b}\x{202c}\x{202d}\x{202e}
cb67f2
+ 0: \x{202a}\x{202b}\x{202c}\x{202d}\x{202e}
cb67f2
+    \x{202f}
cb67f2
+ 0: \x{202f}
cb67f2
+    \x{2060}\x{2061}\x{2062}\x{2063}\x{2064}
cb67f2
+ 0: \x{2060}\x{2061}\x{2062}\x{2063}\x{2064}
cb67f2
+    \x{206a}\x{206b}\x{206c}\x{206d}\x{206e}\x{206f}
cb67f2
+ 0: \x{206a}\x{206b}\x{206c}\x{206d}\x{206e}\x{206f}
cb67f2
+    \x{feff}
cb67f2
+ 0: \x{feff}
cb67f2
+    \x{fff9}\x{fffa}\x{fffb}
cb67f2
+ 0: \x{fff9}\x{fffa}\x{fffb}
cb67f2
+    \x{110bd}
cb67f2
+ 0: \x{110bd}
cb67f2
+    \x{1d173}\x{1d174}\x{1d175}\x{1d176}\x{1d177}\x{1d178}\x{1d179}\x{1d17a}
cb67f2
+ 0: \x{1d173}\x{1d174}\x{1d175}\x{1d176}\x{1d177}\x{1d178}\x{1d179}\x{1d17a}
cb67f2
+    \x{e0001}
cb67f2
+ 0: \x{e0001}
cb67f2
+    \x{e0020}\x{e0030}\x{e0040}\x{e0050}\x{e0060}\x{e0070}\x{e007f}
cb67f2
+ 0: \x{e0020}\x{e0030}\x{e0040}\x{e0050}\x{e0060}\x{e0070}\x{e007f}
cb67f2
+    ** Failers
cb67f2
+ 0: ** Failers
cb67f2
+    \x{09}
cb67f2
+No match
cb67f2
+    \x{1D}
cb67f2
+No match
cb67f2
+    \x{85}
cb67f2
+No match
cb67f2
+    \x{61c}
cb67f2
+No match
cb67f2
+    \x{2028}
cb67f2
+No match
cb67f2
+    \x{2029}
cb67f2
+No match
cb67f2
+    \x{2065}
cb67f2
+No match
cb67f2
+    \x{2066}
cb67f2
+No match
cb67f2
+    \x{2067}
cb67f2
+No match
cb67f2
+    \x{2068}
cb67f2
+No match
cb67f2
+    \x{2069}
cb67f2
+No match
cb67f2
+    \x{e0002}
cb67f2
+No match
cb67f2
+    \x{e001f}
cb67f2
+No match
cb67f2
+    \x{e0080} 
cb67f2
+No match
cb67f2
+
cb67f2
+/^[[:punct:]]+$/8W
cb67f2
+    \$+<=>^`|~
cb67f2
+ 0: $+<=>^`|~
cb67f2
+    !\"#%&'()*,-./:;?@[\\]_{}
cb67f2
+ 0: !"#%&'()*,-./:;?@[\]_{}
cb67f2
+    \x{a1}\x{a7}  
cb67f2
+ 0: \x{a1}\x{a7}
cb67f2
+    \x{37e} 
cb67f2
+ 0: \x{37e}
cb67f2
+    ** Failers
cb67f2
+No match
cb67f2
+    abcde  
cb67f2
+No match
cb67f2
+
cb67f2
+/^[[:^graph:]]+$/8W
cb67f2
+    \x{09}\x{0a}\x{1D}\x{20}\x{85}\x{a0}\x{61c}\x{1680}\x{180e}
cb67f2
+ 0: \x{09}\x{0a}\x{1d} \x{85}\x{a0}\x{61c}\x{1680}\x{180e}
cb67f2
+    \x{2028}\x{2029}\x{202f}\x{2065}\x{2066}\x{2067}\x{2068}\x{2069}
cb67f2
+ 0: \x{2028}\x{2029}\x{202f}\x{2065}\x{2066}\x{2067}\x{2068}\x{2069}
cb67f2
+    \x{3000}\x{e0002}\x{e001f}\x{e0080}
cb67f2
+ 0: \x{3000}\x{e0002}\x{e001f}\x{e0080}
cb67f2
+    ** Failers
cb67f2
+No match
cb67f2
+    Letter:ABC
cb67f2
+No match
cb67f2
+    Mark:\x{300}\x{1d172}\x{1d17b}
cb67f2
+No match
cb67f2
+    Number:9\x{660}
cb67f2
+No match
cb67f2
+    Punctuation:\x{66a},;
cb67f2
+No match
cb67f2
+    Symbol:\x{6de}<>\x{fffc}
cb67f2
+No match
cb67f2
+    Cf-property:\x{ad}\x{600}\x{601}\x{602}\x{603}\x{604}\x{6dd}\x{70f}
cb67f2
+No match
cb67f2
+    \x{200b}\x{200c}\x{200d}\x{200e}\x{200f}
cb67f2
+No match
cb67f2
+    \x{202a}\x{202b}\x{202c}\x{202d}\x{202e}
cb67f2
+No match
cb67f2
+    \x{2060}\x{2061}\x{2062}\x{2063}\x{2064}
cb67f2
+No match
cb67f2
+    \x{206a}\x{206b}\x{206c}\x{206d}\x{206e}\x{206f}
cb67f2
+No match
cb67f2
+    \x{feff}
cb67f2
+No match
cb67f2
+    \x{fff9}\x{fffa}\x{fffb}
cb67f2
+No match
cb67f2
+    \x{110bd}
cb67f2
+No match
cb67f2
+    \x{1d173}\x{1d174}\x{1d175}\x{1d176}\x{1d177}\x{1d178}\x{1d179}\x{1d17a}
cb67f2
+No match
cb67f2
+    \x{e0001}
cb67f2
+No match
cb67f2
+    \x{e0020}\x{e0030}\x{e0040}\x{e0050}\x{e0060}\x{e0070}\x{e007f}
cb67f2
+No match
cb67f2
+
cb67f2
+/^[[:^print:]]+$/8W
cb67f2
+    \x{09}\x{1D}\x{85}\x{61c}\x{2028}\x{2029}\x{2065}\x{2066}\x{2067}
cb67f2
+ 0: \x{09}\x{1d}\x{85}\x{61c}\x{2028}\x{2029}\x{2065}\x{2066}\x{2067}
cb67f2
+    \x{2068}\x{2069}\x{e0002}\x{e001f}\x{e0080}
cb67f2
+ 0: \x{2068}\x{2069}\x{e0002}\x{e001f}\x{e0080}
cb67f2
+    ** Failers
cb67f2
+No match
cb67f2
+    Space: \x{a0}
cb67f2
+No match
cb67f2
+    \x{1680}\x{2000}\x{2001}\x{2002}\x{2003}\x{2004}\x{2005}
cb67f2
+No match
cb67f2
+    \x{2006}\x{2007}\x{2008}\x{2009}\x{200a} 
cb67f2
+No match
cb67f2
+    \x{202f}\x{205f} 
cb67f2
+No match
cb67f2
+    \x{3000}
cb67f2
+No match
cb67f2
+    Letter:ABC
cb67f2
+No match
cb67f2
+    Mark:\x{300}\x{1d172}\x{1d17b}
cb67f2
+No match
cb67f2
+    Number:9\x{660}
cb67f2
+No match
cb67f2
+    Punctuation:\x{66a},;
cb67f2
+No match
cb67f2
+    Symbol:\x{6de}<>\x{fffc}
cb67f2
+No match
cb67f2
+    Cf-property:\x{ad}\x{600}\x{601}\x{602}\x{603}\x{604}\x{6dd}\x{70f}
cb67f2
+No match
cb67f2
+    \x{180e}
cb67f2
+No match
cb67f2
+    \x{200b}\x{200c}\x{200d}\x{200e}\x{200f}
cb67f2
+No match
cb67f2
+    \x{202a}\x{202b}\x{202c}\x{202d}\x{202e}
cb67f2
+No match
cb67f2
+    \x{202f}
cb67f2
+No match
cb67f2
+    \x{2060}\x{2061}\x{2062}\x{2063}\x{2064}
cb67f2
+No match
cb67f2
+    \x{206a}\x{206b}\x{206c}\x{206d}\x{206e}\x{206f}
cb67f2
+No match
cb67f2
+    \x{feff}
cb67f2
+No match
cb67f2
+    \x{fff9}\x{fffa}\x{fffb}
cb67f2
+No match
cb67f2
+    \x{110bd}
cb67f2
+No match
cb67f2
+    \x{1d173}\x{1d174}\x{1d175}\x{1d176}\x{1d177}\x{1d178}\x{1d179}\x{1d17a}
cb67f2
+No match
cb67f2
+    \x{e0001}
cb67f2
+No match
cb67f2
+    \x{e0020}\x{e0030}\x{e0040}\x{e0050}\x{e0060}\x{e0070}\x{e007f}
cb67f2
+No match
cb67f2
+
cb67f2
+/^[[:^punct:]]+$/8W
cb67f2
+    abcde  
cb67f2
+ 0: abcde
cb67f2
+    ** Failers
cb67f2
+No match
cb67f2
+    \$+<=>^`|~
cb67f2
+No match
cb67f2
+    !\"#%&'()*,-./:;?@[\\]_{}
cb67f2
+No match
cb67f2
+    \x{a1}\x{a7}  
cb67f2
+No match
cb67f2
+    \x{37e} 
cb67f2
+No match
cb67f2
+
cb67f2
 /-- End of testinput6 --/
cb67f2
diff --git a/testdata/testoutput7 b/testdata/testoutput7
cb67f2
index 5f0f546..e3f607c 100644
cb67f2
--- a/testdata/testoutput7
cb67f2
+++ b/testdata/testoutput7
cb67f2
@@ -820,7 +820,7 @@ No match
cb67f2
 /[[:graph:]]/WBZ
cb67f2
 ------------------------------------------------------------------
cb67f2
         Bra
cb67f2
-        [!-~]
cb67f2
+        [[:graph:]]
cb67f2
         Ket
cb67f2
         End
cb67f2
 ------------------------------------------------------------------
cb67f2
@@ -828,7 +828,7 @@ No match
cb67f2
 /[[:print:]]/WBZ
cb67f2
 ------------------------------------------------------------------
cb67f2
         Bra
cb67f2
-        [ -~]
cb67f2
+        [[:print:]]
cb67f2
         Ket
cb67f2
         End
cb67f2
 ------------------------------------------------------------------
cb67f2
@@ -836,7 +836,7 @@ No match
cb67f2
 /[[:punct:]]/WBZ
cb67f2
 ------------------------------------------------------------------
cb67f2
         Bra
cb67f2
-        [!-/:-@[-`{-~]
cb67f2
+        [[:punct:]]
cb67f2
         Ket
cb67f2
         End
cb67f2
 ------------------------------------------------------------------
cb67f2
@@ -1478,4 +1478,115 @@ Need char = 'c' (caseless)
cb67f2
     scat
cb67f2
  0: sc
cb67f2
 
cb67f2
+/\D+\X \d+\X \S+\X \s+\X \W+\X \w+\X \C+\X \R+\X \H+\X \h+\X \V+\X \v+\X a+\X \n+\X .+\X/BZx
cb67f2
+------------------------------------------------------------------
cb67f2
+        Bra
cb67f2
+        \D+
cb67f2
+        extuni
cb67f2
+        \d+
cb67f2
+        extuni
cb67f2
+        \S+
cb67f2
+        extuni
cb67f2
+        \s+
cb67f2
+        extuni
cb67f2
+        \W+
cb67f2
+        extuni
cb67f2
+        \w+
cb67f2
+        extuni
cb67f2
+        AllAny+
cb67f2
+        extuni
cb67f2
+        \R+
cb67f2
+        extuni
cb67f2
+        \H+
cb67f2
+        extuni
cb67f2
+        \h+
cb67f2
+        extuni
cb67f2
+        \V+
cb67f2
+        extuni
cb67f2
+        \v+
cb67f2
+        extuni
cb67f2
+        a+
cb67f2
+        extuni
cb67f2
+        \x0a+
cb67f2
+        extuni
cb67f2
+        Any+
cb67f2
+        extuni
cb67f2
+        Ket
cb67f2
+        End
cb67f2
+------------------------------------------------------------------
cb67f2
+
cb67f2
+/.+\X/BZxs
cb67f2
+------------------------------------------------------------------
cb67f2
+        Bra
cb67f2
+        AllAny+
cb67f2
+        extuni
cb67f2
+        Ket
cb67f2
+        End
cb67f2
+------------------------------------------------------------------
cb67f2
+
cb67f2
+/\X+$/BZxm
cb67f2
+------------------------------------------------------------------
cb67f2
+        Bra
cb67f2
+        extuni+
cb67f2
+     /m $
cb67f2
+        Ket
cb67f2
+        End
cb67f2
+------------------------------------------------------------------
cb67f2
+
cb67f2
+/\X+\D \X+\d \X+\S \X+\s \X+\W \X+\w \X+. \X+\C \X+\R \X+\H \X+\h \X+\V \X+\v \X+\X \X+\Z \X+\z \X+$/BZx
cb67f2
+------------------------------------------------------------------
cb67f2
+        Bra
cb67f2
+        extuni+
cb67f2
+        \D
cb67f2
+        extuni+
cb67f2
+        \d
cb67f2
+        extuni+
cb67f2
+        \S
cb67f2
+        extuni+
cb67f2
+        \s
cb67f2
+        extuni+
cb67f2
+        \W
cb67f2
+        extuni+
cb67f2
+        \w
cb67f2
+        extuni+
cb67f2
+        Any
cb67f2
+        extuni+
cb67f2
+        AllAny
cb67f2
+        extuni+
cb67f2
+        \R
cb67f2
+        extuni+
cb67f2
+        \H
cb67f2
+        extuni+
cb67f2
+        \h
cb67f2
+        extuni+
cb67f2
+        \V
cb67f2
+        extuni+
cb67f2
+        \v
cb67f2
+        extuni+
cb67f2
+        extuni
cb67f2
+        extuni+
cb67f2
+        \Z
cb67f2
+        extuni+
cb67f2
+        \z
cb67f2
+        extuni+
cb67f2
+        $
cb67f2
+        Ket
cb67f2
+        End
cb67f2
+------------------------------------------------------------------
cb67f2
+
cb67f2
+/\d+\s{0,5}=\s*\S?=\w{0,4}\W*/8WBZ
cb67f2
+------------------------------------------------------------------
cb67f2
+        Bra
cb67f2
+        prop Nd +
cb67f2
+        prop Xsp {0,5}
cb67f2
+        =
cb67f2
+        prop Xsp *
cb67f2
+        notprop Xsp ?
cb67f2
+        =
cb67f2
+        prop Xwd {0,4}
cb67f2
+        notprop Xwd *
cb67f2
+        Ket
cb67f2
+        End
cb67f2
+------------------------------------------------------------------
cb67f2
+
cb67f2
 /-- End of testinput7 --/
cb67f2
-- 
cb67f2
2.7.4
cb67f2