Blame SOURCES/perl-5.26.1-utf8.c-Don-t-dump-malformation-past-first-NUL.patch

243a19
From d72ba890c8d8ac800c9d00a1f542deca11551f33 Mon Sep 17 00:00:00 2001
243a19
From: Karl Williamson <khw@cpan.org>
243a19
Date: Tue, 13 Feb 2018 07:03:43 -0700
243a19
Subject: utf8.c: Don't dump malformation past first NUL
243a19
243a19
When a UTF-8 string contains a malformation, the bytes are dumped out as
243a19
a debugging aid.  One should exercise caution, however, and not dump out
243a19
bytes that are actually past the end of the string.  Commit 99a765e9e37
243a19
from 2016 added the capability to signal to the dumping routines that
243a19
we're not sure where the string ends, and to dump the minimal possible.
243a19
243a19
It occurred to me that an additional safety measure can be easily added,
243a19
which this commit does.  And that is, in the dumping routines to stop at
243a19
the first NUL.  All strings automatically get a traiing NUL added, even
243a19
if they contain embedded NULs.  A NUL can never be part of a
243a19
malformation, and so its presence likely signals the end of the string.
243a19
---
243a19
 utf8.c | 16 ++++++++++++++--
243a19
 1 file changed, 14 insertions(+), 2 deletions(-)
243a19
243a19
diff --git a/utf8.c b/utf8.c
243a19
index a3d5f61b64..61346f0cb6 100644
243a19
--- a/utf8.c
243a19
+++ b/utf8.c
243a19
@@ -810,7 +810,7 @@ Perl__byte_dump_string(pTHX_ const U8 * s, const STRLEN len, const bool format)
243a19
 PERL_STATIC_INLINE char *
243a19
 S_unexpected_non_continuation_text(pTHX_ const U8 * const s,
243a19
 
243a19
-                                         /* How many bytes to print */
243a19
+                                         /* Max number of bytes to print */
243a19
                                          STRLEN print_len,
243a19
 
243a19
                                          /* Which one is the non-continuation */
243a19
@@ -826,6 +826,8 @@ S_unexpected_non_continuation_text(pTHX_ const U8 * const s,
243a19
                                ? "immediately"
243a19
                                : Perl_form(aTHX_ "%d bytes",
243a19
                                                  (int) non_cont_byte_pos);
243a19
+    const U8 * x = s + non_cont_byte_pos;
243a19
+    const U8 * e = s + print_len;
243a19
 
243a19
     PERL_ARGS_ASSERT_UNEXPECTED_NON_CONTINUATION_TEXT;
243a19
 
243a19
@@ -833,10 +835,20 @@ S_unexpected_non_continuation_text(pTHX_ const U8 * const s,
243a19
      * calculated, it's likely faster to pass it; verify under DEBUGGING */
243a19
     assert(expect_len == UTF8SKIP(s));
243a19
 
243a19
+    /* As a defensive coding measure, don't output anything past a NUL.  Such
243a19
+     * bytes shouldn't be in the middle of a malformation, and could mark the
243a19
+     * end of the allocated string, and what comes after is undefined */
243a19
+    for (; x < e; x++) {
243a19
+        if (*x == '\0') {
243a19
+            x++;            /* Output this particular NUL */
243a19
+            break;
243a19
+        }
243a19
+    }
243a19
+
243a19
     return Perl_form(aTHX_ "%s: %s (unexpected non-continuation byte 0x%02x,"
243a19
                            " %s after start byte 0x%02x; need %d bytes, got %d)",
243a19
                            malformed_text,
243a19
-                           _byte_dump_string(s, print_len, 0),
243a19
+                           _byte_dump_string(s, x - s, 0),
243a19
                            *(s + non_cont_byte_pos),
243a19
                            where,
243a19
                            *s,
243a19
-- 
243a19
2.11.0
243a19