04bfb0
From 89f69032d6a71f41b96ae6becbf3df4e2f9509a5 Mon Sep 17 00:00:00 2001
04bfb0
From: Karl Williamson <khw@cpan.org>
04bfb0
Date: Sat, 27 Apr 2019 13:56:39 -0600
04bfb0
Subject: [PATCH] S_scan_const() Properly test if need to grow
04bfb0
MIME-Version: 1.0
04bfb0
Content-Type: text/plain; charset=UTF-8
04bfb0
Content-Transfer-Encoding: 8bit
04bfb0
04bfb0
As we parse the input, creating a string constant, we may have to grow
04bfb0
the destination if it fills up as we go along.  It allocates space in an
04bfb0
SV and populates the string, but it doesn' update the SvCUR until the
04bfb0
end, so in single stepping the debugger through the code, the SV looks
04bfb0
empty until the end.  It turns out that as a result SvEND also doesn't
04bfb0
get updated and still points to the beginning of the string until SvCUR
04bfb0
is finally set.  That means that the test changed by this commit was
04bfb0
always succeeding, because it was using SvEND that didn't get updated,
04bfb0
so it would attempt to grow each time through the loop.  By moving a
04bfb0
couple of statements earlier, and using SvLEN instead, which does always
04bfb0
have the correct value, those extra growth attempts are avoided.
04bfb0
04bfb0
Signed-off-by: Petr Písař <ppisar@redhat.com>
04bfb0
---
04bfb0
 toke.c | 10 ++++++----
04bfb0
 1 file changed, 6 insertions(+), 4 deletions(-)
04bfb0
04bfb0
diff --git a/toke.c b/toke.c
04bfb0
index 68eea0cae6..03c4f2ba26 100644
04bfb0
--- a/toke.c
04bfb0
+++ b/toke.c
04bfb0
@@ -4097,10 +4097,12 @@ S_scan_const(pTHX_ char *start)
04bfb0
             goto default_action; /* Redo, having upgraded so both are UTF-8 */
04bfb0
         }
04bfb0
         else {  /* UTF8ness matters: convert this non-UTF8 source char to
04bfb0
-                   UTF-8 for output.  It will occupy 2 bytes */
04bfb0
-            if (d + 2 >= SvEND(sv)) {
04bfb0
-                const STRLEN extra = 2 + (send - s - 1) + 1;
04bfb0
-		const STRLEN off = d - SvPVX_const(sv);
04bfb0
+                   UTF-8 for output.  It will occupy 2 bytes, but don't include
04bfb0
+                   the input byte since we haven't incremented 's' yet. See
04bfb0
+                   Note on sizing above. */
04bfb0
+            const STRLEN off = d - SvPVX(sv);
04bfb0
+            const STRLEN extra = 2 + (send - s - 1) + 1;
04bfb0
+            if (off + extra > SvLEN(sv)) {
04bfb0
 		d = off + SvGROW(sv, off + extra);
04bfb0
 	    }
04bfb0
             *d++ = UTF8_EIGHT_BIT_HI(*s);
04bfb0
-- 
04bfb0
2.20.1
04bfb0