Blame SOURCES/gdb-rhbz1639242-fix-dwarf2_find_containing_comp_unit-binary-search.patch

ce65b8
From FEDORA_PATCHES Mon Sep 17 00:00:00 2001
ce65b8
From: Sergio Durigan Junior <sergiodj@redhat.com>
ce65b8
Date: Fri, 30 Nov 2018 15:20:27 -0500
ce65b8
Subject: 
ce65b8
 gdb-rhbz1639242-fix-dwarf2_find_containing_comp_unit-binary-search.patch
ce65b8
ce65b8
;; Fix for 'py-bt is broken, results in exception'.
ce65b8
;; RHBZ 1639242
ce65b8
ce65b8
Fix dwarf2read.c:dwarf2_find_containing_comp_unit's binary search
ce65b8
ce65b8
First of all, I would like to express my gratitude to Keith Seitz, Jan
ce65b8
Kratochvil and Tom Tromey, who were really kind and helped a lot with
ce65b8
this bug.  The patch itself was authored by Jan.
ce65b8
ce65b8
This all began with:
ce65b8
ce65b8
  https://bugzilla.redhat.com/show_bug.cgi?id=1639242
ce65b8
  py-bt is broken, results in exception
ce65b8
ce65b8
In summary, the error reported by the bug above is:
ce65b8
ce65b8
  $ gdb -args python3
ce65b8
  GNU gdb (GDB) Fedora 8.1.1-3.fc28
ce65b8
  (...)
ce65b8
  Reading symbols from python3...Reading symbols from /usr/lib/debug/usr/bin/python3.6-3.6.6-1.fc28.x86_64.debug...done.
ce65b8
  done.
ce65b8
  Dwarf Error: could not find partial DIE containing offset 0x316 [in module /usr/lib/debug/usr/bin/python3.6-3.6.6-1.fc28.x86_64.debug]
ce65b8
ce65b8
After a long investigation, and after thinking that the problem might
ce65b8
actually be on DWZ's side, we were able to determine that there's
ce65b8
something wrong going on when
ce65b8
dwarf2read.c:dwarf2_find_containing_comp_unit performs a binary search
ce65b8
over all of the CUs belonging to an objfile in order to find the CU
ce65b8
which contains a DIE at an specific offset.  The current algorithm is:
ce65b8
ce65b8
  static struct dwarf2_per_cu_data *
ce65b8
  dwarf2_find_containing_comp_unit (sect_offset sect_off,
ce65b8
				    unsigned int offset_in_dwz,
ce65b8
				    struct dwarf2_per_objfile *dwarf2_per_objfile)
ce65b8
  {
ce65b8
    struct dwarf2_per_cu_data *this_cu;
ce65b8
    int low, high;
ce65b8
    const sect_offset *cu_off;
ce65b8
ce65b8
    low = 0;
ce65b8
    high = dwarf2_per_objfile->all_comp_units.size () - 1;
ce65b8
    while (high > low)
ce65b8
      {
ce65b8
	struct dwarf2_per_cu_data *mid_cu;
ce65b8
	int mid = low + (high - low) / 2;
ce65b8
ce65b8
	mid_cu = dwarf2_per_objfile->all_comp_units[mid];
ce65b8
	cu_off = &mid_cu->sect_off;
ce65b8
	if (mid_cu->is_dwz > offset_in_dwz
ce65b8
	    || (mid_cu->is_dwz == offset_in_dwz && *cu_off >= sect_off))
ce65b8
	  high = mid;
ce65b8
	else
ce65b8
	  low = mid + 1;
ce65b8
      }
ce65b8
ce65b8
For the sake of this example, let's consider that "sect_off =
ce65b8
0x7d".
ce65b8
ce65b8
There are a few important things going on here.  First,
ce65b8
"dwarf2_per_objfile->all_comp_units ()" will be sorted first by
ce65b8
whether the CU is a DWZ CU, and then by cu->sect_off.  In this
ce65b8
specific bug, "offset_in_dwz" is false, which means that, for the most
ce65b8
part of the loop, we're going to do "high = mid" (i.e, we'll work with
ce65b8
the lower part of the vector).
ce65b8
ce65b8
In our particular case, when we reach the part where "mid_cu->is_dwz
ce65b8
== offset_in_dwz" (i.e, both are false), we end up with "high = 2" and
ce65b8
"mid = 1".  I.e., there are only 2 elements in the vector who are not
ce65b8
DWZ.  The vector looks like this:
ce65b8
ce65b8
  #0: cu->sect_off = 0;   length = 114;  is_dwz = false  <-- low
ce65b8
  #1: cu->sect_off = 114; length = 7796; is_dwz = false  <-- mid
ce65b8
  #2: cu->sect_off = 0;   length = 28;   is_dwz = true   <-- high
ce65b8
  ...
ce65b8
ce65b8
The CU we want is #1, which is exactly where "mid" is.  Also, #1 is
ce65b8
not DWZ, which is also exactly what we want.  So we perform the second
ce65b8
comparison:
ce65b8
ce65b8
  (mid_cu->is_dwz == offset_in_dwz && *cu_off >= sect_off)
ce65b8
                                      ^^^^^^^^^^^^^^^^^^^
ce65b8
ce65b8
Because "*cu_off = 114" and "sect_off = 0x7d", this evaluates to
ce65b8
false, so we end up with "low = mid + 1 = 2", which actually gives us
ce65b8
the wrong CU (i.e., a CU that is DWZ).  Next in the code, GDB does:
ce65b8
ce65b8
    gdb_assert (low == high);
ce65b8
    this_cu = dwarf2_per_objfile->all_comp_units[low];
ce65b8
    cu_off = &this_cu->sect_off;
ce65b8
    if (this_cu->is_dwz != offset_in_dwz || *cu_off > sect_off)
ce65b8
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ce65b8
      {
ce65b8
	if (low == 0 || this_cu->is_dwz != offset_in_dwz)
ce65b8
	  error (_("Dwarf Error: could not find partial DIE containing "
ce65b8
		 "offset %s [in module %s]"),
ce65b8
		 sect_offset_str (sect_off),
ce65b8
		 bfd_get_filename (dwarf2_per_objfile->objfile->obfd));
ce65b8
	...
ce65b8
ce65b8
Triggering the error we saw in the original bug report.
ce65b8
ce65b8
It's important to notice that we see the error message because the
ce65b8
selected CU is a DWZ one, but we're looking for a non-DWZ CU here.
ce65b8
However, even when the selected CU is *not* a DWZ (and we don't see
ce65b8
any error message), we still end up with the wrong CU.  For example,
ce65b8
suppose that the vector had:
ce65b8
ce65b8
  #0: cu->sect_off = 0;    length = 114;  is_dwz = false
ce65b8
  #1: cu->sect_off = 114;  length = 7796; is_dwz = false
ce65b8
  #2: cu->sect_off = 7910; length = 28;   is_dwz = false
ce65b8
  ...
ce65b8
ce65b8
I.e., #2's "is_dwz" is false instead of true.  In this case, we still
ce65b8
want #1, because that's where the DIE is located.  After the loop ends
ce65b8
up in #2, we have "is_dwz" as false, which is what we wanted, so we
ce65b8
compare offsets.  In this case, "7910 >= 0x7d", so we set "mid = high
ce65b8
= 2".  Next iteration, we have "mid = 0 + (2 - 0) / 2 = 1", and thus
ce65b8
we examining #1.  "is_dwz" is still false, but "114 >= 0x7d" also
ce65b8
evaluates to false, so "low = mid + 1 = 2", which makes the loop stop.
ce65b8
Therefore, we end up choosing #2 as our CU, even though #1 is the
ce65b8
right one.
ce65b8
ce65b8
The problem here is happening because we're comparing "sect_off"
ce65b8
directly against "*cu_off", while we should actually be comparing
ce65b8
against "*cu_off + mid_cu->length" (i.e., the end offset):
ce65b8
ce65b8
  ...
ce65b8
  || (mid_cu->is_dwz == offset_in_dwz
ce65b8
      && *cu_off + mid_cu->length >= sect_off))
ce65b8
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ce65b8
  ...
ce65b8
ce65b8
And this is what the patch does.  The idea is that if GDB is searching
ce65b8
for an offset that falls above the *end* of the CU being
ce65b8
analyzed (i.e., "mid"), then the next iteration should try a
ce65b8
higher-offset CU next.  The previous algorithm was using
ce65b8
the *beginning* of the CU.
ce65b8
ce65b8
Unfortunately, I could not devise a testcase for this problem, so I am
ce65b8
proposing a fix with this huge explanation attached to it in the hope
ce65b8
that it is sufficient.  After talking a bit to Keith (our testcase
ce65b8
guru), it seems that one would have to create an objfile with both DWZ
ce65b8
and non-DWZ sections, which may prove very hard to do, I think.
ce65b8
ce65b8
I ran this patch on our BuildBot, and no regressions were detected.
ce65b8
ce65b8
gdb/ChangeLog:
ce65b8
2018-11-30  Jan Kratochvil  <jan.kratochvil@redhat.com>
ce65b8
	    Keith Seitz  <keiths@redhat.com>
ce65b8
	    Tom Tromey  <tom@tromey.com>
ce65b8
	    Sergio Durigan Junior  <sergiodj@redhat.com>
ce65b8
ce65b8
	https://bugzilla.redhat.com/show_bug.cgi?id=1613614
ce65b8
	* dwarf2read.c (dwarf2_find_containing_comp_unit): Add
ce65b8
	'mid_cu->length' to '*cu_off' when checking if 'sect_off' is
ce65b8
	inside the CU.
ce65b8
ce65b8
diff --git a/gdb/dwarf2read.c b/gdb/dwarf2read.c
ce65b8
--- a/gdb/dwarf2read.c
ce65b8
+++ b/gdb/dwarf2read.c
ce65b8
@@ -25039,7 +25039,8 @@ dwarf2_find_containing_comp_unit (sect_offset sect_off,
ce65b8
       mid_cu = dwarf2_per_objfile->all_comp_units[mid];
ce65b8
       cu_off = &mid_cu->sect_off;
ce65b8
       if (mid_cu->is_dwz > offset_in_dwz
ce65b8
-	  || (mid_cu->is_dwz == offset_in_dwz && *cu_off >= sect_off))
ce65b8
+	  || (mid_cu->is_dwz == offset_in_dwz
ce65b8
+	      && *cu_off + mid_cu->length >= sect_off))
ce65b8
 	high = mid;
ce65b8
       else
ce65b8
 	low = mid + 1;