Blame SOURCES/gdb-rhbz1639242-fix-dwarf2_find_containing_comp_unit-binary-search.patch

689258
From FEDORA_PATCHES Mon Sep 17 00:00:00 2001
689258
From: Sergio Durigan Junior <sergiodj@redhat.com>
689258
Date: Fri, 30 Nov 2018 15:20:27 -0500
689258
Subject: 
689258
 gdb-rhbz1639242-fix-dwarf2_find_containing_comp_unit-binary-search.patch
689258
689258
;; Fix for 'py-bt is broken, results in exception'.
689258
;; RHBZ 1639242
689258
689258
Fix dwarf2read.c:dwarf2_find_containing_comp_unit's binary search
689258
689258
First of all, I would like to express my gratitude to Keith Seitz, Jan
689258
Kratochvil and Tom Tromey, who were really kind and helped a lot with
689258
this bug.  The patch itself was authored by Jan.
689258
689258
This all began with:
689258
689258
  https://bugzilla.redhat.com/show_bug.cgi?id=1639242
689258
  py-bt is broken, results in exception
689258
689258
In summary, the error reported by the bug above is:
689258
689258
  $ gdb -args python3
689258
  GNU gdb (GDB) Fedora 8.1.1-3.fc28
689258
  (...)
689258
  Reading symbols from python3...Reading symbols from /usr/lib/debug/usr/bin/python3.6-3.6.6-1.fc28.x86_64.debug...done.
689258
  done.
689258
  Dwarf Error: could not find partial DIE containing offset 0x316 [in module /usr/lib/debug/usr/bin/python3.6-3.6.6-1.fc28.x86_64.debug]
689258
689258
After a long investigation, and after thinking that the problem might
689258
actually be on DWZ's side, we were able to determine that there's
689258
something wrong going on when
689258
dwarf2read.c:dwarf2_find_containing_comp_unit performs a binary search
689258
over all of the CUs belonging to an objfile in order to find the CU
689258
which contains a DIE at an specific offset.  The current algorithm is:
689258
689258
  static struct dwarf2_per_cu_data *
689258
  dwarf2_find_containing_comp_unit (sect_offset sect_off,
689258
				    unsigned int offset_in_dwz,
689258
				    struct dwarf2_per_objfile *dwarf2_per_objfile)
689258
  {
689258
    struct dwarf2_per_cu_data *this_cu;
689258
    int low, high;
689258
    const sect_offset *cu_off;
689258
689258
    low = 0;
689258
    high = dwarf2_per_objfile->all_comp_units.size () - 1;
689258
    while (high > low)
689258
      {
689258
	struct dwarf2_per_cu_data *mid_cu;
689258
	int mid = low + (high - low) / 2;
689258
689258
	mid_cu = dwarf2_per_objfile->all_comp_units[mid];
689258
	cu_off = &mid_cu->sect_off;
689258
	if (mid_cu->is_dwz > offset_in_dwz
689258
	    || (mid_cu->is_dwz == offset_in_dwz && *cu_off >= sect_off))
689258
	  high = mid;
689258
	else
689258
	  low = mid + 1;
689258
      }
689258
689258
For the sake of this example, let's consider that "sect_off =
689258
0x7d".
689258
689258
There are a few important things going on here.  First,
689258
"dwarf2_per_objfile->all_comp_units ()" will be sorted first by
689258
whether the CU is a DWZ CU, and then by cu->sect_off.  In this
689258
specific bug, "offset_in_dwz" is false, which means that, for the most
689258
part of the loop, we're going to do "high = mid" (i.e, we'll work with
689258
the lower part of the vector).
689258
689258
In our particular case, when we reach the part where "mid_cu->is_dwz
689258
== offset_in_dwz" (i.e, both are false), we end up with "high = 2" and
689258
"mid = 1".  I.e., there are only 2 elements in the vector who are not
689258
DWZ.  The vector looks like this:
689258
689258
  #0: cu->sect_off = 0;   length = 114;  is_dwz = false  <-- low
689258
  #1: cu->sect_off = 114; length = 7796; is_dwz = false  <-- mid
689258
  #2: cu->sect_off = 0;   length = 28;   is_dwz = true   <-- high
689258
  ...
689258
689258
The CU we want is #1, which is exactly where "mid" is.  Also, #1 is
689258
not DWZ, which is also exactly what we want.  So we perform the second
689258
comparison:
689258
689258
  (mid_cu->is_dwz == offset_in_dwz && *cu_off >= sect_off)
689258
                                      ^^^^^^^^^^^^^^^^^^^
689258
689258
Because "*cu_off = 114" and "sect_off = 0x7d", this evaluates to
689258
false, so we end up with "low = mid + 1 = 2", which actually gives us
689258
the wrong CU (i.e., a CU that is DWZ).  Next in the code, GDB does:
689258
689258
    gdb_assert (low == high);
689258
    this_cu = dwarf2_per_objfile->all_comp_units[low];
689258
    cu_off = &this_cu->sect_off;
689258
    if (this_cu->is_dwz != offset_in_dwz || *cu_off > sect_off)
689258
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
689258
      {
689258
	if (low == 0 || this_cu->is_dwz != offset_in_dwz)
689258
	  error (_("Dwarf Error: could not find partial DIE containing "
689258
		 "offset %s [in module %s]"),
689258
		 sect_offset_str (sect_off),
689258
		 bfd_get_filename (dwarf2_per_objfile->objfile->obfd));
689258
	...
689258
689258
Triggering the error we saw in the original bug report.
689258
689258
It's important to notice that we see the error message because the
689258
selected CU is a DWZ one, but we're looking for a non-DWZ CU here.
689258
However, even when the selected CU is *not* a DWZ (and we don't see
689258
any error message), we still end up with the wrong CU.  For example,
689258
suppose that the vector had:
689258
689258
  #0: cu->sect_off = 0;    length = 114;  is_dwz = false
689258
  #1: cu->sect_off = 114;  length = 7796; is_dwz = false
689258
  #2: cu->sect_off = 7910; length = 28;   is_dwz = false
689258
  ...
689258
689258
I.e., #2's "is_dwz" is false instead of true.  In this case, we still
689258
want #1, because that's where the DIE is located.  After the loop ends
689258
up in #2, we have "is_dwz" as false, which is what we wanted, so we
689258
compare offsets.  In this case, "7910 >= 0x7d", so we set "mid = high
689258
= 2".  Next iteration, we have "mid = 0 + (2 - 0) / 2 = 1", and thus
689258
we examining #1.  "is_dwz" is still false, but "114 >= 0x7d" also
689258
evaluates to false, so "low = mid + 1 = 2", which makes the loop stop.
689258
Therefore, we end up choosing #2 as our CU, even though #1 is the
689258
right one.
689258
689258
The problem here is happening because we're comparing "sect_off"
689258
directly against "*cu_off", while we should actually be comparing
689258
against "*cu_off + mid_cu->length" (i.e., the end offset):
689258
689258
  ...
689258
  || (mid_cu->is_dwz == offset_in_dwz
689258
      && *cu_off + mid_cu->length >= sect_off))
689258
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
689258
  ...
689258
689258
And this is what the patch does.  The idea is that if GDB is searching
689258
for an offset that falls above the *end* of the CU being
689258
analyzed (i.e., "mid"), then the next iteration should try a
689258
higher-offset CU next.  The previous algorithm was using
689258
the *beginning* of the CU.
689258
689258
Unfortunately, I could not devise a testcase for this problem, so I am
689258
proposing a fix with this huge explanation attached to it in the hope
689258
that it is sufficient.  After talking a bit to Keith (our testcase
689258
guru), it seems that one would have to create an objfile with both DWZ
689258
and non-DWZ sections, which may prove very hard to do, I think.
689258
689258
I ran this patch on our BuildBot, and no regressions were detected.
689258
689258
gdb/ChangeLog:
689258
2018-11-30  Jan Kratochvil  <jan.kratochvil@redhat.com>
689258
	    Keith Seitz  <keiths@redhat.com>
689258
	    Tom Tromey  <tom@tromey.com>
689258
	    Sergio Durigan Junior  <sergiodj@redhat.com>
689258
689258
	https://bugzilla.redhat.com/show_bug.cgi?id=1613614
689258
	* dwarf2read.c (dwarf2_find_containing_comp_unit): Add
689258
	'mid_cu->length' to '*cu_off' when checking if 'sect_off' is
689258
	inside the CU.
689258
689258
diff --git a/gdb/dwarf2read.c b/gdb/dwarf2read.c
689258
--- a/gdb/dwarf2read.c
689258
+++ b/gdb/dwarf2read.c
689258
@@ -25039,7 +25039,8 @@ dwarf2_find_containing_comp_unit (sect_offset sect_off,
689258
       mid_cu = dwarf2_per_objfile->all_comp_units[mid];
689258
       cu_off = &mid_cu->sect_off;
689258
       if (mid_cu->is_dwz > offset_in_dwz
689258
-	  || (mid_cu->is_dwz == offset_in_dwz && *cu_off >= sect_off))
689258
+	  || (mid_cu->is_dwz == offset_in_dwz
689258
+	      && *cu_off + mid_cu->length >= sect_off))
689258
 	high = mid;
689258
       else
689258
 	low = mid + 1;