From 23ec99d34770b95569257d3558cbb51ac2b64e20 Mon Sep 17 00:00:00 2001 From: CentOS Sources Date: Oct 30 2019 15:15:19 +0000 Subject: import libpfm-4.7.0-10.el7 --- diff --git a/.gitignore b/.gitignore new file mode 100644 index 0000000..9f92e3a --- /dev/null +++ b/.gitignore @@ -0,0 +1 @@ +SOURCES/libpfm-4.7.0.tar.gz diff --git a/.libpfm.metadata b/.libpfm.metadata new file mode 100644 index 0000000..9aa28c4 --- /dev/null +++ b/.libpfm.metadata @@ -0,0 +1 @@ +fdb0f2a485169d4afbaf898c729674664105b9ae SOURCES/libpfm-4.7.0.tar.gz diff --git a/SOURCES/libpfm-bdx_unc.patch b/SOURCES/libpfm-bdx_unc.patch new file mode 100644 index 0000000..61be118 --- /dev/null +++ b/SOURCES/libpfm-bdx_unc.patch @@ -0,0 +1,8909 @@ +commit 488227bf2128e8b80f9b7573869fe33fcbd63342 +Author: Stephane Eranian +Date: Fri Jun 2 12:09:31 2017 -0700 + + add Intel Broadwell server uncore PMUs support + + This patch adds Intel Broadwell Server (model 79, 86) uncore PMU + support. It adds the following PMUs: + + - IMC + - CBOX + - HA + - UBOX + - SBOX + - IRP + - PCU + - QPI + - R2PCIE + - R3QPI + + Based on Broadwell Server Uncore Performance Monitoring Reference Manual + available here: + http://www.intel.com/content/www/us/en/processors/xeon/xeon-e5-e7-v4-uncore-performance-monitoring.html + + Signed-off-by: Stephane Eranian + +diff --git a/docs/Makefile b/docs/Makefile +index f8f8838..45f3f16 100644 +--- a/docs/Makefile ++++ b/docs/Makefile +@@ -81,7 +81,18 @@ ARCH_MAN=libpfm_intel_core.3 \ + libpfm_intel_hswep_unc_r2pcie.3 \ + libpfm_intel_hswep_unc_r3qpi.3 \ + libpfm_intel_hswep_unc_sbo.3 \ +- libpfm_intel_hswep_unc_ubo.3 ++ libpfm_intel_hswep_unc_ubo.3 \ ++ libpfm_intel_bdx_unc_cbo.3 \ ++ libpfm_intel_bdx_unc_ha.3 \ ++ libpfm_intel_bdx_unc_imc.3 \ ++ libpfm_intel_bdx_unc_irp.3 \ ++ libpfm_intel_bdx_unc_pcu.3 \ ++ libpfm_intel_bdx_unc_qpi.3 \ ++ libpfm_intel_bdx_unc_r2pcie.3 \ ++ libpfm_intel_bdx_unc_r3qpi.3 \ ++ libpfm_intel_bdx_unc_sbo.3 \ ++ libpfm_intel_bdx_unc_ubo.3 ++ + + ifeq ($(CONFIG_PFMLIB_ARCH_I386),y) + ARCH_MAN += libpfm_intel_p6.3 libpfm_intel_coreduo.3 +diff --git a/docs/man3/libpfm_intel_bdx_unc_cbo.3 b/docs/man3/libpfm_intel_bdx_unc_cbo.3 +new file mode 100644 +index 0000000..668226b +--- /dev/null ++++ b/docs/man3/libpfm_intel_bdx_unc_cbo.3 +@@ -0,0 +1,79 @@ ++.TH LIBPFM 3 "June, 2017" "" "Linux Programmer's Manual" ++.SH NAME ++libpfm_intel_bdx_unc_cbo - support for Intel Broadwell Server C-Box uncore PMU ++.SH SYNOPSIS ++.nf ++.B #include ++.sp ++.B PMU name: bdx_unc_cbo[0-21] ++.B PMU desc: Intel Broadwell Server C-Box uncore PMU ++.sp ++.SH DESCRIPTION ++The library supports the Intel Broadwell Server C-Box (coherency engine) uncore PMU. ++This PMU model exists on various Broadwell server models (79, 86) . There is one C-box ++PMU per physical core. Therefore there are up to twenty-one identical C-Box PMU instances ++numbered from 0 to 21. On dual-socket systems, the number refers to the C-Box ++PMU on the socket where the program runs. For instance, if running on CPU18, then ++bdx_unc_cbo0 refers to the C-Box for physical core 0 on socket 1. Conversely, ++if running on CPU0, then the same bdx_unc_cbo0 refers to the C-Box for physical ++core 0 but on socket 0. ++ ++Each C-Box PMU implements 4 generic counters and two filter registers used only ++with certain events and umasks. ++ ++.SH MODIFIERS ++The following modifiers are supported on Intel Broadwell C-Box uncore PMU: ++.TP ++.B e ++Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a threshold modifier (t) with a value greater or equal to one. This is a boolean modifier. ++.TP ++.B t ++Set the threshold value. When set to a non-zero value, the counter counts the number ++of C-Box cycles in which the number of occurrences of the event is greater or equal to ++the threshold. This is an integer modifier with values in the range [0:255]. ++.TP ++.B nf ++Node filter. Certain events, such as UNC_C_LLC_LOOKUP, UNC_C_LLC_VICTIMS, provide a \fBNID\fR umask. ++Sometimes the \fBNID\fR is combined with other filtering capabilities, such as opcodes. ++The node filter is an 8-bit max bitmask. A node corresponds to a processor ++socket. The legal values therefore depend on the underlying hardware configuration. For ++dual-socket systems, the bitmask has two valid bits [0:1]. ++.TP ++.B cf ++Core Filter. This is a 5-bit filter which is used to filter based on physical core origin ++of the C-Box request. Possible values are 0-63. If the filter is not specified, then no ++filtering takes place. Bit 0-3 indicate the physical core id and bit 4 filters on non ++thread-related data. ++.TP ++.B tf ++Thread Filter. This is a 1-bit filter which is used to filter C-Box requests based on logical ++processor (hyper-thread) identification. Possibles values are 0-1. If the filter is not ++specified, then no filtering takes place. ++.TP ++.B nc ++Non-Coherent. This is a 1-bit filter which is used to filter C-Box requests only for the ++TOR_INSERTS and TOR_OCCUPANCY umasks using the OPCODE matcher. If the filter is not ++specified, then no filtering takes place. ++.TP ++.B isoc ++Isochronous. This is a 1-bit filter which is used to filter C-Box requests only for the ++TOR_INSERTS and TOR_OCCUPANCY umasks using the OPCODE matcher. If the filter is not ++specified, then no filtering takes place. ++ ++.SH Opcode filtering ++ ++Certain events, such as UNC_C_TOR_INSERTS supports opcode matching on the C-BOX transaction ++type. To use this feature, first an opcode matching umask must be selected, e.g., MISS_OPCODE. ++Second, the opcode to match on must be selected via a second umask among the OPC_* umasks. ++For instance, UNC_C_TOR_INSERTS:OPCODE:OPC_RFO, counts the number of TOR insertions for RFO ++transactions. ++ ++Opcode matching may be combined with node filtering with certain umasks. In general, the ++filtering support is encoded into the umask name, e.g., NID_OPCODE supports both ++node and opcode filtering. For instance, UNC_C_TOR_INSERTS:NID_OPCODE:OPC_RFO:nf=1. ++ ++.SH AUTHORS ++.nf ++Stephane Eranian ++.if ++.PP +diff --git a/docs/man3/libpfm_intel_bdx_unc_ha.3 b/docs/man3/libpfm_intel_bdx_unc_ha.3 +new file mode 100644 +index 0000000..b1c0eb2 +--- /dev/null ++++ b/docs/man3/libpfm_intel_bdx_unc_ha.3 +@@ -0,0 +1,35 @@ ++.TH LIBPFM 3 "June, 2017" "" "Linux Programmer's Manual" ++.SH NAME ++libpfm_intel_bdx_unc_ha - support for Intel Broadwell Server Home Agent (HA) uncore PMU ++.SH SYNOPSIS ++.nf ++.B #include ++.sp ++.B PMU name: bdx_unc_ha0, bdx_unc_ha1 ++.B PMU desc: Intel Broadwell Server HA uncore PMU ++.sp ++.SH DESCRIPTION ++The library supports the Intel Broadwell Server Home Agent (HA) uncore PMU. ++This PMU model only exists on various Broadwell models (79, 86). ++ ++.SH MODIFIERS ++The following modifiers are supported on Intel Broadwell server HA uncore PMU: ++.TP ++.B e ++Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a threshold modifier (t) with a value greater or equal to one. This is a boolean modifier. ++.TP ++.B t ++Set the threshold value. When set to a non-zero value, the counter counts the number ++of HA cycles in which the number of occurrences of the event is greater or equal to ++the threshold. This is an integer modifier with values in the range [0:255]. ++.TP ++.B i ++Invert the meaning of the threshold or edge filter. If set, the event counts when strictly less ++than N occurrences occur per cycle if threshold is set to N. When invert is set, then threshold ++must be set to non-zero value. If set, the event counts when the event transitions from occurring ++to not occurring (falling edge) when edge detection is set. This is a boolean modifier ++.SH AUTHORS ++.nf ++Stephane Eranian ++.if ++.PP +diff --git a/docs/man3/libpfm_intel_bdx_unc_imc.3 b/docs/man3/libpfm_intel_bdx_unc_imc.3 +new file mode 100644 +index 0000000..2baa153 +--- /dev/null ++++ b/docs/man3/libpfm_intel_bdx_unc_imc.3 +@@ -0,0 +1,34 @@ ++.TH LIBPFM 3 "June, 2017" "" "Linux Programmer's Manual" ++.SH NAME ++libpfm_intel_bdx_unc_imc - support for Intel Broadwell Server Integrated Memory Controller (IMC) uncore PMU ++.SH SYNOPSIS ++.nf ++.B #include ++.sp ++.B PMU name: bdx_unc_imc[0-7] ++.B PMU desc: Intel Broadwell Server IMC uncore PMU ++.sp ++.SH DESCRIPTION ++The library supports the Intel Broadwell Server Integrated Memory Controller (IMC) uncore PMU. ++This PMU model only exists on various Broadwell server models (79, 86). ++ ++.SH MODIFIERS ++The following modifiers are supported on Intel Broadwell server IMC uncore PMU: ++.TP ++.B e ++Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a threshold modifier (t) with a value greater or equal to one. This is a boolean modifier. ++.TP ++.B t ++Set the threshold value. When set to a non-zero value, the counter counts the number ++of IMC cycles in which the number of occurrences of the event is greater or equal to ++the threshold. This is an integer modifier with values in the range [0:255]. ++.B i ++Invert the meaning of the threshold or edge filter. If set, the event counts when strictly less ++than N occurrences occur per cycle if threshold is set to N. When invert is set, then threshold ++must be set to non-zero value. If set, the event counts when the event transitions from occurring ++to not occurring (falling edge) when edge detection is set. This is a boolean modifier ++.SH AUTHORS ++.nf ++Stephane Eranian ++.if ++.PP +diff --git a/docs/man3/libpfm_intel_bdx_unc_irp.3 b/docs/man3/libpfm_intel_bdx_unc_irp.3 +new file mode 100644 +index 0000000..d3902c2 +--- /dev/null ++++ b/docs/man3/libpfm_intel_bdx_unc_irp.3 +@@ -0,0 +1,36 @@ ++.TH LIBPFM 3 "June, 2017" "" "Linux Programmer's Manual" ++.SH NAME ++libpfm_intel_bdx_unc_irp - support for Intel Broadwell Server IRP uncore PMU ++.SH SYNOPSIS ++.nf ++.B #include ++.sp ++.B PMU name: bdx_unc_irp ++.B PMU desc: Intel Broadwell Server IRP uncore PMU ++.sp ++.SH DESCRIPTION ++The library supports the Intel Broadwell Server IRP (IIO coherency) uncore PMU . ++This PMU model only exists various Broadwell server models (79, 86). ++ ++.SH MODIFIERS ++The following modifiers are supported on Intel Broadwell server IRP uncore PMU: ++.TP ++.B e ++Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a threshold modifier (t) with a value greater or equal to one. This is a boolean modifier. ++.TP ++.B t ++Set the threshold value. When set to a non-zero value, the counter counts the number ++of cycles in which the number of occurrences of the event is greater or equal to ++the threshold. This is an integer modifier with values in the range [0:255]. ++.TP ++.B i ++Invert the meaning of the threshold or edge filter. If set, the event counts when strictly less ++than N occurrences occur per cycle if threshold is set to N. When invert is set, then threshold ++must be set to non-zero value. If set, the event counts when the event transitions from occurring ++to not occurring (falling edge) when edge detection is set. This is a boolean modifier ++ ++.SH AUTHORS ++.nf ++Stephane Eranian ++.if ++.PP +diff --git a/docs/man3/libpfm_intel_bdx_unc_pcu.3 b/docs/man3/libpfm_intel_bdx_unc_pcu.3 +new file mode 100644 +index 0000000..eb7565a +--- /dev/null ++++ b/docs/man3/libpfm_intel_bdx_unc_pcu.3 +@@ -0,0 +1,50 @@ ++.TH LIBPFM 3 "June, 2017" "" "Linux Programmer's Manual" ++.SH NAME ++libpfm_intel_bdx_unc_pcu - support for Intel Broadwell Server Power Controller Unit (PCU) uncore PMU ++.SH SYNOPSIS ++.nf ++.B #include ++.sp ++.B PMU name: bdx_unc_pcu ++.B PMU desc: Intel Broadwell Server PCU uncore PMU ++.sp ++.SH DESCRIPTION ++The library supports the Intel Broadwell Server Power Controller Unit uncore PMU. ++This PMU model only exists on various Broadwell server models (79, 86). ++ ++.SH MODIFIERS ++The following modifiers are supported on Intel Broadwell server PCU uncore PMU: ++.TP ++.B e ++Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a threshold modifier (t) with a value greater or equal to one. This is a boolean modifier. ++.TP ++.B t ++Set the threshold value. When set to a non-zero value, the counter counts the number ++of HA cycles in which the number of occurrences of the event is greater or equal to ++the threshold. This is an integer modifier with values in the range [0:15]. ++.TP ++.B i ++Invert the meaning of the threshold or edge filter. If set, the event counts when strictly less ++than N occurrences occur per cycle if threshold is set to N. When invert is set, then threshold ++must be set to non-zero value. If set, the event counts when the event transitions from occurring ++to not occurring (falling edge) when edge detection is set. This is a boolean modifier ++.TP ++.B ff ++Enable frequency band filtering. This modifier applies only to the UNC_P_FREQ_BANDx_CYCLES events, where x is [0-3]. ++The modifiers expects an integer in the range [0-255]. The value is interpreted as a frequency value to be ++multiplied by 100Mhz. Thus if the value is 32, then all cycles where the processor is running at 3.2GHz and more are ++counted. ++ ++.SH Frequency band filtering ++ ++There are 3 events which support frequency band filtering, namely, UNC_P_FREQ_BAND0_CYCLES, UNC_P_FREQ_BAND1_CYCLES, ++UNC_P_FREQ_BAND2_CYCLES, UNC_P_FREQ_BAND3_CYCLES. The frequency filter (available via the ff modifier) is stored into ++a PMU shared register which hold all 4 possible frequency bands, one per event. However, the library generate the ++encoding for each event individually because it processes events one at a time. The caller or the underlying kernel ++interface may have to merge the band filter settings to program the filter register properly. ++ ++.SH AUTHORS ++.nf ++Stephane Eranian ++.if ++.PP +diff --git a/docs/man3/libpfm_intel_bdx_unc_qpi.3 b/docs/man3/libpfm_intel_bdx_unc_qpi.3 +new file mode 100644 +index 0000000..ef3f318 +--- /dev/null ++++ b/docs/man3/libpfm_intel_bdx_unc_qpi.3 +@@ -0,0 +1,36 @@ ++.TH LIBPFM 3 "June, 2017" "" "Linux Programmer's Manual" ++.SH NAME ++libpfm_intel_bdx_unc_qpi - support for Intel Broadwell Server QPI uncore PMU ++.SH SYNOPSIS ++.nf ++.B #include ++.sp ++.B PMU name: bdx_unc_qpi0, bdx_unc_qpi1 ++.B PMU desc: Intel Broadwell Server QPI uncore PMU ++.sp ++.SH DESCRIPTION ++The library supports the Intel Broadwell Server QPI uncore PMU. ++This PMU model only exists on various Broadwell server models (79, 86). ++ ++.SH MODIFIERS ++The following modifiers are supported on Broadwell server QPI uncore PMU: ++.TP ++.B e ++Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a threshold modifier (t) with a value greater or equal to one. This is a boolean modifier. ++.TP ++.B t ++Set the threshold value. When set to a non-zero value, the counter counts the number ++of QPI cycles in which the number of occurrences of the event is greater or equal to ++the threshold. This is an integer modifier with values in the range [0:255]. ++.TP ++.B i ++Invert the meaning of the threshold or edge filter. If set, the event counts when strictly less ++than N occurrences occur per cycle if threshold is set to N. When invert is set, then threshold ++must be set to non-zero value. If set, the event counts when the event transitions from occurring ++to not occurring (falling edge) when edge detection is set. This is a boolean modifier ++ ++.SH AUTHORS ++.nf ++Stephane Eranian ++.if ++.PP +diff --git a/docs/man3/libpfm_intel_bdx_unc_r2pcie.3 b/docs/man3/libpfm_intel_bdx_unc_r2pcie.3 +new file mode 100644 +index 0000000..fe8bac5 +--- /dev/null ++++ b/docs/man3/libpfm_intel_bdx_unc_r2pcie.3 +@@ -0,0 +1,36 @@ ++.TH LIBPFM 3 "June, 2017" "" "Linux Programmer's Manual" ++.SH NAME ++libpfm_intel_bdx_unc_r2pcie - support for Intel Broadwell Server R2 PCIe uncore PMU ++.SH SYNOPSIS ++.nf ++.B #include ++.sp ++.B PMU name: bdx_unc_r2pcie ++.B PMU desc: Intel Broadwell Server R2 PCIe uncore PMU ++.sp ++.SH DESCRIPTION ++The library supports the Intel Broadwell server R2 PCIe uncore PMU. ++This PMU model only exists on Broadwell server models (79, 86). ++ ++.SH MODIFIERS ++The following modifiers are supported on Intel Broadwell server R2PCIe uncore PMU: ++.TP ++.B e ++Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a threshold modifier (t) with a value greater or equal to one. This is a boolean modifier. ++.TP ++.B t ++Set the threshold value. When set to a non-zero value, the counter counts the number ++of R2PCIe cycles in which the number of occurrences of the event is greater or equal to ++the threshold. This is an integer modifier with values in the range [0:15]. ++.TP ++.B i ++Invert the meaning of the threshold or edge filter. If set, the event counts when strictly less ++than N occurrences occur per cycle if threshold is set to N. When invert is set, then threshold ++must be set to non-zero value. If set, the event counts when the event transitions from occurring ++to not occurring (falling edge) when edge detection is set. This is a boolean modifier ++ ++.SH AUTHORS ++.nf ++Stephane Eranian ++.if ++.PP +diff --git a/docs/man3/libpfm_intel_bdx_unc_r3qpi.3 b/docs/man3/libpfm_intel_bdx_unc_r3qpi.3 +new file mode 100644 +index 0000000..6fff0b2 +--- /dev/null ++++ b/docs/man3/libpfm_intel_bdx_unc_r3qpi.3 +@@ -0,0 +1,36 @@ ++.TH LIBPFM 3 "June, 2017" "" "Linux Programmer's Manual" ++.SH NAME ++libpfm_intel_bdx_unc_r3qpi - support for Intel Broadwell Server R3QPI uncore PMU ++.SH SYNOPSIS ++.nf ++.B #include ++.sp ++.B PMU name: bdx_unc_r3qpi[0-2] ++.B PMU desc: Intel Broadwell server R3QPI uncore PMU ++.sp ++.SH DESCRIPTION ++The library supports the Intel Broadwell server R3QPI uncore PMU. ++This PMU model only exists on various Broadwell server models (79, 86). ++ ++.SH MODIFIERS ++The following modifiers are supported on Intel Broadwell server R3PQI uncore PMU: ++.TP ++.B e ++Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a threshold modifier (t) with a value greater or equal to one. This is a boolean modifier. ++.TP ++.B t ++Set the threshold value. When set to a non-zero value, the counter counts the number ++of R3QPI cycles in which the number of occurrences of the event is greater or equal to ++the threshold. This is an integer modifier with values in the range [0:15]. ++.TP ++.B i ++Invert the meaning of the threshold or edge filter. If set, the event counts when strictly less ++than N occurrences occur per cycle if threshold is set to N. When invert is set, then threshold ++must be set to non-zero value. If set, the event counts when the event transitions from occurring ++to not occurring (falling edge) when edge detection is set. This is a boolean modifier ++ ++.SH AUTHORS ++.nf ++Stephane Eranian ++.if ++.PP +diff --git a/docs/man3/libpfm_intel_bdx_unc_sbo.3 b/docs/man3/libpfm_intel_bdx_unc_sbo.3 +new file mode 100644 +index 0000000..262e553 +--- /dev/null ++++ b/docs/man3/libpfm_intel_bdx_unc_sbo.3 +@@ -0,0 +1,42 @@ ++.TH LIBPFM 3 "June, 2017" "" "Linux Programmer's Manual" ++.SH NAME ++libpfm_intel_bdx_unc_sbo - support for Intel Broadwell Server S-Box uncore PMU ++.SH SYNOPSIS ++.nf ++.B #include ++.sp ++.B PMU name: bdx_unc_sbo ++.B PMU desc: Intel Broadwell Server S-Box uncore PMU ++.sp ++.SH DESCRIPTION ++The library supports the Intel Broadwell server Ring Transfer unit (S-Box) uncore PMU. ++This PMU model only exists on various Broadwell server models (79, 86). ++ ++.SH MODIFIERS ++The following modifiers are supported on Intel Broadwell server S-Box uncore PMU: ++.TP ++.B e ++Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a threshold modifier (t) with a value greater or equal to one. This is a boolean modifier. ++.TP ++.B t ++Set the threshold value. When set to a non-zero value, the counter counts the number ++of HA cycles in which the number of occurrences of the event is greater or equal to ++the threshold. This is an integer modifier with values in the range [0:15]. ++.TP ++.B i ++Invert the meaning of the threshold or edge filter. If set, the event counts when strictly less ++than N occurrences occur per cycle if threshold is set to N. When invert is set, then threshold ++must be set to non-zero value. If set, the event counts when the event transitions from occurring ++to not occurring (falling edge) when edge detection is set. This is a boolean modifier ++.TP ++.B i ++Invert the meaning of the threshold or edge filter. If set, the event counts when strictly less ++than N occurrences occur per cycle if threshold is set to N. When invert is set, then threshold ++must be set to non-zero value. If set, the event counts when the event transitions from occurring ++to not occurring (falling edge) when edge detection is set. This is a boolean modifier ++ ++.SH AUTHORS ++.nf ++Stephane Eranian ++.if ++.PP +diff --git a/docs/man3/libpfm_intel_bdx_unc_ubo.3 b/docs/man3/libpfm_intel_bdx_unc_ubo.3 +new file mode 100644 +index 0000000..d8fc1ca +--- /dev/null ++++ b/docs/man3/libpfm_intel_bdx_unc_ubo.3 +@@ -0,0 +1,36 @@ ++.TH LIBPFM 3 "June, 2017" "" "Linux Programmer's Manual" ++.SH NAME ++libpfm_intel_bdx_unc_ubo - support for Intel Broadwell Server U-Box uncore PMU ++.SH SYNOPSIS ++.nf ++.B #include ++.sp ++.B PMU name: bdx_unc_ubo ++.B PMU desc: Intel Broadwell Server U-Box uncore PMU ++.sp ++.SH DESCRIPTION ++The library supports the Intel Broadwell server system configuration unit (U-Box) uncore PMU. ++This PMU model only exists on various Broadwell server models (79, 86). ++ ++.SH MODIFIERS ++The following modifiers are supported on Intel Broadwell server U-Box uncore PMU: ++.TP ++.B e ++Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a threshold modifier (t) with a value greater or equal to one. This is a boolean modifier. ++.TP ++.B t ++Set the threshold value. When set to a non-zero value, the counter counts the number ++of HA cycles in which the number of occurrences of the event is greater or equal to ++the threshold. This is an integer modifier with values in the range [0:15]. ++.TP ++.B i ++Invert the meaning of the threshold or edge filter. If set, the event counts when strictly less ++than N occurrences occur per cycle if threshold is set to N. When invert is set, then threshold ++must be set to non-zero value. If set, the event counts when the event transitions from occurring ++to not occurring (falling edge) when edge detection is set. This is a boolean modifier ++ ++.SH AUTHORS ++.nf ++Stephane Eranian ++.if ++.PP +diff --git a/include/perfmon/pfmlib.h b/include/perfmon/pfmlib.h +index 89ab973..61a7d90 100644 +--- a/include/perfmon/pfmlib.h ++++ b/include/perfmon/pfmlib.h +@@ -369,8 +369,58 @@ typedef enum { + PFM_PMU_INTEL_KNL_UNC_M2PCIE, /* Intel KnightLanding M2PCIe uncore */ + + PFM_PMU_POWER9, /* IBM POWER9 */ ++ ++ PFM_PMU_INTEL_BDX_UNC_CB0, /* Intel Broadwell-X C-Box core 0 uncore */ ++ PFM_PMU_INTEL_BDX_UNC_CB1, /* Intel Broadwell-X C-Box core 1 uncore */ ++ PFM_PMU_INTEL_BDX_UNC_CB2, /* Intel Broadwell-X C-Box core 2 uncore */ ++ PFM_PMU_INTEL_BDX_UNC_CB3, /* Intel Broadwell-X C-Box core 3 uncore */ ++ PFM_PMU_INTEL_BDX_UNC_CB4, /* Intel Broadwell-X C-Box core 4 uncore */ ++ PFM_PMU_INTEL_BDX_UNC_CB5, /* Intel Broadwell-X C-Box core 5 uncore */ ++ PFM_PMU_INTEL_BDX_UNC_CB6, /* Intel Broadwell-X C-Box core 6 uncore */ ++ PFM_PMU_INTEL_BDX_UNC_CB7, /* Intel Broadwell-X C-Box core 7 uncore */ ++ PFM_PMU_INTEL_BDX_UNC_CB8, /* Intel Broadwell-X C-Box core 8 uncore */ ++ PFM_PMU_INTEL_BDX_UNC_CB9, /* Intel Broadwell-X C-Box core 9 uncore */ ++ PFM_PMU_INTEL_BDX_UNC_CB10, /* Intel Broadwell-X C-Box core 10 uncore */ ++ PFM_PMU_INTEL_BDX_UNC_CB11, /* Intel Broadwell-X C-Box core 11 uncore */ ++ PFM_PMU_INTEL_BDX_UNC_CB12, /* Intel Broadwell-X C-Box core 12 uncore */ ++ PFM_PMU_INTEL_BDX_UNC_CB13, /* Intel Broadwell-X C-Box core 13 uncore */ ++ PFM_PMU_INTEL_BDX_UNC_CB14, /* Intel Broadwell-X C-Box core 14 uncore */ ++ PFM_PMU_INTEL_BDX_UNC_CB15, /* Intel Broadwell-X C-Box core 15 uncore */ ++ PFM_PMU_INTEL_BDX_UNC_CB16, /* Intel Broadwell-X C-Box core 16 uncore */ ++ PFM_PMU_INTEL_BDX_UNC_CB17, /* Intel Broadwell-X C-Box core 17 uncore */ ++ PFM_PMU_INTEL_BDX_UNC_CB18, /* Intel Broadwell-X C-Box core 18 uncore */ ++ PFM_PMU_INTEL_BDX_UNC_CB19, /* Intel Broadwell-X C-Box core 19 uncore */ ++ PFM_PMU_INTEL_BDX_UNC_CB20, /* Intel Broadwell-X C-Box core 20 uncore */ ++ PFM_PMU_INTEL_BDX_UNC_CB21, /* Intel Broadwell-X C-Box core 21 uncore */ ++ PFM_PMU_INTEL_BDX_UNC_CB22, /* Intel Broadwell-X C-Box core 22 uncore */ ++ PFM_PMU_INTEL_BDX_UNC_CB23, /* Intel Broadwell-X C-Box core 23 uncore */ ++ PFM_PMU_INTEL_BDX_UNC_HA0, /* Intel Broadwell-X HA 0 uncore */ ++ PFM_PMU_INTEL_BDX_UNC_HA1, /* Intel Broadwell-X HA 1 uncore */ ++ PFM_PMU_INTEL_BDX_UNC_IMC0, /* Intel Broadwell-X IMC socket 0 uncore */ ++ PFM_PMU_INTEL_BDX_UNC_IMC1, /* Intel Broadwell-X IMC socket 1 uncore */ ++ PFM_PMU_INTEL_BDX_UNC_IMC2, /* Intel Broadwell-X IMC socket 2 uncore */ ++ PFM_PMU_INTEL_BDX_UNC_IMC3, /* Intel Broadwell-X IMC socket 3 uncore */ ++ PFM_PMU_INTEL_BDX_UNC_IMC4, /* Intel Broadwell-X IMC socket 4 uncore */ ++ PFM_PMU_INTEL_BDX_UNC_IMC5, /* Intel Broadwell-X IMC socket 5 uncore */ ++ PFM_PMU_INTEL_BDX_UNC_IMC6, /* Intel Broadwell-X IMC socket 6 uncore */ ++ PFM_PMU_INTEL_BDX_UNC_IMC7, /* Intel Broadwell-X IMC socket 7 uncore */ ++ PFM_PMU_INTEL_BDX_UNC_PCU, /* Intel Broadwell-X PCU uncore */ ++ PFM_PMU_INTEL_BDX_UNC_QPI0, /* Intel Broadwell-X QPI link 0 uncore */ ++ PFM_PMU_INTEL_BDX_UNC_QPI1, /* Intel Broadwell-X QPI link 1 uncore */ ++ PFM_PMU_INTEL_BDX_UNC_QPI2, /* Intel Broadwell-X QPI link 2 uncore */ ++ PFM_PMU_INTEL_BDX_UNC_UBOX, /* Intel Broadwell-X U-Box uncore */ ++ PFM_PMU_INTEL_BDX_UNC_R2PCIE, /* Intel Broadwell-X R2PCIe uncore */ ++ PFM_PMU_INTEL_BDX_UNC_R3QPI0, /* Intel Broadwell-X R3QPI 0 uncore */ ++ PFM_PMU_INTEL_BDX_UNC_R3QPI1, /* Intel Broadwell-X R3QPI 1 uncore */ ++ PFM_PMU_INTEL_BDX_UNC_R3QPI2, /* Intel Broadwell-X R3QPI 2 uncore */ ++ PFM_PMU_INTEL_BDX_UNC_IRP, /* Intel Broadwell-X IRP uncore */ ++ PFM_PMU_INTEL_BDX_UNC_SB0, /* Intel Broadwell-X S-Box 0 uncore */ ++ PFM_PMU_INTEL_BDX_UNC_SB1, /* Intel Broadwell-X S-Box 1 uncore */ ++ PFM_PMU_INTEL_BDX_UNC_SB2, /* Intel Broadwell-X S-Box 2 uncore */ ++ PFM_PMU_INTEL_BDX_UNC_SB3, /* Intel Broadwell-X S-Box 3 uncore */ + /* MUST ADD NEW PMU MODELS HERE */ + ++ + PFM_PMU_MAX /* end marker */ + } pfm_pmu_t; + +diff --git a/lib/Makefile b/lib/Makefile +index f532561..aa21ccb 100644 +--- a/lib/Makefile ++++ b/lib/Makefile +@@ -91,6 +91,16 @@ SRCS += pfmlib_amd64.c pfmlib_intel_core.c pfmlib_intel_x86.c \ + pfmlib_intel_hswep_unc_r3qpi.c \ + pfmlib_intel_hswep_unc_irp.c \ + pfmlib_intel_hswep_unc_sbo.c \ ++ pfmlib_intel_bdx_unc_cbo.c \ ++ pfmlib_intel_bdx_unc_ubo.c \ ++ pfmlib_intel_bdx_unc_sbo.c \ ++ pfmlib_intel_bdx_unc_ha.c \ ++ pfmlib_intel_bdx_unc_imc.c \ ++ pfmlib_intel_bdx_unc_irp.c \ ++ pfmlib_intel_bdx_unc_pcu.c \ ++ pfmlib_intel_bdx_unc_qpi.c \ ++ pfmlib_intel_bdx_unc_r2pcie.c \ ++ pfmlib_intel_bdx_unc_r3qpi.c \ + pfmlib_intel_knc.c \ + pfmlib_intel_slm.c \ + pfmlib_intel_knl.c \ +@@ -275,6 +285,16 @@ INC_X86= pfmlib_intel_x86_priv.h \ + events/intel_hswep_unc_r2pcie_events.h \ + events/intel_hswep_unc_r3qpi_events.h \ + events/intel_hswep_unc_irp_events.h \ ++ events/intel_bdx_unc_cbo_events.h \ ++ events/intel_bdx_unc_ubo_events.h \ ++ events/intel_bdx_unc_sbo_events.h \ ++ events/intel_bdx_unc_ha_events.h \ ++ events/intel_bdx_unc_imc_events.h \ ++ events/intel_bdx_unc_irp_events.h \ ++ events/intel_bdx_unc_pcu_events.h \ ++ events/intel_bdx_unc_qpi_events.h \ ++ events/intel_bdx_unc_r2pcie_events.h \ ++ events/intel_bdx_unc_r3qpi_events.h \ + events/intel_knl_unc_imc_events.h \ + events/intel_knl_unc_edc_events.h \ + events/intel_knl_unc_cha_events.h \ +diff --git a/lib/events/intel_bdx_unc_cbo_events.h b/lib/events/intel_bdx_unc_cbo_events.h +new file mode 100644 +index 0000000..7aa362c +--- /dev/null ++++ b/lib/events/intel_bdx_unc_cbo_events.h +@@ -0,0 +1,1167 @@ ++/* ++ * Copyright (c) 2017 Google Inc. All rights reserved ++ * Contributed by Stephane Eranian ++ * ++ * Permission is hereby granted, free of charge, to any person obtaining a copy ++ * of this software and associated documentation files (the "Software"), to deal ++ * in the Software without restriction, including without limitation the rights ++ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies ++ * of the Software, and to permit persons to whom the Software is furnished to do so, ++ * subject to the following conditions: ++ * ++ * The above copyright notice and this permission notice shall be included in all ++ * copies or substantial portions of the Software. ++ * ++ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, ++ * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A ++ * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT ++ * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF ++ * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE ++ * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ++ * ++ * This file is part of libpfm, a performance monitoring support library for ++ * applications on Linux. ++ * ++ * PMU: bdx_unc_cbo ++ */ ++ ++#define CBO_FILT_MESIF(a, b, c, d) \ ++ { .uname = "STATE_"#a,\ ++ .udesc = #b" cacheline state",\ ++ .ufilters[0] = 1ULL << (17 + (c)),\ ++ .grpid = d, \ ++ } ++ ++#define CBO_FILT_MESIFS(d) \ ++ CBO_FILT_MESIF(I, Invalid, 0, d), \ ++ CBO_FILT_MESIF(S, Shared, 1, d), \ ++ CBO_FILT_MESIF(E, Exclusive, 2, d), \ ++ CBO_FILT_MESIF(M, Modified, 3, d), \ ++ CBO_FILT_MESIF(F, Forward, 4, d), \ ++ CBO_FILT_MESIF(D, Debug, 5, d), \ ++ { .uname = "STATE_MP",\ ++ .udesc = "Cacheline is modified but never written, was forwarded in modified state",\ ++ .ufilters[0] = 0x1ULL << (17+6),\ ++ .grpid = d, \ ++ .uflags = INTEL_X86_NCOMBO, \ ++ }, \ ++ { .uname = "STATE_MESIFD",\ ++ .udesc = "Any cache line state",\ ++ .ufilters[0] = 0x7fULL << 17,\ ++ .grpid = d, \ ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, \ ++ } ++ ++#define CBO_FILT_OPC(d) \ ++ { .uname = "OPC_RFO",\ ++ .udesc = "Demand data RFO (combine with any OPCODE umask)",\ ++ .ufilters[1] = 0x180ULL << 20, \ ++ .uflags = INTEL_X86_NCOMBO, \ ++ .grpid = d, \ ++ }, \ ++ { .uname = "OPC_CRD",\ ++ .udesc = "Demand code read (combine with any OPCODE umask)",\ ++ .ufilters[1] = 0x181ULL << 20, \ ++ .uflags = INTEL_X86_NCOMBO, \ ++ .grpid = d, \ ++ }, \ ++ { .uname = "OPC_DRD",\ ++ .udesc = "Demand data read (combine with any OPCODE umask)",\ ++ .ufilters[1] = 0x182ULL << 20, \ ++ .uflags = INTEL_X86_NCOMBO, \ ++ .grpid = d, \ ++ }, \ ++ { .uname = "OPC_PRD",\ ++ .udesc = "Partial reads (UC) (combine with any OPCODE umask)",\ ++ .ufilters[1] = 0x187ULL << 20, \ ++ .uflags = INTEL_X86_NCOMBO, \ ++ .grpid = d, \ ++ }, \ ++ { .uname = "OPC_WCILF",\ ++ .udesc = "Full Stream store (combine with any OPCODE umask)", \ ++ .ufilters[1] = 0x18cULL << 20, \ ++ .uflags = INTEL_X86_NCOMBO, \ ++ .grpid = d, \ ++ }, \ ++ { .uname = "OPC_WCIL",\ ++ .udesc = "Partial Stream store (combine with any OPCODE umask)", \ ++ .ufilters[1] = 0x18dULL << 20, \ ++ .uflags = INTEL_X86_NCOMBO, \ ++ .grpid = d, \ ++ }, \ ++ { .uname = "OPC_WIL",\ ++ .udesc = "Write Invalidate Line (Partial) (combine with any OPCODE umask)", \ ++ .ufilters[1] = 0x18fULL << 20, \ ++ .uflags = INTEL_X86_NCOMBO, \ ++ .grpid = d, \ ++ }, \ ++ { .uname = "OPC_PF_RFO",\ ++ .udesc = "Prefetch RFO into LLC but do not pass to L2 (includes hints) (combine with any OPCODE umask)", \ ++ .ufilters[1] = 0x190ULL << 20, \ ++ .uflags = INTEL_X86_NCOMBO, \ ++ .grpid = d, \ ++ }, \ ++ { .uname = "OPC_PF_CODE",\ ++ .udesc = "Prefetch code into LLC but do not pass to L2 (includes hints) (combine with any OPCODE umask)", \ ++ .ufilters[1] = 0x191ULL << 20, \ ++ .uflags = INTEL_X86_NCOMBO, \ ++ .grpid = d, \ ++ }, \ ++ { .uname = "OPC_PF_DATA",\ ++ .udesc = "Prefetch data into LLC but do not pass to L2 (includes hints) (combine with any OPCODE umask)", \ ++ .ufilters[1] = 0x192ULL << 20, \ ++ .uflags = INTEL_X86_NCOMBO, \ ++ .grpid = d, \ ++ }, \ ++ { .uname = "OPC_PCIWIL",\ ++ .udesc = "PCIe write (partial, non-allocating) - partial line MMIO write transactions from IIO (P2P). Not used for coherent transacions. Uncacheable. (combine with any OPCODE umask)", \ ++ .ufilters[1] = 0x193ULL << 20, \ ++ .uflags = INTEL_X86_NCOMBO, \ ++ .grpid = d, \ ++ }, \ ++ { .uname = "OPC_PCIWIF",\ ++ .udesc = "PCIe write (full, non-allocating) - full line MMIO write transactions from IIO (P2P). Not used for coherent transacions. Uncacheable. (combine with any OPCODE umask)", \ ++ .ufilters[1] = 0x194ULL << 20, \ ++ .uflags = INTEL_X86_NCOMBO, \ ++ .grpid = d, \ ++ }, \ ++ { .uname = "OPC_PCIITOM",\ ++ .udesc = "PCIe write (allocating) (combine with any OPCODE umask)", \ ++ .ufilters[1] = 0x19cULL << 20, \ ++ .uflags = INTEL_X86_NCOMBO, \ ++ .grpid = d, \ ++ }, \ ++ { .uname = "OPC_PCIRDCUR",\ ++ .udesc = "PCIe read current (combine with any OPCODE umask)", \ ++ .ufilters[1] = 0x19eULL << 20, \ ++ .uflags = INTEL_X86_NCOMBO, \ ++ .grpid = d, \ ++ }, \ ++ { .uname = "OPC_WBMTOI",\ ++ .udesc = "Request writeback modified invalidate line (combine with any OPCODE umask)", \ ++ .ufilters[1] = 0x1c4ULL << 20, \ ++ .uflags = INTEL_X86_NCOMBO, \ ++ .grpid = d, \ ++ }, \ ++ { .uname = "OPC_WBMTOE",\ ++ .udesc = "Request writeback modified set to exclusive (combine with any OPCODE umask)", \ ++ .ufilters[1] = 0x1c5ULL << 20, \ ++ .uflags = INTEL_X86_NCOMBO, \ ++ .grpid = d, \ ++ }, \ ++ { .uname = "OPC_ITOM",\ ++ .udesc = "Request invalidate line. Request exclusive ownership of the line (combine with any OPCODE umask)", \ ++ .ufilters[1] = 0x1c8ULL << 20, \ ++ .uflags = INTEL_X86_NCOMBO, \ ++ .grpid = d, \ ++ }, \ ++ { .uname = "OPC_PCINSRD",\ ++ .udesc = "PCIe non-snoop read (combine with any OPCODE umask)", \ ++ .ufilters[1] = 0x1e4ULL << 20, \ ++ .uflags = INTEL_X86_NCOMBO, \ ++ .grpid = d, \ ++ }, \ ++ { .uname = "OPC_PCINSWR",\ ++ .udesc = "PCIe non-snoop write (partial) (combine with any OPCODE umask)", \ ++ .ufilters[1] = 0x1e5ULL << 20, \ ++ .uflags = INTEL_X86_NCOMBO, \ ++ .grpid = d, \ ++ }, \ ++ { .uname = "OPC_PCINSWRF",\ ++ .udesc = "PCIe non-snoop write (full) (combine with any OPCODE umask)", \ ++ .ufilters[1] = 0x1e6ULL << 20, \ ++ .uflags = INTEL_X86_NCOMBO, \ ++ .grpid = d, \ ++ } ++ ++ ++static intel_x86_umask_t bdx_unc_c_llc_lookup[]={ ++ { .uname = "ANY", ++ .ucode = 0x1100, ++ .udesc = "Cache Lookups -- Any Request", ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, ++ .grpid = 0, ++ }, ++ { .uname = "DATA_READ", ++ .ucode = 0x300, ++ .udesc = "Cache Lookups -- Data Read Request", ++ .grpid = 0, ++ }, ++ { .uname = "NID", ++ .ucode = 0x4100, ++ .udesc = "Cache Lookups -- Lookups that Match NID", ++ .umodmsk_req = _SNBEP_UNC_ATTR_NF1, ++ .grpid = 1, ++ .uflags = INTEL_X86_GRP_DFL_NONE ++ }, ++ { .uname = "READ", ++ .ucode = 0x2100, ++ .udesc = "Cache Lookups -- Any Read Request", ++ .grpid = 0, ++ }, ++ { .uname = "REMOTE_SNOOP", ++ .ucode = 0x900, ++ .udesc = "Cache Lookups -- External Snoop Request", ++ .grpid = 0, ++ }, ++ { .uname = "WRITE", ++ .ucode = 0x500, ++ .udesc = "Cache Lookups -- Write Requests", ++ .grpid = 0, ++ }, ++ CBO_FILT_MESIFS(2), ++}; ++ ++static intel_x86_umask_t bdx_unc_c_llc_victims[]={ ++ { .uname = "F_STATE", ++ .ucode = 0x800, ++ .udesc = "Lines in Forward state", ++ .grpid = 0, ++ }, ++ { .uname = "I_STATE", ++ .ucode = 0x400, ++ .udesc = "Lines in S State", ++ .grpid = 0, ++ }, ++ { .uname = "S_STATE", ++ .ucode = 0x400, ++ .udesc = "Lines in S state", ++ .grpid = 0, ++ }, ++ { .uname = "E_STATE", ++ .ucode = 0x200, ++ .udesc = "Lines in E state", ++ .grpid = 0, ++ }, ++ { .uname = "M_STATE", ++ .ucode = 0x100, ++ .udesc = "Lines in M state", ++ .grpid = 0, ++ }, ++ { .uname = "MISS", ++ .ucode = 0x1000, ++ .udesc = "Lines Victimized", ++ .grpid = 0, ++ }, ++ { .uname = "NID", ++ .ucode = 0x4000, ++ .udesc = "Lines Victimized -- Victimized Lines that Match NID", ++ .uflags = INTEL_X86_GRP_DFL_NONE, ++ .umodmsk_req = _SNBEP_UNC_ATTR_NF1, ++ .grpid = 1, ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_c_misc[]={ ++ { .uname = "CVZERO_PREFETCH_MISS", ++ .ucode = 0x2000, ++ .udesc = "Cbo Misc -- DRd hitting non-M with raw CV=0", ++ }, ++ { .uname = "CVZERO_PREFETCH_VICTIM", ++ .ucode = 0x1000, ++ .udesc = "Cbo Misc -- Clean Victim with raw CV=0", ++ }, ++ { .uname = "RFO_HIT_S", ++ .ucode = 0x800, ++ .udesc = "Cbo Misc -- RFO HitS", ++ }, ++ { .uname = "RSPI_WAS_FSE", ++ .ucode = 0x100, ++ .udesc = "Cbo Misc -- Silent Snoop Eviction", ++ }, ++ { .uname = "STARTED", ++ .ucode = 0x400, ++ .udesc = "Cbo Misc -- ", ++ }, ++ { .uname = "WC_ALIASING", ++ .ucode = 0x200, ++ .udesc = "Cbo Misc -- Write Combining Aliasing", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_c_ring_ad_used[]={ ++ { .uname = "ALL", ++ .ucode = 0xf00, ++ .udesc = "AD Ring In Use -- All", ++ }, ++ { .uname = "CCW", ++ .ucode = 0xc00, ++ .udesc = "AD Ring In Use -- Down", ++ }, ++ { .uname = "CW", ++ .ucode = 0x300, ++ .udesc = "AD Ring In Use -- Up", ++ }, ++ { .uname = "DOWN_EVEN", ++ .ucode = 0x400, ++ .udesc = "AD Ring In Use -- Down and Even", ++ }, ++ { .uname = "DOWN_ODD", ++ .ucode = 0x800, ++ .udesc = "AD Ring In Use -- Down and Odd", ++ }, ++ { .uname = "UP_EVEN", ++ .ucode = 0x100, ++ .udesc = "AD Ring In Use -- Up and Even", ++ }, ++ { .uname = "UP_ODD", ++ .ucode = 0x200, ++ .udesc = "AD Ring In Use -- Up and Odd", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_c_ring_ak_used[]={ ++ { .uname = "ALL", ++ .ucode = 0xf00, ++ .udesc = "AK Ring In Use -- All", ++ }, ++ { .uname = "CCW", ++ .ucode = 0xc00, ++ .udesc = "AK Ring In Use -- Down", ++ }, ++ { .uname = "CW", ++ .ucode = 0x300, ++ .udesc = "AK Ring In Use -- Up", ++ }, ++ { .uname = "DOWN_EVEN", ++ .ucode = 0x400, ++ .udesc = "AK Ring In Use -- Down and Even", ++ }, ++ { .uname = "DOWN_ODD", ++ .ucode = 0x800, ++ .udesc = "AK Ring In Use -- Down and Odd", ++ }, ++ { .uname = "UP_EVEN", ++ .ucode = 0x100, ++ .udesc = "AK Ring In Use -- Up and Even", ++ }, ++ { .uname = "UP_ODD", ++ .ucode = 0x200, ++ .udesc = "AK Ring In Use -- Up and Odd", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_c_ring_bl_used[]={ ++ { .uname = "ALL", ++ .ucode = 0xf00, ++ .udesc = "BL Ring in Use -- Down", ++ }, ++ { .uname = "CCW", ++ .ucode = 0xc00, ++ .udesc = "BL Ring in Use -- Down", ++ }, ++ { .uname = "CW", ++ .ucode = 0x300, ++ .udesc = "BL Ring in Use -- Up", ++ }, ++ { .uname = "DOWN_EVEN", ++ .ucode = 0x400, ++ .udesc = "BL Ring in Use -- Down and Even", ++ }, ++ { .uname = "DOWN_ODD", ++ .ucode = 0x800, ++ .udesc = "BL Ring in Use -- Down and Odd", ++ }, ++ { .uname = "UP_EVEN", ++ .ucode = 0x100, ++ .udesc = "BL Ring in Use -- Up and Even", ++ }, ++ { .uname = "UP_ODD", ++ .ucode = 0x200, ++ .udesc = "BL Ring in Use -- Up and Odd", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_c_ring_bounces[]={ ++ { .uname = "AD", ++ .ucode = 0x100, ++ .udesc = "Number of LLC responses that bounced on the Ring. -- AD", ++ }, ++ { .uname = "AK", ++ .ucode = 0x200, ++ .udesc = "Number of LLC responses that bounced on the Ring. -- AK", ++ }, ++ { .uname = "BL", ++ .ucode = 0x400, ++ .udesc = "Number of LLC responses that bounced on the Ring. -- BL", ++ }, ++ { .uname = "IV", ++ .ucode = 0x1000, ++ .udesc = "Number of LLC responses that bounced on the Ring. -- Snoops of processors cachee.", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_c_ring_iv_used[]={ ++ { .uname = "ANY", ++ .ucode = 0xf00, ++ .udesc = "BL Ring in Use -- Any", ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, ++ }, ++ { .uname = "DN", ++ .ucode = 0xc00, ++ .udesc = "BL Ring in Use -- Any", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "DOWN", ++ .ucode = 0xcc00, ++ .udesc = "BL Ring in Use -- Down", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "UP", ++ .ucode = 0x300, ++ .udesc = "BL Ring in Use -- Any", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_c_rxr_ext_starved[]={ ++ { .uname = "IPQ", ++ .ucode = 0x200, ++ .udesc = "Ingress Arbiter Blocking Cycles -- IRQ", ++ }, ++ { .uname = "IRQ", ++ .ucode = 0x100, ++ .udesc = "Ingress Arbiter Blocking Cycles -- IPQ", ++ }, ++ { .uname = "ISMQ_BIDS", ++ .ucode = 0x800, ++ .udesc = "Ingress Arbiter Blocking Cycles -- ISMQ_BID", ++ }, ++ { .uname = "PRQ", ++ .ucode = 0x400, ++ .udesc = "Ingress Arbiter Blocking Cycles -- PRQ", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_c_rxr_inserts[]={ ++ { .uname = "IPQ", ++ .ucode = 0x400, ++ .udesc = "Ingress Allocations -- IPQ", ++ }, ++ { .uname = "IRQ", ++ .ucode = 0x100, ++ .udesc = "Ingress Allocations -- IRQ", ++ }, ++ { .uname = "IRQ_REJ", ++ .ucode = 0x200, ++ .udesc = "Ingress Allocations -- IRQ Rejected", ++ }, ++ { .uname = "PRQ", ++ .ucode = 0x1000, ++ .udesc = "Ingress Allocations -- PRQ", ++ }, ++ { .uname = "PRQ_REJ", ++ .ucode = 0x2000, ++ .udesc = "Ingress Allocations -- PRQ", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_c_rxr_ipq_retry[]={ ++ { .uname = "ADDR_CONFLICT", ++ .ucode = 0x400, ++ .udesc = "Probe Queue Retries -- Address Conflict", ++ }, ++ { .uname = "ANY", ++ .ucode = 0x100, ++ .udesc = "Probe Queue Retries -- Any Reject", ++ .uflags = INTEL_X86_DFL, ++ }, ++ { .uname = "FULL", ++ .ucode = 0x200, ++ .udesc = "Probe Queue Retries -- No Egress Credits", ++ }, ++ { .uname = "QPI_CREDITS", ++ .ucode = 0x1000, ++ .udesc = "Probe Queue Retries -- No QPI Credits", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_c_rxr_ipq_retry2[]={ ++ { .uname = "AD_SBO", ++ .ucode = 0x100, ++ .udesc = "Probe Queue Retries -- No AD Sbo Credits", ++ }, ++ { .uname = "TARGET", ++ .ucode = 0x4000, ++ .udesc = "Probe Queue Retries -- Target Node Filter", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_c_rxr_irq_retry[]={ ++ { .uname = "ADDR_CONFLICT", ++ .ucode = 0x400, ++ .udesc = "Ingress Request Queue Rejects -- Address Conflict", ++ }, ++ { .uname = "ANY", ++ .ucode = 0x100, ++ .udesc = "Ingress Request Queue Rejects -- Any Reject", ++ .uflags = INTEL_X86_DFL, ++ }, ++ { .uname = "FULL", ++ .ucode = 0x200, ++ .udesc = "Ingress Request Queue Rejects -- No Egress Credits", ++ }, ++ { .uname = "IIO_CREDITS", ++ .ucode = 0x2000, ++ .udesc = "Ingress Request Queue Rejects -- No IIO Credits", ++ }, ++ { .uname = "NID", ++ .ucode = 0x4000, ++ .udesc = "Ingress Request Queue Rejects -- ", ++ }, ++ { .uname = "QPI_CREDITS", ++ .ucode = 0x1000, ++ .udesc = "Ingress Request Queue Rejects -- No QPI Credits", ++ }, ++ { .uname = "RTID", ++ .ucode = 0x800, ++ .udesc = "Ingress Request Queue Rejects -- No RTIDs", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_c_rxr_irq_retry2[]={ ++ { .uname = "AD_SBO", ++ .ucode = 0x100, ++ .udesc = "Ingress Request Queue Rejects -- No AD Sbo Credits", ++ }, ++ { .uname = "BL_SBO", ++ .ucode = 0x200, ++ .udesc = "Ingress Request Queue Rejects -- No BL Sbo Credits", ++ }, ++ { .uname = "TARGET", ++ .ucode = 0x4000, ++ .udesc = "Ingress Request Queue Rejects -- Target Node Filter", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_c_rxr_ismq_retry[]={ ++ { .uname = "ANY", ++ .ucode = 0x100, ++ .udesc = "ISMQ Retries -- Any Reject", ++ .uflags = INTEL_X86_DFL, ++ }, ++ { .uname = "FULL", ++ .ucode = 0x200, ++ .udesc = "ISMQ Retries -- No Egress Credits", ++ }, ++ { .uname = "IIO_CREDITS", ++ .ucode = 0x2000, ++ .udesc = "ISMQ Retries -- No IIO Credits", ++ }, ++ { .uname = "NID", ++ .ucode = 0x4000, ++ .udesc = "ISMQ Retries -- ", ++ }, ++ { .uname = "QPI_CREDITS", ++ .ucode = 0x1000, ++ .udesc = "ISMQ Retries -- No QPI Credits", ++ }, ++ { .uname = "RTID", ++ .ucode = 0x800, ++ .udesc = "ISMQ Retries -- No RTIDs", ++ }, ++ { .uname = "WB_CREDITS", ++ .ucode = 0x8000, ++ .udesc = "ISMQ Retries -- ", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_c_rxr_ismq_retry2[]={ ++ { .uname = "AD_SBO", ++ .ucode = 0x100, ++ .udesc = "ISMQ Request Queue Rejects -- No AD Sbo Credits", ++ }, ++ { .uname = "BL_SBO", ++ .ucode = 0x200, ++ .udesc = "ISMQ Request Queue Rejects -- No BL Sbo Credits", ++ }, ++ { .uname = "TARGET", ++ .ucode = 0x4000, ++ .udesc = "ISMQ Request Queue Rejects -- Target Node Filter", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_c_rxr_occupancy[]={ ++ { .uname = "IPQ", ++ .ucode = 0x400, ++ .udesc = "Ingress Occupancy -- IPQ", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "IRQ", ++ .ucode = 0x100, ++ .udesc = "Ingress Occupancy -- IRQ", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "IRQ_REJ", ++ .ucode = 0x200, ++ .udesc = "Ingress Occupancy -- IRQ Rejected", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "PRQ_REJ", ++ .ucode = 0x2000, ++ .udesc = "Ingress Occupancy -- PRQ Rejects", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_c_sbo_credits_acquired[]={ ++ { .uname = "AD", ++ .ucode = 0x100, ++ .udesc = "SBo Credits Acquired -- For AD Ring", ++ }, ++ { .uname = "BL", ++ .ucode = 0x200, ++ .udesc = "SBo Credits Acquired -- For BL Ring", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_c_sbo_credit_occupancy[]={ ++ { .uname = "AD", ++ .ucode = 0x100, ++ .udesc = "SBo Credits Occupancy -- For AD Ring", ++ }, ++ { .uname = "BL", ++ .ucode = 0x200, ++ .udesc = "SBo Credits Occupancy -- For BL Ring", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_c_tor_inserts[]={ ++ { .uname = "ALL", ++ .ucode = 0x800, ++ .udesc = "All", ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL | INTEL_X86_EXCL_GRP_GT, ++ .grpid = 0, ++ }, ++ { .uname = "EVICTION", ++ .ucode = 0x400, ++ .udesc = "Evictions", ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, ++ .grpid = 0, ++ }, ++ { .uname = "LOCAL", ++ .ucode = 0x2800, ++ .udesc = "Local Memory", ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, ++ .grpid = 0, ++ }, ++ { .uname = "LOCAL_OPCODE", ++ .ucode = 0x2100, ++ .udesc = "Local Memory - Opcode Matched", ++ .uflags = INTEL_X86_NCOMBO, ++ .grpid = 0, ++ }, ++ { .uname = "MISS_LOCAL", ++ .ucode = 0x2a00, ++ .udesc = "Misses to Local Memory", ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, ++ .grpid = 0, ++ }, ++ { .uname = "MISS_LOCAL_OPCODE", ++ .ucode = 0x2300, ++ .udesc = "Misses to Local Memory - Opcode Matched", ++ .uflags = INTEL_X86_NCOMBO, ++ .grpid = 0, ++ }, ++ { .uname = "MISS_OPCODE", ++ .ucode = 0x300, ++ .udesc = "Miss Opcode Match", ++ .uflags = INTEL_X86_NCOMBO, ++ .grpid = 0, ++ }, ++ { .uname = "MISS_REMOTE", ++ .ucode = 0x8a00, ++ .udesc = "Misses to Remote Memory", ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, ++ .grpid = 0, ++ }, ++ { .uname = "MISS_REMOTE_OPCODE", ++ .ucode = 0x8300, ++ .udesc = "Misses to Remote Memory - Opcode Matched", ++ .uflags = INTEL_X86_NCOMBO, ++ .grpid = 0, ++ }, ++ { .uname = "NID_ALL", ++ .ucode = 0x4800, ++ .udesc = "NID Matched", ++ .umodmsk_req = _SNBEP_UNC_ATTR_NF1, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, ++ .grpid = 0, ++ }, ++ { .uname = "NID_EVICTION", ++ .ucode = 0x4400, ++ .udesc = "NID Matched Evictions", ++ .umodmsk_req = _SNBEP_UNC_ATTR_NF1, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, ++ .grpid = 0, ++ }, ++ { .uname = "NID_MISS_ALL", ++ .ucode = 0x4a00, ++ .udesc = "NID Matched Miss All", ++ .umodmsk_req = _SNBEP_UNC_ATTR_NF1, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, ++ .grpid = 0, ++ }, ++ { .uname = "NID_MISS_OPCODE", ++ .ucode = 0x4300, ++ .udesc = "NID and Opcode Matched Miss", ++ .umodmsk_req = _SNBEP_UNC_ATTR_NF1, ++ .uflags = INTEL_X86_NCOMBO, ++ .grpid = 0, ++ }, ++ { .uname = "NID_OPCODE", ++ .ucode = 0x4100, ++ .udesc = "NID and Opcode Matched", ++ .umodmsk_req = _SNBEP_UNC_ATTR_NF1, ++ .uflags = INTEL_X86_NCOMBO, ++ .grpid = 0, ++ }, ++ { .uname = "NID_WB", ++ .ucode = 0x5000, ++ .udesc = "NID Matched Writebacks", ++ .umodmsk_req = _SNBEP_UNC_ATTR_NF1, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, ++ .grpid = 0, ++ }, ++ { .uname = "OPCODE", ++ .ucode = 0x100, ++ .udesc = "Opcode Match", ++ .uflags = INTEL_X86_NCOMBO, ++ .grpid = 0, ++ }, ++ { .uname = "REMOTE", ++ .ucode = 0x8800, ++ .udesc = "Remote Memory", ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, ++ .grpid = 0, ++ }, ++ { .uname = "REMOTE_OPCODE", ++ .ucode = 0x8100, ++ .udesc = "Remote Memory - Opcode Matched", ++ .uflags = INTEL_X86_NCOMBO, ++ .grpid = 0, ++ }, ++ { .uname = "WB", ++ .ucode = 0x1000, ++ .udesc = "Writebacks", ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, ++ .grpid = 0, ++ }, ++ CBO_FILT_OPC(1) ++}; ++ ++static intel_x86_umask_t bdx_unc_c_tor_occupancy[]={ ++ { .uname = "ALL", ++ .ucode = 0x800, ++ .udesc = "Any", ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL | INTEL_X86_EXCL_GRP_GT, ++ .grpid = 0, ++ }, ++ { .uname = "EVICTION", ++ .ucode = 0x400, ++ .udesc = "Evictions", ++ .grpid = 0, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, ++ }, ++ { .uname = "LOCAL", ++ .ucode = 0x2800, ++ .udesc = "Number of transactions in the TOR that are satisfied by locally homed memory", ++ .grpid = 0, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, ++ }, ++ { .uname = "LOCAL_OPCODE", ++ .ucode = 0x2100, ++ .udesc = "Local Memory - Opcode Matched", ++ .grpid = 0, ++ }, ++ { .uname = "MISS_ALL", ++ .ucode = 0xa00, ++ .udesc = "Miss All", ++ .grpid = 0, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, ++ }, ++ { .uname = "MISS_LOCAL", ++ .ucode = 0x2a00, ++ .udesc = "Number of miss transactions in the TOR that are satisfied by locally homed memory", ++ .grpid = 0, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, ++ }, ++ { .uname = "MISS_LOCAL_OPCODE", ++ .ucode = 0x2300, ++ .udesc = "Number of miss opcode-matched transactions inserted into the TOR that are satisfied by locally homed memory", ++ .grpid = 0, ++ }, ++ { .uname = "MISS_OPCODE", ++ .ucode = 0x300, ++ .udesc = "Number of miss transactions inserted into the TOR that match an opcode (must provide opc_* umask)", ++ .grpid = 0, ++ }, ++ { .uname = "MISS_REMOTE_OPCODE", ++ .ucode = 0x8300, ++ .udesc = "Number of miss opcode-matched transactions inserted into the TOR that are satisfied by remote caches or memory", ++ .grpid = 0, ++ }, ++ { .uname = "NID_ALL", ++ .ucode = 0x4800, ++ .udesc = "Number of NID-matched transactions inserted into the TOR (must provide nf=X modifier)", ++ .grpid = 0, ++ }, ++ { .uname = "NID_EVICTION", ++ .ucode = 0x4400, ++ .udesc = "Number of NID-matched eviction transactions inserted into the TOR (must provide nf=X modifier)", ++ .grpid = 0, ++ .umodmsk_req = _SNBEP_UNC_ATTR_NF1, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, ++ }, ++ { .uname = "NID_MISS_ALL", ++ .ucode = 0x4a00, ++ .udesc = "Number of NID-matched miss transactions that were inserted into the TOR (must provide nf=X modifier)", ++ .grpid = 0, ++ .umodmsk_req = _SNBEP_UNC_ATTR_NF1, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, ++ }, ++ { .uname = "NID_MISS_OPCODE", ++ .ucode = 0x4300, ++ .udesc = "Number of NID and opcode matched miss transactions inserted into the TOR (must provide opc_* umask and nf=X modifier)", ++ .grpid = 0, ++ .umodmsk_req = _SNBEP_UNC_ATTR_NF1, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "NID_OPCODE", ++ .ucode = 0x4100, ++ .udesc = "Number of transactions inserted into the TOR that match a NID and opcode (must provide opc_* umask and nf=X modifier)", ++ .grpid = 0, ++ .umodmsk_req = _SNBEP_UNC_ATTR_NF1, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "NID_WB", ++ .ucode = 0x5000, ++ .udesc = "Number of NID-matched write back transactions inserted into the TOR (must provide nf=X modifier)", ++ .grpid = 0, ++ .umodmsk_req = _SNBEP_UNC_ATTR_NF1, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, ++ }, ++ { .uname = "OPCODE", ++ .ucode = 0x100, ++ .udesc = "Number of transactions inserted into the TOR that match an opcode (must provide opc_* umask)", ++ .grpid = 0, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "REMOTE", ++ .ucode = 0x8800, ++ .udesc = "Number of transactions inserted into the TOR that are satisfied by remote caches or memory", ++ .grpid = 0, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, ++ }, ++ { .uname = "REMOTE_OPCODE", ++ .ucode = 0x8100, ++ .udesc = "Number of opcode-matched transactions inserted into the TOR that are satisfied by remote caches or memory", ++ .grpid = 0, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "WB", ++ .ucode = 0x1000, ++ .udesc = "Number of write transactions inserted into the TOR", ++ .grpid = 0, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, ++ }, ++ { .uname = "MISS_REMOTE", ++ .ucode = 0x8a00, ++ .udesc = "Number of miss transactions inserted into the TOR that are satisfied by remote caches or memory", ++ .grpid = 0, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, ++ }, ++ CBO_FILT_OPC(1) ++}; ++ ++static intel_x86_umask_t bdx_unc_c_txr_ads_used[]={ ++ { .uname = "AD", ++ .ucode = 0x100, ++ .udesc = "Onto AD Ring", ++ }, ++ { .uname = "AK", ++ .ucode = 0x200, ++ .udesc = "Onto AK Ring", ++ }, ++ { .uname = "BL", ++ .ucode = 0x400, ++ .udesc = "Onto BL Ring", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_c_txr_inserts[]={ ++ { .uname = "AD_CACHE", ++ .ucode = 0x100, ++ .udesc = "Egress Allocations -- AD - Cachebo", ++ }, ++ { .uname = "AD_CORE", ++ .ucode = 0x1000, ++ .udesc = "Egress Allocations -- AD - Corebo", ++ }, ++ { .uname = "AK_CACHE", ++ .ucode = 0x200, ++ .udesc = "Egress Allocations -- AK - Cachebo", ++ }, ++ { .uname = "AK_CORE", ++ .ucode = 0x2000, ++ .udesc = "Egress Allocations -- AK - Corebo", ++ }, ++ { .uname = "BL_CACHE", ++ .ucode = 0x400, ++ .udesc = "Egress Allocations -- BL - Cacheno", ++ }, ++ { .uname = "BL_CORE", ++ .ucode = 0x4000, ++ .udesc = "Egress Allocations -- BL - Corebo", ++ }, ++ { .uname = "IV_CACHE", ++ .ucode = 0x800, ++ .udesc = "Egress Allocations -- IV - Cachebo", ++ }, ++}; ++ ++ ++static intel_x86_entry_t intel_bdx_unc_c_pe[]={ ++ { .name = "UNC_C_BOUNCE_CONTROL", ++ .code = 0xa, ++ .desc = "TBD", ++ .modmsk = BDX_UNC_CBO_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_C_CLOCKTICKS", ++ .code = 0x0, ++ .desc = "Clock ticks", ++ .modmsk = BDX_UNC_CBO_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_C_COUNTER0_OCCUPANCY", ++ .code = 0x1f, ++ .desc = "Since occupancy counts can only be captured in the Cbos 0 counter, this event allows a user to capture occupancy related information by filtering the Cb0 occupancy count captured in Counter 0. The filtering available is found in the control register - threshold, invert and edge detect. E.g. setting threshold to 1 can effectively monitor how many cycles the monitored queue has an entryy.", ++ .modmsk = BDX_UNC_CBO_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_C_FAST_ASSERTED", ++ .code = 0x9, ++ .desc = "Counts the number of cycles either the local distress or incoming distress signals are asserted. Incoming distress includes both up and dn.", ++ .modmsk = BDX_UNC_CBO_ATTRS, ++ .cntmsk = 0x3, ++ }, ++ { .name = "UNC_C_LLC_LOOKUP", ++ .code = 0x34, ++ .desc = "Counts the number of times the LLC was accessed - this includes code, data, prefetches and hints coming from L2. This has numerous filters available. Note the non-standard filtering equation. This event will count requests that lookup the cache multiple times with multiple increments. One must ALWAYS set umask bit 0 and select a state or states to match. Otherwise, the event will count nothing. CBoGlCtrl[22:18] bits correspond to [FMESI] state.", ++ .modmsk = BDX_UNC_CBO_NID_ATTRS, ++ .flags = INTEL_X86_NO_AUTOENCODE, ++ .cntmsk = 0xf, ++ .ngrp = 3, ++ .umasks = bdx_unc_c_llc_lookup, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_c_llc_lookup), ++ }, ++ { .name = "UNC_C_LLC_VICTIMS", ++ .code = 0x37, ++ .desc = "Counts the number of lines that were victimized on a fill. This can be filtered by the state that the line was in.", ++ .modmsk = BDX_UNC_CBO_NID_ATTRS, ++ .flags = INTEL_X86_NO_AUTOENCODE, ++ .cntmsk = 0xf, ++ .ngrp = 2, ++ .umasks = bdx_unc_c_llc_victims, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_c_llc_victims), ++ }, ++ { .name = "UNC_C_MISC", ++ .code = 0x39, ++ .desc = "Miscellaneous events in the Cbo.", ++ .modmsk = BDX_UNC_CBO_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_c_misc, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_c_misc), ++ }, ++ { .name = "UNC_C_RING_AD_USED", ++ .code = 0x1b, ++ .desc = "Counts the number of cycles that the AD ring is being used at this ring stop. This includes when packets are passing by and when packets are being sunk, but does not include when packets are being sent from the ring stop. We really have two rings in BDX -- a clockwise ring and a counter-clockwise ring. On the left side of the ring, the UP direction is on the clockwise ring and DN is on the counter-clockwise ring. On the right side of the ring, this is reversed. The first half of the CBos are on the left side of the ring, and the 2nd half are on the right side of the ring. In other words (for example), in a 4c part, Cbo 0 UP AD is NOT the same ring as CBo 2 UP AD because they are on opposite sides of the rhe ring.", ++ .modmsk = BDX_UNC_CBO_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_c_ring_ad_used, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_c_ring_ad_used), ++ }, ++ { .name = "UNC_C_RING_AK_USED", ++ .code = 0x1c, ++ .desc = "Counts the number of cycles that the AK ring is being used at this ring stop. This includes when packets are passing by and when packets are being sunk, but does not include when packets are being sent from the ring stop.We really have two rings in BDX -- a clockwise ring and a counter-clockwise ring. On the left side of the ring, the UP direction is on the clockwise ring and DN is on the counter-clockwise ring. On the right side of the ring, this is reversed. The first half of the CBos are on the left side of the ring, and the 2nd half are on the right side of the ring. In other words (for example), in a 4c part, Cbo 0 UP AD is NOT the same ring as CBo 2 UP AD because they are on opposite sides of the rhe ring.", ++ .modmsk = BDX_UNC_CBO_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_c_ring_ak_used, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_c_ring_ak_used), ++ }, ++ { .name = "UNC_C_RING_BL_USED", ++ .code = 0x1d, ++ .desc = "Counts the number of cycles that the BL ring is being used at this ring stop. This includes when packets are passing by and when packets are being sunk, but does not include when packets are being sent from the ring stop.We really have two rings in BDX -- a clockwise ring and a counter-clockwise ring. On the left side of the ring, the UP direction is on the clockwise ring and DN is on the counter-clockwise ring. On the right side of the ring, this is reversed. The first half of the CBos are on the left side of the ring, and the 2nd half are on the right side of the ring. In other words (for example), in a 4c part, Cbo 0 UP AD is NOT the same ring as CBo 2 UP AD because they are on opposite sides of the rhe ring.", ++ .modmsk = BDX_UNC_CBO_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_c_ring_bl_used, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_c_ring_bl_used), ++ }, ++ { .name = "UNC_C_RING_BOUNCES", ++ .code = 0x5, ++ .desc = "", ++ .modmsk = BDX_UNC_CBO_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_c_ring_bounces, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_c_ring_bounces), ++ }, ++ { .name = "UNC_C_RING_IV_USED", ++ .code = 0x1e, ++ .desc = "Counts the number of cycles that the IV ring is being used at this ring stop. This includes when packets are passing by and when packets are being sunk, but does not include when packets are being sent from the ring stop. There is only 1 IV ring in BDX Therefore, if one wants to monitor the Even ring, they should select both UP_EVEN and DN_EVEN. To monitor the Odd ring, they should select both UP_ODD and DN_ DN_ODD.", ++ .modmsk = BDX_UNC_CBO_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_c_ring_iv_used, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_c_ring_iv_used), ++ }, ++ { .name = "UNC_C_RING_SRC_THRTL", ++ .code = 0x7, ++ .desc = "", ++ .modmsk = BDX_UNC_CBO_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_C_RXR_EXT_STARVED", ++ .code = 0x12, ++ .desc = "Counts cycles in external starvation. This occurs when one of the ingress queues is being starved by the other queues.", ++ .modmsk = BDX_UNC_CBO_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_c_rxr_ext_starved, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_c_rxr_ext_starved), ++ }, ++ { .name = "UNC_C_RXR_INSERTS", ++ .code = 0x13, ++ .desc = "Counts number of allocations per cycle into the specified Ingress queue.", ++ .modmsk = BDX_UNC_CBO_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_c_rxr_inserts, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_c_rxr_inserts), ++ }, ++ { .name = "UNC_C_RXR_IPQ_RETRY", ++ .code = 0x31, ++ .desc = "Number of times a snoop (probe) request had to retry. Filters exist to cover some of the common cases retries.", ++ .modmsk = BDX_UNC_CBO_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_c_rxr_ipq_retry, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_c_rxr_ipq_retry), ++ }, ++ { .name = "UNC_C_RXR_IPQ_RETRY2", ++ .code = 0x28, ++ .desc = "Number of times a snoop (probe) request had to retry. Filters exist to cover some of the common cases retries.", ++ .modmsk = BDX_UNC_CBO_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_c_rxr_ipq_retry2, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_c_rxr_ipq_retry2), ++ }, ++ { .name = "UNC_C_RXR_IRQ_RETRY", ++ .code = 0x32, ++ .desc = "", ++ .modmsk = BDX_UNC_CBO_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_c_rxr_irq_retry, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_c_rxr_irq_retry), ++ }, ++ { .name = "UNC_C_RXR_IRQ_RETRY2", ++ .code = 0x29, ++ .desc = "", ++ .modmsk = BDX_UNC_CBO_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_c_rxr_irq_retry2, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_c_rxr_irq_retry2), ++ }, ++ { .name = "UNC_C_RXR_ISMQ_RETRY", ++ .code = 0x33, ++ .desc = "Number of times a transaction flowing through the ISMQ had to retry. Transaction pass through the ISMQ as responses for requests that already exist in the Cbo. Some examples include: when data is returned or when snoop responses come back from the cores.", ++ .modmsk = BDX_UNC_CBO_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_c_rxr_ismq_retry, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_c_rxr_ismq_retry), ++ }, ++ { .name = "UNC_C_RXR_ISMQ_RETRY2", ++ .code = 0x2a, ++ .desc = "", ++ .modmsk = BDX_UNC_CBO_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_c_rxr_ismq_retry2, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_c_rxr_ismq_retry2), ++ }, ++ { .name = "UNC_C_RXR_OCCUPANCY", ++ .code = 0x11, ++ .desc = "Counts number of entries in the specified Ingress queue in each cycle.", ++ .modmsk = BDX_UNC_CBO_ATTRS, ++ .cntmsk = 0x1, ++ .ngrp = 1, ++ .umasks = bdx_unc_c_rxr_occupancy, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_c_rxr_occupancy), ++ }, ++ { .name = "UNC_C_SBO_CREDITS_ACQUIRED", ++ .code = 0x3d, ++ .desc = "Number of Sbo credits acquired in a given cycle, per ring. Each Cbo is assigned an Sbo it can communicate with.", ++ .modmsk = BDX_UNC_CBO_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_c_sbo_credits_acquired, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_c_sbo_credits_acquired), ++ }, ++ { .name = "UNC_C_SBO_CREDIT_OCCUPANCY", ++ .code = 0x3e, ++ .desc = "Number of Sbo credits in use in a given cycle, per ring. Each Cbo is assigned an Sbo it can communicate with.", ++ .modmsk = BDX_UNC_CBO_ATTRS, ++ .cntmsk = 0x1, ++ .ngrp = 1, ++ .umasks = bdx_unc_c_sbo_credit_occupancy, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_c_sbo_credit_occupancy), ++ }, ++ { .name = "UNC_C_TOR_INSERTS", ++ .code = 0x35, ++ .desc = "Counts the number of entries successfuly inserted into the TOR that match qualifications specified by the subevent. There are a number of subevent filters but only a subset of the subevent combinations are valid. Subevents that require an opcode or NID match require the Cn_MSR_PMON_BOX_FILTER.{opc, nid} field to be set. If, for example, one wanted to count DRD Local Misses, one should select MISS_OPC_MATCH and set Cn_MSR_PMON_BOX_FILTER.opc to DRD (0x1(0x182).", ++ .modmsk = BDX_UNC_CBO_NID_ATTRS | _SNBEP_UNC_ATTR_ISOC | _SNBEP_UNC_ATTR_NC, ++ .flags = INTEL_X86_NO_AUTOENCODE, ++ .cntmsk = 0xf, ++ .ngrp = 2, ++ .umasks = bdx_unc_c_tor_inserts, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_c_tor_inserts), ++ }, ++ { .name = "UNC_C_TOR_OCCUPANCY", ++ .code = 0x36, ++ .desc = "For each cycle, this event accumulates the number of valid entries in the TOR that match qualifications specified by the subevent. There are a number of subevent filters but only a subset of the subevent combinations are valid. Subevents that require an opcode or NID match require the Cn_MSR_PMON_BOX_FILTER.{opc, nid} field to be set. If, for example, one wanted to count DRD Local Misses, one should select MISS_OPC_MATCH and set Cn_MSR_PMON_BOX_FILTER.opc to DRD (0x (0x182)", ++ .modmsk = BDX_UNC_CBO_NID_ATTRS | _SNBEP_UNC_ATTR_ISOC | _SNBEP_UNC_ATTR_NC, ++ .flags = INTEL_X86_NO_AUTOENCODE, ++ .cntmsk = 0x1, ++ .ngrp = 2, ++ .umasks = bdx_unc_c_tor_occupancy, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_c_tor_occupancy), ++ }, ++ { .name = "UNC_C_TXR_ADS_USED", ++ .code = 0x4, ++ .desc = "", ++ .modmsk = BDX_UNC_CBO_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_c_txr_ads_used, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_c_txr_ads_used), ++ }, ++ { .name = "UNC_C_TXR_INSERTS", ++ .code = 0x2, ++ .desc = "Number of allocations into the Cbo Egress. The Egress is used to queue up requests destined for the ring.", ++ .modmsk = BDX_UNC_CBO_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_c_txr_inserts, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_c_txr_inserts), ++ }, ++}; ++ +diff --git a/lib/events/intel_bdx_unc_ha_events.h b/lib/events/intel_bdx_unc_ha_events.h +new file mode 100644 +index 0000000..6266bef +--- /dev/null ++++ b/lib/events/intel_bdx_unc_ha_events.h +@@ -0,0 +1,1159 @@ ++/* ++ * Copyright (c) 2017 Google Inc. All rights reserved ++ * Contributed by Stephane Eranian ++ * ++ * Permission is hereby granted, free of charge, to any person obtaining a copy ++ * of this software and associated documentation files (the "Software"), to deal ++ * in the Software without restriction, including without limitation the rights ++ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies ++ * of the Software, and to permit persons to whom the Software is furnished to do so, ++ * subject to the following conditions: ++ * ++ * The above copyright notice and this permission notice shall be included in all ++ * copies or substantial portions of the Software. ++ * ++ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, ++ * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A ++ * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT ++ * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF ++ * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE ++ * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ++ * ++ * This file is part of libpfm, a performance monitoring support library for ++ * applications on Linux. ++ * ++ * PMU: bdx_unc_ha ++ */ ++ ++static intel_x86_umask_t bdx_unc_h_bypass_imc[]={ ++ { .uname = "NOT_TAKEN", ++ .ucode = 0x200, ++ .udesc = "HA to iMC Bypass -- Not Taken", ++ }, ++ { .uname = "TAKEN", ++ .ucode = 0x100, ++ .udesc = "HA to iMC Bypass -- Taken", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_h_directory_lookup[]={ ++ { .uname = "NO_SNP", ++ .ucode = 0x200, ++ .udesc = "Directory Lookups -- Snoop Not Needed", ++ }, ++ { .uname = "SNP", ++ .ucode = 0x100, ++ .udesc = "Directory Lookups -- Snoop Needed", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_h_directory_update[]={ ++ { .uname = "ANY", ++ .ucode = 0x300, ++ .udesc = "Directory Updates -- Any Directory Update", ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, ++ }, ++ { .uname = "CLEAR", ++ .ucode = 0x200, ++ .udesc = "Directory Updates -- Directory Clear", ++ }, ++ { .uname = "SET", ++ .ucode = 0x100, ++ .udesc = "Directory Updates -- Directory Set", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_h_hitme_hit[]={ ++ { .uname = "ACKCNFLTWBI", ++ .ucode = 0x400, ++ .udesc = "Counts Number of Hits in HitMe Cache -- op is AckCnfltWbI", ++ }, ++ { .uname = "ALL", ++ .ucode = 0xff00, ++ .udesc = "Counts Number of Hits in HitMe Cache -- All Requests", ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, ++ }, ++ { .uname = "ALLOCS", ++ .ucode = 0x7000, ++ .udesc = "Counts Number of Hits in HitMe Cache -- Allocations", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "EVICTS", ++ .ucode = 0x4200, ++ .udesc = "Counts Number of Hits in HitMe Cache -- Allocations", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "HOM", ++ .ucode = 0xf00, ++ .udesc = "Counts Number of Hits in HitMe Cache -- HOM Requests", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "INVALS", ++ .ucode = 0x2600, ++ .udesc = "Counts Number of Hits in HitMe Cache -- Invalidations", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "READ_OR_INVITOE", ++ .ucode = 0x100, ++ .udesc = "Counts Number of Hits in HitMe Cache -- op is RdCode, RdData, RdDataMigratory, RdInvOwn, RdCur or InvItoE", ++ }, ++ { .uname = "RSP", ++ .ucode = 0x8000, ++ .udesc = "Counts Number of Hits in HitMe Cache -- op is RspI, RspIWb, RspS, RspSWb, RspCnflt or RspCnfltWbI", ++ }, ++ { .uname = "RSPFWDI_LOCAL", ++ .ucode = 0x2000, ++ .udesc = "Counts Number of Hits in HitMe Cache -- op is RspIFwd or RspIFwdWb for a local request", ++ }, ++ { .uname = "RSPFWDI_REMOTE", ++ .ucode = 0x1000, ++ .udesc = "Counts Number of Hits in HitMe Cache -- op is RspIFwd or RspIFwdWb for a remote request", ++ }, ++ { .uname = "RSPFWDS", ++ .ucode = 0x4000, ++ .udesc = "Counts Number of Hits in HitMe Cache -- op is RsSFwd or RspSFwdWb", ++ }, ++ { .uname = "WBMTOE_OR_S", ++ .ucode = 0x800, ++ .udesc = "Counts Number of Hits in HitMe Cache -- op is WbMtoE or WbMtoS", ++ }, ++ { .uname = "WBMTOI", ++ .ucode = 0x200, ++ .udesc = "Counts Number of Hits in HitMe Cache -- op is WbMtoI", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_h_hitme_hit_pv_bits_set[]={ ++ { .uname = "ACKCNFLTWBI", ++ .ucode = 0x400, ++ .udesc = "Accumulates Number of PV bits set on HitMe Cache Hits -- op is AckCnfltWbI", ++ }, ++ { .uname = "ALL", ++ .ucode = 0xff00, ++ .udesc = "Accumulates Number of PV bits set on HitMe Cache Hits -- All Requests", ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, ++ }, ++ { .uname = "HOM", ++ .ucode = 0xf00, ++ .udesc = "Accumulates Number of PV bits set on HitMe Cache Hits -- HOM Requests", ++ }, ++ { .uname = "READ_OR_INVITOE", ++ .ucode = 0x100, ++ .udesc = "Accumulates Number of PV bits set on HitMe Cache Hits -- op is RdCode, RdData, RdDataMigratory, RdInvOwn, RdCur or InvItoE", ++ }, ++ { .uname = "RSP", ++ .ucode = 0x8000, ++ .udesc = "Accumulates Number of PV bits set on HitMe Cache Hits -- op is RspI, RspIWb, RspS, RspSWb, RspCnflt or RspCnfltWbI", ++ }, ++ { .uname = "RSPFWDI_LOCAL", ++ .ucode = 0x2000, ++ .udesc = "Accumulates Number of PV bits set on HitMe Cache Hits -- op is RspIFwd or RspIFwdWb for a local request", ++ }, ++ { .uname = "RSPFWDI_REMOTE", ++ .ucode = 0x1000, ++ .udesc = "Accumulates Number of PV bits set on HitMe Cache Hits -- op is RspIFwd or RspIFwdWb for a remote request", ++ }, ++ { .uname = "RSPFWDS", ++ .ucode = 0x4000, ++ .udesc = "Accumulates Number of PV bits set on HitMe Cache Hits -- op is RsSFwd or RspSFwdWb", ++ }, ++ { .uname = "WBMTOE_OR_S", ++ .ucode = 0x800, ++ .udesc = "Accumulates Number of PV bits set on HitMe Cache Hits -- op is WbMtoE or WbMtoS", ++ }, ++ { .uname = "WBMTOI", ++ .ucode = 0x200, ++ .udesc = "Accumulates Number of PV bits set on HitMe Cache Hits -- op is WbMtoI", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_h_hitme_lookup[]={ ++ { .uname = "ACKCNFLTWBI", ++ .ucode = 0x400, ++ .udesc = "Counts Number of times HitMe Cache is accessed -- op is AckCnfltWbI", ++ }, ++ { .uname = "ALL", ++ .ucode = 0xff00, ++ .udesc = "Counts Number of times HitMe Cache is accessed -- All Requests", ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, ++ }, ++ { .uname = "ALLOCS", ++ .ucode = 0x7000, ++ .udesc = "Counts Number of times HitMe Cache is accessed -- Allocations", ++ }, ++ { .uname = "HOM", ++ .ucode = 0xf00, ++ .udesc = "Counts Number of times HitMe Cache is accessed -- HOM Requests", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "INVALS", ++ .ucode = 0x2600, ++ .udesc = "Counts Number of times HitMe Cache is accessed -- Invalidations", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "READ_OR_INVITOE", ++ .ucode = 0x100, ++ .udesc = "Counts Number of times HitMe Cache is accessed -- op is RdCode, RdData, RdDataMigratory, RdInvOwn, RdCur or InvItoE", ++ }, ++ { .uname = "RSP", ++ .ucode = 0x8000, ++ .udesc = "Counts Number of times HitMe Cache is accessed -- op is RspI, RspIWb, RspS, RspSWb, RspCnflt or RspCnfltWbI", ++ }, ++ { .uname = "RSPFWDI_LOCAL", ++ .ucode = 0x2000, ++ .udesc = "Counts Number of times HitMe Cache is accessed -- op is RspIFwd or RspIFwdWb for a local request", ++ }, ++ { .uname = "RSPFWDI_REMOTE", ++ .ucode = 0x1000, ++ .udesc = "Counts Number of times HitMe Cache is accessed -- op is RspIFwd or RspIFwdWb for a remote request", ++ }, ++ { .uname = "RSPFWDS", ++ .ucode = 0x4000, ++ .udesc = "Counts Number of times HitMe Cache is accessed -- op is RsSFwd or RspSFwdWb", ++ }, ++ { .uname = "WBMTOE_OR_S", ++ .ucode = 0x800, ++ .udesc = "Counts Number of times HitMe Cache is accessed -- op is WbMtoE or WbMtoS", ++ }, ++ { .uname = "WBMTOI", ++ .ucode = 0x200, ++ .udesc = "Counts Number of times HitMe Cache is accessed -- op is WbMtoI", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_h_igr_no_credit_cycles[]={ ++ { .uname = "AD_QPI0", ++ .ucode = 0x100, ++ .udesc = "Cycles without QPI Ingress Credits -- AD to QPI Link 0", ++ }, ++ { .uname = "AD_QPI1", ++ .ucode = 0x200, ++ .udesc = "Cycles without QPI Ingress Credits -- AD to QPI Link 1", ++ }, ++ { .uname = "AD_QPI2", ++ .ucode = 0x1000, ++ .udesc = "Cycles without QPI Ingress Credits -- BL to QPI Link 0", ++ }, ++ { .uname = "BL_QPI0", ++ .ucode = 0x400, ++ .udesc = "Cycles without QPI Ingress Credits -- BL to QPI Link 0", ++ }, ++ { .uname = "BL_QPI1", ++ .ucode = 0x800, ++ .udesc = "Cycles without QPI Ingress Credits -- BL to QPI Link 1", ++ }, ++ { .uname = "BL_QPI2", ++ .ucode = 0x2000, ++ .udesc = "Cycles without QPI Ingress Credits -- BL to QPI Link 1", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_h_imc_reads[]={ ++ { .uname = "NORMAL", ++ .ucode = 0x100, ++ .udesc = "HA to iMC Normal Priority Reads Issued -- Normal Priority", ++ .uflags = INTEL_X86_DFL, ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_h_imc_writes[]={ ++ { .uname = "ALL", ++ .ucode = 0xf00, ++ .udesc = "HA to iMC Full Line Writes Issued -- All Writes", ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, ++ }, ++ { .uname = "FULL", ++ .ucode = 0x100, ++ .udesc = "HA to iMC Full Line Writes Issued -- Full Line Non-ISOCH", ++ }, ++ { .uname = "FULL_ISOCH", ++ .ucode = 0x400, ++ .udesc = "HA to iMC Full Line Writes Issued -- ISOCH Full Line", ++ }, ++ { .uname = "PARTIAL", ++ .ucode = 0x200, ++ .udesc = "HA to iMC Full Line Writes Issued -- Partial Non-ISOCH", ++ }, ++ { .uname = "PARTIAL_ISOCH", ++ .ucode = 0x800, ++ .udesc = "HA to iMC Full Line Writes Issued -- ISOCH Partial", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_h_osb[]={ ++ { .uname = "CANCELLED", ++ .ucode = 0x1000, ++ .udesc = "OSB Snoop Broadcast -- Cancelled", ++ }, ++ { .uname = "INVITOE_LOCAL", ++ .ucode = 0x400, ++ .udesc = "OSB Snoop Broadcast -- Local InvItoE", ++ }, ++ { .uname = "READS_LOCAL", ++ .ucode = 0x200, ++ .udesc = "OSB Snoop Broadcast -- Local Reads", ++ }, ++ { .uname = "READS_LOCAL_USEFUL", ++ .ucode = 0x2000, ++ .udesc = "OSB Snoop Broadcast -- Reads Local - Useful", ++ }, ++ { .uname = "REMOTE", ++ .ucode = 0x800, ++ .udesc = "OSB Snoop Broadcast -- Remote", ++ }, ++ { .uname = "REMOTE_USEFUL", ++ .ucode = 0x4000, ++ .udesc = "OSB Snoop Broadcast -- Remote - Useful", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_h_osb_edr[]={ ++ { .uname = "ALL", ++ .ucode = 0x100, ++ .udesc = "OSB Early Data Return -- All", ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, ++ }, ++ { .uname = "READS_LOCAL_I", ++ .ucode = 0x200, ++ .udesc = "OSB Early Data Return -- Reads to Local I", ++ }, ++ { .uname = "READS_LOCAL_S", ++ .ucode = 0x800, ++ .udesc = "OSB Early Data Return -- Reads to Local S", ++ }, ++ { .uname = "READS_REMOTE_I", ++ .ucode = 0x400, ++ .udesc = "OSB Early Data Return -- Reads to Remote I", ++ }, ++ { .uname = "READS_REMOTE_S", ++ .ucode = 0x1000, ++ .udesc = "OSB Early Data Return -- Reads to Remote S", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_h_requests[]={ ++ { .uname = "INVITOE_LOCAL", ++ .ucode = 0x1000, ++ .udesc = "Read and Write Requests -- Local InvItoEs", ++ }, ++ { .uname = "INVITOE_REMOTE", ++ .ucode = 0x2000, ++ .udesc = "Read and Write Requests -- Remote InvItoEs", ++ }, ++ { .uname = "READS", ++ .ucode = 0x300, ++ .udesc = "Read and Write Requests -- Reads", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "READS_LOCAL", ++ .ucode = 0x100, ++ .udesc = "Read and Write Requests -- Local Reads", ++ }, ++ { .uname = "READS_REMOTE", ++ .ucode = 0x200, ++ .udesc = "Read and Write Requests -- Remote Reads", ++ }, ++ { .uname = "WRITES", ++ .ucode = 0xc00, ++ .udesc = "Read and Write Requests -- Writes", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "WRITES_LOCAL", ++ .ucode = 0x400, ++ .udesc = "Read and Write Requests -- Local Writes", ++ }, ++ { .uname = "WRITES_REMOTE", ++ .ucode = 0x800, ++ .udesc = "Read and Write Requests -- Remote Writes", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_h_ring_ad_used[]={ ++ { .uname = "CCW", ++ .ucode = 0xc00, ++ .udesc = "Counterclockwise", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "CCW_EVEN", ++ .ucode = 0x400, ++ .udesc = "Counterclockwise and Even", ++ }, ++ { .uname = "CCW_ODD", ++ .ucode = 0x800, ++ .udesc = "Counterclockwise and Odd", ++ }, ++ { .uname = "CW", ++ .ucode = 0x300, ++ .udesc = "Clockwise", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "CW_EVEN", ++ .ucode = 0x100, ++ .udesc = "Clockwise and Even", ++ }, ++ { .uname = "CW_ODD", ++ .ucode = 0x200, ++ .udesc = "Clockwise and Odd", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_h_rpq_cycles_no_reg_credits[]={ ++ { .uname = "CHN0", ++ .ucode = 0x100, ++ .udesc = "iMC RPQ Credits Empty - Regular -- Channel 0", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "CHN1", ++ .ucode = 0x200, ++ .udesc = "iMC RPQ Credits Empty - Regular -- Channel 1", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "CHN2", ++ .ucode = 0x400, ++ .udesc = "iMC RPQ Credits Empty - Regular -- Channel 2", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "CHN3", ++ .ucode = 0x800, ++ .udesc = "iMC RPQ Credits Empty - Regular -- Channel 3", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_h_sbo0_credits_acquired[]={ ++ { .uname = "AD", ++ .ucode = 0x100, ++ .udesc = "For AD Ring", ++ }, ++ { .uname = "BL", ++ .ucode = 0x200, ++ .udesc = "For BL Ring", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_h_snoops_rsp_after_data[]={ ++ { .uname = "LOCAL", ++ .ucode = 0x100, ++ .udesc = "Data beat the Snoop Responses -- Local Requests", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "REMOTE", ++ .ucode = 0x200, ++ .udesc = "Data beat the Snoop Responses -- Remote Requests", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_h_snoop_cycles_ne[]={ ++ { .uname = "ALL", ++ .ucode = 0x300, ++ .udesc = "Cycles with Snoops Outstanding -- All Requests", ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, ++ }, ++ { .uname = "LOCAL", ++ .ucode = 0x100, ++ .udesc = "Cycles with Snoops Outstanding -- Local Requests", ++ }, ++ { .uname = "REMOTE", ++ .ucode = 0x200, ++ .udesc = "Cycles with Snoops Outstanding -- Remote Requests", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_h_snoop_occupancy[]={ ++ { .uname = "LOCAL", ++ .ucode = 0x100, ++ .udesc = "Tracker Snoops Outstanding Accumulator -- Local Requests", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "REMOTE", ++ .ucode = 0x200, ++ .udesc = "Tracker Snoops Outstanding Accumulator -- Remote Requests", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_h_snoop_resp[]={ ++ { .uname = "RSPCNFLCT", ++ .ucode = 0x4000, ++ .udesc = "Snoop Responses Received -- RSPCNFLCT*", ++ }, ++ { .uname = "RSPI", ++ .ucode = 0x100, ++ .udesc = "Snoop Responses Received -- RspI", ++ }, ++ { .uname = "RSPIFWD", ++ .ucode = 0x400, ++ .udesc = "Snoop Responses Received -- RspIFwd", ++ }, ++ { .uname = "RSPS", ++ .ucode = 0x200, ++ .udesc = "Snoop Responses Received -- RspS", ++ }, ++ { .uname = "RSPSFWD", ++ .ucode = 0x800, ++ .udesc = "Snoop Responses Received -- RspSFwd", ++ }, ++ { .uname = "RSP_FWD_WB", ++ .ucode = 0x2000, ++ .udesc = "Snoop Responses Received -- Rsp*Fwd*WB", ++ }, ++ { .uname = "RSP_WB", ++ .ucode = 0x1000, ++ .udesc = "Snoop Responses Received -- Rsp*WB", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_h_snp_resp_recv_local[]={ ++ { .uname = "OTHER", ++ .ucode = 0x8000, ++ .udesc = "Snoop Responses Received Local -- Other", ++ }, ++ { .uname = "RSPCNFLCT", ++ .ucode = 0x4000, ++ .udesc = "Snoop Responses Received Local -- RspCnflct", ++ }, ++ { .uname = "RSPI", ++ .ucode = 0x100, ++ .udesc = "Snoop Responses Received Local -- RspI", ++ }, ++ { .uname = "RSPIFWD", ++ .ucode = 0x400, ++ .udesc = "Snoop Responses Received Local -- RspIFwd", ++ }, ++ { .uname = "RSPS", ++ .ucode = 0x200, ++ .udesc = "Snoop Responses Received Local -- RspS", ++ }, ++ { .uname = "RSPSFWD", ++ .ucode = 0x800, ++ .udesc = "Snoop Responses Received Local -- RspSFwd", ++ }, ++ { .uname = "RSPxFWDxWB", ++ .ucode = 0x2000, ++ .udesc = "Snoop Responses Received Local -- Rsp*FWD*WB", ++ }, ++ { .uname = "RSPxWB", ++ .ucode = 0x1000, ++ .udesc = "Snoop Responses Received Local -- Rsp*WB", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_h_stall_no_sbo_credit[]={ ++ { .uname = "SBO0_AD", ++ .ucode = 0x100, ++ .udesc = "Stall on No Sbo Credits -- For SBo0, AD Ring", ++ }, ++ { .uname = "SBO0_BL", ++ .ucode = 0x400, ++ .udesc = "Stall on No Sbo Credits -- For SBo0, BL Ring", ++ }, ++ { .uname = "SBO1_AD", ++ .ucode = 0x200, ++ .udesc = "Stall on No Sbo Credits -- For SBo1, AD Ring", ++ }, ++ { .uname = "SBO1_BL", ++ .ucode = 0x800, ++ .udesc = "Stall on No Sbo Credits -- For SBo1, BL Ring", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_h_tad_requests_g0[]={ ++ { .uname = "REGION0", ++ .ucode = 0x100, ++ .udesc = "HA Requests to a TAD Region - Group 0 -- TAD Region 0", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "REGION1", ++ .ucode = 0x200, ++ .udesc = "HA Requests to a TAD Region - Group 0 -- TAD Region 1", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "REGION2", ++ .ucode = 0x400, ++ .udesc = "HA Requests to a TAD Region - Group 0 -- TAD Region 2", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "REGION3", ++ .ucode = 0x800, ++ .udesc = "HA Requests to a TAD Region - Group 0 -- TAD Region 3", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "REGION4", ++ .ucode = 0x1000, ++ .udesc = "HA Requests to a TAD Region - Group 0 -- TAD Region 4", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "REGION5", ++ .ucode = 0x2000, ++ .udesc = "HA Requests to a TAD Region - Group 0 -- TAD Region 5", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "REGION6", ++ .ucode = 0x4000, ++ .udesc = "HA Requests to a TAD Region - Group 0 -- TAD Region 6", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "REGION7", ++ .ucode = 0x8000, ++ .udesc = "HA Requests to a TAD Region - Group 0 -- TAD Region 7", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_h_tad_requests_g1[]={ ++ { .uname = "REGION10", ++ .ucode = 0x400, ++ .udesc = "HA Requests to a TAD Region - Group 1 -- TAD Region 10", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "REGION11", ++ .ucode = 0x800, ++ .udesc = "HA Requests to a TAD Region - Group 1 -- TAD Region 11", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "REGION8", ++ .ucode = 0x100, ++ .udesc = "HA Requests to a TAD Region - Group 1 -- TAD Region 8", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "REGION9", ++ .ucode = 0x200, ++ .udesc = "HA Requests to a TAD Region - Group 1 -- TAD Region 9", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_h_tracker_cycles_full[]={ ++ { .uname = "ALL", ++ .ucode = 0x200, ++ .udesc = "Tracker Cycles Full -- Cycles Completely Used", ++ .uflags = INTEL_X86_DFL, ++ }, ++ { .uname = "GP", ++ .ucode = 0x100, ++ .udesc = "Tracker Cycles Full -- Cycles GP Completely Used", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_h_tracker_cycles_ne[]={ ++ { .uname = "ALL", ++ .ucode = 0x300, ++ .udesc = "Tracker Cycles Not Empty -- All Requests", ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, ++ }, ++ { .uname = "LOCAL", ++ .ucode = 0x100, ++ .udesc = "Tracker Cycles Not Empty -- Local Requests", ++ }, ++ { .uname = "REMOTE", ++ .ucode = 0x200, ++ .udesc = "Tracker Cycles Not Empty -- Remote Requests", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_h_tracker_occupancy[]={ ++ { .uname = "INVITOE_LOCAL", ++ .ucode = 0x4000, ++ .udesc = "Tracker Occupancy Accumultor -- Local InvItoE Requests", ++ }, ++ { .uname = "INVITOE_REMOTE", ++ .ucode = 0x8000, ++ .udesc = "Tracker Occupancy Accumultor -- Remote InvItoE Requests", ++ }, ++ { .uname = "READS_LOCAL", ++ .ucode = 0x400, ++ .udesc = "Tracker Occupancy Accumultor -- Local Read Requests", ++ }, ++ { .uname = "READS_REMOTE", ++ .ucode = 0x800, ++ .udesc = "Tracker Occupancy Accumultor -- Remote Read Requests", ++ }, ++ { .uname = "WRITES_LOCAL", ++ .ucode = 0x1000, ++ .udesc = "Tracker Occupancy Accumultor -- Local Write Requests", ++ }, ++ { .uname = "WRITES_REMOTE", ++ .ucode = 0x2000, ++ .udesc = "Tracker Occupancy Accumultor -- Remote Write Requests", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_h_tracker_pending_occupancy[]={ ++ { .uname = "LOCAL", ++ .ucode = 0x100, ++ .udesc = "Data Pending Occupancy Accumultor -- Local Requests", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "REMOTE", ++ .ucode = 0x200, ++ .udesc = "Data Pending Occupancy Accumultor -- Remote Requests", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_h_txr_ad_cycles_full[]={ ++ { .uname = "ALL", ++ .ucode = 0x300, ++ .udesc = "All", ++ .uflags = INTEL_X86_DFL, ++ }, ++ { .uname = "SCHED0", ++ .ucode = 0x100, ++ .udesc = "Scheduler 0", ++ }, ++ { .uname = "SCHED1", ++ .ucode = 0x200, ++ .udesc = "Scheduler 1", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_h_txr_bl[]={ ++ { .uname = "DRS_CACHE", ++ .ucode = 0x100, ++ .udesc = "Outbound DRS Ring Transactions to Cache -- Data to Cache", ++ }, ++ { .uname = "DRS_CORE", ++ .ucode = 0x200, ++ .udesc = "Outbound DRS Ring Transactions to Cache -- Data to Core", ++ }, ++ { .uname = "DRS_QPI", ++ .ucode = 0x400, ++ .udesc = "Outbound DRS Ring Transactions to Cache -- Data to QPI", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_h_txr_starved[]={ ++ { .uname = "AK", ++ .ucode = 0x100, ++ .udesc = "Injection Starvation -- For AK Ring", ++ }, ++ { .uname = "BL", ++ .ucode = 0x200, ++ .udesc = "Injection Starvation -- For BL Ring", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_h_wpq_cycles_no_reg_credits[]={ ++ { .uname = "CHN0", ++ .ucode = 0x100, ++ .udesc = "HA iMC CHN0 WPQ Credits Empty - Regular -- Channel 0", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "CHN1", ++ .ucode = 0x200, ++ .udesc = "HA iMC CHN0 WPQ Credits Empty - Regular -- Channel 1", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "CHN2", ++ .ucode = 0x400, ++ .udesc = "HA iMC CHN0 WPQ Credits Empty - Regular -- Channel 2", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "CHN3", ++ .ucode = 0x800, ++ .udesc = "HA iMC CHN0 WPQ Credits Empty - Regular -- Channel 3", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++}; ++ ++static intel_x86_entry_t intel_bdx_unc_h_pe[]={ ++ /* ADDR_OPC_MATCH not supported (linux kernel has no support for HA OPC yet*/ ++ { .name = "UNC_H_BT_CYCLES_NE", ++ .code = 0x42, ++ .desc = "Cycles the Backup Tracker (BT) is not empty. The BT is the actual HOM tracker in IVT.", ++ .modmsk = BDX_UNC_HA_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_H_BT_OCCUPANCY", ++ .code = 0x43, ++ .desc = "Accumulates the occupancy of te HA BT pool in every cycle. This can be used with the 'not empty' stat to calculate the average queue occupancy or the 'allocations' stat to calculate average queue latency. HA BTs are allocated as son as a request enters the HA and are released after the snoop response and data return and the response is returned to the ring", ++ .modmsk = BDX_UNC_HA_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_H_BYPASS_IMC", ++ .code = 0x14, ++ .desc = "Counts the number of times when the HA was able to bypass was attempted. This is a latency optimization for situations when there is light loadings on the memory subsystem. This can be filted by when the bypass was taken and when it was not.", ++ .modmsk = BDX_UNC_HA_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_h_bypass_imc, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_bypass_imc), ++ }, ++ { .name = "UNC_H_CONFLICT_CYCLES", ++ .code = 0xb, ++ .desc = "TBD", ++ .modmsk = BDX_UNC_HA_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ ++ { .name = "UNC_H_CLOCKTICKS", ++ .code = 0x0, ++ .desc = "Counts the number of uclks in the HA. This will be slightly different than the count in the Ubox because of enable/freeze delays. The HA is on the other side of the die from the fixed Ubox uclk counter, so the drift could be somewhat larger than in units that are closer like the QPI Agent.", ++ .modmsk = BDX_UNC_HA_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_H_DIRECT2CORE_COUNT", ++ .code = 0x11, ++ .desc = "Number of Direct2Core messages sent", ++ .modmsk = BDX_UNC_HA_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_H_DIRECT2CORE_CYCLES_DISABLED", ++ .code = 0x12, ++ .desc = "Number of cycles in which Direct2Core was disabled", ++ .modmsk = BDX_UNC_HA_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_H_DIRECT2CORE_TXN_OVERRIDE", ++ .code = 0x13, ++ .desc = "Number of Reads where Direct2Core overridden", ++ .modmsk = BDX_UNC_HA_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_H_DIRECTORY_LAT_OPT", ++ .code = 0x41, ++ .desc = "Directory Latency Optimization Data Return Path Taken. When directory mode is enabled and the directory retuned for a read is Dir=I, then data can be returned using a faster path if certain conditions are met (credits, free pipeline, etc).", ++ .modmsk = BDX_UNC_HA_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_H_DIRECTORY_LOOKUP", ++ .code = 0xc, ++ .desc = "Counts the number of transactions that looked up the directory. Can be filtered by requests that had to snoop and those that did not have to.", ++ .modmsk = BDX_UNC_HA_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_h_directory_lookup, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_directory_lookup), ++ }, ++ { .name = "UNC_H_DIRECTORY_UPDATE", ++ .code = 0xd, ++ .desc = "Counts the number of directory updates that were required. These result in writes to the memory controller. This can be filtered by directory sets and directory clears.", ++ .modmsk = BDX_UNC_HA_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_h_directory_update, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_directory_update), ++ }, ++ { .name = "UNC_H_HITME_HIT", ++ .code = 0x71, ++ .desc = "", ++ .modmsk = BDX_UNC_HA_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_h_hitme_hit, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_hitme_hit), ++ }, ++ { .name = "UNC_H_HITME_HIT_PV_BITS_SET", ++ .code = 0x72, ++ .desc = "", ++ .modmsk = BDX_UNC_HA_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_h_hitme_hit_pv_bits_set, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_hitme_hit_pv_bits_set), ++ }, ++ { .name = "UNC_H_HITME_LOOKUP", ++ .code = 0x70, ++ .desc = "", ++ .modmsk = BDX_UNC_HA_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_h_hitme_lookup, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_hitme_lookup), ++ }, ++ { .name = "UNC_H_IGR_NO_CREDIT_CYCLES", ++ .code = 0x22, ++ .desc = "Counts the number of cycles when the HA does not have credits to send messages to the QPI Agent. This can be filtered by the different credit pools and the different links.", ++ .modmsk = BDX_UNC_HA_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_h_igr_no_credit_cycles, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_igr_no_credit_cycles), ++ }, ++ { .name = "UNC_H_IMC_READS", ++ .code = 0x17, ++ .desc = "Count of the number of reads issued to any of the memory controller channels. This can be filtered by the priority of the reads.", ++ .modmsk = BDX_UNC_HA_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_h_imc_reads, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_imc_reads), ++ }, ++ { .name = "UNC_H_IMC_RETRY", ++ .code = 0x1e, ++ .desc = "", ++ .modmsk = BDX_UNC_HA_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_H_IMC_WRITES", ++ .code = 0x1a, ++ .desc = "Counts the total number of full line writes issued from the HA into the memory controller. This counts for all four channels. It can be filtered by full/partial and ISOCH/non-ISOCH.", ++ .modmsk = BDX_UNC_HA_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_h_imc_writes, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_imc_writes), ++ }, ++ { .name = "UNC_H_OSB", ++ .code = 0x53, ++ .desc = "Count of OSB snoop broadcasts. Counts by 1 per request causing OSB snoops to be broadcast. Does not count all the snoops generated by OSB.", ++ .modmsk = BDX_UNC_HA_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_h_osb, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_osb), ++ }, ++ { .name = "UNC_H_OSB_EDR", ++ .code = 0x54, ++ .desc = "Counts the number of transactions that broadcast snoop due to OSB, but found clean data in memory and was able to do early data return", ++ .modmsk = BDX_UNC_HA_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_h_osb_edr, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_osb_edr), ++ }, ++ { .name = "UNC_H_REQUESTS", ++ .code = 0x1, ++ .desc = "Counts the total number of read requests made into the Home Agent. Reads include all read opcodes (including RFO). Writes include all writes (streaming, evictions, HitM, etc).", ++ .modmsk = BDX_UNC_HA_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_h_requests, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_requests), ++ }, ++ { .name = "UNC_H_RING_AD_USED", ++ .code = 0x3e, ++ .desc = "Counts the number of cycles that the AD ring is being used at this ring stop. This includes when packets are passing by and when packets are being sunk, but does not include when packets are being sent from the ring stop.", ++ .modmsk = BDX_UNC_HA_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_h_ring_ad_used, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_ring_ad_used), ++ }, ++ { .name = "UNC_H_RING_AK_USED", ++ .code = 0x3f, ++ .desc = "Counts the number of cycles that the AK ring is being used at this ring stop. This includes when packets are passing by and when packets are being sunk, but does not include when packets are being sent from the ring stop.", ++ .modmsk = BDX_UNC_HA_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_h_ring_ad_used, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_ring_ad_used), ++ }, ++ { .name = "UNC_H_RING_BL_USED", ++ .code = 0x40, ++ .desc = "Counts the number of cycles that the BL ring is being used at this ring stop. This includes when packets are passing by and when packets are being sunk, but does not include when packets are being sent from the ring stop.", ++ .modmsk = BDX_UNC_HA_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_h_ring_ad_used, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_ring_ad_used), ++ }, ++ { .name = "UNC_H_RPQ_CYCLES_NO_REG_CREDITS", ++ .code = 0x15, ++ .desc = "Counts the number of cycles when there are no regular credits available for posting reads from the HA into the iMC. In order to send reads into the memory controller, the HA must first acquire a credit for the iMCs RPQ (read pending queue). This queue is broken into regular credits/buffers that are used by general reads, and special requests such as ISOCH reads. This count only tracks the regular credits Common high banwidth workloads should be able to make use of all of the regular buffers, but it will be difficult (and uncommon) to make use of both the regular and special buffers at the same time. One can filter based on the memory controller channel. One or more channels can be tracked at a given iven time.", ++ .modmsk = BDX_UNC_HA_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_h_rpq_cycles_no_reg_credits, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_rpq_cycles_no_reg_credits), ++ }, ++ { .name = "UNC_H_SBO0_CREDITS_ACQUIRED", ++ .code = 0x68, ++ .desc = "Number of Sbo 0 credits acquired in a given cycle, per ring.", ++ .modmsk = BDX_UNC_HA_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_h_sbo0_credits_acquired, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_sbo0_credits_acquired), ++ }, ++ { .name = "UNC_H_SBO0_CREDIT_OCCUPANCY", ++ .code = 0x6a, ++ .desc = "Number of Sbo 0 credits in use in a given cycle, per ring.", ++ .modmsk = BDX_UNC_HA_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_h_sbo0_credits_acquired, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_sbo0_credits_acquired), ++ }, ++ { .name = "UNC_H_SBO1_CREDITS_ACQUIRED", ++ .code = 0x69, ++ .desc = "Number of Sbo 1 credits acquired in a given cycle, per ring.", ++ .modmsk = BDX_UNC_HA_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_h_sbo0_credits_acquired, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_sbo0_credits_acquired), ++ }, ++ { .name = "UNC_H_SBO1_CREDIT_OCCUPANCY", ++ .code = 0x6b, ++ .desc = "Number of Sbo 1 credits in use in a given cycle, per ring.", ++ .modmsk = BDX_UNC_HA_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_h_sbo0_credits_acquired, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_sbo0_credits_acquired), ++ }, ++ { .name = "UNC_H_SNOOPS_RSP_AFTER_DATA", ++ .code = 0xa, ++ .desc = "Counts the number of reads when the snoop was on the critical path to the data return.", ++ .modmsk = BDX_UNC_HA_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_h_snoops_rsp_after_data, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_snoops_rsp_after_data), ++ }, ++ { .name = "UNC_H_SNOOP_CYCLES_NE", ++ .code = 0x8, ++ .desc = "Counts cycles when one or more snoops are outstanding.", ++ .modmsk = BDX_UNC_HA_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_h_snoop_cycles_ne, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_snoop_cycles_ne), ++ }, ++ { .name = "UNC_H_SNOOP_OCCUPANCY", ++ .code = 0x9, ++ .desc = "Accumulates the occupancy of either the local HA tracker pool that have snoops pending in every cycle. This can be used in conjection with the not empty stat to calculate average queue occupancy or the allocations stat in order to calculate average queue latency. HA trackers are allocated as soon as a request enters the HA if an HT (HomeTracker) entry is available and this occupancy is decremented when all the snoop responses have retureturned.", ++ .modmsk = BDX_UNC_HA_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_h_snoop_occupancy, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_snoop_occupancy), ++ }, ++ { .name = "UNC_H_SNOOP_RESP", ++ .code = 0x21, ++ .desc = "Counts the total number of RspI snoop responses received. Whenever a snoops are issued, one or more snoop responses will be returned depending on the topology of the system. In systems larger than 2s, when multiple snoops are returned this will count all the snoops that are received. For example, if 3 snoops were issued and returned RspI, RspS, and RspSFwd; then each of these sub-events would increment by 1.", ++ .modmsk = BDX_UNC_HA_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_h_snoop_resp, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_snoop_resp), ++ }, ++ { .name = "UNC_H_SNP_RESP_RECV_LOCAL", ++ .code = 0x60, ++ .desc = "Number of snoop responses received for a Local request", ++ .modmsk = BDX_UNC_HA_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_h_snp_resp_recv_local, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_snp_resp_recv_local), ++ }, ++ { .name = "UNC_H_STALL_NO_SBO_CREDIT", ++ .code = 0x6c, ++ .desc = "Number of cycles Egress is stalled waiting for an Sbo credit to become available. Per Sbo, per Ring.", ++ .modmsk = BDX_UNC_HA_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_h_stall_no_sbo_credit, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_stall_no_sbo_credit), ++ }, ++ { .name = "UNC_H_TAD_REQUESTS_G0", ++ .code = 0x1b, ++ .desc = "Counts the number of HA requests to a given TAD region. There are up to 11 TAD (target address decode) regions in each home agent. All requests destined for the memory controller must first be decoded to determine which TAD region they are in. This event is filtered based on the TAD region ID, and covers regions 0 to 7. This event is useful for understanding how applications are using the memory that is spread across the different memory regions. It is particularly useful for Monroe systems that use the TAD to enable individual channels to enter self-refresh to save powewer.", ++ .modmsk = BDX_UNC_HA_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_h_tad_requests_g0, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_tad_requests_g0), ++ }, ++ { .name = "UNC_H_TAD_REQUESTS_G1", ++ .code = 0x1c, ++ .desc = "Counts the number of HA requests to a given TAD region. There are up to 11 TAD (target address decode) regions in each home agent. All requests destined for the memory controller must first be decoded to determine which TAD region they are in. This event is filtered based on the TAD region ID, and covers regions 8 to 10. This event is useful for understanding how applications are using the memory that is spread across the different memory regions. It is particularly useful for Monroe systems that use the TAD to enable individual channels to enter self-refresh to save powewer.", ++ .modmsk = BDX_UNC_HA_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_h_tad_requests_g1, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_tad_requests_g1), ++ }, ++ { .name = "UNC_H_TRACKER_CYCLES_FULL", ++ .code = 0x2, ++ .desc = "Counts the number of cycles when the local HA tracker pool is completely used. This can be used with edge detect to identify the number of situations when the pool became fully utilized. This should not be confused with RTID credit usage -- which must be tracked inside each cbo individually -- but represents the actual tracker buffer structure. In other words, the system could be starved for RTIDs but not fill up the HA trackers. HA trackers are allocated as soon as a request enters the HA and is released after the snoop response and data return (or post in the case of a write) and the response is returned on the ring.", ++ .modmsk = BDX_UNC_HA_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_h_tracker_cycles_full, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_tracker_cycles_full), ++ }, ++ { .name = "UNC_H_TRACKER_CYCLES_NE", ++ .code = 0x3, ++ .desc = "Counts the number of cycles when the local HA tracker pool is not empty. This can be used with edge detect to identify the number of situations when the pool became empty. This should not be confused with RTID credit usage -- which must be tracked inside each cbo individually -- but represents the actual tracker buffer structure. In other words, this buffer could be completely empty, but there may still be credits in use by the CBos. This stat can be used in conjunction with the occupancy accumulation stat in order to calculate average queue occpancy. HA trackers are allocated as soon as a request enters the HA if an HT (Home Tracker) entry is available and is released after the snoop response and data return (or post in the case of a write) and the response is returned on the ring.", ++ .modmsk = BDX_UNC_HA_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_h_tracker_cycles_ne, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_tracker_cycles_ne), ++ }, ++ { .name = "UNC_H_TRACKER_OCCUPANCY", ++ .code = 0x4, ++ .desc = "Accumulates the occupancy of the local HA tracker pool in every cycle. This can be used in conjection with the not empty stat to calculate average queue occupancy or the allocations stat in order to calculate average queue latency. HA trackers are allocated as soon as a request enters the HA if a HT (Home Tracker) entry is available and is released after the snoop response and data return (or post in the case of a write) and the response is returned on the rhe ring.", ++ .modmsk = BDX_UNC_HA_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_h_tracker_occupancy, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_tracker_occupancy), ++ }, ++ { .name = "UNC_H_TRACKER_PENDING_OCCUPANCY", ++ .code = 0x5, ++ .desc = "Accumulates the number of transactions that have data from the memory controller until they get scheduled to the Egress. This can be used to calculate the queuing latency for two things. (1) If the system is waiting for snoops, this will increase. (2) If the system cant schedule to the Egress because of either (a) Egress Credits or (b) QPI BL IGR credits for remote requestss.", ++ .modmsk = BDX_UNC_HA_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_h_tracker_pending_occupancy, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_tracker_pending_occupancy), ++ }, ++ { .name = "UNC_H_TXR_AD_CYCLES_FULL", ++ .code = 0x2a, ++ .desc = "AD Egress Full", ++ .modmsk = BDX_UNC_HA_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_h_txr_ad_cycles_full, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_txr_ad_cycles_full), ++ }, ++ { .name = "UNC_H_TXR_AK_CYCLES_FULL", ++ .code = 0x32, ++ .desc = "AK Egress Full", ++ .modmsk = BDX_UNC_HA_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_h_txr_ad_cycles_full, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_txr_ad_cycles_full), /* shared */ ++ }, ++ { .name = "UNC_H_TXR_BL", ++ .code = 0x10, ++ .desc = "Counts the number of DRS messages sent out on the BL ring. This can be filtered by the destination.", ++ .modmsk = BDX_UNC_HA_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_h_txr_bl, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_txr_bl), ++ }, ++ { .name = "UNC_H_TXR_BL_CYCLES_FULL", ++ .code = 0x36, ++ .desc = "BL Egress Full", ++ .modmsk = BDX_UNC_HA_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_h_txr_ad_cycles_full, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_txr_ad_cycles_full), /* shared */ ++ }, ++ { .name = "UNC_H_TXR_STARVED", ++ .code = 0x6d, ++ .desc = "Counts injection starvation. This starvation is triggered when the Egress cannot send a transaction onto the ring for a long period of time.", ++ .modmsk = BDX_UNC_HA_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_h_txr_starved, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_txr_starved), ++ }, ++ { .name = "UNC_H_WPQ_CYCLES_NO_REG_CREDITS", ++ .code = 0x18, ++ .desc = "Counts the number of cycles when there are no regular credits available for posting writes from the HA into the iMC. In order to send writes into the memory controller, the HA must first acquire a credit for the iMCs WPQ (write pending queue). This queue is broken into regular credits/buffers that are used by general writes, and special requests such as ISOCH writes. This count only tracks the regular credits Common high banwidth workloads should be able to make use of all of the regular buffers, but it will be difficult (and uncommon) to make use of both the regular and special buffers at the same time. One can filter based on the memory controller channel. One or more channels can be tracked at a given iven time.", ++ .modmsk = BDX_UNC_HA_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_h_wpq_cycles_no_reg_credits, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_wpq_cycles_no_reg_credits), ++ }, ++}; +diff --git a/lib/events/intel_bdx_unc_imc_events.h b/lib/events/intel_bdx_unc_imc_events.h +new file mode 100644 +index 0000000..1a2292e +--- /dev/null ++++ b/lib/events/intel_bdx_unc_imc_events.h +@@ -0,0 +1,733 @@ ++/* ++ * Copyright (c) 2017 Google Inc. All rights reserved ++ * Contributed by Stephane Eranian ++ * ++ * Permission is hereby granted, free of charge, to any person obtaining a copy ++ * of this software and associated documentation files (the "Software"), to deal ++ * in the Software without restriction, including without limitation the rights ++ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies ++ * of the Software, and to permit persons to whom the Software is furnished to do so, ++ * subject to the following conditions: ++ * ++ * The above copyright notice and this permission notice shall be included in all ++ * copies or substantial portions of the Software. ++ * ++ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, ++ * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A ++ * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT ++ * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF ++ * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE ++ * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ++ * ++ * This file is part of libpfm, a performance monitoring support library for ++ * applications on Linux. ++ * ++ * PMU: bdx_unc_imc ++ */ ++ ++static intel_x86_umask_t bdx_unc_m_act_count[]={ ++ { .uname = "BYP", ++ .ucode = 0x800, ++ .udesc = "DRAM Activate Count -- Activate due to Write", ++ }, ++ { .uname = "RD", ++ .ucode = 0x100, ++ .udesc = "DRAM Activate Count -- Activate due to Read", ++ }, ++ { .uname = "WR", ++ .ucode = 0x200, ++ .udesc = "DRAM Activate Count -- Activate due to Write", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_m_byp_cmds[]={ ++ { .uname = "ACT", ++ .ucode = 0x100, ++ .udesc = "ACT command issued by 2 cycle bypass", ++ }, ++ { .uname = "CAS", ++ .ucode = 0x200, ++ .udesc = "CAS command issued by 2 cycle bypass", ++ }, ++ { .uname = "PRE", ++ .ucode = 0x400, ++ .udesc = "PRE command issued by 2 cycle bypass", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_m_cas_count[]={ ++ { .uname = "ALL", ++ .ucode = 0xf00, ++ .udesc = "DRAM RD_CAS and WR_CAS Commands. All DRAM WR_CAS (w/ and w/out auto-pre)", ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, ++ }, ++ { .uname = "RD", ++ .ucode = 0x300, ++ .udesc = "DRAM RD_CAS and WR_CAS Commands. All DRAM Reads (RD_CAS + Underfills)", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "RD_REG", ++ .ucode = 0x100, ++ .udesc = "DRAM RD_CAS and WR_CAS Commands. All DRAM RD_CAS (w/ and w/out auto-pre)", ++ }, ++ { .uname = "RD_RMM", ++ .ucode = 0x2000, ++ .udesc = "DRAM RD_CAS and WR_CAS Commands. Read CAS issued in RMM", ++ }, ++ { .uname = "RD_UNDERFILL", ++ .ucode = 0x200, ++ .udesc = "DRAM RD_CAS and WR_CAS Commands. Underfill Read Issued", ++ }, ++ { .uname = "RD_WMM", ++ .ucode = 0x1000, ++ .udesc = "DRAM RD_CAS and WR_CAS Commands. Read CAS issued in WMM", ++ }, ++ { .uname = "WR", ++ .ucode = 0xc00, ++ .udesc = "DRAM RD_CAS and WR_CAS Commands. All DRAM WR_CAS (both Modes)", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "WR_RMM", ++ .ucode = 0x800, ++ .udesc = "DRAM RD_CAS and WR_CAS Commands. DRAM WR_CAS (w/ and w/out auto-pre) in Read Major Mode", ++ }, ++ { .uname = "WR_WMM", ++ .ucode = 0x400, ++ .udesc = "DRAM RD_CAS and WR_CAS Commands. DRAM WR_CAS (w/ and w/out auto-pre) in Write Major Mode", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_m_dram_refresh[]={ ++ { .uname = "HIGH", ++ .ucode = 0x400, ++ .udesc = "Number of DRAM Refreshes Issued", ++ }, ++ { .uname = "PANIC", ++ .ucode = 0x200, ++ .udesc = "Number of DRAM Refreshes Issued", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_m_major_modes[]={ ++ { .uname = "ISOCH", ++ .ucode = 0x800, ++ .udesc = "Cycles in a Major Mode -- Isoch Major Mode", ++ }, ++ { .uname = "PARTIAL", ++ .ucode = 0x400, ++ .udesc = "Cycles in a Major Mode -- Partial Major Mode", ++ }, ++ { .uname = "READ", ++ .ucode = 0x100, ++ .udesc = "Cycles in a Major Mode -- Read Major Mode", ++ }, ++ { .uname = "WRITE", ++ .ucode = 0x200, ++ .udesc = "Cycles in a Major Mode -- Write Major Mode", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_m_power_cke_cycles[]={ ++ { .uname = "RANK0", ++ .ucode = 0x100, ++ .udesc = "CKE_ON_CYCLES by Rank -- DIMM ID", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "RANK1", ++ .ucode = 0x200, ++ .udesc = "CKE_ON_CYCLES by Rank -- DIMM ID", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "RANK2", ++ .ucode = 0x400, ++ .udesc = "CKE_ON_CYCLES by Rank -- DIMM ID", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "RANK3", ++ .ucode = 0x800, ++ .udesc = "CKE_ON_CYCLES by Rank -- DIMM ID", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "RANK4", ++ .ucode = 0x1000, ++ .udesc = "CKE_ON_CYCLES by Rank -- DIMM ID", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "RANK5", ++ .ucode = 0x2000, ++ .udesc = "CKE_ON_CYCLES by Rank -- DIMM ID", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "RANK6", ++ .ucode = 0x4000, ++ .udesc = "CKE_ON_CYCLES by Rank -- DIMM ID", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "RANK7", ++ .ucode = 0x8000, ++ .udesc = "CKE_ON_CYCLES by Rank -- DIMM ID", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_m_power_throttle_cycles[]={ ++ { .uname = "RANK0", ++ .ucode = 0x100, ++ .udesc = "Throttle Cycles for Rank 0 -- DIMM ID", ++ }, ++ { .uname = "RANK1", ++ .ucode = 0x200, ++ .udesc = "Throttle Cycles for Rank 0 -- DIMM ID", ++ }, ++ { .uname = "RANK2", ++ .ucode = 0x400, ++ .udesc = "Throttle Cycles for Rank 0 -- DIMM ID", ++ }, ++ { .uname = "RANK3", ++ .ucode = 0x800, ++ .udesc = "Throttle Cycles for Rank 0 -- DIMM ID", ++ }, ++ { .uname = "RANK4", ++ .ucode = 0x1000, ++ .udesc = "Throttle Cycles for Rank 0 -- DIMM ID", ++ }, ++ { .uname = "RANK5", ++ .ucode = 0x2000, ++ .udesc = "Throttle Cycles for Rank 0 -- DIMM ID", ++ }, ++ { .uname = "RANK6", ++ .ucode = 0x4000, ++ .udesc = "Throttle Cycles for Rank 0 -- DIMM ID", ++ }, ++ { .uname = "RANK7", ++ .ucode = 0x8000, ++ .udesc = "Throttle Cycles for Rank 0 -- DIMM ID", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_m_preemption[]={ ++ { .uname = "RD_PREEMPT_RD", ++ .ucode = 0x100, ++ .udesc = "Read Preemption Count -- Read over Read Preemption", ++ }, ++ { .uname = "RD_PREEMPT_WR", ++ .ucode = 0x200, ++ .udesc = "Read Preemption Count -- Read over Write Preemption", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_m_pre_count[]={ ++ { .uname = "BYP", ++ .ucode = 0x1000, ++ .udesc = "DRAM Precharge commands. -- Precharge due to bypass", ++ }, ++ { .uname = "PAGE_CLOSE", ++ .ucode = 0x200, ++ .udesc = "DRAM Precharge commands. -- Precharge due to timer expiration", ++ }, ++ { .uname = "PAGE_MISS", ++ .ucode = 0x100, ++ .udesc = "DRAM Precharge commands. -- Precharges due to page miss", ++ }, ++ { .uname = "RD", ++ .ucode = 0x400, ++ .udesc = "DRAM Precharge commands. -- Precharge due to read", ++ }, ++ { .uname = "WR", ++ .ucode = 0x800, ++ .udesc = "DRAM Precharge commands. -- Precharge due to write", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_m_rd_cas_prio[]={ ++ { .uname = "HIGH", ++ .ucode = 0x400, ++ .udesc = "Read CAS issued with HIGH priority", ++ }, ++ { .uname = "LOW", ++ .ucode = 0x100, ++ .udesc = "Read CAS issued with LOW priority", ++ }, ++ { .uname = "MED", ++ .ucode = 0x200, ++ .udesc = "Read CAS issued with MEDIUM priority", ++ }, ++ { .uname = "PANIC", ++ .ucode = 0x800, ++ .udesc = "Read CAS issued with PANIC NON ISOCH priority (starved)", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_m_rd_cas_rank0[]={ ++ { .uname = "ALLBANKS", ++ .ucode = 0x1000, ++ .udesc = "Access to Rank 0 -- All Banks", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "BANK0", ++ .ucode = 0x0, ++ .udesc = "Access to Rank 0 -- Bank 0", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "BANK1", ++ .ucode = 0x100, ++ .udesc = "Access to Rank 0 -- Bank 1", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "BANK10", ++ .ucode = 0xa00, ++ .udesc = "Access to Rank 0 -- Bank 10", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "BANK11", ++ .ucode = 0xb00, ++ .udesc = "Access to Rank 0 -- Bank 11", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "BANK12", ++ .ucode = 0xc00, ++ .udesc = "Access to Rank 0 -- Bank 12", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "BANK13", ++ .ucode = 0xd00, ++ .udesc = "Access to Rank 0 -- Bank 13", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "BANK14", ++ .ucode = 0xe00, ++ .udesc = "Access to Rank 0 -- Bank 14", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "BANK15", ++ .ucode = 0xf00, ++ .udesc = "Access to Rank 0 -- Bank 15", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "BANK2", ++ .ucode = 0x200, ++ .udesc = "Access to Rank 0 -- Bank 2", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "BANK3", ++ .ucode = 0x300, ++ .udesc = "Access to Rank 0 -- Bank 3", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "BANK4", ++ .ucode = 0x400, ++ .udesc = "Access to Rank 0 -- Bank 4", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "BANK5", ++ .ucode = 0x500, ++ .udesc = "Access to Rank 0 -- Bank 5", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "BANK6", ++ .ucode = 0x600, ++ .udesc = "Access to Rank 0 -- Bank 6", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "BANK7", ++ .ucode = 0x700, ++ .udesc = "Access to Rank 0 -- Bank 7", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "BANK8", ++ .ucode = 0x800, ++ .udesc = "Access to Rank 0 -- Bank 8", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "BANK9", ++ .ucode = 0x900, ++ .udesc = "Access to Rank 0 -- Bank 9", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "BANKG0", ++ .ucode = 0x1100, ++ .udesc = "Access to Rank 0 -- Bank Group 0 (Banks 0-3)", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "BANKG1", ++ .ucode = 0x1200, ++ .udesc = "Access to Rank 0 -- Bank Group 1 (Banks 4-7)", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "BANKG2", ++ .ucode = 0x1300, ++ .udesc = "Access to Rank 0 -- Bank Group 2 (Banks 8-11)", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "BANKG3", ++ .ucode = 0x1400, ++ .udesc = "Access to Rank 0 -- Bank Group 3 (Banks 12-15)", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_m_rd_cas_rank2[]={ ++ { .uname = "BANK0", ++ .ucode = 0x0, ++ .udesc = "RD_CAS Access to Rank 2 -- Bank 0", ++ .uflags = INTEL_X86_DFL, ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_m_vmse_wr_push[]={ ++ { .uname = "RMM", ++ .ucode = 0x200, ++ .udesc = "VMSE WR PUSH issued -- VMSE write PUSH issued in RMM", ++ }, ++ { .uname = "WMM", ++ .ucode = 0x100, ++ .udesc = "VMSE WR PUSH issued -- VMSE write PUSH issued in WMM", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_m_wmm_to_rmm[]={ ++ { .uname = "LOW_THRESH", ++ .ucode = 0x100, ++ .udesc = "Transition from WMM to RMM because of low threshold -- Transition from WMM to RMM because of starve counter", ++ }, ++ { .uname = "STARVE", ++ .ucode = 0x200, ++ .udesc = "Transition from WMM to RMM because of low threshold -- ", ++ }, ++ { .uname = "VMSE_RETRY", ++ .ucode = 0x400, ++ .udesc = "Transition from WMM to RMM because of low threshold -- ", ++ }, ++}; ++ ++static intel_x86_entry_t intel_bdx_unc_m_pe[]={ ++ { .name = "UNC_M_CLOCKTICKS", ++ .desc = "IMC Uncore clockticks (fixed counter)", ++ .modmsk = 0x0, ++ .cntmsk = 0x100000000ull, ++ .code = 0xff, /* perf pseudo encoding for fixed counter */ ++ .flags = INTEL_X86_FIXED, ++ }, ++ { .name = "UNC_M_ACT_COUNT", ++ .code = 0x1, ++ .desc = "Counts the number of DRAM Activate commands sent on this channel. Activate commands are issued to open up a page on the DRAM devices so that it can be read or written to with a CAS. One can calculate the number of Page Misses by subtracting the number of Page Miss precharges from the number of Activates.", ++ .modmsk = BDX_UNC_IMC_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_m_act_count, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_m_act_count), ++ }, ++ { .name = "UNC_M_BYP_CMDS", ++ .code = 0xa1, ++ .desc = "TBD", ++ .modmsk = BDX_UNC_IMC_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_m_byp_cmds, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_m_byp_cmds), ++ }, ++ { .name = "UNC_M_CAS_COUNT", ++ .code = 0x4, ++ .desc = "DRAM RD_CAS and WR_CAS Commands", ++ .modmsk = BDX_UNC_IMC_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_m_cas_count, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_m_cas_count), ++ }, ++ { .name = "UNC_M_DCLOCKTICKS", ++ .code = 0x0, ++ .desc = "TBD", ++ .modmsk = BDX_UNC_IMC_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_M_DRAM_PRE_ALL", ++ .code = 0x6, ++ .desc = "Counts the number of times that the precharge all command was sent.", ++ .modmsk = BDX_UNC_IMC_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_M_DRAM_REFRESH", ++ .code = 0x5, ++ .desc = "Counts the number of refreshes issued.", ++ .modmsk = BDX_UNC_IMC_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_m_dram_refresh, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_m_dram_refresh), ++ }, ++ { .name = "UNC_M_ECC_CORRECTABLE_ERRORS", ++ .code = 0x9, ++ .desc = "Counts the number of ECC errors detected and corrected by the iMC on this channel. This counter is only useful with ECC DRAM devices. This count will increment one time for each correction regardless of the number of bits corrected. The iMC can correct up to 4 bit errors in independent channel mode and 8 bit erros in lockstep mode.", ++ .modmsk = BDX_UNC_IMC_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_M_MAJOR_MODES", ++ .code = 0x7, ++ .desc = "Counts the total number of cycles spent in a major mode (selected by a filter) on the given channel. Major modea are channel-wide, and not a per-rank (or dimm or bank) mode.", ++ .modmsk = BDX_UNC_IMC_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_m_major_modes, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_m_major_modes), ++ }, ++ { .name = "UNC_M_POWER_CHANNEL_DLLOFF", ++ .code = 0x84, ++ .desc = "Number of cycles when all the ranks in the channel are in CKE Slow (DLLOFF) mode.", ++ .modmsk = BDX_UNC_IMC_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_M_POWER_CHANNEL_PPD", ++ .code = 0x85, ++ .desc = "Number of cycles when all the ranks in the channel are in PPD mode. If IBT=off is enabled, then this can be used to count those cycles. If it is not enabled, then this can count the number of cycles when that could have been taken advantage of.", ++ .modmsk = BDX_UNC_IMC_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_M_POWER_CKE_CYCLES", ++ .code = 0x83, ++ .desc = "Number of cycles spent in CKE ON mode. The filter allows you to select a rank to monitor. If multiple ranks are in CKE ON mode at one time, the counter will ONLY increment by one rather than doing accumulation. Multiple counters will need to be used to track multiple ranks simultaneously. There is no distinction between the different CKE modes (APD, PPDS, PPDF). This can be determined based on the system programming. These events should commonly be used with Invert to get the number of cycles in power saving mode. Edge Detect is also useful here. Make sure that you do NOT use Invert with Edge Detect (this just confuses the system and is not necessary).", ++ .modmsk = BDX_UNC_IMC_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_m_power_cke_cycles, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_m_power_cke_cycles), ++ }, ++ { .name = "UNC_M_POWER_CRITICAL_THROTTLE_CYCLES", ++ .code = 0x86, ++ .desc = "Counts the number of cycles when the iMC is in critical thermal throttling. When this happens, all traffic is blocked. This should be rare unless something bad is going on in the platform. There is no filtering by rank for this event.", ++ .modmsk = BDX_UNC_IMC_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_M_POWER_PCU_THROTTLING", ++ .code = 0x42, ++ .desc = "TBD", ++ .modmsk = BDX_UNC_IMC_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_M_POWER_SELF_REFRESH", ++ .code = 0x43, ++ .desc = "Counts the number of cycles when the iMC is in self-refresh and the iMC still has a clock. This happens in some package C-states. For example, the PCU may ask the iMC to enter self-refresh even though some of the cores are still processing. One use of this is for Monroe technology. Self-refresh is required during package C3 and C6, but there is no clock in the iMC at this time, so it is not possible to count these cases.", ++ .modmsk = BDX_UNC_IMC_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_M_POWER_THROTTLE_CYCLES", ++ .code = 0x41, ++ .desc = "Counts the number of cycles while the iMC is being throttled by either thermal constraints or by the PCU throttling. It is not possible to distinguish between the two. This can be filtered by rank. If multiple ranks are selected and are being throttled at the same time, the counter will only increment by 1.", ++ .modmsk = BDX_UNC_IMC_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_m_power_throttle_cycles, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_m_power_throttle_cycles), ++ }, ++ { .name = "UNC_M_PREEMPTION", ++ .code = 0x8, ++ .desc = "Counts the number of times a read in the iMC preempts another read or write. Generally reads to an open page are issued ahead of requests to closed pages. This improves the page hit rate of the system. However, high priority requests can cause pages of active requests to be closed in order to get them out. This will reduce the latency of the high-priority request at the expense of lower bandwidth and increased overall average latency.", ++ .modmsk = BDX_UNC_IMC_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_m_preemption, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_m_preemption), ++ }, ++ { .name = "UNC_M_PRE_COUNT", ++ .code = 0x2, ++ .desc = "Counts the number of DRAM Precharge commands sent on this channel.", ++ .modmsk = BDX_UNC_IMC_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_m_pre_count, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_m_pre_count), ++ }, ++ { .name = "UNC_M_RD_CAS_PRIO", ++ .code = 0xa0, ++ .desc = "TBD", ++ .modmsk = BDX_UNC_IMC_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_m_rd_cas_prio, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_m_rd_cas_prio), ++ }, ++ { .name = "UNC_M_RD_CAS_RANK0", ++ .code = 0xb0, ++ .desc = "TBD", ++ .modmsk = BDX_UNC_IMC_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_m_rd_cas_rank0, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_m_rd_cas_rank0), ++ }, ++ { .name = "UNC_M_RD_CAS_RANK1", ++ .code = 0xb1, ++ .desc = "TBD", ++ .modmsk = BDX_UNC_IMC_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_m_rd_cas_rank0, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_m_rd_cas_rank0), /* shared */ ++ }, ++ { .name = "UNC_M_RD_CAS_RANK2", ++ .code = 0xb2, ++ .desc = "TBD", ++ .modmsk = BDX_UNC_IMC_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_m_rd_cas_rank2, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_m_rd_cas_rank2), ++ }, ++ { .name = "UNC_M_RD_CAS_RANK4", ++ .code = 0xb4, ++ .desc = "TBD", ++ .modmsk = BDX_UNC_IMC_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_m_rd_cas_rank0, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_m_rd_cas_rank0), /* shared */ ++ }, ++ { .name = "UNC_M_RD_CAS_RANK5", ++ .code = 0xb5, ++ .desc = "TBD", ++ .modmsk = BDX_UNC_IMC_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_m_rd_cas_rank0, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_m_rd_cas_rank0), /* shared */ ++ }, ++ { .name = "UNC_M_RD_CAS_RANK6", ++ .code = 0xb6, ++ .desc = "TBD", ++ .modmsk = BDX_UNC_IMC_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_m_rd_cas_rank0, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_m_rd_cas_rank0), /* shared */ ++ }, ++ { .name = "UNC_M_RD_CAS_RANK7", ++ .code = 0xb7, ++ .desc = "TBD", ++ .modmsk = BDX_UNC_IMC_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_m_rd_cas_rank0, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_m_rd_cas_rank0), /* shared */ ++ }, ++ { .name = "UNC_M_RPQ_CYCLES_NE", ++ .code = 0x11, ++ .desc = "Counts the number of cycles that the Read Pending Queue is not empty. This can then be used to calculate the average occupancy (in conjunction with the Read Pending Queue Occupancy count). The RPQ is used to schedule reads out to the memory controller and to track the requests. Requests allocate into the RPQ soon after they enter the memory controller, and need credits for an entry in this buffer before being sent from the HA to the iMC. They deallocate after the CAS command has been issued to memory. This filter is to be used in conjunction with the occupancy filter so that one can correctly track the average occupancies for schedulable entries and scheduled requests.", ++ .modmsk = BDX_UNC_IMC_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_M_RPQ_INSERTS", ++ .code = 0x10, ++ .desc = "Counts the number of allocations into the Read Pending Queue. This queue is used to schedule reads out to the memory controller and to track the requests. Requests allocate into the RPQ soon after they enter the memory controller, and need credits for an entry in this buffer before being sent from the HA to the iMC. They deallocate after the CAS command has been issued to memory. This includes both ISOCH and non-ISOCH requests.", ++ .modmsk = BDX_UNC_IMC_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_M_VMSE_MXB_WR_OCCUPANCY", ++ .code = 0x91, ++ .desc = "TBD", ++ .modmsk = BDX_UNC_IMC_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_M_VMSE_WR_PUSH", ++ .code = 0x90, ++ .desc = "TBD", ++ .modmsk = BDX_UNC_IMC_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_m_vmse_wr_push, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_m_vmse_wr_push), ++ }, ++ { .name = "UNC_M_WMM_TO_RMM", ++ .code = 0xc0, ++ .desc = "TBD", ++ .modmsk = BDX_UNC_IMC_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_m_wmm_to_rmm, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_m_wmm_to_rmm), ++ }, ++ { .name = "UNC_M_WPQ_CYCLES_FULL", ++ .code = 0x22, ++ .desc = "Counts the number of cycles when the Write Pending Queue is full. When the WPQ is full, the HA will not be able to issue any additional read requests into the iMC. This count should be similar count in the HA which tracks the number of cycles that the HA has no WPQ credits, just somewhat smaller to account for the credit return overhead.", ++ .modmsk = BDX_UNC_IMC_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_M_WPQ_CYCLES_NE", ++ .code = 0x21, ++ .desc = "Counts the number of cycles that the Write Pending Queue is not empty. This can then be used to calculate the average queue occupancy (in conjunction with the WPQ Occupancy Accumulation count). The WPQ is used to schedule write out to the memory controller and to track the writes. Requests allocate into the WPQ soon after they enter the memory controller, and need credits for an entry in this buffer before being sent from the HA to the iMC. They deallocate after being issued to DRAM. Write requests themselves are able to complete (from the perspective of the rest of the system) as soon they have posted to the iMC. This is not to be confused with actually performing the write to DRAM. Therefore, the average latency for this queue is actually not useful for deconstruction intermediate write latencieies.", ++ .modmsk = BDX_UNC_IMC_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_M_WPQ_READ_HIT", ++ .code = 0x23, ++ .desc = "Counts the number of times a request hits in the WPQ (write-pending queue). The iMC allows writes and reads to pass up other writes to different addresses. Before a read or a write is issued, it will first CAM the WPQ to see if there is a write pending to that address. When reads hit, they are able to directly pull their data from the WPQ instead of going to memory. Writes that hit will overwrite the existing data. Partial writes that hit will not need to do underfill reads and will simply update their relevant sections.", ++ .modmsk = BDX_UNC_IMC_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_M_WPQ_WRITE_HIT", ++ .code = 0x24, ++ .desc = "Counts the number of times a request hits in the WPQ (write-pending queue). The iMC allows writes and reads to pass up other writes to different addresses. Before a read or a write is issued, it will first CAM the WPQ to see if there is a write pending to that address. When reads hit, they are able to directly pull their data from the WPQ instead of going to memory. Writes that hit will overwrite the existing data. Partial writes that hit will not need to do underfill reads and will simply update their relevant sections.", ++ .modmsk = BDX_UNC_IMC_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_M_WRONG_MM", ++ .code = 0xc1, ++ .desc = "TBD", ++ .modmsk = BDX_UNC_IMC_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_M_WR_CAS_RANK0", ++ .code = 0xb8, ++ .desc = "TBD", ++ .modmsk = BDX_UNC_IMC_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_m_rd_cas_rank0, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_m_rd_cas_rank0), ++ }, ++ { .name = "UNC_M_WR_CAS_RANK1", ++ .code = 0xb9, ++ .desc = "TBD", ++ .modmsk = BDX_UNC_IMC_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_m_rd_cas_rank0, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_m_rd_cas_rank0), /* shared */ ++ }, ++ { .name = "UNC_M_WR_CAS_RANK4", ++ .code = 0xbc, ++ .desc = "TBD", ++ .modmsk = BDX_UNC_IMC_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_m_rd_cas_rank0, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_m_rd_cas_rank0), /* shared */ ++ }, ++ { .name = "UNC_M_WR_CAS_RANK5", ++ .code = 0xbd, ++ .desc = "TBD", ++ .modmsk = BDX_UNC_IMC_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_m_rd_cas_rank0, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_m_rd_cas_rank0), /* shared */ ++ }, ++ { .name = "UNC_M_WR_CAS_RANK6", ++ .code = 0xbe, ++ .desc = "TBD", ++ .modmsk = BDX_UNC_IMC_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_m_rd_cas_rank0, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_m_rd_cas_rank0), /* shared */ ++ }, ++ { .name = "UNC_M_WR_CAS_RANK7", ++ .code = 0xbf, ++ .desc = "TBD", ++ .modmsk = BDX_UNC_IMC_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_m_rd_cas_rank0, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_m_rd_cas_rank0), /* shared */ ++ }, ++}; ++ +diff --git a/lib/events/intel_bdx_unc_irp_events.h b/lib/events/intel_bdx_unc_irp_events.h +new file mode 100644 +index 0000000..1882a64 +--- /dev/null ++++ b/lib/events/intel_bdx_unc_irp_events.h +@@ -0,0 +1,384 @@ ++/* ++ * Copyright (c) 2017 Google Inc. All rights reserved ++ * Contributed by Stephane Eranian ++ * ++ * Permission is hereby granted, free of charge, to any person obtaining a copy ++ * of this software and associated documentation files (the "Software"), to deal ++ * in the Software without restriction, including without limitation the rights ++ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies ++ * of the Software, and to permit persons to whom the Software is furnished to do so, ++ * subject to the following conditions: ++ * ++ * The above copyright notice and this permission notice shall be included in all ++ * copies or substantial portions of the Software. ++ * ++ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, ++ * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A ++ * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT ++ * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF ++ * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE ++ * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ++ * ++ * This file is part of libpfm, a performance monitoring support library for ++ * applications on Linux. ++ * ++ * PMU: bdx_unc_irp ++ */ ++ ++static intel_x86_umask_t bdx_unc_i_cache_total_occupancy[]={ ++ { .uname = "ANY", ++ .ucode = 0x100, ++ .udesc = "Total Write Cache Occupancy -- Any Source", ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, ++ }, ++ { .uname = "SOURCE", ++ .ucode = 0x200, ++ .udesc = "Total Write Cache Occupancy -- Select Source", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_i_coherent_ops[]={ ++ { .uname = "CLFLUSH", ++ .ucode = 0x8000, ++ .udesc = "Coherent Ops -- CLFlush", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "CRD", ++ .ucode = 0x200, ++ .udesc = "Coherent Ops -- CRd", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "DRD", ++ .ucode = 0x400, ++ .udesc = "Coherent Ops -- DRd", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "PCIDCAHINT", ++ .ucode = 0x2000, ++ .udesc = "Coherent Ops -- PCIDCAHin5t", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "PCIRDCUR", ++ .ucode = 0x100, ++ .udesc = "Coherent Ops -- PCIRdCur", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "PCITOM", ++ .ucode = 0x1000, ++ .udesc = "Coherent Ops -- PCIItoM", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "RFO", ++ .ucode = 0x800, ++ .udesc = "Coherent Ops -- RFO", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "WBMTOI", ++ .ucode = 0x4000, ++ .udesc = "Coherent Ops -- WbMtoI", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_i_misc0[]={ ++ { .uname = "2ND_ATOMIC_INSERT", ++ .ucode = 0x1000, ++ .udesc = "Misc Events - Set 0 -- Cache Inserts of Atomic Transactions as Secondary", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "2ND_RD_INSERT", ++ .ucode = 0x400, ++ .udesc = "Misc Events - Set 0 -- Cache Inserts of Read Transactions as Secondary", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "2ND_WR_INSERT", ++ .ucode = 0x800, ++ .udesc = "Misc Events - Set 0 -- Cache Inserts of Write Transactions as Secondary", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "FAST_REJ", ++ .ucode = 0x200, ++ .udesc = "Misc Events - Set 0 -- Fastpath Rejects", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "FAST_REQ", ++ .ucode = 0x100, ++ .udesc = "Misc Events - Set 0 -- Fastpath Requests", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "FAST_XFER", ++ .ucode = 0x2000, ++ .udesc = "Misc Events - Set 0 -- Fastpath Transfers From Primary to Secondary", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "PF_ACK_HINT", ++ .ucode = 0x4000, ++ .udesc = "Misc Events - Set 0 -- Prefetch Ack Hints From Primary to Secondary", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "PF_TIMEOUT", ++ .ucode = 0x8000, ++ .udesc = "Misc Events - Set 0 -- Prefetch TimeOut", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_i_misc1[]={ ++ { .uname = "DATA_THROTTLE", ++ .ucode = 0x8000, ++ .udesc = "Misc Events - Set 1 -- Data Throttled", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "LOST_FWD", ++ .ucode = 0x1000, ++ .udesc = "Misc Events - Set 1 -- ", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "SEC_RCVD_INVLD", ++ .ucode = 0x2000, ++ .udesc = "Misc Events - Set 1 -- Received Invalid", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "SEC_RCVD_VLD", ++ .ucode = 0x4000, ++ .udesc = "Misc Events - Set 1 -- Received Valid", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "SLOW_I", ++ .ucode = 0x100, ++ .udesc = "Misc Events - Set 1 -- Slow Transfer of I Line", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "SLOW_S", ++ .ucode = 0x200, ++ .udesc = "Misc Events - Set 1 -- Slow Transfer of S Line", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "SLOW_E", ++ .ucode = 0x400, ++ .udesc = "Misc Events - Set 1 -- Slow Transfer of E Line", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "SLOW_M", ++ .ucode = 0x800, ++ .udesc = "Misc Events - Set 1 -- Slow Transfer of M Line", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_i_snoop_resp[]={ ++ { .uname = "HIT_ES", ++ .ucode = 0x400, ++ .udesc = "Snoop Responses -- Hit E or S", ++ }, ++ { .uname = "HIT_I", ++ .ucode = 0x200, ++ .udesc = "Snoop Responses -- Hit I", ++ }, ++ { .uname = "HIT_M", ++ .ucode = 0x800, ++ .udesc = "Snoop Responses -- Hit M", ++ }, ++ { .uname = "MISS", ++ .ucode = 0x100, ++ .udesc = "Snoop Responses -- Miss", ++ }, ++ { .uname = "SNPCODE", ++ .ucode = 0x1000, ++ .udesc = "Snoop Responses -- SnpCode", ++ }, ++ { .uname = "SNPDATA", ++ .ucode = 0x2000, ++ .udesc = "Snoop Responses -- SnpData", ++ }, ++ { .uname = "SNPINV", ++ .ucode = 0x4000, ++ .udesc = "Snoop Responses -- SnpInv", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_i_transactions[]={ ++ { .uname = "ATOMIC", ++ .ucode = 0x1000, ++ .udesc = "Inbound Transaction Count -- Atomic", ++ }, ++ { .uname = "ORDERINGQ", ++ .ucode = 0x4000, ++ .udesc = "Inbound Transaction Count -- Select Source via IRP orderingQ register", ++ }, ++ { .uname = "OTHER", ++ .ucode = 0x2000, ++ .udesc = "Inbound Transaction Count -- Other", ++ }, ++ { .uname = "RD_PREF", ++ .ucode = 0x400, ++ .udesc = "Inbound Transaction Count -- Read Prefetches", ++ }, ++ { .uname = "READS", ++ .ucode = 0x100, ++ .udesc = "Inbound Transaction Count -- Reads", ++ }, ++ { .uname = "WRITES", ++ .ucode = 0x200, ++ .udesc = "Inbound Transaction Count -- Writes", ++ }, ++ { .uname = "WR_PREF", ++ .ucode = 0x800, ++ .udesc = "Inbound Transaction Count -- Write Prefetches", ++ }, ++}; ++ ++ ++static intel_x86_entry_t intel_bdx_unc_i_pe[]={ ++ { .name = "UNC_I_CACHE_TOTAL_OCCUPANCY", ++ .code = 0x12, ++ .desc = "Accumulates the number of reads and writes that are outstanding in the uncore in each cycle. This is effectively the sum of the READ_OCCUPANCY and WRITE_OCCUPANCY events.", ++ .modmsk = BDX_UNC_IRP_ATTRS, ++ .cntmsk = 0x3, ++ .ngrp = 1, ++ .umasks = bdx_unc_i_cache_total_occupancy, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_i_cache_total_occupancy), ++ }, ++ { .name = "UNC_I_CLOCKTICKS", ++ .code = 0x0, ++ .desc = "Number of clocks in the IRP.", ++ .modmsk = BDX_UNC_IRP_ATTRS, ++ .cntmsk = 0x3, ++ }, ++ { .name = "UNC_I_COHERENT_OPS", ++ .code = 0x13, ++ .desc = "Counts the number of coherency related operations servied by the IRP", ++ .modmsk = BDX_UNC_IRP_ATTRS, ++ .cntmsk = 0x3, ++ .ngrp = 1, ++ .umasks = bdx_unc_i_coherent_ops, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_i_coherent_ops), ++ }, ++ { .name = "UNC_I_MISC0", ++ .code = 0x14, ++ .desc = "", ++ .modmsk = BDX_UNC_IRP_ATTRS, ++ .cntmsk = 0x3, ++ .ngrp = 1, ++ .umasks = bdx_unc_i_misc0, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_i_misc0), ++ }, ++ { .name = "UNC_I_MISC1", ++ .code = 0x15, ++ .desc = "", ++ .modmsk = BDX_UNC_IRP_ATTRS, ++ .cntmsk = 0x3, ++ .ngrp = 1, ++ .umasks = bdx_unc_i_misc1, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_i_misc1), ++ }, ++ { .name = "UNC_I_RXR_AK_INSERTS", ++ .code = 0xa, ++ .desc = "Counts the number of allocations into the AK Ingress. This queue is where the IRP receives responses from R2PCIe (the ring).", ++ .modmsk = BDX_UNC_IRP_ATTRS, ++ .cntmsk = 0x3, ++ }, ++ { .name = "UNC_I_RXR_BL_DRS_CYCLES_FULL", ++ .code = 0x4, ++ .desc = "Counts the number of cycles when the BL Ingress is full. This queue is where the IRP receives data from R2PCIe (the ring). It is used for data returns from read requets as well as outbound MMIO writes.", ++ .modmsk = BDX_UNC_IRP_ATTRS, ++ .cntmsk = 0x3, ++ }, ++ { .name = "UNC_I_RXR_BL_DRS_INSERTS", ++ .code = 0x1, ++ .desc = "Counts the number of allocations into the BL Ingress. This queue is where the IRP receives data from R2PCIe (the ring). It is used for data returns from read requets as well as outbound MMIO writes.", ++ .modmsk = BDX_UNC_IRP_ATTRS, ++ .cntmsk = 0x3, ++ }, ++ { .name = "UNC_I_RXR_BL_DRS_OCCUPANCY", ++ .code = 0x7, ++ .desc = "Accumulates the occupancy of the BL Ingress in each cycles. This queue is where the IRP receives data from R2PCIe (the ring). It is used for data returns from read requets as well as outbound MMIO writes.", ++ .modmsk = BDX_UNC_IRP_ATTRS, ++ .cntmsk = 0x3, ++ }, ++ { .name = "UNC_I_RXR_BL_NCB_CYCLES_FULL", ++ .code = 0x5, ++ .desc = "Counts the number of cycles when the BL Ingress is full. This queue is where the IRP receives data from R2PCIe (the ring). It is used for data returns from read requets as well as outbound MMIO writes.", ++ .modmsk = BDX_UNC_IRP_ATTRS, ++ .cntmsk = 0x3, ++ }, ++ { .name = "UNC_I_RXR_BL_NCB_INSERTS", ++ .code = 0x2, ++ .desc = "Counts the number of allocations into the BL Ingress. This queue is where the IRP receives data from R2PCIe (the ring). It is used for data returns from read requets as well as outbound MMIO writes.", ++ .modmsk = BDX_UNC_IRP_ATTRS, ++ .cntmsk = 0x3, ++ }, ++ { .name = "UNC_I_RXR_BL_NCB_OCCUPANCY", ++ .code = 0x8, ++ .desc = "Accumulates the occupancy of the BL Ingress in each cycles. This queue is where the IRP receives data from R2PCIe (the ring). It is used for data returns from read requets as well as outbound MMIO writes.", ++ .modmsk = BDX_UNC_IRP_ATTRS, ++ .cntmsk = 0x3, ++ }, ++ { .name = "UNC_I_RXR_BL_NCS_CYCLES_FULL", ++ .code = 0x6, ++ .desc = "Counts the number of cycles when the BL Ingress is full. This queue is where the IRP receives data from R2PCIe (the ring). It is used for data returns from read requets as well as outbound MMIO writes.", ++ .modmsk = BDX_UNC_IRP_ATTRS, ++ .cntmsk = 0x3, ++ }, ++ { .name = "UNC_I_RXR_BL_NCS_INSERTS", ++ .code = 0x3, ++ .desc = "Counts the number of allocations into the BL Ingress. This queue is where the IRP receives data from R2PCIe (the ring). It is used for data returns from read requets as well as outbound MMIO writes.", ++ .modmsk = BDX_UNC_IRP_ATTRS, ++ .cntmsk = 0x3, ++ }, ++ { .name = "UNC_I_RXR_BL_NCS_OCCUPANCY", ++ .code = 0x9, ++ .desc = "Accumulates the occupancy of the BL Ingress in each cycles. This queue is where the IRP receives data from R2PCIe (the ring). It is used for data returns from read requets as well as outbound MMIO writes.", ++ .modmsk = BDX_UNC_IRP_ATTRS, ++ .cntmsk = 0x3, ++ }, ++ { .name = "UNC_I_SNOOP_RESP", ++ .code = 0x17, ++ .desc = "", ++ .modmsk = BDX_UNC_IRP_ATTRS, ++ .cntmsk = 0x3, ++ .ngrp = 1, ++ .umasks = bdx_unc_i_snoop_resp, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_i_snoop_resp), ++ }, ++ { .name = "UNC_I_TRANSACTIONS", ++ .code = 0x16, ++ .desc = "Counts the number of Inbound transactions from the IRP to the Uncore. This can be filtered based on request type in addition to the source queue. Note the special filtering equation. We do OR-reduction on the request type. If the SOURCE bit is set, then we also do AND qualification based on the source portItID.", ++ .modmsk = BDX_UNC_IRP_ATTRS, ++ .cntmsk = 0x3, ++ .ngrp = 1, ++ .umasks = bdx_unc_i_transactions, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_i_transactions), ++ }, ++ { .name = "UNC_I_TXR_AD_STALL_CREDIT_CYCLES", ++ .code = 0x18, ++ .desc = "Counts the number times when it is not possible to issue a request to the R2PCIe because there are no AD Egress Credits available.", ++ .modmsk = BDX_UNC_IRP_ATTRS, ++ .cntmsk = 0x3, ++ }, ++ { .name = "UNC_I_TXR_BL_STALL_CREDIT_CYCLES", ++ .code = 0x19, ++ .desc = "Counts the number times when it is not possible to issue data to the R2PCIe because there are no BL Egress Credits available.", ++ .modmsk = BDX_UNC_IRP_ATTRS, ++ .cntmsk = 0x3, ++ }, ++ { .name = "UNC_I_TXR_DATA_INSERTS_NCB", ++ .code = 0xe, ++ .desc = "Counts the number of requests issued to the switch (towards the devices).", ++ .modmsk = BDX_UNC_IRP_ATTRS, ++ .cntmsk = 0x3, ++ }, ++ { .name = "UNC_I_TXR_DATA_INSERTS_NCS", ++ .code = 0xf, ++ .desc = "Counts the number of requests issued to the switch (towards the devices).", ++ .modmsk = BDX_UNC_IRP_ATTRS, ++ .cntmsk = 0x3, ++ }, ++ { .name = "UNC_I_TXR_REQUEST_OCCUPANCY", ++ .code = 0xd, ++ .desc = "Accumultes the number of outstanding outbound requests from the IRP to the switch (towards the devices). This can be used in conjuection with the allocations event in order to calculate average latency of outbound requests.", ++ .modmsk = BDX_UNC_IRP_ATTRS, ++ .cntmsk = 0x3, ++ }, ++}; +diff --git a/lib/events/intel_bdx_unc_pcu_events.h b/lib/events/intel_bdx_unc_pcu_events.h +new file mode 100644 +index 0000000..24b0bd5 +--- /dev/null ++++ b/lib/events/intel_bdx_unc_pcu_events.h +@@ -0,0 +1,427 @@ ++/* ++ * Copyright (c) 2017 Google Inc. All rights reserved ++ * Contributed by Stephane Eranian ++ * ++ * Permission is hereby granted, free of charge, to any person obtaining a copy ++ * of this software and associated documentation files (the "Software"), to deal ++ * in the Software without restriction, including without limitation the rights ++ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies ++ * of the Software, and to permit persons to whom the Software is furnished to do so, ++ * subject to the following conditions: ++ * ++ * The above copyright notice and this permission notice shall be included in all ++ * copies or substantial portions of the Software. ++ * ++ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, ++ * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A ++ * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT ++ * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF ++ * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE ++ * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ++ * ++ * This file is part of libpfm, a performance monitoring support library for ++ * applications on Linux. ++ * ++ * PMU: bdx_unc_pcu ++ */ ++ ++static intel_x86_umask_t bdx_unc_p_power_state_occupancy[]={ ++ { .uname = "CORES_C0", ++ .ucode = 0x4000, ++ .udesc = "Number of cores in C-State -- C0 and C1", ++ }, ++ { .uname = "CORES_C3", ++ .ucode = 0x8000, ++ .udesc = "Number of cores in C-State -- C3", ++ }, ++ { .uname = "CORES_C6", ++ .ucode = 0xc000, ++ .udesc = "Number of cores in C-State -- C6 and C7", ++ }, ++}; ++ ++static intel_x86_entry_t intel_bdx_unc_p_pe[]={ ++ { .name = "UNC_P_CLOCKTICKS", ++ .code = 0x0, ++ .desc = "The PCU runs off a fixed 1 GHz clock. This event counts the number of pclk cycles measured while the counter was enabled. The pclk, like the Memory Controllers dclk, counts at a constant rate making it a good measure of actual wall timee.", ++ .modmsk = BDX_UNC_PCU_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_P_CORE0_TRANSITION_CYCLES", ++ .code = 0x60, ++ .desc = "Number of cycles spent performing core C state transitions. There is one event per core.", ++ .modmsk = BDX_UNC_PCU_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_P_CORE10_TRANSITION_CYCLES", ++ .code = 0x6a, ++ .desc = "Number of cycles spent performing core C state transitions. There is one event per core.", ++ .modmsk = BDX_UNC_PCU_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_P_CORE11_TRANSITION_CYCLES", ++ .code = 0x6b, ++ .desc = "Number of cycles spent performing core C state transitions. There is one event per core.", ++ .modmsk = BDX_UNC_PCU_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_P_CORE12_TRANSITION_CYCLES", ++ .code = 0x6c, ++ .desc = "Number of cycles spent performing core C state transitions. There is one event per core.", ++ .modmsk = BDX_UNC_PCU_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_P_CORE13_TRANSITION_CYCLES", ++ .code = 0x6d, ++ .desc = "Number of cycles spent performing core C state transitions. There is one event per core.", ++ .modmsk = BDX_UNC_PCU_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_P_CORE14_TRANSITION_CYCLES", ++ .code = 0x6e, ++ .desc = "Number of cycles spent performing core C state transitions. There is one event per core.", ++ .modmsk = BDX_UNC_PCU_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_P_CORE15_TRANSITION_CYCLES", ++ .code = 0x6f, ++ .desc = "Number of cycles spent performing core C state transitions. There is one event per core.", ++ .modmsk = BDX_UNC_PCU_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_P_CORE16_TRANSITION_CYCLES", ++ .code = 0x70, ++ .desc = "Number of cycles spent performing core C state transitions. There is one event per core.", ++ .modmsk = BDX_UNC_PCU_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_P_CORE17_TRANSITION_CYCLES", ++ .code = 0x71, ++ .desc = "Number of cycles spent performing core C state transitions. There is one event per core.", ++ .modmsk = BDX_UNC_PCU_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_P_CORE1_TRANSITION_CYCLES", ++ .code = 0x61, ++ .desc = "Number of cycles spent performing core C state transitions. There is one event per core.", ++ .modmsk = BDX_UNC_PCU_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_P_CORE2_TRANSITION_CYCLES", ++ .code = 0x62, ++ .desc = "Number of cycles spent performing core C state transitions. There is one event per core.", ++ .modmsk = BDX_UNC_PCU_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_P_CORE3_TRANSITION_CYCLES", ++ .code = 0x63, ++ .desc = "Number of cycles spent performing core C state transitions. There is one event per core.", ++ .modmsk = BDX_UNC_PCU_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_P_CORE4_TRANSITION_CYCLES", ++ .code = 0x64, ++ .desc = "Number of cycles spent performing core C state transitions. There is one event per core.", ++ .modmsk = BDX_UNC_PCU_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_P_CORE5_TRANSITION_CYCLES", ++ .code = 0x65, ++ .desc = "Number of cycles spent performing core C state transitions. There is one event per core.", ++ .modmsk = BDX_UNC_PCU_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_P_CORE6_TRANSITION_CYCLES", ++ .code = 0x66, ++ .desc = "Number of cycles spent performing core C state transitions. There is one event per core.", ++ .modmsk = BDX_UNC_PCU_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_P_CORE7_TRANSITION_CYCLES", ++ .code = 0x67, ++ .desc = "Number of cycles spent performing core C state transitions. There is one event per core.", ++ .modmsk = BDX_UNC_PCU_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_P_CORE8_TRANSITION_CYCLES", ++ .code = 0x68, ++ .desc = "Number of cycles spent performing core C state transitions. There is one event per core.", ++ .modmsk = BDX_UNC_PCU_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_P_CORE9_TRANSITION_CYCLES", ++ .code = 0x69, ++ .desc = "Number of cycles spent performing core C state transitions. There is one event per core.", ++ .modmsk = BDX_UNC_PCU_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_P_DEMOTIONS_CORE0", ++ .code = 0x30, ++ .desc = "Counts the number of times when a configurable cores had a C-state demotion", ++ .modmsk = BDX_UNC_PCU_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_P_DEMOTIONS_CORE1", ++ .code = 0x31, ++ .desc = "Counts the number of times when a configurable cores had a C-state demotion", ++ .modmsk = BDX_UNC_PCU_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_P_DEMOTIONS_CORE10", ++ .code = 0x3a, ++ .desc = "Counts the number of times when a configurable cores had a C-state demotion", ++ .modmsk = BDX_UNC_PCU_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_P_DEMOTIONS_CORE11", ++ .code = 0x3b, ++ .desc = "Counts the number of times when a configurable cores had a C-state demotion", ++ .modmsk = BDX_UNC_PCU_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_P_DEMOTIONS_CORE12", ++ .code = 0x3c, ++ .desc = "Counts the number of times when a configurable cores had a C-state demotion", ++ .modmsk = BDX_UNC_PCU_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_P_DEMOTIONS_CORE13", ++ .code = 0x3d, ++ .desc = "Counts the number of times when a configurable cores had a C-state demotion", ++ .modmsk = BDX_UNC_PCU_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_P_DEMOTIONS_CORE14", ++ .code = 0x3e, ++ .desc = "Counts the number of times when a configurable cores had a C-state demotion", ++ .modmsk = BDX_UNC_PCU_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_P_DEMOTIONS_CORE15", ++ .code = 0x3f, ++ .desc = "Counts the number of times when a configurable cores had a C-state demotion", ++ .modmsk = BDX_UNC_PCU_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_P_DEMOTIONS_CORE16", ++ .code = 0x40, ++ .desc = "Counts the number of times when a configurable cores had a C-state demotion", ++ .modmsk = BDX_UNC_PCU_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_P_DEMOTIONS_CORE17", ++ .code = 0x41, ++ .desc = "Counts the number of times when a configurable cores had a C-state demotion", ++ .modmsk = BDX_UNC_PCU_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_P_DEMOTIONS_CORE2", ++ .code = 0x32, ++ .desc = "Counts the number of times when a configurable cores had a C-state demotion", ++ .modmsk = BDX_UNC_PCU_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_P_DEMOTIONS_CORE3", ++ .code = 0x33, ++ .desc = "Counts the number of times when a configurable cores had a C-state demotion", ++ .modmsk = BDX_UNC_PCU_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_P_DEMOTIONS_CORE4", ++ .code = 0x34, ++ .desc = "Counts the number of times when a configurable cores had a C-state demotion", ++ .modmsk = BDX_UNC_PCU_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_P_DEMOTIONS_CORE5", ++ .code = 0x35, ++ .desc = "Counts the number of times when a configurable cores had a C-state demotion", ++ .modmsk = BDX_UNC_PCU_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_P_DEMOTIONS_CORE6", ++ .code = 0x36, ++ .desc = "Counts the number of times when a configurable cores had a C-state demotion", ++ .modmsk = BDX_UNC_PCU_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_P_DEMOTIONS_CORE7", ++ .code = 0x37, ++ .desc = "Counts the number of times when a configurable cores had a C-state demotion", ++ .modmsk = BDX_UNC_PCU_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_P_DEMOTIONS_CORE8", ++ .code = 0x38, ++ .desc = "Counts the number of times when a configurable cores had a C-state demotion", ++ .modmsk = BDX_UNC_PCU_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_P_DEMOTIONS_CORE9", ++ .code = 0x39, ++ .desc = "Counts the number of times when a configurable cores had a C-state demotion", ++ .modmsk = BDX_UNC_PCU_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_P_FREQ_MAX_LIMIT_THERMAL_CYCLES", ++ .code = 0x4, ++ .desc = "Counts the number of cycles when thermal conditions are the upper limit on frequency. This is related to the THERMAL_THROTTLE CYCLES_ABOVE_TEMP event, which always counts cycles when we are above the thermal temperature. This event (STRONGEST_UPPER_LIMIT) is sampled at the output of the algorithm that determines the actual frequency, while THERMAL_THROTTLE looks at the input.", ++ .modmsk = BDX_UNC_PCU_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_P_FREQ_MAX_OS_CYCLES", ++ .code = 0x6, ++ .desc = "Counts the number of cycles when the OS is the upper limit on frequency.", ++ .modmsk = BDX_UNC_PCU_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_P_FREQ_MAX_POWER_CYCLES", ++ .code = 0x5, ++ .desc = "Counts the number of cycles when power is the upper limit on frequency.", ++ .modmsk = BDX_UNC_PCU_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_P_FREQ_MIN_IO_P_CYCLES", ++ .code = 0x73, ++ .desc = "Counts the number of cycles when IO P Limit is preventing us from dropping the frequency lower. This algorithm monitors the needs to the IO subsystem on both local and remote sockets and will maintain a frequency high enough to maintain good IO BW. This is necessary for when all the IA cores on a socket are idle but a user still would like to maintain high IO Bandwidth.", ++ .modmsk = BDX_UNC_PCU_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_P_FREQ_TRANS_CYCLES", ++ .code = 0x74, ++ .desc = "Counts the number of cycles when the system is changing frequency. This can not be filtered by thread ID. One can also use it with the occupancy counter that monitors number of threads in C0 to estimate the performance impact that frequency transitions had on the system.", ++ .modmsk = BDX_UNC_PCU_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_P_MEMORY_PHASE_SHEDDING_CYCLES", ++ .code = 0x2f, ++ .desc = "Counts the number of cycles that the PCU has triggered memory phase shedding. This is a mode that can be run in the iMC physicals that saves power at the expense of additional latency.", ++ .modmsk = BDX_UNC_PCU_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_P_POWER_STATE_OCCUPANCY", ++ .code = 0x80, ++ .desc = "This is an occupancy event that tracks the number of cores that are in the chosen C-State. It can be used by itself to get the average number of cores in that C-state with threshholding to generate histograms, or with other PCU events and occupancy triggering to capture other details.", ++ .modmsk = BDX_UNC_PCU_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_p_power_state_occupancy, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_p_power_state_occupancy), ++ }, ++ { .name = "UNC_P_PROCHOT_EXTERNAL_CYCLES", ++ .code = 0xa, ++ .desc = "Counts the number of cycles that we are in external PROCHOT mode. This mode is triggered when a sensor off the die determines that something off-die (like DRAM) is too hot and must throttle to avoid damaging the chip.", ++ .modmsk = BDX_UNC_PCU_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_P_PROCHOT_INTERNAL_CYCLES", ++ .code = 0x9, ++ .desc = "Counts the number of cycles that we are in Interal PROCHOT mode. This mode is triggered when a sensor on the die determines that we are too hot and must throttle to avoid damaging the chip.", ++ .modmsk = BDX_UNC_PCU_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_P_TOTAL_TRANSITION_CYCLES", ++ .code = 0x72, ++ .desc = "Number of cycles spent performing core C state transitions across all cores.", ++ .modmsk = BDX_UNC_PCU_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_P_UFS_BANDWIDTH_MAX_RANGE", ++ .code = 0x7e, ++ .desc = "TBD", ++ .modmsk = BDX_UNC_PCU_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_P_UFS_TRANSITIONS_DOWN", ++ .code = 0x7c, ++ .desc = "Ring GV down due to low traffic", ++ .modmsk = BDX_UNC_PCU_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_P_UFS_TRANSITIONS_IO_P_LIMIT", ++ .code = 0x7d, ++ .desc = "TBD", ++ .modmsk = BDX_UNC_PCU_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_P_UFS_TRANSITIONS_NO_CHANGE", ++ .code = 0x79, ++ .desc = "Ring GV with same final and inital frequency", ++ .modmsk = BDX_UNC_PCU_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_P_UFS_TRANSITIONS_UP_RING", ++ .code = 0x7a, ++ .desc = "Ring GV up due to high ring traffic", ++ .modmsk = BDX_UNC_PCU_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_P_UFS_TRANSITIONS_UP_STALL", ++ .code = 0x7b, ++ .desc = "Ring GV up due to high core stalls", ++ .modmsk = BDX_UNC_PCU_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_P_VR_HOT_CYCLES", ++ .code = 0x42, ++ .desc = "TBD", ++ .modmsk = BDX_UNC_PCU_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_P_FREQ_BAND0_CYCLES", ++ .desc = "Frequency Residency", ++ .code = 0xb, ++ .cntmsk = 0xf, ++ .flags = INTEL_X86_NO_AUTOENCODE, ++ .modmsk = BDX_UNC_PCU_BAND_ATTRS, ++ .modmsk_req = _SNBEP_UNC_ATTR_FF, ++ }, ++ { .name = "UNC_P_FREQ_BAND1_CYCLES", ++ .desc = "Frequency Residency", ++ .code = 0xc, ++ .cntmsk = 0xf, ++ .flags = INTEL_X86_NO_AUTOENCODE, ++ .modmsk = BDX_UNC_PCU_BAND_ATTRS, ++ .modmsk_req = _SNBEP_UNC_ATTR_FF, ++ }, ++ { .name = "UNC_P_FREQ_BAND2_CYCLES", ++ .desc = "Frequency Residency", ++ .code = 0xd, ++ .cntmsk = 0xf, ++ .flags = INTEL_X86_NO_AUTOENCODE, ++ .modmsk = BDX_UNC_PCU_BAND_ATTRS, ++ .modmsk_req = _SNBEP_UNC_ATTR_FF, ++ }, ++ { .name = "UNC_P_FREQ_BAND3_CYCLES", ++ .desc = "Frequency Residency", ++ .code = 0xe, ++ .cntmsk = 0xf, ++ .flags = INTEL_X86_NO_AUTOENCODE, ++ .modmsk = BDX_UNC_PCU_BAND_ATTRS, ++ .modmsk_req = _SNBEP_UNC_ATTR_FF, ++ }, ++ { .name = "UNC_P_FIVR_PS_PS0_CYCLES", ++ .desc = "Cycles spent in phase-shedding power state 0", ++ .code = 0x75, ++ .cntmsk = 0xf, ++ .modmsk = BDX_UNC_PCU_ATTRS, ++ }, ++ { .name = "UNC_P_FIVR_PS_PS1_CYCLES", ++ .desc = "Cycles spent in phase-shedding power state 1", ++ .code = 0x76, ++ .cntmsk = 0xf, ++ .modmsk = BDX_UNC_PCU_ATTRS, ++ }, ++ { .name = "UNC_P_FIVR_PS_PS2_CYCLES", ++ .desc = "Cycles spent in phase-shedding power state 2", ++ .code = 0x77, ++ .cntmsk = 0xf, ++ .modmsk = BDX_UNC_PCU_ATTRS, ++ }, ++ { .name = "UNC_P_FIVR_PS_PS3_CYCLES", ++ .desc = "Cycles spent in phase-shedding power state 3", ++ .code = 0x78, ++ .cntmsk = 0xf, ++ .modmsk = BDX_UNC_PCU_ATTRS, ++ }, ++}; +diff --git a/lib/events/intel_bdx_unc_qpi_events.h b/lib/events/intel_bdx_unc_qpi_events.h +new file mode 100644 +index 0000000..18c010a +--- /dev/null ++++ b/lib/events/intel_bdx_unc_qpi_events.h +@@ -0,0 +1,710 @@ ++/* ++ * Copyright (c) 2017 Google Inc. All rights reserved ++ * Contributed by Stephane Eranian ++ * ++ * Permission is hereby granted, free of charge, to any person obtaining a copy ++ * of this software and associated documentation files (the "Software"), to deal ++ * in the Software without restriction, including without limitation the rights ++ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies ++ * of the Software, and to permit persons to whom the Software is furnished to do so, ++ * subject to the following conditions: ++ * ++ * The above copyright notice and this permission notice shall be included in all ++ * copies or substantial portions of the Software. ++ * ++ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, ++ * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A ++ * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT ++ * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF ++ * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE ++ * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ++ * ++ * This file is part of libpfm, a performance monitoring support library for ++ * applications on Linux. ++ * ++ * PMU: bdx_unc_qpi ++ */ ++ ++static intel_x86_umask_t bdx_unc_q_direct2core[]={ ++ { .uname = "FAILURE_CREDITS", ++ .ucode = 0x200, ++ .udesc = "Direct 2 Core Spawning -- Spawn Failure - Egress Credits", ++ }, ++ { .uname = "FAILURE_CREDITS_MISS", ++ .ucode = 0x2000, ++ .udesc = "Direct 2 Core Spawning -- Spawn Failure - Egress and RBT Miss", ++ }, ++ { .uname = "FAILURE_CREDITS_RBT", ++ .ucode = 0x800, ++ .udesc = "Direct 2 Core Spawning -- Spawn Failure - Egress and RBT Invalid", ++ }, ++ { .uname = "FAILURE_CREDITS_RBT_MISS", ++ .ucode = 0x8000, ++ .udesc = "Direct 2 Core Spawning -- Spawn Failure - Egress and RBT Miss, Invalid", ++ }, ++ { .uname = "FAILURE_MISS", ++ .ucode = 0x1000, ++ .udesc = "Direct 2 Core Spawning -- Spawn Failure - RBT Miss", ++ }, ++ { .uname = "FAILURE_RBT_HIT", ++ .ucode = 0x400, ++ .udesc = "Direct 2 Core Spawning -- Spawn Failure - RBT Invalid", ++ }, ++ { .uname = "FAILURE_RBT_MISS", ++ .ucode = 0x4000, ++ .udesc = "Direct 2 Core Spawning -- Spawn Failure - RBT Miss and Invalid", ++ }, ++ { .uname = "SUCCESS_RBT_HIT", ++ .ucode = 0x100, ++ .udesc = "Direct 2 Core Spawning -- Spawn Success", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_q_rxl_credits_consumed_vn0[]={ ++ { .uname = "DRS", ++ .ucode = 0x100, ++ .udesc = "VN0 Credit Consumed -- DRS", ++ }, ++ { .uname = "HOM", ++ .ucode = 0x800, ++ .udesc = "VN0 Credit Consumed -- HOM", ++ }, ++ { .uname = "NCB", ++ .ucode = 0x200, ++ .udesc = "VN0 Credit Consumed -- NCB", ++ }, ++ { .uname = "NCS", ++ .ucode = 0x400, ++ .udesc = "VN0 Credit Consumed -- NCS", ++ }, ++ { .uname = "NDR", ++ .ucode = 0x2000, ++ .udesc = "VN0 Credit Consumed -- NDR", ++ }, ++ { .uname = "SNP", ++ .ucode = 0x1000, ++ .udesc = "VN0 Credit Consumed -- SNP", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_q_rxl_flits_g1[]={ ++ { .uname = "DRS", ++ .ucode = 0x1800, ++ .udesc = "Flits Received - Group 1 -- DRS Flits (both Header and Data)", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "DRS_DATA", ++ .ucode = 0x800, ++ .udesc = "Flits Received - Group 1 -- DRS Data Flits", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "DRS_NONDATA", ++ .ucode = 0x1000, ++ .udesc = "Flits Received - Group 1 -- DRS Header Flits", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "HOM", ++ .ucode = 0x600, ++ .udesc = "Flits Received - Group 1 -- HOM Flits", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "HOM_NONREQ", ++ .ucode = 0x400, ++ .udesc = "Flits Received - Group 1 -- HOM Non-Request Flits", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "HOM_REQ", ++ .ucode = 0x200, ++ .udesc = "Flits Received - Group 1 -- HOM Request Flits", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "SNP", ++ .ucode = 0x100, ++ .udesc = "Flits Received - Group 1 -- SNP Flits", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_q_rxl_flits_g2[]={ ++ { .uname = "NCB", ++ .ucode = 0xc00, ++ .udesc = "Flits Received - Group 2 -- Non-Coherent Rx Flits", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "NCB_DATA", ++ .ucode = 0x400, ++ .udesc = "Flits Received - Group 2 -- Non-Coherent data Rx Flits", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "NCB_NONDATA", ++ .ucode = 0x800, ++ .udesc = "Flits Received - Group 2 -- Non-Coherent non-data Rx Flits", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "NCS", ++ .ucode = 0x1000, ++ .udesc = "Flits Received - Group 2 -- Non-Coherent standard Rx Flits", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "NDR_AD", ++ .ucode = 0x100, ++ .udesc = "Flits Received - Group 2 -- Non-Data Response Rx Flits - AD", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "NDR_AK", ++ .ucode = 0x200, ++ .udesc = "Flits Received - Group 2 -- Non-Data Response Rx Flits - AK", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_q_rxl_inserts_drs[]={ ++ { .uname = "VN0", ++ .ucode = 0x100, ++ .udesc = "for VN0", ++ }, ++ { .uname = "VN1", ++ .ucode = 0x200, ++ .udesc = "for VN1", ++ }, ++}; ++ ++static const intel_x86_umask_t bdx_unc_q_rxl_flits_g0[]={ ++ { .uname = "IDLE", ++ .udesc = "Number of data flits over QPI that do not hold payload. When QPI is not in a power saving state, it continuously transmits flits across the link. When there are no protocol flits to send, it will send IDLE and NULL flits across", ++ .ucode = 0x200, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "DATA", ++ .udesc = "Number of data flits over QPI", ++ .ucode = 0x200, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "NON_DATA", ++ .udesc = "Number of non-NULL non-data flits over QPI", ++ .ucode = 0x400, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_q_txl_flits_g0[]={ ++ { .uname = "DATA", ++ .ucode = 0x200, ++ .udesc = "Flits Transferred - Group 0 -- Data Tx Flits", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "NON_DATA", ++ .ucode = 0x400, ++ .udesc = "Flits Transferred - Group 0 -- Non-Data protocol Tx Flits", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_q_txl_flits_g1[]={ ++ { .uname = "DRS", ++ .ucode = 0x1800, ++ .udesc = "Flits Transferred - Group 1 -- DRS Flits (both Header and Data)", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "DRS_DATA", ++ .ucode = 0x800, ++ .udesc = "Flits Transferred - Group 1 -- DRS Data Flits", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "DRS_NONDATA", ++ .ucode = 0x1000, ++ .udesc = "Flits Transferred - Group 1 -- DRS Header Flits", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "HOM", ++ .ucode = 0x600, ++ .udesc = "Flits Transferred - Group 1 -- HOM Flits", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "HOM_NONREQ", ++ .ucode = 0x400, ++ .udesc = "Flits Transferred - Group 1 -- HOM Non-Request Flits", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "HOM_REQ", ++ .ucode = 0x200, ++ .udesc = "Flits Transferred - Group 1 -- HOM Request Flits", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "SNP", ++ .ucode = 0x100, ++ .udesc = "Flits Transferred - Group 1 -- SNP Flits", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_q_txl_flits_g2[]={ ++ { .uname = "NCB", ++ .ucode = 0xc00, ++ .udesc = "Flits Transferred - Group 2 -- Non-Coherent Bypass Tx Flits", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "NCB_DATA", ++ .ucode = 0x400, ++ .udesc = "Flits Transferred - Group 2 -- Non-Coherent data Tx Flits", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "NCB_NONDATA", ++ .ucode = 0x800, ++ .udesc = "Flits Transferred - Group 2 -- Non-Coherent non-data Tx Flits", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "NCS", ++ .ucode = 0x1000, ++ .udesc = "Flits Transferred - Group 2 -- Non-Coherent standard Tx Flits", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "NDR_AD", ++ .ucode = 0x100, ++ .udesc = "Flits Transferred - Group 2 -- Non-Data Response Tx Flits - AD", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "NDR_AK", ++ .ucode = 0x200, ++ .udesc = "Flits Transferred - Group 2 -- Non-Data Response Tx Flits - AK", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_q_txr_bl_drs_credit_acquired[]={ ++ { .uname = "VN0", ++ .ucode = 0x100, ++ .udesc = "R3QPI Egress Credit Occupancy - DRS -- for VN0", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "VN1", ++ .ucode = 0x200, ++ .udesc = "R3QPI Egress Credit Occupancy - DRS -- for VN1", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "VN_SHR", ++ .ucode = 0x400, ++ .udesc = "R3QPI Egress Credit Occupancy - DRS -- for Shared VN", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++}; ++ ++static intel_x86_entry_t intel_bdx_unc_q_pe[]={ ++ { .name = "UNC_Q_CLOCKTICKS", ++ .code = 0x14, ++ .desc = "Counts the number of clocks in the QPI LL. This clock runs at 1/4th the GT/s speed of the QPI link. For example, a 4GT/s link will have qfclk or 1GHz. BDX does not support dynamic link speeds, so this frequency is fixexed.", ++ .modmsk = BDX_UNC_QPI_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_Q_CTO_COUNT", ++ .code = 0x38 | (1 << 21), /* extra ev_sel_ext bit set */ ++ .desc = "Counts the number of CTO (cluster trigger outs) events that were asserted across the two slots. If both slots trigger in a given cycle, the event will increment by 2. You can use edge detect to count the number of cases when both events triggered.", ++ .modmsk = BDX_UNC_QPI_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_Q_DIRECT2CORE", ++ .code = 0x13, ++ .desc = "Counts the number of DRS packets that we attempted to do direct2core on. There are 4 mutually exlusive filters. Filter [0] can be used to get successful spawns, while [1:3] provide the different failure cases. Note that this does not count packets that are not candidates for Direct2Core. The only candidates for Direct2Core are DRS packets destined for Cbos.", ++ .modmsk = BDX_UNC_QPI_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_q_direct2core, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_q_direct2core), ++ }, ++ { .name = "UNC_Q_L1_POWER_CYCLES", ++ .code = 0x12, ++ .desc = "Number of QPI qfclk cycles spent in L1 power mode. L1 is a mode that totally shuts down a QPI link. Use edge detect to count the number of instances when the QPI link entered L1. Link power states are per link and per direction, so for example the Tx direction could be in one state while Rx was in another. Because L1 totally shuts down the link, it takes a good amount of time to exit this mode.", ++ .modmsk = BDX_UNC_QPI_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_Q_RXL0P_POWER_CYCLES", ++ .code = 0x10, ++ .desc = "Number of QPI qfclk cycles spent in L0p power mode. L0p is a mode where we disable 1/2 of the QPI lanes, decreasing our bandwidth in order to save power. It increases snoop and data transfer latencies and decreases overall bandwidth. This mode can be very useful in NUMA optimized workloads that largely only utilize QPI for snoops and their responses. Use edge detect to count the number of instances when the QPI link entered L0p. Link power states are per link and per direction, so for example the Tx direction could be in one state while Rx was in another.", ++ .modmsk = BDX_UNC_QPI_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_Q_RXL0_POWER_CYCLES", ++ .code = 0xf, ++ .desc = "Number of QPI qfclk cycles spent in L0 power mode in the Link Layer. L0 is the default mode which provides the highest performance with the most power. Use edge detect to count the number of instances that the link entered L0. Link power states are per link and per direction, so for example the Tx direction could be in one state while Rx was in another. The phy layer sometimes leaves L0 for training, which will not be captured by this event.", ++ .modmsk = BDX_UNC_QPI_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_Q_RXL_BYPASSED", ++ .code = 0x9, ++ .desc = "Counts the number of times that an incoming flit was able to bypass the flit buffer and pass directly across the BGF and into the Egress. This is a latency optimization, and should generally be the common case. If this value is less than the number of flits transfered, it implies that there was queueing getting onto the ring, and thus the transactions saw higher latency.", ++ .modmsk = BDX_UNC_QPI_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_Q_RXL_CREDITS_CONSUMED_VN0", ++ .code = 0x1e | (1 << 21), /* extra ev_sel_ext bit set */ ++ .desc = "Counts the number of times that an RxQ VN0 credit was consumed (i.e. message uses a VN0 credit for the Rx Buffer). This includes packets that went through the RxQ and those that were bypasssed.", ++ .modmsk = BDX_UNC_QPI_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_q_rxl_credits_consumed_vn0, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_q_rxl_credits_consumed_vn0), ++ }, ++ { .name = "UNC_Q_RXL_CREDITS_CONSUMED_VN1", ++ .code = 0x39 | (1 << 21), /* extra ev_sel_ext bit set */ ++ .desc = "Counts the number of times that an RxQ VN1 credit was consumed (i.e. message uses a VN1 credit for the Rx Buffer). This includes packets that went through the RxQ and those that were bypasssed.", ++ .modmsk = BDX_UNC_QPI_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_q_rxl_credits_consumed_vn0, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_q_rxl_credits_consumed_vn0), ++ }, ++ { .name = "UNC_Q_RXL_CREDITS_CONSUMED_VNA", ++ .code = 0x1d | (1 << 21), /* extra ev_sel_ext bit set */ ++ .desc = "Counts the number of times that an RxQ VNA credit was consumed (i.e. message uses a VNA credit for the Rx Buffer). This includes packets that went through the RxQ and those that were bypasssed.", ++ .modmsk = BDX_UNC_QPI_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_Q_RXL_CYCLES_NE", ++ .code = 0xa, ++ .desc = "Counts the number of cycles that the QPI RxQ was not empty. Generally, when data is transmitted across QPI, it will bypass the RxQ and pass directly to the ring interface. If things back up getting transmitted onto the ring, however, it may need to allocate into this buffer, thus increasing the latency. This event can be used in conjunction with the Flit Buffer Occupancy Accumulator event to calculate the average occupancy.", ++ .modmsk = BDX_UNC_QPI_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_Q_RXL_FLITS_G0", ++ .code = 0x1, ++ .desc = "Counts the number of flits received from the QPI Link.", ++ .modmsk = BDX_UNC_QPI_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_q_rxl_flits_g0, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_q_rxl_flits_g0), ++ }, ++ { .name = "UNC_Q_RXL_FLITS_G1", ++ .code = 0x2 | (1 << 21), /* extra ev_sel_ext bit set */ ++ .desc = "Counts the number of flits received from the QPI Link. This is one of three groups that allow us to track flits. It includes filters for SNP, HOM, and DRS message classes. Each flit is made up of 80 bits of information (in addition to some ECC data). In full-width (L0) mode, flits are made up of four fits, each of which contains 20 bits of data (along with some additional ECC data). In half-width (L0p) mode, the fits are only 10 bits, and therefore it takes twice as many fits to transmit a flit. When one talks about QPI speed (for example, 8.0 GT/s), the transfers here refer to fits. Therefore, in L0, the system will transfer 1 flit at the rate of 1/4th the QPI speed. One can calculate the bandwidth of the link by taking: flits*80b/time. Note that this is not the same as data bandwidth. For example, when we are transfering a 64B cacheline across QPI, we will break it into 9 flits -- 1 with header information and 8 with 64 bits of actual data and an additional 16 bits of other information. To calculate data bandwidth, one should therefore do: datld therefore do: data flits * 8B / time.", ++ .modmsk = BDX_UNC_QPI_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_q_rxl_flits_g1, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_q_rxl_flits_g1), ++ }, ++ { .name = "UNC_Q_RXL_FLITS_G2", ++ .code = 0x3 | (1 << 21), /* extra ev_sel_ext bit set */ ++ .desc = "Counts the number of flits received from the QPI Link. This is one of three groups that allow us to track flits. It includes filters for NDR, NCB, and NCS message classes. Each flit is made up of 80 bits of information (in addition to some ECC data). In full-width (L0) mode, flits are made up of four fits, each of which contains 20 bits of data (along with some additional ECC data). In half-width (L0p) mode, the fits are only 10 bits, and therefore it takes twice as many fits to transmit a flit. When one talks about QPI speed (for example, 8.0 GT/s), the transfers here refer to fits. Therefore, in L0, the system will transfer 1 flit at the rate of 1/4th the QPI speed. One can calculate the bandwidth of the link by taking: flits*80b/time. Note that this is not the same as data bandwidth. For example, when we are transfering a 64B cacheline across QPI, we will break it into 9 flits -- 1 with header information and 8 with 64 bits of actual data and an additional 16 bits of other information. To calculate data bandwidth, one should therefore do: datld therefore do: data flits * 8B / time.", ++ .modmsk = BDX_UNC_QPI_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_q_rxl_flits_g2, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_q_rxl_flits_g2), ++ }, ++ { .name = "UNC_Q_RXL_INSERTS", ++ .code = 0x8, ++ .desc = "Number of allocations into the QPI Rx Flit Buffer. Generally, when data is transmitted across QPI, it will bypass the RxQ and pass directly to the ring interface. If things back up getting transmitted onto the ring, however, it may need to allocate into this buffer, thus increasing the latency. This event can be used in conjunction with the Flit Buffer Occupancy event in order to calculate the average flit buffer lifetime.", ++ .modmsk = BDX_UNC_QPI_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_Q_RXL_INSERTS_DRS", ++ .code = 0x9 | (1 << 21), /* extra ev_sel_ext bit set */ ++ .desc = "Number of allocations into the QPI Rx Flit Buffer. Generally, when data is transmitted across QPI, it will bypass the RxQ and pass directly to the ring interface. If things back up getting transmitted onto the ring, however, it may need to allocate into this buffer, thus increasing the latency. This event can be used in conjunction with the Flit Buffer Occupancy event in order to calculate the average flit buffer lifetime. This monitors only DRS flits.", ++ .modmsk = BDX_UNC_QPI_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_q_rxl_inserts_drs, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_q_rxl_inserts_drs), ++ }, ++ { .name = "UNC_Q_RXL_INSERTS_HOM", ++ .code = 0xc | (1 << 21), /* extra ev_sel_ext bit set */ ++ .desc = "Number of allocations into the QPI Rx Flit Buffer. Generally, when data is transmitted across QPI, it will bypass the RxQ and pass directly to the ring interface. If things back up getting transmitted onto the ring, however, it may need to allocate into this buffer, thus increasing the latency. This event can be used in conjunction with the Flit Buffer Occupancy event in order to calculate the average flit buffer lifetime. This monitors only HOM flits.", ++ .modmsk = BDX_UNC_QPI_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_q_rxl_inserts_drs, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_q_rxl_inserts_drs), ++ }, ++ { .name = "UNC_Q_RXL_INSERTS_NCB", ++ .code = 0xa | (1 << 21), /* extra ev_sel_ext bit set */ ++ .desc = "Number of allocations into the QPI Rx Flit Buffer. Generally, when data is transmitted across QPI, it will bypass the RxQ and pass directly to the ring interface. If things back up getting transmitted onto the ring, however, it may need to allocate into this buffer, thus increasing the latency. This event can be used in conjunction with the Flit Buffer Occupancy event in order to calculate the average flit buffer lifetime. This monitors only NCB flits.", ++ .modmsk = BDX_UNC_QPI_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_q_rxl_inserts_drs, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_q_rxl_inserts_drs), ++ }, ++ { .name = "UNC_Q_RXL_INSERTS_NCS", ++ .code = 0xb | (1 << 21), /* extra ev_sel_ext bit set */ ++ .desc = "Number of allocations into the QPI Rx Flit Buffer. Generally, when data is transmitted across QPI, it will bypass the RxQ and pass directly to the ring interface. If things back up getting transmitted onto the ring, however, it may need to allocate into this buffer, thus increasing the latency. This event can be used in conjunction with the Flit Buffer Occupancy event in order to calculate the average flit buffer lifetime. This monitors only NCS flits.", ++ .modmsk = BDX_UNC_QPI_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_q_rxl_inserts_drs, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_q_rxl_inserts_drs), ++ }, ++ { .name = "UNC_Q_RXL_INSERTS_NDR", ++ .code = 0xe | (1 << 21), /* extra ev_sel_ext bit set */ ++ .desc = "Number of allocations into the QPI Rx Flit Buffer. Generally, when data is transmitted across QPI, it will bypass the RxQ and pass directly to the ring interface. If things back up getting transmitted onto the ring, however, it may need to allocate into this buffer, thus increasing the latency. This event can be used in conjunction with the Flit Buffer Occupancy event in order to calculate the average flit buffer lifetime. This monitors only NDR flits.", ++ .modmsk = BDX_UNC_QPI_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_q_rxl_inserts_drs, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_q_rxl_inserts_drs), ++ }, ++ { .name = "UNC_Q_RXL_INSERTS_SNP", ++ .code = 0xd | (1 << 21), /* extra ev_sel_ext bit set */ ++ .desc = "Number of allocations into the QPI Rx Flit Buffer. Generally, when data is transmitted across QPI, it will bypass the RxQ and pass directly to the ring interface. If things back up getting transmitted onto the ring, however, it may need to allocate into this buffer, thus increasing the latency. This event can be used in conjunction with the Flit Buffer Occupancy event in order to calculate the average flit buffer lifetime. This monitors only SNP flits.", ++ .modmsk = BDX_UNC_QPI_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_q_rxl_inserts_drs, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_q_rxl_inserts_drs), ++ }, ++ { .name = "UNC_Q_RXL_OCCUPANCY", ++ .code = 0xb, ++ .desc = "Accumulates the number of elements in the QPI RxQ in each cycle. Generally, when data is transmitted across QPI, it will bypass the RxQ and pass directly to the ring interface. If things back up getting transmitted onto the ring, however, it may need to allocate into this buffer, thus increasing the latency. This event can be used in conjunction with the Flit Buffer Not Empty event to calculate average occupancy, or with the Flit Buffer Allocations event to track average lifetime.", ++ .modmsk = BDX_UNC_QPI_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_Q_RXL_OCCUPANCY_DRS", ++ .code = 0x15 | (1 << 21), /* extra ev_sel_ext bit set */ ++ .desc = "Accumulates the number of elements in the QPI RxQ in each cycle. Generally, when data is transmitted across QPI, it will bypass the RxQ and pass directly to the ring interface. If things back up getting transmitted onto the ring, however, it may need to allocate into this buffer, thus increasing the latency. This event can be used in conjunction with the Flit Buffer Not Empty event to calculate average occupancy, or with the Flit Buffer Allocations event to track average lifetime. This monitors DRS flits only.", ++ .modmsk = BDX_UNC_QPI_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_q_rxl_inserts_drs, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_q_rxl_inserts_drs), ++ }, ++ { .name = "UNC_Q_RXL_OCCUPANCY_HOM", ++ .code = 0x18 | (1 << 21), /* extra ev_sel_ext bit set */ ++ .desc = "Accumulates the number of elements in the QPI RxQ in each cycle. Generally, when data is transmitted across QPI, it will bypass the RxQ and pass directly to the ring interface. If things back up getting transmitted onto the ring, however, it may need to allocate into this buffer, thus increasing the latency. This event can be used in conjunction with the Flit Buffer Not Empty event to calculate average occupancy, or with the Flit Buffer Allocations event to track average lifetime. This monitors HOM flits only.", ++ .modmsk = BDX_UNC_QPI_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_q_rxl_inserts_drs, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_q_rxl_inserts_drs), ++ }, ++ { .name = "UNC_Q_RXL_OCCUPANCY_NCB", ++ .code = 0x16 | (1 << 21), /* extra ev_sel_ext bit set */ ++ .desc = "Accumulates the number of elements in the QPI RxQ in each cycle. Generally, when data is transmitted across QPI, it will bypass the RxQ and pass directly to the ring interface. If things back up getting transmitted onto the ring, however, it may need to allocate into this buffer, thus increasing the latency. This event can be used in conjunction with the Flit Buffer Not Empty event to calculate average occupancy, or with the Flit Buffer Allocations event to track average lifetime. This monitors NCB flits only.", ++ .modmsk = BDX_UNC_QPI_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_q_rxl_inserts_drs, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_q_rxl_inserts_drs), ++ }, ++ { .name = "UNC_Q_RXL_OCCUPANCY_NCS", ++ .code = 0x17 | (1 << 21), /* extra ev_sel_ext bit set */ ++ .desc = "Accumulates the number of elements in the QPI RxQ in each cycle. Generally, when data is transmitted across QPI, it will bypass the RxQ and pass directly to the ring interface. If things back up getting transmitted onto the ring, however, it may need to allocate into this buffer, thus increasing the latency. This event can be used in conjunction with the Flit Buffer Not Empty event to calculate average occupancy, or with the Flit Buffer Allocations event to track average lifetime. This monitors NCS flits only.", ++ .modmsk = BDX_UNC_QPI_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_q_rxl_inserts_drs, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_q_rxl_inserts_drs), ++ }, ++ { .name = "UNC_Q_RXL_OCCUPANCY_NDR", ++ .code = 0x1a | (1 << 21), /* extra ev_sel_ext bit set */ ++ .desc = "Accumulates the number of elements in the QPI RxQ in each cycle. Generally, when data is transmitted across QPI, it will bypass the RxQ and pass directly to the ring interface. If things back up getting transmitted onto the ring, however, it may need to allocate into this buffer, thus increasing the latency. This event can be used in conjunction with the Flit Buffer Not Empty event to calculate average occupancy, or with the Flit Buffer Allocations event to track average lifetime. This monitors NDR flits only.", ++ .modmsk = BDX_UNC_QPI_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_q_rxl_inserts_drs, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_q_rxl_inserts_drs), ++ }, ++ { .name = "UNC_Q_RXL_OCCUPANCY_SNP", ++ .code = 0x19 | (1 << 21), /* extra ev_sel_ext bit set */ ++ .desc = "Accumulates the number of elements in the QPI RxQ in each cycle. Generally, when data is transmitted across QPI, it will bypass the RxQ and pass directly to the ring interface. If things back up getting transmitted onto the ring, however, it may need to allocate into this buffer, thus increasing the latency. This event can be used in conjunction with the Flit Buffer Not Empty event to calculate average occupancy, or with the Flit Buffer Allocations event to track average lifetime. This monitors SNP flits only.", ++ .modmsk = BDX_UNC_QPI_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_q_rxl_inserts_drs, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_q_rxl_inserts_drs), ++ }, ++ { .name = "UNC_Q_TXL0P_POWER_CYCLES", ++ .code = 0xd, ++ .desc = "Number of QPI qfclk cycles spent in L0p power mode. L0p is a mode where we disable 1/2 of the QPI lanes, decreasing our bandwidth in order to save power. It increases snoop and data transfer latencies and decreases overall bandwidth. This mode can be very useful in NUMA optimized workloads that largely only utilize QPI for snoops and their responses. Use edge detect to count the number of instances when the QPI link entered L0p. Link power states are per link and per direction, so for example the Tx direction could be in one state while Rx was in another.", ++ .modmsk = BDX_UNC_QPI_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_Q_TXL0_POWER_CYCLES", ++ .code = 0xc, ++ .desc = "Number of QPI qfclk cycles spent in L0 power mode in the Link Layer. L0 is the default mode which provides the highest performance with the most power. Use edge detect to count the number of instances that the link entered L0. Link power states are per link and per direction, so for example the Tx direction could be in one state while Rx was in another. The phy layer sometimes leaves L0 for training, which will not be captured by this event.", ++ .modmsk = BDX_UNC_QPI_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_Q_TXL_BYPASSED", ++ .code = 0x5, ++ .desc = "Counts the number of times that an incoming flit was able to bypass the Tx flit buffer and pass directly out the QPI Link. Generally, when data is transmitted across QPI, it will bypass the TxQ and pass directly to the link. However, the TxQ will be used with L0p and when LLR occurs, increasing latency to transfer out to the link.", ++ .modmsk = BDX_UNC_QPI_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_Q_TXL_CYCLES_NE", ++ .code = 0x6, ++ .desc = "Counts the number of cycles when the TxQ is not empty. Generally, when data is transmitted across QPI, it will bypass the TxQ and pass directly to the link. However, the TxQ will be used with L0p and when LLR occurs, increasing latency to transfer out to the link.", ++ .modmsk = BDX_UNC_QPI_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_Q_TXL_FLITS_G0", ++ .code = 0x0, ++ .desc = "Counts the number of flits transmitted across the QPI Link. It includes filters for Idle, protocol, and Data Flits. Each flit is made up of 80 bits of information (in addition to some ECC data). In full-width (L0) mode, flits are made up of four fits, each of which contains 20 bits of data (along with some additional ECC data). In half-width (L0p) mode, the fits are only 10 bits, and therefore it takes twice as many fits to transmit a flit. When one talks about QPI speed (for example, 8.0 GT/s), the transfers here refer to fits. Therefore, in L0, the system will transfer 1 flit at the rate of 1/4th the QPI speed. One can calculate the bandwidth of the link by taking: flits*80b/time. Note that this is not the same as data bandwidth. For example, when we are transfering a 64B cacheline across QPI, we will break it into 9 flits -- 1 with header information and 8 with 64 bits of actual data and an additional 16 bits of other information. To calculate data bandwidth, one should therefore do: data flits * 8B / time (for L0) or 4B instfor L0) or 4B instead of 8B for L0p.", ++ .modmsk = BDX_UNC_QPI_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_q_txl_flits_g0, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_q_txl_flits_g0), ++ }, ++ { .name = "UNC_Q_TXL_FLITS_G1", ++ .code = 0x0 | (1 << 21), /* extra ev_sel_ext bit set */ ++ .desc = "Counts the number of flits transmitted across the QPI Link. It includes filters for Idle, protocol, and Data Flits. Each flit is made up of 80 bits of information (in addition to some ECC data). In full-width (L0) mode, flits are made up of four fits, each of which contains 20 bits of data (along with some additional ECC data). In half-width (L0p) mode, the fits are only 10 bits, and therefore it takes twice as many fits to transmit a flit. When one talks about QPI speed (for example, 8.0 GT/s), the transfers here refer to fits. Therefore, in L0, the system will transfer 1 flit at the rate of 1/4th the QPI speed. One can calculate the bandwidth of the link by taking: flits*80b/time. Note that this is not the same as data bandwidth. For example, when we are transfering a 64B cacheline across QPI, we will break it into 9 flits -- 1 with header information and 8 with 64 bits of actual data and an additional 16 bits of other information. To calculate data bandwidth, one should therefore do: data flits * 8B / time (for L0) or 4B instfor L0) or 4B instead of 8B for L0p.", ++ .modmsk = BDX_UNC_QPI_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_q_txl_flits_g1, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_q_txl_flits_g1), ++ }, ++ { .name = "UNC_Q_TXL_FLITS_G2", ++ .code = 0x1 | (1 << 21), /* extra ev_sel_ext bit set */ ++ .desc = "Counts the number of flits trasmitted across the QPI Link. This is one of three groups that allow us to track flits. It includes filters for NDR, NCB, and NCS message classes. Each flit is made up of 80 bits of information (in addition to some ECC data). In full-width (L0) mode, flits are made up of four fits, each of which contains 20 bits of data (along with some additional ECC data). In half-width (L0p) mode, the fits are only 10 bits, and therefore it takes twice as many fits to transmit a flit. When one talks about QPI speed (for example, 8.0 GT/s), the transfers here refer to fits. Therefore, in L0, the system will transfer 1 flit at the rate of 1/4th the QPI speed. One can calculate the bandwidth of the link by taking: flits*80b/time. Note that this is not the same as data bandwidth. For example, when we are transfering a 64B cacheline across QPI, we will break it into 9 flits -- 1 with header information and 8 with 64 bits of actual data and an additional 16 bits of other information. To calculate data bandwidth, one should therefore do: datld therefore do: data flits * 8B / time.", ++ .modmsk = BDX_UNC_QPI_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_q_txl_flits_g2, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_q_txl_flits_g2), ++ }, ++ { .name = "UNC_Q_TXL_INSERTS", ++ .code = 0x4, ++ .desc = "Number of allocations into the QPI Tx Flit Buffer. Generally, when data is transmitted across QPI, it will bypass the TxQ and pass directly to the link. However, the TxQ will be used with L0p and when LLR occurs, increasing latency to transfer out to the link. This event can be used in conjunction with the Flit Buffer Occupancy event in order to calculate the average flit buffer lifetime.", ++ .modmsk = BDX_UNC_QPI_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_Q_TXL_OCCUPANCY", ++ .code = 0x7, ++ .desc = "Accumulates the number of flits in the TxQ. Generally, when data is transmitted across QPI, it will bypass the TxQ and pass directly to the link. However, the TxQ will be used with L0p and when LLR occurs, increasing latency to transfer out to the link. This can be used with the cycles not empty event to track average occupancy, or the allocations event to track average lifetime in the TxQ.", ++ .modmsk = BDX_UNC_QPI_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_Q_TXR_AD_HOM_CREDIT_ACQUIRED", ++ .code = 0x26 | (1 << 21), /* extra ev_sel_ext bit set */ ++ .desc = "Number of link layer credits into the R3 (for transactions across the BGF) acquired each cycle. Flow Control FIFO for Home messages on AD.", ++ .modmsk = BDX_UNC_QPI_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_q_rxl_inserts_drs, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_q_rxl_inserts_drs), ++ }, ++ { .name = "UNC_Q_TXR_AD_HOM_CREDIT_OCCUPANCY", ++ .code = 0x22 | (1 << 21), /* extra ev_sel_ext bit set */ ++ .desc = "Occupancy event that tracks the number of link layer credits into the R3 (for transactions across the BGF) available in each cycle. Flow Control FIFO for HOM messages on AD.", ++ .modmsk = BDX_UNC_QPI_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_q_rxl_inserts_drs, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_q_rxl_inserts_drs), ++ }, ++ { .name = "UNC_Q_TXR_AD_NDR_CREDIT_ACQUIRED", ++ .code = 0x28 | (1 << 21), /* extra ev_sel_ext bit set */ ++ .desc = "Number of link layer credits into the R3 (for transactions across the BGF) acquired each cycle. Flow Control FIFO for NDR messages on AD.", ++ .modmsk = BDX_UNC_QPI_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_q_rxl_inserts_drs, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_q_rxl_inserts_drs), ++ }, ++ { .name = "UNC_Q_TXR_AD_NDR_CREDIT_OCCUPANCY", ++ .code = 0x24 | (1 << 21), /* extra ev_sel_ext bit set */ ++ .desc = "Occupancy event that tracks the number of link layer credits into the R3 (for transactions across the BGF) available in each cycle. Flow Control FIFO for NDR messages on AD.", ++ .modmsk = BDX_UNC_QPI_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_q_rxl_inserts_drs, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_q_rxl_inserts_drs), ++ }, ++ { .name = "UNC_Q_TXR_AD_SNP_CREDIT_ACQUIRED", ++ .code = 0x27 | (1 << 21), /* extra ev_sel_ext bit set */ ++ .desc = "Number of link layer credits into the R3 (for transactions across the BGF) acquired each cycle. Flow Control FIFO for Snoop messages on AD.", ++ .modmsk = BDX_UNC_QPI_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_q_rxl_inserts_drs, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_q_rxl_inserts_drs), ++ }, ++ { .name = "UNC_Q_TXR_AD_SNP_CREDIT_OCCUPANCY", ++ .code = 0x23 | (1 << 21), /* extra ev_sel_ext bit set */ ++ .desc = "Occupancy event that tracks the number of link layer credits into the R3 (for transactions across the BGF) available in each cycle. Flow Control FIFO fro Snoop messages on AD.", ++ .modmsk = BDX_UNC_QPI_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_q_rxl_inserts_drs, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_q_rxl_inserts_drs), ++ }, ++ { .name = "UNC_Q_TXR_AK_NDR_CREDIT_ACQUIRED", ++ .code = 0x29 | (1 << 21), /* extra ev_sel_ext bit set */ ++ .desc = "Number of credits into the R3 (for transactions across the BGF) acquired each cycle. Local NDR message class to AK Egress.", ++ .modmsk = BDX_UNC_QPI_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_Q_TXR_AK_NDR_CREDIT_OCCUPANCY", ++ .code = 0x25 | (1 << 21), /* extra ev_sel_ext bit set */ ++ .desc = "Occupancy event that tracks the number of credits into the R3 (for transactions across the BGF) available in each cycle. Local NDR message class to AK Egress.", ++ .modmsk = BDX_UNC_QPI_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_Q_TXR_BL_DRS_CREDIT_ACQUIRED", ++ .code = 0x2a | (1 << 21), /* extra ev_sel_ext bit set */ ++ .desc = "Number of credits into the R3 (for transactions across the BGF) acquired each cycle. DRS message class to BL Egress.", ++ .modmsk = BDX_UNC_QPI_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_q_txr_bl_drs_credit_acquired, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_q_txr_bl_drs_credit_acquired), ++ }, ++ { .name = "UNC_Q_TXR_BL_DRS_CREDIT_OCCUPANCY", ++ .code = 0x1f | (1 << 21), /* extra ev_sel_ext bit set */ ++ .desc = "Occupancy event that tracks the number of credits into the R3 (for transactions across the BGF) available in each cycle. DRS message class to BL Egress.", ++ .modmsk = BDX_UNC_QPI_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_q_txr_bl_drs_credit_acquired, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_q_txr_bl_drs_credit_acquired), ++ }, ++ { .name = "UNC_Q_TXR_BL_NCB_CREDIT_ACQUIRED", ++ .code = 0x2b | (1 << 21), /* extra ev_sel_ext bit set */ ++ .desc = "Number of credits into the R3 (for transactions across the BGF) acquired each cycle. NCB message class to BL Egress.", ++ .modmsk = BDX_UNC_QPI_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_q_rxl_inserts_drs, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_q_rxl_inserts_drs), ++ }, ++ { .name = "UNC_Q_TXR_BL_NCB_CREDIT_OCCUPANCY", ++ .code = 0x20 | (1 << 21), /* extra ev_sel_ext bit set */ ++ .desc = "Occupancy event that tracks the number of credits into the R3 (for transactions across the BGF) available in each cycle. NCB message class to BL Egress.", ++ .modmsk = BDX_UNC_QPI_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_q_rxl_inserts_drs, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_q_rxl_inserts_drs), ++ }, ++ { .name = "UNC_Q_TXR_BL_NCS_CREDIT_ACQUIRED", ++ .code = 0x2c | (1 << 21), /* extra ev_sel_ext bit set */ ++ .desc = "Number of credits into the R3 (for transactions across the BGF) acquired each cycle. NCS message class to BL Egress.", ++ .modmsk = BDX_UNC_QPI_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_q_rxl_inserts_drs, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_q_rxl_inserts_drs), ++ }, ++ { .name = "UNC_Q_TXR_BL_NCS_CREDIT_OCCUPANCY", ++ .code = 0x21 | (1 << 21), /* extra ev_sel_ext bit set */ ++ .desc = "Occupancy event that tracks the number of credits into the R3 (for transactions across the BGF) available in each cycle. NCS message class to BL Egress.", ++ .modmsk = BDX_UNC_QPI_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_q_rxl_inserts_drs, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_q_rxl_inserts_drs), ++ }, ++ { .name = "UNC_Q_VNA_CREDIT_RETURNS", ++ .code = 0x1c | (1 << 21), /* extra ev_sel_ext bit set */ ++ .desc = "Number of VNA credits returned.", ++ .modmsk = BDX_UNC_QPI_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_Q_VNA_CREDIT_RETURN_OCCUPANCY", ++ .code = 0x1b | (1 << 21), /* extra ev_sel_ext bit set */ ++ .desc = "Number of VNA credits in the Rx side that are waitng to be returned back across the link.", ++ .modmsk = BDX_UNC_QPI_ATTRS, ++ .cntmsk = 0xf, ++ }, ++}; +diff --git a/lib/events/intel_bdx_unc_r2pcie_events.h b/lib/events/intel_bdx_unc_r2pcie_events.h +new file mode 100644 +index 0000000..5ce8845 +--- /dev/null ++++ b/lib/events/intel_bdx_unc_r2pcie_events.h +@@ -0,0 +1,344 @@ ++/* ++ * Copyright (c) 2017 Google Inc. All rights reserved ++ * Contributed by Stephane Eranian ++ * ++ * Permission is hereby granted, free of charge, to any person obtaining a copy ++ * of this software and associated documentation files (the "Software"), to deal ++ * in the Software without restriction, including without limitation the rights ++ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies ++ * of the Software, and to permit persons to whom the Software is furnished to do so, ++ * subject to the following conditions: ++ * ++ * The above copyright notice and this permission notice shall be included in all ++ * copies or substantial portions of the Software. ++ * ++ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, ++ * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A ++ * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT ++ * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF ++ * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE ++ * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ++ * ++ * This file is part of libpfm, a performance monitoring support library for ++ * applications on Linux. ++ * ++ * PMU: bdx_unc_r2pcie ++ */ ++ ++static intel_x86_umask_t bdx_unc_r2_iio_credit[]={ ++ { .uname = "ISOCH_QPI0", ++ .ucode = 0x400, ++ .udesc = "TBD", ++ }, ++ { .uname = "ISOCH_QPI1", ++ .ucode = 0x800, ++ .udesc = "TBD", ++ }, ++ { .uname = "PRQ_QPI0", ++ .ucode = 0x100, ++ .udesc = "TBD", ++ }, ++ { .uname = "PRQ_QPI1", ++ .ucode = 0x200, ++ .udesc = "TBD", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_r2_ring_ad_used[]={ ++ { .uname = "CCW", ++ .ucode = 0xc00, ++ .udesc = "Counterclockwise", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "CCW_EVEN", ++ .ucode = 0x400, ++ .udesc = "Counterclockwise and Even", ++ }, ++ { .uname = "CCW_ODD", ++ .ucode = 0x800, ++ .udesc = "Counterclockwise and Odd", ++ }, ++ { .uname = "CW", ++ .ucode = 0x300, ++ .udesc = "Clockwise", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "CW_EVEN", ++ .ucode = 0x100, ++ .udesc = "Clockwise and Even", ++ }, ++ { .uname = "CW_ODD", ++ .ucode = 0x200, ++ .udesc = "Clockwise and Odd", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_r2_ring_ak_bounces[]={ ++ { .uname = "DN", ++ .ucode = 0x200, ++ .udesc = "AK Ingress Bounced -- Dn", ++ }, ++ { .uname = "UP", ++ .ucode = 0x100, ++ .udesc = "AK Ingress Bounced -- Up", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_r2_ring_iv_used[]={ ++ { .uname = "ANY", ++ .ucode = 0xf00, ++ .udesc = "Any directions", ++ .uflags = INTEL_X86_DFL, ++ }, ++ { .uname = "CCW", ++ .ucode = 0xc00, ++ .udesc = "Counterclockwise", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "CW", ++ .ucode = 0x300, ++ .udesc = "Clockwise", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_r2_rxr_cycles_ne[]={ ++ { .uname = "NCB", ++ .ucode = 0x1000, ++ .udesc = "NCB", ++ }, ++ { .uname = "NCS", ++ .ucode = 0x2000, ++ .udesc = "NCS", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_r2_rxr_occupancy[]={ ++ { .uname = "DRS", ++ .ucode = 0x800, ++ .udesc = "Ingress Occupancy Accumulator -- DRS", ++ .uflags = INTEL_X86_DFL, ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_r2_sbo0_credits_acquired[]={ ++ { .uname = "AD", ++ .ucode = 0x100, ++ .udesc = "SBo0 Credits Acquired -- For AD Ring", ++ }, ++ { .uname = "BL", ++ .ucode = 0x200, ++ .udesc = "SBo0 Credits Acquired -- For BL Ring", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_r2_stall_no_sbo_credit[]={ ++ { .uname = "SBO0_AD", ++ .ucode = 0x100, ++ .udesc = "Stall on No Sbo Credits -- For SBo0, AD Ring", ++ }, ++ { .uname = "SBO0_BL", ++ .ucode = 0x400, ++ .udesc = "Stall on No Sbo Credits -- For SBo0, BL Ring", ++ }, ++ { .uname = "SBO1_AD", ++ .ucode = 0x200, ++ .udesc = "Stall on No Sbo Credits -- For SBo1, AD Ring", ++ }, ++ { .uname = "SBO1_BL", ++ .ucode = 0x800, ++ .udesc = "Stall on No Sbo Credits -- For SBo1, BL Ring", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_r2_txr_cycles_full[]={ ++ { .uname = "AD", ++ .ucode = 0x100, ++ .udesc = "Egress Cycles Full -- AD", ++ }, ++ { .uname = "AK", ++ .ucode = 0x200, ++ .udesc = "Egress Cycles Full -- AK", ++ }, ++ { .uname = "BL", ++ .ucode = 0x400, ++ .udesc = "Egress Cycles Full -- BL", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_r2_txr_cycles_ne[]={ ++ { .uname = "AD", ++ .ucode = 0x100, ++ .udesc = "Egress Cycles Not Empty -- AD", ++ }, ++ { .uname = "AK", ++ .ucode = 0x200, ++ .udesc = "Egress Cycles Not Empty -- AK", ++ }, ++ { .uname = "BL", ++ .ucode = 0x400, ++ .udesc = "Egress Cycles Not Empty -- BL", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_r2_txr_nack_cw[]={ ++ { .uname = "DN_AD", ++ .ucode = 0x100, ++ .udesc = "Egress CCW NACK -- AD CCW", ++ }, ++ { .uname = "DN_AK", ++ .ucode = 0x400, ++ .udesc = "Egress CCW NACK -- AK CCW", ++ }, ++ { .uname = "DN_BL", ++ .ucode = 0x200, ++ .udesc = "Egress CCW NACK -- BL CCW", ++ }, ++ { .uname = "UP_AD", ++ .ucode = 0x800, ++ .udesc = "Egress CCW NACK -- AK CCW", ++ }, ++ { .uname = "UP_AK", ++ .ucode = 0x2000, ++ .udesc = "Egress CCW NACK -- BL CW", ++ }, ++ { .uname = "UP_BL", ++ .ucode = 0x1000, ++ .udesc = "Egress CCW NACK -- BL CCW", ++ }, ++}; ++ ++static intel_x86_entry_t intel_bdx_unc_r2_pe[]={ ++ { .name = "UNC_R2_CLOCKTICKS", ++ .code = 0x1, ++ .desc = "Counts the number of uclks in the R2PCIe uclk domain. This could be slightly different than the count in the Ubox because of enable/freeze delays. However, because the R2PCIe is close to the Ubox, they generally should not diverge by more than a handful of cycles.", ++ .modmsk = BDX_UNC_R2PCIE_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_R2_IIO_CREDIT", ++ .code = 0x2d, ++ .desc = "TBD", ++ .modmsk = BDX_UNC_R2PCIE_ATTRS, ++ .cntmsk = 0x3, ++ .ngrp = 1, ++ .umasks = bdx_unc_r2_iio_credit, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r2_iio_credit), ++ }, ++ { .name = "UNC_R2_RING_AD_USED", ++ .code = 0x7, ++ .desc = "Counts the number of cycles that the AD ring is being used at this ring stop. This includes when packets are passing by and when packets are being sunk, but does not include when packets are being sent from the ring stop.", ++ .modmsk = BDX_UNC_R2PCIE_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_r2_ring_ad_used, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r2_ring_ad_used), ++ }, ++ { .name = "UNC_R2_RING_AK_BOUNCES", ++ .code = 0x12, ++ .desc = "Counts the number of times when a request destined for the AK ingress bounced.", ++ .modmsk = BDX_UNC_R2PCIE_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_r2_ring_ak_bounces, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r2_ring_ak_bounces), ++ }, ++ { .name = "UNC_R2_RING_AK_USED", ++ .code = 0x8, ++ .desc = "Counts the number of cycles that the AK ring is being used at this ring stop. This includes when packets are passing by and when packets are being sunk, but does not include when packets are being sent from the ring stop.", ++ .modmsk = BDX_UNC_R2PCIE_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_r2_ring_ad_used, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r2_ring_ad_used), ++ }, ++ { .name = "UNC_R2_RING_BL_USED", ++ .code = 0x9, ++ .desc = "Counts the number of cycles that the BL ring is being used at this ring stop. This includes when packets are passing by and when packets are being sunk, but does not include when packets are being sent from the ring stop.", ++ .modmsk = BDX_UNC_R2PCIE_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_r2_ring_ad_used, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r2_ring_ad_used), ++ }, ++ { .name = "UNC_R2_RING_IV_USED", ++ .code = 0xa, ++ .desc = "Counts the number of cycles that the IV ring is being used at this ring stop. This includes when packets are passing by and when packets are being sent, but does not include when packets are being sunk into the ring stop.", ++ .modmsk = BDX_UNC_R2PCIE_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_r2_ring_iv_used, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r2_ring_iv_used), ++ }, ++ { .name = "UNC_R2_RXR_CYCLES_NE", ++ .code = 0x10, ++ .desc = "Counts the number of cycles when the R2PCIe Ingress is not empty. This tracks one of the three rings that are used by the R2PCIe agent. This can be used in conjunction with the R2PCIe Ingress Occupancy Accumulator event in order to calculate average queue occupancy. Multiple ingress buffers can be tracked at a given time using multiple counters.", ++ .modmsk = BDX_UNC_R2PCIE_ATTRS, ++ .cntmsk = 0x3, ++ .ngrp = 1, ++ .umasks = bdx_unc_r2_rxr_cycles_ne, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r2_rxr_cycles_ne), ++ }, ++ { .name = "UNC_R2_RXR_INSERTS", ++ .code = 0x11, ++ .desc = "Counts the number of allocations into the R2PCIe Ingress. This tracks one of the three rings that are used by the R2PCIe agent. This can be used in conjunction with the R2PCIe Ingress Occupancy Accumulator event in order to calculate average queue latency. Multiple ingress buffers can be tracked at a given time using multiple counters.", ++ .modmsk = BDX_UNC_R2PCIE_ATTRS, ++ .cntmsk = 0x3, ++ .ngrp = 1, ++ .umasks = bdx_unc_r2_rxr_cycles_ne, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r2_rxr_cycles_ne), ++ }, ++ { .name = "UNC_R2_RXR_OCCUPANCY", ++ .code = 0x13, ++ .desc = "Accumulates the occupancy of a given R2PCIe Ingress queue in each cycles. This tracks one of the three ring Ingress buffers. This can be used with the R2PCIe Ingress Not Empty event to calculate average occupancy or the R2PCIe Ingress Allocations event in order to calculate average queuing latency.", ++ .modmsk = BDX_UNC_R2PCIE_ATTRS, ++ .cntmsk = 0x1, ++ .ngrp = 1, ++ .umasks = bdx_unc_r2_rxr_occupancy, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r2_rxr_occupancy), ++ }, ++ { .name = "UNC_R2_SBO0_CREDITS_ACQUIRED", ++ .code = 0x28, ++ .desc = "Number of Sbo 0 credits acquired in a given cycle, per ring.", ++ .modmsk = BDX_UNC_R2PCIE_ATTRS, ++ .cntmsk = 0x3, ++ .ngrp = 1, ++ .umasks = bdx_unc_r2_sbo0_credits_acquired, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r2_sbo0_credits_acquired), ++ }, ++ { .name = "UNC_R2_STALL_NO_SBO_CREDIT", ++ .code = 0x2c, ++ .desc = "Number of cycles Egress is stalled waiting for an Sbo credit to become available. Per Sbo, per Ring.", ++ .modmsk = BDX_UNC_R2PCIE_ATTRS, ++ .cntmsk = 0x3, ++ .ngrp = 1, ++ .umasks = bdx_unc_r2_stall_no_sbo_credit, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r2_stall_no_sbo_credit), ++ }, ++ { .name = "UNC_R2_TXR_CYCLES_FULL", ++ .code = 0x25, ++ .desc = "Counts the number of cycles when the R2PCIe Egress buffer is full.", ++ .modmsk = BDX_UNC_R2PCIE_ATTRS, ++ .cntmsk = 0x1, ++ .ngrp = 1, ++ .umasks = bdx_unc_r2_txr_cycles_full, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r2_txr_cycles_full), ++ }, ++ { .name = "UNC_R2_TXR_CYCLES_NE", ++ .code = 0x23, ++ .desc = "Counts the number of cycles when the R2PCIe Egress is not empty. This tracks one of the three rings that are used by the R2PCIe agent. This can be used in conjunction with the R2PCIe Egress Occupancy Accumulator event in order to calculate average queue occupancy. Only a single Egress queue can be tracked at any given time. It is not possible to filter based on direction or polarity.", ++ .modmsk = BDX_UNC_R2PCIE_ATTRS, ++ .cntmsk = 0x1, ++ .ngrp = 1, ++ .umasks = bdx_unc_r2_txr_cycles_ne, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r2_txr_cycles_ne), ++ }, ++ { .name = "UNC_R2_TXR_NACK_CW", ++ .code = 0x26, ++ .desc = "TBD", ++ .modmsk = BDX_UNC_R2PCIE_ATTRS, ++ .cntmsk = 0x3, ++ .ngrp = 1, ++ .umasks = bdx_unc_r2_txr_nack_cw, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r2_txr_nack_cw), ++ }, ++}; +diff --git a/lib/events/intel_bdx_unc_r3qpi_events.h b/lib/events/intel_bdx_unc_r3qpi_events.h +new file mode 100644 +index 0000000..5c0b561 +--- /dev/null ++++ b/lib/events/intel_bdx_unc_r3qpi_events.h +@@ -0,0 +1,752 @@ ++/* ++ * Copyright (c) 2017 Google Inc. All rights reserved ++ * Contributed by Stephane Eranian ++ * ++ * Permission is hereby granted, free of charge, to any person obtaining a copy ++ * of this software and associated documentation files (the "Software"), to deal ++ * in the Software without restriction, including without limitation the rights ++ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies ++ * of the Software, and to permit persons to whom the Software is furnished to do so, ++ * subject to the following conditions: ++ * ++ * The above copyright notice and this permission notice shall be included in all ++ * copies or substantial portions of the Software. ++ * ++ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, ++ * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A ++ * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT ++ * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF ++ * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE ++ * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ++ * ++ * This file is part of libpfm, a performance monitoring support library for ++ * applications on Linux. ++ * ++ * PMU: bdx_unc_r3qpi ++ */ ++ ++static intel_x86_umask_t bdx_unc_r3_c_hi_ad_credits_empty[]={ ++ { .uname = "CBO10", ++ .ucode = 0x400, ++ .udesc = "CBox AD Credits Empty", ++ }, ++ { .uname = "CBO11", ++ .ucode = 0x800, ++ .udesc = "CBox AD Credits Empty", ++ }, ++ { .uname = "CBO12", ++ .ucode = 0x1000, ++ .udesc = "CBox AD Credits Empty", ++ }, ++ { .uname = "CBO13", ++ .ucode = 0x2000, ++ .udesc = "CBox AD Credits Empty", ++ }, ++ { .uname = "CBO14_16", ++ .ucode = 0x4000, ++ .udesc = "CBox AD Credits Empty", ++ }, ++ { .uname = "CBO8", ++ .ucode = 0x100, ++ .udesc = "CBox AD Credits Empty", ++ }, ++ { .uname = "CBO9", ++ .ucode = 0x200, ++ .udesc = "CBox AD Credits Empty", ++ }, ++ { .uname = "CBO_15_17", ++ .ucode = 0x8000, ++ .udesc = "CBox AD Credits Empty", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_r3_c_lo_ad_credits_empty[]={ ++ { .uname = "CBO0", ++ .ucode = 0x100, ++ .udesc = "CBox AD Credits Empty", ++ }, ++ { .uname = "CBO1", ++ .ucode = 0x200, ++ .udesc = "CBox AD Credits Empty", ++ }, ++ { .uname = "CBO2", ++ .ucode = 0x400, ++ .udesc = "CBox AD Credits Empty", ++ }, ++ { .uname = "CBO3", ++ .ucode = 0x800, ++ .udesc = "CBox AD Credits Empty", ++ }, ++ { .uname = "CBO4", ++ .ucode = 0x1000, ++ .udesc = "CBox AD Credits Empty", ++ }, ++ { .uname = "CBO5", ++ .ucode = 0x2000, ++ .udesc = "CBox AD Credits Empty", ++ }, ++ { .uname = "CBO6", ++ .ucode = 0x4000, ++ .udesc = "CBox AD Credits Empty", ++ }, ++ { .uname = "CBO7", ++ .ucode = 0x8000, ++ .udesc = "CBox AD Credits Empty", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_r3_ha_r2_bl_credits_empty[]={ ++ { .uname = "HA0", ++ .ucode = 0x100, ++ .udesc = "HA/R2 AD Credits Empty", ++ }, ++ { .uname = "HA1", ++ .ucode = 0x200, ++ .udesc = "HA/R2 AD Credits Empty", ++ }, ++ { .uname = "R2_NCB", ++ .ucode = 0x400, ++ .udesc = "HA/R2 AD Credits Empty", ++ }, ++ { .uname = "R2_NCS", ++ .ucode = 0x800, ++ .udesc = "HA/R2 AD Credits Empty", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_r3_qpi0_ad_credits_empty[]={ ++ { .uname = "VN0_HOM", ++ .ucode = 0x200, ++ .udesc = "VN0 HOM messages", ++ }, ++ { .uname = "VN0_NDR", ++ .ucode = 0x800, ++ .udesc = "VN0 NDR messages", ++ }, ++ { .uname = "VN0_SNP", ++ .ucode = 0x400, ++ .udesc = "VN0 SNP messages", ++ }, ++ { .uname = "VN1_HOM", ++ .ucode = 0x1000, ++ .udesc = "VN1 HOM messages", ++ }, ++ { .uname = "VN1_NDR", ++ .ucode = 0x4000, ++ .udesc = "VN1 NDR messages", ++ }, ++ { .uname = "VN1_SNP", ++ .ucode = 0x2000, ++ .udesc = "VN1 SNP messages", ++ }, ++ { .uname = "VNA", ++ .ucode = 0x100, ++ .udesc = "VNA messages", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_r3_qpi0_bl_credits_empty[]={ ++ { .uname = "VN1_HOM", ++ .ucode = 0x1000, ++ .udesc = "QPIx BL Credits Empty", ++ }, ++ { .uname = "VN1_NDR", ++ .ucode = 0x4000, ++ .udesc = "QPIx BL Credits Empty", ++ }, ++ { .uname = "VN1_SNP", ++ .ucode = 0x2000, ++ .udesc = "QPIx BL Credits Empty", ++ }, ++ { .uname = "VNA", ++ .ucode = 0x100, ++ .udesc = "QPIx BL Credits Empty", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_r3_ring_ad_used[]={ ++ { .uname = "CCW", ++ .ucode = 0xc00, ++ .udesc = "Counterclockwise", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "CCW_EVEN", ++ .ucode = 0x400, ++ .udesc = "Counterclockwise and Even", ++ }, ++ { .uname = "CCW_ODD", ++ .ucode = 0x800, ++ .udesc = "Counterclockwise and Odd", ++ }, ++ { .uname = "CW", ++ .ucode = 0x300, ++ .udesc = "Clockwise", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "CW_EVEN", ++ .ucode = 0x100, ++ .udesc = "Clockwise and Even", ++ }, ++ { .uname = "CW_ODD", ++ .ucode = 0x200, ++ .udesc = "Clockwise and Odd", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_r3_ring_iv_used[]={ ++ { .uname = "ANY", ++ .ucode = 0xf00, ++ .udesc = "Any", ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, ++ }, ++ { .uname = "CW", ++ .ucode = 0x300, ++ .udesc = "Clockwise", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_r3_ring_sink_starved[]={ ++ { .uname = "AK", ++ .ucode = 0x200, ++ .udesc = "AK", ++ .uflags = INTEL_X86_DFL, ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_r3_rxr_cycles_ne[]={ ++ { .uname = "HOM", ++ .ucode = 0x100, ++ .udesc = "Ingress Cycles Not Empty -- HOM", ++ }, ++ { .uname = "NDR", ++ .ucode = 0x400, ++ .udesc = "Ingress Cycles Not Empty -- NDR", ++ }, ++ { .uname = "SNP", ++ .ucode = 0x200, ++ .udesc = "Ingress Cycles Not Empty -- SNP", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_r3_rxr_cycles_ne_vn1[]={ ++ { .uname = "DRS", ++ .ucode = 0x800, ++ .udesc = "VN1 Ingress Cycles Not Empty -- DRS", ++ }, ++ { .uname = "HOM", ++ .ucode = 0x100, ++ .udesc = "VN1 Ingress Cycles Not Empty -- HOM", ++ }, ++ { .uname = "NCB", ++ .ucode = 0x1000, ++ .udesc = "VN1 Ingress Cycles Not Empty -- NCB", ++ }, ++ { .uname = "NCS", ++ .ucode = 0x2000, ++ .udesc = "VN1 Ingress Cycles Not Empty -- NCS", ++ }, ++ { .uname = "NDR", ++ .ucode = 0x400, ++ .udesc = "VN1 Ingress Cycles Not Empty -- NDR", ++ }, ++ { .uname = "SNP", ++ .ucode = 0x200, ++ .udesc = "VN1 Ingress Cycles Not Empty -- SNP", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_r3_rxr_inserts[]={ ++ { .uname = "DRS", ++ .ucode = 0x800, ++ .udesc = "Ingress Allocations -- DRS", ++ }, ++ { .uname = "HOM", ++ .ucode = 0x100, ++ .udesc = "Ingress Allocations -- HOM", ++ }, ++ { .uname = "NCB", ++ .ucode = 0x1000, ++ .udesc = "Ingress Allocations -- NCB", ++ }, ++ { .uname = "NCS", ++ .ucode = 0x2000, ++ .udesc = "Ingress Allocations -- NCS", ++ }, ++ { .uname = "NDR", ++ .ucode = 0x400, ++ .udesc = "Ingress Allocations -- NDR", ++ }, ++ { .uname = "SNP", ++ .ucode = 0x200, ++ .udesc = "Ingress Allocations -- SNP", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_r3_sbo0_credits_acquired[]={ ++ { .uname = "AD", ++ .ucode = 0x100, ++ .udesc = "SBo0 Credits Acquired -- For AD Ring", ++ }, ++ { .uname = "BL", ++ .ucode = 0x200, ++ .udesc = "SBo0 Credits Acquired -- For BL Ring", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_r3_sbo1_credits_acquired[]={ ++ { .uname = "AD", ++ .ucode = 0x100, ++ .udesc = "SBo1 Credits Acquired -- For AD Ring", ++ }, ++ { .uname = "BL", ++ .ucode = 0x200, ++ .udesc = "SBo1 Credits Acquired -- For BL Ring", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_r3_stall_no_sbo_credit[]={ ++ { .uname = "SBO0_AD", ++ .ucode = 0x100, ++ .udesc = "Stall on No Sbo Credits -- For SBo0, AD Ring", ++ }, ++ { .uname = "SBO0_BL", ++ .ucode = 0x400, ++ .udesc = "Stall on No Sbo Credits -- For SBo0, BL Ring", ++ }, ++ { .uname = "SBO1_AD", ++ .ucode = 0x200, ++ .udesc = "Stall on No Sbo Credits -- For SBo1, AD Ring", ++ }, ++ { .uname = "SBO1_BL", ++ .ucode = 0x800, ++ .udesc = "Stall on No Sbo Credits -- For SBo1, BL Ring", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_r3_txr_nack[]={ ++ { .uname = "DN_AD", ++ .ucode = 0x100, ++ .udesc = "Egress CCW NACK -- AD CCW", ++ }, ++ { .uname = "DN_AK", ++ .ucode = 0x400, ++ .udesc = "Egress CCW NACK -- AK CCW", ++ }, ++ { .uname = "DN_BL", ++ .ucode = 0x200, ++ .udesc = "Egress CCW NACK -- BL CCW", ++ }, ++ { .uname = "UP_AD", ++ .ucode = 0x800, ++ .udesc = "Egress CCW NACK -- AK CCW", ++ }, ++ { .uname = "UP_AK", ++ .ucode = 0x2000, ++ .udesc = "Egress CCW NACK -- BL CW", ++ }, ++ { .uname = "UP_BL", ++ .ucode = 0x1000, ++ .udesc = "Egress CCW NACK -- BL CCW", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_r3_vn0_credits_reject[]={ ++ { .uname = "DRS", ++ .ucode = 0x800, ++ .udesc = "VN0 Credit Acquisition Failed on DRS -- DRS Message Class", ++ }, ++ { .uname = "HOM", ++ .ucode = 0x100, ++ .udesc = "VN0 Credit Acquisition Failed on DRS -- HOM Message Class", ++ }, ++ { .uname = "NCB", ++ .ucode = 0x1000, ++ .udesc = "VN0 Credit Acquisition Failed on DRS -- NCB Message Class", ++ }, ++ { .uname = "NCS", ++ .ucode = 0x2000, ++ .udesc = "VN0 Credit Acquisition Failed on DRS -- NCS Message Class", ++ }, ++ { .uname = "NDR", ++ .ucode = 0x400, ++ .udesc = "VN0 Credit Acquisition Failed on DRS -- NDR Message Class", ++ }, ++ { .uname = "SNP", ++ .ucode = 0x200, ++ .udesc = "VN0 Credit Acquisition Failed on DRS -- SNP Message Class", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_r3_vn0_credits_used[]={ ++ { .uname = "DRS", ++ .ucode = 0x800, ++ .udesc = "VN0 Credit Used -- DRS Message Class", ++ }, ++ { .uname = "HOM", ++ .ucode = 0x100, ++ .udesc = "VN0 Credit Used -- HOM Message Class", ++ }, ++ { .uname = "NCB", ++ .ucode = 0x1000, ++ .udesc = "VN0 Credit Used -- NCB Message Class", ++ }, ++ { .uname = "NCS", ++ .ucode = 0x2000, ++ .udesc = "VN0 Credit Used -- NCS Message Class", ++ }, ++ { .uname = "NDR", ++ .ucode = 0x400, ++ .udesc = "VN0 Credit Used -- NDR Message Class", ++ }, ++ { .uname = "SNP", ++ .ucode = 0x200, ++ .udesc = "VN0 Credit Used -- SNP Message Class", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_r3_vn1_credits_reject[]={ ++ { .uname = "DRS", ++ .ucode = 0x800, ++ .udesc = "VN1 Credit Acquisition Failed on DRS -- DRS Message Class", ++ }, ++ { .uname = "HOM", ++ .ucode = 0x100, ++ .udesc = "VN1 Credit Acquisition Failed on DRS -- HOM Message Class", ++ }, ++ { .uname = "NCB", ++ .ucode = 0x1000, ++ .udesc = "VN1 Credit Acquisition Failed on DRS -- NCB Message Class", ++ }, ++ { .uname = "NCS", ++ .ucode = 0x2000, ++ .udesc = "VN1 Credit Acquisition Failed on DRS -- NCS Message Class", ++ }, ++ { .uname = "NDR", ++ .ucode = 0x400, ++ .udesc = "VN1 Credit Acquisition Failed on DRS -- NDR Message Class", ++ }, ++ { .uname = "SNP", ++ .ucode = 0x200, ++ .udesc = "VN1 Credit Acquisition Failed on DRS -- SNP Message Class", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_r3_vn1_credits_used[]={ ++ { .uname = "DRS", ++ .ucode = 0x800, ++ .udesc = "VN1 Credit Used -- DRS Message Class", ++ }, ++ { .uname = "HOM", ++ .ucode = 0x100, ++ .udesc = "VN1 Credit Used -- HOM Message Class", ++ }, ++ { .uname = "NCB", ++ .ucode = 0x1000, ++ .udesc = "VN1 Credit Used -- NCB Message Class", ++ }, ++ { .uname = "NCS", ++ .ucode = 0x2000, ++ .udesc = "VN1 Credit Used -- NCS Message Class", ++ }, ++ { .uname = "NDR", ++ .ucode = 0x400, ++ .udesc = "VN1 Credit Used -- NDR Message Class", ++ }, ++ { .uname = "SNP", ++ .ucode = 0x200, ++ .udesc = "VN1 Credit Used -- SNP Message Class", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_r3_vna_credits_acquired[]={ ++ { .uname = "AD", ++ .ucode = 0x100, ++ .udesc = "VNA credit Acquisitions -- HOM Message Class", ++ }, ++ { .uname = "BL", ++ .ucode = 0x400, ++ .udesc = "VNA credit Acquisitions -- HOM Message Class", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_r3_vna_credits_reject[]={ ++ { .uname = "DRS", ++ .ucode = 0x800, ++ .udesc = "VNA Credit Reject -- DRS Message Class", ++ }, ++ { .uname = "HOM", ++ .ucode = 0x100, ++ .udesc = "VNA Credit Reject -- HOM Message Class", ++ }, ++ { .uname = "NCB", ++ .ucode = 0x1000, ++ .udesc = "VNA Credit Reject -- NCB Message Class", ++ }, ++ { .uname = "NCS", ++ .ucode = 0x2000, ++ .udesc = "VNA Credit Reject -- NCS Message Class", ++ }, ++ { .uname = "NDR", ++ .ucode = 0x400, ++ .udesc = "VNA Credit Reject -- NDR Message Class", ++ }, ++ { .uname = "SNP", ++ .ucode = 0x200, ++ .udesc = "VNA Credit Reject -- SNP Message Class", ++ }, ++}; ++ ++ ++static intel_x86_entry_t intel_bdx_unc_r3_pe[]={ ++ { .name = "UNC_R3_CLOCKTICKS", ++ .code = 0x1, ++ .desc = "Counts the number of uclks in the QPI uclk domain. This could be slightly different than the count in the Ubox because of enable/freeze delays. However, because the QPI Agent is close to the Ubox, they generally should not diverge by more than a handful of cycles.", ++ .modmsk = BDX_UNC_R3QPI_ATTRS, ++ .cntmsk = 0x7, ++ }, ++ { .name = "UNC_R3_C_HI_AD_CREDITS_EMPTY", ++ .code = 0x1f, ++ .desc = "No credits available to send to Cbox on the AD Ring (covers higher CBoxes)", ++ .modmsk = BDX_UNC_R3QPI_ATTRS, ++ .cntmsk = 0x3, ++ .ngrp = 1, ++ .umasks = bdx_unc_r3_c_hi_ad_credits_empty, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r3_c_hi_ad_credits_empty), ++ }, ++ { .name = "UNC_R3_C_LO_AD_CREDITS_EMPTY", ++ .code = 0x22, ++ .desc = "No credits available to send to Cbox on the AD Ring (covers lower CBoxes)", ++ .modmsk = BDX_UNC_R3QPI_ATTRS, ++ .cntmsk = 0x3, ++ .ngrp = 1, ++ .umasks = bdx_unc_r3_c_lo_ad_credits_empty, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r3_c_lo_ad_credits_empty), ++ }, ++ { .name = "UNC_R3_HA_R2_BL_CREDITS_EMPTY", ++ .code = 0x2d, ++ .desc = "No credits available to send to either HA or R2 on the BL Ring", ++ .modmsk = BDX_UNC_R3QPI_ATTRS, ++ .cntmsk = 0x3, ++ .ngrp = 1, ++ .umasks = bdx_unc_r3_ha_r2_bl_credits_empty, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r3_ha_r2_bl_credits_empty), ++ }, ++ { .name = "UNC_R3_QPI0_AD_CREDITS_EMPTY", ++ .code = 0x20, ++ .desc = "No credits available to send to QPI0 on the AD Ring", ++ .modmsk = BDX_UNC_R3QPI_ATTRS, ++ .cntmsk = 0x3, ++ .ngrp = 1, ++ .umasks = bdx_unc_r3_qpi0_ad_credits_empty, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r3_qpi0_ad_credits_empty), ++ }, ++ { .name = "UNC_R3_QPI0_BL_CREDITS_EMPTY", ++ .code = 0x21, ++ .desc = "No credits available to send to QPI0 on the BL Ring", ++ .modmsk = BDX_UNC_R3QPI_ATTRS, ++ .cntmsk = 0x3, ++ .ngrp = 1, ++ .umasks = bdx_unc_r3_qpi0_bl_credits_empty, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r3_qpi0_bl_credits_empty), ++ }, ++ { .name = "UNC_R3_QPI1_AD_CREDITS_EMPTY", ++ .code = 0x2e, ++ .desc = "No credits available to send to QPI1 on the AD Ring", ++ .modmsk = BDX_UNC_R3QPI_ATTRS, ++ .cntmsk = 0x3, ++ .ngrp = 1, ++ .umasks = bdx_unc_r3_qpi0_ad_credits_empty, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r3_qpi0_ad_credits_empty), ++ }, ++ { .name = "UNC_R3_QPI1_BL_CREDITS_EMPTY", ++ .code = 0x2f, ++ .desc = "No credits available to send to QPI1 on the BL Ring", ++ .modmsk = BDX_UNC_R3QPI_ATTRS, ++ .cntmsk = 0x3, ++ .ngrp = 1, ++ .umasks = bdx_unc_r3_qpi0_ad_credits_empty, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r3_qpi0_ad_credits_empty), ++ }, ++ { .name = "UNC_R3_RING_AD_USED", ++ .code = 0x7, ++ .desc = "Counts the number of cycles that the AD ring is being used at this ring stop. This includes when packets are passing by and when packets are being sunk, but does not include when packets are being sent from the ring stop.", ++ .modmsk = BDX_UNC_R3QPI_ATTRS, ++ .cntmsk = 0x7, ++ .ngrp = 1, ++ .umasks = bdx_unc_r3_ring_ad_used, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r3_ring_ad_used), ++ }, ++ { .name = "UNC_R3_RING_AK_USED", ++ .code = 0x8, ++ .desc = "Counts the number of cycles that the AK ring is being used at this ring stop. This includes when packets are passing by and when packets are being sunk, but does not include when packets are being sent from the ring stop.", ++ .modmsk = BDX_UNC_R3QPI_ATTRS, ++ .cntmsk = 0x7, ++ .ngrp = 1, ++ .umasks = bdx_unc_r3_ring_ad_used, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r3_ring_ad_used), ++ }, ++ { .name = "UNC_R3_RING_BL_USED", ++ .code = 0x9, ++ .desc = "Counts the number of cycles that the BL ring is being used at this ring stop. This includes when packets are passing by and when packets are being sunk, but does not include when packets are being sent from the ring stop.", ++ .modmsk = BDX_UNC_R3QPI_ATTRS, ++ .cntmsk = 0x7, ++ .ngrp = 1, ++ .umasks = bdx_unc_r3_ring_ad_used, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r3_ring_ad_used), ++ }, ++ { .name = "UNC_R3_RING_IV_USED", ++ .code = 0xa, ++ .desc = "Counts the number of cycles that the IV ring is being used at this ring stop. This includes when packets are passing by and when packets are being sent, but does not include when packets are being sunk into the ring stop.", ++ .modmsk = BDX_UNC_R3QPI_ATTRS, ++ .cntmsk = 0x7, ++ .ngrp = 1, ++ .umasks = bdx_unc_r3_ring_iv_used, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r3_ring_iv_used), ++ }, ++ { .name = "UNC_R3_RING_SINK_STARVED", ++ .code = 0xe, ++ .desc = "Number of cycles the ringstop is in starvation (per ring)", ++ .modmsk = BDX_UNC_R3QPI_ATTRS, ++ .cntmsk = 0x7, ++ .ngrp = 1, ++ .umasks = bdx_unc_r3_ring_sink_starved, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r3_ring_sink_starved), ++ }, ++ { .name = "UNC_R3_RXR_CYCLES_NE", ++ .code = 0x10, ++ .desc = "Counts the number of cycles when the QPI Ingress is not empty. This tracks one of the three rings that are used by the QPI agent. This can be used in conjunction with the QPI Ingress Occupancy Accumulator event in order to calculate average queue occupancy. Multiple ingress buffers can be tracked at a given time using multiple counters.", ++ .modmsk = BDX_UNC_R3QPI_ATTRS, ++ .cntmsk = 0x3, ++ .ngrp = 1, ++ .umasks = bdx_unc_r3_rxr_cycles_ne, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r3_rxr_cycles_ne), ++ }, ++ { .name = "UNC_R3_RXR_CYCLES_NE_VN1", ++ .code = 0x14, ++ .desc = "Counts the number of cycles when the QPI VN1 Ingress is not empty. This tracks one of the three rings that are used by the QPI agent. This can be used in conjunction with the QPI VN1 Ingress Occupancy Accumulator event in order to calculate average queue occupancy. Multiple ingress buffers can be tracked at a given time using multiple counters.", ++ .modmsk = BDX_UNC_R3QPI_ATTRS, ++ .cntmsk = 0x3, ++ .ngrp = 1, ++ .umasks = bdx_unc_r3_rxr_cycles_ne_vn1, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r3_rxr_cycles_ne_vn1), ++ }, ++ { .name = "UNC_R3_RXR_INSERTS", ++ .code = 0x11, ++ .desc = "Counts the number of allocations into the QPI Ingress. This tracks one of the three rings that are used by the QPI agent. This can be used in conjunction with the QPI Ingress Occupancy Accumulator event in order to calculate average queue latency. Multiple ingress buffers can be tracked at a given time using multiple counters.", ++ .modmsk = BDX_UNC_R3QPI_ATTRS, ++ .cntmsk = 0x3, ++ .ngrp = 1, ++ .umasks = bdx_unc_r3_rxr_inserts, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r3_rxr_inserts), ++ }, ++ { .name = "UNC_R3_RXR_INSERTS_VN1", ++ .code = 0x15, ++ .desc = "Counts the number of allocations into the QPI VN1 Ingress. This tracks one of the three rings that are used by the QPI agent. This can be used in conjunction with the QPI VN1 Ingress Occupancy Accumulator event in order to calculate average queue latency. Multiple ingress buffers can be tracked at a given time using multiple counters.", ++ .modmsk = BDX_UNC_R3QPI_ATTRS, ++ .cntmsk = 0x3, ++ .ngrp = 1, ++ .umasks = bdx_unc_r3_rxr_inserts, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r3_rxr_inserts), ++ }, ++ { .name = "UNC_R3_RXR_OCCUPANCY_VN1", ++ .code = 0x13, ++ .desc = "Accumulates the occupancy of a given QPI VN1 Ingress queue in each cycles. This tracks one of the three ring Ingress buffers. This can be used with the QPI VN1 Ingress Not Empty event to calculate average occupancy or the QPI VN1 Ingress Allocations event in order to calculate average queuing latency.", ++ .modmsk = BDX_UNC_R3QPI_ATTRS, ++ .cntmsk = 0x1, ++ .ngrp = 1, ++ .umasks = bdx_unc_r3_rxr_inserts, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r3_rxr_inserts), ++ }, ++ { .name = "UNC_R3_SBO0_CREDITS_ACQUIRED", ++ .code = 0x28, ++ .desc = "Number of Sbo 0 credits acquired in a given cycle, per ring.", ++ .modmsk = BDX_UNC_R3QPI_ATTRS, ++ .cntmsk = 0x3, ++ .ngrp = 1, ++ .umasks = bdx_unc_r3_sbo0_credits_acquired, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r3_sbo0_credits_acquired), ++ }, ++ { .name = "UNC_R3_SBO1_CREDITS_ACQUIRED", ++ .code = 0x29, ++ .desc = "Number of Sbo 1 credits acquired in a given cycle, per ring.", ++ .modmsk = BDX_UNC_R3QPI_ATTRS, ++ .cntmsk = 0x3, ++ .ngrp = 1, ++ .umasks = bdx_unc_r3_sbo1_credits_acquired, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r3_sbo1_credits_acquired), ++ }, ++ { .name = "UNC_R3_STALL_NO_SBO_CREDIT", ++ .code = 0x2c, ++ .desc = "Number of cycles Egress is stalled waiting for an Sbo credit to become available. Per Sbo, per Ring.", ++ .modmsk = BDX_UNC_R3QPI_ATTRS, ++ .cntmsk = 0x3, ++ .ngrp = 1, ++ .umasks = bdx_unc_r3_stall_no_sbo_credit, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r3_stall_no_sbo_credit), ++ }, ++ { .name = "UNC_R3_TXR_NACK", ++ .code = 0x26, ++ .desc = "", ++ .modmsk = BDX_UNC_R3QPI_ATTRS, ++ .cntmsk = 0x3, ++ .ngrp = 1, ++ .umasks = bdx_unc_r3_txr_nack, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r3_txr_nack), ++ }, ++ { .name = "UNC_R3_VN0_CREDITS_REJECT", ++ .code = 0x37, ++ .desc = "Number of times a request failed to acquire a DRS VN0 credit. In order for a request to be transferred across QPI, it must be guaranteed to have a flit buffer on the remote socket to sink into. There are two credit pools, VNA and VN0. VNA is a shared pool used to achieve high performance. The VN0 pool has reserved entries for each message class and is used to prevent deadlock. Requests first attempt to acquire a VNA credit, and then fall back to VN0 if they fail. This therefore counts the number of times when a request failed to acquire either a VNA or VN0 credit and is delayed. This should generally be a rare situation.", ++ .modmsk = BDX_UNC_R3QPI_ATTRS, ++ .cntmsk = 0x3, ++ .ngrp = 1, ++ .umasks = bdx_unc_r3_vn0_credits_reject, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r3_vn0_credits_reject), ++ }, ++ { .name = "UNC_R3_VN0_CREDITS_USED", ++ .code = 0x36, ++ .desc = "Number of times a VN0 credit was used on the DRS message channel. In order for a request to be transferred across QPI, it must be guaranteed to have a flit buffer on the remote socket to sink into. There are two credit pools, VNA and VN0. VNA is a shared pool used to achieve high performance. The VN0 pool has reserved entries for each message class and is used to prevent deadlock. Requests first attempt to acquire a VNA credit, and then fall back to VN0 if they fail. This counts the number of times a VN0 credit was used. Note that a single VN0 credit holds access to potentially multiple flit buffers. For example, a transfer that uses VNA could use 9 flit buffers and in that case uses 9 credits. A transfer on VN0 will only count a single credit even though it may use multiple buffers.", ++ .modmsk = BDX_UNC_R3QPI_ATTRS, ++ .cntmsk = 0x3, ++ .ngrp = 1, ++ .umasks = bdx_unc_r3_vn0_credits_used, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r3_vn0_credits_used), ++ }, ++ { .name = "UNC_R3_VN1_CREDITS_REJECT", ++ .code = 0x39, ++ .desc = "Number of times a request failed to acquire a VN1 credit. In order for a request to be transferred across QPI, it must be guaranteed to have a flit buffer on the remote socket to sink into. There are two credit pools, VNA and VN1. VNA is a shared pool used to achieve high performance. The VN1 pool has reserved entries for each message class and is used to prevent deadlock. Requests first attempt to acquire a VNA credit, and then fall back to VN1 if they fail. This therefore counts the number of times when a request failed to acquire either a VNA or VN1 credit and is delayed. This should generally be a rare situation.", ++ .modmsk = BDX_UNC_R3QPI_ATTRS, ++ .cntmsk = 0x3, ++ .ngrp = 1, ++ .umasks = bdx_unc_r3_vn1_credits_reject, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r3_vn1_credits_reject), ++ }, ++ { .name = "UNC_R3_VN1_CREDITS_USED", ++ .code = 0x38, ++ .desc = "Number of times a VN1 credit was used on the DRS message channel. In order for a request to be transferred across QPI, it must be guaranteed to have a flit buffer on the remote socket to sink into. There are two credit pools, VNA and VN1. VNA is a shared pool used to achieve high performance. The VN1 pool has reserved entries for each message class and is used to prevent deadlock. Requests first attempt to acquire a VNA credit, and then fall back to VN1 if they fail. This counts the number of times a VN1 credit was used. Note that a single VN1 credit holds access to potentially multiple flit buffers. For example, a transfer that uses VNA could use 9 flit buffers and in that case uses 9 credits. A transfer on VN1 will only count a single credit even though it may use multiple buffers.", ++ .modmsk = BDX_UNC_R3QPI_ATTRS, ++ .cntmsk = 0x3, ++ .ngrp = 1, ++ .umasks = bdx_unc_r3_vn1_credits_used, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r3_vn1_credits_used), ++ }, ++ { .name = "UNC_R3_VNA_CREDITS_ACQUIRED", ++ .code = 0x33, ++ .desc = "Number of QPI VNA Credit acquisitions. This event can be used in conjunction with the VNA In-Use Accumulator to calculate the average lifetime of a credit holder. VNA credits are used by all message classes in order to communicate across QPI. If a packet is unable to acquire credits, it will then attempt to use credts from the VN0 pool. Note that a single packet may require multiple flit buffers (i.e. when data is being transfered). Therefore, this event will increment by the number of credits acquired in each cycle. Filtering based on message class is not provided. One can count the number of packets transfered in a given message class using an qfclk event.", ++ .modmsk = BDX_UNC_R3QPI_ATTRS, ++ .cntmsk = 0x3, ++ .ngrp = 1, ++ .umasks = bdx_unc_r3_vna_credits_acquired, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r3_vna_credits_acquired), ++ }, ++ { .name = "UNC_R3_VNA_CREDITS_REJECT", ++ .code = 0x34, ++ .desc = "Number of attempted VNA credit acquisitions that were rejected because the VNA credit pool was full (or almost full). It is possible to filter this event by message class. Some packets use more than one flit buffer, and therefore must acquire multiple credits. Therefore, one could get a reject even if the VNA credits were not fully used up. The VNA pool is generally used to provide the bulk of the QPI bandwidth (as opposed to the VN0 pool which is used to guarantee forward progress). VNA credits can run out if the flit buffer on the receiving side starts to queue up substantially. This can happen if the rest of the uncore is unable to drain the requests fast enough.", ++ .modmsk = BDX_UNC_R3QPI_ATTRS, ++ .cntmsk = 0x3, ++ .ngrp = 1, ++ .umasks = bdx_unc_r3_vna_credits_reject, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r3_vna_credits_reject), ++ }, ++}; ++ +diff --git a/lib/events/intel_bdx_unc_sbo_events.h b/lib/events/intel_bdx_unc_sbo_events.h +new file mode 100644 +index 0000000..2f35d95 +--- /dev/null ++++ b/lib/events/intel_bdx_unc_sbo_events.h +@@ -0,0 +1,405 @@ ++/* ++ * Copyright (c) 2017 Google Inc. All rights reserved ++ * Contributed by Stephane Eranian ++ * ++ * Permission is hereby granted, free of charge, to any person obtaining a copy ++ * of this software and associated documentation files (the "Software"), to deal ++ * in the Software without restriction, including without limitation the rights ++ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies ++ * of the Software, and to permit persons to whom the Software is furnished to do so, ++ * subject to the following conditions: ++ * ++ * The above copyright notice and this permission notice shall be included in all ++ * copies or substantial portions of the Software. ++ * ++ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, ++ * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A ++ * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT ++ * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF ++ * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE ++ * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ++ * ++ * This file is part of libpfm, a performance monitoring support library for ++ * applications on Linux. ++ * ++ * PMU: bdx_unc_sbo ++ */ ++ ++static intel_x86_umask_t bdx_unc_s_ring_ad_used[]={ ++ { .uname = "DOWN_EVEN", ++ .ucode = 0x400, ++ .udesc = "Down and Event", ++ }, ++ { .uname = "DOWN_ODD", ++ .ucode = 0x800, ++ .udesc = "Down and Odd", ++ }, ++ { .uname = "UP_EVEN", ++ .ucode = 0x100, ++ .udesc = "Up and Even", ++ }, ++ { .uname = "UP_ODD", ++ .ucode = 0x200, ++ .udesc = "Up and Odd", ++ }, ++ { .uname = "UP", ++ .ucode = 0x300, ++ .udesc = "Up", ++ .uflags= INTEL_X86_NCOMBO, ++ }, ++ { .uname = "DOWN", ++ .ucode = 0xcc00, ++ .udesc = "Down", ++ .uflags= INTEL_X86_NCOMBO, ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_s_ring_bounces[]={ ++ { .uname = "AD_CACHE", ++ .ucode = 0x100, ++ .udesc = "Number of LLC responses that bounced on the Ring. -- ", ++ }, ++ { .uname = "AK_CORE", ++ .ucode = 0x200, ++ .udesc = "Number of LLC responses that bounced on the Ring. -- Acknowledgements to core", ++ }, ++ { .uname = "BL_CORE", ++ .ucode = 0x400, ++ .udesc = "Number of LLC responses that bounced on the Ring. -- Data Responses to core", ++ }, ++ { .uname = "IV_CORE", ++ .ucode = 0x800, ++ .udesc = "Number of LLC responses that bounced on the Ring. -- Snoops of processors cachee.", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_s_ring_iv_used[]={ ++ { .uname = "DN", ++ .ucode = 0xc00, ++ .udesc = "BL Ring in Use -- Any", ++ .uflags= INTEL_X86_NCOMBO, ++ }, ++ { .uname = "UP", ++ .ucode = 0x300, ++ .udesc = "BL Ring in Use -- Any", ++ .uflags= INTEL_X86_NCOMBO, ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_s_rxr_bypass[]={ ++ { .uname = "AD_BNC", ++ .ucode = 0x200, ++ .udesc = "Bypass -- AD - Bounces", ++ .uflags= INTEL_X86_NCOMBO, ++ }, ++ { .uname = "AD_CRD", ++ .ucode = 0x100, ++ .udesc = "Bypass -- AD - Credits", ++ .uflags= INTEL_X86_NCOMBO, ++ }, ++ { .uname = "AK", ++ .ucode = 0x1000, ++ .udesc = "Bypass -- AK", ++ .uflags= INTEL_X86_NCOMBO, ++ }, ++ { .uname = "BL_BNC", ++ .ucode = 0x800, ++ .udesc = "Bypass -- BL - Bounces", ++ .uflags= INTEL_X86_NCOMBO, ++ }, ++ { .uname = "BL_CRD", ++ .ucode = 0x400, ++ .udesc = "Bypass -- BL - Credits", ++ .uflags= INTEL_X86_NCOMBO, ++ }, ++ { .uname = "IV", ++ .ucode = 0x2000, ++ .udesc = "Bypass -- IV", ++ .uflags= INTEL_X86_NCOMBO, ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_s_rxr_inserts[]={ ++ { .uname = "AD_BNC", ++ .ucode = 0x200, ++ .udesc = "Ingress Allocations -- AD - Bounces", ++ }, ++ { .uname = "AD_CRD", ++ .ucode = 0x100, ++ .udesc = "Ingress Allocations -- AD - Credits", ++ }, ++ { .uname = "AK", ++ .ucode = 0x1000, ++ .udesc = "Ingress Allocations -- AK", ++ }, ++ { .uname = "BL_BNC", ++ .ucode = 0x800, ++ .udesc = "Ingress Allocations -- BL - Bounces", ++ }, ++ { .uname = "BL_CRD", ++ .ucode = 0x400, ++ .udesc = "Ingress Allocations -- BL - Credits", ++ }, ++ { .uname = "IV", ++ .ucode = 0x2000, ++ .udesc = "Ingress Allocations -- IV", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_s_rxr_occupancy[]={ ++ { .uname = "AD_BNC", ++ .ucode = 0x200, ++ .udesc = "Ingress Occupancy -- AD - Bounces", ++ .uflags= INTEL_X86_NCOMBO, ++ }, ++ { .uname = "AD_CRD", ++ .ucode = 0x100, ++ .udesc = "Ingress Occupancy -- AD - Credits", ++ .uflags= INTEL_X86_NCOMBO, ++ }, ++ { .uname = "AK", ++ .ucode = 0x1000, ++ .udesc = "Ingress Occupancy -- AK", ++ .uflags= INTEL_X86_NCOMBO, ++ }, ++ { .uname = "BL_BNC", ++ .ucode = 0x800, ++ .udesc = "Ingress Occupancy -- BL - Bounces", ++ .uflags= INTEL_X86_NCOMBO, ++ }, ++ { .uname = "BL_CRD", ++ .ucode = 0x400, ++ .udesc = "Ingress Occupancy -- BL - Credits", ++ .uflags= INTEL_X86_NCOMBO, ++ }, ++ { .uname = "IV", ++ .ucode = 0x2000, ++ .udesc = "Ingress Occupancy -- IV", ++ .uflags= INTEL_X86_NCOMBO, ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_s_txr_ads_used[]={ ++ { .uname = "AD", ++ .ucode = 0x100, ++ .udesc = "TBD", ++ }, ++ { .uname = "AK", ++ .ucode = 0x200, ++ .udesc = "TBD", ++ }, ++ { .uname = "BL", ++ .ucode = 0x400, ++ .udesc = "TBD", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_s_txr_inserts[]={ ++ { .uname = "AD_BNC", ++ .ucode = 0x200, ++ .udesc = "Egress Allocations -- AD - Bounces", ++ }, ++ { .uname = "AD_CRD", ++ .ucode = 0x100, ++ .udesc = "Egress Allocations -- AD - Credits", ++ }, ++ { .uname = "AK", ++ .ucode = 0x1000, ++ .udesc = "Egress Allocations -- AK", ++ }, ++ { .uname = "BL_BNC", ++ .ucode = 0x800, ++ .udesc = "Egress Allocations -- BL - Bounces", ++ }, ++ { .uname = "BL_CRD", ++ .ucode = 0x400, ++ .udesc = "Egress Allocations -- BL - Credits", ++ }, ++ { .uname = "IV", ++ .ucode = 0x2000, ++ .udesc = "Egress Allocations -- IV", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_s_txr_occupancy[]={ ++ { .uname = "AD_BNC", ++ .ucode = 0x200, ++ .udesc = "Egress Occupancy -- AD - Bounces", ++ }, ++ { .uname = "AD_CRD", ++ .ucode = 0x100, ++ .udesc = "Egress Occupancy -- AD - Credits", ++ }, ++ { .uname = "AK", ++ .ucode = 0x1000, ++ .udesc = "Egress Occupancy -- AK", ++ }, ++ { .uname = "BL_BNC", ++ .ucode = 0x800, ++ .udesc = "Egress Occupancy -- BL - Bounces", ++ }, ++ { .uname = "BL_CRD", ++ .ucode = 0x400, ++ .udesc = "Egress Occupancy -- BL - Credits", ++ }, ++ { .uname = "IV", ++ .ucode = 0x2000, ++ .udesc = "Egress Occupancy -- IV", ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_s_txr_ordering[]={ ++ { .uname = "IVSNOOPGO_UP", ++ .ucode = 0x100, ++ .udesc = "TBD", ++ }, ++ { .uname = "IVSNOOP_DN", ++ .ucode = 0x200, ++ .udesc = "TBD", ++ }, ++ { .uname = "AK_U2C_UP_EVEN", ++ .ucode = 0x400, ++ .udesc = "TBD", ++ }, ++ { .uname = "AK_U2C_UP_ODD", ++ .ucode = 0x800, ++ .udesc = "TBD", ++ }, ++ { .uname = "AK_U2C_DN_EVEN", ++ .ucode = 0x1000, ++ .udesc = "TBD", ++ }, ++ { .uname = "AK_U2C_DN_ODD", ++ .ucode = 0x2000, ++ .udesc = "TBD", ++ }, ++}; ++ ++static intel_x86_entry_t intel_bdx_unc_s_pe[]={ ++ { .name = "UNC_S_BOUNCE_CONTROL", ++ .code = 0xa, ++ .desc = "TBD", ++ .modmsk = BDX_UNC_SBO_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_S_CLOCKTICKS", ++ .code = 0x0, ++ .desc = "TBD", ++ .modmsk = BDX_UNC_SBO_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_S_FAST_ASSERTED", ++ .code = 0x9, ++ .desc = "Counts the number of cycles either the local or incoming distress signals are asserted. Incoming distress includes up, dn and across.", ++ .modmsk = BDX_UNC_SBO_ATTRS, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_S_RING_AD_USED", ++ .code = 0x1b, ++ .desc = "Counts the number of cycles that the AD ring is being used at this ring stop. This includes when packets are passing by and when packets are being sent, but does not include when packets are being sunk into the ring stop. We really have two rings in BDX -- a clockwise ring and a counter-clockwise ring. On the left side of the ring, the UP direction is on the clockwise ring and DN is on the counter-clockwise ring. On the right side of the ring, this is reversed. The first half of the CBos are on the left side of the ring, and the 2nd half are on the right side of the ring. In other words (for example), in a 4c part, Cbo 0 UP AD is NOT the same ring as CBo 2 UP AD because they are on opposite sides of the rhe ring.", ++ .modmsk = BDX_UNC_SBO_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_s_ring_ad_used, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_s_ring_ad_used), ++ }, ++ { .name = "UNC_S_RING_AK_USED", ++ .code = 0x1c, ++ .desc = "Counts the number of cycles that the AK ring is being used at this ring stop. This includes when packets are passing by and when packets are being sent, but does not include when packets are being sunk into the ring stop. We really have two rings in BDX -- a clockwise ring and a counter-clockwise ring. On the left side of the ring, the UP direction is on the clockwise ring and DN is on the counter-clockwise ring. On the right side of the ring, this is reversed. The first half of the CBos are on the left side of the ring, and the 2nd half are on the right side of the ring. In other words (for example), in a 4c part, Cbo 0 UP AD is NOT the same ring as CBo 2 UP AD because they are on opposite sides of the rhe ring.", ++ .modmsk = BDX_UNC_SBO_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_s_ring_ad_used, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_s_ring_ad_used), ++ }, ++ { .name = "UNC_S_RING_BL_USED", ++ .code = 0x1d, ++ .desc = "Counts the number of cycles that the BL ring is being used at this ring stop. This includes when packets are passing by and when packets are being sent, but does not include when packets are being sunk into the ring stop. We really have two rings in BDX -- a clockwise ring and a counter-clockwise ring. On the left side of the ring, the UP direction is on the clockwise ring and DN is on the counter-clockwise ring. On the right side of the ring, this is reversed. The first half of the CBos are on the left side of the ring, and the 2nd half are on the right side of the ring. In other words (for example), in a 4c part, Cbo 0 UP AD is NOT the same ring as CBo 2 UP AD because they are on opposite sides of the rhe ring.", ++ .modmsk = BDX_UNC_SBO_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_s_ring_ad_used, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_s_ring_ad_used), ++ }, ++ { .name = "UNC_S_RING_BOUNCES", ++ .code = 0x5, ++ .desc = "TBD", ++ .modmsk = BDX_UNC_SBO_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_s_ring_bounces, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_s_ring_bounces), ++ }, ++ { .name = "UNC_S_RING_IV_USED", ++ .code = 0x1e, ++ .desc = "Counts the number of cycles that the BL ring is being used at this ring stop. This includes when packets are passing by and when packets are being sent, but does not include when packets are being sunk into the ring stop. There is only 1 IV ring in BDX. Therefore, if one wants to monitor the Even ring, they should select both UP_EVEN and DN_EVEN. To monitor the Odd ring, they should select both UP_ODD and DN_ DN_ODD.", ++ .modmsk = BDX_UNC_SBO_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_s_ring_iv_used, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_s_ring_iv_used), ++ }, ++ { .name = "UNC_S_RXR_BYPASS", ++ .code = 0x12, ++ .desc = "Bypass the Sbo Ingress.", ++ .modmsk = BDX_UNC_SBO_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_s_rxr_bypass, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_s_rxr_bypass), ++ }, ++ { .name = "UNC_S_RXR_INSERTS", ++ .code = 0x13, ++ .desc = "Number of allocations into the Sbo Ingress The Ingress is used to queue up requests received from the ring.", ++ .modmsk = BDX_UNC_SBO_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_s_rxr_inserts, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_s_rxr_inserts), ++ }, ++ { .name = "UNC_S_RXR_OCCUPANCY", ++ .code = 0x11, ++ .desc = "Occupancy event for the Ingress buffers in the Sbo. The Ingress is used to queue up requests received from the ring.", ++ .modmsk = BDX_UNC_SBO_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_s_rxr_occupancy, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_s_rxr_occupancy), ++ }, ++ { .name = "UNC_S_TXR_ADS_USED", ++ .code = 0x4, ++ .desc = "", ++ .modmsk = BDX_UNC_SBO_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_s_txr_ads_used, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_s_txr_ads_used), ++ }, ++ { .name = "UNC_S_TXR_INSERTS", ++ .code = 0x2, ++ .desc = "Number of allocations into the Sbo Egress. The Egress is used to queue up requests destined for the ring.", ++ .modmsk = BDX_UNC_SBO_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_s_txr_inserts, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_s_txr_inserts), ++ }, ++ { .name = "UNC_S_TXR_OCCUPANCY", ++ .code = 0x1, ++ .desc = "Occupancy event for the Egress buffers in the Sbo. The egress is used to queue up requests destined for the ring.", ++ .modmsk = BDX_UNC_SBO_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_s_txr_occupancy, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_s_txr_occupancy), ++ }, ++ { .name = "UNC_S_TXR_ORDERING", ++ .code = 0x7, ++ .desc = "TB", ++ .modmsk = BDX_UNC_SBO_ATTRS, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .umasks = bdx_unc_s_txr_ordering, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_s_txr_ordering), ++ }, ++}; +diff --git a/lib/events/intel_bdx_unc_ubo_events.h b/lib/events/intel_bdx_unc_ubo_events.h +new file mode 100644 +index 0000000..2f4e1f3 +--- /dev/null ++++ b/lib/events/intel_bdx_unc_ubo_events.h +@@ -0,0 +1,70 @@ ++/* ++ * Copyright (c) 2017 Google Inc. All rights reserved ++ * Contributed by Stephane Eranian ++ * ++ * Permission is hereby granted, free of charge, to any person obtaining a copy ++ * of this software and associated documentation files (the "Software"), to deal ++ * in the Software without restriction, including without limitation the rights ++ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies ++ * of the Software, and to permit persons to whom the Software is furnished to do so, ++ * subject to the following conditions: ++ * ++ * The above copyright notice and this permission notice shall be included in all ++ * copies or substantial portions of the Software. ++ * ++ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, ++ * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A ++ * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT ++ * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF ++ * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE ++ * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ++ * ++ * This file is part of libpfm, a performance monitoring support library for ++ * applications on Linux. ++ * ++ * PMU: bdx_unc_ubo ++ */ ++ ++static intel_x86_umask_t bdx_unc_u_event_msg[]={ ++ { .uname = "DOORBELL_RCVD", ++ .ucode = 0x800, ++ .udesc = "VLW Received", ++ .uflags = INTEL_X86_DFL, ++ }, ++}; ++ ++static intel_x86_umask_t bdx_unc_u_phold_cycles[]={ ++ { .uname = "ASSERT_TO_ACK", ++ .ucode = 0x100, ++ .udesc = "Cycles PHOLD Assert to Ack. Assert to ACK", ++ .uflags = INTEL_X86_DFL, ++ }, ++}; ++ ++static intel_x86_entry_t intel_bdx_unc_u_pe[]={ ++ { .name = "UNC_U_EVENT_MSG", ++ .code = 0x42, ++ .desc = "Virtual Logical Wire (legacy) message were received from uncore", ++ .modmsk = BDX_UNC_UBO_ATTRS, ++ .cntmsk = 0x3, ++ .ngrp = 1, ++ .umasks = bdx_unc_u_event_msg, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_u_event_msg), ++ }, ++ { .name = "UNC_U_PHOLD_CYCLES", ++ .code = 0x45, ++ .desc = "PHOLD cycles. Filter from source CoreID.", ++ .modmsk = BDX_UNC_UBO_ATTRS, ++ .cntmsk = 0x3, ++ .ngrp = 1, ++ .umasks = bdx_unc_u_phold_cycles, ++ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_u_phold_cycles), ++ }, ++ { .name = "UNC_U_RACU_REQUESTS", ++ .code = 0x46, ++ .desc = "Number outstanding register requests within message channel tracker", ++ .modmsk = BDX_UNC_UBO_ATTRS, ++ .cntmsk = 0x3, ++ }, ++}; ++ +diff --git a/lib/pfmlib_common.c b/lib/pfmlib_common.c +index bd57078..844667d 100644 +--- a/lib/pfmlib_common.c ++++ b/lib/pfmlib_common.c +@@ -266,6 +266,54 @@ static pfmlib_pmu_t *pfmlib_pmus[]= + &intel_knl_unc_cha36_support, + &intel_knl_unc_cha37_support, + &intel_knl_unc_m2pcie_support, ++ &intel_bdx_unc_cb0_support, ++ &intel_bdx_unc_cb1_support, ++ &intel_bdx_unc_cb2_support, ++ &intel_bdx_unc_cb3_support, ++ &intel_bdx_unc_cb4_support, ++ &intel_bdx_unc_cb5_support, ++ &intel_bdx_unc_cb6_support, ++ &intel_bdx_unc_cb7_support, ++ &intel_bdx_unc_cb8_support, ++ &intel_bdx_unc_cb9_support, ++ &intel_bdx_unc_cb10_support, ++ &intel_bdx_unc_cb11_support, ++ &intel_bdx_unc_cb12_support, ++ &intel_bdx_unc_cb13_support, ++ &intel_bdx_unc_cb14_support, ++ &intel_bdx_unc_cb15_support, ++ &intel_bdx_unc_cb16_support, ++ &intel_bdx_unc_cb17_support, ++ &intel_bdx_unc_cb18_support, ++ &intel_bdx_unc_cb19_support, ++ &intel_bdx_unc_cb20_support, ++ &intel_bdx_unc_cb21_support, ++ &intel_bdx_unc_cb22_support, ++ &intel_bdx_unc_cb23_support, ++ &intel_bdx_unc_ubo_support, ++ &intel_bdx_unc_sbo0_support, ++ &intel_bdx_unc_sbo1_support, ++ &intel_bdx_unc_sbo2_support, ++ &intel_bdx_unc_sbo3_support, ++ &intel_bdx_unc_ha0_support, ++ &intel_bdx_unc_ha1_support, ++ &intel_bdx_unc_imc0_support, ++ &intel_bdx_unc_imc1_support, ++ &intel_bdx_unc_imc2_support, ++ &intel_bdx_unc_imc3_support, ++ &intel_bdx_unc_imc4_support, ++ &intel_bdx_unc_imc5_support, ++ &intel_bdx_unc_imc6_support, ++ &intel_bdx_unc_imc7_support, ++ &intel_bdx_unc_irp_support, ++ &intel_bdx_unc_pcu_support, ++ &intel_bdx_unc_qpi0_support, ++ &intel_bdx_unc_qpi1_support, ++ &intel_bdx_unc_qpi2_support, ++ &intel_bdx_unc_r2pcie_support, ++ &intel_bdx_unc_r3qpi0_support, ++ &intel_bdx_unc_r3qpi1_support, ++ &intel_bdx_unc_r3qpi2_support, + &intel_x86_arch_support, /* must always be last for x86 */ + #endif + +diff --git a/lib/pfmlib_intel_bdx_unc_cbo.c b/lib/pfmlib_intel_bdx_unc_cbo.c +new file mode 100644 +index 0000000..d1ff970 +--- /dev/null ++++ b/lib/pfmlib_intel_bdx_unc_cbo.c +@@ -0,0 +1,134 @@ ++/* ++ * pfmlib_intel_bdx_unc_cbo.c : Intel BDX C-Box uncore PMU ++ * ++ * Copyright (c) 2017 Google Inc. ++ * Contributed by Stephane Eranian ++ * ++ * Permission is hereby granted, free of charge, to any person obtaining a copy ++ * of this software and associated documentation files (the "Software"), to deal ++ * in the Software without restriction, including without limitation the rights ++ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies ++ * of the Software, and to permit persons to whom the Software is furnished to do so, ++ * subject to the following conditions: ++ * ++ * The above copyright notice and this permission notice shall be included in all ++ * copies or substantial portions of the Software. ++ * ++ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, ++ * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A ++ * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT ++ * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF ++ * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE ++ * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ++ */ ++#include ++#include ++#include ++#include ++#include ++ ++/* private headers */ ++#include "pfmlib_priv.h" ++#include "pfmlib_intel_x86_priv.h" ++#include "pfmlib_intel_snbep_unc_priv.h" ++#include "events/intel_bdx_unc_cbo_events.h" ++ ++static void ++display_cbo(void *this, pfmlib_event_desc_t *e, void *val) ++{ ++ const intel_x86_entry_t *pe = this_pe(this); ++ pfm_snbep_unc_reg_t *reg = val; ++ pfm_snbep_unc_reg_t f; ++ ++ __pfm_vbprintf("[UNC_CBO=0x%"PRIx64" event=0x%x umask=0x%x en=%d " ++ "inv=%d edge=%d thres=%d tid_en=%d] %s\n", ++ reg->val, ++ reg->cbo.unc_event, ++ reg->cbo.unc_umask, ++ reg->cbo.unc_en, ++ reg->cbo.unc_inv, ++ reg->cbo.unc_edge, ++ reg->cbo.unc_thres, ++ reg->cbo.unc_tid, ++ pe[e->event].name); ++ ++ if (e->count == 1) ++ return; ++ ++ f.val = e->codes[1]; ++ ++ __pfm_vbprintf("[UNC_CBOX_FILTER0=0x%"PRIx64" tid=%d core=0x%x" ++ " state=0x%x]\n", ++ f.val, ++ f.ivbep_cbo_filt0.tid, ++ f.ivbep_cbo_filt0.cid, ++ f.ivbep_cbo_filt0.state); ++ ++ if (e->count == 2) ++ return; ++ ++ f.val = e->codes[2]; ++ ++ __pfm_vbprintf("[UNC_CBOX_FILTER1=0x%"PRIx64" nid=%d opc=0x%x" ++ " nc=0x%x isoc=0x%x]\n", ++ f.val, ++ f.ivbep_cbo_filt1.nid, ++ f.ivbep_cbo_filt1.opc, ++ f.ivbep_cbo_filt1.nc, ++ f.ivbep_cbo_filt1.isoc); ++} ++ ++#define DEFINE_C_BOX(n) \ ++pfmlib_pmu_t intel_bdx_unc_cb##n##_support = {\ ++ .desc = "Intel BroadwellX C-Box "#n" uncore",\ ++ .name = "bdx_unc_cbo"#n,\ ++ .perf_name = "uncore_cbox_"#n,\ ++ .pmu = PFM_PMU_INTEL_BDX_UNC_CB##n,\ ++ .pme_count = LIBPFM_ARRAY_SIZE(intel_bdx_unc_c_pe),\ ++ .type = PFM_PMU_TYPE_UNCORE,\ ++ .num_cntrs = 4,\ ++ .num_fixed_cntrs = 0,\ ++ .max_encoding = 2,\ ++ .pe = intel_bdx_unc_c_pe,\ ++ .atdesc = snbep_unc_mods,\ ++ .flags = PFMLIB_PMU_FL_RAW_UMASK|INTEL_PMU_FL_UNC_CBO,\ ++ .pmu_detect = pfm_intel_bdx_unc_detect,\ ++ .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding,\ ++ PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding),\ ++ PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), \ ++ .get_event_first = pfm_intel_x86_get_event_first,\ ++ .get_event_next = pfm_intel_x86_get_event_next,\ ++ .event_is_valid = pfm_intel_x86_event_is_valid,\ ++ .validate_table = pfm_intel_x86_validate_table,\ ++ .get_event_info = pfm_intel_x86_get_event_info,\ ++ .get_event_attr_info = pfm_intel_x86_get_event_attr_info,\ ++ PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs),\ ++ .get_event_nattrs = pfm_intel_x86_get_event_nattrs,\ ++ .can_auto_encode = pfm_intel_x86_can_auto_encode, \ ++ .display_reg = display_cbo,\ ++} ++ ++DEFINE_C_BOX(0); ++DEFINE_C_BOX(1); ++DEFINE_C_BOX(2); ++DEFINE_C_BOX(3); ++DEFINE_C_BOX(4); ++DEFINE_C_BOX(5); ++DEFINE_C_BOX(6); ++DEFINE_C_BOX(7); ++DEFINE_C_BOX(8); ++DEFINE_C_BOX(9); ++DEFINE_C_BOX(10); ++DEFINE_C_BOX(11); ++DEFINE_C_BOX(12); ++DEFINE_C_BOX(13); ++DEFINE_C_BOX(14); ++DEFINE_C_BOX(15); ++DEFINE_C_BOX(16); ++DEFINE_C_BOX(17); ++DEFINE_C_BOX(18); ++DEFINE_C_BOX(19); ++DEFINE_C_BOX(20); ++DEFINE_C_BOX(21); ++DEFINE_C_BOX(22); ++DEFINE_C_BOX(23); +diff --git a/lib/pfmlib_intel_bdx_unc_ha.c b/lib/pfmlib_intel_bdx_unc_ha.c +new file mode 100644 +index 0000000..d928ba0 +--- /dev/null ++++ b/lib/pfmlib_intel_bdx_unc_ha.c +@@ -0,0 +1,97 @@ ++/* ++ * pfmlib_intel_bdx_unc_ha.c : Intel BroadwellX Home Agent (HA) uncore PMU ++ * ++ * Copyright (c) 2017 Google Inc. All rights reserved ++ * Contributed by Stephane Eranian ++ * ++ * Permission is hereby granted, free of charge, to any person obtaining a copy ++ * of this software and associated documentation files (the "Software"), to deal ++ * in the Software without restriction, including without limitation the rights ++ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies ++ * of the Software, and to permit persons to whom the Software is furnished to do so, ++ * subject to the following conditions: ++ * ++ * The above copyright notice and this permission notice shall be included in all ++ * copies or substantial portions of the Software. ++ * ++ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, ++ * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A ++ * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT ++ * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF ++ * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE ++ * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ++ */ ++#include ++#include ++#include ++#include ++#include ++ ++/* private headers */ ++#include "pfmlib_priv.h" ++#include "pfmlib_intel_x86_priv.h" ++#include "pfmlib_intel_snbep_unc_priv.h" ++#include "events/intel_bdx_unc_ha_events.h" ++ ++static void ++display_ha(void *this, pfmlib_event_desc_t *e, void *val) ++{ ++ const intel_x86_entry_t *pe = this_pe(this); ++ pfm_snbep_unc_reg_t *reg = val; ++ pfm_snbep_unc_reg_t f; ++ ++ __pfm_vbprintf("[UNC_HA=0x%"PRIx64" event=0x%x umask=0x%x en=%d " ++ "inv=%d edge=%d thres=%d] %s\n", ++ reg->val, ++ reg->com.unc_event, ++ reg->com.unc_umask, ++ reg->com.unc_en, ++ reg->com.unc_inv, ++ reg->com.unc_edge, ++ reg->com.unc_thres, ++ pe[e->event].name); ++ ++ if (e->count == 1) ++ return; ++ ++ f.val = e->codes[1]; ++ __pfm_vbprintf("[UNC_HA_ADDR=0x%"PRIx64" lo_addr=0x%x hi_addr=0x%x]\n", ++ f.val, ++ f.ha_addr.lo_addr, ++ f.ha_addr.hi_addr); ++ ++ f.val = e->codes[2]; ++ __pfm_vbprintf("[UNC_HA_OPC=0x%"PRIx64" opc=0x%x]\n", f.val, f.ha_opc.opc); ++} ++ ++#define DEFINE_HA_BOX(n) \ ++pfmlib_pmu_t intel_bdx_unc_ha##n##_support = {\ ++ .desc = "Intel BroadwellX HA "#n" uncore",\ ++ .name = "bdx_unc_ha"#n,\ ++ .perf_name = "uncore_ha_"#n,\ ++ .pmu = PFM_PMU_INTEL_BDX_UNC_HA##n,\ ++ .pme_count = LIBPFM_ARRAY_SIZE(intel_bdx_unc_h_pe),\ ++ .type = PFM_PMU_TYPE_UNCORE,\ ++ .num_cntrs = 4,\ ++ .num_fixed_cntrs = 0,\ ++ .max_encoding = 3, /* address matchers */\ ++ .pe = intel_bdx_unc_h_pe,\ ++ .atdesc = snbep_unc_mods,\ ++ .flags = PFMLIB_PMU_FL_RAW_UMASK,\ ++ .pmu_detect = pfm_intel_bdx_unc_detect,\ ++ .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding,\ ++ PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding),\ ++ PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), \ ++ .get_event_first = pfm_intel_x86_get_event_first,\ ++ .get_event_next = pfm_intel_x86_get_event_next,\ ++ .event_is_valid = pfm_intel_x86_event_is_valid,\ ++ .validate_table = pfm_intel_x86_validate_table,\ ++ .get_event_info = pfm_intel_x86_get_event_info,\ ++ .get_event_attr_info = pfm_intel_x86_get_event_attr_info,\ ++ PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs),\ ++ .get_event_nattrs = pfm_intel_x86_get_event_nattrs,\ ++ .display_reg = display_ha,\ ++} ++ ++DEFINE_HA_BOX(0); ++DEFINE_HA_BOX(1); +diff --git a/lib/pfmlib_intel_bdx_unc_imc.c b/lib/pfmlib_intel_bdx_unc_imc.c +new file mode 100644 +index 0000000..462f547 +--- /dev/null ++++ b/lib/pfmlib_intel_bdx_unc_imc.c +@@ -0,0 +1,71 @@ ++/* ++ * pfmlib_intel_bdx_unc_imc.c : Intel BroadwellX Integrated Memory Controller (IMC) uncore PMU ++ * ++ * Copyright (c) 2017 Google Inc. All rights reserved ++ * Contributed by Stephane Eranian ++ * ++ * Permission is hereby granted, free of charge, to any person obtaining a copy ++ * of this software and associated documentation files (the "Software"), to deal ++ * in the Software without restriction, including without limitation the rights ++ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies ++ * of the Software, and to permit persons to whom the Software is furnished to do so, ++ * subject to the following conditions: ++ * ++ * The above copyright notice and this permission notice shall be included in all ++ * copies or substantial portions of the Software. ++ * ++ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, ++ * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A ++ * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT ++ * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF ++ * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE ++ * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ++ */ ++#include ++#include ++#include ++#include ++#include ++ ++/* private headers */ ++#include "pfmlib_priv.h" ++#include "pfmlib_intel_x86_priv.h" ++#include "pfmlib_intel_snbep_unc_priv.h" ++#include "events/intel_bdx_unc_imc_events.h" ++ ++#define DEFINE_IMC_BOX(n) \ ++pfmlib_pmu_t intel_bdx_unc_imc##n##_support = { \ ++ .desc = "Intel BroadwellX IMC"#n" uncore", \ ++ .name = "bdx_unc_imc"#n, \ ++ .perf_name = "uncore_imc_"#n, \ ++ .pmu = PFM_PMU_INTEL_BDX_UNC_IMC##n, \ ++ .pme_count = LIBPFM_ARRAY_SIZE(intel_bdx_unc_m_pe), \ ++ .type = PFM_PMU_TYPE_UNCORE, \ ++ .num_cntrs = 4, \ ++ .num_fixed_cntrs = 1, \ ++ .max_encoding = 1, \ ++ .pe = intel_bdx_unc_m_pe, \ ++ .atdesc = snbep_unc_mods, \ ++ .flags = PFMLIB_PMU_FL_RAW_UMASK, \ ++ .pmu_detect = pfm_intel_bdx_unc_detect, \ ++ .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding, \ ++ PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding), \ ++ PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), \ ++ .get_event_first = pfm_intel_x86_get_event_first, \ ++ .get_event_next = pfm_intel_x86_get_event_next, \ ++ .event_is_valid = pfm_intel_x86_event_is_valid, \ ++ .validate_table = pfm_intel_x86_validate_table, \ ++ .get_event_info = pfm_intel_x86_get_event_info, \ ++ .get_event_attr_info = pfm_intel_x86_get_event_attr_info, \ ++ PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs), \ ++ .get_event_nattrs = pfm_intel_x86_get_event_nattrs, \ ++}; ++ ++DEFINE_IMC_BOX(0); ++DEFINE_IMC_BOX(1); ++DEFINE_IMC_BOX(2); ++DEFINE_IMC_BOX(3); ++DEFINE_IMC_BOX(4); ++DEFINE_IMC_BOX(5); ++DEFINE_IMC_BOX(6); ++DEFINE_IMC_BOX(7); +diff --git a/lib/pfmlib_intel_bdx_unc_irp.c b/lib/pfmlib_intel_bdx_unc_irp.c +new file mode 100644 +index 0000000..14e010b +--- /dev/null ++++ b/lib/pfmlib_intel_bdx_unc_irp.c +@@ -0,0 +1,79 @@ ++/* ++ * pfmlib_intel_bdx_irp.c : Intel BroadwellX IRP uncore PMU ++ * ++ * Copyright (c) 2017 Google Inc. All rights reserved ++ * Contributed by Stephane Eranian ++ * ++ * Permission is hereby granted, free of charge, to any person obtaining a copy ++ * of this software and associated documentation files (the "Software"), to deal ++ * in the Software without restriction, including without limitation the rights ++ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies ++ * of the Software, and to permit persons to whom the Software is furnished to do so, ++ * subject to the following conditions: ++ * ++ * The above copyright notice and this permission notice shall be included in all ++ * copies or substantial portions of the Software. ++ * ++ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, ++ * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A ++ * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT ++ * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF ++ * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE ++ * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ++ */ ++#include ++#include ++#include ++#include ++#include ++ ++/* private headers */ ++#include "pfmlib_priv.h" ++#include "pfmlib_intel_x86_priv.h" ++#include "pfmlib_intel_snbep_unc_priv.h" ++#include "events/intel_bdx_unc_irp_events.h" ++ ++static void ++display_irp(void *this, pfmlib_event_desc_t *e, void *val) ++{ ++ const intel_x86_entry_t *pe = this_pe(this); ++ pfm_snbep_unc_reg_t *reg = val; ++ ++ __pfm_vbprintf("[UNC_IRP=0x%"PRIx64" event=0x%x umask=0x%x en=%d " ++ "edge=%d thres=%d] %s\n", ++ reg->val, ++ reg->irp.unc_event, ++ reg->irp.unc_umask, ++ reg->irp.unc_en, ++ reg->irp.unc_edge, ++ reg->irp.unc_thres, ++ pe[e->event].name); ++} ++ ++pfmlib_pmu_t intel_bdx_unc_irp_support = { ++ .desc = "Intel BroadwellX IRP uncore", ++ .name = "bdx_unc_irp", ++ .perf_name = "uncore_irp", ++ .pmu = PFM_PMU_INTEL_BDX_UNC_IRP, ++ .pme_count = LIBPFM_ARRAY_SIZE(intel_bdx_unc_i_pe), ++ .type = PFM_PMU_TYPE_UNCORE, ++ .num_cntrs = 4, ++ .num_fixed_cntrs = 0, ++ .max_encoding = 3, ++ .pe = intel_bdx_unc_i_pe, ++ .atdesc = snbep_unc_mods, ++ .flags = PFMLIB_PMU_FL_RAW_UMASK, ++ .pmu_detect = pfm_intel_bdx_unc_detect, ++ .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding, ++ PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding), ++ PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), ++ .get_event_first = pfm_intel_x86_get_event_first, ++ .get_event_next = pfm_intel_x86_get_event_next, ++ .event_is_valid = pfm_intel_x86_event_is_valid, ++ .validate_table = pfm_intel_x86_validate_table, ++ .get_event_info = pfm_intel_x86_get_event_info, ++ .get_event_attr_info = pfm_intel_x86_get_event_attr_info, ++ PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs), ++ .get_event_nattrs = pfm_intel_x86_get_event_nattrs, ++ .display_reg = display_irp, ++}; +diff --git a/lib/pfmlib_intel_bdx_unc_pcu.c b/lib/pfmlib_intel_bdx_unc_pcu.c +new file mode 100644 +index 0000000..435f280 +--- /dev/null ++++ b/lib/pfmlib_intel_bdx_unc_pcu.c +@@ -0,0 +1,97 @@ ++/* ++ * pfmlib_intel_bdx_unc_pcu.c : Intel BroadwellX Power Control Unit (PCU) uncore PMU ++ * ++ * Copyright (c) 2017 Google Inc. All rights reserved ++ * Contributed by Stephane Eranian ++ * ++ * Permission is hereby granted, free of charge, to any person obtaining a copy ++ * of this software and associated documentation files (the "Software"), to deal ++ * in the Software without restriction, including without limitation the rights ++ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies ++ * of the Software, and to permit persons to whom the Software is furnished to do so, ++ * subject to the following conditions: ++ * ++ * The above copyright notice and this permission notice shall be included in all ++ * copies or substantial portions of the Software. ++ * ++ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, ++ * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A ++ * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT ++ * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF ++ * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE ++ * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ++ */ ++#include ++#include ++#include ++#include ++#include ++ ++/* private headers */ ++#include "pfmlib_priv.h" ++#include "pfmlib_intel_x86_priv.h" ++#include "pfmlib_intel_snbep_unc_priv.h" ++#include "events/intel_bdx_unc_pcu_events.h" ++ ++static void ++display_pcu(void *this, pfmlib_event_desc_t *e, void *val) ++{ ++ const intel_x86_entry_t *pe = this_pe(this); ++ pfm_snbep_unc_reg_t *reg = val; ++ pfm_snbep_unc_reg_t f; ++ ++ __pfm_vbprintf("[UNC_PCU=0x%"PRIx64" event=0x%x sel_ext=%d occ_sel=0x%x en=%d " ++ "edge=%d thres=%d occ_inv=%d occ_edge=%d] %s\n", ++ reg->val, ++ reg->ivbep_pcu.unc_event, ++ reg->ivbep_pcu.unc_sel_ext, ++ reg->ivbep_pcu.unc_occ, ++ reg->ivbep_pcu.unc_en, ++ reg->ivbep_pcu.unc_edge, ++ reg->ivbep_pcu.unc_thres, ++ reg->ivbep_pcu.unc_occ_inv, ++ reg->ivbep_pcu.unc_occ_edge, ++ pe[e->event].name); ++ ++ if (e->count == 1) ++ return; ++ ++ f.val = e->codes[1]; ++ ++ __pfm_vbprintf("[UNC_PCU_FILTER=0x%"PRIx64" band0=%u band1=%u band2=%u band3=%u]\n", ++ f.val, ++ f.pcu_filt.filt0, ++ f.pcu_filt.filt1, ++ f.pcu_filt.filt2, ++ f.pcu_filt.filt3); ++} ++ ++ ++pfmlib_pmu_t intel_bdx_unc_pcu_support = { ++ .desc = "Intel BroadwellX PCU uncore", ++ .name = "bdx_unc_pcu", ++ .perf_name = "uncore_pcu", ++ .pmu = PFM_PMU_INTEL_BDX_UNC_PCU, ++ .pme_count = LIBPFM_ARRAY_SIZE(intel_bdx_unc_p_pe), ++ .type = PFM_PMU_TYPE_UNCORE, ++ .num_cntrs = 4, ++ .num_fixed_cntrs = 0, ++ .max_encoding = 2, ++ .pe = intel_bdx_unc_p_pe, ++ .atdesc = snbep_unc_mods, ++ .flags = PFMLIB_PMU_FL_RAW_UMASK, ++ .pmu_detect = pfm_intel_bdx_unc_detect, ++ .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding, ++ PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding), ++ PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), ++ .get_event_first = pfm_intel_x86_get_event_first, ++ .get_event_next = pfm_intel_x86_get_event_next, ++ .event_is_valid = pfm_intel_x86_event_is_valid, ++ .validate_table = pfm_intel_x86_validate_table, ++ .get_event_info = pfm_intel_x86_get_event_info, ++ .get_event_attr_info = pfm_intel_x86_get_event_attr_info, ++ PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs), ++ .get_event_nattrs = pfm_intel_x86_get_event_nattrs, ++ .can_auto_encode = pfm_intel_snbep_unc_can_auto_encode, ++ .display_reg = display_pcu, ++}; +diff --git a/lib/pfmlib_intel_bdx_unc_qpi.c b/lib/pfmlib_intel_bdx_unc_qpi.c +new file mode 100644 +index 0000000..e3fba3d +--- /dev/null ++++ b/lib/pfmlib_intel_bdx_unc_qpi.c +@@ -0,0 +1,85 @@ ++/* ++ * pfmlib_intel_bdx_qpi.c : Intel BroadwellX QPI uncore PMU ++ * ++ * Copyright (c) 2017 Google Inc. All rights reserved ++ * Contributed by Stephane Eranian ++ * ++ * Permission is hereby granted, free of charge, to any person obtaining a copy ++ * of this software and associated documentation files (the "Software"), to deal ++ * in the Software without restriction, including without limitation the rights ++ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies ++ * of the Software, and to permit persons to whom the Software is furnished to do so, ++ * subject to the following conditions: ++ * ++ * The above copyright notice and this permission notice shall be included in all ++ * copies or substantial portions of the Software. ++ * ++ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, ++ * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A ++ * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT ++ * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF ++ * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE ++ * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ++ */ ++#include ++#include ++#include ++#include ++#include ++ ++/* private headers */ ++#include "pfmlib_priv.h" ++#include "pfmlib_intel_x86_priv.h" ++#include "pfmlib_intel_snbep_unc_priv.h" ++#include "events/intel_bdx_unc_qpi_events.h" ++ ++static void ++display_qpi(void *this, pfmlib_event_desc_t *e, void *val) ++{ ++ const intel_x86_entry_t *pe = this_pe(this); ++ pfm_snbep_unc_reg_t *reg = val; ++ ++ __pfm_vbprintf("[UNC_QPI=0x%"PRIx64" event=0x%x sel_ext=%d umask=0x%x en=%d " ++ "inv=%d edge=%d thres=%d] %s\n", ++ reg->val, ++ reg->qpi.unc_event, ++ reg->qpi.unc_event_ext, ++ reg->qpi.unc_umask, ++ reg->qpi.unc_en, ++ reg->qpi.unc_inv, ++ reg->qpi.unc_edge, ++ reg->qpi.unc_thres, ++ pe[e->event].name); ++} ++ ++#define DEFINE_QPI_BOX(n) \ ++pfmlib_pmu_t intel_bdx_unc_qpi##n##_support = {\ ++ .desc = "Intel BroadwellX QPI"#n" uncore",\ ++ .name = "bdx_unc_qpi"#n,\ ++ .perf_name = "uncore_qpi_"#n,\ ++ .pmu = PFM_PMU_INTEL_BDX_UNC_QPI##n,\ ++ .pme_count = LIBPFM_ARRAY_SIZE(intel_bdx_unc_q_pe),\ ++ .type = PFM_PMU_TYPE_UNCORE,\ ++ .num_cntrs = 4,\ ++ .num_fixed_cntrs = 0,\ ++ .max_encoding = 3,\ ++ .pe = intel_bdx_unc_q_pe,\ ++ .atdesc = snbep_unc_mods,\ ++ .flags = PFMLIB_PMU_FL_RAW_UMASK,\ ++ .pmu_detect = pfm_intel_bdx_unc_detect,\ ++ .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding,\ ++ PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding),\ ++ PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), \ ++ .get_event_first = pfm_intel_x86_get_event_first,\ ++ .get_event_next = pfm_intel_x86_get_event_next,\ ++ .event_is_valid = pfm_intel_x86_event_is_valid,\ ++ .validate_table = pfm_intel_x86_validate_table,\ ++ .get_event_info = pfm_intel_x86_get_event_info,\ ++ .get_event_attr_info = pfm_intel_x86_get_event_attr_info,\ ++ PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs),\ ++ .get_event_nattrs = pfm_intel_x86_get_event_nattrs,\ ++ .display_reg = display_qpi,\ ++} ++DEFINE_QPI_BOX(0); ++DEFINE_QPI_BOX(1); ++DEFINE_QPI_BOX(2); +diff --git a/lib/pfmlib_intel_bdx_unc_r2pcie.c b/lib/pfmlib_intel_bdx_unc_r2pcie.c +new file mode 100644 +index 0000000..18bed6c +--- /dev/null ++++ b/lib/pfmlib_intel_bdx_unc_r2pcie.c +@@ -0,0 +1,80 @@ ++/* ++ * pfmlib_intel_bdx_r2pcie.c : Intel BroadwellX R2PCIe uncore PMU ++ * ++ * Copyright (c) 2017 Google Inc. All rights reserved ++ * Contributed by Stephane Eranian ++ * ++ * Permission is hereby granted, free of charge, to any person obtaining a copy ++ * of this software and associated documentation files (the "Software"), to deal ++ * in the Software without restriction, including without limitation the rights ++ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies ++ * of the Software, and to permit persons to whom the Software is furnished to do so, ++ * subject to the following conditions: ++ * ++ * The above copyright notice and this permission notice shall be included in all ++ * copies or substantial portions of the Software. ++ * ++ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, ++ * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A ++ * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT ++ * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF ++ * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE ++ * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ++ */ ++#include ++#include ++#include ++#include ++#include ++ ++/* private headers */ ++#include "pfmlib_priv.h" ++#include "pfmlib_intel_x86_priv.h" ++#include "pfmlib_intel_snbep_unc_priv.h" ++#include "events/intel_bdx_unc_r2pcie_events.h" ++ ++static void ++display_r2(void *this, pfmlib_event_desc_t *e, void *val) ++{ ++ const intel_x86_entry_t *pe = this_pe(this); ++ pfm_snbep_unc_reg_t *reg = val; ++ ++ __pfm_vbprintf("[UNC_R2PCIE=0x%"PRIx64" event=0x%x umask=0x%x en=%d " ++ "inv=%d edge=%d thres=%d] %s\n", ++ reg->val, ++ reg->com.unc_event, ++ reg->com.unc_umask, ++ reg->com.unc_en, ++ reg->com.unc_inv, ++ reg->com.unc_edge, ++ reg->com.unc_thres, ++ pe[e->event].name); ++} ++ ++pfmlib_pmu_t intel_bdx_unc_r2pcie_support = { ++ .desc = "Intel BroadwellX R2PCIe uncore", ++ .name = "bdx_unc_r2pcie", ++ .perf_name = "uncore_r2pcie", ++ .pmu = PFM_PMU_INTEL_BDX_UNC_R2PCIE, ++ .pme_count = LIBPFM_ARRAY_SIZE(intel_bdx_unc_r2_pe), ++ .type = PFM_PMU_TYPE_UNCORE, ++ .num_cntrs = 4, ++ .num_fixed_cntrs = 0, ++ .max_encoding = 1, ++ .pe = intel_bdx_unc_r2_pe, ++ .atdesc = snbep_unc_mods, ++ .flags = PFMLIB_PMU_FL_RAW_UMASK, ++ .pmu_detect = pfm_intel_bdx_unc_detect, ++ .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding, ++ PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding), ++ PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), ++ .get_event_first = pfm_intel_x86_get_event_first, ++ .get_event_next = pfm_intel_x86_get_event_next, ++ .event_is_valid = pfm_intel_x86_event_is_valid, ++ .validate_table = pfm_intel_x86_validate_table, ++ .get_event_info = pfm_intel_x86_get_event_info, ++ .get_event_attr_info = pfm_intel_x86_get_event_attr_info, ++ PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs), ++ .get_event_nattrs = pfm_intel_x86_get_event_nattrs, ++ .display_reg = display_r2, ++}; +diff --git a/lib/pfmlib_intel_bdx_unc_r3qpi.c b/lib/pfmlib_intel_bdx_unc_r3qpi.c +new file mode 100644 +index 0000000..89ba498 +--- /dev/null ++++ b/lib/pfmlib_intel_bdx_unc_r3qpi.c +@@ -0,0 +1,84 @@ ++/* ++ * pfmlib_intel_bdx_r3qpi.c : Intel BroadwellX R3QPI uncore PMU ++ * ++ * Copyright (c) 2017 Google Inc. All rights reserved ++ * Contributed by Stephane Eranian ++ * ++ * Permission is hereby granted, free of charge, to any person obtaining a copy ++ * of this software and associated documentation files (the "Software"), to deal ++ * in the Software without restriction, including without limitation the rights ++ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies ++ * of the Software, and to permit persons to whom the Software is furnished to do so, ++ * subject to the following conditions: ++ * ++ * The above copyright notice and this permission notice shall be included in all ++ * copies or substantial portions of the Software. ++ * ++ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, ++ * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A ++ * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT ++ * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF ++ * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE ++ * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ++ */ ++#include ++#include ++#include ++#include ++#include ++ ++/* private headers */ ++#include "pfmlib_priv.h" ++#include "pfmlib_intel_x86_priv.h" ++#include "pfmlib_intel_snbep_unc_priv.h" ++#include "events/intel_bdx_unc_r3qpi_events.h" ++ ++static void ++display_r3(void *this, pfmlib_event_desc_t *e, void *val) ++{ ++ const intel_x86_entry_t *pe = this_pe(this); ++ pfm_snbep_unc_reg_t *reg = val; ++ ++ __pfm_vbprintf("[UNC_R3QPI=0x%"PRIx64" event=0x%x umask=0x%x en=%d " ++ "inv=%d edge=%d thres=%d] %s\n", ++ reg->val, ++ reg->com.unc_event, ++ reg->com.unc_umask, ++ reg->com.unc_en, ++ reg->com.unc_inv, ++ reg->com.unc_edge, ++ reg->com.unc_thres, ++ pe[e->event].name); ++} ++ ++#define DEFINE_R3QPI_BOX(n) \ ++pfmlib_pmu_t intel_bdx_unc_r3qpi##n##_support = {\ ++ .desc = "Intel BroadwellX R3QPI"#n" uncore", \ ++ .name = "bdx_unc_r3qpi"#n,\ ++ .perf_name = "uncore_r3qpi_"#n, \ ++ .pmu = PFM_PMU_INTEL_BDX_UNC_R3QPI##n, \ ++ .pme_count = LIBPFM_ARRAY_SIZE(intel_bdx_unc_r3_pe),\ ++ .type = PFM_PMU_TYPE_UNCORE,\ ++ .num_cntrs = 3,\ ++ .num_fixed_cntrs = 0,\ ++ .max_encoding = 1,\ ++ .pe = intel_bdx_unc_r3_pe,\ ++ .atdesc = snbep_unc_mods,\ ++ .flags = PFMLIB_PMU_FL_RAW_UMASK,\ ++ .pmu_detect = pfm_intel_bdx_unc_detect,\ ++ .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding,\ ++ PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding),\ ++ PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), \ ++ .get_event_first = pfm_intel_x86_get_event_first,\ ++ .get_event_next = pfm_intel_x86_get_event_next,\ ++ .event_is_valid = pfm_intel_x86_event_is_valid,\ ++ .validate_table = pfm_intel_x86_validate_table,\ ++ .get_event_info = pfm_intel_x86_get_event_info,\ ++ .get_event_attr_info = pfm_intel_x86_get_event_attr_info,\ ++ PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs),\ ++ .get_event_nattrs = pfm_intel_x86_get_event_nattrs,\ ++ .display_reg = display_r3,\ ++} ++DEFINE_R3QPI_BOX(0); ++DEFINE_R3QPI_BOX(1); ++DEFINE_R3QPI_BOX(2); +diff --git a/lib/pfmlib_intel_bdx_unc_sbo.c b/lib/pfmlib_intel_bdx_unc_sbo.c +new file mode 100644 +index 0000000..a2be8bc +--- /dev/null ++++ b/lib/pfmlib_intel_bdx_unc_sbo.c +@@ -0,0 +1,86 @@ ++/* ++ * pfmlib_intel_bdx_unc_sbo.c : Intel BroadwellX S-Box uncore PMU ++ * ++ * Copyright (c) 2017 Google Inc. All rights reserved ++ * Contributed by Stephane Eranian ++ * ++ * Permission is hereby granted, free of charge, to any person obtaining a copy ++ * of this software and associated documentation files (the "Software"), to deal ++ * in the Software without restriction, including without limitation the rights ++ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies ++ * of the Software, and to permit persons to whom the Software is furnished to do so, ++ * subject to the following conditions: ++ * ++ * The above copyright notice and this permission notice shall be included in all ++ * copies or substantial portions of the Software. ++ * ++ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, ++ * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A ++ * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT ++ * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF ++ * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE ++ * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ++ */ ++#include ++#include ++#include ++#include ++#include ++ ++/* private headers */ ++#include "pfmlib_priv.h" ++#include "pfmlib_intel_x86_priv.h" ++#include "pfmlib_intel_snbep_unc_priv.h" ++#include "events/intel_bdx_unc_sbo_events.h" ++ ++static void ++display_sbo(void *this, pfmlib_event_desc_t *e, void *val) ++{ ++ const intel_x86_entry_t *pe = this_pe(this); ++ pfm_snbep_unc_reg_t *reg = val; ++ ++ __pfm_vbprintf("[UNC_SBO=0x%"PRIx64" event=0x%x umask=0x%x en=%d " ++ "inv=%d edge=%d thres=%d] %s\n", ++ reg->val, ++ reg->com.unc_event, ++ reg->com.unc_umask, ++ reg->com.unc_en, ++ reg->com.unc_inv, ++ reg->com.unc_edge, ++ reg->com.unc_thres, ++ pe[e->event].name); ++} ++ ++#define DEFINE_S_BOX(n) \ ++pfmlib_pmu_t intel_bdx_unc_sbo##n##_support = {\ ++ .desc = "Intel BroadwellX S-BOX"#n" uncore",\ ++ .name = "bdx_unc_sbo"#n,\ ++ .perf_name = "uncore_sbox_"#n,\ ++ .pmu = PFM_PMU_INTEL_BDX_UNC_SB##n,\ ++ .pme_count = LIBPFM_ARRAY_SIZE(intel_bdx_unc_s_pe),\ ++ .type = PFM_PMU_TYPE_UNCORE,\ ++ .num_cntrs = 4,\ ++ .num_fixed_cntrs = 0,\ ++ .max_encoding = 3,\ ++ .pe = intel_bdx_unc_s_pe,\ ++ .atdesc = snbep_unc_mods,\ ++ .flags = PFMLIB_PMU_FL_RAW_UMASK,\ ++ .pmu_detect = pfm_intel_bdx_unc_detect,\ ++ .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding,\ ++ PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding),\ ++ PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), \ ++ .get_event_first = pfm_intel_x86_get_event_first,\ ++ .get_event_next = pfm_intel_x86_get_event_next,\ ++ .event_is_valid = pfm_intel_x86_event_is_valid,\ ++ .validate_table = pfm_intel_x86_validate_table,\ ++ .get_event_info = pfm_intel_x86_get_event_info,\ ++ .get_event_attr_info = pfm_intel_x86_get_event_attr_info,\ ++ PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs),\ ++ .get_event_nattrs = pfm_intel_x86_get_event_nattrs,\ ++ .display_reg = display_sbo,\ ++} ++ ++DEFINE_S_BOX(0); ++DEFINE_S_BOX(1); ++DEFINE_S_BOX(2); ++DEFINE_S_BOX(3); +diff --git a/lib/pfmlib_intel_bdx_unc_ubo.c b/lib/pfmlib_intel_bdx_unc_ubo.c +new file mode 100644 +index 0000000..f0d058a +--- /dev/null ++++ b/lib/pfmlib_intel_bdx_unc_ubo.c +@@ -0,0 +1,81 @@ ++/* ++ * pfmlib_intel_bdx_unc_ubo.c : Intel BroadwellX U-Box uncore PMU ++ * ++ * Copyright (c) 2017 Google Inc. All rights reserved ++ * Contributed by Stephane Eranian ++ * ++ * Permission is hereby granted, free of charge, to any person obtaining a copy ++ * of this software and associated documentation files (the "Software"), to deal ++ * in the Software without restriction, including without limitation the rights ++ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies ++ * of the Software, and to permit persons to whom the Software is furnished to do so, ++ * subject to the following conditions: ++ * ++ * The above copyright notice and this permission notice shall be included in all ++ * copies or substantial portions of the Software. ++ * ++ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, ++ * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A ++ * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT ++ * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF ++ * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE ++ * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ++ */ ++#include ++#include ++#include ++#include ++#include ++ ++/* private headers */ ++#include "pfmlib_priv.h" ++#include "pfmlib_intel_x86_priv.h" ++#include "pfmlib_intel_snbep_unc_priv.h" ++#include "events/intel_bdx_unc_ubo_events.h" ++ ++static void ++display_ubo(void *this, pfmlib_event_desc_t *e, void *val) ++{ ++ const intel_x86_entry_t *pe = this_pe(this); ++ pfm_snbep_unc_reg_t *reg = val; ++ ++ __pfm_vbprintf("[UNC_UBO=0x%"PRIx64" event=0x%x umask=0x%x en=%d " ++ "inv=%d edge=%d thres=%d] %s\n", ++ reg->val, ++ reg->com.unc_event, ++ reg->com.unc_umask, ++ reg->com.unc_en, ++ reg->com.unc_inv, ++ reg->com.unc_edge, ++ reg->com.unc_thres, ++ pe[e->event].name); ++} ++ ++ ++pfmlib_pmu_t intel_bdx_unc_ubo_support = { ++ .desc = "Intel BroadwellX U-Box uncore", ++ .name = "bdx_unc_ubo", ++ .perf_name = "uncore_ubox", ++ .pmu = PFM_PMU_INTEL_BDX_UNC_UBOX, ++ .pme_count = LIBPFM_ARRAY_SIZE(intel_bdx_unc_u_pe), ++ .type = PFM_PMU_TYPE_UNCORE, ++ .num_cntrs = 2, ++ .num_fixed_cntrs = 1, ++ .max_encoding = 1, ++ .pe = intel_bdx_unc_u_pe, ++ .atdesc = snbep_unc_mods, ++ .flags = PFMLIB_PMU_FL_RAW_UMASK, ++ .pmu_detect = pfm_intel_bdx_unc_detect, ++ .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding, ++ PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding), ++ PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), ++ .get_event_first = pfm_intel_x86_get_event_first, ++ .get_event_next = pfm_intel_x86_get_event_next, ++ .event_is_valid = pfm_intel_x86_event_is_valid, ++ .validate_table = pfm_intel_x86_validate_table, ++ .get_event_info = pfm_intel_x86_get_event_info, ++ .get_event_attr_info = pfm_intel_x86_get_event_attr_info, ++ PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs), ++ .get_event_nattrs = pfm_intel_x86_get_event_nattrs, ++ .display_reg = display_ubo, ++}; +diff --git a/lib/pfmlib_intel_snbep_unc.c b/lib/pfmlib_intel_snbep_unc.c +index 1e80147..c44d92a 100644 +--- a/lib/pfmlib_intel_snbep_unc.c ++++ b/lib/pfmlib_intel_snbep_unc.c +@@ -129,7 +129,26 @@ pfm_intel_knl_unc_detect(void *this) + return PFM_SUCCESS; + } + ++int ++pfm_intel_bdx_unc_detect(void *this) ++{ ++ int ret; ++ ++ ret = pfm_intel_x86_detect(); ++ if (ret != PFM_SUCCESS) ++ ++ if (pfm_intel_x86_cfg.family != 6) ++ return PFM_ERR_NOTSUPP; + ++ switch(pfm_intel_x86_cfg.model) { ++ case 79: /* Broadwell X */ ++ case 86: /* Broadwell X */ ++ break; ++ default: ++ return PFM_ERR_NOTSUPP; ++ } ++ return PFM_SUCCESS; ++} + + static void + display_com(void *this, pfmlib_event_desc_t *e, void *val) +@@ -255,7 +274,7 @@ snbep_unc_add_defaults(void *this, pfmlib_event_desc_t *e, + } + } + if (!added && !skip) { +- DPRINT("no default found for event %s unit mask group %d (max_grpid=%d)\n", ent->name, i, max_grpid); ++ DPRINT("no default found for event %s unit mask group %d (max_grpid=%d, i=%d)\n", ent->name, i, max_grpid, i); + return PFM_ERR_UMASK; + } + } +diff --git a/lib/pfmlib_intel_snbep_unc_priv.h b/lib/pfmlib_intel_snbep_unc_priv.h +index 4984242..898a460 100644 +--- a/lib/pfmlib_intel_snbep_unc_priv.h ++++ b/lib/pfmlib_intel_snbep_unc_priv.h +@@ -65,10 +65,13 @@ + #define HSWEP_UNC_IRP_ATTRS \ + (_SNBEP_UNC_ATTR_E|_SNBEP_UNC_ATTR_T8|_SNBEP_UNC_ATTR_I) + ++#define BDX_UNC_IRP_ATTRS HSWEP_UNC_IRP_ATTRS ++ + #define SNBEP_UNC_R3QPI_ATTRS \ + (_SNBEP_UNC_ATTR_I|_SNBEP_UNC_ATTR_E|_SNBEP_UNC_ATTR_T8) + + #define HSWEP_UNC_R3QPI_ATTRS SNBEP_UNC_R3QPI_ATTRS ++#define BDX_UNC_R3QPI_ATTRS SNBEP_UNC_R3QPI_ATTRS + + #define IVBEP_UNC_R3QPI_ATTRS \ + (_SNBEP_UNC_ATTR_E|_SNBEP_UNC_ATTR_T8) +@@ -77,6 +80,7 @@ + (_SNBEP_UNC_ATTR_I|_SNBEP_UNC_ATTR_E|_SNBEP_UNC_ATTR_T8) + + #define HSWEP_UNC_R2PCIE_ATTRS SNBEP_UNC_R2PCIE_ATTRS ++#define BDX_UNC_R2PCIE_ATTRS SNBEP_UNC_R2PCIE_ATTRS + + #define IVBEP_UNC_R2PCIE_ATTRS \ + (_SNBEP_UNC_ATTR_E|_SNBEP_UNC_ATTR_T8) +@@ -88,6 +92,7 @@ + (_SNBEP_UNC_ATTR_E|_SNBEP_UNC_ATTR_T8) + + #define HSWEP_UNC_QPI_ATTRS SNBEP_UNC_QPI_ATTRS ++#define BDX_UNC_QPI_ATTRS SNBEP_UNC_QPI_ATTRS + + #define SNBEP_UNC_UBO_ATTRS \ + (_SNBEP_UNC_ATTR_I|_SNBEP_UNC_ATTR_E|_SNBEP_UNC_ATTR_T8) +@@ -96,6 +101,7 @@ + (_SNBEP_UNC_ATTR_E|_SNBEP_UNC_ATTR_T8) + + #define HSWEP_UNC_UBO_ATTRS SNBEP_UNC_UBO_ATTRS ++#define BDX_UNC_UBO_ATTRS SNBEP_UNC_UBO_ATTRS + + #define SNBEP_UNC_PCU_ATTRS \ + (_SNBEP_UNC_ATTR_I|_SNBEP_UNC_ATTR_E|_SNBEP_UNC_ATTR_T5) +@@ -105,6 +111,9 @@ + + #define HSWEP_UNC_PCU_ATTRS SNBEP_UNC_PCU_ATTRS + ++#define BDX_UNC_PCU_ATTRS \ ++ (_SNBEP_UNC_ATTR_I|_SNBEP_UNC_ATTR_E|_SNBEP_UNC_ATTR_T8) ++ + #define SNBEP_UNC_PCU_BAND_ATTRS \ + (SNBEP_UNC_PCU_ATTRS | _SNBEP_UNC_ATTR_FF) + +@@ -112,6 +121,7 @@ + (IVBEP_UNC_PCU_ATTRS | _SNBEP_UNC_ATTR_FF) + + #define HSWEP_UNC_PCU_BAND_ATTRS SNBEP_UNC_PCU_BAND_ATTRS ++#define BDX_UNC_PCU_BAND_ATTRS SNBEP_UNC_PCU_BAND_ATTRS + + #define SNBEP_UNC_IMC_ATTRS \ + (_SNBEP_UNC_ATTR_I|_SNBEP_UNC_ATTR_E|_SNBEP_UNC_ATTR_T8) +@@ -121,6 +131,8 @@ + + #define HSWEP_UNC_IMC_ATTRS SNBEP_UNC_IMC_ATTRS + ++#define BDX_UNC_IMC_ATTRS SNBEP_UNC_IMC_ATTRS ++ + #define SNBEP_UNC_CBO_ATTRS \ + (_SNBEP_UNC_ATTR_I |\ + _SNBEP_UNC_ATTR_E |\ +@@ -140,6 +152,8 @@ + _SNBEP_UNC_ATTR_CF1 |\ + _SNBEP_UNC_ATTR_TF) + ++#define BDX_UNC_CBO_ATTRS HSWEP_UNC_CBO_ATTRS ++ + #define SNBEP_UNC_CBO_NID_ATTRS \ + (SNBEP_UNC_CBO_ATTRS|_SNBEP_UNC_ATTR_NF) + +@@ -149,6 +163,8 @@ + #define HSWEP_UNC_CBO_NID_ATTRS \ + (HSWEP_UNC_CBO_ATTRS | _SNBEP_UNC_ATTR_NF1) + ++#define BDX_UNC_CBO_NID_ATTRS HSWEP_UNC_CBO_NID_ATTRS ++ + #define SNBEP_UNC_HA_ATTRS \ + (_SNBEP_UNC_ATTR_I|_SNBEP_UNC_ATTR_E|_SNBEP_UNC_ATTR_T8) + +@@ -158,12 +174,18 @@ + #define HSWEP_UNC_HA_ATTRS \ + (_SNBEP_UNC_ATTR_E|_SNBEP_UNC_ATTR_T8|_SNBEP_UNC_ATTR_I) + ++#define BDX_UNC_HA_ATTRS \ ++ (_SNBEP_UNC_ATTR_E|_SNBEP_UNC_ATTR_T8|_SNBEP_UNC_ATTR_I) ++ + #define SNBEP_UNC_HA_OPC_ATTRS \ + (SNBEP_UNC_HA_ATTRS|_SNBEP_UNC_ATTR_A) + + #define HSWEP_UNC_SBO_ATTRS \ + (_SNBEP_UNC_ATTR_E|_SNBEP_UNC_ATTR_T8|_SNBEP_UNC_ATTR_I) + ++#define BDX_UNC_SBO_ATTRS \ ++ (_SNBEP_UNC_ATTR_E|_SNBEP_UNC_ATTR_T8|_SNBEP_UNC_ATTR_I) ++ + #define KNL_UNC_CHA_TOR_ATTRS _SNBEP_UNC_ATTR_NF1 + + typedef union { +@@ -327,6 +349,7 @@ extern int pfm_intel_snbep_unc_detect(void *this); + extern int pfm_intel_ivbep_unc_detect(void *this); + extern int pfm_intel_hswep_unc_detect(void *this); + extern int pfm_intel_knl_unc_detect(void *this); ++extern int pfm_intel_bdx_unc_detect(void *this); + extern int pfm_intel_snbep_unc_get_perf_encoding(void *this, pfmlib_event_desc_t *e); + extern int pfm_intel_snbep_unc_can_auto_encode(void *this, int pidx, int uidx); + extern int pfm_intel_snbep_unc_get_event_attr_info(void *this, int pidx, int attr_idx, pfmlib_event_attr_info_t *info); +diff --git a/lib/pfmlib_priv.h b/lib/pfmlib_priv.h +index 1f80571..1f2d030 100644 +--- a/lib/pfmlib_priv.h ++++ b/lib/pfmlib_priv.h +@@ -439,6 +439,54 @@ extern pfmlib_pmu_t intel_knl_unc_cha35_support; + extern pfmlib_pmu_t intel_knl_unc_cha36_support; + extern pfmlib_pmu_t intel_knl_unc_cha37_support; + extern pfmlib_pmu_t intel_knl_unc_m2pcie_support; ++extern pfmlib_pmu_t intel_bdx_unc_cb0_support; ++extern pfmlib_pmu_t intel_bdx_unc_cb1_support; ++extern pfmlib_pmu_t intel_bdx_unc_cb2_support; ++extern pfmlib_pmu_t intel_bdx_unc_cb3_support; ++extern pfmlib_pmu_t intel_bdx_unc_cb4_support; ++extern pfmlib_pmu_t intel_bdx_unc_cb5_support; ++extern pfmlib_pmu_t intel_bdx_unc_cb6_support; ++extern pfmlib_pmu_t intel_bdx_unc_cb7_support; ++extern pfmlib_pmu_t intel_bdx_unc_cb8_support; ++extern pfmlib_pmu_t intel_bdx_unc_cb9_support; ++extern pfmlib_pmu_t intel_bdx_unc_cb10_support; ++extern pfmlib_pmu_t intel_bdx_unc_cb11_support; ++extern pfmlib_pmu_t intel_bdx_unc_cb12_support; ++extern pfmlib_pmu_t intel_bdx_unc_cb13_support; ++extern pfmlib_pmu_t intel_bdx_unc_cb14_support; ++extern pfmlib_pmu_t intel_bdx_unc_cb15_support; ++extern pfmlib_pmu_t intel_bdx_unc_cb16_support; ++extern pfmlib_pmu_t intel_bdx_unc_cb17_support; ++extern pfmlib_pmu_t intel_bdx_unc_cb18_support; ++extern pfmlib_pmu_t intel_bdx_unc_cb19_support; ++extern pfmlib_pmu_t intel_bdx_unc_cb20_support; ++extern pfmlib_pmu_t intel_bdx_unc_cb21_support; ++extern pfmlib_pmu_t intel_bdx_unc_cb22_support; ++extern pfmlib_pmu_t intel_bdx_unc_cb23_support; ++extern pfmlib_pmu_t intel_bdx_unc_ha0_support; ++extern pfmlib_pmu_t intel_bdx_unc_ha1_support; ++extern pfmlib_pmu_t intel_bdx_unc_imc0_support; ++extern pfmlib_pmu_t intel_bdx_unc_imc1_support; ++extern pfmlib_pmu_t intel_bdx_unc_imc2_support; ++extern pfmlib_pmu_t intel_bdx_unc_imc3_support; ++extern pfmlib_pmu_t intel_bdx_unc_imc4_support; ++extern pfmlib_pmu_t intel_bdx_unc_imc5_support; ++extern pfmlib_pmu_t intel_bdx_unc_imc6_support; ++extern pfmlib_pmu_t intel_bdx_unc_imc7_support; ++extern pfmlib_pmu_t intel_bdx_unc_pcu_support; ++extern pfmlib_pmu_t intel_bdx_unc_qpi0_support; ++extern pfmlib_pmu_t intel_bdx_unc_qpi1_support; ++extern pfmlib_pmu_t intel_bdx_unc_qpi2_support; ++extern pfmlib_pmu_t intel_bdx_unc_sbo0_support; ++extern pfmlib_pmu_t intel_bdx_unc_sbo1_support; ++extern pfmlib_pmu_t intel_bdx_unc_sbo2_support; ++extern pfmlib_pmu_t intel_bdx_unc_sbo3_support; ++extern pfmlib_pmu_t intel_bdx_unc_ubo_support; ++extern pfmlib_pmu_t intel_bdx_unc_r2pcie_support; ++extern pfmlib_pmu_t intel_bdx_unc_r3qpi0_support; ++extern pfmlib_pmu_t intel_bdx_unc_r3qpi1_support; ++extern pfmlib_pmu_t intel_bdx_unc_r3qpi2_support; ++extern pfmlib_pmu_t intel_bdx_unc_irp_support; + extern pfmlib_pmu_t intel_glm_support; + extern pfmlib_pmu_t power4_support; + extern pfmlib_pmu_t ppc970_support; +diff --git a/tests/validate_x86.c b/tests/validate_x86.c +index aa0aaa1..3edc8a8 100644 +--- a/tests/validate_x86.c ++++ b/tests/validate_x86.c +@@ -4755,6 +4755,836 @@ static const test_event_t x86_test_events[]={ + .name = "skl::offcore_response_1:0x7fffffffff", + .ret = PFM_ERR_ATTR, + }, ++ ++ { SRC_LINE, ++ .name = "bdx_unc_cbo1::UNC_C_CLOCKTICKS:u", ++ .ret = PFM_ERR_ATTR, ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_cbo0::UNC_C_CLOCKTICKS", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x00, ++ .fstr = "bdx_unc_cbo0::UNC_C_CLOCKTICKS:e=0:t=0:tf=0:cf=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_cbo1::UNC_C_CLOCKTICKS", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x00, ++ .fstr = "bdx_unc_cbo1::UNC_C_CLOCKTICKS:e=0:t=0:tf=0:cf=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_cbo2::UNC_C_CLOCKTICKS", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x00, ++ .fstr = "bdx_unc_cbo2::UNC_C_CLOCKTICKS:e=0:t=0:tf=0:cf=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_cbo3::UNC_C_CLOCKTICKS", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x00, ++ .fstr = "bdx_unc_cbo3::UNC_C_CLOCKTICKS:e=0:t=0:tf=0:cf=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_cbo4::UNC_C_CLOCKTICKS", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x00, ++ .fstr = "bdx_unc_cbo4::UNC_C_CLOCKTICKS:e=0:t=0:tf=0:cf=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_cbo5::UNC_C_CLOCKTICKS", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x00, ++ .fstr = "bdx_unc_cbo5::UNC_C_CLOCKTICKS:e=0:t=0:tf=0:cf=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_cbo6::UNC_C_CLOCKTICKS", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x00, ++ .fstr = "bdx_unc_cbo6::UNC_C_CLOCKTICKS:e=0:t=0:tf=0:cf=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_cbo7::UNC_C_CLOCKTICKS", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x00, ++ .fstr = "bdx_unc_cbo7::UNC_C_CLOCKTICKS:e=0:t=0:tf=0:cf=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_cbo8::UNC_C_CLOCKTICKS", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x00, ++ .fstr = "bdx_unc_cbo8::UNC_C_CLOCKTICKS:e=0:t=0:tf=0:cf=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_cbo9::UNC_C_CLOCKTICKS", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x00, ++ .fstr = "bdx_unc_cbo9::UNC_C_CLOCKTICKS:e=0:t=0:tf=0:cf=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_cbo10::UNC_C_CLOCKTICKS", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x00, ++ .fstr = "bdx_unc_cbo10::UNC_C_CLOCKTICKS:e=0:t=0:tf=0:cf=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_cbo11::UNC_C_CLOCKTICKS", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x00, ++ .fstr = "bdx_unc_cbo11::UNC_C_CLOCKTICKS:e=0:t=0:tf=0:cf=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_cbo12::UNC_C_CLOCKTICKS", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x00, ++ .fstr = "bdx_unc_cbo12::UNC_C_CLOCKTICKS:e=0:t=0:tf=0:cf=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_cbo13::UNC_C_CLOCKTICKS", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x00, ++ .fstr = "bdx_unc_cbo13::UNC_C_CLOCKTICKS:e=0:t=0:tf=0:cf=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_cbo14::UNC_C_CLOCKTICKS", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x00, ++ .fstr = "bdx_unc_cbo14::UNC_C_CLOCKTICKS:e=0:t=0:tf=0:cf=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_cbo15::UNC_C_CLOCKTICKS", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x00, ++ .fstr = "bdx_unc_cbo15::UNC_C_CLOCKTICKS:e=0:t=0:tf=0:cf=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_cbo16::UNC_C_CLOCKTICKS", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x00, ++ .fstr = "bdx_unc_cbo16::UNC_C_CLOCKTICKS:e=0:t=0:tf=0:cf=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_cbo17::UNC_C_CLOCKTICKS", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x00, ++ .fstr = "bdx_unc_cbo17::UNC_C_CLOCKTICKS:e=0:t=0:tf=0:cf=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_cbo18::UNC_C_CLOCKTICKS", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x00, ++ .fstr = "bdx_unc_cbo18::UNC_C_CLOCKTICKS:e=0:t=0:tf=0:cf=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_cbo19::UNC_C_CLOCKTICKS", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x00, ++ .fstr = "bdx_unc_cbo19::UNC_C_CLOCKTICKS:e=0:t=0:tf=0:cf=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_cbo20::UNC_C_CLOCKTICKS", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x00, ++ .fstr = "bdx_unc_cbo20::UNC_C_CLOCKTICKS:e=0:t=0:tf=0:cf=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_cbo21::UNC_C_CLOCKTICKS", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x00, ++ .fstr = "bdx_unc_cbo21::UNC_C_CLOCKTICKS:e=0:t=0:tf=0:cf=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_cbo0::UNC_C_LLC_LOOKUP:DATA_READ", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] = 0x334, ++ .codes[1] = 0xfe0000, ++ .fstr = "bdx_unc_cbo0::UNC_C_LLC_LOOKUP:DATA_READ:STATE_MESIFD:e=0:t=0:tf=0:cf=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_cbo0::UNC_C_LLC_LOOKUP:DATA_READ:nf=1", ++ .ret = PFM_ERR_ATTR, ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_cbo0::UNC_C_LLC_LOOKUP", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] = 0x1134, ++ .codes[1] = 0xfe0000, ++ .fstr = "bdx_unc_cbo0::UNC_C_LLC_LOOKUP:ANY:STATE_MESIFD:e=0:t=0:tf=0:cf=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_cbo0::UNC_C_LLC_LOOKUP:NID:STATE_M", ++ .ret = PFM_ERR_ATTR, ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_cbo0::UNC_C_LLC_LOOKUP:NID:nf=3", ++ .ret = PFM_SUCCESS, ++ .count = 3, ++ .codes[0] = 0x5134, ++ .codes[1] = 0xfe0000, ++ .codes[2] = 0x3, ++ .fstr = "bdx_unc_cbo0::UNC_C_LLC_LOOKUP:ANY:NID:STATE_MESIFD:e=0:t=0:tf=0:nf=3:cf=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_cbo0::UNC_C_LLC_LOOKUP:NID:STATE_M:tid=1", ++ .ret = PFM_ERR_ATTR, ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_cbo0::UNC_C_ring_iv_used:DN:UP", ++ .ret = PFM_ERR_FEATCOMB, ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_cbo0::UNC_C_LLC_LOOKUP:WRITE:NID:nf=3:tf=1:e:t=1", ++ .ret = PFM_SUCCESS, ++ .count = 3, ++ .codes[0] = 0x10c4534, ++ .codes[1] = 0xfe0001, ++ .codes[2] = 0x3, ++ .fstr = "bdx_unc_cbo0::UNC_C_LLC_LOOKUP:NID:WRITE:STATE_MESIFD:e=1:t=1:tf=1:nf=3:cf=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_cbo0::UNC_C_LLC_VICTIMS", ++ .ret = PFM_ERR_UMASK, ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_cbo0::UNC_C_LLC_VICTIMS:NID", ++ .ret = PFM_ERR_UMASK, ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_cbo0::UNC_C_LLC_VICTIMS:NID:nf=1", ++ .ret = PFM_ERR_UMASK, ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_cbo0::UNC_C_LLC_VICTIMS:M_STATE", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x137, ++ .fstr = "bdx_unc_cbo0::UNC_C_LLC_VICTIMS:M_STATE:e=0:t=0:tf=0:cf=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_cbo0::UNC_C_LLC_VICTIMS:M_STATE:S_STATE", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x537, ++ .fstr = "bdx_unc_cbo0::UNC_C_LLC_VICTIMS:S_STATE:M_STATE:e=0:t=0:tf=0:cf=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_cbo0::UNC_C_LLC_VICTIMS:M_STATE:S_STATE:NID:nf=1", ++ .ret = PFM_SUCCESS, ++ .count = 3, ++ .codes[0] = 0x4537, ++ .codes[1] = 0x0, ++ .codes[2] = 0x1, ++ .fstr = "bdx_unc_cbo0::UNC_C_LLC_VICTIMS:S_STATE:M_STATE:NID:e=0:t=0:tf=0:nf=1:cf=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_cbo0::UNC_C_TOR_INSERTS:OPCODE", ++ .ret = PFM_ERR_UMASK, ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_cbo0::UNC_C_TOR_INSERTS:WB", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x1035, ++ .fstr = "bdx_unc_cbo0::UNC_C_TOR_INSERTS:WB:e=0:t=0:tf=0:isoc=0:nc=0:cf=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_cbo0::UNC_C_TOR_INSERTS:OPCODE:OPC_ITOM", ++ .ret = PFM_SUCCESS, ++ .count = 3, ++ .codes[0] = 0x135, ++ .codes[1] = 0x0, ++ .codes[2] = 0x1c800000ull, ++ .fstr = "bdx_unc_cbo0::UNC_C_TOR_INSERTS:OPCODE:OPC_ITOM:e=0:t=0:tf=0:isoc=0:nc=0:cf=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_cbo0::UNC_C_TOR_INSERTS:OPCODE:OPC_ITOM:isoc=1", ++ .ret = PFM_SUCCESS, ++ .count = 3, ++ .codes[0] = 0x135, ++ .codes[1] = 0x0, ++ .codes[2] = 0x9c800000ull, ++ .fstr = "bdx_unc_cbo0::UNC_C_TOR_INSERTS:OPCODE:OPC_ITOM:e=0:t=0:tf=0:isoc=1:nc=0:cf=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_cbo0::UNC_C_TOR_INSERTS:OPCODE:OPC_PCIWILF:nf=1", ++ .ret = PFM_ERR_ATTR, ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_cbo0::UNC_C_TOR_INSERTS:NID_OPCODE:OPC_PCIRDCUR:nf=1", ++ .ret = PFM_SUCCESS, ++ .count = 3, ++ .codes[0] = 0x4135, ++ .codes[1] = 0x0, ++ .codes[2] = 0x19e00001ull, ++ .fstr = "bdx_unc_cbo0::UNC_C_TOR_INSERTS:NID_OPCODE:OPC_PCIRDCUR:e=0:t=0:tf=0:nf=1:isoc=0:nc=0:cf=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_cbo0::UNC_C_TOR_INSERTS:OPC_RFO:NID_OPCODE:nf=1", ++ .ret = PFM_SUCCESS, ++ .count = 3, ++ .codes[0] = 0x4135, ++ .codes[1] = 0x0, ++ .codes[2] = 0x18000001ull, ++ .fstr = "bdx_unc_cbo0::UNC_C_TOR_INSERTS:NID_OPCODE:OPC_RFO:e=0:t=0:tf=0:nf=1:isoc=0:nc=0:cf=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_cbo0::UNC_C_TOR_OCCUPANCY:MISS_REMOTE", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x8a36, ++ .fstr = "bdx_unc_cbo0::UNC_C_TOR_OCCUPANCY:MISS_REMOTE:e=0:t=0:tf=0:isoc=0:nc=0:cf=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_irp::unc_i_clockticks", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x0, ++ .fstr = "bdx_unc_irp::UNC_I_CLOCKTICKS:e=0:i=0:t=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_irp::unc_i_coherent_ops:RFO", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x813, ++ .fstr = "bdx_unc_irp::UNC_I_COHERENT_OPS:RFO:e=0:i=0:t=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_irp::unc_i_transactions:reads", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x116, ++ .fstr = "bdx_unc_irp::UNC_I_TRANSACTIONS:READS:e=0:i=0:t=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_irp::unc_i_transactions:reads:c=1:i", ++ .ret = PFM_ERR_ATTR, ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_irp::unc_i_transactions:reads:t=6", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x6000116, ++ .fstr = "bdx_unc_irp::UNC_I_TRANSACTIONS:READS:e=0:i=0:t=6", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_sbo0::unc_s_clockticks", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x0, ++ .fstr = "bdx_unc_sbo0::UNC_S_CLOCKTICKS:e=0:i=0:t=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_sbo1::unc_s_clockticks", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x0, ++ .fstr = "bdx_unc_sbo1::UNC_S_CLOCKTICKS:e=0:i=0:t=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_sbo2::unc_s_clockticks", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x0, ++ .fstr = "bdx_unc_sbo2::UNC_S_CLOCKTICKS:e=0:i=0:t=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_sbo3::unc_s_clockticks", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x0, ++ .fstr = "bdx_unc_sbo3::UNC_S_CLOCKTICKS:e=0:i=0:t=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_pcu::UNC_P_CLOCKTICKS", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x0, ++ .fstr = "bdx_unc_pcu::UNC_P_CLOCKTICKS:e=0:i=0:t=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_pcu::UNC_P_CLOCKTICKS:t=1", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x1000000, ++ .fstr = "bdx_unc_pcu::UNC_P_CLOCKTICKS:e=0:i=0:t=1", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_pcu::UNC_P_CORE0_TRANSITION_CYCLES", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x60, ++ .fstr = "bdx_unc_pcu::UNC_P_CORE0_TRANSITION_CYCLES:e=0:i=0:t=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_pcu::UNC_P_CORE17_TRANSITION_CYCLES", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x71, ++ .fstr = "bdx_unc_pcu::UNC_P_CORE17_TRANSITION_CYCLES:e=0:i=0:t=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_pcu::UNC_P_FREQ_BAND0_CYCLES", ++ .ret = PFM_ERR_ATTR, ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_pcu::UNC_P_FREQ_BAND1_CYCLES", ++ .ret = PFM_ERR_ATTR, ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_pcu::UNC_P_FREQ_BAND2_CYCLES", ++ .ret = PFM_ERR_ATTR, ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_pcu::UNC_P_FREQ_BAND3_CYCLES", ++ .ret = PFM_ERR_ATTR, ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_pcu::UNC_P_FREQ_BAND0_CYCLES:ff=32", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] = 0xb, ++ .codes[1] = 0x20, ++ .fstr = "bdx_unc_pcu::UNC_P_FREQ_BAND0_CYCLES:e=0:i=0:t=0:ff=32", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_pcu::UNC_P_FREQ_BAND1_CYCLES:ff=16", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] = 0xc, ++ .codes[1] = 0x1000, ++ .fstr = "bdx_unc_pcu::UNC_P_FREQ_BAND1_CYCLES:e=0:i=0:t=0:ff=16", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_pcu::UNC_P_FREQ_BAND2_CYCLES:ff=8", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] = 0xd, ++ .codes[1] = 0x80000, ++ .fstr = "bdx_unc_pcu::UNC_P_FREQ_BAND2_CYCLES:e=0:i=0:t=0:ff=8", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_pcu::UNC_P_FREQ_BAND3_CYCLES:ff=40", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] = 0xe, ++ .codes[1] = 0x28000000, ++ .fstr = "bdx_unc_pcu::UNC_P_FREQ_BAND3_CYCLES:e=0:i=0:t=0:ff=40", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_pcu::UNC_P_FREQ_BAND0_CYCLES:ff=32:e", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] = 0x4000b, ++ .codes[1] = 0x20, ++ .fstr = "bdx_unc_pcu::UNC_P_FREQ_BAND0_CYCLES:e=1:i=0:t=0:ff=32", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_pcu::UNC_P_FREQ_BAND0_CYCLES:ff=32:t=24", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] = 0x1800000b, ++ .codes[1] = 0x20, ++ .fstr = "bdx_unc_pcu::UNC_P_FREQ_BAND0_CYCLES:e=0:i=0:t=24:ff=32", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_pcu::UNC_P_FREQ_BAND0_CYCLES:ff=32:e:t=4", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] = 0x404000b, ++ .codes[1] = 0x20, ++ .fstr = "bdx_unc_pcu::UNC_P_FREQ_BAND0_CYCLES:e=1:i=0:t=4:ff=32", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_pcu::UNC_P_POWER_STATE_OCCUPANCY:CORES_C0", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x4080, ++ .fstr = "bdx_unc_pcu::UNC_P_POWER_STATE_OCCUPANCY:CORES_C0:e=0:i=0:t=0" ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_pcu::UNC_P_POWER_STATE_OCCUPANCY:CORES_C3", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x8080, ++ .fstr = "bdx_unc_pcu::UNC_P_POWER_STATE_OCCUPANCY:CORES_C3:e=0:i=0:t=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_pcu::UNC_P_POWER_STATE_OCCUPANCY:CORES_C6", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0xc080, ++ .fstr = "bdx_unc_pcu::UNC_P_POWER_STATE_OCCUPANCY:CORES_C6:e=0:i=0:t=0" ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_pcu::UNC_P_POWER_STATE_OCCUPANCY:CORES_C0:t=6:i", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x6804080, ++ .fstr = "bdx_unc_pcu::UNC_P_POWER_STATE_OCCUPANCY:CORES_C0:e=0:i=1:t=6" ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_pcu::UNC_P_POWER_STATE_OCCUPANCY:CORES_C0:t=6", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x6004080, ++ .fstr = "bdx_unc_pcu::UNC_P_POWER_STATE_OCCUPANCY:CORES_C0:e=0:i=0:t=6" ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_pcu::UNC_P_DEMOTIONS_CORE10", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x3a, ++ .fstr = "bdx_unc_pcu::UNC_P_DEMOTIONS_CORE10:e=0:i=0:t=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_pcu::UNC_P_DEMOTIONS_CORE14", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x3e, ++ .fstr = "bdx_unc_pcu::UNC_P_DEMOTIONS_CORE14:e=0:i=0:t=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_pcu::UNC_P_DEMOTIONS_CORE17", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x41, ++ .fstr = "bdx_unc_pcu::UNC_P_DEMOTIONS_CORE17:e=0:i=0:t=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_ha0::UNC_H_CLOCKTICKS", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x0, ++ .fstr = "bdx_unc_ha0::UNC_H_CLOCKTICKS:e=0:i=0:t=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_ha1::UNC_H_CLOCKTICKS", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x0, ++ .fstr = "bdx_unc_ha1::UNC_H_CLOCKTICKS:e=0:i=0:t=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_ha1::UNC_H_REQUESTS:READS:t=1", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x1000301, ++ .fstr = "bdx_unc_ha1::UNC_H_REQUESTS:READS:e=0:i=0:t=1", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_ha0::UNC_H_IMC_WRITES:t=1", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x1000f1a, ++ .fstr = "bdx_unc_ha0::UNC_H_IMC_WRITES:ALL:e=0:i=0:t=1", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_ha0::UNC_H_IMC_READS:t=1", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x1000117, ++ .fstr = "bdx_unc_ha0::UNC_H_IMC_READS:NORMAL:e=0:i=0:t=1", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_imc0::UNC_M_CLOCKTICKS", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0xff, ++ .fstr = "bdx_unc_imc0::UNC_M_CLOCKTICKS", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_imc1::UNC_M_CLOCKTICKS", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0xff, ++ .fstr = "bdx_unc_imc1::UNC_M_CLOCKTICKS", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_imc2::UNC_M_CLOCKTICKS", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0xff, ++ .fstr = "bdx_unc_imc2::UNC_M_CLOCKTICKS", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_imc3::UNC_M_CLOCKTICKS", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0xff, ++ .fstr = "bdx_unc_imc3::UNC_M_CLOCKTICKS", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_imc4::UNC_M_CLOCKTICKS", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0xff, ++ .fstr = "bdx_unc_imc4::UNC_M_CLOCKTICKS", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_imc5::UNC_M_CLOCKTICKS", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0xff, ++ .fstr = "bdx_unc_imc5::UNC_M_CLOCKTICKS", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_imc6::UNC_M_CLOCKTICKS", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0xff, ++ .fstr = "bdx_unc_imc6::UNC_M_CLOCKTICKS", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_imc7::UNC_M_CLOCKTICKS", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0xff, ++ .fstr = "bdx_unc_imc7::UNC_M_CLOCKTICKS", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_imc0::UNC_M_CLOCKTICKS:t=1", ++ .ret = PFM_ERR_ATTR, ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_imc0::UNC_M_DCLOCKTICKS", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x00, ++ .fstr = "bdx_unc_imc0::UNC_M_DCLOCKTICKS:e=0:i=0:t=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_imc4::UNC_M_DCLOCKTICKS", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x00, ++ .fstr = "bdx_unc_imc4::UNC_M_DCLOCKTICKS:e=0:i=0:t=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_imc0::UNC_M_CAS_COUNT:RD", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x0304, ++ .fstr = "bdx_unc_imc0::UNC_M_CAS_COUNT:RD:e=0:i=0:t=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_imc0::UNC_M_PRE_COUNT:WR", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x0802, ++ .fstr = "bdx_unc_imc0::UNC_M_PRE_COUNT:WR:e=0:i=0:t=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_imc0::UNC_M_POWER_CKE_CYCLES:RANK0", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x183, ++ .fstr = "bdx_unc_imc0::UNC_M_POWER_CKE_CYCLES:RANK0:e=0:i=0:t=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_imc0::UNC_M_CAS_COUNT:WR", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0xc04, ++ .fstr = "bdx_unc_imc0::UNC_M_CAS_COUNT:WR:e=0:i=0:t=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_imc0::UNC_M_RD_CAS_RANK0:BANK0", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0xb0, ++ .fstr = "bdx_unc_imc0::UNC_M_RD_CAS_RANK0:BANK0:e=0:i=0:t=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_imc0::UNC_M_RD_CAS_RANK0:ALLBANKS", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x10b0, ++ .fstr = "bdx_unc_imc0::UNC_M_RD_CAS_RANK0:ALLBANKS:e=0:i=0:t=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_imc0::UNC_M_RD_CAS_RANK0:BANKG0", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x11b0, ++ .fstr = "bdx_unc_imc0::UNC_M_RD_CAS_RANK0:BANKG0:e=0:i=0:t=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_imc0::UNC_M_RD_CAS_RANK4:BANK7", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x7b4, ++ .fstr = "bdx_unc_imc0::UNC_M_RD_CAS_RANK4:BANK7:e=0:i=0:t=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_imc0::UNC_M_RD_CAS_RANK4:BANK7:t=1", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x10007b4, ++ .fstr = "bdx_unc_imc0::UNC_M_RD_CAS_RANK4:BANK7:e=0:i=0:t=1", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_imc0::UNC_M_RD_CAS_RANK7:BANK7:t=1:i", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x18007b7, ++ .fstr = "bdx_unc_imc0::UNC_M_RD_CAS_RANK7:BANK7:e=0:i=1:t=1", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_sbo0::UNC_S_CLOCKTICKS", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x0, ++ .fstr = "bdx_unc_sbo0::UNC_S_CLOCKTICKS:e=0:i=0:t=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_sbo0::UNC_S_FAST_ASSERTED:t=1:i", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x1800009, ++ .fstr = "bdx_unc_sbo0::UNC_S_FAST_ASSERTED:e=0:i=1:t=1", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_sbo3::UNC_S_CLOCKTICKS", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x0, ++ .fstr = "bdx_unc_sbo3::UNC_S_CLOCKTICKS:e=0:i=0:t=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_sbo3::UNC_S_FAST_ASSERTED:t=1:i", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x1800009, ++ .fstr = "bdx_unc_sbo3::UNC_S_FAST_ASSERTED:e=0:i=1:t=1", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_ubo::UNC_U_EVENT_MSG", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x842, ++ .fstr = "bdx_unc_ubo::UNC_U_EVENT_MSG:DOORBELL_RCVD:e=0:i=0:t=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_qpi0::UNC_Q_CLOCKTICKS", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x14, ++ .fstr = "bdx_unc_qpi0::UNC_Q_CLOCKTICKS:e=0:i=0:t=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_qpi1::UNC_Q_CLOCKTICKS", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x14, ++ .fstr = "bdx_unc_qpi1::UNC_Q_CLOCKTICKS:e=0:i=0:t=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_qpi2::UNC_Q_CLOCKTICKS", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x14, ++ .fstr = "bdx_unc_qpi2::UNC_Q_CLOCKTICKS:e=0:i=0:t=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_qpi0::UNC_Q_DIRECT2CORE:SUCCESS_RBT_HIT", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x113, ++ .fstr = "bdx_unc_qpi0::UNC_Q_DIRECT2CORE:SUCCESS_RBT_HIT:e=0:i=0:t=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_qpi0::UNC_Q_RXL_FLITS_G1:DRS:i:t=1", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x1a01802, ++ .fstr = "bdx_unc_qpi0::UNC_Q_RXL_FLITS_G1:DRS:e=0:i=1:t=1", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_qpi0::UNC_Q_TXL_FLITS_G2:NDR_AD", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x200101, ++ .fstr = "bdx_unc_qpi0::UNC_Q_TXL_FLITS_G2:NDR_AD:e=0:i=0:t=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_qpi0::UNC_Q_RXL_OCCUPANCY", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0xb, ++ .fstr = "bdx_unc_qpi0::UNC_Q_RXL_OCCUPANCY:e=0:i=0:t=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_qpi0::UNC_Q_TXL_INSERTS", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x4, ++ .fstr = "bdx_unc_qpi0::UNC_Q_TXL_INSERTS:e=0:i=0:t=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_r2pcie::UNC_R2_CLOCKTICKS", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x1, ++ .fstr = "bdx_unc_r2pcie::UNC_R2_CLOCKTICKS:e=0:i=0:t=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_r2pcie::UNC_R2_RING_AD_USED:CW", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x307, ++ .fstr = "bdx_unc_r2pcie::UNC_R2_RING_AD_USED:CW:e=0:i=0:t=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_r3qpi0::UNC_R3_CLOCKTICKS", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x1, ++ .fstr = "bdx_unc_r3qpi0::UNC_R3_CLOCKTICKS:e=0:i=0:t=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_r3qpi0::UNC_R3_RXR_CYCLES_NE:SNP:e=0:t=0", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x210, ++ .fstr = "bdx_unc_r3qpi0::UNC_R3_RXR_CYCLES_NE:SNP:e=0:i=0:t=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_r3qpi1::UNC_R3_RING_SINK_STARVED", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x20e, ++ .fstr = "bdx_unc_r3qpi1::UNC_R3_RING_SINK_STARVED:AK:e=0:i=0:t=0", ++ }, ++ { SRC_LINE, ++ .name = "bdx_unc_r3qpi1::UNC_R3_HA_R2_BL_CREDITS_EMPTY:HA1:i:t=2", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x280022d, ++ .fstr = "bdx_unc_r3qpi1::UNC_R3_HA_R2_BL_CREDITS_EMPTY:HA1:e=0:i=1:t=2", ++ }, + }; + + #define NUM_TEST_EVENTS (int)(sizeof(x86_test_events)/sizeof(test_event_t)) diff --git a/SOURCES/libpfm-intel_1port.patch b/SOURCES/libpfm-intel_1port.patch new file mode 100644 index 0000000..c52d1df --- /dev/null +++ b/SOURCES/libpfm-intel_1port.patch @@ -0,0 +1,23 @@ +commit 2e98642dd331b15382256caa380834d01b63bef8 +Author: Stephane Eranian +Date: Thu Oct 19 11:23:44 2017 -0700 + + Fix Intel Skylake EXE_ACTIVITY.1_PORTS_UTIL event + + Was missing a umask name. + + Signed-off-by: Stephane Eranian + +diff --git a/lib/events/intel_skl_events.h b/lib/events/intel_skl_events.h +index 8021403..38d2aa5 100644 +--- a/lib/events/intel_skl_events.h ++++ b/lib/events/intel_skl_events.h +@@ -1973,7 +1973,7 @@ static const intel_x86_umask_t skl_fp_arith[]={ + }; + + static const intel_x86_umask_t skl_exe_activity[]={ +- { .uname = "", ++ { .uname = "1_PORTS_UTIL", + .udesc = "Cycles with 1 uop executing across all ports and Reservation Station is not empty", + .ucode = 0x0200, + .uflags= INTEL_X86_NCOMBO, diff --git a/SOURCES/libpfm-p9_alt.patch b/SOURCES/libpfm-p9_alt.patch new file mode 100644 index 0000000..2203dcb --- /dev/null +++ b/SOURCES/libpfm-p9_alt.patch @@ -0,0 +1,210 @@ +From ed3f51c4690685675cf2766edb90acbc0c1cdb67 Mon Sep 17 00:00:00 2001 +From: Will Schmidt +Date: Sun, 3 Dec 2017 09:42:44 -0800 +Subject: [PATCH] Add alternate event numbers for power9. + +I had previously missed adding the _ALT entries, which allow some +events to be specified on different counters. This patch fills +those in. + +This patch also adds a few validation tests for the ALT events. + +Signed-off-by: Will Schmidt +--- + lib/events/power9_events.h | 93 +++++++++++++++++++++++++++++++++++++++++++++- + tests/validate_power.c | 21 +++++++++++ + 2 files changed, 113 insertions(+), 1 deletion(-) + +diff --git a/lib/events/power9_events.h b/lib/events/power9_events.h +index 72c481b..d77bab3 100644 +--- a/lib/events/power9_events.h ++++ b/lib/events/power9_events.h +@@ -9,6 +9,7 @@ + * Mods: + * Initial content generated by Will Schmidt. (Jan 31, 2017). + * Refresh/update generated Jun 06, 2017 by Will Schmidt. ++* missing _ALT events added, Nov 16, 2017 by Will Schmidt. + * + * Contributed by + * (C) Copyright IBM Corporation, 2017. All Rights Reserved. +@@ -969,6 +970,18 @@ + #define POWER9_PME_PM_XLATE_HPT_MODE 943 + #define POWER9_PME_PM_XLATE_MISS 944 + #define POWER9_PME_PM_XLATE_RADIX_MODE 945 ++#define POWER9_PME_PM_BR_2PATH_ALT 946 ++#define POWER9_PME_PM_CYC_ALT 947 ++#define POWER9_PME_PM_CYC_ALT2 948 ++#define POWER9_PME_PM_CYC_ALT3 949 ++#define POWER9_PME_PM_INST_CMPL_ALT 950 ++#define POWER9_PME_PM_INST_CMPL_ALT2 951 ++#define POWER9_PME_PM_INST_CMPL_ALT3 952 ++#define POWER9_PME_PM_INST_DISP_ALT 953 ++#define POWER9_PME_PM_LD_MISS_L1_ALT 954 ++#define POWER9_PME_PM_SUSPENDED_ALT 955 ++#define POWER9_PME_PM_SUSPENDED_ALT2 956 ++#define POWER9_PME_PM_SUSPENDED_ALT3 957 + + static const pme_power_entry_t power9_pe[] = { + [ POWER9_PME_PM_1FLOP_CMPL ] = { +@@ -1031,6 +1044,7 @@ static const pme_power_entry_t power9_pe[] = { + .pme_short_desc = "Cycles in which all 4 Binary Floating Point units are busy.", + .pme_long_desc = "Cycles in which all 4 Binary Floating Point units are busy. The BFU is running at capacity", + }, ++/* See also alternate entries for 0000020036 / POWER9_PME_PM_BR_2PATH with code(s) 0000040036 at the bottom of this table. \n */ + [ POWER9_PME_PM_BR_2PATH ] = { + .pme_name = "PM_BR_2PATH", + .pme_code = 0x0000020036, +@@ -1559,6 +1573,7 @@ static const pme_power_entry_t power9_pe[] = { + .pme_short_desc = "Continuous 16 cycle (2to1) window where this signals rotates thru sampling each CO machine busy.", + .pme_long_desc = "Continuous 16 cycle (2to1) window where this signals rotates thru sampling each CO machine busy. PMU uses this wave to then do 16 cyc count to sample total number of machs running", + }, ++/* See also alternate entries for 000001001E / POWER9_PME_PM_CYC with code(s) 000002001E 000003001E 000004001E at the bottom of this table. \n */ + [ POWER9_PME_PM_CYC ] = { + .pme_name = "PM_CYC", + .pme_code = 0x000001001E, +@@ -2669,12 +2684,14 @@ static const pme_power_entry_t power9_pe[] = { + .pme_short_desc = "Initial and Final Pump Scope was chip pump (prediction=correct) for an instruction fetch", + .pme_long_desc = "Initial and Final Pump Scope was chip pump (prediction=correct) for an instruction fetch", + }, ++/* See also alternate entries for 0000010002 / POWER9_PME_PM_INST_CMPL with code(s) 0000020002 0000030002 0000040002 at the bottom of this table. \n */ + [ POWER9_PME_PM_INST_CMPL ] = { + .pme_name = "PM_INST_CMPL", + .pme_code = 0x0000010002, + .pme_short_desc = "Number of PowerPC Instructions that completed.", + .pme_long_desc = "Number of PowerPC Instructions that completed.", + }, ++/* See also alternate entries for 00000200F2 / POWER9_PME_PM_INST_DISP with code(s) 00000300F2 at the bottom of this table. \n */ + [ POWER9_PME_PM_INST_DISP ] = { + .pme_name = "PM_INST_DISP", + .pme_code = 0x00000200F2, +@@ -4007,6 +4024,7 @@ static const pme_power_entry_t power9_pe[] = { + .pme_short_desc = "Number of load instructions that finished with an L1 miss.", + .pme_long_desc = "Number of load instructions that finished with an L1 miss. Note that even if a load spans multiple slices this event will increment only once per load op.", + }, ++/* See also alternate entries for 000003E054 / POWER9_PME_PM_LD_MISS_L1 with code(s) 00000400F0 at the bottom of this table. \n */ + [ POWER9_PME_PM_LD_MISS_L1 ] = { + .pme_name = "PM_LD_MISS_L1", + .pme_code = 0x000003E054, +@@ -6149,6 +6167,7 @@ static const pme_power_entry_t power9_pe[] = { + .pme_short_desc = "Fetching is stopped due to an incoming instruction that will result in a flush", + .pme_long_desc = "Fetching is stopped due to an incoming instruction that will result in a flush", + }, ++/* See also alternate entries for 0000010000 / POWER9_PME_PM_SUSPENDED with code(s) 0000020000 0000030000 0000040000 at the bottom of this table. \n */ + [ POWER9_PME_PM_SUSPENDED ] = { + .pme_name = "PM_SUSPENDED", + .pme_code = 0x0000010000, +@@ -6647,6 +6666,78 @@ static const pme_power_entry_t power9_pe[] = { + .pme_short_desc = "LSU reports every cycle the thread is in radix translation mode (as opposed to HPT mode)", + .pme_long_desc = "LSU reports every cycle the thread is in radix translation mode (as opposed to HPT mode)", + }, +-/* total 945 */ ++[ POWER9_PME_PM_BR_2PATH_ALT ] = { ++ .pme_name = "PM_BR_2PATH_ALT", ++ .pme_code = 0x0000040036, ++ .pme_short_desc = "Branches that are not strongly biased", ++ .pme_long_desc = "Branches that are not strongly biased", ++}, ++[ POWER9_PME_PM_CYC_ALT ] = { ++ .pme_name = "PM_CYC_ALT", ++ .pme_code = 0x000002001E, ++ .pme_short_desc = "Processor cycles", ++ .pme_long_desc = "Processor cycles", ++}, ++[ POWER9_PME_PM_CYC_ALT2 ] = { ++ .pme_name = "PM_CYC_ALT2", ++ .pme_code = 0x000003001E, ++ .pme_short_desc = "Processor cycles", ++ .pme_long_desc = "Processor cycles", ++}, ++[ POWER9_PME_PM_CYC_ALT3 ] = { ++ .pme_name = "PM_CYC_ALT3", ++ .pme_code = 0x000004001E, ++ .pme_short_desc = "Processor cycles", ++ .pme_long_desc = "Processor cycles", ++}, ++[ POWER9_PME_PM_INST_CMPL_ALT ] = { ++ .pme_name = "PM_INST_CMPL_ALT", ++ .pme_code = 0x0000020002, ++ .pme_short_desc = "Number of PowerPC Instructions that completed.", ++ .pme_long_desc = "Number of PowerPC Instructions that completed.", ++}, ++[ POWER9_PME_PM_INST_CMPL_ALT2 ] = { ++ .pme_name = "PM_INST_CMPL_ALT2", ++ .pme_code = 0x0000030002, ++ .pme_short_desc = "Number of PowerPC Instructions that completed.", ++ .pme_long_desc = "Number of PowerPC Instructions that completed.", ++}, ++[ POWER9_PME_PM_INST_CMPL_ALT3 ] = { ++ .pme_name = "PM_INST_CMPL_ALT3", ++ .pme_code = 0x0000040002, ++ .pme_short_desc = "Number of PowerPC Instructions that completed.", ++ .pme_long_desc = "Number of PowerPC Instructions that completed.", ++}, ++[ POWER9_PME_PM_INST_DISP_ALT ] = { ++ .pme_name = "PM_INST_DISP_ALT", ++ .pme_code = 0x00000300F2, ++ .pme_short_desc = "# PPC Dispatched", ++ .pme_long_desc = "# PPC Dispatched", ++}, ++[ POWER9_PME_PM_LD_MISS_L1_ALT ] = { ++ .pme_name = "PM_LD_MISS_L1_ALT", ++ .pme_code = 0x00000400F0, ++ .pme_short_desc = "Load Missed L1, counted at execution time (can be greater than loads finished).", ++ .pme_long_desc = "Load Missed L1, counted at execution time (can be greater than loads finished). LMQ merges are not included in this count. i.e. if a load instruction misses on an address that is already allocated on the LMQ, this event will not increment for that load). Note that this count is per slice, so if a load spans multiple slices this event will increment multiple times for a single load.", ++}, ++[ POWER9_PME_PM_SUSPENDED_ALT ] = { ++ .pme_name = "PM_SUSPENDED_ALT", ++ .pme_code = 0x0000020000, ++ .pme_short_desc = "Counter OFF", ++ .pme_long_desc = "Counter OFF", ++}, ++[ POWER9_PME_PM_SUSPENDED_ALT2 ] = { ++ .pme_name = "PM_SUSPENDED_ALT2", ++ .pme_code = 0x0000030000, ++ .pme_short_desc = "Counter OFF", ++ .pme_long_desc = "Counter OFF", ++}, ++[ POWER9_PME_PM_SUSPENDED_ALT3 ] = { ++ .pme_name = "PM_SUSPENDED_ALT3", ++ .pme_code = 0x0000040000, ++ .pme_short_desc = "Counter OFF", ++ .pme_long_desc = "Counter OFF", ++}, ++/* total 957 */ + }; + #endif +diff --git a/tests/validate_power.c b/tests/validate_power.c +index 617efca..2e38f32 100644 +--- a/tests/validate_power.c ++++ b/tests/validate_power.c +@@ -171,6 +171,27 @@ static const test_event_t ppc_test_events[]={ + .codes[0] = 0x200f2, + .fstr = "power9::PM_INST_DISP", + }, ++ { SRC_LINE, ++ .name = "power9::PM_CYC_ALT", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x2001e, ++ .fstr = "power9::PM_CYC_ALT", ++ }, ++ { SRC_LINE, ++ .name = "power9::PM_CYC_ALT2", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x3001e, ++ .fstr = "power9::PM_CYC_ALT2", ++ }, ++ { SRC_LINE, ++ .name = "power9::PM_INST_CMPL_ALT", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x20002, ++ .fstr = "power9::PM_INST_CMPL_ALT", ++ }, + { SRC_LINE, + .name = "powerpc_nest_mcs_read::MCS_00", + .ret = PFM_SUCCESS, +-- +2.13.6 + diff --git a/SOURCES/libpfm-p9_uniq.patch b/SOURCES/libpfm-p9_uniq.patch new file mode 100644 index 0000000..82c2277 --- /dev/null +++ b/SOURCES/libpfm-p9_uniq.patch @@ -0,0 +1,220 @@ +diff --git a/lib/events/power9_events.h b/lib/events/power9_events.h +index d77bab3..f352ace 100644 +--- a/lib/events/power9_events.h ++++ b/lib/events/power9_events.h +@@ -1550,7 +1550,7 @@ static const pme_power_entry_t power9_pe[] = { + .pme_long_desc = "CO mach 0 Busy. Used by PMU to sample ave CO lifetime (mach0 used as sample point)", + }, + [ POWER9_PME_PM_CO0_BUSY_ALT ] = { +- .pme_name = "PM_CO0_BUSY", ++ .pme_name = "PM_CO0_BUSY_ALT", + .pme_code = 0x000004608C, + .pme_short_desc = "CO mach 0 Busy.", + .pme_long_desc = "CO mach 0 Busy. Used by PMU to sample ave CO lifetime (mach0 used as sample point)", +@@ -2277,7 +2277,7 @@ static const pme_power_entry_t power9_pe[] = { + .pme_long_desc = "Data SLB Miss - Total of all segment sizes", + }, + [ POWER9_PME_PM_DSLB_MISS_ALT ] = { +- .pme_name = "PM_DSLB_MISS", ++ .pme_name = "PM_DSLB_MISS_ALT", + .pme_code = 0x0000010016, + .pme_short_desc = "gate_and(sd_pc_c0_comp_valid AND sd_pc_c0_comp_thread(0:1)=tid,sd_pc_c0_comp_ppc_count(0:3)) + gate_and(sd_pc_c1_comp_valid AND sd_pc_c1_comp_thread(0:1)=tid,sd_pc_c1_comp_ppc_count(0:3))", + .pme_long_desc = "gate_and(sd_pc_c0_comp_valid AND sd_pc_c0_comp_thread(0:1)=tid,sd_pc_c0_comp_ppc_count(0:3)) + gate_and(sd_pc_c1_comp_valid AND sd_pc_c1_comp_thread(0:1)=tid,sd_pc_c1_comp_ppc_count(0:3))", +@@ -3155,7 +3155,7 @@ static const pme_power_entry_t power9_pe[] = { + .pme_long_desc = "Instruction SLB Miss - Total of all segment sizes", + }, + [ POWER9_PME_PM_ISLB_MISS_ALT ] = { +- .pme_name = "PM_ISLB_MISS", ++ .pme_name = "PM_ISLB_MISS_ALT", + .pme_code = 0x0000040006, + .pme_short_desc = "Number of ISLB misses for this thread", + .pme_long_desc = "Number of ISLB misses for this thread", +@@ -3323,7 +3323,7 @@ static const pme_power_entry_t power9_pe[] = { + .pme_long_desc = "All successful I-side dispatches that were an L2 miss for this thread (excludes i_l2mru_tch reqs)", + }, + [ POWER9_PME_PM_L2_INST_MISS_ALT ] = { +- .pme_name = "PM_L2_INST_MISS", ++ .pme_name = "PM_L2_INST_MISS_ALT", + .pme_code = 0x000004609E, + .pme_short_desc = "All successful I-side dispatches that were an L2 miss for this thread (excludes i_l2mru_tch reqs)", + .pme_long_desc = "All successful I-side dispatches that were an L2 miss for this thread (excludes i_l2mru_tch reqs)", +@@ -3335,7 +3335,7 @@ static const pme_power_entry_t power9_pe[] = { + .pme_long_desc = "All successful I-side dispatches for this thread (excludes i_l2mru_tch reqs)", + }, + [ POWER9_PME_PM_L2_INST_ALT ] = { +- .pme_name = "PM_L2_INST", ++ .pme_name = "PM_L2_INST_ALT", + .pme_code = 0x000003609E, + .pme_short_desc = "All successful I-side dispatches for this thread (excludes i_l2mru_tch reqs)", + .pme_long_desc = "All successful I-side dispatches for this thread (excludes i_l2mru_tch reqs)", +@@ -3347,7 +3347,7 @@ static const pme_power_entry_t power9_pe[] = { + .pme_long_desc = "All successful D-side load dispatches for this thread (L2 miss + L2 hits)", + }, + [ POWER9_PME_PM_L2_LD_DISP_ALT ] = { +- .pme_name = "PM_L2_LD_DISP", ++ .pme_name = "PM_L2_LD_DISP_ALT", + .pme_code = 0x0000036082, + .pme_short_desc = "All successful I-or-D side load dispatches for this thread (excludes i_l2mru_tch_reqs)", + .pme_long_desc = "All successful I-or-D side load dispatches for this thread (excludes i_l2mru_tch_reqs)", +@@ -3359,7 +3359,7 @@ static const pme_power_entry_t power9_pe[] = { + .pme_long_desc = "All successful D-side load dispatches that were L2 hits for this thread", + }, + [ POWER9_PME_PM_L2_LD_HIT_ALT ] = { +- .pme_name = "PM_L2_LD_HIT", ++ .pme_name = "PM_L2_LD_HIT_ALT", + .pme_code = 0x0000036882, + .pme_short_desc = "All successful I-or-D side load dispatches for this thread that were L2 hits (excludes i_l2mru_tch_reqs)", + .pme_long_desc = "All successful I-or-D side load dispatches for this thread that were L2 hits (excludes i_l2mru_tch_reqs)", +@@ -3449,7 +3449,7 @@ static const pme_power_entry_t power9_pe[] = { + .pme_long_desc = "RC retries on PB for any load from core (excludes DCBFs)", + }, + [ POWER9_PME_PM_L2_RTY_LD_ALT ] = { +- .pme_name = "PM_L2_RTY_LD", ++ .pme_name = "PM_L2_RTY_LD_ALT", + .pme_code = 0x000003689E, + .pme_short_desc = "RC retries on PB for any load from core (excludes DCBFs)", + .pme_long_desc = "RC retries on PB for any load from core (excludes DCBFs)", +@@ -3461,7 +3461,7 @@ static const pme_power_entry_t power9_pe[] = { + .pme_long_desc = "RC retries on PB for any store from core (excludes DCBFs)", + }, + [ POWER9_PME_PM_L2_RTY_ST_ALT ] = { +- .pme_name = "PM_L2_RTY_ST", ++ .pme_name = "PM_L2_RTY_ST_ALT", + .pme_code = 0x000004689E, + .pme_short_desc = "RC retries on PB for any store from core (excludes DCBFs)", + .pme_long_desc = "RC retries on PB for any store from core (excludes DCBFs)", +@@ -3479,7 +3479,7 @@ static const pme_power_entry_t power9_pe[] = { + .pme_long_desc = "SNP dispatched for a write and was M (true M); for DMA cacheinj this will pulse if rty/push is required (won't pulse if cacheinj is accepted)", + }, + [ POWER9_PME_PM_L2_SN_M_WR_DONE_ALT ] = { +- .pme_name = "PM_L2_SN_M_WR_DONE", ++ .pme_name = "PM_L2_SN_M_WR_DONE_ALT", + .pme_code = 0x0000046886, + .pme_short_desc = "SNP dispatched for a write and was M (true M); for DMA cacheinj this will pulse if rty/push is required (won't pulse if cacheinj is accepted)", + .pme_long_desc = "SNP dispatched for a write and was M (true M); for DMA cacheinj this will pulse if rty/push is required (won't pulse if cacheinj is accepted)", +@@ -3497,7 +3497,7 @@ static const pme_power_entry_t power9_pe[] = { + .pme_long_desc = "All successful D-side store dispatches for this thread", + }, + [ POWER9_PME_PM_L2_ST_DISP_ALT ] = { +- .pme_name = "PM_L2_ST_DISP", ++ .pme_name = "PM_L2_ST_DISP_ALT", + .pme_code = 0x000001689E, + .pme_short_desc = "All successful D-side store dispatches for this thread (L2 miss + L2 hits)", + .pme_long_desc = "All successful D-side store dispatches for this thread (L2 miss + L2 hits)", +@@ -3509,7 +3509,7 @@ static const pme_power_entry_t power9_pe[] = { + .pme_long_desc = "All successful D-side store dispatches for this thread that were L2 hits", + }, + [ POWER9_PME_PM_L2_ST_HIT_ALT ] = { +- .pme_name = "PM_L2_ST_HIT", ++ .pme_name = "PM_L2_ST_HIT_ALT", + .pme_code = 0x000002689E, + .pme_short_desc = "All successful D-side store dispatches that were L2 hits for this thread", + .pme_long_desc = "All successful D-side store dispatches that were L2 hits for this thread", +@@ -3587,7 +3587,7 @@ static const pme_power_entry_t power9_pe[] = { + .pme_long_desc = "Lifetime, sample of CO machine 0 valid", + }, + [ POWER9_PME_PM_L3_CO0_BUSY_ALT ] = { +- .pme_name = "PM_L3_CO0_BUSY", ++ .pme_name = "PM_L3_CO0_BUSY_ALT", + .pme_code = 0x00000468AC, + .pme_short_desc = "Lifetime, sample of CO machine 0 valid", + .pme_long_desc = "Lifetime, sample of CO machine 0 valid", +@@ -3617,7 +3617,7 @@ static const pme_power_entry_t power9_pe[] = { + .pme_long_desc = "L3 castouts in Mepf state for this thread", + }, + [ POWER9_PME_PM_L3_CO_MEPF_ALT ] = { +- .pme_name = "PM_L3_CO_MEPF", ++ .pme_name = "PM_L3_CO_MEPF_ALT", + .pme_code = 0x00000168A0, + .pme_short_desc = "L3 CO of line in Mep state (includes casthrough to memory).", + .pme_long_desc = "L3 CO of line in Mep state (includes casthrough to memory). The Mepf state indicates that a line was brought in to satisfy an L3 prefetch request", +@@ -3731,7 +3731,7 @@ static const pme_power_entry_t power9_pe[] = { + .pme_long_desc = "L3 CO received retry port 0 (memory only), every retry counted", + }, + [ POWER9_PME_PM_L3_P0_CO_RTY_ALT ] = { +- .pme_name = "PM_L3_P0_CO_RTY", ++ .pme_name = "PM_L3_P0_CO_RTY_ALT", + .pme_code = 0x00000460AE, + .pme_short_desc = "L3 CO received retry port 2 (memory only), every retry counted", + .pme_long_desc = "L3 CO received retry port 2 (memory only), every retry counted", +@@ -3773,7 +3773,7 @@ static const pme_power_entry_t power9_pe[] = { + .pme_long_desc = "L3 PF received retry port 0, every retry counted", + }, + [ POWER9_PME_PM_L3_P0_PF_RTY_ALT ] = { +- .pme_name = "PM_L3_P0_PF_RTY", ++ .pme_name = "PM_L3_P0_PF_RTY_ALT", + .pme_code = 0x00000260AE, + .pme_short_desc = "L3 PF received retry port 2, every retry counted", + .pme_long_desc = "L3 PF received retry port 2, every retry counted", +@@ -3803,7 +3803,7 @@ static const pme_power_entry_t power9_pe[] = { + .pme_long_desc = "L3 CO received retry port 1 (memory only), every retry counted", + }, + [ POWER9_PME_PM_L3_P1_CO_RTY_ALT ] = { +- .pme_name = "PM_L3_P1_CO_RTY", ++ .pme_name = "PM_L3_P1_CO_RTY_ALT", + .pme_code = 0x00000468AE, + .pme_short_desc = "L3 CO received retry port 3 (memory only), every retry counted", + .pme_long_desc = "L3 CO received retry port 3 (memory only), every retry counted", +@@ -3845,7 +3845,7 @@ static const pme_power_entry_t power9_pe[] = { + .pme_long_desc = "L3 PF received retry port 1, every retry counted", + }, + [ POWER9_PME_PM_L3_P1_PF_RTY_ALT ] = { +- .pme_name = "PM_L3_P1_PF_RTY", ++ .pme_name = "PM_L3_P1_PF_RTY_ALT", + .pme_code = 0x00000268AE, + .pme_short_desc = "L3 PF received retry port 3, every retry counted", + .pme_long_desc = "L3 PF received retry port 3, every retry counted", +@@ -3875,7 +3875,7 @@ static const pme_power_entry_t power9_pe[] = { + .pme_long_desc = "Lifetime, sample of PF machine 0 valid", + }, + [ POWER9_PME_PM_L3_PF0_BUSY_ALT ] = { +- .pme_name = "PM_L3_PF0_BUSY", ++ .pme_name = "PM_L3_PF0_BUSY_ALT", + .pme_code = 0x00000460B4, + .pme_short_desc = "Lifetime, sample of PF machine 0 valid", + .pme_long_desc = "Lifetime, sample of PF machine 0 valid", +@@ -3929,7 +3929,7 @@ static const pme_power_entry_t power9_pe[] = { + .pme_long_desc = "Lifetime, sample of RD machine 0 valid", + }, + [ POWER9_PME_PM_L3_RD0_BUSY_ALT ] = { +- .pme_name = "PM_L3_RD0_BUSY", ++ .pme_name = "PM_L3_RD0_BUSY_ALT", + .pme_code = 0x00000468B4, + .pme_short_desc = "Lifetime, sample of RD machine 0 valid", + .pme_long_desc = "Lifetime, sample of RD machine 0 valid", +@@ -3947,7 +3947,7 @@ static const pme_power_entry_t power9_pe[] = { + .pme_long_desc = "Lifetime, sample of snooper machine 0 valid", + }, + [ POWER9_PME_PM_L3_SN0_BUSY_ALT ] = { +- .pme_name = "PM_L3_SN0_BUSY", ++ .pme_name = "PM_L3_SN0_BUSY_ALT", + .pme_code = 0x00000460AC, + .pme_short_desc = "Lifetime, sample of snooper machine 0 valid", + .pme_long_desc = "Lifetime, sample of snooper machine 0 valid", +@@ -3989,7 +3989,7 @@ static const pme_power_entry_t power9_pe[] = { + .pme_long_desc = "Rotating sample of 8 WI valid", + }, + [ POWER9_PME_PM_L3_WI0_BUSY_ALT ] = { +- .pme_name = "PM_L3_WI0_BUSY", ++ .pme_name = "PM_L3_WI0_BUSY_ALT", + .pme_code = 0x00000260B6, + .pme_short_desc = "Rotating sample of 8 WI valid (duplicate)", + .pme_long_desc = "Rotating sample of 8 WI valid (duplicate)", +@@ -5928,7 +5928,7 @@ static const pme_power_entry_t power9_pe[] = { + .pme_long_desc = "RC mach 0 Busy. Used by PMU to sample ave RC lifetime (mach0 used as sample point)", + }, + [ POWER9_PME_PM_RC0_BUSY_ALT ] = { +- .pme_name = "PM_RC0_BUSY", ++ .pme_name = "PM_RC0_BUSY_ALT", + .pme_code = 0x000002608C, + .pme_short_desc = "RC mach 0 Busy.", + .pme_long_desc = "RC mach 0 Busy. Used by PMU to sample ave RC lifetime (mach0 used as sample point)", +@@ -6042,7 +6042,7 @@ static const pme_power_entry_t power9_pe[] = { + .pme_long_desc = "SN mach 0 Busy. Used by PMU to sample ave SN lifetime (mach0 used as sample point)", + }, + [ POWER9_PME_PM_SN0_BUSY_ALT ] = { +- .pme_name = "PM_SN0_BUSY", ++ .pme_name = "PM_SN0_BUSY_ALT", + .pme_code = 0x0000026090, + .pme_short_desc = "SN mach 0 Busy.", + .pme_long_desc = "SN mach 0 Busy. Used by PMU to sample ave SN lifetime (mach0 used as sample point)", diff --git a/SOURCES/libpfm-power9.patch b/SOURCES/libpfm-power9.patch new file mode 100644 index 0000000..9a7fbe6 --- /dev/null +++ b/SOURCES/libpfm-power9.patch @@ -0,0 +1,18877 @@ +From ae1a66c16313ea1d96d99a293c2b7dab095b9880 Mon Sep 17 00:00:00 2001 +From: Will Schmidt +Date: Fri, 21 Apr 2017 17:25:45 -0700 +Subject: [PATCH 1/6] Enable IBM Power9 core PMU support (beta) + +This patch is build tested only, thus the [RFC] tag. :-) +The event list itself is untested and thus subject to change. + +Add POWER9 support. + +Signed-off-by: Will Schmidt +--- + README | 1 + + include/perfmon/pfmlib.h | 2 + + lib/Makefile | 3 +- + lib/events/power9_events.h | 6460 ++++++++++++++++++++++++++++++++++++++++++++ + lib/pfmlib_common.c | 1 + + lib/pfmlib_power9.c | 58 + + lib/pfmlib_power_priv.h | 2 + + lib/pfmlib_priv.h | 1 + + 8 files changed, 6527 insertions(+), 1 deletion(-) + create mode 100644 lib/events/power9_events.h + create mode 100644 lib/pfmlib_power9.c + +diff --git a/README b/README +index 6a49591..92d9950 100644 +--- a/README ++++ b/README +@@ -79,6 +79,7 @@ The library supports many PMUs. The current version can handle: + Power 7 + Power 8 + Power 8 Nest ++ Power 9 + PPC970 + Torrent + System z (s390x) +diff --git a/include/perfmon/pfmlib.h b/include/perfmon/pfmlib.h +index 6904c1c..89ab973 100644 +--- a/include/perfmon/pfmlib.h ++++ b/include/perfmon/pfmlib.h +@@ -367,6 +367,8 @@ typedef enum { + + PFM_PMU_INTEL_KNL_UNC_UBOX, /* Intel KnightLanding Ubox uncore */ + PFM_PMU_INTEL_KNL_UNC_M2PCIE, /* Intel KnightLanding M2PCIe uncore */ ++ ++ PFM_PMU_POWER9, /* IBM POWER9 */ + /* MUST ADD NEW PMU MODELS HERE */ + + PFM_PMU_MAX /* end marker */ +diff --git a/lib/Makefile b/lib/Makefile +index 72f26d7..f532561 100644 +--- a/lib/Makefile ++++ b/lib/Makefile +@@ -124,7 +124,7 @@ SRCS += pfmlib_powerpc_perf_event.c + endif + + INCARCH = $(INC_POWERPC) +-SRCS += pfmlib_powerpc.c pfmlib_power4.c pfmlib_ppc970.c pfmlib_power5.c pfmlib_power6.c pfmlib_power7.c pfmlib_torrent.c pfmlib_power8.c pfmlib_powerpc_nest.c ++SRCS += pfmlib_powerpc.c pfmlib_power4.c pfmlib_ppc970.c pfmlib_power5.c pfmlib_power6.c pfmlib_power7.c pfmlib_torrent.c pfmlib_power8.c pfmlib_power9.c pfmlib_powerpc_nest.c + CFLAGS += -DCONFIG_PFMLIB_ARCH_POWERPC + endif + +@@ -291,6 +291,7 @@ INC_POWERPC=events/ppc970_events.h \ + events/power6_events.h \ + events/power7_events.h \ + events/power8_events.h \ ++ events/power9_events.h \ + events/torrent_events.h \ + events/powerpc_nest_events.h + +diff --git a/lib/events/power9_events.h b/lib/events/power9_events.h +new file mode 100644 +index 0000000..7414687 +--- /dev/null ++++ b/lib/events/power9_events.h +@@ -0,0 +1,6460 @@ ++/* ++* File: power9_events.h ++* CVS: ++* Author: Will Schmidt ++* will_schmidt@vnet.ibm.com ++* Author: Carl Love ++* cel@us.ibm.com ++* ++* Mods: ++* Initial content generated by Will Schmidt. (Jan 31, 2017). ++* ++* Contributed by ++* (C) Copyright IBM Corporation, 2017. All Rights Reserved. ++* ++* Note: This code was automatically generated and should not be modified by ++* hand. ++* ++* Documentation on the PMU events will be published at: ++* ... ++*/ ++ ++#ifndef __POWER9_EVENTS_H__ ++#define __POWER8_EVENTS_H__ ++ ++#define POWER9_PME_PM_IERAT_RELOAD 0 ++#define POWER9_PME_PM_TM_OUTER_TEND 1 ++#define POWER9_PME_PM_IPTEG_FROM_L3 2 ++#define POWER9_PME_PM_DPTEG_FROM_L3_1_MOD 3 ++#define POWER9_PME_PM_PMC2_SAVED 4 ++#define POWER9_PME_PM_LSU_FLUSH_SAO 5 ++#define POWER9_PME_PM_CMPLU_STALL_DFU 6 ++#define POWER9_PME_PM_MRK_LSU_FLUSH_RELAUNCH_MISS 7 ++#define POWER9_PME_PM_SP_FLOP_CMPL 8 ++#define POWER9_PME_PM_IC_RELOAD_PRIVATE 9 ++#define POWER9_PME_PM_RADIX_PWC_L2_PTE_FROM_L2 10 ++#define POWER9_PME_PM_INST_PUMP_CPRED 11 ++#define POWER9_PME_PM_INST_FROM_L2_1_MOD 12 ++#define POWER9_PME_PM_MRK_ST_CMPL 13 ++#define POWER9_PME_PM_MRK_LSU_DERAT_MISS 14 ++#define POWER9_PME_PM_L2_ST_DISP 15 ++#define POWER9_PME_PM_LSU0_FALSE_LHS 16 ++#define POWER9_PME_PM_L2_CASTOUT_MOD 17 ++#define POWER9_PME_PM_L2_RCST_DISP_FAIL_ADDR 18 ++#define POWER9_PME_PM_MRK_INST_TIMEO 19 ++#define POWER9_PME_PM_CMPLU_STALL_LOAD_FINISH 20 ++#define POWER9_PME_PM_INST_FROM_L2_1_SHR 21 ++#define POWER9_PME_PM_LS1_DC_COLLISIONS 22 ++#define POWER9_PME_PM_LSU2_FALSE_LHS 23 ++#define POWER9_PME_PM_MRK_ST_DRAIN_TO_L2DISP_CYC 24 ++#define POWER9_PME_PM_MRK_DTLB_MISS_16M 25 ++#define POWER9_PME_PM_L2_GROUP_PUMP 26 ++#define POWER9_PME_PM_LSU2_VECTOR_ST_FIN 27 ++#define POWER9_PME_PM_CMPLU_STALL_LSAQ_ARB 28 ++#define POWER9_PME_PM_L3_CO_LCO 29 ++#define POWER9_PME_PM_INST_GRP_PUMP_CPRED 30 ++#define POWER9_PME_PM_THRD_PRIO_4_5_CYC 31 ++#define POWER9_PME_PM_BR_PRED_TA 32 ++#define POWER9_PME_PM_ICT_NOSLOT_BR_MPRED_ICMISS 33 ++#define POWER9_PME_PM_IPTEG_FROM_L3_NO_CONFLICT 34 ++#define POWER9_PME_PM_CMPLU_STALL_FXU 35 ++#define POWER9_PME_PM_VSU_FSQRT_FDIV 36 ++#define POWER9_PME_PM_EXT_INT 37 ++#define POWER9_PME_PM_MRK_LD_MISS_EXPOSED_CYC 38 ++#define POWER9_PME_PM_S2Q_FULL 39 ++#define POWER9_PME_PM_RUN_CYC_SMT2_MODE 40 ++#define POWER9_PME_PM_DECODE_LANES_NOT_AVAIL 41 ++#define POWER9_PME_PM_TM_FAIL_TLBIE 42 ++#define POWER9_PME_PM_DISP_CLB_HELD_BAL 43 ++#define POWER9_PME_PM_MRK_DATA_FROM_L3MISS_CYC 44 ++#define POWER9_PME_PM_MRK_ST_FWD 45 ++#define POWER9_PME_PM_FXU_FIN 46 ++#define POWER9_PME_PM_SYNC_MRK_BR_MPRED 47 ++#define POWER9_PME_PM_CMPLU_STALL_STORE_FIN_ARB 48 ++#define POWER9_PME_PM_DSLB_MISS 49 ++#define POWER9_PME_PM_L3_MISS 50 ++#define POWER9_PME_PM_DUMMY2_REMOVE_ME 51 ++#define POWER9_PME_PM_MRK_DERAT_MISS_1G 52 ++#define POWER9_PME_PM_MATH_FLOP_CMPL 53 ++#define POWER9_PME_PM_L2_INST 54 ++#define POWER9_PME_PM_FLUSH_DISP 55 ++#define POWER9_PME_PM_DISP_HELD_ISSQ_FULL 56 ++#define POWER9_PME_PM_MEM_READ 57 ++#define POWER9_PME_PM_DATA_PUMP_MPRED 58 ++#define POWER9_PME_PM_DATA_CHIP_PUMP_CPRED 59 ++#define POWER9_PME_PM_MRK_DATA_FROM_DMEM 60 ++#define POWER9_PME_PM_CMPLU_STALL_LSU 61 ++#define POWER9_PME_PM_DATA_FROM_L3_1_MOD 62 ++#define POWER9_PME_PM_MRK_DERAT_MISS_16M 63 ++#define POWER9_PME_PM_TM_TRANS_RUN_CYC 64 ++#define POWER9_PME_PM_THRD_ALL_RUN_CYC 65 ++#define POWER9_PME_PM_DATA_FROM_DL2L3_MOD 66 ++#define POWER9_PME_PM_MRK_BR_MPRED_CMPL 67 ++#define POWER9_PME_PM_ICT_NOSLOT_DISP_HELD_ISSQ 68 ++#define POWER9_PME_PM_MRK_INST 69 ++#define POWER9_PME_PM_TABLEWALK_CYC_PREF 70 ++#define POWER9_PME_PM_LSU1_ERAT_HIT 71 ++#define POWER9_PME_PM_NTC_ISSUE_HELD_OTHER 72 ++#define POWER9_PME_PM_CMPLU_STALL_LSU_FLUSH_NEXT 73 ++#define POWER9_PME_PM_MRK_DPTEG_FROM_L2 74 ++#define POWER9_PME_PM_LS1_TM_DISALLOW 75 ++#define POWER9_PME_PM_INST_FROM_L2_DISP_CONFLICT_LDHITST 76 ++#define POWER9_PME_PM_BR_PRED_PCACHE 77 ++#define POWER9_PME_PM_MRK_BACK_BR_CMPL 78 ++#define POWER9_PME_PM_RD_CLEARING_SC 79 ++#define POWER9_PME_PM_PMC1_OVERFLOW 80 ++#define POWER9_PME_PM_L2_RTY_ST 81 ++#define POWER9_PME_PM_IPTEG_FROM_L2_NO_CONFLICT 82 ++#define POWER9_PME_PM_LSU1_FALSE_LHS 83 ++#define POWER9_PME_PM_LSU0_VECTOR_ST_FIN 84 ++#define POWER9_PME_PM_MEM_LOC_THRESH_LSU_HIGH 85 ++#define POWER9_PME_PM_LS2_UNALIGNED_LD 86 ++#define POWER9_PME_PM_BR_TAKEN_CMPL 87 ++#define POWER9_PME_PM_DATA_SYS_PUMP_MPRED 88 ++#define POWER9_PME_PM_ISQ_36_44_ENTRIES 89 ++#define POWER9_PME_PM_LSU1_VECTOR_LD_FIN 90 ++#define POWER9_PME_PM_MRK_DATA_FROM_L2_DISP_CONFLICT_OTHER 91 ++#define POWER9_PME_PM_ICT_NOSLOT_IC_MISS 92 ++#define POWER9_PME_PM_LSU3_TM_L1_HIT 93 ++#define POWER9_PME_PM_MRK_INST_DISP 94 ++#define POWER9_PME_PM_VECTOR_FLOP_CMPL 95 ++#define POWER9_PME_PM_FXU_IDLE 96 ++#define POWER9_PME_PM_INST_CMPL 97 ++#define POWER9_PME_PM_EAT_FORCE_MISPRED 98 ++#define POWER9_PME_PM_CMPLU_STALL_LRQ_FULL 99 ++#define POWER9_PME_PM_MRK_DATA_FROM_DL2L3_MOD 100 ++#define POWER9_PME_PM_BACK_BR_CMPL 101 ++#define POWER9_PME_PM_NEST_REF_CLK 102 ++#define POWER9_PME_PM_MRK_DPTEG_FROM_RL2L3_SHR 103 ++#define POWER9_PME_PM_RC_USAGE 104 ++#define POWER9_PME_PM_MRK_DPTEG_FROM_L3_1_ECO_MOD 105 ++#define POWER9_PME_PM_BR_CMPL 106 ++#define POWER9_PME_PM_INST_FROM_RL2L3_MOD 107 ++#define POWER9_PME_PM_SHL_CREATED 108 ++#define POWER9_PME_PM_CMPLU_STALL_PASTE 109 ++#define POWER9_PME_PM_LSU3_LDMX_FIN 110 ++#define POWER9_PME_PM_SN_USAGE 111 ++#define POWER9_PME_PM_L2_ST_HIT 112 ++#define POWER9_PME_PM_DATA_FROM_DMEM 113 ++#define POWER9_PME_PM_CMPLU_STALL_DMISS_REMOTE 114 ++#define POWER9_PME_PM_LSU2_LDMX_FIN 115 ++#define POWER9_PME_PM_L3_LD_MISS 116 ++#define POWER9_PME_PM_DPTEG_FROM_RL4 117 ++#define POWER9_PME_PM_RADIX_PWC_L3_PDE_FROM_L2 118 ++#define POWER9_PME_PM_MRK_DATA_FROM_RL4_CYC 119 ++#define POWER9_PME_PM_TM_SC_CO 120 ++#define POWER9_PME_PM_L2_SN_SX_I_DONE 121 ++#define POWER9_PME_PM_DPTEG_FROM_L3_DISP_CONFLICT 122 ++#define POWER9_PME_PM_ISIDE_L2MEMACC 123 ++#define POWER9_PME_PM_L3_P0_GRP_PUMP 124 ++#define POWER9_PME_PM_IPTEG_FROM_DL2L3_SHR 125 ++#define POWER9_PME_PM_RADIX_PWC_L3_PDE_FROM_L3 126 ++#define POWER9_PME_PM_THRESH_MET 127 ++#define POWER9_PME_PM_DATA_FROM_L2_MEPF 128 ++#define POWER9_PME_PM_DISP_STARVED 129 ++#define POWER9_PME_PM_L3_P0_LCO_RTY 130 ++#define POWER9_PME_PM_NTC_ISSUE_HELD_DARQ_FULL 131 ++#define POWER9_PME_PM_L3_RD_USAGE 132 ++#define POWER9_PME_PM_TLBIE_FIN 133 ++#define POWER9_PME_PM_DPTEG_FROM_LL4 134 ++#define POWER9_PME_PM_CMPLU_STALL_TLBIE 135 ++#define POWER9_PME_PM_MRK_DATA_FROM_L2MISS_CYC 136 ++#define POWER9_PME_PM_LS3_DC_COLLISIONS 137 ++#define POWER9_PME_PM_L1_ICACHE_MISS 138 ++#define POWER9_PME_PM_LSU_REJECT_ERAT_MISS 139 ++#define POWER9_PME_PM_DATA_SYS_PUMP_CPRED 140 ++#define POWER9_PME_PM_MRK_FAB_RSP_RWITM_CYC 141 ++#define POWER9_PME_PM_MRK_DATA_FROM_L3_1_SHR_CYC 142 ++#define POWER9_PME_PM_LSU_FLUSH_UE 143 ++#define POWER9_PME_PM_BR_PRED_TAKEN_CR 144 ++#define POWER9_PME_PM_INST_FROM_L2_DISP_CONFLICT_OTHER 145 ++#define POWER9_PME_PM_MRK_DPTEG_FROM_DL2L3_SHR 146 ++#define POWER9_PME_PM_DATA_FROM_L2_1_MOD 147 ++#define POWER9_PME_PM_LSU_FLUSH_LHL_SHL 148 ++#define POWER9_PME_PM_L3_P1_PF_RTY 149 ++#define POWER9_PME_PM_MRK_DPTEG_FROM_DL2L3_MOD 150 ++#define POWER9_PME_PM_DFU_BUSY 151 ++#define POWER9_PME_PM_LSU1_TM_L1_MISS 152 ++#define POWER9_PME_PM_FREQ_UP 153 ++#define POWER9_PME_PM_DATA_FROM_LMEM 154 ++#define POWER9_PME_PM_MRK_DATA_FROM_L2_MEPF 155 ++#define POWER9_PME_PM_ISIDE_DISP 156 ++#define POWER9_PME_PM_TM_OUTER_TBEGIN 157 ++#define POWER9_PME_PM_PMC3_OVERFLOW 158 ++#define POWER9_PME_PM_LSU0_SET_MPRED 159 ++#define POWER9_PME_PM_INST_FROM_L2_MEPF 160 ++#define POWER9_PME_PM_L3_P0_NODE_PUMP 161 ++#define POWER9_PME_PM_IPTEG_FROM_L3_1_MOD 162 ++#define POWER9_PME_PM_L3_PF_USAGE 163 ++#define POWER9_PME_PM_CMPLU_STALL_BRU 164 ++#define POWER9_PME_PM_ISLB_MISS 165 ++#define POWER9_PME_PM_CYC 166 ++#define POWER9_PME_PM_MRK_DATA_FROM_L3_1_SHR 167 ++#define POWER9_PME_PM_IPTEG_FROM_RL2L3_MOD 168 ++#define POWER9_PME_PM_DARQ_10_12_ENTRIES 169 ++#define POWER9_PME_PM_LSU2_3_LRQF_FULL_CYC 170 ++#define POWER9_PME_PM_DECODE_FUSION_OP_PRESERV 171 ++#define POWER9_PME_PM_MRK_DPTEG_FROM_L2_MEPF 172 ++#define POWER9_PME_PM_MRK_L1_RELOAD_VALID 173 ++#define POWER9_PME_PM_LSU2_SET_MPRED 174 ++#define POWER9_PME_PM_1PLUS_PPC_CMPL 175 ++#define POWER9_PME_PM_DATA_FROM_LL4 176 ++#define POWER9_PME_PM_CMPLU_STALL_DMISS_L3MISS 177 ++#define POWER9_PME_PM_TM_CAP_OVERFLOW 178 ++#define POWER9_PME_PM_MRK_DPTEG_FROM_LMEM 179 ++#define POWER9_PME_PM_LSU3_FALSE_LHS 180 ++#define POWER9_PME_PM_THRESH_EXC_512 181 ++#define POWER9_PME_PM_RADIX_PWC_L1_PDE_FROM_L2 182 ++#define POWER9_PME_PM_HWSYNC 183 ++#define POWER9_PME_PM_TM_FAIL_FOOTPRINT_OVERFLOW 184 ++#define POWER9_PME_PM_INST_SYS_PUMP_MPRED_RTY 185 ++#define POWER9_PME_PM_ICT_NOSLOT_DISP_HELD_HB_FULL 186 ++#define POWER9_PME_PM_DC_DEALLOC_NO_CONF 187 ++#define POWER9_PME_PM_CMPLU_STALL_VFXLONG 188 ++#define POWER9_PME_PM_MEM_LOC_THRESH_IFU 189 ++#define POWER9_PME_PM_MRK_DATA_FROM_L3_CYC 190 ++#define POWER9_PME_PM_PTE_PREFETCH 191 ++#define POWER9_PME_PM_CMPLU_STALL_STORE_PIPE_ARB 192 ++#define POWER9_PME_PM_CMPLU_STALL_SLB 193 ++#define POWER9_PME_PM_MRK_DERAT_MISS_4K 194 ++#define POWER9_PME_PM_CMPLU_STALL_LSU_MFSPR 195 ++#define POWER9_PME_PM_MRK_DPTEG_FROM_L3_1_ECO_SHR 196 ++#define POWER9_PME_PM_VSU_DP_FSQRT_FDIV 197 ++#define POWER9_PME_PM_IPTEG_FROM_L3_1_ECO_SHR 198 ++#define POWER9_PME_PM_L3_P0_LCO_DATA 199 ++#define POWER9_PME_PM_RUN_INST_CMPL 200 ++#define POWER9_PME_PM_MRK_DATA_FROM_OFF_CHIP_CACHE 201 ++#define POWER9_PME_PM_MRK_TEND_FAIL 202 ++#define POWER9_PME_PM_MRK_VSU_FIN 203 ++#define POWER9_PME_PM_DATA_FROM_L3_1_ECO_MOD 204 ++#define POWER9_PME_PM_RUN_SPURR 205 ++#define POWER9_PME_PM_ST_CAUSED_FAIL 206 ++#define POWER9_PME_PM_SNOOP_TLBIE 207 ++#define POWER9_PME_PM_PMC1_SAVED 208 ++#define POWER9_PME_PM_DATA_FROM_L3MISS 209 ++#define POWER9_PME_PM_DATA_FROM_ON_CHIP_CACHE 210 ++#define POWER9_PME_PM_DTLB_MISS_16G 211 ++#define POWER9_PME_PM_MRK_DPTEG_FROM_DMEM 212 ++#define POWER9_PME_PM_ICT_NOSLOT_IC_L3MISS 213 ++#define POWER9_PME_PM_FLUSH 214 ++#define POWER9_PME_PM_LSU_FLUSH_OTHER 215 ++#define POWER9_PME_PM_LS1_LAUNCH_HELD_PREF 216 ++#define POWER9_PME_PM_L2_LD_HIT 217 ++#define POWER9_PME_PM_LSU2_VECTOR_LD_FIN 218 ++#define POWER9_PME_PM_LSU_FLUSH_EMSH 219 ++#define POWER9_PME_PM_IC_PREF_REQ 220 ++#define POWER9_PME_PM_DPTEG_FROM_L2_1_SHR 221 ++#define POWER9_PME_PM_XLATE_RADIX_MODE 222 ++#define POWER9_PME_PM_L3_LD_HIT 223 ++#define POWER9_PME_PM_DARQ_7_9_ENTRIES 224 ++#define POWER9_PME_PM_CMPLU_STALL_EXEC_UNIT 225 ++#define POWER9_PME_PM_DISP_HELD 226 ++#define POWER9_PME_PM_TM_FAIL_CONF_TM 227 ++#define POWER9_PME_PM_LS0_DC_COLLISIONS 228 ++#define POWER9_PME_PM_L2_LD 229 ++#define POWER9_PME_PM_BTAC_GOOD_RESULT 230 ++#define POWER9_PME_PM_TEND_PEND_CYC 231 ++#define POWER9_PME_PM_MRK_DCACHE_RELOAD_INTV 232 ++#define POWER9_PME_PM_DISP_HELD_HB_FULL 233 ++#define POWER9_PME_PM_TM_TRESUME 234 ++#define POWER9_PME_PM_MRK_LSU_FLUSH_SAO 235 ++#define POWER9_PME_PM_LS0_TM_DISALLOW 236 ++#define POWER9_PME_PM_DPTEG_FROM_OFF_CHIP_CACHE 237 ++#define POWER9_PME_PM_RC0_BUSY 238 ++#define POWER9_PME_PM_LSU1_TM_L1_HIT 239 ++#define POWER9_PME_PM_TB_BIT_TRANS 240 ++#define POWER9_PME_PM_DPTEG_FROM_L2_NO_CONFLICT 241 ++#define POWER9_PME_PM_MRK_DPTEG_FROM_L3_1_MOD 242 ++#define POWER9_PME_PM_MRK_DATA_FROM_L2_NO_CONFLICT 243 ++#define POWER9_PME_PM_MRK_DATA_FROM_LL4_CYC 244 ++#define POWER9_PME_PM_INST_FROM_OFF_CHIP_CACHE 245 ++#define POWER9_PME_PM_L3_CO_L31 246 ++#define POWER9_PME_PM_CMPLU_STALL_CRYPTO 247 ++#define POWER9_PME_PM_RADIX_PWC_L1_PDE_FROM_L3 248 ++#define POWER9_PME_PM_ICT_EMPTY_CYC 249 ++#define POWER9_PME_PM_BR_UNCOND 250 ++#define POWER9_PME_PM_DERAT_MISS_2M 251 ++#define POWER9_PME_PM_PMC4_REWIND 252 ++#define POWER9_PME_PM_L2_RCLD_DISP 253 ++#define POWER9_PME_PM_CMPLU_STALL_DMISS_L2L3_CONFLICT 254 ++#define POWER9_PME_PM_TAKEN_BR_MPRED_CMPL 255 ++#define POWER9_PME_PM_THRD_PRIO_2_3_CYC 256 ++#define POWER9_PME_PM_DATA_FROM_DL4 257 ++#define POWER9_PME_PM_CMPLU_STALL_DPLONG 258 ++#define POWER9_PME_PM_IC_DEMAND_L2_BHT_REDIRECT 259 ++#define POWER9_PME_PM_MRK_FAB_RSP_BKILL 260 ++#define POWER9_PME_PM_LSU_DERAT_MISS 261 ++#define POWER9_PME_PM_IC_PREF_CANCEL_L2 262 ++#define POWER9_PME_PM_MRK_NTC_CYC 263 ++#define POWER9_PME_PM_STCX_FIN 264 ++#define POWER9_PME_PM_MRK_DATA_FROM_L3_MEPF 265 ++#define POWER9_PME_PM_DC_PREF_FUZZY_CONF 266 ++#define POWER9_PME_PM_MULT_MRK 267 ++#define POWER9_PME_PM_LSU_FLUSH_LARX_STCX 268 ++#define POWER9_PME_PM_L3_P1_LCO_NO_DATA 269 ++#define POWER9_PME_PM_TM_TABORT_TRECLAIM 270 ++#define POWER9_PME_PM_MRK_DATA_FROM_L2_MEPF_CYC 271 ++#define POWER9_PME_PM_BR_PRED_CCACHE 272 ++#define POWER9_PME_PM_L3_P1_LCO_DATA 273 ++#define POWER9_PME_PM_LINK_STACK_WRONG_ADD_PRED 274 ++#define POWER9_PME_PM_MRK_DPTEG_FROM_L3 275 ++#define POWER9_PME_PM_MRK_ST_CMPL_INT 276 ++#define POWER9_PME_PM_FLUSH_HB_RESTORE_CYC 277 ++#define POWER9_PME_PM_LS1_PTE_TABLEWALK_CYC 278 ++#define POWER9_PME_PM_L3_CI_USAGE 279 ++#define POWER9_PME_PM_MRK_DATA_FROM_L3MISS 280 ++#define POWER9_PME_PM_DPTEG_FROM_DL4 281 ++#define POWER9_PME_PM_MRK_STCX_FIN 282 ++#define POWER9_PME_PM_MRK_LSU_FLUSH_UE 283 ++#define POWER9_PME_PM_MRK_DATA_FROM_MEMORY 284 ++#define POWER9_PME_PM_GRP_PUMP_MPRED_RTY 285 ++#define POWER9_PME_PM_DPTEG_FROM_L3_1_ECO_SHR 286 ++#define POWER9_PME_PM_FLUSH_DISP_TLBIE 287 ++#define POWER9_PME_PM_DPTEG_FROM_L3MISS 288 ++#define POWER9_PME_PM_L3_GRP_GUESS_CORRECT 289 ++#define POWER9_PME_PM_IC_INVALIDATE 290 ++#define POWER9_PME_PM_DERAT_MISS_16G 291 ++#define POWER9_PME_PM_SYS_PUMP_MPRED_RTY 292 ++#define POWER9_PME_PM_LMQ_MERGE 293 ++#define POWER9_PME_PM_IPTEG_FROM_LMEM 294 ++#define POWER9_PME_PM_L3_LAT_CI_HIT 295 ++#define POWER9_PME_PM_LSU1_VECTOR_ST_FIN 296 ++#define POWER9_PME_PM_IC_DEMAND_L2_BR_REDIRECT 297 ++#define POWER9_PME_PM_INST_FROM_LMEM 298 ++#define POWER9_PME_PM_MRK_DATA_FROM_RL4 299 ++#define POWER9_PME_PM_MRK_DTLB_MISS_4K 300 ++#define POWER9_PME_PM_MRK_DATA_FROM_L3_NO_CONFLICT 301 ++#define POWER9_PME_PM_CMPLU_STALL_NTC_FLUSH 302 ++#define POWER9_PME_PM_MRK_DATA_FROM_L3_NO_CONFLICT_CYC 303 ++#define POWER9_PME_PM_DARQ_0_3_ENTRIES 304 ++#define POWER9_PME_PM_DATA_FROM_L3MISS_MOD 305 ++#define POWER9_PME_PM_MRK_DATA_FROM_L2_1_SHR_CYC 306 ++#define POWER9_PME_PM_TAGE_OVERRIDE_WRONG 307 ++#define POWER9_PME_PM_L2_LD_MISS 308 ++#define POWER9_PME_PM_EAT_FULL_CYC 309 ++#define POWER9_PME_PM_CMPLU_STALL_SPEC_FINISH 310 ++#define POWER9_PME_PM_MRK_LSU_FLUSH_LARX_STCX 311 ++#define POWER9_PME_PM_THRESH_EXC_128 312 ++#define POWER9_PME_PM_LMQ_EMPTY_CYC 313 ++#define POWER9_PME_PM_RADIX_PWC_L2_PDE_FROM_L3 314 ++#define POWER9_PME_PM_MRK_IC_MISS 315 ++#define POWER9_PME_PM_L3_P1_GRP_PUMP 316 ++#define POWER9_PME_PM_CMPLU_STALL_TEND 317 ++#define POWER9_PME_PM_PUMP_MPRED 318 ++#define POWER9_PME_PM_INST_GRP_PUMP_MPRED 319 ++#define POWER9_PME_PM_L1_PREF 320 ++#define POWER9_PME_PM_MRK_DATA_FROM_LMEM_CYC 321 ++#define POWER9_PME_PM_LSU_FLUSH_ATOMIC 322 ++#define POWER9_PME_PM_L2_DISP_ALL_L2MISS 323 ++#define POWER9_PME_PM_DATA_FROM_MEMORY 324 ++#define POWER9_PME_PM_IPTEG_FROM_L3_1_ECO_MOD 325 ++#define POWER9_PME_PM_ISIDE_DISP_FAIL_ADDR 326 ++#define POWER9_PME_PM_CMPLU_STALL_HWSYNC 327 ++#define POWER9_PME_PM_DATA_FROM_L3 328 ++#define POWER9_PME_PM_PMC2_OVERFLOW 329 ++#define POWER9_PME_PM_LSU0_SRQ_S0_VALID_CYC 330 ++#define POWER9_PME_PM_DPTEG_FROM_LMEM 331 ++#define POWER9_PME_PM_IPTEG_FROM_ON_CHIP_CACHE 332 ++#define POWER9_PME_PM_LSU1_SET_MPRED 333 ++#define POWER9_PME_PM_DATA_FROM_L3_1_ECO_SHR 334 ++#define POWER9_PME_PM_INST_FROM_MEMORY 335 ++#define POWER9_PME_PM_L3_P1_LCO_RTY 336 ++#define POWER9_PME_PM_DATA_FROM_L2_1_SHR 337 ++#define POWER9_PME_PM_FLUSH_LSU 338 ++#define POWER9_PME_PM_CMPLU_STALL_FXLONG 339 ++#define POWER9_PME_PM_CMPLU_STALL_DMISS_LMEM 340 ++#define POWER9_PME_PM_SNP_TM_HIT_M 341 ++#define POWER9_PME_PM_INST_GRP_PUMP_MPRED_RTY 342 ++#define POWER9_PME_PM_L2_INST_MISS 343 ++#define POWER9_PME_PM_CMPLU_STALL_ERAT_MISS 344 ++#define POWER9_PME_PM_MRK_L2_RC_DONE 345 ++#define POWER9_PME_PM_INST_FROM_L3_1_SHR 346 ++#define POWER9_PME_PM_RADIX_PWC_L4_PDE_FROM_L2 347 ++#define POWER9_PME_PM_MRK_DATA_FROM_L3_1_MOD 348 ++#define POWER9_PME_PM_CO0_BUSY 349 ++#define POWER9_PME_PM_CMPLU_STALL_STORE_DATA 350 ++#define POWER9_PME_PM_INST_FROM_RMEM 351 ++#define POWER9_PME_PM_SYNC_MRK_BR_LINK 352 ++#define POWER9_PME_PM_L3_LD_PREF 353 ++#define POWER9_PME_PM_DISP_CLB_HELD_TLBIE 354 ++#define POWER9_PME_PM_DPTEG_FROM_ON_CHIP_CACHE 355 ++#define POWER9_PME_PM_MRK_DATA_FROM_L3_MEPF_CYC 356 ++#define POWER9_PME_PM_LS0_UNALIGNED_LD 357 ++#define POWER9_PME_PM_MRK_DATA_FROM_DMEM_CYC 358 ++#define POWER9_PME_PM_SN_HIT 359 ++#define POWER9_PME_PM_L3_LOC_GUESS_CORRECT 360 ++#define POWER9_PME_PM_MRK_INST_FROM_L3MISS 361 ++#define POWER9_PME_PM_DECODE_FUSION_EXT_ADD 362 ++#define POWER9_PME_PM_INST_FROM_DL4 363 ++#define POWER9_PME_PM_DC_PREF_XCONS_ALLOC 364 ++#define POWER9_PME_PM_MRK_DPTEG_FROM_MEMORY 365 ++#define POWER9_PME_PM_IC_PREF_CANCEL_PAGE 366 ++#define POWER9_PME_PM_RADIX_PWC_L3_PTE_FROM_L3 367 ++#define POWER9_PME_PM_L3_GRP_GUESS_WRONG_LOW 368 ++#define POWER9_PME_PM_TM_FAIL_SELF 369 ++#define POWER9_PME_PM_L3_P1_SYS_PUMP 370 ++#define POWER9_PME_PM_CMPLU_STALL_RFID 371 ++#define POWER9_PME_PM_BR_2PATH 372 ++#define POWER9_PME_PM_RADIX_PWC_L4_PTE_FROM_L3MISS 373 ++#define POWER9_PME_PM_DPTEG_FROM_L2MISS 374 ++#define POWER9_PME_PM_TM_TX_PASS_RUN_INST 375 ++#define POWER9_PME_PM_L1_ICACHE_RELOADED_PREF 376 ++#define POWER9_PME_PM_THRESH_EXC_4096 377 ++#define POWER9_PME_PM_IERAT_RELOAD_64K 378 ++#define POWER9_PME_PM_LSU0_TM_L1_MISS 379 ++#define POWER9_PME_PM_MEM_LOC_THRESH_LSU_MED 380 ++#define POWER9_PME_PM_PMC3_REWIND 381 ++#define POWER9_PME_PM_ST_FWD 382 ++#define POWER9_PME_PM_TM_FAIL_TX_CONFLICT 383 ++#define POWER9_PME_PM_SYNC_MRK_L2MISS 384 ++#define POWER9_PME_PM_ISU0_ISS_HOLD_ALL 385 ++#define POWER9_PME_PM_MRK_FAB_RSP_DCLAIM_CYC 386 ++#define POWER9_PME_PM_DATA_FROM_L2 387 ++#define POWER9_PME_PM_MRK_DATA_FROM_RL2L3_MOD 388 ++#define POWER9_PME_PM_ISQ_0_8_ENTRIES 389 ++#define POWER9_PME_PM_L3_CO_MEPF 390 ++#define POWER9_PME_PM_LINK_STACK_INVALID_PTR 391 ++#define POWER9_PME_PM_IPTEG_FROM_L2_1_MOD 392 ++#define POWER9_PME_PM_TM_ST_CAUSED_FAIL 393 ++#define POWER9_PME_PM_LD_REF_L1 394 ++#define POWER9_PME_PM_TM_FAIL_NON_TX_CONFLICT 395 ++#define POWER9_PME_PM_GRP_PUMP_CPRED 396 ++#define POWER9_PME_PM_INST_FROM_L3_NO_CONFLICT 397 ++#define POWER9_PME_PM_DC_PREF_STRIDED_CONF 398 ++#define POWER9_PME_PM_THRD_PRIO_6_7_CYC 399 ++#define POWER9_PME_PM_RADIX_PWC_L4_PDE_FROM_L3 400 ++#define POWER9_PME_PM_L3_PF_OFF_CHIP_MEM 401 ++#define POWER9_PME_PM_L3_CO_MEM 402 ++#define POWER9_PME_PM_DECODE_HOLD_ICT_FULL 403 ++#define POWER9_PME_PM_CMPLU_STALL_DFLONG 404 ++#define POWER9_PME_PM_LD_MISS_L1 405 ++#define POWER9_PME_PM_DATA_FROM_RL2L3_MOD 406 ++#define POWER9_PME_PM_L3_WI0_BUSY 407 ++#define POWER9_PME_PM_LSU_SRQ_FULL_CYC 408 ++#define POWER9_PME_PM_TABLEWALK_CYC 409 ++#define POWER9_PME_PM_MRK_DATA_FROM_MEMORY_CYC 410 ++#define POWER9_PME_PM_IPTEG_FROM_OFF_CHIP_CACHE 411 ++#define POWER9_PME_PM_RADIX_PWC_L1_PDE_FROM_L3MISS 412 ++#define POWER9_PME_PM_CMPLU_STALL_SYS_CALL 413 ++#define POWER9_PME_PM_LSU_FLUSH_RELAUNCH_MISS 414 ++#define POWER9_PME_PM_DPTEG_FROM_L3_1_ECO_MOD 415 ++#define POWER9_PME_PM_PMC5_OVERFLOW 416 ++#define POWER9_PME_PM_LS1_UNALIGNED_ST 417 ++#define POWER9_PME_PM_ICT_NOSLOT_DISP_HELD_SYNC 418 ++#define POWER9_PME_PM_CMPLU_STALL_THRD 419 ++#define POWER9_PME_PM_PMC3_SAVED 420 ++#define POWER9_PME_PM_MRK_DERAT_MISS 421 ++#define POWER9_PME_PM_RADIX_PWC_L3_HIT 422 ++#define POWER9_PME_PM_RADIX_PWC_L2_PTE_FROM_L3MISS 423 ++#define POWER9_PME_PM_RUN_CYC_SMT4_MODE 424 ++#define POWER9_PME_PM_DATA_FROM_RMEM 425 ++#define POWER9_PME_PM_BR_MPRED_LSTACK 426 ++#define POWER9_PME_PM_PROBE_NOP_DISP 427 ++#define POWER9_PME_PM_DPTEG_FROM_L3_MEPF 428 ++#define POWER9_PME_PM_INST_FROM_L3MISS_MOD 429 ++#define POWER9_PME_PM_DUMMY1_REMOVE_ME 430 ++#define POWER9_PME_PM_MRK_DATA_FROM_DL4 431 ++#define POWER9_PME_PM_MRK_DATA_FROM_RL2L3_MOD_CYC 432 ++#define POWER9_PME_PM_IPTEG_FROM_L3_1_SHR 433 ++#define POWER9_PME_PM_MRK_DATA_FROM_L3_1_ECO_SHR 434 ++#define POWER9_PME_PM_DTLB_MISS_2M 435 ++#define POWER9_PME_PM_TM_RST_SC 436 ++#define POWER9_PME_PM_LSU_NCST 437 ++#define POWER9_PME_PM_DATA_SYS_PUMP_MPRED_RTY 438 ++#define POWER9_PME_PM_THRESH_ACC 439 ++#define POWER9_PME_PM_ISU3_ISS_HOLD_ALL 440 ++#define POWER9_PME_PM_LSU0_L1_CAM_CANCEL 441 ++#define POWER9_PME_PM_MRK_FAB_RSP_BKILL_CYC 442 ++#define POWER9_PME_PM_MRK_DPTEG_FROM_L3_MEPF 443 ++#define POWER9_PME_PM_DARQ_STORE_REJECT 444 ++#define POWER9_PME_PM_DPTEG_FROM_L3_NO_CONFLICT 445 ++#define POWER9_PME_PM_TM_TX_PASS_RUN_CYC 446 ++#define POWER9_PME_PM_DTLB_MISS_4K 447 ++#define POWER9_PME_PM_MRK_DATA_FROM_ON_CHIP_CACHE_CYC 448 ++#define POWER9_PME_PM_LS0_PTE_TABLEWALK_CYC 449 ++#define POWER9_PME_PM_PMC4_SAVED 450 ++#define POWER9_PME_PM_SNP_TM_HIT_T 451 ++#define POWER9_PME_PM_MRK_BR_2PATH 452 ++#define POWER9_PME_PM_LSU_FLUSH_CI 453 ++#define POWER9_PME_PM_FLUSH_MPRED 454 ++#define POWER9_PME_PM_CMPLU_STALL_ST_FWD 455 ++#define POWER9_PME_PM_DTLB_MISS 456 ++#define POWER9_PME_PM_MRK_L2_TM_REQ_ABORT 457 ++#define POWER9_PME_PM_TM_NESTED_TEND 458 ++#define POWER9_PME_PM_CMPLU_STALL_PM 459 ++#define POWER9_PME_PM_CMPLU_STALL_ISYNC 460 ++#define POWER9_PME_PM_MRK_DTLB_MISS_1G 461 ++#define POWER9_PME_PM_L3_SYS_GUESS_CORRECT 462 ++#define POWER9_PME_PM_L2_CASTOUT_SHR 463 ++#define POWER9_PME_PM_CMPLU_STALL_DMISS_L2L3 464 ++#define POWER9_PME_PM_LS2_UNALIGNED_ST 465 ++#define POWER9_PME_PM_MRK_DPTEG_FROM_L2MISS 466 ++#define POWER9_PME_PM_THRESH_EXC_32 467 ++#define POWER9_PME_PM_TM_TSUSPEND 468 ++#define POWER9_PME_PM_DATA_FROM_DL2L3_SHR 469 ++#define POWER9_PME_PM_MRK_DATA_FROM_L3_DISP_CONFLICT 470 ++#define POWER9_PME_PM_MRK_DATA_FROM_L3_1_ECO_SHR_CYC 471 ++#define POWER9_PME_PM_THRESH_EXC_1024 472 ++#define POWER9_PME_PM_ST_FIN 473 ++#define POWER9_PME_PM_TM_LD_CAUSED_FAIL 474 ++#define POWER9_PME_PM_SRQ_SYNC_CYC 475 ++#define POWER9_PME_PM_IFETCH_THROTTLE 476 ++#define POWER9_PME_PM_L3_SW_PREF 477 ++#define POWER9_PME_PM_LSU0_LDMX_FIN 478 ++#define POWER9_PME_PM_L2_LOC_GUESS_WRONG 479 ++#define POWER9_PME_PM_MRK_DATA_FROM_L2_NO_CONFLICT_CYC 480 ++#define POWER9_PME_PM_MRK_DPTEG_FROM_ON_CHIP_CACHE 481 ++#define POWER9_PME_PM_L3_P1_CO_RTY 482 ++#define POWER9_PME_PM_MRK_STCX_FAIL 483 ++#define POWER9_PME_PM_LARX_FIN 484 ++#define POWER9_PME_PM_RADIX_PWC_L2_PTE_FROM_L3 485 ++#define POWER9_PME_PM_LSU3_L1_CAM_CANCEL 486 ++#define POWER9_PME_PM_IC_PREF_CANCEL_HIT 487 ++#define POWER9_PME_PM_CMPLU_STALL_EIEIO 488 ++#define POWER9_PME_PM_CMPLU_STALL_VDP 489 ++#define POWER9_PME_PM_DERAT_MISS_1G 490 ++#define POWER9_PME_PM_DATA_PUMP_CPRED 491 ++#define POWER9_PME_PM_DPTEG_FROM_L2_MEPF 492 ++#define POWER9_PME_PM_BR_MPRED_TAKEN_CR 493 ++#define POWER9_PME_PM_MRK_BRU_FIN 494 ++#define POWER9_PME_PM_MRK_DPTEG_FROM_DL4 495 ++#define POWER9_PME_PM_SHL_ST_DEP_CREATED 496 ++#define POWER9_PME_PM_DPTEG_FROM_L3_1_SHR 497 ++#define POWER9_PME_PM_DATA_FROM_RL4 498 ++#define POWER9_PME_PM_XLATE_MISS 499 ++#define POWER9_PME_PM_CMPLU_STALL_SRQ_FULL 500 ++#define POWER9_PME_PM_SN0_BUSY 501 ++#define POWER9_PME_PM_CMPLU_STALL_NESTED_TBEGIN 502 ++#define POWER9_PME_PM_ST_CMPL 503 ++#define POWER9_PME_PM_DPTEG_FROM_DL2L3_SHR 504 ++#define POWER9_PME_PM_DECODE_FUSION_CONST_GEN 505 ++#define POWER9_PME_PM_L2_LOC_GUESS_CORRECT 506 ++#define POWER9_PME_PM_INST_FROM_L3_1_ECO_SHR 507 ++#define POWER9_PME_PM_XLATE_HPT_MODE 508 ++#define POWER9_PME_PM_CMPLU_STALL_LSU_FIN 509 ++#define POWER9_PME_PM_THRESH_EXC_64 510 ++#define POWER9_PME_PM_MRK_DATA_FROM_DL4_CYC 511 ++#define POWER9_PME_PM_DARQ_STORE_XMIT 512 ++#define POWER9_PME_PM_DATA_TABLEWALK_CYC 513 ++#define POWER9_PME_PM_L2_RC_ST_DONE 514 ++#define POWER9_PME_PM_TMA_REQ_L2 515 ++#define POWER9_PME_PM_INST_FROM_ON_CHIP_CACHE 516 ++#define POWER9_PME_PM_SLB_TABLEWALK_CYC 517 ++#define POWER9_PME_PM_MRK_DATA_FROM_RMEM 518 ++#define POWER9_PME_PM_L3_PF_MISS_L3 519 ++#define POWER9_PME_PM_L3_CI_MISS 520 ++#define POWER9_PME_PM_L2_RCLD_DISP_FAIL_ADDR 521 ++#define POWER9_PME_PM_DERAT_MISS_4K 522 ++#define POWER9_PME_PM_ISIDE_MRU_TOUCH 523 ++#define POWER9_PME_PM_MRK_RUN_CYC 524 ++#define POWER9_PME_PM_L3_P0_CO_RTY 525 ++#define POWER9_PME_PM_BR_MPRED_CMPL 526 ++#define POWER9_PME_PM_BR_MPRED_TAKEN_TA 527 ++#define POWER9_PME_PM_DISP_HELD_TBEGIN 528 ++#define POWER9_PME_PM_DPTEG_FROM_RL2L3_MOD 529 ++#define POWER9_PME_PM_FLUSH_DISP_SB 530 ++#define POWER9_PME_PM_L2_CHIP_PUMP 531 ++#define POWER9_PME_PM_L2_DC_INV 532 ++#define POWER9_PME_PM_MRK_DATA_FROM_L2_DISP_CONFLICT_LDHITST_CYC 533 ++#define POWER9_PME_PM_MRK_DPTEG_FROM_L3_1_SHR 534 ++#define POWER9_PME_PM_MRK_DERAT_MISS_2M 535 ++#define POWER9_PME_PM_MRK_ST_DONE_L2 536 ++#define POWER9_PME_PM_MRK_DATA_FROM_L3_1_ECO_MOD 537 ++#define POWER9_PME_PM_IPTEG_FROM_RMEM 538 ++#define POWER9_PME_PM_MRK_LSU_FLUSH_EMSH 539 ++#define POWER9_PME_PM_BR_PRED_LSTACK 540 ++#define POWER9_PME_PM_L3_P0_CO_MEM 541 ++#define POWER9_PME_PM_IPTEG_FROM_L2_MEPF 542 ++#define POWER9_PME_PM_LS0_ERAT_MISS_PREF 543 ++#define POWER9_PME_PM_RD_HIT_PF 544 ++#define POWER9_PME_PM_DECODE_FUSION_LD_ST_DISP 545 ++#define POWER9_PME_PM_CMPLU_STALL_NTC_DISP_FIN 546 ++#define POWER9_PME_PM_ICT_NOSLOT_CYC 547 ++#define POWER9_PME_PM_DERAT_MISS_16M 548 ++#define POWER9_PME_PM_IC_MISS_ICBI 549 ++#define POWER9_PME_PM_TAGE_OVERRIDE_WRONG_SPEC 550 ++#define POWER9_PME_PM_ICT_NOSLOT_DISP_HELD_TBEGIN 551 ++#define POWER9_PME_PM_MRK_BR_TAKEN_CMPL 552 ++#define POWER9_PME_PM_CMPLU_STALL_VFXU 553 ++#define POWER9_PME_PM_DATA_GRP_PUMP_MPRED_RTY 554 ++#define POWER9_PME_PM_INST_FROM_L3 555 ++#define POWER9_PME_PM_ITLB_MISS 556 ++#define POWER9_PME_PM_MRK_DPTEG_FROM_RL2L3_MOD 557 ++#define POWER9_PME_PM_LSU2_TM_L1_MISS 558 ++#define POWER9_PME_PM_L3_WI_USAGE 559 ++#define POWER9_PME_PM_L2_SN_M_WR_DONE 560 ++#define POWER9_PME_PM_DISP_HELD_SYNC_HOLD 561 ++#define POWER9_PME_PM_MRK_DPTEG_FROM_L2_1_SHR 562 ++#define POWER9_PME_PM_MEM_PREF 563 ++#define POWER9_PME_PM_L2_SN_M_RD_DONE 564 ++#define POWER9_PME_PM_LS0_UNALIGNED_ST 565 ++#define POWER9_PME_PM_DC_PREF_CONS_ALLOC 566 ++#define POWER9_PME_PM_MRK_DERAT_MISS_16G 567 ++#define POWER9_PME_PM_IPTEG_FROM_L2 568 ++#define POWER9_PME_PM_ANY_THRD_RUN_CYC 569 ++#define POWER9_PME_PM_MRK_PROBE_NOP_CMPL 570 ++#define POWER9_PME_PM_BANK_CONFLICT 571 ++#define POWER9_PME_PM_INST_SYS_PUMP_MPRED 572 ++#define POWER9_PME_PM_NON_DATA_STORE 573 ++#define POWER9_PME_PM_DC_PREF_CONF 574 ++#define POWER9_PME_PM_BTAC_BAD_RESULT 575 ++#define POWER9_PME_PM_LSU_LMQ_FULL_CYC 576 ++#define POWER9_PME_PM_NON_MATH_FLOP_CMPL 577 ++#define POWER9_PME_PM_MRK_LD_MISS_L1_CYC 578 ++#define POWER9_PME_PM_MRK_DATA_FROM_L2_CYC 579 ++#define POWER9_PME_PM_FXU_1PLUS_BUSY 580 ++#define POWER9_PME_PM_CMPLU_STALL_DP 581 ++#define POWER9_PME_PM_MRK_DATA_FROM_L3_1_MOD_CYC 582 ++#define POWER9_PME_PM_SYNC_MRK_L2HIT 583 ++#define POWER9_PME_PM_MRK_DATA_FROM_RMEM_CYC 584 ++#define POWER9_PME_PM_ISU1_ISS_HOLD_ALL 585 ++#define POWER9_PME_PM_MRK_DPTEG_FROM_L3_DISP_CONFLICT 586 ++#define POWER9_PME_PM_MRK_FAB_RSP_RWITM_RTY 587 ++#define POWER9_PME_PM_L3_P3_LCO_RTY 588 ++#define POWER9_PME_PM_PUMP_CPRED 589 ++#define POWER9_PME_PM_LS3_TM_DISALLOW 590 ++#define POWER9_PME_PM_SN_INVL 591 ++#define POWER9_PME_PM_TM_LD_CONF 592 ++#define POWER9_PME_PM_LD_MISS_L1_FIN 593 ++#define POWER9_PME_PM_SYNC_MRK_PROBE_NOP 594 ++#define POWER9_PME_PM_RUN_CYC 595 ++#define POWER9_PME_PM_SYS_PUMP_MPRED 596 ++#define POWER9_PME_PM_DATA_FROM_OFF_CHIP_CACHE 597 ++#define POWER9_PME_PM_TM_NESTED_TBEGIN 598 ++#define POWER9_PME_PM_FLUSH_COMPLETION 599 ++#define POWER9_PME_PM_ST_MISS_L1 600 ++#define POWER9_PME_PM_IPTEG_FROM_L2MISS 601 ++#define POWER9_PME_PM_LSU3_TM_L1_MISS 602 ++#define POWER9_PME_PM_L3_CO 603 ++#define POWER9_PME_PM_MRK_STALL_CMPLU_CYC 604 ++#define POWER9_PME_PM_INST_FROM_DL2L3_SHR 605 ++#define POWER9_PME_PM_SCALAR_FLOP_CMPL 606 ++#define POWER9_PME_PM_LRQ_REJECT 607 ++#define POWER9_PME_PM_4FLOP_CMPL 608 ++#define POWER9_PME_PM_MRK_DPTEG_FROM_RMEM 609 ++#define POWER9_PME_PM_LD_CMPL 610 ++#define POWER9_PME_PM_DATA_FROM_L3_MEPF 611 ++#define POWER9_PME_PM_L1PF_L2MEMACC 612 ++#define POWER9_PME_PM_INST_FROM_L3MISS 613 ++#define POWER9_PME_PM_MRK_LSU_FLUSH_LHS 614 ++#define POWER9_PME_PM_EE_OFF_EXT_INT 615 ++#define POWER9_PME_PM_TM_ST_CONF 616 ++#define POWER9_PME_PM_PMC6_OVERFLOW 617 ++#define POWER9_PME_PM_INST_FROM_DL2L3_MOD 618 ++#define POWER9_PME_PM_MRK_INST_CMPL 619 ++#define POWER9_PME_PM_TAGE_CORRECT_TAKEN_CMPL 620 ++#define POWER9_PME_PM_MRK_L1_ICACHE_MISS 621 ++#define POWER9_PME_PM_TLB_MISS 622 ++#define POWER9_PME_PM_L2_RCLD_DISP_FAIL_OTHER 623 ++#define POWER9_PME_PM_FXU_BUSY 624 ++#define POWER9_PME_PM_DATA_FROM_L3_DISP_CONFLICT 625 ++#define POWER9_PME_PM_INST_FROM_L3_1_MOD 626 ++#define POWER9_PME_PM_LSU_REJECT_LMQ_FULL 627 ++#define POWER9_PME_PM_CO_DISP_FAIL 628 ++#define POWER9_PME_PM_L3_TRANS_PF 629 ++#define POWER9_PME_PM_MRK_ST_NEST 630 ++#define POWER9_PME_PM_LSU1_L1_CAM_CANCEL 631 ++#define POWER9_PME_PM_INST_CHIP_PUMP_CPRED 632 ++#define POWER9_PME_PM_LSU3_VECTOR_ST_FIN 633 ++#define POWER9_PME_PM_MRK_DPTEG_FROM_L2_1_MOD 634 ++#define POWER9_PME_PM_IBUF_FULL_CYC 635 ++#define POWER9_PME_PM_8FLOP_CMPL 636 ++#define POWER9_PME_PM_MRK_DATA_FROM_DL2L3_SHR_CYC 637 ++#define POWER9_PME_PM_MRK_DPTEG_FROM_OFF_CHIP_CACHE 638 ++#define POWER9_PME_PM_ICT_NOSLOT_IC_L3 639 ++#define POWER9_PME_PM_CMPLU_STALL_LWSYNC 640 ++#define POWER9_PME_PM_RADIX_PWC_L2_PDE_FROM_L2 641 ++#define POWER9_PME_PM_MRK_DATA_FROM_RL2L3_SHR_CYC 642 ++#define POWER9_PME_PM_L3_SN0_BUSY 643 ++#define POWER9_PME_PM_TM_OUTER_TBEGIN_DISP 644 ++#define POWER9_PME_PM_GRP_PUMP_MPRED 645 ++#define POWER9_PME_PM_SRQ_EMPTY_CYC 646 ++#define POWER9_PME_PM_LSU_REJECT_LHS 647 ++#define POWER9_PME_PM_IPTEG_FROM_L3_MEPF 648 ++#define POWER9_PME_PM_MRK_DATA_FROM_LMEM 649 ++#define POWER9_PME_PM_L3_P1_CO_MEM 650 ++#define POWER9_PME_PM_FREQ_DOWN 651 ++#define POWER9_PME_PM_L3_CINJ 652 ++#define POWER9_PME_PM_L3_P0_PF_RTY 653 ++#define POWER9_PME_PM_IPTEG_FROM_DL2L3_MOD 654 ++#define POWER9_PME_PM_MRK_INST_ISSUED 655 ++#define POWER9_PME_PM_INST_FROM_RL2L3_SHR 656 ++#define POWER9_PME_PM_LSU_STCX_FAIL 657 ++#define POWER9_PME_PM_L3_P1_NODE_PUMP 658 ++#define POWER9_PME_PM_MEM_RWITM 659 ++#define POWER9_PME_PM_DP_QP_FLOP_CMPL 660 ++#define POWER9_PME_PM_RUN_PURR 661 ++#define POWER9_PME_PM_CMPLU_STALL_LMQ_FULL 662 ++#define POWER9_PME_PM_CMPLU_STALL_VDPLONG 663 ++#define POWER9_PME_PM_LSU2_TM_L1_HIT 664 ++#define POWER9_PME_PM_MRK_DATA_FROM_L3 665 ++#define POWER9_PME_PM_CMPLU_STALL_MTFPSCR 666 ++#define POWER9_PME_PM_STALL_END_ICT_EMPTY 667 ++#define POWER9_PME_PM_L3_P1_CO_L31 668 ++#define POWER9_PME_PM_CMPLU_STALL_DCACHE_MISS 669 ++#define POWER9_PME_PM_DPTEG_FROM_DL2L3_MOD 670 ++#define POWER9_PME_PM_INST_FROM_L3_MEPF 671 ++#define POWER9_PME_PM_L1_DCACHE_RELOADED_ALL 672 ++#define POWER9_PME_PM_DATA_GRP_PUMP_CPRED 673 ++#define POWER9_PME_PM_MRK_DERAT_MISS_64K 674 ++#define POWER9_PME_PM_L2_ST_MISS 675 ++#define POWER9_PME_PM_L3_PF_OFF_CHIP_CACHE 676 ++#define POWER9_PME_PM_RADIX_PWC_L3_PTE_FROM_L3MISS 677 ++#define POWER9_PME_PM_LWSYNC 678 ++#define POWER9_PME_PM_LS3_UNALIGNED_LD 679 ++#define POWER9_PME_PM_L3_RD0_BUSY 680 ++#define POWER9_PME_PM_LINK_STACK_CORRECT 681 ++#define POWER9_PME_PM_MRK_DTLB_MISS 682 ++#define POWER9_PME_PM_INST_IMC_MATCH_CMPL 683 ++#define POWER9_PME_PM_LS1_ERAT_MISS_PREF 684 ++#define POWER9_PME_PM_L3_CO0_BUSY 685 ++#define POWER9_PME_PM_BFU_BUSY 686 ++#define POWER9_PME_PM_L2_SYS_GUESS_CORRECT 687 ++#define POWER9_PME_PM_L1_SW_PREF 688 ++#define POWER9_PME_PM_MRK_DATA_FROM_LL4 689 ++#define POWER9_PME_PM_MRK_INST_FIN 690 ++#define POWER9_PME_PM_SYNC_MRK_L3MISS 691 ++#define POWER9_PME_PM_LSU1_STORE_REJECT 692 ++#define POWER9_PME_PM_CHIP_PUMP_CPRED 693 ++#define POWER9_PME_PM_MRK_DATA_FROM_OFF_CHIP_CACHE_CYC 694 ++#define POWER9_PME_PM_DATA_STORE 695 ++#define POWER9_PME_PM_LS1_UNALIGNED_LD 696 ++#define POWER9_PME_PM_TM_TRANS_RUN_INST 697 ++#define POWER9_PME_PM_IC_MISS_CMPL 698 ++#define POWER9_PME_PM_THRESH_NOT_MET 699 ++#define POWER9_PME_PM_DPTEG_FROM_L2 700 ++#define POWER9_PME_PM_IPTEG_FROM_RL2L3_SHR 701 ++#define POWER9_PME_PM_DPTEG_FROM_RMEM 702 ++#define POWER9_PME_PM_L3_L2_CO_MISS 703 ++#define POWER9_PME_PM_IPTEG_FROM_DMEM 704 ++#define POWER9_PME_PM_MRK_DTLB_MISS_64K 705 ++#define POWER9_PME_PM_MRK_DATA_FROM_L3_DISP_CONFLICT_CYC 706 ++#define POWER9_PME_PM_LSU_FIN 707 ++#define POWER9_PME_PM_DATA_FROM_L2_DISP_CONFLICT_OTHER 708 ++#define POWER9_PME_PM_MRK_DATA_FROM_ON_CHIP_CACHE 709 ++#define POWER9_PME_PM_LSU_STCX 710 ++#define POWER9_PME_PM_MRK_DATA_FROM_L2_1_MOD 711 ++#define POWER9_PME_PM_VSU_NON_FLOP_CMPL 712 ++#define POWER9_PME_PM_INST_FROM_L3_DISP_CONFLICT 713 ++#define POWER9_PME_PM_MRK_DATA_FROM_L2_1_SHR 714 ++#define POWER9_PME_PM_RADIX_PWC_L4_PTE_FROM_L3 715 ++#define POWER9_PME_PM_TAGE_CORRECT 716 ++#define POWER9_PME_PM_TM_FAV_CAUSED_FAIL 717 ++#define POWER9_PME_PM_RADIX_PWC_L1_HIT 718 ++#define POWER9_PME_PM_LSU0_LMQ_S0_VALID 719 ++#define POWER9_PME_PM_BR_MPRED_CCACHE 720 ++#define POWER9_PME_PM_L1_DEMAND_WRITE 721 ++#define POWER9_PME_PM_CMPLU_STALL_FLUSH_ANY_THREAD 722 ++#define POWER9_PME_PM_IPTEG_FROM_L3MISS 723 ++#define POWER9_PME_PM_MRK_DTLB_MISS_16G 724 ++#define POWER9_PME_PM_IPTEG_FROM_RL4 725 ++#define POWER9_PME_PM_L2_RCST_DISP 726 ++#define POWER9_PME_PM_MRK_DATA_FROM_L2_DISP_CONFLICT_OTHER_CYC 727 ++#define POWER9_PME_PM_CMPLU_STALL 728 ++#define POWER9_PME_PM_DISP_CLB_HELD_SB 729 ++#define POWER9_PME_PM_L3_SN_USAGE 730 ++#define POWER9_PME_PM_FLOP_CMPL 731 ++#define POWER9_PME_PM_MRK_L2_RC_DISP 732 ++#define POWER9_PME_PM_L3_PF_ON_CHIP_CACHE 733 ++#define POWER9_PME_PM_IC_DEMAND_CYC 734 ++#define POWER9_PME_PM_CO_USAGE 735 ++#define POWER9_PME_PM_ISYNC 736 ++#define POWER9_PME_PM_MEM_CO 737 ++#define POWER9_PME_PM_NTC_ALL_FIN 738 ++#define POWER9_PME_PM_CMPLU_STALL_EXCEPTION 739 ++#define POWER9_PME_PM_LS0_LAUNCH_HELD_PREF 740 ++#define POWER9_PME_PM_ICT_NOSLOT_BR_MPRED 741 ++#define POWER9_PME_PM_MRK_BR_CMPL 742 ++#define POWER9_PME_PM_ICT_NOSLOT_DISP_HELD 743 ++#define POWER9_PME_PM_IC_PREF_WRITE 744 ++#define POWER9_PME_PM_MRK_LSU_FLUSH_LHL_SHL 745 ++#define POWER9_PME_PM_DTLB_MISS_1G 746 ++#define POWER9_PME_PM_DATA_FROM_L2_NO_CONFLICT 747 ++#define POWER9_PME_PM_MRK_DPTEG_FROM_L3MISS 748 ++#define POWER9_PME_PM_BR_PRED 749 ++#define POWER9_PME_PM_CMPLU_STALL_OTHER_CMPL 750 ++#define POWER9_PME_PM_INST_FROM_DMEM 751 ++#define POWER9_PME_PM_MRK_DPTEG_FROM_L2_NO_CONFLICT 752 ++#define POWER9_PME_PM_DC_PREF_SW_ALLOC 753 ++#define POWER9_PME_PM_L2_RCST_DISP_FAIL_OTHER 754 ++#define POWER9_PME_PM_CMPLU_STALL_EMQ_FULL 755 ++#define POWER9_PME_PM_MRK_INST_DECODED 756 ++#define POWER9_PME_PM_IERAT_RELOAD_4K 757 ++#define POWER9_PME_PM_CMPLU_STALL_LRQ_OTHER 758 ++#define POWER9_PME_PM_INST_FROM_L3_1_ECO_MOD 759 ++#define POWER9_PME_PM_L3_P0_CO_L31 760 ++#define POWER9_PME_PM_NON_TM_RST_SC 761 ++#define POWER9_PME_PM_RADIX_PWC_L4_PTE_FROM_L2 762 ++#define POWER9_PME_PM_INST_SYS_PUMP_CPRED 763 ++#define POWER9_PME_PM_DPTEG_FROM_DMEM 764 ++#define POWER9_PME_PM_LSU_LMQ_SRQ_EMPTY_CYC 765 ++#define POWER9_PME_PM_SYS_PUMP_CPRED 766 ++#define POWER9_PME_PM_DTLB_MISS_64K 767 ++#define POWER9_PME_PM_CMPLU_STALL_STCX 768 ++#define POWER9_PME_PM_MRK_FAB_RSP_CLAIM_RTY 769 ++#define POWER9_PME_PM_PARTIAL_ST_FIN 770 ++#define POWER9_PME_PM_THRD_CONC_RUN_INST 771 ++#define POWER9_PME_PM_CO_TM_SC_FOOTPRINT 772 ++#define POWER9_PME_PM_MRK_LARX_FIN 773 ++#define POWER9_PME_PM_L3_LOC_GUESS_WRONG 774 ++#define POWER9_PME_PM_CMPLU_STALL_DMISS_L21_L31 775 ++#define POWER9_PME_PM_SHL_ST_DISABLE 776 ++#define POWER9_PME_PM_VSU_FIN 777 ++#define POWER9_PME_PM_MRK_LSU_FLUSH_ATOMIC 778 ++#define POWER9_PME_PM_L3_CI_HIT 779 ++#define POWER9_PME_PM_CMPLU_STALL_DARQ 780 ++#define POWER9_PME_PM_L3_PF_ON_CHIP_MEM 781 ++#define POWER9_PME_PM_THRD_PRIO_0_1_CYC 782 ++#define POWER9_PME_PM_DERAT_MISS_64K 783 ++#define POWER9_PME_PM_PMC2_REWIND 784 ++#define POWER9_PME_PM_INST_FROM_L2 785 ++#define POWER9_PME_PM_MRK_NTF_FIN 786 ++#define POWER9_PME_PM_ALL_SRQ_FULL 787 ++#define POWER9_PME_PM_INST_DISP 788 ++#define POWER9_PME_PM_LS3_ERAT_MISS_PREF 789 ++#define POWER9_PME_PM_STOP_FETCH_PENDING_CYC 790 ++#define POWER9_PME_PM_L1_DCACHE_RELOAD_VALID 791 ++#define POWER9_PME_PM_L3_P0_LCO_NO_DATA 792 ++#define POWER9_PME_PM_LSU3_VECTOR_LD_FIN 793 ++#define POWER9_PME_PM_MRK_DPTEG_FROM_L3_NO_CONFLICT 794 ++#define POWER9_PME_PM_MRK_FXU_FIN 795 ++#define POWER9_PME_PM_LS3_UNALIGNED_ST 796 ++#define POWER9_PME_PM_DPTEG_FROM_MEMORY 797 ++#define POWER9_PME_PM_RUN_CYC_ST_MODE 798 ++#define POWER9_PME_PM_PMC4_OVERFLOW 799 ++#define POWER9_PME_PM_THRESH_EXC_256 800 ++#define POWER9_PME_PM_MRK_DATA_FROM_L3_1_ECO_MOD_CYC 801 ++#define POWER9_PME_PM_LSU0_LRQ_S0_VALID_CYC 802 ++#define POWER9_PME_PM_INST_FROM_L2MISS 803 ++#define POWER9_PME_PM_MRK_L2_TM_ST_ABORT_SISTER 804 ++#define POWER9_PME_PM_L2_ST 805 ++#define POWER9_PME_PM_RADIX_PWC_MISS 806 ++#define POWER9_PME_PM_MRK_ST_L2DISP_TO_CMPL_CYC 807 ++#define POWER9_PME_PM_LSU1_LDMX_FIN 808 ++#define POWER9_PME_PM_L3_P2_LCO_RTY 809 ++#define POWER9_PME_PM_MRK_DATA_FROM_DL2L3_SHR 810 ++#define POWER9_PME_PM_L2_GRP_GUESS_CORRECT 811 ++#define POWER9_PME_PM_LSU0_1_LRQF_FULL_CYC 812 ++#define POWER9_PME_PM_DATA_GRP_PUMP_MPRED 813 ++#define POWER9_PME_PM_LSU3_ERAT_HIT 814 ++#define POWER9_PME_PM_FORCED_NOP 815 ++#define POWER9_PME_PM_MRK_DATA_FROM_L2_DISP_CONFLICT_LDHITST 816 ++#define POWER9_PME_PM_CMPLU_STALL_LARX 817 ++#define POWER9_PME_PM_MRK_DPTEG_FROM_RL4 818 ++#define POWER9_PME_PM_MRK_DATA_FROM_L2 819 ++#define POWER9_PME_PM_TM_FAIL_CONF_NON_TM 820 ++#define POWER9_PME_PM_DPTEG_FROM_RL2L3_SHR 821 ++#define POWER9_PME_PM_DARQ_4_6_ENTRIES 822 ++#define POWER9_PME_PM_L2_SYS_PUMP 823 ++#define POWER9_PME_PM_IOPS_CMPL 824 ++#define POWER9_PME_PM_LSU_FLUSH_LHS 825 ++#define POWER9_PME_PM_DATA_FROM_L3_1_SHR 826 ++#define POWER9_PME_PM_NTC_FIN 827 ++#define POWER9_PME_PM_LS2_DC_COLLISIONS 828 ++#define POWER9_PME_PM_FMA_CMPL 829 ++#define POWER9_PME_PM_IPTEG_FROM_MEMORY 830 ++#define POWER9_PME_PM_TM_NON_FAV_TBEGIN 831 ++#define POWER9_PME_PM_PMC1_REWIND 832 ++#define POWER9_PME_PM_ISU2_ISS_HOLD_ALL 833 ++#define POWER9_PME_PM_MRK_DATA_FROM_DL2L3_MOD_CYC 834 ++#define POWER9_PME_PM_PTESYNC 835 ++#define POWER9_PME_PM_ISIDE_DISP_FAIL_OTHER 836 ++#define POWER9_PME_PM_L2_IC_INV 837 ++#define POWER9_PME_PM_DPTEG_FROM_L3 838 ++#define POWER9_PME_PM_RADIX_PWC_L2_HIT 839 ++#define POWER9_PME_PM_DC_PREF_HW_ALLOC 840 ++#define POWER9_PME_PM_LSU0_VECTOR_LD_FIN 841 ++#define POWER9_PME_PM_1PLUS_PPC_DISP 842 ++#define POWER9_PME_PM_RADIX_PWC_L3_PTE_FROM_L2 843 ++#define POWER9_PME_PM_DATA_FROM_L2MISS 844 ++#define POWER9_PME_PM_MRK_FAB_RSP_RD_T_INTV 845 ++#define POWER9_PME_PM_NTC_ISSUE_HELD_ARB 846 ++#define POWER9_PME_PM_LSU2_L1_CAM_CANCEL 847 ++#define POWER9_PME_PM_L3_GRP_GUESS_WRONG_HIGH 848 ++#define POWER9_PME_PM_DATA_FROM_L3_NO_CONFLICT 849 ++#define POWER9_PME_PM_SUSPENDED 850 ++#define POWER9_PME_PM_L3_SYS_GUESS_WRONG 851 ++#define POWER9_PME_PM_L3_L2_CO_HIT 852 ++#define POWER9_PME_PM_LSU0_TM_L1_HIT 853 ++#define POWER9_PME_PM_BR_MPRED_PCACHE 854 ++#define POWER9_PME_PM_STCX_FAIL 855 ++#define POWER9_PME_PM_LSU_FLUSH_NEXT 856 ++#define POWER9_PME_PM_DSIDE_MRU_TOUCH 857 ++#define POWER9_PME_PM_SN_MISS 858 ++#define POWER9_PME_PM_BR_PRED_TAKEN_CMPL 859 ++#define POWER9_PME_PM_L3_P0_SYS_PUMP 860 ++#define POWER9_PME_PM_L3_HIT 861 ++#define POWER9_PME_PM_MRK_DFU_FIN 862 ++#define POWER9_PME_PM_CMPLU_STALL_NESTED_TEND 863 ++#define POWER9_PME_PM_INST_FROM_L1 864 ++#define POWER9_PME_PM_IC_DEMAND_REQ 865 ++#define POWER9_PME_PM_BRU_FIN 866 ++#define POWER9_PME_PM_L1_ICACHE_RELOADED_ALL 867 ++#define POWER9_PME_PM_IERAT_RELOAD_16M 868 ++#define POWER9_PME_PM_DATA_FROM_L2MISS_MOD 869 ++#define POWER9_PME_PM_LSU0_ERAT_HIT 870 ++#define POWER9_PME_PM_L3_PF0_BUSY 871 ++#define POWER9_PME_PM_MRK_DPTEG_FROM_LL4 872 ++#define POWER9_PME_PM_LSU3_SET_MPRED 873 ++#define POWER9_PME_PM_TM_CAM_OVERFLOW 874 ++#define POWER9_PME_PM_SYNC_MRK_FX_DIVIDE 875 ++#define POWER9_PME_PM_IPTEG_FROM_L2_1_SHR 876 ++#define POWER9_PME_PM_MRK_LD_MISS_L1 877 ++#define POWER9_PME_PM_MRK_FAB_RSP_DCLAIM 878 ++#define POWER9_PME_PM_IPTEG_FROM_L3_DISP_CONFLICT 879 ++#define POWER9_PME_PM_NON_FMA_FLOP_CMPL 880 ++#define POWER9_PME_PM_MRK_DATA_FROM_L2MISS 881 ++#define POWER9_PME_PM_L2_SYS_GUESS_WRONG 882 ++#define POWER9_PME_PM_THRESH_EXC_2048 883 ++#define POWER9_PME_PM_INST_FROM_LL4 884 ++#define POWER9_PME_PM_DATA_FROM_RL2L3_SHR 885 ++#define POWER9_PME_PM_DATA_FROM_L2_DISP_CONFLICT_LDHITST 886 ++#define POWER9_PME_PM_LSU_FLUSH_WRK_ARND 887 ++#define POWER9_PME_PM_L3_PF_HIT_L3 888 ++#define POWER9_PME_PM_RD_FORMING_SC 889 ++#define POWER9_PME_PM_MRK_DATA_FROM_L2_1_MOD_CYC 890 ++#define POWER9_PME_PM_IPTEG_FROM_DL4 891 ++#define POWER9_PME_PM_CMPLU_STALL_STORE_FINISH 892 ++#define POWER9_PME_PM_IPTEG_FROM_LL4 893 ++#define POWER9_PME_PM_1FLOP_CMPL 894 ++#define POWER9_PME_PM_L2_GRP_GUESS_WRONG 895 ++#define POWER9_PME_PM_TM_FAV_TBEGIN 896 ++#define POWER9_PME_PM_INST_FROM_L2_NO_CONFLICT 897 ++#define POWER9_PME_PM_2FLOP_CMPL 898 ++#define POWER9_PME_PM_LS2_TM_DISALLOW 899 ++#define POWER9_PME_PM_L2_LD_DISP 900 ++#define POWER9_PME_PM_CMPLU_STALL_LHS 901 ++#define POWER9_PME_PM_TLB_HIT 902 ++#define POWER9_PME_PM_HV_CYC 903 ++#define POWER9_PME_PM_L2_RTY_LD 904 ++#define POWER9_PME_PM_STCX_SUCCESS_CMPL 905 ++#define POWER9_PME_PM_INST_PUMP_MPRED 906 ++#define POWER9_PME_PM_LSU2_ERAT_HIT 907 ++#define POWER9_PME_PM_INST_FROM_RL4 908 ++#define POWER9_PME_PM_LD_L3MISS_PEND_CYC 909 ++#define POWER9_PME_PM_L3_LAT_CI_MISS 910 ++#define POWER9_PME_PM_MRK_FAB_RSP_RD_RTY 911 ++#define POWER9_PME_PM_DTLB_MISS_16M 912 ++#define POWER9_PME_PM_DPTEG_FROM_L2_1_MOD 913 ++#define POWER9_PME_PM_MRK_DATA_FROM_RL2L3_SHR 914 ++#define POWER9_PME_PM_MRK_LSU_FIN 915 ++#define POWER9_PME_PM_LSU0_STORE_REJECT 916 ++#define POWER9_PME_PM_CLB_HELD 917 ++#define POWER9_PME_PM_LS2_ERAT_MISS_PREF 918 ++static const pme_power_entry_t power9_pe[] = { ++[ POWER9_PME_PM_IERAT_RELOAD ] = { /* 0 */ ++ .pme_name = "PM_IERAT_RELOAD", ++ .pme_code = 0x00000100F6, ++ .pme_short_desc = "Number of I-ERAT reloads", ++ .pme_long_desc = "Number of I-ERAT reloads", ++}, ++[ POWER9_PME_PM_TM_OUTER_TEND ] = { /* 1 */ ++ .pme_name = "PM_TM_OUTER_TEND", ++ .pme_code = 0x0000002894, ++ .pme_short_desc = "Completion time outer tend", ++ .pme_long_desc = "Completion time outer tend", ++}, ++[ POWER9_PME_PM_IPTEG_FROM_L3 ] = { /* 2 */ ++ .pme_name = "PM_IPTEG_FROM_L3", ++ .pme_code = 0x0000045042, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L3 due to a instruction side request", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L3 due to a instruction side request", ++}, ++[ POWER9_PME_PM_DPTEG_FROM_L3_1_MOD ] = { /* 3 */ ++ .pme_name = "PM_DPTEG_FROM_L3_1_MOD", ++ .pme_code = 0x000002E044, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's L3 on the same chip due to a data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's L3 on the same chip due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++}, ++[ POWER9_PME_PM_PMC2_SAVED ] = { /* 4 */ ++ .pme_name = "PM_PMC2_SAVED", ++ .pme_code = 0x0000010022, ++ .pme_short_desc = "PMC2 Rewind Value saved", ++ .pme_long_desc = "PMC2 Rewind Value saved", ++}, ++[ POWER9_PME_PM_LSU_FLUSH_SAO ] = { /* 5 */ ++ .pme_name = "PM_LSU_FLUSH_SAO", ++ .pme_code = 0x000000C0B8, ++ .pme_short_desc = "A load-hit-load condition with Strong Address Ordering will have address compare disabled and flush", ++ .pme_long_desc = "A load-hit-load condition with Strong Address Ordering will have address compare disabled and flush", ++}, ++[ POWER9_PME_PM_CMPLU_STALL_DFU ] = { /* 6 */ ++ .pme_name = "PM_CMPLU_STALL_DFU", ++ .pme_code = 0x000002D012, ++ .pme_short_desc = "Finish stall because the NTF instruction was issued to the Decimal Floating Point execution pipe and waiting to finish.", ++ .pme_long_desc = "Finish stall because the NTF instruction was issued to the Decimal Floating Point execution pipe and waiting to finish. Includes decimal floating point instructions + 128 bit binary floating point instructions. Not qualified by multicycle", ++}, ++[ POWER9_PME_PM_MRK_LSU_FLUSH_RELAUNCH_MISS ] = { /* 7 */ ++ .pme_name = "PM_MRK_LSU_FLUSH_RELAUNCH_MISS", ++ .pme_code = 0x000000D09C, ++ .pme_short_desc = "If a load that has already returned data and has to relaunch for any reason then gets a miss (erat, setp, data cache), it will often be flushed at relaunch time because the data might be inconsistent", ++ .pme_long_desc = "If a load that has already returned data and has to relaunch for any reason then gets a miss (erat, setp, data cache), it will often be flushed at relaunch time because the data might be inconsistent", ++}, ++[ POWER9_PME_PM_SP_FLOP_CMPL ] = { /* 8 */ ++ .pme_name = "PM_SP_FLOP_CMPL", ++ .pme_code = 0x000001505E, ++ .pme_short_desc = "Single-precision flop count", ++ .pme_long_desc = "Single-precision flop count", ++}, ++[ POWER9_PME_PM_IC_RELOAD_PRIVATE ] = { /* 9 */ ++ .pme_name = "PM_IC_RELOAD_PRIVATE", ++ .pme_code = 0x0000004894, ++ .pme_short_desc = "Reloading line was brought in private for a specific thread.", ++ .pme_long_desc = "Reloading line was brought in private for a specific thread. Most lines are brought in shared for all eight thrreads. If RA does not match then invalidates and then brings it shared to other thread. In P7 line brought in private , then line was invalidat", ++}, ++[ POWER9_PME_PM_RADIX_PWC_L2_PTE_FROM_L2 ] = { /* 10 */ ++ .pme_name = "PM_RADIX_PWC_L2_PTE_FROM_L2", ++ .pme_code = 0x000001F058, ++ .pme_short_desc = "A Page Table Entry was reloaded to a level 2 page walk cache from the core's L2 data cache.", ++ .pme_long_desc = "A Page Table Entry was reloaded to a level 2 page walk cache from the core's L2 data cache. This implies that level 3 and level 4 PWC accesses were not necessary for this translation", ++}, ++[ POWER9_PME_PM_INST_PUMP_CPRED ] = { /* 11 */ ++ .pme_name = "PM_INST_PUMP_CPRED", ++ .pme_code = 0x0000014054, ++ .pme_short_desc = "Pump prediction correct.", ++ .pme_long_desc = "Pump prediction correct. Counts across all types of pumps for an instruction fetch", ++}, ++[ POWER9_PME_PM_INST_FROM_L2_1_MOD ] = { /* 12 */ ++ .pme_name = "PM_INST_FROM_L2_1_MOD", ++ .pme_code = 0x0000044046, ++ .pme_short_desc = "The processor's Instruction cache was reloaded with Modified (M) data from another core's L2 on the same chip due to an instruction fetch (not prefetch)", ++ .pme_long_desc = "The processor's Instruction cache was reloaded with Modified (M) data from another core's L2 on the same chip due to an instruction fetch (not prefetch)", ++}, ++[ POWER9_PME_PM_MRK_ST_CMPL ] = { /* 13 */ ++ .pme_name = "PM_MRK_ST_CMPL", ++ .pme_code = 0x00000301E2, ++ .pme_short_desc = "Marked store completed and sent to nest", ++ .pme_long_desc = "Marked store completed and sent to nest", ++}, ++[ POWER9_PME_PM_MRK_LSU_DERAT_MISS ] = { /* 14 */ ++ .pme_name = "PM_MRK_LSU_DERAT_MISS", ++ .pme_code = 0x0000030162, ++ .pme_short_desc = "Marked derat reload (miss) for any page size", ++ .pme_long_desc = "Marked derat reload (miss) for any page size", ++}, ++[ POWER9_PME_PM_L2_ST_DISP ] = { /* 15 */ ++ .pme_name = "PM_L2_ST_DISP", ++ .pme_code = 0x000001689E, ++ .pme_short_desc = "All successful store dispatches", ++ .pme_long_desc = "All successful store dispatches", ++}, ++[ POWER9_PME_PM_LSU0_FALSE_LHS ] = { /* 16 */ ++ .pme_name = "PM_LSU0_FALSE_LHS", ++ .pme_code = 0x000000C0A0, ++ .pme_short_desc = "False LHS match detected", ++ .pme_long_desc = "False LHS match detected", ++}, ++[ POWER9_PME_PM_L2_CASTOUT_MOD ] = { /* 17 */ ++ .pme_name = "PM_L2_CASTOUT_MOD", ++ .pme_code = 0x0000016082, ++ .pme_short_desc = "L2 Castouts - Modified (M, Mu, Me)", ++ .pme_long_desc = "L2 Castouts - Modified (M, Mu, Me)", ++}, ++[ POWER9_PME_PM_L2_RCST_DISP_FAIL_ADDR ] = { /* 18 */ ++ .pme_name = "PM_L2_RCST_DISP_FAIL_ADDR", ++ .pme_code = 0x0000036884, ++ .pme_short_desc = "L2 RC store dispatch attempt failed due to address collision with RC/CO/SN/SQ", ++ .pme_long_desc = "L2 RC store dispatch attempt failed due to address collision with RC/CO/SN/SQ", ++}, ++[ POWER9_PME_PM_MRK_INST_TIMEO ] = { /* 19 */ ++ .pme_name = "PM_MRK_INST_TIMEO", ++ .pme_code = 0x0000040134, ++ .pme_short_desc = "marked Instruction finish timeout (instruction lost)", ++ .pme_long_desc = "marked Instruction finish timeout (instruction lost)", ++}, ++[ POWER9_PME_PM_CMPLU_STALL_LOAD_FINISH ] = { /* 20 */ ++ .pme_name = "PM_CMPLU_STALL_LOAD_FINISH", ++ .pme_code = 0x000004D014, ++ .pme_short_desc = "Finish stall because the NTF instruction was a load instruction with all its dependencies satisfied just going through the LSU pipe to finish", ++ .pme_long_desc = "Finish stall because the NTF instruction was a load instruction with all its dependencies satisfied just going through the LSU pipe to finish", ++}, ++[ POWER9_PME_PM_INST_FROM_L2_1_SHR ] = { /* 21 */ ++ .pme_name = "PM_INST_FROM_L2_1_SHR", ++ .pme_code = 0x0000034046, ++ .pme_short_desc = "The processor's Instruction cache was reloaded with Shared (S) data from another core's L2 on the same chip due to an instruction fetch (not prefetch)", ++ .pme_long_desc = "The processor's Instruction cache was reloaded with Shared (S) data from another core's L2 on the same chip due to an instruction fetch (not prefetch)", ++}, ++[ POWER9_PME_PM_LS1_DC_COLLISIONS ] = { /* 22 */ ++ .pme_name = "PM_LS1_DC_COLLISIONS", ++ .pme_code = 0x000000D890, ++ .pme_short_desc = "Read-write data cache collisions", ++ .pme_long_desc = "Read-write data cache collisions", ++}, ++[ POWER9_PME_PM_LSU2_FALSE_LHS ] = { /* 23 */ ++ .pme_name = "PM_LSU2_FALSE_LHS", ++ .pme_code = 0x000000C0A4, ++ .pme_short_desc = "False LHS match detected", ++ .pme_long_desc = "False LHS match detected", ++}, ++[ POWER9_PME_PM_MRK_ST_DRAIN_TO_L2DISP_CYC ] = { /* 24 */ ++ .pme_name = "PM_MRK_ST_DRAIN_TO_L2DISP_CYC", ++ .pme_code = 0x000003F150, ++ .pme_short_desc = "cycles to drain st from core to L2", ++ .pme_long_desc = "cycles to drain st from core to L2", ++}, ++[ POWER9_PME_PM_MRK_DTLB_MISS_16M ] = { /* 25 */ ++ .pme_name = "PM_MRK_DTLB_MISS_16M", ++ .pme_code = 0x000004C15E, ++ .pme_short_desc = "Marked Data TLB Miss page size 16M", ++ .pme_long_desc = "Marked Data TLB Miss page size 16M", ++}, ++[ POWER9_PME_PM_L2_GROUP_PUMP ] = { /* 26 */ ++ .pme_name = "PM_L2_GROUP_PUMP", ++ .pme_code = 0x0000046888, ++ .pme_short_desc = "RC requests that were on Node Pump attempts", ++ .pme_long_desc = "RC requests that were on Node Pump attempts", ++}, ++[ POWER9_PME_PM_LSU2_VECTOR_ST_FIN ] = { /* 27 */ ++ .pme_name = "PM_LSU2_VECTOR_ST_FIN", ++ .pme_code = 0x000000C08C, ++ .pme_short_desc = "A vector store instruction finished.", ++ .pme_long_desc = "A vector store instruction finished. The ops considered in this category are stv*, stxv*, stxsi*x, stxsd, and stxssp", ++}, ++[ POWER9_PME_PM_CMPLU_STALL_LSAQ_ARB ] = { /* 28 */ ++ .pme_name = "PM_CMPLU_STALL_LSAQ_ARB", ++ .pme_code = 0x000004E016, ++ .pme_short_desc = "Finish stall because the NTF instruction was a load or store that was held in LSAQ because an older instruction from SRQ or LRQ won arbitration to the LSU pipe when this instruction tried to launch", ++ .pme_long_desc = "Finish stall because the NTF instruction was a load or store that was held in LSAQ because an older instruction from SRQ or LRQ won arbitration to the LSU pipe when this instruction tried to launch", ++}, ++[ POWER9_PME_PM_L3_CO_LCO ] = { /* 29 */ ++ .pme_name = "PM_L3_CO_LCO", ++ .pme_code = 0x00000360A4, ++ .pme_short_desc = "Total L3 castouts occurred on LCO", ++ .pme_long_desc = "Total L3 castouts occurred on LCO", ++}, ++[ POWER9_PME_PM_INST_GRP_PUMP_CPRED ] = { /* 30 */ ++ .pme_name = "PM_INST_GRP_PUMP_CPRED", ++ .pme_code = 0x000002C05C, ++ .pme_short_desc = "Initial and Final Pump Scope was group pump (prediction=correct) for an instruction fetch (demand only)", ++ .pme_long_desc = "Initial and Final Pump Scope was group pump (prediction=correct) for an instruction fetch (demand only)", ++}, ++[ POWER9_PME_PM_THRD_PRIO_4_5_CYC ] = { /* 31 */ ++ .pme_name = "PM_THRD_PRIO_4_5_CYC", ++ .pme_code = 0x0000005080, ++ .pme_short_desc = "Cycles thread running at priority level 4 or 5", ++ .pme_long_desc = "Cycles thread running at priority level 4 or 5", ++}, ++[ POWER9_PME_PM_BR_PRED_TA ] = { /* 32 */ ++ .pme_name = "PM_BR_PRED_TA", ++ .pme_code = 0x00000040B4, ++ .pme_short_desc = "Conditional Branch Completed that had its target address predicted.", ++ .pme_long_desc = "Conditional Branch Completed that had its target address predicted. Only XL-form branches set this event. This equal the sum of CCACHE, LSTACK, and PCACHE", ++}, ++[ POWER9_PME_PM_ICT_NOSLOT_BR_MPRED_ICMISS ] = { /* 33 */ ++ .pme_name = "PM_ICT_NOSLOT_BR_MPRED_ICMISS", ++ .pme_code = 0x0000034058, ++ .pme_short_desc = "Ict empty for this thread due to Icache Miss and branch mispred", ++ .pme_long_desc = "Ict empty for this thread due to Icache Miss and branch mispred", ++}, ++[ POWER9_PME_PM_IPTEG_FROM_L3_NO_CONFLICT ] = { /* 34 */ ++ .pme_name = "PM_IPTEG_FROM_L3_NO_CONFLICT", ++ .pme_code = 0x0000015044, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L3 without conflict due to a instruction side request", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L3 without conflict due to a instruction side request", ++}, ++[ POWER9_PME_PM_CMPLU_STALL_FXU ] = { /* 35 */ ++ .pme_name = "PM_CMPLU_STALL_FXU", ++ .pme_code = 0x000002D016, ++ .pme_short_desc = "Finish stall due to a scalar fixed point or CR instruction in the execution pipeline.", ++ .pme_long_desc = "Finish stall due to a scalar fixed point or CR instruction in the execution pipeline. These instructions get routed to the ALU, ALU2, and DIV pipes", ++}, ++[ POWER9_PME_PM_VSU_FSQRT_FDIV ] = { /* 36 */ ++ .pme_name = "PM_VSU_FSQRT_FDIV", ++ .pme_code = 0x000004D04E, ++ .pme_short_desc = "four flops operation (fdiv,fsqrt) Scalar Instructions only", ++ .pme_long_desc = "four flops operation (fdiv,fsqrt) Scalar Instructions only", ++}, ++[ POWER9_PME_PM_EXT_INT ] = { /* 37 */ ++ .pme_name = "PM_EXT_INT", ++ .pme_code = 0x00000200F8, ++ .pme_short_desc = "external interrupt", ++ .pme_long_desc = "external interrupt", ++}, ++[ POWER9_PME_PM_MRK_LD_MISS_EXPOSED_CYC ] = { /* 38 */ ++ .pme_name = "PM_MRK_LD_MISS_EXPOSED_CYC", ++ .pme_code = 0x000001013E, ++ .pme_short_desc = "Marked Load exposed Miss (use edge detect to count #)", ++ .pme_long_desc = "Marked Load exposed Miss (use edge detect to count #)", ++}, ++[ POWER9_PME_PM_S2Q_FULL ] = { /* 39 */ ++ .pme_name = "PM_S2Q_FULL", ++ .pme_code = 0x000000E080, ++ .pme_short_desc = "Cycles during which the S2Q is full", ++ .pme_long_desc = "Cycles during which the S2Q is full", ++}, ++[ POWER9_PME_PM_RUN_CYC_SMT2_MODE ] = { /* 40 */ ++ .pme_name = "PM_RUN_CYC_SMT2_MODE", ++ .pme_code = 0x000003006C, ++ .pme_short_desc = "Cycles in which this thread's run latch is set and the core is in SMT2 mode", ++ .pme_long_desc = "Cycles in which this thread's run latch is set and the core is in SMT2 mode", ++}, ++[ POWER9_PME_PM_DECODE_LANES_NOT_AVAIL ] = { /* 41 */ ++ .pme_name = "PM_DECODE_LANES_NOT_AVAIL", ++ .pme_code = 0x0000005884, ++ .pme_short_desc = "Decode has something to transmit but dispatch lanes are not available", ++ .pme_long_desc = "Decode has something to transmit but dispatch lanes are not available", ++}, ++[ POWER9_PME_PM_TM_FAIL_TLBIE ] = { /* 42 */ ++ .pme_name = "PM_TM_FAIL_TLBIE", ++ .pme_code = 0x000000E0AC, ++ .pme_short_desc = "Transaction failed because there was a TLBIE hit in the bloom filter", ++ .pme_long_desc = "Transaction failed because there was a TLBIE hit in the bloom filter", ++}, ++[ POWER9_PME_PM_DISP_CLB_HELD_BAL ] = { /* 43 */ ++ .pme_name = "PM_DISP_CLB_HELD_BAL", ++ .pme_code = 0x000000288C, ++ .pme_short_desc = "Dispatch/CLB Hold: Balance Flush", ++ .pme_long_desc = "Dispatch/CLB Hold: Balance Flush", ++}, ++[ POWER9_PME_PM_MRK_DATA_FROM_L3MISS_CYC ] = { /* 44 */ ++ .pme_name = "PM_MRK_DATA_FROM_L3MISS_CYC", ++ .pme_code = 0x000001415E, ++ .pme_short_desc = "Duration in cycles to reload from a localtion other than the local core's L3 due to a marked load", ++ .pme_long_desc = "Duration in cycles to reload from a localtion other than the local core's L3 due to a marked load", ++}, ++[ POWER9_PME_PM_MRK_ST_FWD ] = { /* 45 */ ++ .pme_name = "PM_MRK_ST_FWD", ++ .pme_code = 0x000003012C, ++ .pme_short_desc = "Marked st forwards", ++ .pme_long_desc = "Marked st forwards", ++}, ++[ POWER9_PME_PM_FXU_FIN ] = { /* 46 */ ++ .pme_name = "PM_FXU_FIN", ++ .pme_code = 0x0000040004, ++ .pme_short_desc = "The fixed point unit Unit finished an instruction.", ++ .pme_long_desc = "The fixed point unit Unit finished an instruction. Instructions that finish may not necessary complete.", ++}, ++[ POWER9_PME_PM_SYNC_MRK_BR_MPRED ] = { /* 47 */ ++ .pme_name = "PM_SYNC_MRK_BR_MPRED", ++ .pme_code = 0x000001515C, ++ .pme_short_desc = "Marked Branch mispredict that can cause a synchronous interrupt", ++ .pme_long_desc = "Marked Branch mispredict that can cause a synchronous interrupt", ++}, ++[ POWER9_PME_PM_CMPLU_STALL_STORE_FIN_ARB ] = { /* 48 */ ++ .pme_name = "PM_CMPLU_STALL_STORE_FIN_ARB", ++ .pme_code = 0x0000030014, ++ .pme_short_desc = "Finish stall because the NTF instruction was a store waiting for a slot in the store finish pipe.", ++ .pme_long_desc = "Finish stall because the NTF instruction was a store waiting for a slot in the store finish pipe. This means the instruction is ready to finish but there are instructions ahead of it, using the finish pipe", ++}, ++[ POWER9_PME_PM_DSLB_MISS ] = { /* 49 */ ++ .pme_name = "PM_DSLB_MISS", ++ .pme_code = 0x000000D0A8, ++ .pme_short_desc = "Data SLB Miss - Total of all segment sizes", ++ .pme_long_desc = "Data SLB Miss - Total of all segment sizes", ++}, ++[ POWER9_PME_PM_L3_MISS ] = { /* 50 */ ++ .pme_name = "PM_L3_MISS", ++ .pme_code = 0x00000168A4, ++ .pme_short_desc = "L3 Misses", ++ .pme_long_desc = "L3 Misses", ++}, ++[ POWER9_PME_PM_DUMMY2_REMOVE_ME ] = { /* 51 */ ++ .pme_name = "PM_DUMMY2_REMOVE_ME", ++ .pme_code = 0x0000040064, ++ .pme_short_desc = "Space holder for LS_PC_RELOAD_RA", ++ .pme_long_desc = "Space holder for LS_PC_RELOAD_RA", ++}, ++[ POWER9_PME_PM_MRK_DERAT_MISS_1G ] = { /* 52 */ ++ .pme_name = "PM_MRK_DERAT_MISS_1G", ++ .pme_code = 0x000003D152, ++ .pme_short_desc = "Marked Data ERAT Miss (Data TLB Access) page size 1G.", ++ .pme_long_desc = "Marked Data ERAT Miss (Data TLB Access) page size 1G. Implies radix translation", ++}, ++[ POWER9_PME_PM_MATH_FLOP_CMPL ] = { /* 53 */ ++ .pme_name = "PM_MATH_FLOP_CMPL", ++ .pme_code = 0x0000010066, ++ .pme_short_desc = "", ++ .pme_long_desc = "", ++}, ++[ POWER9_PME_PM_L2_INST ] = { /* 54 */ ++ .pme_name = "PM_L2_INST", ++ .pme_code = 0x000003609E, ++ .pme_short_desc = "All successful I-side dispatches for this thread (excludes i_l2mru_tch reqs)", ++ .pme_long_desc = "All successful I-side dispatches for this thread (excludes i_l2mru_tch reqs)", ++}, ++[ POWER9_PME_PM_FLUSH_DISP ] = { /* 55 */ ++ .pme_name = "PM_FLUSH_DISP", ++ .pme_code = 0x0000002880, ++ .pme_short_desc = "Dispatch flush", ++ .pme_long_desc = "Dispatch flush", ++}, ++[ POWER9_PME_PM_DISP_HELD_ISSQ_FULL ] = { /* 56 */ ++ .pme_name = "PM_DISP_HELD_ISSQ_FULL", ++ .pme_code = 0x0000020006, ++ .pme_short_desc = "Dispatch held due to Issue q full.", ++ .pme_long_desc = "Dispatch held due to Issue q full. Includes issue queue and branch queue", ++}, ++[ POWER9_PME_PM_MEM_READ ] = { /* 57 */ ++ .pme_name = "PM_MEM_READ", ++ .pme_code = 0x0000010056, ++ .pme_short_desc = "Reads from Memory from this thread (includes data/inst/xlate/l1prefetch/inst prefetch).", ++ .pme_long_desc = "Reads from Memory from this thread (includes data/inst/xlate/l1prefetch/inst prefetch). Includes L4", ++}, ++[ POWER9_PME_PM_DATA_PUMP_MPRED ] = { /* 58 */ ++ .pme_name = "PM_DATA_PUMP_MPRED", ++ .pme_code = 0x000004C052, ++ .pme_short_desc = "Pump misprediction.", ++ .pme_long_desc = "Pump misprediction. Counts across all types of pumps for a demand load", ++}, ++[ POWER9_PME_PM_DATA_CHIP_PUMP_CPRED ] = { /* 59 */ ++ .pme_name = "PM_DATA_CHIP_PUMP_CPRED", ++ .pme_code = 0x000001C050, ++ .pme_short_desc = "Initial and Final Pump Scope was chip pump (prediction=correct) for a demand load", ++ .pme_long_desc = "Initial and Final Pump Scope was chip pump (prediction=correct) for a demand load", ++}, ++[ POWER9_PME_PM_MRK_DATA_FROM_DMEM ] = { /* 60 */ ++ .pme_name = "PM_MRK_DATA_FROM_DMEM", ++ .pme_code = 0x000003D14C, ++ .pme_short_desc = "The processor's data cache was reloaded from another chip's memory on the same Node or Group (Distant) due to a marked load", ++ .pme_long_desc = "The processor's data cache was reloaded from another chip's memory on the same Node or Group (Distant) due to a marked load", ++}, ++[ POWER9_PME_PM_CMPLU_STALL_LSU ] = { /* 61 */ ++ .pme_name = "PM_CMPLU_STALL_LSU", ++ .pme_code = 0x000002C010, ++ .pme_short_desc = "Completion stall by LSU instruction", ++ .pme_long_desc = "Completion stall by LSU instruction", ++}, ++[ POWER9_PME_PM_DATA_FROM_L3_1_MOD ] = { /* 62 */ ++ .pme_name = "PM_DATA_FROM_L3_1_MOD", ++ .pme_code = 0x000002C044, ++ .pme_short_desc = "The processor's data cache was reloaded with Modified (M) data from another core's L3 on the same chip due to a demand load", ++ .pme_long_desc = "The processor's data cache was reloaded with Modified (M) data from another core's L3 on the same chip due to a demand load", ++}, ++[ POWER9_PME_PM_MRK_DERAT_MISS_16M ] = { /* 63 */ ++ .pme_name = "PM_MRK_DERAT_MISS_16M", ++ .pme_code = 0x000003D154, ++ .pme_short_desc = "Marked Data ERAT Miss (Data TLB Access) page size 16M", ++ .pme_long_desc = "Marked Data ERAT Miss (Data TLB Access) page size 16M", ++}, ++[ POWER9_PME_PM_TM_TRANS_RUN_CYC ] = { /* 64 */ ++ .pme_name = "PM_TM_TRANS_RUN_CYC", ++ .pme_code = 0x0000010060, ++ .pme_short_desc = "run cycles in transactional state", ++ .pme_long_desc = "run cycles in transactional state", ++}, ++[ POWER9_PME_PM_THRD_ALL_RUN_CYC ] = { /* 65 */ ++ .pme_name = "PM_THRD_ALL_RUN_CYC", ++ .pme_code = 0x0000020008, ++ .pme_short_desc = "Cycles in which all the threads have the run latch set", ++ .pme_long_desc = "Cycles in which all the threads have the run latch set", ++}, ++[ POWER9_PME_PM_DATA_FROM_DL2L3_MOD ] = { /* 66 */ ++ .pme_name = "PM_DATA_FROM_DL2L3_MOD", ++ .pme_code = 0x000004C048, ++ .pme_short_desc = "The processor's data cache was reloaded with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a demand load", ++ .pme_long_desc = "The processor's data cache was reloaded with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a demand load", ++}, ++[ POWER9_PME_PM_MRK_BR_MPRED_CMPL ] = { /* 67 */ ++ .pme_name = "PM_MRK_BR_MPRED_CMPL", ++ .pme_code = 0x00000301E4, ++ .pme_short_desc = "Marked Branch Mispredicted", ++ .pme_long_desc = "Marked Branch Mispredicted", ++}, ++[ POWER9_PME_PM_ICT_NOSLOT_DISP_HELD_ISSQ ] = { /* 68 */ ++ .pme_name = "PM_ICT_NOSLOT_DISP_HELD_ISSQ", ++ .pme_code = 0x000002D01E, ++ .pme_short_desc = "Ict empty for this thread due to dispatch hold on this thread due to Issue q full, BRQ full, XVCF Full, Count cache, Link, Tar full", ++ .pme_long_desc = "Ict empty for this thread due to dispatch hold on this thread due to Issue q full, BRQ full, XVCF Full, Count cache, Link, Tar full", ++}, ++[ POWER9_PME_PM_MRK_INST ] = { /* 69 */ ++ .pme_name = "PM_MRK_INST", ++ .pme_code = 0x0000024058, ++ .pme_short_desc = "An instruction was marked.", ++ .pme_long_desc = "An instruction was marked. Includes both Random Instruction Sampling (RIS) at decode time and Random Event Samping (RES) at the time the configured event happens", ++}, ++[ POWER9_PME_PM_TABLEWALK_CYC_PREF ] = { /* 70 */ ++ .pme_name = "PM_TABLEWALK_CYC_PREF", ++ .pme_code = 0x000000F884, ++ .pme_short_desc = "tablewalk qualified for pte prefetches", ++ .pme_long_desc = "tablewalk qualified for pte prefetches", ++}, ++[ POWER9_PME_PM_LSU1_ERAT_HIT ] = { /* 71 */ ++ .pme_name = "PM_LSU1_ERAT_HIT", ++ .pme_code = 0x000000E88C, ++ .pme_short_desc = "Primary ERAT hit.", ++ .pme_long_desc = "Primary ERAT hit. There is no secondary ERAT", ++}, ++[ POWER9_PME_PM_NTC_ISSUE_HELD_OTHER ] = { /* 72 */ ++ .pme_name = "PM_NTC_ISSUE_HELD_OTHER", ++ .pme_code = 0x000003D05A, ++ .pme_short_desc = "The NTC instruction is being held at dispatch during regular pipeline cycles, or because the VSU is busy with multi-cycle instructions, or because of a write-back collision with VSU", ++ .pme_long_desc = "The NTC instruction is being held at dispatch during regular pipeline cycles, or because the VSU is busy with multi-cycle instructions, or because of a write-back collision with VSU", ++}, ++[ POWER9_PME_PM_CMPLU_STALL_LSU_FLUSH_NEXT ] = { /* 73 */ ++ .pme_name = "PM_CMPLU_STALL_LSU_FLUSH_NEXT", ++ .pme_code = 0x000002E01A, ++ .pme_short_desc = "Completion stall of one cycle because the LSU requested to flush the next iop in the sequence.", ++ .pme_long_desc = "Completion stall of one cycle because the LSU requested to flush the next iop in the sequence. It takes 1 cycle for the ISU to process this request before the LSU instruction is allowed to complete", ++}, ++[ POWER9_PME_PM_MRK_DPTEG_FROM_L2 ] = { /* 74 */ ++ .pme_name = "PM_MRK_DPTEG_FROM_L2", ++ .pme_code = 0x000001F142, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L2 due to a marked data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L2 due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++}, ++[ POWER9_PME_PM_LS1_TM_DISALLOW ] = { /* 75 */ ++ .pme_name = "PM_LS1_TM_DISALLOW", ++ .pme_code = 0x000000E8B4, ++ .pme_short_desc = "A TM-ineligible instruction tries to execute inside a transaction and the LSU disallows it", ++ .pme_long_desc = "A TM-ineligible instruction tries to execute inside a transaction and the LSU disallows it", ++}, ++[ POWER9_PME_PM_INST_FROM_L2_DISP_CONFLICT_LDHITST ] = { /* 76 */ ++ .pme_name = "PM_INST_FROM_L2_DISP_CONFLICT_LDHITST", ++ .pme_code = 0x0000034040, ++ .pme_short_desc = "The processor's Instruction cache was reloaded from local core's L2 with load hit store conflict due to an instruction fetch (not prefetch)", ++ .pme_long_desc = "The processor's Instruction cache was reloaded from local core's L2 with load hit store conflict due to an instruction fetch (not prefetch)", ++}, ++[ POWER9_PME_PM_BR_PRED_PCACHE ] = { /* 77 */ ++ .pme_name = "PM_BR_PRED_PCACHE", ++ .pme_code = 0x00000048A0, ++ .pme_short_desc = "Conditional branch completed that used pattern cache prediction", ++ .pme_long_desc = "Conditional branch completed that used pattern cache prediction", ++}, ++[ POWER9_PME_PM_MRK_BACK_BR_CMPL ] = { /* 78 */ ++ .pme_name = "PM_MRK_BACK_BR_CMPL", ++ .pme_code = 0x000003515E, ++ .pme_short_desc = "Marked branch instruction completed with a target address less than current instruction address", ++ .pme_long_desc = "Marked branch instruction completed with a target address less than current instruction address", ++}, ++[ POWER9_PME_PM_RD_CLEARING_SC ] = { /* 79 */ ++ .pme_name = "PM_RD_CLEARING_SC", ++ .pme_code = 0x00000468A6, ++ .pme_short_desc = "rd clearing sc", ++ .pme_long_desc = "rd clearing sc", ++}, ++[ POWER9_PME_PM_PMC1_OVERFLOW ] = { /* 80 */ ++ .pme_name = "PM_PMC1_OVERFLOW", ++ .pme_code = 0x0000020010, ++ .pme_short_desc = "Overflow from counter 1", ++ .pme_long_desc = "Overflow from counter 1", ++}, ++[ POWER9_PME_PM_L2_RTY_ST ] = { /* 81 */ ++ .pme_name = "PM_L2_RTY_ST", ++ .pme_code = 0x000004689E, ++ .pme_short_desc = "RC retries on PB for any store from core", ++ .pme_long_desc = "RC retries on PB for any store from core", ++}, ++[ POWER9_PME_PM_IPTEG_FROM_L2_NO_CONFLICT ] = { /* 82 */ ++ .pme_name = "PM_IPTEG_FROM_L2_NO_CONFLICT", ++ .pme_code = 0x0000015040, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L2 without conflict due to a instruction side request", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L2 without conflict due to a instruction side request", ++}, ++[ POWER9_PME_PM_LSU1_FALSE_LHS ] = { /* 83 */ ++ .pme_name = "PM_LSU1_FALSE_LHS", ++ .pme_code = 0x000000C8A0, ++ .pme_short_desc = "False LHS match detected", ++ .pme_long_desc = "False LHS match detected", ++}, ++[ POWER9_PME_PM_LSU0_VECTOR_ST_FIN ] = { /* 84 */ ++ .pme_name = "PM_LSU0_VECTOR_ST_FIN", ++ .pme_code = 0x000000C088, ++ .pme_short_desc = "A vector store instruction finished.", ++ .pme_long_desc = "A vector store instruction finished. The ops considered in this category are stv*, stxv*, stxsi*x, stxsd, and stxssp", ++}, ++[ POWER9_PME_PM_MEM_LOC_THRESH_LSU_HIGH ] = { /* 85 */ ++ .pme_name = "PM_MEM_LOC_THRESH_LSU_HIGH", ++ .pme_code = 0x0000040056, ++ .pme_short_desc = "Local memory above threshold for LSU medium", ++ .pme_long_desc = "Local memory above threshold for LSU medium", ++}, ++[ POWER9_PME_PM_LS2_UNALIGNED_LD ] = { /* 86 */ ++ .pme_name = "PM_LS2_UNALIGNED_LD", ++ .pme_code = 0x000000C098, ++ .pme_short_desc = "Load instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the load of that size.", ++ .pme_long_desc = "Load instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the load of that size. If the load wraps from slice 3 to slice 0, thee is an additional 3-cycle penalty", ++}, ++[ POWER9_PME_PM_BR_TAKEN_CMPL ] = { /* 87 */ ++ .pme_name = "PM_BR_TAKEN_CMPL", ++ .pme_code = 0x00000200FA, ++ .pme_short_desc = "New event for Branch Taken", ++ .pme_long_desc = "New event for Branch Taken", ++}, ++[ POWER9_PME_PM_DATA_SYS_PUMP_MPRED ] = { /* 88 */ ++ .pme_name = "PM_DATA_SYS_PUMP_MPRED", ++ .pme_code = 0x000003C052, ++ .pme_short_desc = "Final Pump Scope (system) mispredicted.", ++ .pme_long_desc = "Final Pump Scope (system) mispredicted. Either the original scope was too small (Chip/Group) or the original scope was System and it should have been smaller. Counts for a demand load", ++}, ++[ POWER9_PME_PM_ISQ_36_44_ENTRIES ] = { /* 89 */ ++ .pme_name = "PM_ISQ_36_44_ENTRIES", ++ .pme_code = 0x000004000A, ++ .pme_short_desc = "Cycles in which 36 or more Issue Queue entries are in use.", ++ .pme_long_desc = "Cycles in which 36 or more Issue Queue entries are in use. This is a shared event, not per thread. There are 44 issue queue entries across 4 slices in the whole core", ++}, ++[ POWER9_PME_PM_LSU1_VECTOR_LD_FIN ] = { /* 90 */ ++ .pme_name = "PM_LSU1_VECTOR_LD_FIN", ++ .pme_code = 0x000000C880, ++ .pme_short_desc = "A vector load instruction finished.", ++ .pme_long_desc = "A vector load instruction finished. The ops considered in this category are lxv*, lvx*, lve*, lxsi*zx, lxvwsx, lxsd, lxssp, lxvl, lxvll, lxvb16x, lxvh8x, lxv, lxvx", ++}, ++[ POWER9_PME_PM_MRK_DATA_FROM_L2_DISP_CONFLICT_OTHER ] = { /* 91 */ ++ .pme_name = "PM_MRK_DATA_FROM_L2_DISP_CONFLICT_OTHER", ++ .pme_code = 0x000002C124, ++ .pme_short_desc = "The processor's data cache was reloaded from local core's L2 with dispatch conflict due to a marked load", ++ .pme_long_desc = "The processor's data cache was reloaded from local core's L2 with dispatch conflict due to a marked load", ++}, ++[ POWER9_PME_PM_ICT_NOSLOT_IC_MISS ] = { /* 92 */ ++ .pme_name = "PM_ICT_NOSLOT_IC_MISS", ++ .pme_code = 0x000002D01A, ++ .pme_short_desc = "Ict empty for this thread due to Icache Miss", ++ .pme_long_desc = "Ict empty for this thread due to Icache Miss", ++}, ++[ POWER9_PME_PM_LSU3_TM_L1_HIT ] = { /* 93 */ ++ .pme_name = "PM_LSU3_TM_L1_HIT", ++ .pme_code = 0x000000E898, ++ .pme_short_desc = "Load tm hit in L1", ++ .pme_long_desc = "Load tm hit in L1", ++}, ++[ POWER9_PME_PM_MRK_INST_DISP ] = { /* 94 */ ++ .pme_name = "PM_MRK_INST_DISP", ++ .pme_code = 0x00000101E0, ++ .pme_short_desc = "The thread has dispatched a randomly sampled marked instruction", ++ .pme_long_desc = "The thread has dispatched a randomly sampled marked instruction", ++}, ++[ POWER9_PME_PM_VECTOR_FLOP_CMPL ] = { /* 95 */ ++ .pme_name = "PM_VECTOR_FLOP_CMPL", ++ .pme_code = 0x000004D058, ++ .pme_short_desc = "Vector flop instruction completed", ++ .pme_long_desc = "Vector flop instruction completed", ++}, ++[ POWER9_PME_PM_FXU_IDLE ] = { /* 96 */ ++ .pme_name = "PM_FXU_IDLE", ++ .pme_code = 0x0000024052, ++ .pme_short_desc = "Cycles in which FXU0, FXU1, FXU2, and FXU3 are all idle", ++ .pme_long_desc = "Cycles in which FXU0, FXU1, FXU2, and FXU3 are all idle", ++}, ++[ POWER9_PME_PM_INST_CMPL ] = { /* 97 */ ++ .pme_name = "PM_INST_CMPL", ++ .pme_code = 0x0000010002, ++ .pme_short_desc = "# PPC instructions completed", ++ .pme_long_desc = "# PPC instructions completed", ++}, ++[ POWER9_PME_PM_EAT_FORCE_MISPRED ] = { /* 98 */ ++ .pme_name = "PM_EAT_FORCE_MISPRED", ++ .pme_code = 0x00000050A8, ++ .pme_short_desc = "XL-form branch was mispredicted due to the predicted target address missing from EAT.", ++ .pme_long_desc = "XL-form branch was mispredicted due to the predicted target address missing from EAT. The EAT forces a mispredict in this case since there is no predicated target to validate. This is a rare case that may occur when the EAT is full and a branch is issued", ++}, ++[ POWER9_PME_PM_CMPLU_STALL_LRQ_FULL ] = { /* 99 */ ++ .pme_name = "PM_CMPLU_STALL_LRQ_FULL", ++ .pme_code = 0x000002D014, ++ .pme_short_desc = "Finish stall because the NTF instruction was a load that was held in LSAQ because the LRQ was full", ++ .pme_long_desc = "Finish stall because the NTF instruction was a load that was held in LSAQ because the LRQ was full", ++}, ++[ POWER9_PME_PM_MRK_DATA_FROM_DL2L3_MOD ] = { /* 100 */ ++ .pme_name = "PM_MRK_DATA_FROM_DL2L3_MOD", ++ .pme_code = 0x000003D14E, ++ .pme_short_desc = "The processor's data cache was reloaded with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a marked load", ++ .pme_long_desc = "The processor's data cache was reloaded with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a marked load", ++}, ++[ POWER9_PME_PM_BACK_BR_CMPL ] = { /* 101 */ ++ .pme_name = "PM_BACK_BR_CMPL", ++ .pme_code = 0x000002505E, ++ .pme_short_desc = "Branch instruction completed with a target address less than current instruction address", ++ .pme_long_desc = "Branch instruction completed with a target address less than current instruction address", ++}, ++[ POWER9_PME_PM_NEST_REF_CLK ] = { /* 102 */ ++ .pme_name = "PM_NEST_REF_CLK", ++ .pme_code = 0x000003006E, ++ .pme_short_desc = "Multiply by 4 to obtain the number of PB cycles", ++ .pme_long_desc = "Multiply by 4 to obtain the number of PB cycles", ++}, ++[ POWER9_PME_PM_MRK_DPTEG_FROM_RL2L3_SHR ] = { /* 103 */ ++ .pme_name = "PM_MRK_DPTEG_FROM_RL2L3_SHR", ++ .pme_code = 0x000001F14A, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a marked data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a marked data side request.. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++}, ++[ POWER9_PME_PM_RC_USAGE ] = { /* 104 */ ++ .pme_name = "PM_RC_USAGE", ++ .pme_code = 0x000001688E, ++ .pme_short_desc = " continuous 16 cycle(2to1) window where this signals rotates thru sampling each machine", ++ .pme_long_desc = " continuous 16 cycle(2to1) window where this signals rotates thru sampling each machine", ++}, ++[ POWER9_PME_PM_MRK_DPTEG_FROM_L3_1_ECO_MOD ] = { /* 105 */ ++ .pme_name = "PM_MRK_DPTEG_FROM_L3_1_ECO_MOD", ++ .pme_code = 0x000004F144, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's ECO L3 on the same chip due to a marked data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's ECO L3 on the same chip due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++}, ++[ POWER9_PME_PM_BR_CMPL ] = { /* 106 */ ++ .pme_name = "PM_BR_CMPL", ++ .pme_code = 0x0000010012, ++ .pme_short_desc = "Branch Instruction completed", ++ .pme_long_desc = "Branch Instruction completed", ++}, ++[ POWER9_PME_PM_INST_FROM_RL2L3_MOD ] = { /* 107 */ ++ .pme_name = "PM_INST_FROM_RL2L3_MOD", ++ .pme_code = 0x0000024046, ++ .pme_short_desc = "The processor's Instruction cache was reloaded with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to an instruction fetch (not prefetch)", ++ .pme_long_desc = "The processor's Instruction cache was reloaded with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to an instruction fetch (not prefetch)", ++}, ++[ POWER9_PME_PM_SHL_CREATED ] = { /* 108 */ ++ .pme_name = "PM_SHL_CREATED", ++ .pme_code = 0x000000508C, ++ .pme_short_desc = "Store-Hit-Load Table Entry Created", ++ .pme_long_desc = "Store-Hit-Load Table Entry Created", ++}, ++[ POWER9_PME_PM_CMPLU_STALL_PASTE ] = { /* 109 */ ++ .pme_name = "PM_CMPLU_STALL_PASTE", ++ .pme_code = 0x000002C016, ++ .pme_short_desc = "Finish stall because the NTF instruction was a paste waiting for response from L2", ++ .pme_long_desc = "Finish stall because the NTF instruction was a paste waiting for response from L2", ++}, ++[ POWER9_PME_PM_LSU3_LDMX_FIN ] = { /* 110 */ ++ .pme_name = "PM_LSU3_LDMX_FIN", ++ .pme_code = 0x000000D88C, ++ .pme_short_desc = " New P9 instruction LDMX.", ++ .pme_long_desc = " New P9 instruction LDMX.", ++}, ++[ POWER9_PME_PM_SN_USAGE ] = { /* 111 */ ++ .pme_name = "PM_SN_USAGE", ++ .pme_code = 0x000003688E, ++ .pme_short_desc = " continuous 16 cycle(2to1) window where this signals rotates thru sampling each machine", ++ .pme_long_desc = " continuous 16 cycle(2to1) window where this signals rotates thru sampling each machine", ++}, ++[ POWER9_PME_PM_L2_ST_HIT ] = { /* 112 */ ++ .pme_name = "PM_L2_ST_HIT", ++ .pme_code = 0x000002689E, ++ .pme_short_desc = "All successful store dispatches that were L2Hits", ++ .pme_long_desc = "All successful store dispatches that were L2Hits", ++}, ++[ POWER9_PME_PM_DATA_FROM_DMEM ] = { /* 113 */ ++ .pme_name = "PM_DATA_FROM_DMEM", ++ .pme_code = 0x000004C04C, ++ .pme_short_desc = "The processor's data cache was reloaded from another chip's memory on the same Node or Group (Distant) due to a demand load", ++ .pme_long_desc = "The processor's data cache was reloaded from another chip's memory on the same Node or Group (Distant) due to a demand load", ++}, ++[ POWER9_PME_PM_CMPLU_STALL_DMISS_REMOTE ] = { /* 114 */ ++ .pme_name = "PM_CMPLU_STALL_DMISS_REMOTE", ++ .pme_code = 0x000002C01C, ++ .pme_short_desc = "Completion stall by Dcache miss which resolved from remote chip (cache or memory)", ++ .pme_long_desc = "Completion stall by Dcache miss which resolved from remote chip (cache or memory)", ++}, ++[ POWER9_PME_PM_LSU2_LDMX_FIN ] = { /* 115 */ ++ .pme_name = "PM_LSU2_LDMX_FIN", ++ .pme_code = 0x000000D08C, ++ .pme_short_desc = " New P9 instruction LDMX.", ++ .pme_long_desc = " New P9 instruction LDMX.", ++}, ++[ POWER9_PME_PM_L3_LD_MISS ] = { /* 116 */ ++ .pme_name = "PM_L3_LD_MISS", ++ .pme_code = 0x00000268A4, ++ .pme_short_desc = "L3 demand LD Miss", ++ .pme_long_desc = "L3 demand LD Miss", ++}, ++[ POWER9_PME_PM_DPTEG_FROM_RL4 ] = { /* 117 */ ++ .pme_name = "PM_DPTEG_FROM_RL4", ++ .pme_code = 0x000002E04A, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from another chip's L4 on the same Node or Group ( Remote) due to a data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from another chip's L4 on the same Node or Group ( Remote) due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++}, ++[ POWER9_PME_PM_RADIX_PWC_L3_PDE_FROM_L2 ] = { /* 118 */ ++ .pme_name = "PM_RADIX_PWC_L3_PDE_FROM_L2", ++ .pme_code = 0x000002D02A, ++ .pme_short_desc = "A Page Directory Entry was reloaded to a level 3 page walk cache from the core's L2 data cache", ++ .pme_long_desc = "A Page Directory Entry was reloaded to a level 3 page walk cache from the core's L2 data cache", ++}, ++[ POWER9_PME_PM_MRK_DATA_FROM_RL4_CYC ] = { /* 119 */ ++ .pme_name = "PM_MRK_DATA_FROM_RL4_CYC", ++ .pme_code = 0x000004D12A, ++ .pme_short_desc = "Duration in cycles to reload from another chip's L4 on the same Node or Group ( Remote) due to a marked load", ++ .pme_long_desc = "Duration in cycles to reload from another chip's L4 on the same Node or Group ( Remote) due to a marked load", ++}, ++[ POWER9_PME_PM_TM_SC_CO ] = { /* 120 */ ++ .pme_name = "PM_TM_SC_CO", ++ .pme_code = 0x00000160A6, ++ .pme_short_desc = "l3 castout tm Sc line", ++ .pme_long_desc = "l3 castout tm Sc line", ++}, ++[ POWER9_PME_PM_L2_SN_SX_I_DONE ] = { /* 121 */ ++ .pme_name = "PM_L2_SN_SX_I_DONE", ++ .pme_code = 0x0000036886, ++ .pme_short_desc = "SNP dispatched and went from Sx or Tx to Ix", ++ .pme_long_desc = "SNP dispatched and went from Sx or Tx to Ix", ++}, ++[ POWER9_PME_PM_DPTEG_FROM_L3_DISP_CONFLICT ] = { /* 122 */ ++ .pme_name = "PM_DPTEG_FROM_L3_DISP_CONFLICT", ++ .pme_code = 0x000003E042, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L3 with dispatch conflict due to a data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L3 with dispatch conflict due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++}, ++[ POWER9_PME_PM_ISIDE_L2MEMACC ] = { /* 123 */ ++ .pme_name = "PM_ISIDE_L2MEMACC", ++ .pme_code = 0x0000026890, ++ .pme_short_desc = "valid when first beat of data comes in for an i-side fetch where data came from mem(or L4)", ++ .pme_long_desc = "valid when first beat of data comes in for an i-side fetch where data came from mem(or L4)", ++}, ++[ POWER9_PME_PM_L3_P0_GRP_PUMP ] = { /* 124 */ ++ .pme_name = "PM_L3_P0_GRP_PUMP", ++ .pme_code = 0x00000260B0, ++ .pme_short_desc = "L3 pf sent with grp scope port 0", ++ .pme_long_desc = "L3 pf sent with grp scope port 0", ++}, ++[ POWER9_PME_PM_IPTEG_FROM_DL2L3_SHR ] = { /* 125 */ ++ .pme_name = "PM_IPTEG_FROM_DL2L3_SHR", ++ .pme_code = 0x0000035048, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a instruction side request", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a instruction side request", ++}, ++[ POWER9_PME_PM_RADIX_PWC_L3_PDE_FROM_L3 ] = { /* 126 */ ++ .pme_name = "PM_RADIX_PWC_L3_PDE_FROM_L3", ++ .pme_code = 0x000001F15C, ++ .pme_short_desc = "A Page Directory Entry was reloaded to a level 3 page walk cache from the core's L3 data cache", ++ .pme_long_desc = "A Page Directory Entry was reloaded to a level 3 page walk cache from the core's L3 data cache", ++}, ++[ POWER9_PME_PM_THRESH_MET ] = { /* 127 */ ++ .pme_name = "PM_THRESH_MET", ++ .pme_code = 0x00000101EC, ++ .pme_short_desc = "threshold exceeded", ++ .pme_long_desc = "threshold exceeded", ++}, ++[ POWER9_PME_PM_DATA_FROM_L2_MEPF ] = { /* 128 */ ++ .pme_name = "PM_DATA_FROM_L2_MEPF", ++ .pme_code = 0x000002C040, ++ .pme_short_desc = "The processor's data cache was reloaded from local core's L2 hit without dispatch conflicts on Mepf state due to a demand load", ++ .pme_long_desc = "The processor's data cache was reloaded from local core's L2 hit without dispatch conflicts on Mepf state due to a demand load", ++}, ++[ POWER9_PME_PM_DISP_STARVED ] = { /* 129 */ ++ .pme_name = "PM_DISP_STARVED", ++ .pme_code = 0x0000030008, ++ .pme_short_desc = "Dispatched Starved", ++ .pme_long_desc = "Dispatched Starved", ++}, ++[ POWER9_PME_PM_L3_P0_LCO_RTY ] = { /* 130 */ ++ .pme_name = "PM_L3_P0_LCO_RTY", ++ .pme_code = 0x00000160B4, ++ .pme_short_desc = "L3 lateral cast out received retry on port 0", ++ .pme_long_desc = "L3 lateral cast out received retry on port 0", ++}, ++[ POWER9_PME_PM_NTC_ISSUE_HELD_DARQ_FULL ] = { /* 131 */ ++ .pme_name = "PM_NTC_ISSUE_HELD_DARQ_FULL", ++ .pme_code = 0x000001006A, ++ .pme_short_desc = "The NTC instruction is being held at dispatch because there are no slots in the DARQ for it", ++ .pme_long_desc = "The NTC instruction is being held at dispatch because there are no slots in the DARQ for it", ++}, ++[ POWER9_PME_PM_L3_RD_USAGE ] = { /* 132 */ ++ .pme_name = "PM_L3_RD_USAGE", ++ .pme_code = 0x00000268AC, ++ .pme_short_desc = "rotating sample of 16 RD actives", ++ .pme_long_desc = "rotating sample of 16 RD actives", ++}, ++[ POWER9_PME_PM_TLBIE_FIN ] = { /* 133 */ ++ .pme_name = "PM_TLBIE_FIN", ++ .pme_code = 0x0000030058, ++ .pme_short_desc = "tlbie finished", ++ .pme_long_desc = "tlbie finished", ++}, ++[ POWER9_PME_PM_DPTEG_FROM_LL4 ] = { /* 134 */ ++ .pme_name = "PM_DPTEG_FROM_LL4", ++ .pme_code = 0x000001E04C, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from the local chip's L4 cache due to a data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from the local chip's L4 cache due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++}, ++[ POWER9_PME_PM_CMPLU_STALL_TLBIE ] = { /* 135 */ ++ .pme_name = "PM_CMPLU_STALL_TLBIE", ++ .pme_code = 0x000002E01C, ++ .pme_short_desc = "Finish stall because the NTF instruction was a tlbie waiting for response from L2", ++ .pme_long_desc = "Finish stall because the NTF instruction was a tlbie waiting for response from L2", ++}, ++[ POWER9_PME_PM_MRK_DATA_FROM_L2MISS_CYC ] = { /* 136 */ ++ .pme_name = "PM_MRK_DATA_FROM_L2MISS_CYC", ++ .pme_code = 0x0000035152, ++ .pme_short_desc = "Duration in cycles to reload from a localtion other than the local core's L2 due to a marked load", ++ .pme_long_desc = "Duration in cycles to reload from a localtion other than the local core's L2 due to a marked load", ++}, ++[ POWER9_PME_PM_LS3_DC_COLLISIONS ] = { /* 137 */ ++ .pme_name = "PM_LS3_DC_COLLISIONS", ++ .pme_code = 0x000000D894, ++ .pme_short_desc = "Read-write data cache collisions", ++ .pme_long_desc = "Read-write data cache collisions", ++}, ++[ POWER9_PME_PM_L1_ICACHE_MISS ] = { /* 138 */ ++ .pme_name = "PM_L1_ICACHE_MISS", ++ .pme_code = 0x00000200FD, ++ .pme_short_desc = "Demand iCache Miss", ++ .pme_long_desc = "Demand iCache Miss", ++}, ++[ POWER9_PME_PM_LSU_REJECT_ERAT_MISS ] = { /* 139 */ ++ .pme_name = "PM_LSU_REJECT_ERAT_MISS", ++ .pme_code = 0x000002E05C, ++ .pme_short_desc = "LSU Reject due to ERAT (up to 4 per cycles)", ++ .pme_long_desc = "LSU Reject due to ERAT (up to 4 per cycles)", ++}, ++[ POWER9_PME_PM_DATA_SYS_PUMP_CPRED ] = { /* 140 */ ++ .pme_name = "PM_DATA_SYS_PUMP_CPRED", ++ .pme_code = 0x000003C050, ++ .pme_short_desc = "Initial and Final Pump Scope was system pump (prediction=correct) for a demand load", ++ .pme_long_desc = "Initial and Final Pump Scope was system pump (prediction=correct) for a demand load", ++}, ++[ POWER9_PME_PM_MRK_FAB_RSP_RWITM_CYC ] = { /* 141 */ ++ .pme_name = "PM_MRK_FAB_RSP_RWITM_CYC", ++ .pme_code = 0x000004F150, ++ .pme_short_desc = "cycles L2 RC took for a rwitm", ++ .pme_long_desc = "cycles L2 RC took for a rwitm", ++}, ++[ POWER9_PME_PM_MRK_DATA_FROM_L3_1_SHR_CYC ] = { /* 142 */ ++ .pme_name = "PM_MRK_DATA_FROM_L3_1_SHR_CYC", ++ .pme_code = 0x0000035156, ++ .pme_short_desc = "Duration in cycles to reload with Shared (S) data from another core's L3 on the same chip due to a marked load", ++ .pme_long_desc = "Duration in cycles to reload with Shared (S) data from another core's L3 on the same chip due to a marked load", ++}, ++[ POWER9_PME_PM_LSU_FLUSH_UE ] = { /* 143 */ ++ .pme_name = "PM_LSU_FLUSH_UE", ++ .pme_code = 0x000000C0B4, ++ .pme_short_desc = "Correctable ECC error on reload data, reported at critical data forward time", ++ .pme_long_desc = "Correctable ECC error on reload data, reported at critical data forward time", ++}, ++[ POWER9_PME_PM_BR_PRED_TAKEN_CR ] = { /* 144 */ ++ .pme_name = "PM_BR_PRED_TAKEN_CR", ++ .pme_code = 0x00000040B0, ++ .pme_short_desc = "Conditional Branch that had its direction predicted.", ++ .pme_long_desc = "Conditional Branch that had its direction predicted. I-form branches do not set this event. In addition, B-form branches which do not use the BHT do not set this event - these are branches with BO-field set to 'always taken' and branches", ++}, ++[ POWER9_PME_PM_INST_FROM_L2_DISP_CONFLICT_OTHER ] = { /* 145 */ ++ .pme_name = "PM_INST_FROM_L2_DISP_CONFLICT_OTHER", ++ .pme_code = 0x0000044040, ++ .pme_short_desc = "The processor's Instruction cache was reloaded from local core's L2 with dispatch conflict due to an instruction fetch (not prefetch)", ++ .pme_long_desc = "The processor's Instruction cache was reloaded from local core's L2 with dispatch conflict due to an instruction fetch (not prefetch)", ++}, ++[ POWER9_PME_PM_MRK_DPTEG_FROM_DL2L3_SHR ] = { /* 146 */ ++ .pme_name = "PM_MRK_DPTEG_FROM_DL2L3_SHR", ++ .pme_code = 0x000003F148, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a marked data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++}, ++[ POWER9_PME_PM_DATA_FROM_L2_1_MOD ] = { /* 147 */ ++ .pme_name = "PM_DATA_FROM_L2_1_MOD", ++ .pme_code = 0x000004C046, ++ .pme_short_desc = "The processor's data cache was reloaded with Modified (M) data from another core's L2 on the same chip due to a demand load", ++ .pme_long_desc = "The processor's data cache was reloaded with Modified (M) data from another core's L2 on the same chip due to a demand load", ++}, ++[ POWER9_PME_PM_LSU_FLUSH_LHL_SHL ] = { /* 148 */ ++ .pme_name = "PM_LSU_FLUSH_LHL_SHL", ++ .pme_code = 0x000000C8B4, ++ .pme_short_desc = "The instruction was flushed because of a sequential load/store consistency.", ++ .pme_long_desc = "The instruction was flushed because of a sequential load/store consistency. If a load or store hits on an older load that has either been snooped (for loads) or has stale data (for stores).", ++}, ++[ POWER9_PME_PM_L3_P1_PF_RTY ] = { /* 149 */ ++ .pme_name = "PM_L3_P1_PF_RTY", ++ .pme_code = 0x00000268AE, ++ .pme_short_desc = "L3 PF received retry port 3", ++ .pme_long_desc = "L3 PF received retry port 3", ++}, ++[ POWER9_PME_PM_MRK_DPTEG_FROM_DL2L3_MOD ] = { /* 150 */ ++ .pme_name = "PM_MRK_DPTEG_FROM_DL2L3_MOD", ++ .pme_code = 0x000004F148, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a marked data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++}, ++[ POWER9_PME_PM_DFU_BUSY ] = { /* 151 */ ++ .pme_name = "PM_DFU_BUSY", ++ .pme_code = 0x000004D04C, ++ .pme_short_desc = "Cycles in which all 4 Decimal Floating Point units are busy.", ++ .pme_long_desc = "Cycles in which all 4 Decimal Floating Point units are busy. The DFU is running at capacity", ++}, ++[ POWER9_PME_PM_LSU1_TM_L1_MISS ] = { /* 152 */ ++ .pme_name = "PM_LSU1_TM_L1_MISS", ++ .pme_code = 0x000000E89C, ++ .pme_short_desc = "Load tm L1 miss", ++ .pme_long_desc = "Load tm L1 miss", ++}, ++[ POWER9_PME_PM_FREQ_UP ] = { /* 153 */ ++ .pme_name = "PM_FREQ_UP", ++ .pme_code = 0x000004000C, ++ .pme_short_desc = "Power Management: Above Threshold A", ++ .pme_long_desc = "Power Management: Above Threshold A", ++}, ++[ POWER9_PME_PM_DATA_FROM_LMEM ] = { /* 154 */ ++ .pme_name = "PM_DATA_FROM_LMEM", ++ .pme_code = 0x000002C048, ++ .pme_short_desc = "The processor's data cache was reloaded from the local chip's Memory due to a demand load", ++ .pme_long_desc = "The processor's data cache was reloaded from the local chip's Memory due to a demand load", ++}, ++[ POWER9_PME_PM_MRK_DATA_FROM_L2_MEPF ] = { /* 155 */ ++ .pme_name = "PM_MRK_DATA_FROM_L2_MEPF", ++ .pme_code = 0x000004C120, ++ .pme_short_desc = "The processor's data cache was reloaded from local core's L2 hit without dispatch conflicts on Mepf state.", ++ .pme_long_desc = "The processor's data cache was reloaded from local core's L2 hit without dispatch conflicts on Mepf state. due to a marked load", ++}, ++[ POWER9_PME_PM_ISIDE_DISP ] = { /* 156 */ ++ .pme_name = "PM_ISIDE_DISP", ++ .pme_code = 0x000001688A, ++ .pme_short_desc = "All i-side dispatch attempts", ++ .pme_long_desc = "All i-side dispatch attempts", ++}, ++[ POWER9_PME_PM_TM_OUTER_TBEGIN ] = { /* 157 */ ++ .pme_name = "PM_TM_OUTER_TBEGIN", ++ .pme_code = 0x0000002094, ++ .pme_short_desc = "Completion time outer tbegin", ++ .pme_long_desc = "Completion time outer tbegin", ++}, ++[ POWER9_PME_PM_PMC3_OVERFLOW ] = { /* 158 */ ++ .pme_name = "PM_PMC3_OVERFLOW", ++ .pme_code = 0x0000040010, ++ .pme_short_desc = "Overflow from counter 3", ++ .pme_long_desc = "Overflow from counter 3", ++}, ++[ POWER9_PME_PM_LSU0_SET_MPRED ] = { /* 159 */ ++ .pme_name = "PM_LSU0_SET_MPRED", ++ .pme_code = 0x000000D080, ++ .pme_short_desc = "Set prediction(set-p) miss.", ++ .pme_long_desc = "Set prediction(set-p) miss. The entry was not found in the Set prediction table", ++}, ++[ POWER9_PME_PM_INST_FROM_L2_MEPF ] = { /* 160 */ ++ .pme_name = "PM_INST_FROM_L2_MEPF", ++ .pme_code = 0x0000024040, ++ .pme_short_desc = "The processor's Instruction cache was reloaded from local core's L2 hit without dispatch conflicts on Mepf state.", ++ .pme_long_desc = "The processor's Instruction cache was reloaded from local core's L2 hit without dispatch conflicts on Mepf state. due to an instruction fetch (not prefetch)", ++}, ++[ POWER9_PME_PM_L3_P0_NODE_PUMP ] = { /* 161 */ ++ .pme_name = "PM_L3_P0_NODE_PUMP", ++ .pme_code = 0x00000160B0, ++ .pme_short_desc = "L3 pf sent with nodal scope port 0", ++ .pme_long_desc = "L3 pf sent with nodal scope port 0", ++}, ++[ POWER9_PME_PM_IPTEG_FROM_L3_1_MOD ] = { /* 162 */ ++ .pme_name = "PM_IPTEG_FROM_L3_1_MOD", ++ .pme_code = 0x0000025044, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's L3 on the same chip due to a instruction side request", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's L3 on the same chip due to a instruction side request", ++}, ++[ POWER9_PME_PM_L3_PF_USAGE ] = { /* 163 */ ++ .pme_name = "PM_L3_PF_USAGE", ++ .pme_code = 0x00000260AC, ++ .pme_short_desc = "rotating sample of 32 PF actives", ++ .pme_long_desc = "rotating sample of 32 PF actives", ++}, ++[ POWER9_PME_PM_CMPLU_STALL_BRU ] = { /* 164 */ ++ .pme_name = "PM_CMPLU_STALL_BRU", ++ .pme_code = 0x000004D018, ++ .pme_short_desc = "Completion stall due to a Branch Unit", ++ .pme_long_desc = "Completion stall due to a Branch Unit", ++}, ++[ POWER9_PME_PM_ISLB_MISS ] = { /* 165 */ ++ .pme_name = "PM_ISLB_MISS", ++ .pme_code = 0x000000D8A8, ++ .pme_short_desc = "Instruction SLB Miss - Total of all segment sizes", ++ .pme_long_desc = "Instruction SLB Miss - Total of all segment sizes", ++}, ++[ POWER9_PME_PM_CYC ] = { /* 166 */ ++ .pme_name = "PM_CYC", ++ .pme_code = 0x000001001E, ++ .pme_short_desc = "Cycles", ++ .pme_long_desc = "Cycles", ++}, ++[ POWER9_PME_PM_MRK_DATA_FROM_L3_1_SHR ] = { /* 167 */ ++ .pme_name = "PM_MRK_DATA_FROM_L3_1_SHR", ++ .pme_code = 0x000004D124, ++ .pme_short_desc = "The processor's data cache was reloaded with Shared (S) data from another core's L3 on the same chip due to a marked load", ++ .pme_long_desc = "The processor's data cache was reloaded with Shared (S) data from another core's L3 on the same chip due to a marked load", ++}, ++[ POWER9_PME_PM_IPTEG_FROM_RL2L3_MOD ] = { /* 168 */ ++ .pme_name = "PM_IPTEG_FROM_RL2L3_MOD", ++ .pme_code = 0x0000025046, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a instruction side request", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a instruction side request", ++}, ++[ POWER9_PME_PM_DARQ_10_12_ENTRIES ] = { /* 169 */ ++ .pme_name = "PM_DARQ_10_12_ENTRIES", ++ .pme_code = 0x000001D058, ++ .pme_short_desc = "Cycles in which 10 or more DARQ entries (out of 12) are in use", ++ .pme_long_desc = "Cycles in which 10 or more DARQ entries (out of 12) are in use", ++}, ++[ POWER9_PME_PM_LSU2_3_LRQF_FULL_CYC ] = { /* 170 */ ++ .pme_name = "PM_LSU2_3_LRQF_FULL_CYC", ++ .pme_code = 0x000000D8BC, ++ .pme_short_desc = "Counts the number of cycles the LRQF is full.", ++ .pme_long_desc = "Counts the number of cycles the LRQF is full. LRQF is the queue that holds loads between finish and completion. If it fills up, instructions stay in LRQ until completion, potentially backing up the LRQ", ++}, ++[ POWER9_PME_PM_DECODE_FUSION_OP_PRESERV ] = { /* 171 */ ++ .pme_name = "PM_DECODE_FUSION_OP_PRESERV", ++ .pme_code = 0x0000005088, ++ .pme_short_desc = "Destructive op operand preservation", ++ .pme_long_desc = "Destructive op operand preservation", ++}, ++[ POWER9_PME_PM_MRK_DPTEG_FROM_L2_MEPF ] = { /* 172 */ ++ .pme_name = "PM_MRK_DPTEG_FROM_L2_MEPF", ++ .pme_code = 0x000002F140, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L2 hit without dispatch conflicts on Mepf state.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L2 hit without dispatch conflicts on Mepf state. due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++}, ++[ POWER9_PME_PM_MRK_L1_RELOAD_VALID ] = { /* 173 */ ++ .pme_name = "PM_MRK_L1_RELOAD_VALID", ++ .pme_code = 0x00000101EA, ++ .pme_short_desc = "Marked demand reload", ++ .pme_long_desc = "Marked demand reload", ++}, ++[ POWER9_PME_PM_LSU2_SET_MPRED ] = { /* 174 */ ++ .pme_name = "PM_LSU2_SET_MPRED", ++ .pme_code = 0x000000D084, ++ .pme_short_desc = "Set prediction(set-p) miss.", ++ .pme_long_desc = "Set prediction(set-p) miss. The entry was not found in the Set prediction table", ++}, ++[ POWER9_PME_PM_1PLUS_PPC_CMPL ] = { /* 175 */ ++ .pme_name = "PM_1PLUS_PPC_CMPL", ++ .pme_code = 0x00000100F2, ++ .pme_short_desc = "1 or more ppc insts finished", ++ .pme_long_desc = "1 or more ppc insts finished", ++}, ++[ POWER9_PME_PM_DATA_FROM_LL4 ] = { /* 176 */ ++ .pme_name = "PM_DATA_FROM_LL4", ++ .pme_code = 0x000001C04C, ++ .pme_short_desc = "The processor's data cache was reloaded from the local chip's L4 cache due to a demand load", ++ .pme_long_desc = "The processor's data cache was reloaded from the local chip's L4 cache due to a demand load", ++}, ++[ POWER9_PME_PM_CMPLU_STALL_DMISS_L3MISS ] = { /* 177 */ ++ .pme_name = "PM_CMPLU_STALL_DMISS_L3MISS", ++ .pme_code = 0x000004C01A, ++ .pme_short_desc = "Completion stall due to cache miss resolving missed the L3", ++ .pme_long_desc = "Completion stall due to cache miss resolving missed the L3", ++}, ++[ POWER9_PME_PM_TM_CAP_OVERFLOW ] = { /* 178 */ ++ .pme_name = "PM_TM_CAP_OVERFLOW", ++ .pme_code = 0x000004608C, ++ .pme_short_desc = "TM Footprint Capactiy Overflow", ++ .pme_long_desc = "TM Footprint Capactiy Overflow", ++}, ++[ POWER9_PME_PM_MRK_DPTEG_FROM_LMEM ] = { /* 179 */ ++ .pme_name = "PM_MRK_DPTEG_FROM_LMEM", ++ .pme_code = 0x000002F148, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from the local chip's Memory due to a marked data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from the local chip's Memory due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++}, ++[ POWER9_PME_PM_LSU3_FALSE_LHS ] = { /* 180 */ ++ .pme_name = "PM_LSU3_FALSE_LHS", ++ .pme_code = 0x000000C8A4, ++ .pme_short_desc = "False LHS match detected", ++ .pme_long_desc = "False LHS match detected", ++}, ++[ POWER9_PME_PM_THRESH_EXC_512 ] = { /* 181 */ ++ .pme_name = "PM_THRESH_EXC_512", ++ .pme_code = 0x00000201E8, ++ .pme_short_desc = "Threshold counter exceeded a value of 512", ++ .pme_long_desc = "Threshold counter exceeded a value of 512", ++}, ++[ POWER9_PME_PM_RADIX_PWC_L1_PDE_FROM_L2 ] = { /* 182 */ ++ .pme_name = "PM_RADIX_PWC_L1_PDE_FROM_L2", ++ .pme_code = 0x000002D026, ++ .pme_short_desc = "A Page Directory Entry was reloaded to a level 1 page walk cache from the core's L2 data cache", ++ .pme_long_desc = "A Page Directory Entry was reloaded to a level 1 page walk cache from the core's L2 data cache", ++}, ++[ POWER9_PME_PM_HWSYNC ] = { /* 183 */ ++ .pme_name = "PM_HWSYNC", ++ .pme_code = 0x00000050A0, ++ .pme_short_desc = "Hwsync instruction decoded and transferred", ++ .pme_long_desc = "Hwsync instruction decoded and transferred", ++}, ++[ POWER9_PME_PM_TM_FAIL_FOOTPRINT_OVERFLOW ] = { /* 184 */ ++ .pme_name = "PM_TM_FAIL_FOOTPRINT_OVERFLOW", ++ .pme_code = 0x00000020A8, ++ .pme_short_desc = "TM aborted because the tracking limit for transactional storage accesses was exceeded.", ++ .pme_long_desc = "TM aborted because the tracking limit for transactional storage accesses was exceeded.. Asynchronous", ++}, ++[ POWER9_PME_PM_INST_SYS_PUMP_MPRED_RTY ] = { /* 185 */ ++ .pme_name = "PM_INST_SYS_PUMP_MPRED_RTY", ++ .pme_code = 0x0000044050, ++ .pme_short_desc = "Final Pump Scope (system) ended up larger than Initial Pump Scope (Chip/Group) for an instruction fetch", ++ .pme_long_desc = "Final Pump Scope (system) ended up larger than Initial Pump Scope (Chip/Group) for an instruction fetch", ++}, ++[ POWER9_PME_PM_ICT_NOSLOT_DISP_HELD_HB_FULL ] = { /* 186 */ ++ .pme_name = "PM_ICT_NOSLOT_DISP_HELD_HB_FULL", ++ .pme_code = 0x0000030018, ++ .pme_short_desc = "Ict empty for this thread due to dispatch holds because the History Buffer was full.", ++ .pme_long_desc = "Ict empty for this thread due to dispatch holds because the History Buffer was full. Could be GPR/VSR/VMR/FPR/CR/XVF", ++}, ++[ POWER9_PME_PM_DC_DEALLOC_NO_CONF ] = { /* 187 */ ++ .pme_name = "PM_DC_DEALLOC_NO_CONF", ++ .pme_code = 0x000000F8AC, ++ .pme_short_desc = "A demand load referenced a line in an active fuzzy prefetch stream.", ++ .pme_long_desc = "A demand load referenced a line in an active fuzzy prefetch stream. The stream could have been allocated through the hardware prefetch mechanism or through software.Fuzzy stream confirm (out of order effects, or pf cant keep up)", ++}, ++[ POWER9_PME_PM_CMPLU_STALL_VFXLONG ] = { /* 188 */ ++ .pme_name = "PM_CMPLU_STALL_VFXLONG", ++ .pme_code = 0x000002E018, ++ .pme_short_desc = "Completion stall due to a long latency vector fixed point instruction (division, square root)", ++ .pme_long_desc = "Completion stall due to a long latency vector fixed point instruction (division, square root)", ++}, ++[ POWER9_PME_PM_MEM_LOC_THRESH_IFU ] = { /* 189 */ ++ .pme_name = "PM_MEM_LOC_THRESH_IFU", ++ .pme_code = 0x0000010058, ++ .pme_short_desc = "Local Memory above threshold for IFU speculation control", ++ .pme_long_desc = "Local Memory above threshold for IFU speculation control", ++}, ++[ POWER9_PME_PM_MRK_DATA_FROM_L3_CYC ] = { /* 190 */ ++ .pme_name = "PM_MRK_DATA_FROM_L3_CYC", ++ .pme_code = 0x0000035154, ++ .pme_short_desc = "Duration in cycles to reload from local core's L3 due to a marked load", ++ .pme_long_desc = "Duration in cycles to reload from local core's L3 due to a marked load", ++}, ++[ POWER9_PME_PM_PTE_PREFETCH ] = { /* 191 */ ++ .pme_name = "PM_PTE_PREFETCH", ++ .pme_code = 0x000000F084, ++ .pme_short_desc = "PTE prefetches", ++ .pme_long_desc = "PTE prefetches", ++}, ++[ POWER9_PME_PM_CMPLU_STALL_STORE_PIPE_ARB ] = { /* 192 */ ++ .pme_name = "PM_CMPLU_STALL_STORE_PIPE_ARB", ++ .pme_code = 0x000004C010, ++ .pme_short_desc = "Finish stall because the NTF instruction was a store waiting for the next relaunch opportunity after an internal reject.", ++ .pme_long_desc = "Finish stall because the NTF instruction was a store waiting for the next relaunch opportunity after an internal reject. This means the instruction is ready to relaunch and tried once but lost arbitration", ++}, ++[ POWER9_PME_PM_CMPLU_STALL_SLB ] = { /* 193 */ ++ .pme_name = "PM_CMPLU_STALL_SLB", ++ .pme_code = 0x000001E052, ++ .pme_short_desc = "Finish stall because the NTF instruction was awaiting L2 response for an SLB", ++ .pme_long_desc = "Finish stall because the NTF instruction was awaiting L2 response for an SLB", ++}, ++[ POWER9_PME_PM_MRK_DERAT_MISS_4K ] = { /* 194 */ ++ .pme_name = "PM_MRK_DERAT_MISS_4K", ++ .pme_code = 0x000002D150, ++ .pme_short_desc = "Marked Data ERAT Miss (Data TLB Access) page size 4K", ++ .pme_long_desc = "Marked Data ERAT Miss (Data TLB Access) page size 4K", ++}, ++[ POWER9_PME_PM_CMPLU_STALL_LSU_MFSPR ] = { /* 195 */ ++ .pme_name = "PM_CMPLU_STALL_LSU_MFSPR", ++ .pme_code = 0x0000034056, ++ .pme_short_desc = "Finish stall because the NTF instruction was a mfspr instruction targeting an LSU SPR and it was waiting for the register data to be returned", ++ .pme_long_desc = "Finish stall because the NTF instruction was a mfspr instruction targeting an LSU SPR and it was waiting for the register data to be returned", ++}, ++[ POWER9_PME_PM_MRK_DPTEG_FROM_L3_1_ECO_SHR ] = { /* 196 */ ++ .pme_name = "PM_MRK_DPTEG_FROM_L3_1_ECO_SHR", ++ .pme_code = 0x000003F144, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's ECO L3 on the same chip due to a marked data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's ECO L3 on the same chip due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++}, ++[ POWER9_PME_PM_VSU_DP_FSQRT_FDIV ] = { /* 197 */ ++ .pme_name = "PM_VSU_DP_FSQRT_FDIV", ++ .pme_code = 0x000003D058, ++ .pme_short_desc = "vector versions of fdiv,fsqrt", ++ .pme_long_desc = "vector versions of fdiv,fsqrt", ++}, ++[ POWER9_PME_PM_IPTEG_FROM_L3_1_ECO_SHR ] = { /* 198 */ ++ .pme_name = "PM_IPTEG_FROM_L3_1_ECO_SHR", ++ .pme_code = 0x0000035044, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's ECO L3 on the same chip due to a instruction side request", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's ECO L3 on the same chip due to a instruction side request", ++}, ++[ POWER9_PME_PM_L3_P0_LCO_DATA ] = { /* 199 */ ++ .pme_name = "PM_L3_P0_LCO_DATA", ++ .pme_code = 0x00000260AA, ++ .pme_short_desc = "lco sent with data port 0", ++ .pme_long_desc = "lco sent with data port 0", ++}, ++[ POWER9_PME_PM_RUN_INST_CMPL ] = { /* 200 */ ++ .pme_name = "PM_RUN_INST_CMPL", ++ .pme_code = 0x00000400FA, ++ .pme_short_desc = "Run_Instructions", ++ .pme_long_desc = "Run_Instructions", ++}, ++[ POWER9_PME_PM_MRK_DATA_FROM_OFF_CHIP_CACHE ] = { /* 201 */ ++ .pme_name = "PM_MRK_DATA_FROM_OFF_CHIP_CACHE", ++ .pme_code = 0x000002D120, ++ .pme_short_desc = "The processor's data cache was reloaded either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to a marked load", ++ .pme_long_desc = "The processor's data cache was reloaded either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to a marked load", ++}, ++[ POWER9_PME_PM_MRK_TEND_FAIL ] = { /* 202 */ ++ .pme_name = "PM_MRK_TEND_FAIL", ++ .pme_code = 0x00000028A4, ++ .pme_short_desc = "Nested or not nested tend failed for a marked tend instruction", ++ .pme_long_desc = "Nested or not nested tend failed for a marked tend instruction", ++}, ++[ POWER9_PME_PM_MRK_VSU_FIN ] = { /* 203 */ ++ .pme_name = "PM_MRK_VSU_FIN", ++ .pme_code = 0x0000030132, ++ .pme_short_desc = "VSU marked instr finish", ++ .pme_long_desc = "VSU marked instr finish", ++}, ++[ POWER9_PME_PM_DATA_FROM_L3_1_ECO_MOD ] = { /* 204 */ ++ .pme_name = "PM_DATA_FROM_L3_1_ECO_MOD", ++ .pme_code = 0x000004C044, ++ .pme_short_desc = "The processor's data cache was reloaded with Modified (M) data from another core's ECO L3 on the same chip due to a demand load", ++ .pme_long_desc = "The processor's data cache was reloaded with Modified (M) data from another core's ECO L3 on the same chip due to a demand load", ++}, ++[ POWER9_PME_PM_RUN_SPURR ] = { /* 205 */ ++ .pme_name = "PM_RUN_SPURR", ++ .pme_code = 0x0000010008, ++ .pme_short_desc = "Run SPURR", ++ .pme_long_desc = "Run SPURR", ++}, ++[ POWER9_PME_PM_ST_CAUSED_FAIL ] = { /* 206 */ ++ .pme_name = "PM_ST_CAUSED_FAIL", ++ .pme_code = 0x000001608C, ++ .pme_short_desc = "Non TM St caused any thread to fail", ++ .pme_long_desc = "Non TM St caused any thread to fail", ++}, ++[ POWER9_PME_PM_SNOOP_TLBIE ] = { /* 207 */ ++ .pme_name = "PM_SNOOP_TLBIE", ++ .pme_code = 0x000000F880, ++ .pme_short_desc = "TLBIE snoop", ++ .pme_long_desc = "TLBIE snoop", ++}, ++[ POWER9_PME_PM_PMC1_SAVED ] = { /* 208 */ ++ .pme_name = "PM_PMC1_SAVED", ++ .pme_code = 0x000004D010, ++ .pme_short_desc = "PMC1 Rewind Value saved", ++ .pme_long_desc = "PMC1 Rewind Value saved", ++}, ++[ POWER9_PME_PM_DATA_FROM_L3MISS ] = { /* 209 */ ++ .pme_name = "PM_DATA_FROM_L3MISS", ++ .pme_code = 0x00000300FE, ++ .pme_short_desc = "Demand LD - L3 Miss (not L2 hit and not L3 hit)", ++ .pme_long_desc = "Demand LD - L3 Miss (not L2 hit and not L3 hit)", ++}, ++[ POWER9_PME_PM_DATA_FROM_ON_CHIP_CACHE ] = { /* 210 */ ++ .pme_name = "PM_DATA_FROM_ON_CHIP_CACHE", ++ .pme_code = 0x000001C048, ++ .pme_short_desc = "The processor's data cache was reloaded either shared or modified data from another core's L2/L3 on the same chip due to a demand load", ++ .pme_long_desc = "The processor's data cache was reloaded either shared or modified data from another core's L2/L3 on the same chip due to a demand load", ++}, ++[ POWER9_PME_PM_DTLB_MISS_16G ] = { /* 211 */ ++ .pme_name = "PM_DTLB_MISS_16G", ++ .pme_code = 0x000001C058, ++ .pme_short_desc = "Data TLB Miss page size 16G", ++ .pme_long_desc = "Data TLB Miss page size 16G", ++}, ++[ POWER9_PME_PM_MRK_DPTEG_FROM_DMEM ] = { /* 212 */ ++ .pme_name = "PM_MRK_DPTEG_FROM_DMEM", ++ .pme_code = 0x000004F14C, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from another chip's memory on the same Node or Group (Distant) due to a marked data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from another chip's memory on the same Node or Group (Distant) due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++}, ++[ POWER9_PME_PM_ICT_NOSLOT_IC_L3MISS ] = { /* 213 */ ++ .pme_name = "PM_ICT_NOSLOT_IC_L3MISS", ++ .pme_code = 0x000004E010, ++ .pme_short_desc = "Ict empty for this thread due to icache misses that were sourced from beyond the local L3.", ++ .pme_long_desc = "Ict empty for this thread due to icache misses that were sourced from beyond the local L3. The source could be local/remote/distant memory or another core's cache", ++}, ++[ POWER9_PME_PM_FLUSH ] = { /* 214 */ ++ .pme_name = "PM_FLUSH", ++ .pme_code = 0x00000400F8, ++ .pme_short_desc = "Flush (any type)", ++ .pme_long_desc = "Flush (any type)", ++}, ++[ POWER9_PME_PM_LSU_FLUSH_OTHER ] = { /* 215 */ ++ .pme_name = "PM_LSU_FLUSH_OTHER", ++ .pme_code = 0x000000C0BC, ++ .pme_short_desc = "Other LSU flushes including: Sync (sync ack from L2 caused search of LRQ for oldest snooped load, This will either signal a Precise Flush of the oldest snooped loa or a Flush Next PPC)", ++ .pme_long_desc = "Other LSU flushes including: Sync (sync ack from L2 caused search of LRQ for oldest snooped load, This will either signal a Precise Flush of the oldest snooped loa or a Flush Next PPC)", ++}, ++[ POWER9_PME_PM_LS1_LAUNCH_HELD_PREF ] = { /* 216 */ ++ .pme_name = "PM_LS1_LAUNCH_HELD_PREF", ++ .pme_code = 0x000000C89C, ++ .pme_short_desc = "Number of times a load or store instruction was unable to launch/relaunch because a high priority prefetch used that relaunch cycle", ++ .pme_long_desc = "Number of times a load or store instruction was unable to launch/relaunch because a high priority prefetch used that relaunch cycle", ++}, ++[ POWER9_PME_PM_L2_LD_HIT ] = { /* 217 */ ++ .pme_name = "PM_L2_LD_HIT", ++ .pme_code = 0x000002609E, ++ .pme_short_desc = "All successful load dispatches that were L2 hits", ++ .pme_long_desc = "All successful load dispatches that were L2 hits", ++}, ++[ POWER9_PME_PM_LSU2_VECTOR_LD_FIN ] = { /* 218 */ ++ .pme_name = "PM_LSU2_VECTOR_LD_FIN", ++ .pme_code = 0x000000C084, ++ .pme_short_desc = "A vector load instruction finished.", ++ .pme_long_desc = "A vector load instruction finished. The ops considered in this category are lxv*, lvx*, lve*, lxsi*zx, lxvwsx, lxsd, lxssp, lxvl, lxvll, lxvb16x, lxvh8x, lxv, lxvx", ++}, ++[ POWER9_PME_PM_LSU_FLUSH_EMSH ] = { /* 219 */ ++ .pme_name = "PM_LSU_FLUSH_EMSH", ++ .pme_code = 0x000000C0B0, ++ .pme_short_desc = "An ERAT miss was detected after a set-p hit.", ++ .pme_long_desc = "An ERAT miss was detected after a set-p hit. Erat tracker indicates fail due to tlbmiss and the instruction gets flushed because the instruction was working on the wrong address", ++}, ++[ POWER9_PME_PM_IC_PREF_REQ ] = { /* 220 */ ++ .pme_name = "PM_IC_PREF_REQ", ++ .pme_code = 0x0000004888, ++ .pme_short_desc = "Instruction prefetch requests", ++ .pme_long_desc = "Instruction prefetch requests", ++}, ++[ POWER9_PME_PM_DPTEG_FROM_L2_1_SHR ] = { /* 221 */ ++ .pme_name = "PM_DPTEG_FROM_L2_1_SHR", ++ .pme_code = 0x000003E046, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's L2 on the same chip due to a data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's L2 on the same chip due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++}, ++[ POWER9_PME_PM_XLATE_RADIX_MODE ] = { /* 222 */ ++ .pme_name = "PM_XLATE_RADIX_MODE", ++ .pme_code = 0x000000F898, ++ .pme_short_desc = "LSU reports every cycle the thread is in radix translation mode (as opposed to HPT mode)", ++ .pme_long_desc = "LSU reports every cycle the thread is in radix translation mode (as opposed to HPT mode)", ++}, ++[ POWER9_PME_PM_L3_LD_HIT ] = { /* 223 */ ++ .pme_name = "PM_L3_LD_HIT", ++ .pme_code = 0x00000260A4, ++ .pme_short_desc = "L3 demand LD Hits", ++ .pme_long_desc = "L3 demand LD Hits", ++}, ++[ POWER9_PME_PM_DARQ_7_9_ENTRIES ] = { /* 224 */ ++ .pme_name = "PM_DARQ_7_9_ENTRIES", ++ .pme_code = 0x000002E050, ++ .pme_short_desc = "Cycles in which 7,8, or 9 DARQ entries (out of 12) are in use", ++ .pme_long_desc = "Cycles in which 7,8, or 9 DARQ entries (out of 12) are in use", ++}, ++[ POWER9_PME_PM_CMPLU_STALL_EXEC_UNIT ] = { /* 225 */ ++ .pme_name = "PM_CMPLU_STALL_EXEC_UNIT", ++ .pme_code = 0x000002D018, ++ .pme_short_desc = "Completion stall due to execution units (FXU/VSU/CRU)", ++ .pme_long_desc = "Completion stall due to execution units (FXU/VSU/CRU)", ++}, ++[ POWER9_PME_PM_DISP_HELD ] = { /* 226 */ ++ .pme_name = "PM_DISP_HELD", ++ .pme_code = 0x0000010006, ++ .pme_short_desc = "Dispatch Held", ++ .pme_long_desc = "Dispatch Held", ++}, ++[ POWER9_PME_PM_TM_FAIL_CONF_TM ] = { /* 227 */ ++ .pme_name = "PM_TM_FAIL_CONF_TM", ++ .pme_code = 0x00000020AC, ++ .pme_short_desc = "TM aborted because a conflict occurred with another transaction.", ++ .pme_long_desc = "TM aborted because a conflict occurred with another transaction.", ++}, ++[ POWER9_PME_PM_LS0_DC_COLLISIONS ] = { /* 228 */ ++ .pme_name = "PM_LS0_DC_COLLISIONS", ++ .pme_code = 0x000000D090, ++ .pme_short_desc = "Read-write data cache collisions", ++ .pme_long_desc = "Read-write data cache collisions", ++}, ++[ POWER9_PME_PM_L2_LD ] = { /* 229 */ ++ .pme_name = "PM_L2_LD", ++ .pme_code = 0x0000016080, ++ .pme_short_desc = "All successful D-side Load dispatches for this thread", ++ .pme_long_desc = "All successful D-side Load dispatches for this thread", ++}, ++[ POWER9_PME_PM_BTAC_GOOD_RESULT ] = { /* 230 */ ++ .pme_name = "PM_BTAC_GOOD_RESULT", ++ .pme_code = 0x00000058B0, ++ .pme_short_desc = "BTAC predicts a taken branch and the BHT agrees, and the target address is correct", ++ .pme_long_desc = "BTAC predicts a taken branch and the BHT agrees, and the target address is correct", ++}, ++[ POWER9_PME_PM_TEND_PEND_CYC ] = { /* 231 */ ++ .pme_name = "PM_TEND_PEND_CYC", ++ .pme_code = 0x000000E8B0, ++ .pme_short_desc = "TEND latency per thread", ++ .pme_long_desc = "TEND latency per thread", ++}, ++[ POWER9_PME_PM_MRK_DCACHE_RELOAD_INTV ] = { /* 232 */ ++ .pme_name = "PM_MRK_DCACHE_RELOAD_INTV", ++ .pme_code = 0x0000040118, ++ .pme_short_desc = "Combined Intervention event", ++ .pme_long_desc = "Combined Intervention event", ++}, ++[ POWER9_PME_PM_DISP_HELD_HB_FULL ] = { /* 233 */ ++ .pme_name = "PM_DISP_HELD_HB_FULL", ++ .pme_code = 0x000003D05C, ++ .pme_short_desc = "Dispatch held due to History Buffer full.", ++ .pme_long_desc = "Dispatch held due to History Buffer full. Could be GPR/VSR/VMR/FPR/CR/XVF", ++}, ++[ POWER9_PME_PM_TM_TRESUME ] = { /* 234 */ ++ .pme_name = "PM_TM_TRESUME", ++ .pme_code = 0x00000020A4, ++ .pme_short_desc = "TM resume instruction completed", ++ .pme_long_desc = "TM resume instruction completed", ++}, ++[ POWER9_PME_PM_MRK_LSU_FLUSH_SAO ] = { /* 235 */ ++ .pme_name = "PM_MRK_LSU_FLUSH_SAO", ++ .pme_code = 0x000000D0A4, ++ .pme_short_desc = "A load-hit-load condition with Strong Address Ordering will have address compare disabled and flush", ++ .pme_long_desc = "A load-hit-load condition with Strong Address Ordering will have address compare disabled and flush", ++}, ++[ POWER9_PME_PM_LS0_TM_DISALLOW ] = { /* 236 */ ++ .pme_name = "PM_LS0_TM_DISALLOW", ++ .pme_code = 0x000000E0B4, ++ .pme_short_desc = "A TM-ineligible instruction tries to execute inside a transaction and the LSU disallows it", ++ .pme_long_desc = "A TM-ineligible instruction tries to execute inside a transaction and the LSU disallows it", ++}, ++[ POWER9_PME_PM_DPTEG_FROM_OFF_CHIP_CACHE ] = { /* 237 */ ++ .pme_name = "PM_DPTEG_FROM_OFF_CHIP_CACHE", ++ .pme_code = 0x000004E04A, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to a data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++}, ++[ POWER9_PME_PM_RC0_BUSY ] = { /* 238 */ ++ .pme_name = "PM_RC0_BUSY", ++ .pme_code = 0x000002608E, ++ .pme_short_desc = "RC mach 0 Busy.", ++ .pme_long_desc = "RC mach 0 Busy. Used by PMU to sample ave RC livetime(mach0 used as sample point)", ++}, ++[ POWER9_PME_PM_LSU1_TM_L1_HIT ] = { /* 239 */ ++ .pme_name = "PM_LSU1_TM_L1_HIT", ++ .pme_code = 0x000000E894, ++ .pme_short_desc = "Load tm hit in L1", ++ .pme_long_desc = "Load tm hit in L1", ++}, ++[ POWER9_PME_PM_TB_BIT_TRANS ] = { /* 240 */ ++ .pme_name = "PM_TB_BIT_TRANS", ++ .pme_code = 0x00000300F8, ++ .pme_short_desc = "timebase event", ++ .pme_long_desc = "timebase event", ++}, ++[ POWER9_PME_PM_DPTEG_FROM_L2_NO_CONFLICT ] = { /* 241 */ ++ .pme_name = "PM_DPTEG_FROM_L2_NO_CONFLICT", ++ .pme_code = 0x000001E040, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L2 without conflict due to a data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L2 without conflict due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++}, ++[ POWER9_PME_PM_MRK_DPTEG_FROM_L3_1_MOD ] = { /* 242 */ ++ .pme_name = "PM_MRK_DPTEG_FROM_L3_1_MOD", ++ .pme_code = 0x000002F144, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's L3 on the same chip due to a marked data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's L3 on the same chip due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++}, ++[ POWER9_PME_PM_MRK_DATA_FROM_L2_NO_CONFLICT ] = { /* 243 */ ++ .pme_name = "PM_MRK_DATA_FROM_L2_NO_CONFLICT", ++ .pme_code = 0x000002C120, ++ .pme_short_desc = "The processor's data cache was reloaded from local core's L2 without conflict due to a marked load", ++ .pme_long_desc = "The processor's data cache was reloaded from local core's L2 without conflict due to a marked load", ++}, ++[ POWER9_PME_PM_MRK_DATA_FROM_LL4_CYC ] = { /* 244 */ ++ .pme_name = "PM_MRK_DATA_FROM_LL4_CYC", ++ .pme_code = 0x000002C12E, ++ .pme_short_desc = "Duration in cycles to reload from the local chip's L4 cache due to a marked load", ++ .pme_long_desc = "Duration in cycles to reload from the local chip's L4 cache due to a marked load", ++}, ++[ POWER9_PME_PM_INST_FROM_OFF_CHIP_CACHE ] = { /* 245 */ ++ .pme_name = "PM_INST_FROM_OFF_CHIP_CACHE", ++ .pme_code = 0x000004404A, ++ .pme_short_desc = "The processor's Instruction cache was reloaded either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to an instruction fetch (not prefetch)", ++ .pme_long_desc = "The processor's Instruction cache was reloaded either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to an instruction fetch (not prefetch)", ++}, ++[ POWER9_PME_PM_L3_CO_L31 ] = { /* 246 */ ++ .pme_name = "PM_L3_CO_L31", ++ .pme_code = 0x00000268A0, ++ .pme_short_desc = "L3 CO to L3.", ++ .pme_long_desc = "L3 CO to L3.1 OR of port 0 and 1 ( lossy)", ++}, ++[ POWER9_PME_PM_CMPLU_STALL_CRYPTO ] = { /* 247 */ ++ .pme_name = "PM_CMPLU_STALL_CRYPTO", ++ .pme_code = 0x000004C01E, ++ .pme_short_desc = "Finish stall because the NTF instruction was routed to the crypto execution pipe and was waiting to finish", ++ .pme_long_desc = "Finish stall because the NTF instruction was routed to the crypto execution pipe and was waiting to finish", ++}, ++[ POWER9_PME_PM_RADIX_PWC_L1_PDE_FROM_L3 ] = { /* 248 */ ++ .pme_name = "PM_RADIX_PWC_L1_PDE_FROM_L3", ++ .pme_code = 0x000003F058, ++ .pme_short_desc = "A Page Directory Entry was reloaded to a level 1 page walk cache from the core's L3 data cache", ++ .pme_long_desc = "A Page Directory Entry was reloaded to a level 1 page walk cache from the core's L3 data cache", ++}, ++[ POWER9_PME_PM_ICT_EMPTY_CYC ] = { /* 249 */ ++ .pme_name = "PM_ICT_EMPTY_CYC", ++ .pme_code = 0x0000020004, ++ .pme_short_desc = "Cycles in which the ICT is completely empty.", ++ .pme_long_desc = "Cycles in which the ICT is completely empty. No itags are assigned to any thread", ++}, ++[ POWER9_PME_PM_BR_UNCOND ] = { /* 250 */ ++ .pme_name = "PM_BR_UNCOND", ++ .pme_code = 0x00000040A0, ++ .pme_short_desc = "Unconditional Branch Completed.", ++ .pme_long_desc = "Unconditional Branch Completed. HW branch prediction was not used for this branch. This can be an I-form branch, a B-form branch with BO-field set to branch always, or a B-form branch which was coverted to a Resolve.", ++}, ++[ POWER9_PME_PM_DERAT_MISS_2M ] = { /* 251 */ ++ .pme_name = "PM_DERAT_MISS_2M", ++ .pme_code = 0x000001C05A, ++ .pme_short_desc = "Data ERAT Miss (Data TLB Access) page size 2M.", ++ .pme_long_desc = "Data ERAT Miss (Data TLB Access) page size 2M. Implies radix translation", ++}, ++[ POWER9_PME_PM_PMC4_REWIND ] = { /* 252 */ ++ .pme_name = "PM_PMC4_REWIND", ++ .pme_code = 0x0000010020, ++ .pme_short_desc = "PMC4 Rewind Event", ++ .pme_long_desc = "PMC4 Rewind Event", ++}, ++[ POWER9_PME_PM_L2_RCLD_DISP ] = { /* 253 */ ++ .pme_name = "PM_L2_RCLD_DISP", ++ .pme_code = 0x0000016084, ++ .pme_short_desc = "L2 RC load dispatch attempt", ++ .pme_long_desc = "L2 RC load dispatch attempt", ++}, ++[ POWER9_PME_PM_CMPLU_STALL_DMISS_L2L3_CONFLICT ] = { /* 254 */ ++ .pme_name = "PM_CMPLU_STALL_DMISS_L2L3_CONFLICT", ++ .pme_code = 0x000004C016, ++ .pme_short_desc = "Completion stall due to cache miss that resolves in the L2 or L3 with a conflict", ++ .pme_long_desc = "Completion stall due to cache miss that resolves in the L2 or L3 with a conflict", ++}, ++[ POWER9_PME_PM_TAKEN_BR_MPRED_CMPL ] = { /* 255 */ ++ .pme_name = "PM_TAKEN_BR_MPRED_CMPL", ++ .pme_code = 0x0000020056, ++ .pme_short_desc = "Total number of taken branches that were incorrectly predicted as not-taken.", ++ .pme_long_desc = "Total number of taken branches that were incorrectly predicted as not-taken. This event counts branches completed and does not include speculative instructions", ++}, ++[ POWER9_PME_PM_THRD_PRIO_2_3_CYC ] = { /* 256 */ ++ .pme_name = "PM_THRD_PRIO_2_3_CYC", ++ .pme_code = 0x00000048BC, ++ .pme_short_desc = "Cycles thread running at priority level 2 or 3", ++ .pme_long_desc = "Cycles thread running at priority level 2 or 3", ++}, ++[ POWER9_PME_PM_DATA_FROM_DL4 ] = { /* 257 */ ++ .pme_name = "PM_DATA_FROM_DL4", ++ .pme_code = 0x000003C04C, ++ .pme_short_desc = "The processor's data cache was reloaded from another chip's L4 on a different Node or Group (Distant) due to a demand load", ++ .pme_long_desc = "The processor's data cache was reloaded from another chip's L4 on a different Node or Group (Distant) due to a demand load", ++}, ++[ POWER9_PME_PM_CMPLU_STALL_DPLONG ] = { /* 258 */ ++ .pme_name = "PM_CMPLU_STALL_DPLONG", ++ .pme_code = 0x000003405C, ++ .pme_short_desc = "Finish stall because the NTF instruction was a scalar multi-cycle instruction issued to the Double Precision execution pipe and waiting to finish.", ++ .pme_long_desc = "Finish stall because the NTF instruction was a scalar multi-cycle instruction issued to the Double Precision execution pipe and waiting to finish. Includes binary floating point instructions in 32 and 64 bit binary floating point format. Qualified by NOT vector AND multicycle", ++}, ++[ POWER9_PME_PM_IC_DEMAND_L2_BHT_REDIRECT ] = { /* 259 */ ++ .pme_name = "PM_IC_DEMAND_L2_BHT_REDIRECT", ++ .pme_code = 0x0000004098, ++ .pme_short_desc = "L2 I cache demand request due to BHT redirect, branch redirect ( 2 bubbles 3 cycles)", ++ .pme_long_desc = "L2 I cache demand request due to BHT redirect, branch redirect ( 2 bubbles 3 cycles)", ++}, ++[ POWER9_PME_PM_MRK_FAB_RSP_BKILL ] = { /* 260 */ ++ .pme_name = "PM_MRK_FAB_RSP_BKILL", ++ .pme_code = 0x0000040154, ++ .pme_short_desc = "Marked store had to do a bkill", ++ .pme_long_desc = "Marked store had to do a bkill", ++}, ++[ POWER9_PME_PM_LSU_DERAT_MISS ] = { /* 261 */ ++ .pme_name = "PM_LSU_DERAT_MISS", ++ .pme_code = 0x00000200F6, ++ .pme_short_desc = "DERAT Reloaded due to a DERAT miss", ++ .pme_long_desc = "DERAT Reloaded due to a DERAT miss", ++}, ++[ POWER9_PME_PM_IC_PREF_CANCEL_L2 ] = { /* 262 */ ++ .pme_name = "PM_IC_PREF_CANCEL_L2", ++ .pme_code = 0x0000004094, ++ .pme_short_desc = "L2 Squashed a demand or prefetch request", ++ .pme_long_desc = "L2 Squashed a demand or prefetch request", ++}, ++[ POWER9_PME_PM_MRK_NTC_CYC ] = { /* 263 */ ++ .pme_name = "PM_MRK_NTC_CYC", ++ .pme_code = 0x000002011C, ++ .pme_short_desc = "Cycles during which the marked instruction is next to complete (completion is held up because the marked instruction hasn't completed yet)", ++ .pme_long_desc = "Cycles during which the marked instruction is next to complete (completion is held up because the marked instruction hasn't completed yet)", ++}, ++[ POWER9_PME_PM_STCX_FIN ] = { /* 264 */ ++ .pme_name = "PM_STCX_FIN", ++ .pme_code = 0x000002E014, ++ .pme_short_desc = "Number of stcx instructions finished.", ++ .pme_long_desc = "Number of stcx instructions finished. This includes instructions in the speculative path of a branch that may be flushed", ++}, ++[ POWER9_PME_PM_MRK_DATA_FROM_L3_MEPF ] = { /* 265 */ ++ .pme_name = "PM_MRK_DATA_FROM_L3_MEPF", ++ .pme_code = 0x000002D142, ++ .pme_short_desc = "The processor's data cache was reloaded from local core's L3 without dispatch conflicts hit on Mepf state.", ++ .pme_long_desc = "The processor's data cache was reloaded from local core's L3 without dispatch conflicts hit on Mepf state. due to a marked load", ++}, ++[ POWER9_PME_PM_DC_PREF_FUZZY_CONF ] = { /* 266 */ ++ .pme_name = "PM_DC_PREF_FUZZY_CONF", ++ .pme_code = 0x000000F8A8, ++ .pme_short_desc = "A demand load referenced a line in an active fuzzy prefetch stream.", ++ .pme_long_desc = "A demand load referenced a line in an active fuzzy prefetch stream. The stream could have been allocated through the hardware prefetch mechanism or through software.Fuzzy stream confirm (out of order effects, or pf cant keep up)", ++}, ++[ POWER9_PME_PM_MULT_MRK ] = { /* 267 */ ++ .pme_name = "PM_MULT_MRK", ++ .pme_code = 0x000003D15E, ++ .pme_short_desc = "mult marked instr", ++ .pme_long_desc = "mult marked instr", ++}, ++[ POWER9_PME_PM_LSU_FLUSH_LARX_STCX ] = { /* 268 */ ++ .pme_name = "PM_LSU_FLUSH_LARX_STCX", ++ .pme_code = 0x000000C8B8, ++ .pme_short_desc = "A larx is flushed because an older larx has an LMQ reservation for the same thread.", ++ .pme_long_desc = "A larx is flushed because an older larx has an LMQ reservation for the same thread. A stcx is flushed because an older stcx is in the LMQ. The flush happens when the older larx/stcx relaunches", ++}, ++[ POWER9_PME_PM_L3_P1_LCO_NO_DATA ] = { /* 269 */ ++ .pme_name = "PM_L3_P1_LCO_NO_DATA", ++ .pme_code = 0x00000168AA, ++ .pme_short_desc = "dataless l3 lco sent port 1", ++ .pme_long_desc = "dataless l3 lco sent port 1", ++}, ++[ POWER9_PME_PM_TM_TABORT_TRECLAIM ] = { /* 270 */ ++ .pme_name = "PM_TM_TABORT_TRECLAIM", ++ .pme_code = 0x0000002898, ++ .pme_short_desc = "Completion time tabortnoncd, tabortcd, treclaim", ++ .pme_long_desc = "Completion time tabortnoncd, tabortcd, treclaim", ++}, ++[ POWER9_PME_PM_MRK_DATA_FROM_L2_MEPF_CYC ] = { /* 271 */ ++ .pme_name = "PM_MRK_DATA_FROM_L2_MEPF_CYC", ++ .pme_code = 0x000003D144, ++ .pme_short_desc = "Duration in cycles to reload from local core's L2 hit without dispatch conflicts on Mepf state.", ++ .pme_long_desc = "Duration in cycles to reload from local core's L2 hit without dispatch conflicts on Mepf state. due to a marked load", ++}, ++[ POWER9_PME_PM_BR_PRED_CCACHE ] = { /* 272 */ ++ .pme_name = "PM_BR_PRED_CCACHE", ++ .pme_code = 0x00000040A4, ++ .pme_short_desc = "Conditional Branch Completed that used the Count Cache for Target Prediction", ++ .pme_long_desc = "Conditional Branch Completed that used the Count Cache for Target Prediction", ++}, ++[ POWER9_PME_PM_L3_P1_LCO_DATA ] = { /* 273 */ ++ .pme_name = "PM_L3_P1_LCO_DATA", ++ .pme_code = 0x00000268AA, ++ .pme_short_desc = "lco sent with data port 1", ++ .pme_long_desc = "lco sent with data port 1", ++}, ++[ POWER9_PME_PM_LINK_STACK_WRONG_ADD_PRED ] = { /* 274 */ ++ .pme_name = "PM_LINK_STACK_WRONG_ADD_PRED", ++ .pme_code = 0x0000005098, ++ .pme_short_desc = "Link stack predicts wrong address, because of link stack design limitation or software violating the coding conventions", ++ .pme_long_desc = "Link stack predicts wrong address, because of link stack design limitation or software violating the coding conventions", ++}, ++[ POWER9_PME_PM_MRK_DPTEG_FROM_L3 ] = { /* 275 */ ++ .pme_name = "PM_MRK_DPTEG_FROM_L3", ++ .pme_code = 0x000004F142, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L3 due to a marked data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L3 due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++}, ++[ POWER9_PME_PM_MRK_ST_CMPL_INT ] = { /* 276 */ ++ .pme_name = "PM_MRK_ST_CMPL_INT", ++ .pme_code = 0x0000030134, ++ .pme_short_desc = "marked store finished with intervention", ++ .pme_long_desc = "marked store finished with intervention", ++}, ++[ POWER9_PME_PM_FLUSH_HB_RESTORE_CYC ] = { /* 277 */ ++ .pme_name = "PM_FLUSH_HB_RESTORE_CYC", ++ .pme_code = 0x0000002084, ++ .pme_short_desc = "Cycles in which no new instructions can be dispatched to the ICT after a flush.", ++ .pme_long_desc = "Cycles in which no new instructions can be dispatched to the ICT after a flush. History buffer recovery", ++}, ++[ POWER9_PME_PM_LS1_PTE_TABLEWALK_CYC ] = { /* 278 */ ++ .pme_name = "PM_LS1_PTE_TABLEWALK_CYC", ++ .pme_code = 0x000000E8BC, ++ .pme_short_desc = "Cycles when a tablewalk is pending on this thread on table 1", ++ .pme_long_desc = "Cycles when a tablewalk is pending on this thread on table 1", ++}, ++[ POWER9_PME_PM_L3_CI_USAGE ] = { /* 279 */ ++ .pme_name = "PM_L3_CI_USAGE", ++ .pme_code = 0x00000168AC, ++ .pme_short_desc = "rotating sample of 16 CI or CO actives", ++ .pme_long_desc = "rotating sample of 16 CI or CO actives", ++}, ++[ POWER9_PME_PM_MRK_DATA_FROM_L3MISS ] = { /* 280 */ ++ .pme_name = "PM_MRK_DATA_FROM_L3MISS", ++ .pme_code = 0x00000201E4, ++ .pme_short_desc = "The processor's data cache was reloaded from a localtion other than the local core's L3 due to a marked load", ++ .pme_long_desc = "The processor's data cache was reloaded from a localtion other than the local core's L3 due to a marked load", ++}, ++[ POWER9_PME_PM_DPTEG_FROM_DL4 ] = { /* 281 */ ++ .pme_name = "PM_DPTEG_FROM_DL4", ++ .pme_code = 0x000003E04C, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from another chip's L4 on a different Node or Group (Distant) due to a data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from another chip's L4 on a different Node or Group (Distant) due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++}, ++[ POWER9_PME_PM_MRK_STCX_FIN ] = { /* 282 */ ++ .pme_name = "PM_MRK_STCX_FIN", ++ .pme_code = 0x0000024056, ++ .pme_short_desc = "Number of marked stcx instructions finished.", ++ .pme_long_desc = "Number of marked stcx instructions finished. This includes instructions in the speculative path of a branch that may be flushed", ++}, ++[ POWER9_PME_PM_MRK_LSU_FLUSH_UE ] = { /* 283 */ ++ .pme_name = "PM_MRK_LSU_FLUSH_UE", ++ .pme_code = 0x000000D89C, ++ .pme_short_desc = "Correctable ECC error on reload data, reported at critical data forward time", ++ .pme_long_desc = "Correctable ECC error on reload data, reported at critical data forward time", ++}, ++[ POWER9_PME_PM_MRK_DATA_FROM_MEMORY ] = { /* 284 */ ++ .pme_name = "PM_MRK_DATA_FROM_MEMORY", ++ .pme_code = 0x00000201E0, ++ .pme_short_desc = "The processor's data cache was reloaded from a memory location including L4 from local remote or distant due to a marked load", ++ .pme_long_desc = "The processor's data cache was reloaded from a memory location including L4 from local remote or distant due to a marked load", ++}, ++[ POWER9_PME_PM_GRP_PUMP_MPRED_RTY ] = { /* 285 */ ++ .pme_name = "PM_GRP_PUMP_MPRED_RTY", ++ .pme_code = 0x0000010052, ++ .pme_short_desc = "Final Pump Scope (Group) ended up larger than Initial Pump Scope (Chip) for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", ++ .pme_long_desc = "Final Pump Scope (Group) ended up larger than Initial Pump Scope (Chip) for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", ++}, ++[ POWER9_PME_PM_DPTEG_FROM_L3_1_ECO_SHR ] = { /* 286 */ ++ .pme_name = "PM_DPTEG_FROM_L3_1_ECO_SHR", ++ .pme_code = 0x000003E044, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's ECO L3 on the same chip due to a data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's ECO L3 on the same chip due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++}, ++[ POWER9_PME_PM_FLUSH_DISP_TLBIE ] = { /* 287 */ ++ .pme_name = "PM_FLUSH_DISP_TLBIE", ++ .pme_code = 0x0000002888, ++ .pme_short_desc = "Dispatch Flush: TLBIE", ++ .pme_long_desc = "Dispatch Flush: TLBIE", ++}, ++[ POWER9_PME_PM_DPTEG_FROM_L3MISS ] = { /* 288 */ ++ .pme_name = "PM_DPTEG_FROM_L3MISS", ++ .pme_code = 0x000004E04E, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from a localtion other than the local core's L3 due to a data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from a localtion other than the local core's L3 due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++}, ++[ POWER9_PME_PM_L3_GRP_GUESS_CORRECT ] = { /* 289 */ ++ .pme_name = "PM_L3_GRP_GUESS_CORRECT", ++ .pme_code = 0x00000168B2, ++ .pme_short_desc = "Initial scope=group and data from same group (near) (pred successful)", ++ .pme_long_desc = "Initial scope=group and data from same group (near) (pred successful)", ++}, ++[ POWER9_PME_PM_IC_INVALIDATE ] = { /* 290 */ ++ .pme_name = "PM_IC_INVALIDATE", ++ .pme_code = 0x0000005888, ++ .pme_short_desc = "Ic line invalidated", ++ .pme_long_desc = "Ic line invalidated", ++}, ++[ POWER9_PME_PM_DERAT_MISS_16G ] = { /* 291 */ ++ .pme_name = "PM_DERAT_MISS_16G", ++ .pme_code = 0x000004C054, ++ .pme_short_desc = "Data ERAT Miss (Data TLB Access) page size 16G", ++ .pme_long_desc = "Data ERAT Miss (Data TLB Access) page size 16G", ++}, ++[ POWER9_PME_PM_SYS_PUMP_MPRED_RTY ] = { /* 292 */ ++ .pme_name = "PM_SYS_PUMP_MPRED_RTY", ++ .pme_code = 0x0000040050, ++ .pme_short_desc = "Final Pump Scope (system) ended up larger than Initial Pump Scope (Chip/Group) for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", ++ .pme_long_desc = "Final Pump Scope (system) ended up larger than Initial Pump Scope (Chip/Group) for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", ++}, ++[ POWER9_PME_PM_LMQ_MERGE ] = { /* 293 */ ++ .pme_name = "PM_LMQ_MERGE", ++ .pme_code = 0x000001002E, ++ .pme_short_desc = "A demand miss collides with a prefetch for the same line", ++ .pme_long_desc = "A demand miss collides with a prefetch for the same line", ++}, ++[ POWER9_PME_PM_IPTEG_FROM_LMEM ] = { /* 294 */ ++ .pme_name = "PM_IPTEG_FROM_LMEM", ++ .pme_code = 0x0000025048, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from the local chip's Memory due to a instruction side request", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from the local chip's Memory due to a instruction side request", ++}, ++[ POWER9_PME_PM_L3_LAT_CI_HIT ] = { /* 295 */ ++ .pme_name = "PM_L3_LAT_CI_HIT", ++ .pme_code = 0x00000460A2, ++ .pme_short_desc = "L3 Lateral Castins Hit", ++ .pme_long_desc = "L3 Lateral Castins Hit", ++}, ++[ POWER9_PME_PM_LSU1_VECTOR_ST_FIN ] = { /* 296 */ ++ .pme_name = "PM_LSU1_VECTOR_ST_FIN", ++ .pme_code = 0x000000C888, ++ .pme_short_desc = "A vector store instruction finished.", ++ .pme_long_desc = "A vector store instruction finished. The ops considered in this category are stv*, stxv*, stxsi*x, stxsd, and stxssp", ++}, ++[ POWER9_PME_PM_IC_DEMAND_L2_BR_REDIRECT ] = { /* 297 */ ++ .pme_name = "PM_IC_DEMAND_L2_BR_REDIRECT", ++ .pme_code = 0x0000004898, ++ .pme_short_desc = "L2 I cache demand request due to branch Mispredict ( 15 cycle path)", ++ .pme_long_desc = "L2 I cache demand request due to branch Mispredict ( 15 cycle path)", ++}, ++[ POWER9_PME_PM_INST_FROM_LMEM ] = { /* 298 */ ++ .pme_name = "PM_INST_FROM_LMEM", ++ .pme_code = 0x0000024048, ++ .pme_short_desc = "The processor's Instruction cache was reloaded from the local chip's Memory due to an instruction fetch (not prefetch)", ++ .pme_long_desc = "The processor's Instruction cache was reloaded from the local chip's Memory due to an instruction fetch (not prefetch)", ++}, ++[ POWER9_PME_PM_MRK_DATA_FROM_RL4 ] = { /* 299 */ ++ .pme_name = "PM_MRK_DATA_FROM_RL4", ++ .pme_code = 0x000003515C, ++ .pme_short_desc = "The processor's data cache was reloaded from another chip's L4 on the same Node or Group ( Remote) due to a marked load", ++ .pme_long_desc = "The processor's data cache was reloaded from another chip's L4 on the same Node or Group ( Remote) due to a marked load", ++}, ++[ POWER9_PME_PM_MRK_DTLB_MISS_4K ] = { /* 300 */ ++ .pme_name = "PM_MRK_DTLB_MISS_4K", ++ .pme_code = 0x000002D156, ++ .pme_short_desc = "Marked Data TLB Miss page size 4k", ++ .pme_long_desc = "Marked Data TLB Miss page size 4k", ++}, ++[ POWER9_PME_PM_MRK_DATA_FROM_L3_NO_CONFLICT ] = { /* 301 */ ++ .pme_name = "PM_MRK_DATA_FROM_L3_NO_CONFLICT", ++ .pme_code = 0x000003D146, ++ .pme_short_desc = "The processor's data cache was reloaded from local core's L3 without conflict due to a marked load", ++ .pme_long_desc = "The processor's data cache was reloaded from local core's L3 without conflict due to a marked load", ++}, ++[ POWER9_PME_PM_CMPLU_STALL_NTC_FLUSH ] = { /* 302 */ ++ .pme_name = "PM_CMPLU_STALL_NTC_FLUSH", ++ .pme_code = 0x000002E01E, ++ .pme_short_desc = "Completion stall due to ntc flush", ++ .pme_long_desc = "Completion stall due to ntc flush", ++}, ++[ POWER9_PME_PM_MRK_DATA_FROM_L3_NO_CONFLICT_CYC ] = { /* 303 */ ++ .pme_name = "PM_MRK_DATA_FROM_L3_NO_CONFLICT_CYC", ++ .pme_code = 0x000004C124, ++ .pme_short_desc = "Duration in cycles to reload from local core's L3 without conflict due to a marked load", ++ .pme_long_desc = "Duration in cycles to reload from local core's L3 without conflict due to a marked load", ++}, ++[ POWER9_PME_PM_DARQ_0_3_ENTRIES ] = { /* 304 */ ++ .pme_name = "PM_DARQ_0_3_ENTRIES", ++ .pme_code = 0x000004D04A, ++ .pme_short_desc = "Cycles in which 3 or less DARQ entries (out of 12) are in use", ++ .pme_long_desc = "Cycles in which 3 or less DARQ entries (out of 12) are in use", ++}, ++[ POWER9_PME_PM_DATA_FROM_L3MISS_MOD ] = { /* 305 */ ++ .pme_name = "PM_DATA_FROM_L3MISS_MOD", ++ .pme_code = 0x000004C04E, ++ .pme_short_desc = "The processor's data cache was reloaded from a localtion other than the local core's L3 due to a demand load", ++ .pme_long_desc = "The processor's data cache was reloaded from a localtion other than the local core's L3 due to a demand load", ++}, ++[ POWER9_PME_PM_MRK_DATA_FROM_L2_1_SHR_CYC ] = { /* 306 */ ++ .pme_name = "PM_MRK_DATA_FROM_L2_1_SHR_CYC", ++ .pme_code = 0x000001D154, ++ .pme_short_desc = "Duration in cycles to reload with Shared (S) data from another core's L2 on the same chip due to a marked load", ++ .pme_long_desc = "Duration in cycles to reload with Shared (S) data from another core's L2 on the same chip due to a marked load", ++}, ++[ POWER9_PME_PM_TAGE_OVERRIDE_WRONG ] = { /* 307 */ ++ .pme_name = "PM_TAGE_OVERRIDE_WRONG", ++ .pme_code = 0x00000050B8, ++ .pme_short_desc = "The TAGE overrode BHT direction prediction but it was incorrect.", ++ .pme_long_desc = "The TAGE overrode BHT direction prediction but it was incorrect. Counted at completion for taken branches only", ++}, ++[ POWER9_PME_PM_L2_LD_MISS ] = { /* 308 */ ++ .pme_name = "PM_L2_LD_MISS", ++ .pme_code = 0x0000026080, ++ .pme_short_desc = "All successful D-Side Load dispatches that were an L2miss for this thread", ++ .pme_long_desc = "All successful D-Side Load dispatches that were an L2miss for this thread", ++}, ++[ POWER9_PME_PM_EAT_FULL_CYC ] = { /* 309 */ ++ .pme_name = "PM_EAT_FULL_CYC", ++ .pme_code = 0x0000004084, ++ .pme_short_desc = "Cycles No room in EAT", ++ .pme_long_desc = "Cycles No room in EAT", ++}, ++[ POWER9_PME_PM_CMPLU_STALL_SPEC_FINISH ] = { /* 310 */ ++ .pme_name = "PM_CMPLU_STALL_SPEC_FINISH", ++ .pme_code = 0x0000030028, ++ .pme_short_desc = "Finish stall while waiting for the non-speculative finish of either a stcx waiting for its result or a load waiting for non-critical sectors of data and ECC", ++ .pme_long_desc = "Finish stall while waiting for the non-speculative finish of either a stcx waiting for its result or a load waiting for non-critical sectors of data and ECC", ++}, ++[ POWER9_PME_PM_MRK_LSU_FLUSH_LARX_STCX ] = { /* 311 */ ++ .pme_name = "PM_MRK_LSU_FLUSH_LARX_STCX", ++ .pme_code = 0x000000D8A4, ++ .pme_short_desc = "A larx is flushed because an older larx has an LMQ reservation for the same thread.", ++ .pme_long_desc = "A larx is flushed because an older larx has an LMQ reservation for the same thread. A stcx is flushed because an older stcx is in the LMQ. The flush happens when the older larx/stcx relaunches", ++}, ++[ POWER9_PME_PM_THRESH_EXC_128 ] = { /* 312 */ ++ .pme_name = "PM_THRESH_EXC_128", ++ .pme_code = 0x00000401EA, ++ .pme_short_desc = "Threshold counter exceeded a value of 128", ++ .pme_long_desc = "Threshold counter exceeded a value of 128", ++}, ++[ POWER9_PME_PM_LMQ_EMPTY_CYC ] = { /* 313 */ ++ .pme_name = "PM_LMQ_EMPTY_CYC", ++ .pme_code = 0x000002E05E, ++ .pme_short_desc = "Cycles in which the LMQ has no pending load misses for this thread", ++ .pme_long_desc = "Cycles in which the LMQ has no pending load misses for this thread", ++}, ++[ POWER9_PME_PM_RADIX_PWC_L2_PDE_FROM_L3 ] = { /* 314 */ ++ .pme_name = "PM_RADIX_PWC_L2_PDE_FROM_L3", ++ .pme_code = 0x000003F05A, ++ .pme_short_desc = "A Page Directory Entry was reloaded to a level 2 page walk cache from the core's L3 data cache", ++ .pme_long_desc = "A Page Directory Entry was reloaded to a level 2 page walk cache from the core's L3 data cache", ++}, ++[ POWER9_PME_PM_MRK_IC_MISS ] = { /* 315 */ ++ .pme_name = "PM_MRK_IC_MISS", ++ .pme_code = 0x000004013A, ++ .pme_short_desc = "Marked instruction experienced I cache miss", ++ .pme_long_desc = "Marked instruction experienced I cache miss", ++}, ++[ POWER9_PME_PM_L3_P1_GRP_PUMP ] = { /* 316 */ ++ .pme_name = "PM_L3_P1_GRP_PUMP", ++ .pme_code = 0x00000268B0, ++ .pme_short_desc = "L3 pf sent with grp scope port 1", ++ .pme_long_desc = "L3 pf sent with grp scope port 1", ++}, ++[ POWER9_PME_PM_CMPLU_STALL_TEND ] = { /* 317 */ ++ .pme_name = "PM_CMPLU_STALL_TEND", ++ .pme_code = 0x000001E050, ++ .pme_short_desc = "Finish stall because the NTF instruction was a tend instruction awaiting response from L2", ++ .pme_long_desc = "Finish stall because the NTF instruction was a tend instruction awaiting response from L2", ++}, ++[ POWER9_PME_PM_PUMP_MPRED ] = { /* 318 */ ++ .pme_name = "PM_PUMP_MPRED", ++ .pme_code = 0x0000040052, ++ .pme_short_desc = "Pump misprediction.", ++ .pme_long_desc = "Pump misprediction. Counts across all types of pumps for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", ++}, ++[ POWER9_PME_PM_INST_GRP_PUMP_MPRED ] = { /* 319 */ ++ .pme_name = "PM_INST_GRP_PUMP_MPRED", ++ .pme_code = 0x000002C05E, ++ .pme_short_desc = "Final Pump Scope (Group) ended up either larger or smaller than Initial Pump Scope for an instruction fetch (demand only)", ++ .pme_long_desc = "Final Pump Scope (Group) ended up either larger or smaller than Initial Pump Scope for an instruction fetch (demand only)", ++}, ++[ POWER9_PME_PM_L1_PREF ] = { /* 320 */ ++ .pme_name = "PM_L1_PREF", ++ .pme_code = 0x0000020054, ++ .pme_short_desc = "A data line was written to the L1 due to a hardware or software prefetch", ++ .pme_long_desc = "A data line was written to the L1 due to a hardware or software prefetch", ++}, ++[ POWER9_PME_PM_MRK_DATA_FROM_LMEM_CYC ] = { /* 321 */ ++ .pme_name = "PM_MRK_DATA_FROM_LMEM_CYC", ++ .pme_code = 0x000004D128, ++ .pme_short_desc = "Duration in cycles to reload from the local chip's Memory due to a marked load", ++ .pme_long_desc = "Duration in cycles to reload from the local chip's Memory due to a marked load", ++}, ++[ POWER9_PME_PM_LSU_FLUSH_ATOMIC ] = { /* 322 */ ++ .pme_name = "PM_LSU_FLUSH_ATOMIC", ++ .pme_code = 0x000000C8A8, ++ .pme_short_desc = "Quad-word loads (lq) are considered atomic because they always span at least 2 slices.", ++ .pme_long_desc = "Quad-word loads (lq) are considered atomic because they always span at least 2 slices. If a snoop or store from another thread changes the data the load is accessing between the 2 or 3 pieces of the lq instruction, the lq will be flushed", ++}, ++[ POWER9_PME_PM_L2_DISP_ALL_L2MISS ] = { /* 323 */ ++ .pme_name = "PM_L2_DISP_ALL_L2MISS", ++ .pme_code = 0x0000046080, ++ .pme_short_desc = "All successful Ld/St dispatches for this thread that were an L2miss.", ++ .pme_long_desc = "All successful Ld/St dispatches for this thread that were an L2miss.", ++}, ++[ POWER9_PME_PM_DATA_FROM_MEMORY ] = { /* 324 */ ++ .pme_name = "PM_DATA_FROM_MEMORY", ++ .pme_code = 0x00000400FE, ++ .pme_short_desc = "The processor's data cache was reloaded from a memory location including L4 from local remote or distant due to a demand load", ++ .pme_long_desc = "The processor's data cache was reloaded from a memory location including L4 from local remote or distant due to a demand load", ++}, ++[ POWER9_PME_PM_IPTEG_FROM_L3_1_ECO_MOD ] = { /* 325 */ ++ .pme_name = "PM_IPTEG_FROM_L3_1_ECO_MOD", ++ .pme_code = 0x0000045044, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's ECO L3 on the same chip due to a instruction side request", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's ECO L3 on the same chip due to a instruction side request", ++}, ++[ POWER9_PME_PM_ISIDE_DISP_FAIL_ADDR ] = { /* 326 */ ++ .pme_name = "PM_ISIDE_DISP_FAIL_ADDR", ++ .pme_code = 0x000002608A, ++ .pme_short_desc = "All i-side dispatch attempts that failed due to a addr collision with another machine", ++ .pme_long_desc = "All i-side dispatch attempts that failed due to a addr collision with another machine", ++}, ++[ POWER9_PME_PM_CMPLU_STALL_HWSYNC ] = { /* 327 */ ++ .pme_name = "PM_CMPLU_STALL_HWSYNC", ++ .pme_code = 0x0000030036, ++ .pme_short_desc = "completion stall due to hwsync", ++ .pme_long_desc = "completion stall due to hwsync", ++}, ++[ POWER9_PME_PM_DATA_FROM_L3 ] = { /* 328 */ ++ .pme_name = "PM_DATA_FROM_L3", ++ .pme_code = 0x000004C042, ++ .pme_short_desc = "The processor's data cache was reloaded from local core's L3 due to a demand load", ++ .pme_long_desc = "The processor's data cache was reloaded from local core's L3 due to a demand load", ++}, ++[ POWER9_PME_PM_PMC2_OVERFLOW ] = { /* 329 */ ++ .pme_name = "PM_PMC2_OVERFLOW", ++ .pme_code = 0x0000030010, ++ .pme_short_desc = "Overflow from counter 2", ++ .pme_long_desc = "Overflow from counter 2", ++}, ++[ POWER9_PME_PM_LSU0_SRQ_S0_VALID_CYC ] = { /* 330 */ ++ .pme_name = "PM_LSU0_SRQ_S0_VALID_CYC", ++ .pme_code = 0x000000D0B4, ++ .pme_short_desc = "Slot 0 of SRQ valid", ++ .pme_long_desc = "Slot 0 of SRQ valid", ++}, ++[ POWER9_PME_PM_DPTEG_FROM_LMEM ] = { /* 331 */ ++ .pme_name = "PM_DPTEG_FROM_LMEM", ++ .pme_code = 0x000002E048, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from the local chip's Memory due to a data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from the local chip's Memory due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++}, ++[ POWER9_PME_PM_IPTEG_FROM_ON_CHIP_CACHE ] = { /* 332 */ ++ .pme_name = "PM_IPTEG_FROM_ON_CHIP_CACHE", ++ .pme_code = 0x0000015048, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB either shared or modified data from another core's L2/L3 on the same chip due to a instruction side request", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB either shared or modified data from another core's L2/L3 on the same chip due to a instruction side request", ++}, ++[ POWER9_PME_PM_LSU1_SET_MPRED ] = { /* 333 */ ++ .pme_name = "PM_LSU1_SET_MPRED", ++ .pme_code = 0x000000D880, ++ .pme_short_desc = "Set prediction(set-p) miss.", ++ .pme_long_desc = "Set prediction(set-p) miss. The entry was not found in the Set prediction table", ++}, ++[ POWER9_PME_PM_DATA_FROM_L3_1_ECO_SHR ] = { /* 334 */ ++ .pme_name = "PM_DATA_FROM_L3_1_ECO_SHR", ++ .pme_code = 0x000003C044, ++ .pme_short_desc = "The processor's data cache was reloaded with Shared (S) data from another core's ECO L3 on the same chip due to a demand load", ++ .pme_long_desc = "The processor's data cache was reloaded with Shared (S) data from another core's ECO L3 on the same chip due to a demand load", ++}, ++[ POWER9_PME_PM_INST_FROM_MEMORY ] = { /* 335 */ ++ .pme_name = "PM_INST_FROM_MEMORY", ++ .pme_code = 0x000002404C, ++ .pme_short_desc = "The processor's Instruction cache was reloaded from a memory location including L4 from local remote or distant due to an instruction fetch (not prefetch)", ++ .pme_long_desc = "The processor's Instruction cache was reloaded from a memory location including L4 from local remote or distant due to an instruction fetch (not prefetch)", ++}, ++[ POWER9_PME_PM_L3_P1_LCO_RTY ] = { /* 336 */ ++ .pme_name = "PM_L3_P1_LCO_RTY", ++ .pme_code = 0x00000168B4, ++ .pme_short_desc = "L3 lateral cast out received retry on port 1", ++ .pme_long_desc = "L3 lateral cast out received retry on port 1", ++}, ++[ POWER9_PME_PM_DATA_FROM_L2_1_SHR ] = { /* 337 */ ++ .pme_name = "PM_DATA_FROM_L2_1_SHR", ++ .pme_code = 0x000003C046, ++ .pme_short_desc = "The processor's data cache was reloaded with Shared (S) data from another core's L2 on the same chip due to a demand load", ++ .pme_long_desc = "The processor's data cache was reloaded with Shared (S) data from another core's L2 on the same chip due to a demand load", ++}, ++[ POWER9_PME_PM_FLUSH_LSU ] = { /* 338 */ ++ .pme_name = "PM_FLUSH_LSU", ++ .pme_code = 0x00000058A4, ++ .pme_short_desc = "LSU flushes.", ++ .pme_long_desc = "LSU flushes. Includes all lsu flushes", ++}, ++[ POWER9_PME_PM_CMPLU_STALL_FXLONG ] = { /* 339 */ ++ .pme_name = "PM_CMPLU_STALL_FXLONG", ++ .pme_code = 0x000004D016, ++ .pme_short_desc = "Completion stall due to a long latency scalar fixed point instruction (division, square root)", ++ .pme_long_desc = "Completion stall due to a long latency scalar fixed point instruction (division, square root)", ++}, ++[ POWER9_PME_PM_CMPLU_STALL_DMISS_LMEM ] = { /* 340 */ ++ .pme_name = "PM_CMPLU_STALL_DMISS_LMEM", ++ .pme_code = 0x0000030038, ++ .pme_short_desc = "Completion stall due to cache miss that resolves in local memory", ++ .pme_long_desc = "Completion stall due to cache miss that resolves in local memory", ++}, ++[ POWER9_PME_PM_SNP_TM_HIT_M ] = { /* 341 */ ++ .pme_name = "PM_SNP_TM_HIT_M", ++ .pme_code = 0x00000360A6, ++ .pme_short_desc = "snp tm st hit m mu", ++ .pme_long_desc = "snp tm st hit m mu", ++}, ++[ POWER9_PME_PM_INST_GRP_PUMP_MPRED_RTY ] = { /* 342 */ ++ .pme_name = "PM_INST_GRP_PUMP_MPRED_RTY", ++ .pme_code = 0x0000014052, ++ .pme_short_desc = "Final Pump Scope (Group) ended up larger than Initial Pump Scope (Chip) for an instruction fetch", ++ .pme_long_desc = "Final Pump Scope (Group) ended up larger than Initial Pump Scope (Chip) for an instruction fetch", ++}, ++[ POWER9_PME_PM_L2_INST_MISS ] = { /* 343 */ ++ .pme_name = "PM_L2_INST_MISS", ++ .pme_code = 0x000004609E, ++ .pme_short_desc = "All successful i-side dispatches that were an L2miss for this thread (excludes i_l2mru_tch reqs)", ++ .pme_long_desc = "All successful i-side dispatches that were an L2miss for this thread (excludes i_l2mru_tch reqs)", ++}, ++[ POWER9_PME_PM_CMPLU_STALL_ERAT_MISS ] = { /* 344 */ ++ .pme_name = "PM_CMPLU_STALL_ERAT_MISS", ++ .pme_code = 0x000004C012, ++ .pme_short_desc = "Finish stall because the NTF instruction was a load or store that suffered a translation miss", ++ .pme_long_desc = "Finish stall because the NTF instruction was a load or store that suffered a translation miss", ++}, ++[ POWER9_PME_PM_MRK_L2_RC_DONE ] = { /* 345 */ ++ .pme_name = "PM_MRK_L2_RC_DONE", ++ .pme_code = 0x000003012A, ++ .pme_short_desc = "Marked RC done", ++ .pme_long_desc = "Marked RC done", ++}, ++[ POWER9_PME_PM_INST_FROM_L3_1_SHR ] = { /* 346 */ ++ .pme_name = "PM_INST_FROM_L3_1_SHR", ++ .pme_code = 0x0000014046, ++ .pme_short_desc = "The processor's Instruction cache was reloaded with Shared (S) data from another core's L3 on the same chip due to an instruction fetch (not prefetch)", ++ .pme_long_desc = "The processor's Instruction cache was reloaded with Shared (S) data from another core's L3 on the same chip due to an instruction fetch (not prefetch)", ++}, ++[ POWER9_PME_PM_RADIX_PWC_L4_PDE_FROM_L2 ] = { /* 347 */ ++ .pme_name = "PM_RADIX_PWC_L4_PDE_FROM_L2", ++ .pme_code = 0x000002D02C, ++ .pme_short_desc = "A Page Directory Entry was reloaded to a level 4 page walk cache from the core's L2 data cache", ++ .pme_long_desc = "A Page Directory Entry was reloaded to a level 4 page walk cache from the core's L2 data cache", ++}, ++[ POWER9_PME_PM_MRK_DATA_FROM_L3_1_MOD ] = { /* 348 */ ++ .pme_name = "PM_MRK_DATA_FROM_L3_1_MOD", ++ .pme_code = 0x000002D144, ++ .pme_short_desc = "The processor's data cache was reloaded with Modified (M) data from another core's L3 on the same chip due to a marked load", ++ .pme_long_desc = "The processor's data cache was reloaded with Modified (M) data from another core's L3 on the same chip due to a marked load", ++}, ++[ POWER9_PME_PM_CO0_BUSY ] = { /* 349 */ ++ .pme_name = "PM_CO0_BUSY", ++ .pme_code = 0x000004608E, ++ .pme_short_desc = "CO mach 0 Busy.", ++ .pme_long_desc = "CO mach 0 Busy. Used by PMU to sample ave RC livetime(mach0 used as sample point)", ++}, ++[ POWER9_PME_PM_CMPLU_STALL_STORE_DATA ] = { /* 350 */ ++ .pme_name = "PM_CMPLU_STALL_STORE_DATA", ++ .pme_code = 0x0000030026, ++ .pme_short_desc = "Finish stall because the next to finish instruction was a store waiting on data", ++ .pme_long_desc = "Finish stall because the next to finish instruction was a store waiting on data", ++}, ++[ POWER9_PME_PM_INST_FROM_RMEM ] = { /* 351 */ ++ .pme_name = "PM_INST_FROM_RMEM", ++ .pme_code = 0x000003404A, ++ .pme_short_desc = "The processor's Instruction cache was reloaded from another chip's memory on the same Node or Group ( Remote) due to an instruction fetch (not prefetch)", ++ .pme_long_desc = "The processor's Instruction cache was reloaded from another chip's memory on the same Node or Group ( Remote) due to an instruction fetch (not prefetch)", ++}, ++[ POWER9_PME_PM_SYNC_MRK_BR_LINK ] = { /* 352 */ ++ .pme_name = "PM_SYNC_MRK_BR_LINK", ++ .pme_code = 0x0000015152, ++ .pme_short_desc = "Marked Branch and link branch that can cause a synchronous interrupt", ++ .pme_long_desc = "Marked Branch and link branch that can cause a synchronous interrupt", ++}, ++[ POWER9_PME_PM_L3_LD_PREF ] = { /* 353 */ ++ .pme_name = "PM_L3_LD_PREF", ++ .pme_code = 0x000000F0B0, ++ .pme_short_desc = "L3 load prefetch, sourced from a hardware or software stream, was sent to the nest", ++ .pme_long_desc = "L3 load prefetch, sourced from a hardware or software stream, was sent to the nest", ++}, ++[ POWER9_PME_PM_DISP_CLB_HELD_TLBIE ] = { /* 354 */ ++ .pme_name = "PM_DISP_CLB_HELD_TLBIE", ++ .pme_code = 0x0000002890, ++ .pme_short_desc = "Dispatch Hold: Due to TLBIE", ++ .pme_long_desc = "Dispatch Hold: Due to TLBIE", ++}, ++[ POWER9_PME_PM_DPTEG_FROM_ON_CHIP_CACHE ] = { /* 355 */ ++ .pme_name = "PM_DPTEG_FROM_ON_CHIP_CACHE", ++ .pme_code = 0x000001E048, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB either shared or modified data from another core's L2/L3 on the same chip due to a data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB either shared or modified data from another core's L2/L3 on the same chip due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++}, ++[ POWER9_PME_PM_MRK_DATA_FROM_L3_MEPF_CYC ] = { /* 356 */ ++ .pme_name = "PM_MRK_DATA_FROM_L3_MEPF_CYC", ++ .pme_code = 0x000001415C, ++ .pme_short_desc = "Duration in cycles to reload from local core's L3 without dispatch conflicts hit on Mepf state.", ++ .pme_long_desc = "Duration in cycles to reload from local core's L3 without dispatch conflicts hit on Mepf state. due to a marked load", ++}, ++[ POWER9_PME_PM_LS0_UNALIGNED_LD ] = { /* 357 */ ++ .pme_name = "PM_LS0_UNALIGNED_LD", ++ .pme_code = 0x000000C094, ++ .pme_short_desc = "Load instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the load of that size.", ++ .pme_long_desc = "Load instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the load of that size. If the load wraps from slice 3 to slice 0, thee is an additional 3-cycle penalty", ++}, ++[ POWER9_PME_PM_MRK_DATA_FROM_DMEM_CYC ] = { /* 358 */ ++ .pme_name = "PM_MRK_DATA_FROM_DMEM_CYC", ++ .pme_code = 0x000004E11E, ++ .pme_short_desc = "Duration in cycles to reload from another chip's memory on the same Node or Group (Distant) due to a marked load", ++ .pme_long_desc = "Duration in cycles to reload from another chip's memory on the same Node or Group (Distant) due to a marked load", ++}, ++[ POWER9_PME_PM_SN_HIT ] = { /* 359 */ ++ .pme_name = "PM_SN_HIT", ++ .pme_code = 0x00000460A8, ++ .pme_short_desc = "Any port snooper hit.", ++ .pme_long_desc = "Any port snooper hit. Up to 4 can happen in a cycle but we only count 1", ++}, ++[ POWER9_PME_PM_L3_LOC_GUESS_CORRECT ] = { /* 360 */ ++ .pme_name = "PM_L3_LOC_GUESS_CORRECT", ++ .pme_code = 0x00000160B2, ++ .pme_short_desc = "initial scope=node/chip and data from local node (local) (pred successful)", ++ .pme_long_desc = "initial scope=node/chip and data from local node (local) (pred successful)", ++}, ++[ POWER9_PME_PM_MRK_INST_FROM_L3MISS ] = { /* 361 */ ++ .pme_name = "PM_MRK_INST_FROM_L3MISS", ++ .pme_code = 0x00000401E6, ++ .pme_short_desc = "Marked instruction was reloaded from a location beyond the local chiplet", ++ .pme_long_desc = "Marked instruction was reloaded from a location beyond the local chiplet", ++}, ++[ POWER9_PME_PM_DECODE_FUSION_EXT_ADD ] = { /* 362 */ ++ .pme_name = "PM_DECODE_FUSION_EXT_ADD", ++ .pme_code = 0x0000005084, ++ .pme_short_desc = "32-bit extended addition", ++ .pme_long_desc = "32-bit extended addition", ++}, ++[ POWER9_PME_PM_INST_FROM_DL4 ] = { /* 363 */ ++ .pme_name = "PM_INST_FROM_DL4", ++ .pme_code = 0x000003404C, ++ .pme_short_desc = "The processor's Instruction cache was reloaded from another chip's L4 on a different Node or Group (Distant) due to an instruction fetch (not prefetch)", ++ .pme_long_desc = "The processor's Instruction cache was reloaded from another chip's L4 on a different Node or Group (Distant) due to an instruction fetch (not prefetch)", ++}, ++[ POWER9_PME_PM_DC_PREF_XCONS_ALLOC ] = { /* 364 */ ++ .pme_name = "PM_DC_PREF_XCONS_ALLOC", ++ .pme_code = 0x000000F8B4, ++ .pme_short_desc = "Prefetch stream allocated in the Ultra conservative phase by either the hardware prefetch mechanism or software prefetch", ++ .pme_long_desc = "Prefetch stream allocated in the Ultra conservative phase by either the hardware prefetch mechanism or software prefetch", ++}, ++[ POWER9_PME_PM_MRK_DPTEG_FROM_MEMORY ] = { /* 365 */ ++ .pme_name = "PM_MRK_DPTEG_FROM_MEMORY", ++ .pme_code = 0x000002F14C, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from a memory location including L4 from local remote or distant due to a marked data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from a memory location including L4 from local remote or distant due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++}, ++[ POWER9_PME_PM_IC_PREF_CANCEL_PAGE ] = { /* 366 */ ++ .pme_name = "PM_IC_PREF_CANCEL_PAGE", ++ .pme_code = 0x0000004090, ++ .pme_short_desc = "Prefetch Canceled due to page boundary", ++ .pme_long_desc = "Prefetch Canceled due to page boundary", ++}, ++[ POWER9_PME_PM_RADIX_PWC_L3_PTE_FROM_L3 ] = { /* 367 */ ++ .pme_name = "PM_RADIX_PWC_L3_PTE_FROM_L3", ++ .pme_code = 0x000003F05E, ++ .pme_short_desc = "A Page Table Entry was reloaded to a level 3 page walk cache from the core's L3 data cache.", ++ .pme_long_desc = "A Page Table Entry was reloaded to a level 3 page walk cache from the core's L3 data cache. This implies that a level 4 PWC access was not necessary for this translation", ++}, ++[ POWER9_PME_PM_L3_GRP_GUESS_WRONG_LOW ] = { /* 368 */ ++ .pme_name = "PM_L3_GRP_GUESS_WRONG_LOW", ++ .pme_code = 0x00000360B2, ++ .pme_short_desc = "Initial scope=group but data from outside group (far or rem).", ++ .pme_long_desc = "Initial scope=group but data from outside group (far or rem). Prediction too Low", ++}, ++[ POWER9_PME_PM_TM_FAIL_SELF ] = { /* 369 */ ++ .pme_name = "PM_TM_FAIL_SELF", ++ .pme_code = 0x00000028AC, ++ .pme_short_desc = "TM aborted because a self-induced conflict occurred in Suspended state, due to one of the following: a store to a storage location that was previously accessed transactionally", ++ .pme_long_desc = "TM aborted because a self-induced conflict occurred in Suspended state, due to one of the following: a store to a storage location that was previously accessed transactionally", ++}, ++[ POWER9_PME_PM_L3_P1_SYS_PUMP ] = { /* 370 */ ++ .pme_name = "PM_L3_P1_SYS_PUMP", ++ .pme_code = 0x00000368B0, ++ .pme_short_desc = "L3 pf sent with sys scope port 1", ++ .pme_long_desc = "L3 pf sent with sys scope port 1", ++}, ++[ POWER9_PME_PM_CMPLU_STALL_RFID ] = { /* 371 */ ++ .pme_name = "PM_CMPLU_STALL_RFID", ++ .pme_code = 0x000002C01E, ++ .pme_short_desc = "Cycles in which the NTC instruction is not allowed to complete because it was interrupted by an RFID exception, which has to be serviced before the instruction can complete", ++ .pme_long_desc = "Cycles in which the NTC instruction is not allowed to complete because it was interrupted by an RFID exception, which has to be serviced before the instruction can complete", ++}, ++[ POWER9_PME_PM_BR_2PATH ] = { /* 372 */ ++ .pme_name = "PM_BR_2PATH", ++ .pme_code = 0x0000020036, ++ .pme_short_desc = "two path branch", ++ .pme_long_desc = "two path branch", ++}, ++[ POWER9_PME_PM_RADIX_PWC_L4_PTE_FROM_L3MISS ] = { /* 373 */ ++ .pme_name = "PM_RADIX_PWC_L4_PTE_FROM_L3MISS", ++ .pme_code = 0x000003F054, ++ .pme_short_desc = "A Page Table Entry was reloaded to a level 4 page walk cache from beyond the core's L3 data cache.", ++ .pme_long_desc = "A Page Table Entry was reloaded to a level 4 page walk cache from beyond the core's L3 data cache. This is the deepest level of PWC possible for a translation. The source could be local/remote/distant memory or another core's cache", ++}, ++[ POWER9_PME_PM_DPTEG_FROM_L2MISS ] = { /* 374 */ ++ .pme_name = "PM_DPTEG_FROM_L2MISS", ++ .pme_code = 0x000001E04E, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from a localtion other than the local core's L2 due to a data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from a localtion other than the local core's L2 due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++}, ++[ POWER9_PME_PM_TM_TX_PASS_RUN_INST ] = { /* 375 */ ++ .pme_name = "PM_TM_TX_PASS_RUN_INST", ++ .pme_code = 0x000004E014, ++ .pme_short_desc = "Run instructions spent in successful transactions", ++ .pme_long_desc = "Run instructions spent in successful transactions", ++}, ++[ POWER9_PME_PM_L1_ICACHE_RELOADED_PREF ] = { /* 376 */ ++ .pme_name = "PM_L1_ICACHE_RELOADED_PREF", ++ .pme_code = 0x0000030068, ++ .pme_short_desc = "Counts all Icache prefetch reloads ( includes demand turned into prefetch)", ++ .pme_long_desc = "Counts all Icache prefetch reloads ( includes demand turned into prefetch)", ++}, ++[ POWER9_PME_PM_THRESH_EXC_4096 ] = { /* 377 */ ++ .pme_name = "PM_THRESH_EXC_4096", ++ .pme_code = 0x00000101E6, ++ .pme_short_desc = "Threshold counter exceed a count of 4096", ++ .pme_long_desc = "Threshold counter exceed a count of 4096", ++}, ++[ POWER9_PME_PM_IERAT_RELOAD_64K ] = { /* 378 */ ++ .pme_name = "PM_IERAT_RELOAD_64K", ++ .pme_code = 0x000003006A, ++ .pme_short_desc = "IERAT Reloaded (Miss) for a 64k page", ++ .pme_long_desc = "IERAT Reloaded (Miss) for a 64k page", ++}, ++[ POWER9_PME_PM_LSU0_TM_L1_MISS ] = { /* 379 */ ++ .pme_name = "PM_LSU0_TM_L1_MISS", ++ .pme_code = 0x000000E09C, ++ .pme_short_desc = "Load tm L1 miss", ++ .pme_long_desc = "Load tm L1 miss", ++}, ++[ POWER9_PME_PM_MEM_LOC_THRESH_LSU_MED ] = { /* 380 */ ++ .pme_name = "PM_MEM_LOC_THRESH_LSU_MED", ++ .pme_code = 0x000001C05E, ++ .pme_short_desc = "Local memory above theshold for data prefetch", ++ .pme_long_desc = "Local memory above theshold for data prefetch", ++}, ++[ POWER9_PME_PM_PMC3_REWIND ] = { /* 381 */ ++ .pme_name = "PM_PMC3_REWIND", ++ .pme_code = 0x000001000A, ++ .pme_short_desc = "PMC3 rewind event.", ++ .pme_long_desc = "PMC3 rewind event. A rewind happens when a speculative event (such as latency or CPI stack) is selected on PMC3 and the stall reason or reload source did not match the one programmed in PMC3. When this occurs, the count in PMC3 will not change.", ++}, ++[ POWER9_PME_PM_ST_FWD ] = { /* 382 */ ++ .pme_name = "PM_ST_FWD", ++ .pme_code = 0x0000020018, ++ .pme_short_desc = "Store forwards that finished", ++ .pme_long_desc = "Store forwards that finished", ++}, ++[ POWER9_PME_PM_TM_FAIL_TX_CONFLICT ] = { /* 383 */ ++ .pme_name = "PM_TM_FAIL_TX_CONFLICT", ++ .pme_code = 0x000000E8AC, ++ .pme_short_desc = "Transactional conflict from LSU, whatever gets reported to texas", ++ .pme_long_desc = "Transactional conflict from LSU, whatever gets reported to texas", ++}, ++[ POWER9_PME_PM_SYNC_MRK_L2MISS ] = { /* 384 */ ++ .pme_name = "PM_SYNC_MRK_L2MISS", ++ .pme_code = 0x000001515A, ++ .pme_short_desc = "Marked L2 Miss that can throw a synchronous interrupt", ++ .pme_long_desc = "Marked L2 Miss that can throw a synchronous interrupt", ++}, ++[ POWER9_PME_PM_ISU0_ISS_HOLD_ALL ] = { /* 385 */ ++ .pme_name = "PM_ISU0_ISS_HOLD_ALL", ++ .pme_code = 0x0000003080, ++ .pme_short_desc = "All ISU rejects", ++ .pme_long_desc = "All ISU rejects", ++}, ++[ POWER9_PME_PM_MRK_FAB_RSP_DCLAIM_CYC ] = { /* 386 */ ++ .pme_name = "PM_MRK_FAB_RSP_DCLAIM_CYC", ++ .pme_code = 0x000002F152, ++ .pme_short_desc = "cycles L2 RC took for a dclaim", ++ .pme_long_desc = "cycles L2 RC took for a dclaim", ++}, ++[ POWER9_PME_PM_DATA_FROM_L2 ] = { /* 387 */ ++ .pme_name = "PM_DATA_FROM_L2", ++ .pme_code = 0x000001C042, ++ .pme_short_desc = "The processor's data cache was reloaded from local core's L2 due to a demand load", ++ .pme_long_desc = "The processor's data cache was reloaded from local core's L2 due to a demand load", ++}, ++[ POWER9_PME_PM_MRK_DATA_FROM_RL2L3_MOD ] = { /* 388 */ ++ .pme_name = "PM_MRK_DATA_FROM_RL2L3_MOD", ++ .pme_code = 0x000001D14A, ++ .pme_short_desc = "The processor's data cache was reloaded with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a marked load", ++ .pme_long_desc = "The processor's data cache was reloaded with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a marked load", ++}, ++[ POWER9_PME_PM_ISQ_0_8_ENTRIES ] = { /* 389 */ ++ .pme_name = "PM_ISQ_0_8_ENTRIES", ++ .pme_code = 0x000003005A, ++ .pme_short_desc = "Cycles in which 8 or less Issue Queue entries are in use.", ++ .pme_long_desc = "Cycles in which 8 or less Issue Queue entries are in use. This is a shared event, not per thread", ++}, ++[ POWER9_PME_PM_L3_CO_MEPF ] = { /* 390 */ ++ .pme_name = "PM_L3_CO_MEPF", ++ .pme_code = 0x00000168A0, ++ .pme_short_desc = "L3 CO of line in Mep state ( includes casthrough", ++ .pme_long_desc = "L3 CO of line in Mep state ( includes casthrough", ++}, ++[ POWER9_PME_PM_LINK_STACK_INVALID_PTR ] = { /* 391 */ ++ .pme_name = "PM_LINK_STACK_INVALID_PTR", ++ .pme_code = 0x0000005898, ++ .pme_short_desc = "It is most often caused by certain types of flush where the pointer is not available.", ++ .pme_long_desc = "It is most often caused by certain types of flush where the pointer is not available. Can result in the data in the link stack becoming unusable.", ++}, ++[ POWER9_PME_PM_IPTEG_FROM_L2_1_MOD ] = { /* 392 */ ++ .pme_name = "PM_IPTEG_FROM_L2_1_MOD", ++ .pme_code = 0x0000045046, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's L2 on the same chip due to a instruction side request", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's L2 on the same chip due to a instruction side request", ++}, ++[ POWER9_PME_PM_TM_ST_CAUSED_FAIL ] = { /* 393 */ ++ .pme_name = "PM_TM_ST_CAUSED_FAIL", ++ .pme_code = 0x000003688C, ++ .pme_short_desc = "TM Store (fav or non-fav) caused another thread to fail", ++ .pme_long_desc = "TM Store (fav or non-fav) caused another thread to fail", ++}, ++[ POWER9_PME_PM_LD_REF_L1 ] = { /* 394 */ ++ .pme_name = "PM_LD_REF_L1", ++ .pme_code = 0x00000100FC, ++ .pme_short_desc = "All L1 D cache load references counted at finish, gated by reject", ++ .pme_long_desc = "All L1 D cache load references counted at finish, gated by reject", ++}, ++[ POWER9_PME_PM_TM_FAIL_NON_TX_CONFLICT ] = { /* 395 */ ++ .pme_name = "PM_TM_FAIL_NON_TX_CONFLICT", ++ .pme_code = 0x000000E0B0, ++ .pme_short_desc = "Non transactional conflict from LSU whtver gets repoted to texas", ++ .pme_long_desc = "Non transactional conflict from LSU whtver gets repoted to texas", ++}, ++[ POWER9_PME_PM_GRP_PUMP_CPRED ] = { /* 396 */ ++ .pme_name = "PM_GRP_PUMP_CPRED", ++ .pme_code = 0x0000020050, ++ .pme_short_desc = "Initial and Final Pump Scope and data sourced across this scope was group pump for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", ++ .pme_long_desc = "Initial and Final Pump Scope and data sourced across this scope was group pump for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", ++}, ++[ POWER9_PME_PM_INST_FROM_L3_NO_CONFLICT ] = { /* 397 */ ++ .pme_name = "PM_INST_FROM_L3_NO_CONFLICT", ++ .pme_code = 0x0000014044, ++ .pme_short_desc = "The processor's Instruction cache was reloaded from local core's L3 without conflict due to an instruction fetch (not prefetch)", ++ .pme_long_desc = "The processor's Instruction cache was reloaded from local core's L3 without conflict due to an instruction fetch (not prefetch)", ++}, ++[ POWER9_PME_PM_DC_PREF_STRIDED_CONF ] = { /* 398 */ ++ .pme_name = "PM_DC_PREF_STRIDED_CONF", ++ .pme_code = 0x000000F0AC, ++ .pme_short_desc = "A demand load referenced a line in an active strided prefetch stream.", ++ .pme_long_desc = "A demand load referenced a line in an active strided prefetch stream. The stream could have been allocated through the hardware prefetch mechanism or through software.", ++}, ++[ POWER9_PME_PM_THRD_PRIO_6_7_CYC ] = { /* 399 */ ++ .pme_name = "PM_THRD_PRIO_6_7_CYC", ++ .pme_code = 0x0000005880, ++ .pme_short_desc = "Cycles thread running at priority level 6 or 7", ++ .pme_long_desc = "Cycles thread running at priority level 6 or 7", ++}, ++[ POWER9_PME_PM_RADIX_PWC_L4_PDE_FROM_L3 ] = { /* 400 */ ++ .pme_name = "PM_RADIX_PWC_L4_PDE_FROM_L3", ++ .pme_code = 0x000003F05C, ++ .pme_short_desc = "A Page Directory Entry was reloaded to a level 2 page walk cache from the core's L3 data cache", ++ .pme_long_desc = "A Page Directory Entry was reloaded to a level 2 page walk cache from the core's L3 data cache", ++}, ++[ POWER9_PME_PM_L3_PF_OFF_CHIP_MEM ] = { /* 401 */ ++ .pme_name = "PM_L3_PF_OFF_CHIP_MEM", ++ .pme_code = 0x00000468A0, ++ .pme_short_desc = "L3 Prefetch from Off chip memory", ++ .pme_long_desc = "L3 Prefetch from Off chip memory", ++}, ++[ POWER9_PME_PM_L3_CO_MEM ] = { /* 402 */ ++ .pme_name = "PM_L3_CO_MEM", ++ .pme_code = 0x00000260A0, ++ .pme_short_desc = "L3 CO to memory OR of port 0 and 1 ( lossy)", ++ .pme_long_desc = "L3 CO to memory OR of port 0 and 1 ( lossy)", ++}, ++[ POWER9_PME_PM_DECODE_HOLD_ICT_FULL ] = { /* 403 */ ++ .pme_name = "PM_DECODE_HOLD_ICT_FULL", ++ .pme_code = 0x00000058A8, ++ .pme_short_desc = "Counts the number of cycles in which the IFU was not able to decode and transmit one or more instructions because all itags were in use.", ++ .pme_long_desc = "Counts the number of cycles in which the IFU was not able to decode and transmit one or more instructions because all itags were in use. This means the ICT is full for this thread", ++}, ++[ POWER9_PME_PM_CMPLU_STALL_DFLONG ] = { /* 404 */ ++ .pme_name = "PM_CMPLU_STALL_DFLONG", ++ .pme_code = 0x000001005A, ++ .pme_short_desc = "Finish stall because the NTF instruction was a multi-cycle instruction issued to the Decimal Floating Point execution pipe and waiting to finish.", ++ .pme_long_desc = "Finish stall because the NTF instruction was a multi-cycle instruction issued to the Decimal Floating Point execution pipe and waiting to finish. Includes decimal floating point instructions + 128 bit binary floating point instructions. Qualified by multicycle", ++}, ++[ POWER9_PME_PM_LD_MISS_L1 ] = { /* 405 */ ++ .pme_name = "PM_LD_MISS_L1", ++ .pme_code = 0x000003E054, ++ .pme_short_desc = "Load Missed L1, at execution time (not gated by finish, which means this counter can be greater than loads finished)", ++ .pme_long_desc = "Load Missed L1, at execution time (not gated by finish, which means this counter can be greater than loads finished)", ++}, ++[ POWER9_PME_PM_DATA_FROM_RL2L3_MOD ] = { /* 406 */ ++ .pme_name = "PM_DATA_FROM_RL2L3_MOD", ++ .pme_code = 0x000002C046, ++ .pme_short_desc = "The processor's data cache was reloaded with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a demand load", ++ .pme_long_desc = "The processor's data cache was reloaded with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a demand load", ++}, ++[ POWER9_PME_PM_L3_WI0_BUSY ] = { /* 407 */ ++ .pme_name = "PM_L3_WI0_BUSY", ++ .pme_code = 0x00000260B6, ++ .pme_short_desc = "lifetime, sample of Write Inject machine 0 valid", ++ .pme_long_desc = "lifetime, sample of Write Inject machine 0 valid", ++}, ++[ POWER9_PME_PM_LSU_SRQ_FULL_CYC ] = { /* 408 */ ++ .pme_name = "PM_LSU_SRQ_FULL_CYC", ++ .pme_code = 0x000001001A, ++ .pme_short_desc = "Cycles in which the Store Queue is full on all 4 slices.", ++ .pme_long_desc = "Cycles in which the Store Queue is full on all 4 slices. This is event is not per thread. All the threads will see the same count for this core resource", ++}, ++[ POWER9_PME_PM_TABLEWALK_CYC ] = { /* 409 */ ++ .pme_name = "PM_TABLEWALK_CYC", ++ .pme_code = 0x0000010026, ++ .pme_short_desc = "Cycles when a tablewalk (I or D) is active", ++ .pme_long_desc = "Cycles when a tablewalk (I or D) is active", ++}, ++[ POWER9_PME_PM_MRK_DATA_FROM_MEMORY_CYC ] = { /* 410 */ ++ .pme_name = "PM_MRK_DATA_FROM_MEMORY_CYC", ++ .pme_code = 0x000001D146, ++ .pme_short_desc = "Duration in cycles to reload from a memory location including L4 from local remote or distant due to a marked load", ++ .pme_long_desc = "Duration in cycles to reload from a memory location including L4 from local remote or distant due to a marked load", ++}, ++[ POWER9_PME_PM_IPTEG_FROM_OFF_CHIP_CACHE ] = { /* 411 */ ++ .pme_name = "PM_IPTEG_FROM_OFF_CHIP_CACHE", ++ .pme_code = 0x000004504A, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to a instruction side request", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to a instruction side request", ++}, ++[ POWER9_PME_PM_RADIX_PWC_L1_PDE_FROM_L3MISS ] = { /* 412 */ ++ .pme_name = "PM_RADIX_PWC_L1_PDE_FROM_L3MISS", ++ .pme_code = 0x000004F056, ++ .pme_short_desc = "A Page Directory Entry was reloaded to a level 1 page walk cache from beyond the core's L3 data cache.", ++ .pme_long_desc = "A Page Directory Entry was reloaded to a level 1 page walk cache from beyond the core's L3 data cache. The source could be local/remote/distant memory or another core's cache", ++}, ++[ POWER9_PME_PM_CMPLU_STALL_SYS_CALL ] = { /* 413 */ ++ .pme_name = "PM_CMPLU_STALL_SYS_CALL", ++ .pme_code = 0x000001E05A, ++ .pme_short_desc = "Cycles in which the NTC instruction is not allowed to complete because it was interrupted by a system call exception, which has to be serviced before the instruction can complete", ++ .pme_long_desc = "Cycles in which the NTC instruction is not allowed to complete because it was interrupted by a system call exception, which has to be serviced before the instruction can complete", ++}, ++[ POWER9_PME_PM_LSU_FLUSH_RELAUNCH_MISS ] = { /* 414 */ ++ .pme_name = "PM_LSU_FLUSH_RELAUNCH_MISS", ++ .pme_code = 0x000000C8B0, ++ .pme_short_desc = "If a load that has already returned data and has to relaunch for any reason then gets a miss (erat, setp, data cache), it will often be flushed at relaunch time because the data might be inconsistent", ++ .pme_long_desc = "If a load that has already returned data and has to relaunch for any reason then gets a miss (erat, setp, data cache), it will often be flushed at relaunch time because the data might be inconsistent", ++}, ++[ POWER9_PME_PM_DPTEG_FROM_L3_1_ECO_MOD ] = { /* 415 */ ++ .pme_name = "PM_DPTEG_FROM_L3_1_ECO_MOD", ++ .pme_code = 0x000004E044, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's ECO L3 on the same chip due to a data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's ECO L3 on the same chip due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++}, ++[ POWER9_PME_PM_PMC5_OVERFLOW ] = { /* 416 */ ++ .pme_name = "PM_PMC5_OVERFLOW", ++ .pme_code = 0x0000010024, ++ .pme_short_desc = "Overflow from counter 5", ++ .pme_long_desc = "Overflow from counter 5", ++}, ++[ POWER9_PME_PM_LS1_UNALIGNED_ST ] = { /* 417 */ ++ .pme_name = "PM_LS1_UNALIGNED_ST", ++ .pme_code = 0x000000F8B8, ++ .pme_short_desc = "Store instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the Store of that size.", ++ .pme_long_desc = "Store instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the Store of that size. If the Store wraps from slice 3 to slice 0, thee is an additional 3-cycle penalty", ++}, ++[ POWER9_PME_PM_ICT_NOSLOT_DISP_HELD_SYNC ] = { /* 418 */ ++ .pme_name = "PM_ICT_NOSLOT_DISP_HELD_SYNC", ++ .pme_code = 0x000004D01C, ++ .pme_short_desc = "Dispatch held due to a synchronizing instruction at dispatch", ++ .pme_long_desc = "Dispatch held due to a synchronizing instruction at dispatch", ++}, ++[ POWER9_PME_PM_CMPLU_STALL_THRD ] = { /* 419 */ ++ .pme_name = "PM_CMPLU_STALL_THRD", ++ .pme_code = 0x000001001C, ++ .pme_short_desc = "Completion Stalled because the thread was blocked", ++ .pme_long_desc = "Completion Stalled because the thread was blocked", ++}, ++[ POWER9_PME_PM_PMC3_SAVED ] = { /* 420 */ ++ .pme_name = "PM_PMC3_SAVED", ++ .pme_code = 0x000004D012, ++ .pme_short_desc = "PMC3 Rewind Value saved", ++ .pme_long_desc = "PMC3 Rewind Value saved", ++}, ++[ POWER9_PME_PM_MRK_DERAT_MISS ] = { /* 421 */ ++ .pme_name = "PM_MRK_DERAT_MISS", ++ .pme_code = 0x00000301E6, ++ .pme_short_desc = "Erat Miss (TLB Access) All page sizes", ++ .pme_long_desc = "Erat Miss (TLB Access) All page sizes", ++}, ++[ POWER9_PME_PM_RADIX_PWC_L3_HIT ] = { /* 422 */ ++ .pme_name = "PM_RADIX_PWC_L3_HIT", ++ .pme_code = 0x000003F056, ++ .pme_short_desc = "A radix translation attempt missed in the TLB but hit on the first, second, and third levels of page walk cache.", ++ .pme_long_desc = "A radix translation attempt missed in the TLB but hit on the first, second, and third levels of page walk cache.", ++}, ++[ POWER9_PME_PM_RADIX_PWC_L2_PTE_FROM_L3MISS ] = { /* 423 */ ++ .pme_name = "PM_RADIX_PWC_L2_PTE_FROM_L3MISS", ++ .pme_code = 0x000004F05C, ++ .pme_short_desc = "A Page Table Entry was reloaded to a level 2 page walk cache from beyond the core's L3 data cache.", ++ .pme_long_desc = "A Page Table Entry was reloaded to a level 2 page walk cache from beyond the core's L3 data cache. This implies that level 3 and level 4 PWC accesses were not necessary for this translation. The source could be local/remote/distant memory or another core's cache", ++}, ++[ POWER9_PME_PM_RUN_CYC_SMT4_MODE ] = { /* 424 */ ++ .pme_name = "PM_RUN_CYC_SMT4_MODE", ++ .pme_code = 0x000002006C, ++ .pme_short_desc = "Cycles in which this thread's run latch is set and the core is in SMT4 mode", ++ .pme_long_desc = "Cycles in which this thread's run latch is set and the core is in SMT4 mode", ++}, ++[ POWER9_PME_PM_DATA_FROM_RMEM ] = { /* 425 */ ++ .pme_name = "PM_DATA_FROM_RMEM", ++ .pme_code = 0x000003C04A, ++ .pme_short_desc = "The processor's data cache was reloaded from another chip's memory on the same Node or Group ( Remote) due to a demand load", ++ .pme_long_desc = "The processor's data cache was reloaded from another chip's memory on the same Node or Group ( Remote) due to a demand load", ++}, ++[ POWER9_PME_PM_BR_MPRED_LSTACK ] = { /* 426 */ ++ .pme_name = "PM_BR_MPRED_LSTACK", ++ .pme_code = 0x00000048AC, ++ .pme_short_desc = "Conditional Branch Completed that was Mispredicted due to the Link Stack Target Prediction", ++ .pme_long_desc = "Conditional Branch Completed that was Mispredicted due to the Link Stack Target Prediction", ++}, ++[ POWER9_PME_PM_PROBE_NOP_DISP ] = { /* 427 */ ++ .pme_name = "PM_PROBE_NOP_DISP", ++ .pme_code = 0x0000040014, ++ .pme_short_desc = "ProbeNops dispatched", ++ .pme_long_desc = "ProbeNops dispatched", ++}, ++[ POWER9_PME_PM_DPTEG_FROM_L3_MEPF ] = { /* 428 */ ++ .pme_name = "PM_DPTEG_FROM_L3_MEPF", ++ .pme_code = 0x000002E042, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L3 without dispatch conflicts hit on Mepf state.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L3 without dispatch conflicts hit on Mepf state. due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++}, ++[ POWER9_PME_PM_INST_FROM_L3MISS_MOD ] = { /* 429 */ ++ .pme_name = "PM_INST_FROM_L3MISS_MOD", ++ .pme_code = 0x000004404E, ++ .pme_short_desc = "The processor's Instruction cache was reloaded from a localtion other than the local core's L3 due to a instruction fetch", ++ .pme_long_desc = "The processor's Instruction cache was reloaded from a localtion other than the local core's L3 due to a instruction fetch", ++}, ++[ POWER9_PME_PM_DUMMY1_REMOVE_ME ] = { /* 430 */ ++ .pme_name = "PM_DUMMY1_REMOVE_ME", ++ .pme_code = 0x0000040062, ++ .pme_short_desc = "Space holder for l2_pc_pm_mk_ldst_scope_pred_status", ++ .pme_long_desc = "Space holder for l2_pc_pm_mk_ldst_scope_pred_status", ++}, ++[ POWER9_PME_PM_MRK_DATA_FROM_DL4 ] = { /* 431 */ ++ .pme_name = "PM_MRK_DATA_FROM_DL4", ++ .pme_code = 0x000001D152, ++ .pme_short_desc = "The processor's data cache was reloaded from another chip's L4 on a different Node or Group (Distant) due to a marked load", ++ .pme_long_desc = "The processor's data cache was reloaded from another chip's L4 on a different Node or Group (Distant) due to a marked load", ++}, ++[ POWER9_PME_PM_MRK_DATA_FROM_RL2L3_MOD_CYC ] = { /* 432 */ ++ .pme_name = "PM_MRK_DATA_FROM_RL2L3_MOD_CYC", ++ .pme_code = 0x000002D14A, ++ .pme_short_desc = "Duration in cycles to reload with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a marked load", ++ .pme_long_desc = "Duration in cycles to reload with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a marked load", ++}, ++[ POWER9_PME_PM_IPTEG_FROM_L3_1_SHR ] = { /* 433 */ ++ .pme_name = "PM_IPTEG_FROM_L3_1_SHR", ++ .pme_code = 0x0000015046, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's L3 on the same chip due to a instruction side request", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's L3 on the same chip due to a instruction side request", ++}, ++[ POWER9_PME_PM_MRK_DATA_FROM_L3_1_ECO_SHR ] = { /* 434 */ ++ .pme_name = "PM_MRK_DATA_FROM_L3_1_ECO_SHR", ++ .pme_code = 0x000002D14C, ++ .pme_short_desc = "The processor's data cache was reloaded with Shared (S) data from another core's ECO L3 on the same chip due to a marked load", ++ .pme_long_desc = "The processor's data cache was reloaded with Shared (S) data from another core's ECO L3 on the same chip due to a marked load", ++}, ++[ POWER9_PME_PM_DTLB_MISS_2M ] = { /* 435 */ ++ .pme_name = "PM_DTLB_MISS_2M", ++ .pme_code = 0x000001C05C, ++ .pme_short_desc = "Data TLB reload (after a miss) page size 2M.", ++ .pme_long_desc = "Data TLB reload (after a miss) page size 2M. Implies radix translation was used", ++}, ++[ POWER9_PME_PM_TM_RST_SC ] = { /* 436 */ ++ .pme_name = "PM_TM_RST_SC", ++ .pme_code = 0x00000268A6, ++ .pme_short_desc = "tm snp rst tm sc", ++ .pme_long_desc = "tm snp rst tm sc", ++}, ++[ POWER9_PME_PM_LSU_NCST ] = { /* 437 */ ++ .pme_name = "PM_LSU_NCST", ++ .pme_code = 0x000000C890, ++ .pme_short_desc = "Asserts when a i=1 store op is sent to the nest.", ++ .pme_long_desc = "Asserts when a i=1 store op is sent to the nest. No record of issue pipe (LS0/LS1) is maintained so this is for both pipes. Probably don't need separate LS0 and LS1", ++}, ++[ POWER9_PME_PM_DATA_SYS_PUMP_MPRED_RTY ] = { /* 438 */ ++ .pme_name = "PM_DATA_SYS_PUMP_MPRED_RTY", ++ .pme_code = 0x000004C050, ++ .pme_short_desc = "Final Pump Scope (system) ended up larger than Initial Pump Scope (Chip/Group) for a demand load", ++ .pme_long_desc = "Final Pump Scope (system) ended up larger than Initial Pump Scope (Chip/Group) for a demand load", ++}, ++[ POWER9_PME_PM_THRESH_ACC ] = { /* 439 */ ++ .pme_name = "PM_THRESH_ACC", ++ .pme_code = 0x0000024154, ++ .pme_short_desc = "This event increments every time the threshold event counter ticks.", ++ .pme_long_desc = "This event increments every time the threshold event counter ticks. Thresholding must be enabled (via MMCRA) and the thresholding start event must occur for this counter to increment. It will stop incrementing when the thresholding stop event occurs or when thresholding is disabled, until the next time a configured thresholding start event occurs.", ++}, ++[ POWER9_PME_PM_ISU3_ISS_HOLD_ALL ] = { /* 440 */ ++ .pme_name = "PM_ISU3_ISS_HOLD_ALL", ++ .pme_code = 0x0000003884, ++ .pme_short_desc = "All ISU rejects", ++ .pme_long_desc = "All ISU rejects", ++}, ++[ POWER9_PME_PM_LSU0_L1_CAM_CANCEL ] = { /* 441 */ ++ .pme_name = "PM_LSU0_L1_CAM_CANCEL", ++ .pme_code = 0x000000F090, ++ .pme_short_desc = "ls0 l1 tm cam cancel", ++ .pme_long_desc = "ls0 l1 tm cam cancel", ++}, ++[ POWER9_PME_PM_MRK_FAB_RSP_BKILL_CYC ] = { /* 442 */ ++ .pme_name = "PM_MRK_FAB_RSP_BKILL_CYC", ++ .pme_code = 0x000001F152, ++ .pme_short_desc = "cycles L2 RC took for a bkill", ++ .pme_long_desc = "cycles L2 RC took for a bkill", ++}, ++[ POWER9_PME_PM_MRK_DPTEG_FROM_L3_MEPF ] = { /* 443 */ ++ .pme_name = "PM_MRK_DPTEG_FROM_L3_MEPF", ++ .pme_code = 0x000002F142, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L3 without dispatch conflicts hit on Mepf state.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L3 without dispatch conflicts hit on Mepf state. due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++}, ++[ POWER9_PME_PM_DARQ_STORE_REJECT ] = { /* 444 */ ++ .pme_name = "PM_DARQ_STORE_REJECT", ++ .pme_code = 0x000004405E, ++ .pme_short_desc = "The DARQ attempted to transmit a store into an LSAQ or SRQ entry but It was rejected.", ++ .pme_long_desc = "The DARQ attempted to transmit a store into an LSAQ or SRQ entry but It was rejected. Divide by pm_darq_store_xmit to get reject ratio", ++}, ++[ POWER9_PME_PM_DPTEG_FROM_L3_NO_CONFLICT ] = { /* 445 */ ++ .pme_name = "PM_DPTEG_FROM_L3_NO_CONFLICT", ++ .pme_code = 0x000001E044, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L3 without conflict due to a data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L3 without conflict due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++}, ++[ POWER9_PME_PM_TM_TX_PASS_RUN_CYC ] = { /* 446 */ ++ .pme_name = "PM_TM_TX_PASS_RUN_CYC", ++ .pme_code = 0x000002E012, ++ .pme_short_desc = "cycles spent in successful transactions", ++ .pme_long_desc = "cycles spent in successful transactions", ++}, ++[ POWER9_PME_PM_DTLB_MISS_4K ] = { /* 447 */ ++ .pme_name = "PM_DTLB_MISS_4K", ++ .pme_code = 0x000002C056, ++ .pme_short_desc = "Data TLB Miss page size 4k", ++ .pme_long_desc = "Data TLB Miss page size 4k", ++}, ++[ POWER9_PME_PM_MRK_DATA_FROM_ON_CHIP_CACHE_CYC ] = { /* 448 */ ++ .pme_name = "PM_MRK_DATA_FROM_ON_CHIP_CACHE_CYC", ++ .pme_code = 0x000003515A, ++ .pme_short_desc = "Duration in cycles to reload either shared or modified data from another core's L2/L3 on the same chip due to a marked load", ++ .pme_long_desc = "Duration in cycles to reload either shared or modified data from another core's L2/L3 on the same chip due to a marked load", ++}, ++[ POWER9_PME_PM_LS0_PTE_TABLEWALK_CYC ] = { /* 449 */ ++ .pme_name = "PM_LS0_PTE_TABLEWALK_CYC", ++ .pme_code = 0x000000E0BC, ++ .pme_short_desc = "Cycles when a tablewalk is pending on this thread on table 0", ++ .pme_long_desc = "Cycles when a tablewalk is pending on this thread on table 0", ++}, ++[ POWER9_PME_PM_PMC4_SAVED ] = { /* 450 */ ++ .pme_name = "PM_PMC4_SAVED", ++ .pme_code = 0x0000030022, ++ .pme_short_desc = "PMC4 Rewind Value saved (matched condition)", ++ .pme_long_desc = "PMC4 Rewind Value saved (matched condition)", ++}, ++[ POWER9_PME_PM_SNP_TM_HIT_T ] = { /* 451 */ ++ .pme_name = "PM_SNP_TM_HIT_T", ++ .pme_code = 0x00000368A6, ++ .pme_short_desc = "snp tm_st_hit t tn te", ++ .pme_long_desc = "snp tm_st_hit t tn te", ++}, ++[ POWER9_PME_PM_MRK_BR_2PATH ] = { /* 452 */ ++ .pme_name = "PM_MRK_BR_2PATH", ++ .pme_code = 0x0000010138, ++ .pme_short_desc = "marked two path branch", ++ .pme_long_desc = "marked two path branch", ++}, ++[ POWER9_PME_PM_LSU_FLUSH_CI ] = { /* 453 */ ++ .pme_name = "PM_LSU_FLUSH_CI", ++ .pme_code = 0x000000C0A8, ++ .pme_short_desc = "Load was not issued to LSU as a cache inhibited (non-cacheable) load but it was later determined to be cache inhibited", ++ .pme_long_desc = "Load was not issued to LSU as a cache inhibited (non-cacheable) load but it was later determined to be cache inhibited", ++}, ++[ POWER9_PME_PM_FLUSH_MPRED ] = { /* 454 */ ++ .pme_name = "PM_FLUSH_MPRED", ++ .pme_code = 0x00000050A4, ++ .pme_short_desc = "Branch mispredict flushes.", ++ .pme_long_desc = "Branch mispredict flushes. Includes target and address misprecition", ++}, ++[ POWER9_PME_PM_CMPLU_STALL_ST_FWD ] = { /* 455 */ ++ .pme_name = "PM_CMPLU_STALL_ST_FWD", ++ .pme_code = 0x000004C01C, ++ .pme_short_desc = "Completion stall due to store forward", ++ .pme_long_desc = "Completion stall due to store forward", ++}, ++[ POWER9_PME_PM_DTLB_MISS ] = { /* 456 */ ++ .pme_name = "PM_DTLB_MISS", ++ .pme_code = 0x00000300FC, ++ .pme_short_desc = "Data PTEG reload", ++ .pme_long_desc = "Data PTEG reload", ++}, ++[ POWER9_PME_PM_MRK_L2_TM_REQ_ABORT ] = { /* 457 */ ++ .pme_name = "PM_MRK_L2_TM_REQ_ABORT", ++ .pme_code = 0x000001E15E, ++ .pme_short_desc = "TM abort", ++ .pme_long_desc = "TM abort", ++}, ++[ POWER9_PME_PM_TM_NESTED_TEND ] = { /* 458 */ ++ .pme_name = "PM_TM_NESTED_TEND", ++ .pme_code = 0x0000002098, ++ .pme_short_desc = "Completion time nested tend", ++ .pme_long_desc = "Completion time nested tend", ++}, ++[ POWER9_PME_PM_CMPLU_STALL_PM ] = { /* 459 */ ++ .pme_name = "PM_CMPLU_STALL_PM", ++ .pme_code = 0x000003000A, ++ .pme_short_desc = "Finish stall because the NTF instruction was issued to the Permute execution pipe and waiting to finish.", ++ .pme_long_desc = "Finish stall because the NTF instruction was issued to the Permute execution pipe and waiting to finish. Includes permute and decimal fixpoint instructions (128 bit BCD arithmetic) + a few 128 bit fixpoint add/subtract instructions with carry. Not qualified by vector or multicycle", ++}, ++[ POWER9_PME_PM_CMPLU_STALL_ISYNC ] = { /* 460 */ ++ .pme_name = "PM_CMPLU_STALL_ISYNC", ++ .pme_code = 0x000003002E, ++ .pme_short_desc = "Completion stall because the ISU is checking the scoreboard for whether the isync instruction requires a flush or not", ++ .pme_long_desc = "Completion stall because the ISU is checking the scoreboard for whether the isync instruction requires a flush or not", ++}, ++[ POWER9_PME_PM_MRK_DTLB_MISS_1G ] = { /* 461 */ ++ .pme_name = "PM_MRK_DTLB_MISS_1G", ++ .pme_code = 0x000001D15C, ++ .pme_short_desc = "Marked Data TLB reload (after a miss) page size 2M.", ++ .pme_long_desc = "Marked Data TLB reload (after a miss) page size 2M. Implies radix translation was used", ++}, ++[ POWER9_PME_PM_L3_SYS_GUESS_CORRECT ] = { /* 462 */ ++ .pme_name = "PM_L3_SYS_GUESS_CORRECT", ++ .pme_code = 0x00000260B2, ++ .pme_short_desc = "Initial scope=system and data from outside group (far or rem)(pred successful)", ++ .pme_long_desc = "Initial scope=system and data from outside group (far or rem)(pred successful)", ++}, ++[ POWER9_PME_PM_L2_CASTOUT_SHR ] = { /* 463 */ ++ .pme_name = "PM_L2_CASTOUT_SHR", ++ .pme_code = 0x0000016882, ++ .pme_short_desc = "L2 Castouts - Shared (T, Te, Si, S)", ++ .pme_long_desc = "L2 Castouts - Shared (T, Te, Si, S)", ++}, ++[ POWER9_PME_PM_CMPLU_STALL_DMISS_L2L3 ] = { /* 464 */ ++ .pme_name = "PM_CMPLU_STALL_DMISS_L2L3", ++ .pme_code = 0x000001003C, ++ .pme_short_desc = "Completion stall by Dcache miss which resolved in L2/L3", ++ .pme_long_desc = "Completion stall by Dcache miss which resolved in L2/L3", ++}, ++[ POWER9_PME_PM_LS2_UNALIGNED_ST ] = { /* 465 */ ++ .pme_name = "PM_LS2_UNALIGNED_ST", ++ .pme_code = 0x000000F0BC, ++ .pme_short_desc = "Store instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the Store of that size.", ++ .pme_long_desc = "Store instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the Store of that size. If the Store wraps from slice 3 to slice 0, thee is an additional 3-cycle penalty", ++}, ++[ POWER9_PME_PM_MRK_DPTEG_FROM_L2MISS ] = { /* 466 */ ++ .pme_name = "PM_MRK_DPTEG_FROM_L2MISS", ++ .pme_code = 0x000001F14E, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from a localtion other than the local core's L2 due to a marked data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from a localtion other than the local core's L2 due to a marked data side request.. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++}, ++[ POWER9_PME_PM_THRESH_EXC_32 ] = { /* 467 */ ++ .pme_name = "PM_THRESH_EXC_32", ++ .pme_code = 0x00000201E6, ++ .pme_short_desc = "Threshold counter exceeded a value of 32", ++ .pme_long_desc = "Threshold counter exceeded a value of 32", ++}, ++[ POWER9_PME_PM_TM_TSUSPEND ] = { /* 468 */ ++ .pme_name = "PM_TM_TSUSPEND", ++ .pme_code = 0x00000028A0, ++ .pme_short_desc = "TM suspend instruction completed", ++ .pme_long_desc = "TM suspend instruction completed", ++}, ++[ POWER9_PME_PM_DATA_FROM_DL2L3_SHR ] = { /* 469 */ ++ .pme_name = "PM_DATA_FROM_DL2L3_SHR", ++ .pme_code = 0x000003C048, ++ .pme_short_desc = "The processor's data cache was reloaded with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a demand load", ++ .pme_long_desc = "The processor's data cache was reloaded with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a demand load", ++}, ++[ POWER9_PME_PM_MRK_DATA_FROM_L3_DISP_CONFLICT ] = { /* 470 */ ++ .pme_name = "PM_MRK_DATA_FROM_L3_DISP_CONFLICT", ++ .pme_code = 0x000001D144, ++ .pme_short_desc = "The processor's data cache was reloaded from local core's L3 with dispatch conflict due to a marked load", ++ .pme_long_desc = "The processor's data cache was reloaded from local core's L3 with dispatch conflict due to a marked load", ++}, ++[ POWER9_PME_PM_MRK_DATA_FROM_L3_1_ECO_SHR_CYC ] = { /* 471 */ ++ .pme_name = "PM_MRK_DATA_FROM_L3_1_ECO_SHR_CYC", ++ .pme_code = 0x000001D142, ++ .pme_short_desc = "Duration in cycles to reload with Shared (S) data from another core's ECO L3 on the same chip due to a marked load", ++ .pme_long_desc = "Duration in cycles to reload with Shared (S) data from another core's ECO L3 on the same chip due to a marked load", ++}, ++[ POWER9_PME_PM_THRESH_EXC_1024 ] = { /* 472 */ ++ .pme_name = "PM_THRESH_EXC_1024", ++ .pme_code = 0x00000301EA, ++ .pme_short_desc = "Threshold counter exceeded a value of 1024", ++ .pme_long_desc = "Threshold counter exceeded a value of 1024", ++}, ++[ POWER9_PME_PM_ST_FIN ] = { /* 473 */ ++ .pme_name = "PM_ST_FIN", ++ .pme_code = 0x0000020016, ++ .pme_short_desc = "Store finish count.", ++ .pme_long_desc = "Store finish count. Includes speculative activity", ++}, ++[ POWER9_PME_PM_TM_LD_CAUSED_FAIL ] = { /* 474 */ ++ .pme_name = "PM_TM_LD_CAUSED_FAIL", ++ .pme_code = 0x000001688C, ++ .pme_short_desc = "Non TM Ld caused any thread to fail", ++ .pme_long_desc = "Non TM Ld caused any thread to fail", ++}, ++[ POWER9_PME_PM_SRQ_SYNC_CYC ] = { /* 475 */ ++ .pme_name = "PM_SRQ_SYNC_CYC", ++ .pme_code = 0x000000D0AC, ++ .pme_short_desc = "A sync is in the S2Q (edge detect to count)", ++ .pme_long_desc = "A sync is in the S2Q (edge detect to count)", ++}, ++[ POWER9_PME_PM_IFETCH_THROTTLE ] = { /* 476 */ ++ .pme_name = "PM_IFETCH_THROTTLE", ++ .pme_code = 0x000003405E, ++ .pme_short_desc = "Cycles in which Instruction fetch throttle was active.", ++ .pme_long_desc = "Cycles in which Instruction fetch throttle was active.", ++}, ++[ POWER9_PME_PM_L3_SW_PREF ] = { /* 477 */ ++ .pme_name = "PM_L3_SW_PREF", ++ .pme_code = 0x000000F8B0, ++ .pme_short_desc = "L3 load prefetch, sourced from a software prefetch stream, was sent to the nest", ++ .pme_long_desc = "L3 load prefetch, sourced from a software prefetch stream, was sent to the nest", ++}, ++[ POWER9_PME_PM_LSU0_LDMX_FIN ] = { /* 478 */ ++ .pme_name = "PM_LSU0_LDMX_FIN", ++ .pme_code = 0x000000D088, ++ .pme_short_desc = " New P9 instruction LDMX.", ++ .pme_long_desc = " New P9 instruction LDMX.", ++}, ++[ POWER9_PME_PM_L2_LOC_GUESS_WRONG ] = { /* 479 */ ++ .pme_name = "PM_L2_LOC_GUESS_WRONG", ++ .pme_code = 0x0000016888, ++ .pme_short_desc = "L2 guess loc and guess was not correct (ie data not on chip)", ++ .pme_long_desc = "L2 guess loc and guess was not correct (ie data not on chip)", ++}, ++[ POWER9_PME_PM_MRK_DATA_FROM_L2_NO_CONFLICT_CYC ] = { /* 480 */ ++ .pme_name = "PM_MRK_DATA_FROM_L2_NO_CONFLICT_CYC", ++ .pme_code = 0x0000014158, ++ .pme_short_desc = "Duration in cycles to reload from local core's L2 without conflict due to a marked load", ++ .pme_long_desc = "Duration in cycles to reload from local core's L2 without conflict due to a marked load", ++}, ++[ POWER9_PME_PM_MRK_DPTEG_FROM_ON_CHIP_CACHE ] = { /* 481 */ ++ .pme_name = "PM_MRK_DPTEG_FROM_ON_CHIP_CACHE", ++ .pme_code = 0x000001F148, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB either shared or modified data from another core's L2/L3 on the same chip due to a marked data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB either shared or modified data from another core's L2/L3 on the same chip due to a marked data side request.. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++}, ++[ POWER9_PME_PM_L3_P1_CO_RTY ] = { /* 482 */ ++ .pme_name = "PM_L3_P1_CO_RTY", ++ .pme_code = 0x00000468AE, ++ .pme_short_desc = "L3 CO received retry port 3", ++ .pme_long_desc = "L3 CO received retry port 3", ++}, ++[ POWER9_PME_PM_MRK_STCX_FAIL ] = { /* 483 */ ++ .pme_name = "PM_MRK_STCX_FAIL", ++ .pme_code = 0x000003E158, ++ .pme_short_desc = "marked stcx failed", ++ .pme_long_desc = "marked stcx failed", ++}, ++[ POWER9_PME_PM_LARX_FIN ] = { /* 484 */ ++ .pme_name = "PM_LARX_FIN", ++ .pme_code = 0x000003C058, ++ .pme_short_desc = "Larx finished", ++ .pme_long_desc = "Larx finished", ++}, ++[ POWER9_PME_PM_RADIX_PWC_L2_PTE_FROM_L3 ] = { /* 485 */ ++ .pme_name = "PM_RADIX_PWC_L2_PTE_FROM_L3", ++ .pme_code = 0x000004F058, ++ .pme_short_desc = "A Page Table Entry was reloaded to a level 2 page walk cache from the core's L3 data cache.", ++ .pme_long_desc = "A Page Table Entry was reloaded to a level 2 page walk cache from the core's L3 data cache. This implies that level 3 and level 4 PWC accesses were not necessary for this translation", ++}, ++[ POWER9_PME_PM_LSU3_L1_CAM_CANCEL ] = { /* 486 */ ++ .pme_name = "PM_LSU3_L1_CAM_CANCEL", ++ .pme_code = 0x000000F894, ++ .pme_short_desc = "ls3 l1 tm cam cancel", ++ .pme_long_desc = "ls3 l1 tm cam cancel", ++}, ++[ POWER9_PME_PM_IC_PREF_CANCEL_HIT ] = { /* 487 */ ++ .pme_name = "PM_IC_PREF_CANCEL_HIT", ++ .pme_code = 0x0000004890, ++ .pme_short_desc = "Prefetch Canceled due to icache hit", ++ .pme_long_desc = "Prefetch Canceled due to icache hit", ++}, ++[ POWER9_PME_PM_CMPLU_STALL_EIEIO ] = { /* 488 */ ++ .pme_name = "PM_CMPLU_STALL_EIEIO", ++ .pme_code = 0x000004D01A, ++ .pme_short_desc = "Finish stall because the NTF instruction is an EIEIO waiting for response from L2", ++ .pme_long_desc = "Finish stall because the NTF instruction is an EIEIO waiting for response from L2", ++}, ++[ POWER9_PME_PM_CMPLU_STALL_VDP ] = { /* 489 */ ++ .pme_name = "PM_CMPLU_STALL_VDP", ++ .pme_code = 0x000004405C, ++ .pme_short_desc = "Finish stall because the NTF instruction was a vector instruction issued to the Double Precision execution pipe and waiting to finish.", ++ .pme_long_desc = "Finish stall because the NTF instruction was a vector instruction issued to the Double Precision execution pipe and waiting to finish. Includes binary floating point instructions in 32 and 64 bit binary floating point format. Not qualified multicycle. Qualified by vector", ++}, ++[ POWER9_PME_PM_DERAT_MISS_1G ] = { /* 490 */ ++ .pme_name = "PM_DERAT_MISS_1G", ++ .pme_code = 0x000002C05A, ++ .pme_short_desc = "Data ERAT Miss (Data TLB Access) page size 1G.", ++ .pme_long_desc = "Data ERAT Miss (Data TLB Access) page size 1G. Implies radix translation", ++}, ++[ POWER9_PME_PM_DATA_PUMP_CPRED ] = { /* 491 */ ++ .pme_name = "PM_DATA_PUMP_CPRED", ++ .pme_code = 0x000001C054, ++ .pme_short_desc = "Pump prediction correct.", ++ .pme_long_desc = "Pump prediction correct. Counts across all types of pumps for a demand load", ++}, ++[ POWER9_PME_PM_DPTEG_FROM_L2_MEPF ] = { /* 492 */ ++ .pme_name = "PM_DPTEG_FROM_L2_MEPF", ++ .pme_code = 0x000002E040, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L2 hit without dispatch conflicts on Mepf state.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L2 hit without dispatch conflicts on Mepf state. due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++}, ++[ POWER9_PME_PM_BR_MPRED_TAKEN_CR ] = { /* 493 */ ++ .pme_name = "PM_BR_MPRED_TAKEN_CR", ++ .pme_code = 0x00000040B8, ++ .pme_short_desc = "Conditional Branch Completed that was Mispredicted due to the BHT Direction Prediction (taken/not taken).", ++ .pme_long_desc = "Conditional Branch Completed that was Mispredicted due to the BHT Direction Prediction (taken/not taken).", ++}, ++[ POWER9_PME_PM_MRK_BRU_FIN ] = { /* 494 */ ++ .pme_name = "PM_MRK_BRU_FIN", ++ .pme_code = 0x000002013A, ++ .pme_short_desc = "bru marked instr finish", ++ .pme_long_desc = "bru marked instr finish", ++}, ++[ POWER9_PME_PM_MRK_DPTEG_FROM_DL4 ] = { /* 495 */ ++ .pme_name = "PM_MRK_DPTEG_FROM_DL4", ++ .pme_code = 0x000003F14C, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from another chip's L4 on a different Node or Group (Distant) due to a marked data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from another chip's L4 on a different Node or Group (Distant) due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++}, ++[ POWER9_PME_PM_SHL_ST_DEP_CREATED ] = { /* 496 */ ++ .pme_name = "PM_SHL_ST_DEP_CREATED", ++ .pme_code = 0x000000588C, ++ .pme_short_desc = "Store-Hit-Load Table Read Hit with entry Enabled", ++ .pme_long_desc = "Store-Hit-Load Table Read Hit with entry Enabled", ++}, ++[ POWER9_PME_PM_DPTEG_FROM_L3_1_SHR ] = { /* 497 */ ++ .pme_name = "PM_DPTEG_FROM_L3_1_SHR", ++ .pme_code = 0x000001E046, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's L3 on the same chip due to a data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's L3 on the same chip due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++}, ++[ POWER9_PME_PM_DATA_FROM_RL4 ] = { /* 498 */ ++ .pme_name = "PM_DATA_FROM_RL4", ++ .pme_code = 0x000002C04A, ++ .pme_short_desc = "The processor's data cache was reloaded from another chip's L4 on the same Node or Group ( Remote) due to a demand load", ++ .pme_long_desc = "The processor's data cache was reloaded from another chip's L4 on the same Node or Group ( Remote) due to a demand load", ++}, ++[ POWER9_PME_PM_XLATE_MISS ] = { /* 499 */ ++ .pme_name = "PM_XLATE_MISS", ++ .pme_code = 0x000000F89C, ++ .pme_short_desc = "The LSU requested a line from L2 for translation.", ++ .pme_long_desc = "The LSU requested a line from L2 for translation. It may be satisfied from any source beyond L2. Includes speculative instructions", ++}, ++[ POWER9_PME_PM_CMPLU_STALL_SRQ_FULL ] = { /* 500 */ ++ .pme_name = "PM_CMPLU_STALL_SRQ_FULL", ++ .pme_code = 0x0000030016, ++ .pme_short_desc = "Finish stall because the NTF instruction was a store that was held in LSAQ because the SRQ was full", ++ .pme_long_desc = "Finish stall because the NTF instruction was a store that was held in LSAQ because the SRQ was full", ++}, ++[ POWER9_PME_PM_SN0_BUSY ] = { /* 501 */ ++ .pme_name = "PM_SN0_BUSY", ++ .pme_code = 0x0000026090, ++ .pme_short_desc = "SN mach 0 Busy.", ++ .pme_long_desc = "SN mach 0 Busy. Used by PMU to sample ave RC livetime(mach0 used as sample point)", ++}, ++[ POWER9_PME_PM_CMPLU_STALL_NESTED_TBEGIN ] = { /* 502 */ ++ .pme_name = "PM_CMPLU_STALL_NESTED_TBEGIN", ++ .pme_code = 0x000001E05C, ++ .pme_short_desc = "Completion stall because the ISU is updating the TEXASR to keep track of the nested tbegin.", ++ .pme_long_desc = "Completion stall because the ISU is updating the TEXASR to keep track of the nested tbegin. This is a short delay, and it includes ROT", ++}, ++[ POWER9_PME_PM_ST_CMPL ] = { /* 503 */ ++ .pme_name = "PM_ST_CMPL", ++ .pme_code = 0x00000200F0, ++ .pme_short_desc = "Store Instructions Completed", ++ .pme_long_desc = "Store Instructions Completed", ++}, ++[ POWER9_PME_PM_DPTEG_FROM_DL2L3_SHR ] = { /* 504 */ ++ .pme_name = "PM_DPTEG_FROM_DL2L3_SHR", ++ .pme_code = 0x000003E048, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++}, ++[ POWER9_PME_PM_DECODE_FUSION_CONST_GEN ] = { /* 505 */ ++ .pme_name = "PM_DECODE_FUSION_CONST_GEN", ++ .pme_code = 0x00000048B4, ++ .pme_short_desc = "32-bit constant generation", ++ .pme_long_desc = "32-bit constant generation", ++}, ++[ POWER9_PME_PM_L2_LOC_GUESS_CORRECT ] = { /* 506 */ ++ .pme_name = "PM_L2_LOC_GUESS_CORRECT", ++ .pme_code = 0x0000016088, ++ .pme_short_desc = "L2 guess loc and guess was correct (ie data local)", ++ .pme_long_desc = "L2 guess loc and guess was correct (ie data local)", ++}, ++[ POWER9_PME_PM_INST_FROM_L3_1_ECO_SHR ] = { /* 507 */ ++ .pme_name = "PM_INST_FROM_L3_1_ECO_SHR", ++ .pme_code = 0x0000034044, ++ .pme_short_desc = "The processor's Instruction cache was reloaded with Shared (S) data from another core's ECO L3 on the same chip due to an instruction fetch (not prefetch)", ++ .pme_long_desc = "The processor's Instruction cache was reloaded with Shared (S) data from another core's ECO L3 on the same chip due to an instruction fetch (not prefetch)", ++}, ++[ POWER9_PME_PM_XLATE_HPT_MODE ] = { /* 508 */ ++ .pme_name = "PM_XLATE_HPT_MODE", ++ .pme_code = 0x000000F098, ++ .pme_short_desc = "LSU reports every cycle the thread is in HPT translation mode (as opposed to radix mode)", ++ .pme_long_desc = "LSU reports every cycle the thread is in HPT translation mode (as opposed to radix mode)", ++}, ++[ POWER9_PME_PM_CMPLU_STALL_LSU_FIN ] = { /* 509 */ ++ .pme_name = "PM_CMPLU_STALL_LSU_FIN", ++ .pme_code = 0x000001003A, ++ .pme_short_desc = "Finish stall because the NTF instruction was an LSU op (other than a load or a store) with all its dependencies met and just going through the LSU pipe to finish", ++ .pme_long_desc = "Finish stall because the NTF instruction was an LSU op (other than a load or a store) with all its dependencies met and just going through the LSU pipe to finish", ++}, ++[ POWER9_PME_PM_THRESH_EXC_64 ] = { /* 510 */ ++ .pme_name = "PM_THRESH_EXC_64", ++ .pme_code = 0x00000301E8, ++ .pme_short_desc = "Threshold counter exceeded a value of 64", ++ .pme_long_desc = "Threshold counter exceeded a value of 64", ++}, ++[ POWER9_PME_PM_MRK_DATA_FROM_DL4_CYC ] = { /* 511 */ ++ .pme_name = "PM_MRK_DATA_FROM_DL4_CYC", ++ .pme_code = 0x000002C12C, ++ .pme_short_desc = "Duration in cycles to reload from another chip's L4 on a different Node or Group (Distant) due to a marked load", ++ .pme_long_desc = "Duration in cycles to reload from another chip's L4 on a different Node or Group (Distant) due to a marked load", ++}, ++[ POWER9_PME_PM_DARQ_STORE_XMIT ] = { /* 512 */ ++ .pme_name = "PM_DARQ_STORE_XMIT", ++ .pme_code = 0x0000030064, ++ .pme_short_desc = "The DARQ attempted to transmit a store into an LSAQ or SRQ entry.", ++ .pme_long_desc = "The DARQ attempted to transmit a store into an LSAQ or SRQ entry. Includes rejects. Not qualified by thread, so it includes counts for the whole core", ++}, ++[ POWER9_PME_PM_DATA_TABLEWALK_CYC ] = { /* 513 */ ++ .pme_name = "PM_DATA_TABLEWALK_CYC", ++ .pme_code = 0x000003001A, ++ .pme_short_desc = "Tablwalk Cycles (could be 1 or 2 active tablewalks)", ++ .pme_long_desc = "Tablwalk Cycles (could be 1 or 2 active tablewalks)", ++}, ++[ POWER9_PME_PM_L2_RC_ST_DONE ] = { /* 514 */ ++ .pme_name = "PM_L2_RC_ST_DONE", ++ .pme_code = 0x0000036086, ++ .pme_short_desc = "RC did st to line that was Tx or Sx", ++ .pme_long_desc = "RC did st to line that was Tx or Sx", ++}, ++[ POWER9_PME_PM_TMA_REQ_L2 ] = { /* 515 */ ++ .pme_name = "PM_TMA_REQ_L2", ++ .pme_code = 0x000000E0A4, ++ .pme_short_desc = "addrs only req to L2 only on the first one,Indication that Load footprint is not expanding", ++ .pme_long_desc = "addrs only req to L2 only on the first one,Indication that Load footprint is not expanding", ++}, ++[ POWER9_PME_PM_INST_FROM_ON_CHIP_CACHE ] = { /* 516 */ ++ .pme_name = "PM_INST_FROM_ON_CHIP_CACHE", ++ .pme_code = 0x0000014048, ++ .pme_short_desc = "The processor's Instruction cache was reloaded either shared or modified data from another core's L2/L3 on the same chip due to an instruction fetch (not prefetch)", ++ .pme_long_desc = "The processor's Instruction cache was reloaded either shared or modified data from another core's L2/L3 on the same chip due to an instruction fetch (not prefetch)", ++}, ++[ POWER9_PME_PM_SLB_TABLEWALK_CYC ] = { /* 517 */ ++ .pme_name = "PM_SLB_TABLEWALK_CYC", ++ .pme_code = 0x000000F09C, ++ .pme_short_desc = "Cycles when a tablewalk is pending on this thread on the SLB table", ++ .pme_long_desc = "Cycles when a tablewalk is pending on this thread on the SLB table", ++}, ++[ POWER9_PME_PM_MRK_DATA_FROM_RMEM ] = { /* 518 */ ++ .pme_name = "PM_MRK_DATA_FROM_RMEM", ++ .pme_code = 0x000001D148, ++ .pme_short_desc = "The processor's data cache was reloaded from another chip's memory on the same Node or Group ( Remote) due to a marked load", ++ .pme_long_desc = "The processor's data cache was reloaded from another chip's memory on the same Node or Group ( Remote) due to a marked load", ++}, ++[ POWER9_PME_PM_L3_PF_MISS_L3 ] = { /* 519 */ ++ .pme_name = "PM_L3_PF_MISS_L3", ++ .pme_code = 0x00000160A0, ++ .pme_short_desc = "L3 Prefetch missed in L3", ++ .pme_long_desc = "L3 Prefetch missed in L3", ++}, ++[ POWER9_PME_PM_L3_CI_MISS ] = { /* 520 */ ++ .pme_name = "PM_L3_CI_MISS", ++ .pme_code = 0x00000268A2, ++ .pme_short_desc = "L3 castins miss (total count", ++ .pme_long_desc = "L3 castins miss (total count", ++}, ++[ POWER9_PME_PM_L2_RCLD_DISP_FAIL_ADDR ] = { /* 521 */ ++ .pme_name = "PM_L2_RCLD_DISP_FAIL_ADDR", ++ .pme_code = 0x0000016884, ++ .pme_short_desc = "L2 RC load dispatch attempt failed due to address collision with RC/CO/SN/SQ", ++ .pme_long_desc = "L2 RC load dispatch attempt failed due to address collision with RC/CO/SN/SQ", ++}, ++[ POWER9_PME_PM_DERAT_MISS_4K ] = { /* 522 */ ++ .pme_name = "PM_DERAT_MISS_4K", ++ .pme_code = 0x000001C056, ++ .pme_short_desc = "Data ERAT Miss (Data TLB Access) page size 4K", ++ .pme_long_desc = "Data ERAT Miss (Data TLB Access) page size 4K", ++}, ++[ POWER9_PME_PM_ISIDE_MRU_TOUCH ] = { /* 523 */ ++ .pme_name = "PM_ISIDE_MRU_TOUCH", ++ .pme_code = 0x0000046880, ++ .pme_short_desc = "Iside L2 MRU touch", ++ .pme_long_desc = "Iside L2 MRU touch", ++}, ++[ POWER9_PME_PM_MRK_RUN_CYC ] = { /* 524 */ ++ .pme_name = "PM_MRK_RUN_CYC", ++ .pme_code = 0x000001D15E, ++ .pme_short_desc = "Run cycles in which a marked instruction is in the pipeline", ++ .pme_long_desc = "Run cycles in which a marked instruction is in the pipeline", ++}, ++[ POWER9_PME_PM_L3_P0_CO_RTY ] = { /* 525 */ ++ .pme_name = "PM_L3_P0_CO_RTY", ++ .pme_code = 0x00000460AE, ++ .pme_short_desc = "L3 CO received retry port 2", ++ .pme_long_desc = "L3 CO received retry port 2", ++}, ++[ POWER9_PME_PM_BR_MPRED_CMPL ] = { /* 526 */ ++ .pme_name = "PM_BR_MPRED_CMPL", ++ .pme_code = 0x00000400F6, ++ .pme_short_desc = "Number of Branch Mispredicts", ++ .pme_long_desc = "Number of Branch Mispredicts", ++}, ++[ POWER9_PME_PM_BR_MPRED_TAKEN_TA ] = { /* 527 */ ++ .pme_name = "PM_BR_MPRED_TAKEN_TA", ++ .pme_code = 0x00000048B8, ++ .pme_short_desc = "Conditional Branch Completed that was Mispredicted due to the Target Address Prediction from the Count Cache or Link Stack.", ++ .pme_long_desc = "Conditional Branch Completed that was Mispredicted due to the Target Address Prediction from the Count Cache or Link Stack. Only XL-form branches that resolved Taken set this event.", ++}, ++[ POWER9_PME_PM_DISP_HELD_TBEGIN ] = { /* 528 */ ++ .pme_name = "PM_DISP_HELD_TBEGIN", ++ .pme_code = 0x00000028B0, ++ .pme_short_desc = "This outer tbegin transaction cannot be dispatched until the previous tend instruction completes", ++ .pme_long_desc = "This outer tbegin transaction cannot be dispatched until the previous tend instruction completes", ++}, ++[ POWER9_PME_PM_DPTEG_FROM_RL2L3_MOD ] = { /* 529 */ ++ .pme_name = "PM_DPTEG_FROM_RL2L3_MOD", ++ .pme_code = 0x000002E046, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++}, ++[ POWER9_PME_PM_FLUSH_DISP_SB ] = { /* 530 */ ++ .pme_name = "PM_FLUSH_DISP_SB", ++ .pme_code = 0x0000002088, ++ .pme_short_desc = "Dispatch Flush: Scoreboard", ++ .pme_long_desc = "Dispatch Flush: Scoreboard", ++}, ++[ POWER9_PME_PM_L2_CHIP_PUMP ] = { /* 531 */ ++ .pme_name = "PM_L2_CHIP_PUMP", ++ .pme_code = 0x0000046088, ++ .pme_short_desc = "RC requests that were local on chip pump attempts", ++ .pme_long_desc = "RC requests that were local on chip pump attempts", ++}, ++[ POWER9_PME_PM_L2_DC_INV ] = { /* 532 */ ++ .pme_name = "PM_L2_DC_INV", ++ .pme_code = 0x0000026882, ++ .pme_short_desc = "Dcache invalidates from L2", ++ .pme_long_desc = "Dcache invalidates from L2", ++}, ++[ POWER9_PME_PM_MRK_DATA_FROM_L2_DISP_CONFLICT_LDHITST_CYC ] = { /* 533 */ ++ .pme_name = "PM_MRK_DATA_FROM_L2_DISP_CONFLICT_LDHITST_CYC", ++ .pme_code = 0x000001415A, ++ .pme_short_desc = "Duration in cycles to reload from local core's L2 with load hit store conflict due to a marked load", ++ .pme_long_desc = "Duration in cycles to reload from local core's L2 with load hit store conflict due to a marked load", ++}, ++[ POWER9_PME_PM_MRK_DPTEG_FROM_L3_1_SHR ] = { /* 534 */ ++ .pme_name = "PM_MRK_DPTEG_FROM_L3_1_SHR", ++ .pme_code = 0x000001F146, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's L3 on the same chip due to a marked data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's L3 on the same chip due to a marked data side request.. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++}, ++[ POWER9_PME_PM_MRK_DERAT_MISS_2M ] = { /* 535 */ ++ .pme_name = "PM_MRK_DERAT_MISS_2M", ++ .pme_code = 0x000002D152, ++ .pme_short_desc = "Marked Data ERAT Miss (Data TLB Access) page size 2M.", ++ .pme_long_desc = "Marked Data ERAT Miss (Data TLB Access) page size 2M. Implies radix translation", ++}, ++[ POWER9_PME_PM_MRK_ST_DONE_L2 ] = { /* 536 */ ++ .pme_name = "PM_MRK_ST_DONE_L2", ++ .pme_code = 0x0000010134, ++ .pme_short_desc = "marked store completed in L2 ( RC machine done)", ++ .pme_long_desc = "marked store completed in L2 ( RC machine done)", ++}, ++[ POWER9_PME_PM_MRK_DATA_FROM_L3_1_ECO_MOD ] = { /* 537 */ ++ .pme_name = "PM_MRK_DATA_FROM_L3_1_ECO_MOD", ++ .pme_code = 0x000004D144, ++ .pme_short_desc = "The processor's data cache was reloaded with Modified (M) data from another core's ECO L3 on the same chip due to a marked load", ++ .pme_long_desc = "The processor's data cache was reloaded with Modified (M) data from another core's ECO L3 on the same chip due to a marked load", ++}, ++[ POWER9_PME_PM_IPTEG_FROM_RMEM ] = { /* 538 */ ++ .pme_name = "PM_IPTEG_FROM_RMEM", ++ .pme_code = 0x000003504A, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from another chip's memory on the same Node or Group ( Remote) due to a instruction side request", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from another chip's memory on the same Node or Group ( Remote) due to a instruction side request", ++}, ++[ POWER9_PME_PM_MRK_LSU_FLUSH_EMSH ] = { /* 539 */ ++ .pme_name = "PM_MRK_LSU_FLUSH_EMSH", ++ .pme_code = 0x000000D898, ++ .pme_short_desc = "An ERAT miss was detected after a set-p hit.", ++ .pme_long_desc = "An ERAT miss was detected after a set-p hit. Erat tracker indicates fail due to tlbmiss and the instruction gets flushed because the instruction was working on the wrong address", ++}, ++[ POWER9_PME_PM_BR_PRED_LSTACK ] = { /* 540 */ ++ .pme_name = "PM_BR_PRED_LSTACK", ++ .pme_code = 0x00000040A8, ++ .pme_short_desc = "Conditional Branch Completed that used the Link Stack for Target Prediction", ++ .pme_long_desc = "Conditional Branch Completed that used the Link Stack for Target Prediction", ++}, ++[ POWER9_PME_PM_L3_P0_CO_MEM ] = { /* 541 */ ++ .pme_name = "PM_L3_P0_CO_MEM", ++ .pme_code = 0x00000360AA, ++ .pme_short_desc = "l3 CO to memory port 0", ++ .pme_long_desc = "l3 CO to memory port 0", ++}, ++[ POWER9_PME_PM_IPTEG_FROM_L2_MEPF ] = { /* 542 */ ++ .pme_name = "PM_IPTEG_FROM_L2_MEPF", ++ .pme_code = 0x0000025040, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L2 hit without dispatch conflicts on Mepf state.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L2 hit without dispatch conflicts on Mepf state. due to a instruction side request", ++}, ++[ POWER9_PME_PM_LS0_ERAT_MISS_PREF ] = { /* 543 */ ++ .pme_name = "PM_LS0_ERAT_MISS_PREF", ++ .pme_code = 0x000000E084, ++ .pme_short_desc = "LS0 Erat miss due to prefetch", ++ .pme_long_desc = "LS0 Erat miss due to prefetch", ++}, ++[ POWER9_PME_PM_RD_HIT_PF ] = { /* 544 */ ++ .pme_name = "PM_RD_HIT_PF", ++ .pme_code = 0x00000268A8, ++ .pme_short_desc = "rd machine hit l3 pf machine", ++ .pme_long_desc = "rd machine hit l3 pf machine", ++}, ++[ POWER9_PME_PM_DECODE_FUSION_LD_ST_DISP ] = { /* 545 */ ++ .pme_name = "PM_DECODE_FUSION_LD_ST_DISP", ++ .pme_code = 0x00000048A8, ++ .pme_short_desc = "32-bit displacement D-form and 16-bit displacement X-form", ++ .pme_long_desc = "32-bit displacement D-form and 16-bit displacement X-form", ++}, ++[ POWER9_PME_PM_CMPLU_STALL_NTC_DISP_FIN ] = { /* 546 */ ++ .pme_name = "PM_CMPLU_STALL_NTC_DISP_FIN", ++ .pme_code = 0x000004E018, ++ .pme_short_desc = "Finish stall because the NTF instruction was one that must finish at dispatch.", ++ .pme_long_desc = "Finish stall because the NTF instruction was one that must finish at dispatch.", ++}, ++[ POWER9_PME_PM_ICT_NOSLOT_CYC ] = { /* 547 */ ++ .pme_name = "PM_ICT_NOSLOT_CYC", ++ .pme_code = 0x00000100F8, ++ .pme_short_desc = "Number of cycles the ICT has no itags assigned to this thread", ++ .pme_long_desc = "Number of cycles the ICT has no itags assigned to this thread", ++}, ++[ POWER9_PME_PM_DERAT_MISS_16M ] = { /* 548 */ ++ .pme_name = "PM_DERAT_MISS_16M", ++ .pme_code = 0x000003C054, ++ .pme_short_desc = "Data ERAT Miss (Data TLB Access) page size 16M", ++ .pme_long_desc = "Data ERAT Miss (Data TLB Access) page size 16M", ++}, ++[ POWER9_PME_PM_IC_MISS_ICBI ] = { /* 549 */ ++ .pme_name = "PM_IC_MISS_ICBI", ++ .pme_code = 0x0000005094, ++ .pme_short_desc = "threaded version, IC Misses where we got EA dir hit but no sector valids were on.", ++ .pme_long_desc = "threaded version, IC Misses where we got EA dir hit but no sector valids were on. ICBI took line out", ++}, ++[ POWER9_PME_PM_TAGE_OVERRIDE_WRONG_SPEC ] = { /* 550 */ ++ .pme_name = "PM_TAGE_OVERRIDE_WRONG_SPEC", ++ .pme_code = 0x00000058B8, ++ .pme_short_desc = "The TAGE overrode BHT direction prediction and it was correct.", ++ .pme_long_desc = "The TAGE overrode BHT direction prediction and it was correct. Includes taken and not taken and is counted at execution time", ++}, ++[ POWER9_PME_PM_ICT_NOSLOT_DISP_HELD_TBEGIN ] = { /* 551 */ ++ .pme_name = "PM_ICT_NOSLOT_DISP_HELD_TBEGIN", ++ .pme_code = 0x0000010064, ++ .pme_short_desc = "the NTC instruction is being held at dispatch because it is a tbegin instruction and there is an older tbegin in the pipeline that must complete before the younger tbegin can dispatch", ++ .pme_long_desc = "the NTC instruction is being held at dispatch because it is a tbegin instruction and there is an older tbegin in the pipeline that must complete before the younger tbegin can dispatch", ++}, ++[ POWER9_PME_PM_MRK_BR_TAKEN_CMPL ] = { /* 552 */ ++ .pme_name = "PM_MRK_BR_TAKEN_CMPL", ++ .pme_code = 0x00000101E2, ++ .pme_short_desc = "Marked Branch Taken completed", ++ .pme_long_desc = "Marked Branch Taken completed", ++}, ++[ POWER9_PME_PM_CMPLU_STALL_VFXU ] = { /* 553 */ ++ .pme_name = "PM_CMPLU_STALL_VFXU", ++ .pme_code = 0x000003C05C, ++ .pme_short_desc = "Finish stall due to a vector fixed point instruction in the execution pipeline.", ++ .pme_long_desc = "Finish stall due to a vector fixed point instruction in the execution pipeline. These instructions get routed to the ALU, ALU2, and DIV pipes", ++}, ++[ POWER9_PME_PM_DATA_GRP_PUMP_MPRED_RTY ] = { /* 554 */ ++ .pme_name = "PM_DATA_GRP_PUMP_MPRED_RTY", ++ .pme_code = 0x000001C052, ++ .pme_short_desc = "Final Pump Scope (Group) ended up larger than Initial Pump Scope (Chip) for a demand load", ++ .pme_long_desc = "Final Pump Scope (Group) ended up larger than Initial Pump Scope (Chip) for a demand load", ++}, ++[ POWER9_PME_PM_INST_FROM_L3 ] = { /* 555 */ ++ .pme_name = "PM_INST_FROM_L3", ++ .pme_code = 0x0000044042, ++ .pme_short_desc = "The processor's Instruction cache was reloaded from local core's L3 due to an instruction fetch (not prefetch)", ++ .pme_long_desc = "The processor's Instruction cache was reloaded from local core's L3 due to an instruction fetch (not prefetch)", ++}, ++[ POWER9_PME_PM_ITLB_MISS ] = { /* 556 */ ++ .pme_name = "PM_ITLB_MISS", ++ .pme_code = 0x00000400FC, ++ .pme_short_desc = "ITLB Reloaded (always zero on POWER6)", ++ .pme_long_desc = "ITLB Reloaded (always zero on POWER6)", ++}, ++[ POWER9_PME_PM_MRK_DPTEG_FROM_RL2L3_MOD ] = { /* 557 */ ++ .pme_name = "PM_MRK_DPTEG_FROM_RL2L3_MOD", ++ .pme_code = 0x000002F146, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a marked data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++}, ++[ POWER9_PME_PM_LSU2_TM_L1_MISS ] = { /* 558 */ ++ .pme_name = "PM_LSU2_TM_L1_MISS", ++ .pme_code = 0x000000E0A0, ++ .pme_short_desc = "Load tm L1 miss", ++ .pme_long_desc = "Load tm L1 miss", ++}, ++[ POWER9_PME_PM_L3_WI_USAGE ] = { /* 559 */ ++ .pme_name = "PM_L3_WI_USAGE", ++ .pme_code = 0x00000168A8, ++ .pme_short_desc = "rotating sample of 8 WI actives", ++ .pme_long_desc = "rotating sample of 8 WI actives", ++}, ++[ POWER9_PME_PM_L2_SN_M_WR_DONE ] = { /* 560 */ ++ .pme_name = "PM_L2_SN_M_WR_DONE", ++ .pme_code = 0x0000046886, ++ .pme_short_desc = "SNP dispatched for a write and was M", ++ .pme_long_desc = "SNP dispatched for a write and was M", ++}, ++[ POWER9_PME_PM_DISP_HELD_SYNC_HOLD ] = { /* 561 */ ++ .pme_name = "PM_DISP_HELD_SYNC_HOLD", ++ .pme_code = 0x000004003C, ++ .pme_short_desc = "Cycles in which dispatch is held because of a synchronizing instruction in the pipeline", ++ .pme_long_desc = "Cycles in which dispatch is held because of a synchronizing instruction in the pipeline", ++}, ++[ POWER9_PME_PM_MRK_DPTEG_FROM_L2_1_SHR ] = { /* 562 */ ++ .pme_name = "PM_MRK_DPTEG_FROM_L2_1_SHR", ++ .pme_code = 0x000003F146, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's L2 on the same chip due to a marked data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's L2 on the same chip due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++}, ++[ POWER9_PME_PM_MEM_PREF ] = { /* 563 */ ++ .pme_name = "PM_MEM_PREF", ++ .pme_code = 0x000002C058, ++ .pme_short_desc = "Memory prefetch for this thread.", ++ .pme_long_desc = "Memory prefetch for this thread. Includes L4", ++}, ++[ POWER9_PME_PM_L2_SN_M_RD_DONE ] = { /* 564 */ ++ .pme_name = "PM_L2_SN_M_RD_DONE", ++ .pme_code = 0x0000046086, ++ .pme_short_desc = "SNP dispatched for a read and was M", ++ .pme_long_desc = "SNP dispatched for a read and was M", ++}, ++[ POWER9_PME_PM_LS0_UNALIGNED_ST ] = { /* 565 */ ++ .pme_name = "PM_LS0_UNALIGNED_ST", ++ .pme_code = 0x000000F0B8, ++ .pme_short_desc = "Store instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the Store of that size.", ++ .pme_long_desc = "Store instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the Store of that size. If the Store wraps from slice 3 to slice 0, thee is an additional 3-cycle penalty", ++}, ++[ POWER9_PME_PM_DC_PREF_CONS_ALLOC ] = { /* 566 */ ++ .pme_name = "PM_DC_PREF_CONS_ALLOC", ++ .pme_code = 0x000000F0B4, ++ .pme_short_desc = "Prefetch stream allocated in the conservative phase by either the hardware prefetch mechanism or software prefetch", ++ .pme_long_desc = "Prefetch stream allocated in the conservative phase by either the hardware prefetch mechanism or software prefetch", ++}, ++[ POWER9_PME_PM_MRK_DERAT_MISS_16G ] = { /* 567 */ ++ .pme_name = "PM_MRK_DERAT_MISS_16G", ++ .pme_code = 0x000004C15C, ++ .pme_short_desc = "Marked Data ERAT Miss (Data TLB Access) page size 16G", ++ .pme_long_desc = "Marked Data ERAT Miss (Data TLB Access) page size 16G", ++}, ++[ POWER9_PME_PM_IPTEG_FROM_L2 ] = { /* 568 */ ++ .pme_name = "PM_IPTEG_FROM_L2", ++ .pme_code = 0x0000015042, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L2 due to a instruction side request", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L2 due to a instruction side request", ++}, ++[ POWER9_PME_PM_ANY_THRD_RUN_CYC ] = { /* 569 */ ++ .pme_name = "PM_ANY_THRD_RUN_CYC", ++ .pme_code = 0x00000100FA, ++ .pme_short_desc = "Cycles in which at least one thread has the run latch set", ++ .pme_long_desc = "Cycles in which at least one thread has the run latch set", ++}, ++[ POWER9_PME_PM_MRK_PROBE_NOP_CMPL ] = { /* 570 */ ++ .pme_name = "PM_MRK_PROBE_NOP_CMPL", ++ .pme_code = 0x000001F05E, ++ .pme_short_desc = "Marked probeNops completed", ++ .pme_long_desc = "Marked probeNops completed", ++}, ++[ POWER9_PME_PM_BANK_CONFLICT ] = { /* 571 */ ++ .pme_name = "PM_BANK_CONFLICT", ++ .pme_code = 0x0000004880, ++ .pme_short_desc = "Read blocked due to interleave conflict.", ++ .pme_long_desc = "Read blocked due to interleave conflict. The ifar logic will detect an interleave conflict and kill the data that was read that cycle.", ++}, ++[ POWER9_PME_PM_INST_SYS_PUMP_MPRED ] = { /* 572 */ ++ .pme_name = "PM_INST_SYS_PUMP_MPRED", ++ .pme_code = 0x0000034052, ++ .pme_short_desc = "Final Pump Scope (system) mispredicted.", ++ .pme_long_desc = "Final Pump Scope (system) mispredicted. Either the original scope was too small (Chip/Group) or the original scope was System and it should have been smaller. Counts for an instruction fetch", ++}, ++[ POWER9_PME_PM_NON_DATA_STORE ] = { /* 573 */ ++ .pme_name = "PM_NON_DATA_STORE", ++ .pme_code = 0x000000F8A0, ++ .pme_short_desc = "All ops that drain from s2q to L2 and contain no data", ++ .pme_long_desc = "All ops that drain from s2q to L2 and contain no data", ++}, ++[ POWER9_PME_PM_DC_PREF_CONF ] = { /* 574 */ ++ .pme_name = "PM_DC_PREF_CONF", ++ .pme_code = 0x000000F0A8, ++ .pme_short_desc = "A demand load referenced a line in an active prefetch stream.", ++ .pme_long_desc = "A demand load referenced a line in an active prefetch stream. The stream could have been allocated through the hardware prefetch mechanism or through software. Includes forwards and backwards streams", ++}, ++[ POWER9_PME_PM_BTAC_BAD_RESULT ] = { /* 575 */ ++ .pme_name = "PM_BTAC_BAD_RESULT", ++ .pme_code = 0x00000050B0, ++ .pme_short_desc = "BTAC thinks branch will be taken but it is either predicted not-taken by the BHT, or the target address is wrong (less common).", ++ .pme_long_desc = "BTAC thinks branch will be taken but it is either predicted not-taken by the BHT, or the target address is wrong (less common). In both cases, a redirect will happen", ++}, ++[ POWER9_PME_PM_LSU_LMQ_FULL_CYC ] = { /* 576 */ ++ .pme_name = "PM_LSU_LMQ_FULL_CYC", ++ .pme_code = 0x000000D0B8, ++ .pme_short_desc = "Counts the number of cycles the LMQ is full", ++ .pme_long_desc = "Counts the number of cycles the LMQ is full", ++}, ++[ POWER9_PME_PM_NON_MATH_FLOP_CMPL ] = { /* 577 */ ++ .pme_name = "PM_NON_MATH_FLOP_CMPL", ++ .pme_code = 0x000004D05A, ++ .pme_short_desc = "Non-math flop instruction completed", ++ .pme_long_desc = "Non-math flop instruction completed", ++}, ++[ POWER9_PME_PM_MRK_LD_MISS_L1_CYC ] = { /* 578 */ ++ .pme_name = "PM_MRK_LD_MISS_L1_CYC", ++ .pme_code = 0x000001D056, ++ .pme_short_desc = "Marked ld latency", ++ .pme_long_desc = "Marked ld latency", ++}, ++[ POWER9_PME_PM_MRK_DATA_FROM_L2_CYC ] = { /* 579 */ ++ .pme_name = "PM_MRK_DATA_FROM_L2_CYC", ++ .pme_code = 0x0000014156, ++ .pme_short_desc = "Duration in cycles to reload from local core's L2 due to a marked load", ++ .pme_long_desc = "Duration in cycles to reload from local core's L2 due to a marked load", ++}, ++[ POWER9_PME_PM_FXU_1PLUS_BUSY ] = { /* 580 */ ++ .pme_name = "PM_FXU_1PLUS_BUSY", ++ .pme_code = 0x000003000E, ++ .pme_short_desc = "At least one of the 4 FXU units is busy", ++ .pme_long_desc = "At least one of the 4 FXU units is busy", ++}, ++[ POWER9_PME_PM_CMPLU_STALL_DP ] = { /* 581 */ ++ .pme_name = "PM_CMPLU_STALL_DP", ++ .pme_code = 0x000001005C, ++ .pme_short_desc = "Finish stall because the NTF instruction was a scalar instruction issued to the Double Precision execution pipe and waiting to finish.", ++ .pme_long_desc = "Finish stall because the NTF instruction was a scalar instruction issued to the Double Precision execution pipe and waiting to finish. Includes binary floating point instructions in 32 and 64 bit binary floating point format. Not qualified multicycle. Qualified by NOT vector", ++}, ++[ POWER9_PME_PM_MRK_DATA_FROM_L3_1_MOD_CYC ] = { /* 582 */ ++ .pme_name = "PM_MRK_DATA_FROM_L3_1_MOD_CYC", ++ .pme_code = 0x000001D140, ++ .pme_short_desc = "Duration in cycles to reload with Modified (M) data from another core's L3 on the same chip due to a marked load", ++ .pme_long_desc = "Duration in cycles to reload with Modified (M) data from another core's L3 on the same chip due to a marked load", ++}, ++[ POWER9_PME_PM_SYNC_MRK_L2HIT ] = { /* 583 */ ++ .pme_name = "PM_SYNC_MRK_L2HIT", ++ .pme_code = 0x0000015158, ++ .pme_short_desc = "Marked L2 Hits that can throw a synchronous interrupt", ++ .pme_long_desc = "Marked L2 Hits that can throw a synchronous interrupt", ++}, ++[ POWER9_PME_PM_MRK_DATA_FROM_RMEM_CYC ] = { /* 584 */ ++ .pme_name = "PM_MRK_DATA_FROM_RMEM_CYC", ++ .pme_code = 0x000002C12A, ++ .pme_short_desc = "Duration in cycles to reload from another chip's memory on the same Node or Group ( Remote) due to a marked load", ++ .pme_long_desc = "Duration in cycles to reload from another chip's memory on the same Node or Group ( Remote) due to a marked load", ++}, ++[ POWER9_PME_PM_ISU1_ISS_HOLD_ALL ] = { /* 585 */ ++ .pme_name = "PM_ISU1_ISS_HOLD_ALL", ++ .pme_code = 0x0000003084, ++ .pme_short_desc = "All ISU rejects", ++ .pme_long_desc = "All ISU rejects", ++}, ++[ POWER9_PME_PM_MRK_DPTEG_FROM_L3_DISP_CONFLICT ] = { /* 586 */ ++ .pme_name = "PM_MRK_DPTEG_FROM_L3_DISP_CONFLICT", ++ .pme_code = 0x000003F142, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L3 with dispatch conflict due to a marked data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L3 with dispatch conflict due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++}, ++[ POWER9_PME_PM_MRK_FAB_RSP_RWITM_RTY ] = { /* 587 */ ++ .pme_name = "PM_MRK_FAB_RSP_RWITM_RTY", ++ .pme_code = 0x000002015E, ++ .pme_short_desc = "Sampled store did a rwitm and got a rty", ++ .pme_long_desc = "Sampled store did a rwitm and got a rty", ++}, ++[ POWER9_PME_PM_L3_P3_LCO_RTY ] = { /* 588 */ ++ .pme_name = "PM_L3_P3_LCO_RTY", ++ .pme_code = 0x00000268B4, ++ .pme_short_desc = "L3 lateral cast out received retry on port 3", ++ .pme_long_desc = "L3 lateral cast out received retry on port 3", ++}, ++[ POWER9_PME_PM_PUMP_CPRED ] = { /* 589 */ ++ .pme_name = "PM_PUMP_CPRED", ++ .pme_code = 0x0000010054, ++ .pme_short_desc = "Pump prediction correct.", ++ .pme_long_desc = "Pump prediction correct. Counts across all types of pumps for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", ++}, ++[ POWER9_PME_PM_LS3_TM_DISALLOW ] = { /* 590 */ ++ .pme_name = "PM_LS3_TM_DISALLOW", ++ .pme_code = 0x000000E8B8, ++ .pme_short_desc = "A TM-ineligible instruction tries to execute inside a transaction and the LSU disallows it", ++ .pme_long_desc = "A TM-ineligible instruction tries to execute inside a transaction and the LSU disallows it", ++}, ++[ POWER9_PME_PM_SN_INVL ] = { /* 591 */ ++ .pme_name = "PM_SN_INVL", ++ .pme_code = 0x00000368A8, ++ .pme_short_desc = "Any port snooper detects a store to a line that?s in the Sx state and invalidates the line.", ++ .pme_long_desc = "Any port snooper detects a store to a line that?s in the Sx state and invalidates the line. Up to 4 can happen in a cycle but we only count 1", ++}, ++[ POWER9_PME_PM_TM_LD_CONF ] = { /* 592 */ ++ .pme_name = "PM_TM_LD_CONF", ++ .pme_code = 0x000002608C, ++ .pme_short_desc = "TM Load (fav or non-fav) ran into conflict (failed)", ++ .pme_long_desc = "TM Load (fav or non-fav) ran into conflict (failed)", ++}, ++[ POWER9_PME_PM_LD_MISS_L1_FIN ] = { /* 593 */ ++ .pme_name = "PM_LD_MISS_L1_FIN", ++ .pme_code = 0x000002C04E, ++ .pme_short_desc = "Number of load instructions that finished with an L1 miss.", ++ .pme_long_desc = "Number of load instructions that finished with an L1 miss. Note that even if a load spans multiple slices this event will increment only once per load op.", ++}, ++[ POWER9_PME_PM_SYNC_MRK_PROBE_NOP ] = { /* 594 */ ++ .pme_name = "PM_SYNC_MRK_PROBE_NOP", ++ .pme_code = 0x0000015150, ++ .pme_short_desc = "Marked probeNops which can cause synchronous interrupts", ++ .pme_long_desc = "Marked probeNops which can cause synchronous interrupts", ++}, ++[ POWER9_PME_PM_RUN_CYC ] = { /* 595 */ ++ .pme_name = "PM_RUN_CYC", ++ .pme_code = 0x00000200F4, ++ .pme_short_desc = "Run_cycles", ++ .pme_long_desc = "Run_cycles", ++}, ++[ POWER9_PME_PM_SYS_PUMP_MPRED ] = { /* 596 */ ++ .pme_name = "PM_SYS_PUMP_MPRED", ++ .pme_code = 0x0000030052, ++ .pme_short_desc = "Final Pump Scope (system) mispredicted.", ++ .pme_long_desc = "Final Pump Scope (system) mispredicted. Either the original scope was too small (Chip/Group) or the original scope was System and it should have been smaller. Counts for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", ++}, ++[ POWER9_PME_PM_DATA_FROM_OFF_CHIP_CACHE ] = { /* 597 */ ++ .pme_name = "PM_DATA_FROM_OFF_CHIP_CACHE", ++ .pme_code = 0x000004C04A, ++ .pme_short_desc = "The processor's data cache was reloaded either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to a demand load", ++ .pme_long_desc = "The processor's data cache was reloaded either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to a demand load", ++}, ++[ POWER9_PME_PM_TM_NESTED_TBEGIN ] = { /* 598 */ ++ .pme_name = "PM_TM_NESTED_TBEGIN", ++ .pme_code = 0x00000020A0, ++ .pme_short_desc = "Completion Tm nested tbegin", ++ .pme_long_desc = "Completion Tm nested tbegin", ++}, ++[ POWER9_PME_PM_FLUSH_COMPLETION ] = { /* 599 */ ++ .pme_name = "PM_FLUSH_COMPLETION", ++ .pme_code = 0x0000030012, ++ .pme_short_desc = "The instruction that was next to complete did not complete because it suffered a flush", ++ .pme_long_desc = "The instruction that was next to complete did not complete because it suffered a flush", ++}, ++[ POWER9_PME_PM_ST_MISS_L1 ] = { /* 600 */ ++ .pme_name = "PM_ST_MISS_L1", ++ .pme_code = 0x00000300F0, ++ .pme_short_desc = "Store Missed L1", ++ .pme_long_desc = "Store Missed L1", ++}, ++[ POWER9_PME_PM_IPTEG_FROM_L2MISS ] = { /* 601 */ ++ .pme_name = "PM_IPTEG_FROM_L2MISS", ++ .pme_code = 0x000001504E, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from a localtion other than the local core's L2 due to a instruction side request", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from a localtion other than the local core's L2 due to a instruction side request", ++}, ++[ POWER9_PME_PM_LSU3_TM_L1_MISS ] = { /* 602 */ ++ .pme_name = "PM_LSU3_TM_L1_MISS", ++ .pme_code = 0x000000E8A0, ++ .pme_short_desc = "Load tm L1 miss", ++ .pme_long_desc = "Load tm L1 miss", ++}, ++[ POWER9_PME_PM_L3_CO ] = { /* 603 */ ++ .pme_name = "PM_L3_CO", ++ .pme_code = 0x00000360A8, ++ .pme_short_desc = "l3 castout occuring ( does not include casthrough or log writes (cinj/dmaw)", ++ .pme_long_desc = "l3 castout occuring ( does not include casthrough or log writes (cinj/dmaw)", ++}, ++[ POWER9_PME_PM_MRK_STALL_CMPLU_CYC ] = { /* 604 */ ++ .pme_name = "PM_MRK_STALL_CMPLU_CYC", ++ .pme_code = 0x000003013E, ++ .pme_short_desc = "Number of cycles the marked instruction is experiencing a stall while it is next to complete (NTC)", ++ .pme_long_desc = "Number of cycles the marked instruction is experiencing a stall while it is next to complete (NTC)", ++}, ++[ POWER9_PME_PM_INST_FROM_DL2L3_SHR ] = { /* 605 */ ++ .pme_name = "PM_INST_FROM_DL2L3_SHR", ++ .pme_code = 0x0000034048, ++ .pme_short_desc = "The processor's Instruction cache was reloaded with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to an instruction fetch (not prefetch)", ++ .pme_long_desc = "The processor's Instruction cache was reloaded with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to an instruction fetch (not prefetch)", ++}, ++[ POWER9_PME_PM_SCALAR_FLOP_CMPL ] = { /* 606 */ ++ .pme_name = "PM_SCALAR_FLOP_CMPL", ++ .pme_code = 0x0000010130, ++ .pme_short_desc = "Scalar flop events", ++ .pme_long_desc = "Scalar flop events", ++}, ++[ POWER9_PME_PM_LRQ_REJECT ] = { /* 607 */ ++ .pme_name = "PM_LRQ_REJECT", ++ .pme_code = 0x000002E05A, ++ .pme_short_desc = "Internal LSU reject from LRQ.", ++ .pme_long_desc = "Internal LSU reject from LRQ. Rejects cause the load to go back to LRQ, but it stays contained within the LSU once it gets issued. This event counts the number of times the LRQ attempts to relaunch an instruction after a reject. Any load can suffer multiple rejects", ++}, ++[ POWER9_PME_PM_4FLOP_CMPL ] = { /* 608 */ ++ .pme_name = "PM_4FLOP_CMPL", ++ .pme_code = 0x000001000E, ++ .pme_short_desc = "four flop events", ++ .pme_long_desc = "four flop events", ++}, ++[ POWER9_PME_PM_MRK_DPTEG_FROM_RMEM ] = { /* 609 */ ++ .pme_name = "PM_MRK_DPTEG_FROM_RMEM", ++ .pme_code = 0x000003F14A, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from another chip's memory on the same Node or Group ( Remote) due to a marked data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from another chip's memory on the same Node or Group ( Remote) due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++}, ++[ POWER9_PME_PM_LD_CMPL ] = { /* 610 */ ++ .pme_name = "PM_LD_CMPL", ++ .pme_code = 0x000004003E, ++ .pme_short_desc = "count of Loads completed", ++ .pme_long_desc = "count of Loads completed", ++}, ++[ POWER9_PME_PM_DATA_FROM_L3_MEPF ] = { /* 611 */ ++ .pme_name = "PM_DATA_FROM_L3_MEPF", ++ .pme_code = 0x000002C042, ++ .pme_short_desc = "The processor's data cache was reloaded from local core's L3 without dispatch conflicts hit on Mepf state due to a demand load", ++ .pme_long_desc = "The processor's data cache was reloaded from local core's L3 without dispatch conflicts hit on Mepf state due to a demand load", ++}, ++[ POWER9_PME_PM_L1PF_L2MEMACC ] = { /* 612 */ ++ .pme_name = "PM_L1PF_L2MEMACC", ++ .pme_code = 0x0000016890, ++ .pme_short_desc = "valid when first beat of data comes in for an L1pref where data came from mem(or L4)", ++ .pme_long_desc = "valid when first beat of data comes in for an L1pref where data came from mem(or L4)", ++}, ++[ POWER9_PME_PM_INST_FROM_L3MISS ] = { /* 613 */ ++ .pme_name = "PM_INST_FROM_L3MISS", ++ .pme_code = 0x00000300FA, ++ .pme_short_desc = "Marked instruction was reloaded from a location beyond the local chiplet", ++ .pme_long_desc = "Marked instruction was reloaded from a location beyond the local chiplet", ++}, ++[ POWER9_PME_PM_MRK_LSU_FLUSH_LHS ] = { /* 614 */ ++ .pme_name = "PM_MRK_LSU_FLUSH_LHS", ++ .pme_code = 0x000000D0A0, ++ .pme_short_desc = "Effective Address alias flush : no EA match but Real Address match.", ++ .pme_long_desc = "Effective Address alias flush : no EA match but Real Address match. If the data has not yet been returned for this load, the instruction will just be rejected, but if it has returned data, it will be flushed", ++}, ++[ POWER9_PME_PM_EE_OFF_EXT_INT ] = { /* 615 */ ++ .pme_name = "PM_EE_OFF_EXT_INT", ++ .pme_code = 0x0000002080, ++ .pme_short_desc = "CyclesMSR[EE] is off and external interrupts are active", ++ .pme_long_desc = "CyclesMSR[EE] is off and external interrupts are active", ++}, ++[ POWER9_PME_PM_TM_ST_CONF ] = { /* 616 */ ++ .pme_name = "PM_TM_ST_CONF", ++ .pme_code = 0x000003608C, ++ .pme_short_desc = "TM Store (fav or non-fav) ran into conflict (failed)", ++ .pme_long_desc = "TM Store (fav or non-fav) ran into conflict (failed)", ++}, ++[ POWER9_PME_PM_PMC6_OVERFLOW ] = { /* 617 */ ++ .pme_name = "PM_PMC6_OVERFLOW", ++ .pme_code = 0x0000030024, ++ .pme_short_desc = "Overflow from counter 6", ++ .pme_long_desc = "Overflow from counter 6", ++}, ++[ POWER9_PME_PM_INST_FROM_DL2L3_MOD ] = { /* 618 */ ++ .pme_name = "PM_INST_FROM_DL2L3_MOD", ++ .pme_code = 0x0000044048, ++ .pme_short_desc = "The processor's Instruction cache was reloaded with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to an instruction fetch (not prefetch)", ++ .pme_long_desc = "The processor's Instruction cache was reloaded with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to an instruction fetch (not prefetch)", ++}, ++[ POWER9_PME_PM_MRK_INST_CMPL ] = { /* 619 */ ++ .pme_name = "PM_MRK_INST_CMPL", ++ .pme_code = 0x00000401E0, ++ .pme_short_desc = "marked instruction completed", ++ .pme_long_desc = "marked instruction completed", ++}, ++[ POWER9_PME_PM_TAGE_CORRECT_TAKEN_CMPL ] = { /* 620 */ ++ .pme_name = "PM_TAGE_CORRECT_TAKEN_CMPL", ++ .pme_code = 0x00000050B4, ++ .pme_short_desc = "The TAGE overrode BHT direction prediction and it was correct.", ++ .pme_long_desc = "The TAGE overrode BHT direction prediction and it was correct. Counted at completion for taken branches only", ++}, ++[ POWER9_PME_PM_MRK_L1_ICACHE_MISS ] = { /* 621 */ ++ .pme_name = "PM_MRK_L1_ICACHE_MISS", ++ .pme_code = 0x00000101E4, ++ .pme_short_desc = "sampled Instruction suffered an icache Miss", ++ .pme_long_desc = "sampled Instruction suffered an icache Miss", ++}, ++[ POWER9_PME_PM_TLB_MISS ] = { /* 622 */ ++ .pme_name = "PM_TLB_MISS", ++ .pme_code = 0x0000020066, ++ .pme_short_desc = "TLB Miss (I + D)", ++ .pme_long_desc = "TLB Miss (I + D)", ++}, ++[ POWER9_PME_PM_L2_RCLD_DISP_FAIL_OTHER ] = { /* 623 */ ++ .pme_name = "PM_L2_RCLD_DISP_FAIL_OTHER", ++ .pme_code = 0x0000026084, ++ .pme_short_desc = "L2 RC load dispatch attempt failed due to other reasons", ++ .pme_long_desc = "L2 RC load dispatch attempt failed due to other reasons", ++}, ++[ POWER9_PME_PM_FXU_BUSY ] = { /* 624 */ ++ .pme_name = "PM_FXU_BUSY", ++ .pme_code = 0x000002000A, ++ .pme_short_desc = "Cycles in which all 4 FXUs are busy.", ++ .pme_long_desc = "Cycles in which all 4 FXUs are busy. The FXU is running at capacity", ++}, ++[ POWER9_PME_PM_DATA_FROM_L3_DISP_CONFLICT ] = { /* 625 */ ++ .pme_name = "PM_DATA_FROM_L3_DISP_CONFLICT", ++ .pme_code = 0x000003C042, ++ .pme_short_desc = "The processor's data cache was reloaded from local core's L3 with dispatch conflict due to a demand load", ++ .pme_long_desc = "The processor's data cache was reloaded from local core's L3 with dispatch conflict due to a demand load", ++}, ++[ POWER9_PME_PM_INST_FROM_L3_1_MOD ] = { /* 626 */ ++ .pme_name = "PM_INST_FROM_L3_1_MOD", ++ .pme_code = 0x0000024044, ++ .pme_short_desc = "The processor's Instruction cache was reloaded with Modified (M) data from another core's L3 on the same chip due to an instruction fetch (not prefetch)", ++ .pme_long_desc = "The processor's Instruction cache was reloaded with Modified (M) data from another core's L3 on the same chip due to an instruction fetch (not prefetch)", ++}, ++[ POWER9_PME_PM_LSU_REJECT_LMQ_FULL ] = { /* 627 */ ++ .pme_name = "PM_LSU_REJECT_LMQ_FULL", ++ .pme_code = 0x000003001C, ++ .pme_short_desc = "LSU Reject due to LMQ full (up to 4 per cycles)", ++ .pme_long_desc = "LSU Reject due to LMQ full (up to 4 per cycles)", ++}, ++[ POWER9_PME_PM_CO_DISP_FAIL ] = { /* 628 */ ++ .pme_name = "PM_CO_DISP_FAIL", ++ .pme_code = 0x0000016886, ++ .pme_short_desc = "CO dispatch failed due to all CO machines being busy", ++ .pme_long_desc = "CO dispatch failed due to all CO machines being busy", ++}, ++[ POWER9_PME_PM_L3_TRANS_PF ] = { /* 629 */ ++ .pme_name = "PM_L3_TRANS_PF", ++ .pme_code = 0x00000468A4, ++ .pme_short_desc = "L3 Transient prefetch", ++ .pme_long_desc = "L3 Transient prefetch", ++}, ++[ POWER9_PME_PM_MRK_ST_NEST ] = { /* 630 */ ++ .pme_name = "PM_MRK_ST_NEST", ++ .pme_code = 0x0000020138, ++ .pme_short_desc = "Marked store sent to nest", ++ .pme_long_desc = "Marked store sent to nest", ++}, ++[ POWER9_PME_PM_LSU1_L1_CAM_CANCEL ] = { /* 631 */ ++ .pme_name = "PM_LSU1_L1_CAM_CANCEL", ++ .pme_code = 0x000000F890, ++ .pme_short_desc = "ls1 l1 tm cam cancel", ++ .pme_long_desc = "ls1 l1 tm cam cancel", ++}, ++[ POWER9_PME_PM_INST_CHIP_PUMP_CPRED ] = { /* 632 */ ++ .pme_name = "PM_INST_CHIP_PUMP_CPRED", ++ .pme_code = 0x0000014050, ++ .pme_short_desc = "Initial and Final Pump Scope was chip pump (prediction=correct) for an instruction fetch", ++ .pme_long_desc = "Initial and Final Pump Scope was chip pump (prediction=correct) for an instruction fetch", ++}, ++[ POWER9_PME_PM_LSU3_VECTOR_ST_FIN ] = { /* 633 */ ++ .pme_name = "PM_LSU3_VECTOR_ST_FIN", ++ .pme_code = 0x000000C88C, ++ .pme_short_desc = "A vector store instruction finished.", ++ .pme_long_desc = "A vector store instruction finished. The ops considered in this category are stv*, stxv*, stxsi*x, stxsd, and stxssp", ++}, ++[ POWER9_PME_PM_MRK_DPTEG_FROM_L2_1_MOD ] = { /* 634 */ ++ .pme_name = "PM_MRK_DPTEG_FROM_L2_1_MOD", ++ .pme_code = 0x000004F146, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's L2 on the same chip due to a marked data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's L2 on the same chip due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++}, ++[ POWER9_PME_PM_IBUF_FULL_CYC ] = { /* 635 */ ++ .pme_name = "PM_IBUF_FULL_CYC", ++ .pme_code = 0x0000004884, ++ .pme_short_desc = "Cycles No room in ibuff", ++ .pme_long_desc = "Cycles No room in ibuff", ++}, ++[ POWER9_PME_PM_8FLOP_CMPL ] = { /* 636 */ ++ .pme_name = "PM_8FLOP_CMPL", ++ .pme_code = 0x000004D054, ++ .pme_short_desc = "", ++ .pme_long_desc = "", ++}, ++[ POWER9_PME_PM_MRK_DATA_FROM_DL2L3_SHR_CYC ] = { /* 637 */ ++ .pme_name = "PM_MRK_DATA_FROM_DL2L3_SHR_CYC", ++ .pme_code = 0x000002C128, ++ .pme_short_desc = "Duration in cycles to reload with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a marked load", ++ .pme_long_desc = "Duration in cycles to reload with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a marked load", ++}, ++[ POWER9_PME_PM_MRK_DPTEG_FROM_OFF_CHIP_CACHE ] = { /* 638 */ ++ .pme_name = "PM_MRK_DPTEG_FROM_OFF_CHIP_CACHE", ++ .pme_code = 0x000004F14A, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to a marked data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++}, ++[ POWER9_PME_PM_ICT_NOSLOT_IC_L3 ] = { /* 639 */ ++ .pme_name = "PM_ICT_NOSLOT_IC_L3", ++ .pme_code = 0x000003E052, ++ .pme_short_desc = "Ict empty for this thread due to icache misses that were sourced from the local L3", ++ .pme_long_desc = "Ict empty for this thread due to icache misses that were sourced from the local L3", ++}, ++[ POWER9_PME_PM_CMPLU_STALL_LWSYNC ] = { /* 640 */ ++ .pme_name = "PM_CMPLU_STALL_LWSYNC", ++ .pme_code = 0x0000010036, ++ .pme_short_desc = "completion stall due to lwsync", ++ .pme_long_desc = "completion stall due to lwsync", ++}, ++[ POWER9_PME_PM_RADIX_PWC_L2_PDE_FROM_L2 ] = { /* 641 */ ++ .pme_name = "PM_RADIX_PWC_L2_PDE_FROM_L2", ++ .pme_code = 0x000002D028, ++ .pme_short_desc = "A Page Directory Entry was reloaded to a level 2 page walk cache from the core's L2 data cache", ++ .pme_long_desc = "A Page Directory Entry was reloaded to a level 2 page walk cache from the core's L2 data cache", ++}, ++[ POWER9_PME_PM_MRK_DATA_FROM_RL2L3_SHR_CYC ] = { /* 642 */ ++ .pme_name = "PM_MRK_DATA_FROM_RL2L3_SHR_CYC", ++ .pme_code = 0x000004C12A, ++ .pme_short_desc = "Duration in cycles to reload with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a marked load", ++ .pme_long_desc = "Duration in cycles to reload with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a marked load", ++}, ++[ POWER9_PME_PM_L3_SN0_BUSY ] = { /* 643 */ ++ .pme_name = "PM_L3_SN0_BUSY", ++ .pme_code = 0x00000460AC, ++ .pme_short_desc = "lifetime, sample of snooper machine 0 valid", ++ .pme_long_desc = "lifetime, sample of snooper machine 0 valid", ++}, ++[ POWER9_PME_PM_TM_OUTER_TBEGIN_DISP ] = { /* 644 */ ++ .pme_name = "PM_TM_OUTER_TBEGIN_DISP", ++ .pme_code = 0x000004E05E, ++ .pme_short_desc = "Number of outer tbegin instructions dispatched.", ++ .pme_long_desc = "Number of outer tbegin instructions dispatched. The dispatch unit determines whether the tbegin instruction is outer or nested. This is a speculative count, which includes flushed instructions", ++}, ++[ POWER9_PME_PM_GRP_PUMP_MPRED ] = { /* 645 */ ++ .pme_name = "PM_GRP_PUMP_MPRED", ++ .pme_code = 0x0000020052, ++ .pme_short_desc = "Final Pump Scope (Group) ended up either larger or smaller than Initial Pump Scope for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", ++ .pme_long_desc = "Final Pump Scope (Group) ended up either larger or smaller than Initial Pump Scope for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", ++}, ++[ POWER9_PME_PM_SRQ_EMPTY_CYC ] = { /* 646 */ ++ .pme_name = "PM_SRQ_EMPTY_CYC", ++ .pme_code = 0x0000040008, ++ .pme_short_desc = "Cycles in which the SRQ has at least one (out of four) empty slice", ++ .pme_long_desc = "Cycles in which the SRQ has at least one (out of four) empty slice", ++}, ++[ POWER9_PME_PM_LSU_REJECT_LHS ] = { /* 647 */ ++ .pme_name = "PM_LSU_REJECT_LHS", ++ .pme_code = 0x000004E05C, ++ .pme_short_desc = "LSU Reject due to LHS (up to 4 per cycle)", ++ .pme_long_desc = "LSU Reject due to LHS (up to 4 per cycle)", ++}, ++[ POWER9_PME_PM_IPTEG_FROM_L3_MEPF ] = { /* 648 */ ++ .pme_name = "PM_IPTEG_FROM_L3_MEPF", ++ .pme_code = 0x0000025042, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L3 without dispatch conflicts hit on Mepf state.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L3 without dispatch conflicts hit on Mepf state. due to a instruction side request", ++}, ++[ POWER9_PME_PM_MRK_DATA_FROM_LMEM ] = { /* 649 */ ++ .pme_name = "PM_MRK_DATA_FROM_LMEM", ++ .pme_code = 0x000003D142, ++ .pme_short_desc = "The processor's data cache was reloaded from the local chip's Memory due to a marked load", ++ .pme_long_desc = "The processor's data cache was reloaded from the local chip's Memory due to a marked load", ++}, ++[ POWER9_PME_PM_L3_P1_CO_MEM ] = { /* 650 */ ++ .pme_name = "PM_L3_P1_CO_MEM", ++ .pme_code = 0x00000368AA, ++ .pme_short_desc = "l3 CO to memory port 1", ++ .pme_long_desc = "l3 CO to memory port 1", ++}, ++[ POWER9_PME_PM_FREQ_DOWN ] = { /* 651 */ ++ .pme_name = "PM_FREQ_DOWN", ++ .pme_code = 0x000003000C, ++ .pme_short_desc = "Power Management: Below Threshold B", ++ .pme_long_desc = "Power Management: Below Threshold B", ++}, ++[ POWER9_PME_PM_L3_CINJ ] = { /* 652 */ ++ .pme_name = "PM_L3_CINJ", ++ .pme_code = 0x00000368A4, ++ .pme_short_desc = "l3 ci of cache inject", ++ .pme_long_desc = "l3 ci of cache inject", ++}, ++[ POWER9_PME_PM_L3_P0_PF_RTY ] = { /* 653 */ ++ .pme_name = "PM_L3_P0_PF_RTY", ++ .pme_code = 0x00000260AE, ++ .pme_short_desc = "L3 PF received retry port 2", ++ .pme_long_desc = "L3 PF received retry port 2", ++}, ++[ POWER9_PME_PM_IPTEG_FROM_DL2L3_MOD ] = { /* 654 */ ++ .pme_name = "PM_IPTEG_FROM_DL2L3_MOD", ++ .pme_code = 0x0000045048, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a instruction side request", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a instruction side request", ++}, ++[ POWER9_PME_PM_MRK_INST_ISSUED ] = { /* 655 */ ++ .pme_name = "PM_MRK_INST_ISSUED", ++ .pme_code = 0x0000010132, ++ .pme_short_desc = "Marked instruction issued", ++ .pme_long_desc = "Marked instruction issued", ++}, ++[ POWER9_PME_PM_INST_FROM_RL2L3_SHR ] = { /* 656 */ ++ .pme_name = "PM_INST_FROM_RL2L3_SHR", ++ .pme_code = 0x000001404A, ++ .pme_short_desc = "The processor's Instruction cache was reloaded with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to an instruction fetch (not prefetch)", ++ .pme_long_desc = "The processor's Instruction cache was reloaded with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to an instruction fetch (not prefetch)", ++}, ++[ POWER9_PME_PM_LSU_STCX_FAIL ] = { /* 657 */ ++ .pme_name = "PM_LSU_STCX_FAIL", ++ .pme_code = 0x000000F080, ++ .pme_short_desc = "stcx failed", ++ .pme_long_desc = "stcx failed", ++}, ++[ POWER9_PME_PM_L3_P1_NODE_PUMP ] = { /* 658 */ ++ .pme_name = "PM_L3_P1_NODE_PUMP", ++ .pme_code = 0x00000168B0, ++ .pme_short_desc = "L3 pf sent with nodal scope port 1", ++ .pme_long_desc = "L3 pf sent with nodal scope port 1", ++}, ++[ POWER9_PME_PM_MEM_RWITM ] = { /* 659 */ ++ .pme_name = "PM_MEM_RWITM", ++ .pme_code = 0x000003C05E, ++ .pme_short_desc = "Memory Read With Intent to Modify for this thread", ++ .pme_long_desc = "Memory Read With Intent to Modify for this thread", ++}, ++[ POWER9_PME_PM_DP_QP_FLOP_CMPL ] = { /* 660 */ ++ .pme_name = "PM_DP_QP_FLOP_CMPL", ++ .pme_code = 0x000004D05C, ++ .pme_short_desc = "Double-precision flop instruction completed", ++ .pme_long_desc = "Double-precision flop instruction completed", ++}, ++[ POWER9_PME_PM_RUN_PURR ] = { /* 661 */ ++ .pme_name = "PM_RUN_PURR", ++ .pme_code = 0x00000400F4, ++ .pme_short_desc = "Run_PURR", ++ .pme_long_desc = "Run_PURR", ++}, ++[ POWER9_PME_PM_CMPLU_STALL_LMQ_FULL ] = { /* 662 */ ++ .pme_name = "PM_CMPLU_STALL_LMQ_FULL", ++ .pme_code = 0x000004C014, ++ .pme_short_desc = "Finish stall because the NTF instruction was a load that missed in the L1 and the LMQ was unable to accept this load miss request because it was full", ++ .pme_long_desc = "Finish stall because the NTF instruction was a load that missed in the L1 and the LMQ was unable to accept this load miss request because it was full", ++}, ++[ POWER9_PME_PM_CMPLU_STALL_VDPLONG ] = { /* 663 */ ++ .pme_name = "PM_CMPLU_STALL_VDPLONG", ++ .pme_code = 0x000003C05A, ++ .pme_short_desc = "Finish stall because the NTF instruction was a scalar multi-cycle instruction issued to the Double Precision execution pipe and waiting to finish.", ++ .pme_long_desc = "Finish stall because the NTF instruction was a scalar multi-cycle instruction issued to the Double Precision execution pipe and waiting to finish. Includes binary floating point instructions in 32 and 64 bit binary floating point format. Qualified by NOT vector AND multicycle", ++}, ++[ POWER9_PME_PM_LSU2_TM_L1_HIT ] = { /* 664 */ ++ .pme_name = "PM_LSU2_TM_L1_HIT", ++ .pme_code = 0x000000E098, ++ .pme_short_desc = "Load tm hit in L1", ++ .pme_long_desc = "Load tm hit in L1", ++}, ++[ POWER9_PME_PM_MRK_DATA_FROM_L3 ] = { /* 665 */ ++ .pme_name = "PM_MRK_DATA_FROM_L3", ++ .pme_code = 0x000004D142, ++ .pme_short_desc = "The processor's data cache was reloaded from local core's L3 due to a marked load", ++ .pme_long_desc = "The processor's data cache was reloaded from local core's L3 due to a marked load", ++}, ++[ POWER9_PME_PM_CMPLU_STALL_MTFPSCR ] = { /* 666 */ ++ .pme_name = "PM_CMPLU_STALL_MTFPSCR", ++ .pme_code = 0x000004E012, ++ .pme_short_desc = "Completion stall because the ISU is updating the register and notifying the Effective Address Table (EAT)", ++ .pme_long_desc = "Completion stall because the ISU is updating the register and notifying the Effective Address Table (EAT)", ++}, ++[ POWER9_PME_PM_STALL_END_ICT_EMPTY ] = { /* 667 */ ++ .pme_name = "PM_STALL_END_ICT_EMPTY", ++ .pme_code = 0x0000010028, ++ .pme_short_desc = "The number a times the core transitioned from a stall to ICT-empty for this thread", ++ .pme_long_desc = "The number a times the core transitioned from a stall to ICT-empty for this thread", ++}, ++[ POWER9_PME_PM_L3_P1_CO_L31 ] = { /* 668 */ ++ .pme_name = "PM_L3_P1_CO_L31", ++ .pme_code = 0x00000468AA, ++ .pme_short_desc = "l3 CO to L3.", ++ .pme_long_desc = "l3 CO to L3.1 (lco) port 1", ++}, ++[ POWER9_PME_PM_CMPLU_STALL_DCACHE_MISS ] = { /* 669 */ ++ .pme_name = "PM_CMPLU_STALL_DCACHE_MISS", ++ .pme_code = 0x000002C012, ++ .pme_short_desc = "Finish stall because the NTF instruction was a load that missed the L1 and was waiting for the data to return from the nest", ++ .pme_long_desc = "Finish stall because the NTF instruction was a load that missed the L1 and was waiting for the data to return from the nest", ++}, ++[ POWER9_PME_PM_DPTEG_FROM_DL2L3_MOD ] = { /* 670 */ ++ .pme_name = "PM_DPTEG_FROM_DL2L3_MOD", ++ .pme_code = 0x000004E048, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++}, ++[ POWER9_PME_PM_INST_FROM_L3_MEPF ] = { /* 671 */ ++ .pme_name = "PM_INST_FROM_L3_MEPF", ++ .pme_code = 0x0000024042, ++ .pme_short_desc = "The processor's Instruction cache was reloaded from local core's L3 without dispatch conflicts hit on Mepf state.", ++ .pme_long_desc = "The processor's Instruction cache was reloaded from local core's L3 without dispatch conflicts hit on Mepf state. due to an instruction fetch (not prefetch)", ++}, ++[ POWER9_PME_PM_L1_DCACHE_RELOADED_ALL ] = { /* 672 */ ++ .pme_name = "PM_L1_DCACHE_RELOADED_ALL", ++ .pme_code = 0x000001002C, ++ .pme_short_desc = "L1 data cache reloaded for demand.", ++ .pme_long_desc = "L1 data cache reloaded for demand. If MMCR1[16] is 1, prefetches will be included as well", ++}, ++[ POWER9_PME_PM_DATA_GRP_PUMP_CPRED ] = { /* 673 */ ++ .pme_name = "PM_DATA_GRP_PUMP_CPRED", ++ .pme_code = 0x000002C050, ++ .pme_short_desc = "Initial and Final Pump Scope was group pump (prediction=correct) for a demand load", ++ .pme_long_desc = "Initial and Final Pump Scope was group pump (prediction=correct) for a demand load", ++}, ++[ POWER9_PME_PM_MRK_DERAT_MISS_64K ] = { /* 674 */ ++ .pme_name = "PM_MRK_DERAT_MISS_64K", ++ .pme_code = 0x000002D154, ++ .pme_short_desc = "Marked Data ERAT Miss (Data TLB Access) page size 64K", ++ .pme_long_desc = "Marked Data ERAT Miss (Data TLB Access) page size 64K", ++}, ++[ POWER9_PME_PM_L2_ST_MISS ] = { /* 675 */ ++ .pme_name = "PM_L2_ST_MISS", ++ .pme_code = 0x0000026880, ++ .pme_short_desc = "All successful D-Side Store dispatches that were an L2miss for this thread", ++ .pme_long_desc = "All successful D-Side Store dispatches that were an L2miss for this thread", ++}, ++[ POWER9_PME_PM_L3_PF_OFF_CHIP_CACHE ] = { /* 676 */ ++ .pme_name = "PM_L3_PF_OFF_CHIP_CACHE", ++ .pme_code = 0x00000368A0, ++ .pme_short_desc = "L3 Prefetch from Off chip cache", ++ .pme_long_desc = "L3 Prefetch from Off chip cache", ++}, ++[ POWER9_PME_PM_RADIX_PWC_L3_PTE_FROM_L3MISS ] = { /* 677 */ ++ .pme_name = "PM_RADIX_PWC_L3_PTE_FROM_L3MISS", ++ .pme_code = 0x000004F05E, ++ .pme_short_desc = "A Page Table Entry was reloaded to a level 3 page walk cache from beyond the core's L3 data cache.", ++ .pme_long_desc = "A Page Table Entry was reloaded to a level 3 page walk cache from beyond the core's L3 data cache. This implies that a level 4 PWC access was not necessary for this translation. The source could be local/remote/distant memory or another core's cache", ++}, ++[ POWER9_PME_PM_LWSYNC ] = { /* 678 */ ++ .pme_name = "PM_LWSYNC", ++ .pme_code = 0x0000005894, ++ .pme_short_desc = "Lwsync instruction decoded and transferred", ++ .pme_long_desc = "Lwsync instruction decoded and transferred", ++}, ++[ POWER9_PME_PM_LS3_UNALIGNED_LD ] = { /* 679 */ ++ .pme_name = "PM_LS3_UNALIGNED_LD", ++ .pme_code = 0x000000C898, ++ .pme_short_desc = "Load instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the load of that size.", ++ .pme_long_desc = "Load instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the load of that size. If the load wraps from slice 3 to slice 0, thee is an additional 3-cycle penalty", ++}, ++[ POWER9_PME_PM_L3_RD0_BUSY ] = { /* 680 */ ++ .pme_name = "PM_L3_RD0_BUSY", ++ .pme_code = 0x00000468B4, ++ .pme_short_desc = "lifetime, sample of RD machine 0 valid", ++ .pme_long_desc = "lifetime, sample of RD machine 0 valid", ++}, ++[ POWER9_PME_PM_LINK_STACK_CORRECT ] = { /* 681 */ ++ .pme_name = "PM_LINK_STACK_CORRECT", ++ .pme_code = 0x00000058A0, ++ .pme_short_desc = "Link stack predicts right address", ++ .pme_long_desc = "Link stack predicts right address", ++}, ++[ POWER9_PME_PM_MRK_DTLB_MISS ] = { /* 682 */ ++ .pme_name = "PM_MRK_DTLB_MISS", ++ .pme_code = 0x00000401E4, ++ .pme_short_desc = "Marked dtlb miss", ++ .pme_long_desc = "Marked dtlb miss", ++}, ++[ POWER9_PME_PM_INST_IMC_MATCH_CMPL ] = { /* 683 */ ++ .pme_name = "PM_INST_IMC_MATCH_CMPL", ++ .pme_code = 0x000004001C, ++ .pme_short_desc = "IMC Match Count", ++ .pme_long_desc = "IMC Match Count", ++}, ++[ POWER9_PME_PM_LS1_ERAT_MISS_PREF ] = { /* 684 */ ++ .pme_name = "PM_LS1_ERAT_MISS_PREF", ++ .pme_code = 0x000000E884, ++ .pme_short_desc = "LS1 Erat miss due to prefetch", ++ .pme_long_desc = "LS1 Erat miss due to prefetch", ++}, ++[ POWER9_PME_PM_L3_CO0_BUSY ] = { /* 685 */ ++ .pme_name = "PM_L3_CO0_BUSY", ++ .pme_code = 0x00000468AC, ++ .pme_short_desc = "lifetime, sample of CO machine 0 valid", ++ .pme_long_desc = "lifetime, sample of CO machine 0 valid", ++}, ++[ POWER9_PME_PM_BFU_BUSY ] = { /* 686 */ ++ .pme_name = "PM_BFU_BUSY", ++ .pme_code = 0x000003005C, ++ .pme_short_desc = "Cycles in which all 4 Binary Floating Point units are busy.", ++ .pme_long_desc = "Cycles in which all 4 Binary Floating Point units are busy. The BFU is running at capacity", ++}, ++[ POWER9_PME_PM_L2_SYS_GUESS_CORRECT ] = { /* 687 */ ++ .pme_name = "PM_L2_SYS_GUESS_CORRECT", ++ .pme_code = 0x0000036088, ++ .pme_short_desc = "L2 guess sys and guess was correct (ie data beyond-6chip)", ++ .pme_long_desc = "L2 guess sys and guess was correct (ie data beyond-6chip)", ++}, ++[ POWER9_PME_PM_L1_SW_PREF ] = { /* 688 */ ++ .pme_name = "PM_L1_SW_PREF", ++ .pme_code = 0x000000E880, ++ .pme_short_desc = "Software L1 Prefetches, including SW Transient Prefetches", ++ .pme_long_desc = "Software L1 Prefetches, including SW Transient Prefetches", ++}, ++[ POWER9_PME_PM_MRK_DATA_FROM_LL4 ] = { /* 689 */ ++ .pme_name = "PM_MRK_DATA_FROM_LL4", ++ .pme_code = 0x000001D14C, ++ .pme_short_desc = "The processor's data cache was reloaded from the local chip's L4 cache due to a marked load", ++ .pme_long_desc = "The processor's data cache was reloaded from the local chip's L4 cache due to a marked load", ++}, ++[ POWER9_PME_PM_MRK_INST_FIN ] = { /* 690 */ ++ .pme_name = "PM_MRK_INST_FIN", ++ .pme_code = 0x0000030130, ++ .pme_short_desc = "marked instruction finished", ++ .pme_long_desc = "marked instruction finished", ++}, ++[ POWER9_PME_PM_SYNC_MRK_L3MISS ] = { /* 691 */ ++ .pme_name = "PM_SYNC_MRK_L3MISS", ++ .pme_code = 0x0000015154, ++ .pme_short_desc = "Marked L3 misses that can throw a synchronous interrupt", ++ .pme_long_desc = "Marked L3 misses that can throw a synchronous interrupt", ++}, ++[ POWER9_PME_PM_LSU1_STORE_REJECT ] = { /* 692 */ ++ .pme_name = "PM_LSU1_STORE_REJECT", ++ .pme_code = 0x000000F88C, ++ .pme_short_desc = "All internal store rejects cause the instruction to go back to the SRQ and go to sleep until woken up to try again after the condition has been met", ++ .pme_long_desc = "All internal store rejects cause the instruction to go back to the SRQ and go to sleep until woken up to try again after the condition has been met", ++}, ++[ POWER9_PME_PM_CHIP_PUMP_CPRED ] = { /* 693 */ ++ .pme_name = "PM_CHIP_PUMP_CPRED", ++ .pme_code = 0x0000010050, ++ .pme_short_desc = "Initial and Final Pump Scope was chip pump (prediction=correct) for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", ++ .pme_long_desc = "Initial and Final Pump Scope was chip pump (prediction=correct) for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", ++}, ++[ POWER9_PME_PM_MRK_DATA_FROM_OFF_CHIP_CACHE_CYC ] = { /* 694 */ ++ .pme_name = "PM_MRK_DATA_FROM_OFF_CHIP_CACHE_CYC", ++ .pme_code = 0x000001D14E, ++ .pme_short_desc = "Duration in cycles to reload either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to a marked load", ++ .pme_long_desc = "Duration in cycles to reload either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to a marked load", ++}, ++[ POWER9_PME_PM_DATA_STORE ] = { /* 695 */ ++ .pme_name = "PM_DATA_STORE", ++ .pme_code = 0x000000F0A0, ++ .pme_short_desc = "All ops that drain from s2q to L2 containing data", ++ .pme_long_desc = "All ops that drain from s2q to L2 containing data", ++}, ++[ POWER9_PME_PM_LS1_UNALIGNED_LD ] = { /* 696 */ ++ .pme_name = "PM_LS1_UNALIGNED_LD", ++ .pme_code = 0x000000C894, ++ .pme_short_desc = "Load instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the load of that size.", ++ .pme_long_desc = "Load instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the load of that size. If the load wraps from slice 3 to slice 0, thee is an additional 3-cycle penalty", ++}, ++[ POWER9_PME_PM_TM_TRANS_RUN_INST ] = { /* 697 */ ++ .pme_name = "PM_TM_TRANS_RUN_INST", ++ .pme_code = 0x0000030060, ++ .pme_short_desc = "Run instructions completed in transactional state (gated by the run latch)", ++ .pme_long_desc = "Run instructions completed in transactional state (gated by the run latch)", ++}, ++[ POWER9_PME_PM_IC_MISS_CMPL ] = { /* 698 */ ++ .pme_name = "PM_IC_MISS_CMPL", ++ .pme_code = 0x000001D15A, ++ .pme_short_desc = "Non-speculative icache miss, counted at completion", ++ .pme_long_desc = "Non-speculative icache miss, counted at completion", ++}, ++[ POWER9_PME_PM_THRESH_NOT_MET ] = { /* 699 */ ++ .pme_name = "PM_THRESH_NOT_MET", ++ .pme_code = 0x000004016E, ++ .pme_short_desc = "Threshold counter did not meet threshold", ++ .pme_long_desc = "Threshold counter did not meet threshold", ++}, ++[ POWER9_PME_PM_DPTEG_FROM_L2 ] = { /* 700 */ ++ .pme_name = "PM_DPTEG_FROM_L2", ++ .pme_code = 0x000001E042, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L2 due to a data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L2 due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++}, ++[ POWER9_PME_PM_IPTEG_FROM_RL2L3_SHR ] = { /* 701 */ ++ .pme_name = "PM_IPTEG_FROM_RL2L3_SHR", ++ .pme_code = 0x000001504A, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a instruction side request", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a instruction side request", ++}, ++[ POWER9_PME_PM_DPTEG_FROM_RMEM ] = { /* 702 */ ++ .pme_name = "PM_DPTEG_FROM_RMEM", ++ .pme_code = 0x000003E04A, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from another chip's memory on the same Node or Group ( Remote) due to a data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from another chip's memory on the same Node or Group ( Remote) due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++}, ++[ POWER9_PME_PM_L3_L2_CO_MISS ] = { /* 703 */ ++ .pme_name = "PM_L3_L2_CO_MISS", ++ .pme_code = 0x00000368A2, ++ .pme_short_desc = "L2 castout miss", ++ .pme_long_desc = "L2 castout miss", ++}, ++[ POWER9_PME_PM_IPTEG_FROM_DMEM ] = { /* 704 */ ++ .pme_name = "PM_IPTEG_FROM_DMEM", ++ .pme_code = 0x000004504C, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from another chip's memory on the same Node or Group (Distant) due to a instruction side request", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from another chip's memory on the same Node or Group (Distant) due to a instruction side request", ++}, ++[ POWER9_PME_PM_MRK_DTLB_MISS_64K ] = { /* 705 */ ++ .pme_name = "PM_MRK_DTLB_MISS_64K", ++ .pme_code = 0x000003D156, ++ .pme_short_desc = "Marked Data TLB Miss page size 64K", ++ .pme_long_desc = "Marked Data TLB Miss page size 64K", ++}, ++[ POWER9_PME_PM_MRK_DATA_FROM_L3_DISP_CONFLICT_CYC ] = { /* 706 */ ++ .pme_name = "PM_MRK_DATA_FROM_L3_DISP_CONFLICT_CYC", ++ .pme_code = 0x000002C122, ++ .pme_short_desc = "Duration in cycles to reload from local core's L3 with dispatch conflict due to a marked load", ++ .pme_long_desc = "Duration in cycles to reload from local core's L3 with dispatch conflict due to a marked load", ++}, ++[ POWER9_PME_PM_LSU_FIN ] = { /* 707 */ ++ .pme_name = "PM_LSU_FIN", ++ .pme_code = 0x0000030066, ++ .pme_short_desc = "LSU Finished a PPC instruction (up to 4 per cycle)", ++ .pme_long_desc = "LSU Finished a PPC instruction (up to 4 per cycle)", ++}, ++[ POWER9_PME_PM_DATA_FROM_L2_DISP_CONFLICT_OTHER ] = { /* 708 */ ++ .pme_name = "PM_DATA_FROM_L2_DISP_CONFLICT_OTHER", ++ .pme_code = 0x000004C040, ++ .pme_short_desc = "The processor's data cache was reloaded from local core's L2 with dispatch conflict due to a demand load", ++ .pme_long_desc = "The processor's data cache was reloaded from local core's L2 with dispatch conflict due to a demand load", ++}, ++[ POWER9_PME_PM_MRK_DATA_FROM_ON_CHIP_CACHE ] = { /* 709 */ ++ .pme_name = "PM_MRK_DATA_FROM_ON_CHIP_CACHE", ++ .pme_code = 0x000004D140, ++ .pme_short_desc = "The processor's data cache was reloaded either shared or modified data from another core's L2/L3 on the same chip due to a marked load", ++ .pme_long_desc = "The processor's data cache was reloaded either shared or modified data from another core's L2/L3 on the same chip due to a marked load", ++}, ++[ POWER9_PME_PM_LSU_STCX ] = { /* 710 */ ++ .pme_name = "PM_LSU_STCX", ++ .pme_code = 0x000000C090, ++ .pme_short_desc = "STCX sent to nest, i.", ++ .pme_long_desc = "STCX sent to nest, i.e. total", ++}, ++[ POWER9_PME_PM_MRK_DATA_FROM_L2_1_MOD ] = { /* 711 */ ++ .pme_name = "PM_MRK_DATA_FROM_L2_1_MOD", ++ .pme_code = 0x000004D146, ++ .pme_short_desc = "The processor's data cache was reloaded with Modified (M) data from another core's L2 on the same chip due to a marked load", ++ .pme_long_desc = "The processor's data cache was reloaded with Modified (M) data from another core's L2 on the same chip due to a marked load", ++}, ++[ POWER9_PME_PM_VSU_NON_FLOP_CMPL ] = { /* 712 */ ++ .pme_name = "PM_VSU_NON_FLOP_CMPL", ++ .pme_code = 0x000004D050, ++ .pme_short_desc = "", ++ .pme_long_desc = "", ++}, ++[ POWER9_PME_PM_INST_FROM_L3_DISP_CONFLICT ] = { /* 713 */ ++ .pme_name = "PM_INST_FROM_L3_DISP_CONFLICT", ++ .pme_code = 0x0000034042, ++ .pme_short_desc = "The processor's Instruction cache was reloaded from local core's L3 with dispatch conflict due to an instruction fetch (not prefetch)", ++ .pme_long_desc = "The processor's Instruction cache was reloaded from local core's L3 with dispatch conflict due to an instruction fetch (not prefetch)", ++}, ++[ POWER9_PME_PM_MRK_DATA_FROM_L2_1_SHR ] = { /* 714 */ ++ .pme_name = "PM_MRK_DATA_FROM_L2_1_SHR", ++ .pme_code = 0x000002D14E, ++ .pme_short_desc = "The processor's data cache was reloaded with Shared (S) data from another core's L2 on the same chip due to a marked load", ++ .pme_long_desc = "The processor's data cache was reloaded with Shared (S) data from another core's L2 on the same chip due to a marked load", ++}, ++[ POWER9_PME_PM_RADIX_PWC_L4_PTE_FROM_L3 ] = { /* 715 */ ++ .pme_name = "PM_RADIX_PWC_L4_PTE_FROM_L3", ++ .pme_code = 0x000004F05A, ++ .pme_short_desc = "A Page Table Entry was reloaded to a level 4 page walk cache from the core's L3 data cache.", ++ .pme_long_desc = "A Page Table Entry was reloaded to a level 4 page walk cache from the core's L3 data cache. This is the deepest level of PWC possible for a translation", ++}, ++[ POWER9_PME_PM_TAGE_CORRECT ] = { /* 716 */ ++ .pme_name = "PM_TAGE_CORRECT", ++ .pme_code = 0x00000058B4, ++ .pme_short_desc = "The TAGE overrode BHT direction prediction and it was correct.", ++ .pme_long_desc = "The TAGE overrode BHT direction prediction and it was correct. Includes taken and not taken and is counted at execution time", ++}, ++[ POWER9_PME_PM_TM_FAV_CAUSED_FAIL ] = { /* 717 */ ++ .pme_name = "PM_TM_FAV_CAUSED_FAIL", ++ .pme_code = 0x000002688C, ++ .pme_short_desc = "TM Load (fav) caused another thread to fail", ++ .pme_long_desc = "TM Load (fav) caused another thread to fail", ++}, ++[ POWER9_PME_PM_RADIX_PWC_L1_HIT ] = { /* 718 */ ++ .pme_name = "PM_RADIX_PWC_L1_HIT", ++ .pme_code = 0x000001F056, ++ .pme_short_desc = "A radix translation attempt missed in the TLB and only the first level page walk cache was a hit.", ++ .pme_long_desc = "A radix translation attempt missed in the TLB and only the first level page walk cache was a hit.", ++}, ++[ POWER9_PME_PM_LSU0_LMQ_S0_VALID ] = { /* 719 */ ++ .pme_name = "PM_LSU0_LMQ_S0_VALID", ++ .pme_code = 0x000000D8B8, ++ .pme_short_desc = "Slot 0 of LMQ valid", ++ .pme_long_desc = "Slot 0 of LMQ valid", ++}, ++[ POWER9_PME_PM_BR_MPRED_CCACHE ] = { /* 720 */ ++ .pme_name = "PM_BR_MPRED_CCACHE", ++ .pme_code = 0x00000040AC, ++ .pme_short_desc = "Conditional Branch Completed that was Mispredicted due to the Count Cache Target Prediction", ++ .pme_long_desc = "Conditional Branch Completed that was Mispredicted due to the Count Cache Target Prediction", ++}, ++[ POWER9_PME_PM_L1_DEMAND_WRITE ] = { /* 721 */ ++ .pme_name = "PM_L1_DEMAND_WRITE", ++ .pme_code = 0x000000408C, ++ .pme_short_desc = "Instruction Demand sectors wriittent into IL1", ++ .pme_long_desc = "Instruction Demand sectors wriittent into IL1", ++}, ++[ POWER9_PME_PM_CMPLU_STALL_FLUSH_ANY_THREAD ] = { /* 722 */ ++ .pme_name = "PM_CMPLU_STALL_FLUSH_ANY_THREAD", ++ .pme_code = 0x000001E056, ++ .pme_short_desc = "Cycles in which the NTC instruction is not allowed to complete because any of the 4 threads in the same core suffered a flush, which blocks completion", ++ .pme_long_desc = "Cycles in which the NTC instruction is not allowed to complete because any of the 4 threads in the same core suffered a flush, which blocks completion", ++}, ++[ POWER9_PME_PM_IPTEG_FROM_L3MISS ] = { /* 723 */ ++ .pme_name = "PM_IPTEG_FROM_L3MISS", ++ .pme_code = 0x000004504E, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from a localtion other than the local core's L3 due to a instruction side request", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from a localtion other than the local core's L3 due to a instruction side request", ++}, ++[ POWER9_PME_PM_MRK_DTLB_MISS_16G ] = { /* 724 */ ++ .pme_name = "PM_MRK_DTLB_MISS_16G", ++ .pme_code = 0x000002D15E, ++ .pme_short_desc = "Marked Data TLB Miss page size 16G", ++ .pme_long_desc = "Marked Data TLB Miss page size 16G", ++}, ++[ POWER9_PME_PM_IPTEG_FROM_RL4 ] = { /* 725 */ ++ .pme_name = "PM_IPTEG_FROM_RL4", ++ .pme_code = 0x000002504A, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from another chip's L4 on the same Node or Group ( Remote) due to a instruction side request", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from another chip's L4 on the same Node or Group ( Remote) due to a instruction side request", ++}, ++[ POWER9_PME_PM_L2_RCST_DISP ] = { /* 726 */ ++ .pme_name = "PM_L2_RCST_DISP", ++ .pme_code = 0x0000036084, ++ .pme_short_desc = "L2 RC store dispatch attempt", ++ .pme_long_desc = "L2 RC store dispatch attempt", ++}, ++[ POWER9_PME_PM_MRK_DATA_FROM_L2_DISP_CONFLICT_OTHER_CYC ] = { /* 727 */ ++ .pme_name = "PM_MRK_DATA_FROM_L2_DISP_CONFLICT_OTHER_CYC", ++ .pme_code = 0x000003D140, ++ .pme_short_desc = "Duration in cycles to reload from local core's L2 with dispatch conflict due to a marked load", ++ .pme_long_desc = "Duration in cycles to reload from local core's L2 with dispatch conflict due to a marked load", ++}, ++[ POWER9_PME_PM_CMPLU_STALL ] = { /* 728 */ ++ .pme_name = "PM_CMPLU_STALL", ++ .pme_code = 0x000001E054, ++ .pme_short_desc = "Nothing completed and ICT not empty", ++ .pme_long_desc = "Nothing completed and ICT not empty", ++}, ++[ POWER9_PME_PM_DISP_CLB_HELD_SB ] = { /* 729 */ ++ .pme_name = "PM_DISP_CLB_HELD_SB", ++ .pme_code = 0x0000002090, ++ .pme_short_desc = "Dispatch/CLB Hold: Scoreboard", ++ .pme_long_desc = "Dispatch/CLB Hold: Scoreboard", ++}, ++[ POWER9_PME_PM_L3_SN_USAGE ] = { /* 730 */ ++ .pme_name = "PM_L3_SN_USAGE", ++ .pme_code = 0x00000160AC, ++ .pme_short_desc = "rotating sample of 8 snoop valids", ++ .pme_long_desc = "rotating sample of 8 snoop valids", ++}, ++[ POWER9_PME_PM_FLOP_CMPL ] = { /* 731 */ ++ .pme_name = "PM_FLOP_CMPL", ++ .pme_code = 0x00000100F4, ++ .pme_short_desc = "Floating Point Operation Finished", ++ .pme_long_desc = "Floating Point Operation Finished", ++}, ++[ POWER9_PME_PM_MRK_L2_RC_DISP ] = { /* 732 */ ++ .pme_name = "PM_MRK_L2_RC_DISP", ++ .pme_code = 0x0000020114, ++ .pme_short_desc = "Marked Instruction RC dispatched in L2", ++ .pme_long_desc = "Marked Instruction RC dispatched in L2", ++}, ++[ POWER9_PME_PM_L3_PF_ON_CHIP_CACHE ] = { /* 733 */ ++ .pme_name = "PM_L3_PF_ON_CHIP_CACHE", ++ .pme_code = 0x00000360A0, ++ .pme_short_desc = "L3 Prefetch from On chip cache", ++ .pme_long_desc = "L3 Prefetch from On chip cache", ++}, ++[ POWER9_PME_PM_IC_DEMAND_CYC ] = { /* 734 */ ++ .pme_name = "PM_IC_DEMAND_CYC", ++ .pme_code = 0x0000010018, ++ .pme_short_desc = "Final Pump Scope (Group) ended up larger than Initial Pump Scope (Chip) for a demand load", ++ .pme_long_desc = "Final Pump Scope (Group) ended up larger than Initial Pump Scope (Chip) for a demand load", ++}, ++[ POWER9_PME_PM_CO_USAGE ] = { /* 735 */ ++ .pme_name = "PM_CO_USAGE", ++ .pme_code = 0x000002688E, ++ .pme_short_desc = " continuous 16 cycle(2to1) window where this signals rotates thru sampling each machine", ++ .pme_long_desc = " continuous 16 cycle(2to1) window where this signals rotates thru sampling each machine", ++}, ++[ POWER9_PME_PM_ISYNC ] = { /* 736 */ ++ .pme_name = "PM_ISYNC", ++ .pme_code = 0x0000002884, ++ .pme_short_desc = "Isync completion count per thread", ++ .pme_long_desc = "Isync completion count per thread", ++}, ++[ POWER9_PME_PM_MEM_CO ] = { /* 737 */ ++ .pme_name = "PM_MEM_CO", ++ .pme_code = 0x000004C058, ++ .pme_short_desc = "Memory castouts from this thread", ++ .pme_long_desc = "Memory castouts from this thread", ++}, ++[ POWER9_PME_PM_NTC_ALL_FIN ] = { /* 738 */ ++ .pme_name = "PM_NTC_ALL_FIN", ++ .pme_code = 0x000002001A, ++ .pme_short_desc = "Cycles after all instructions have finished to group completed", ++ .pme_long_desc = "Cycles after all instructions have finished to group completed", ++}, ++[ POWER9_PME_PM_CMPLU_STALL_EXCEPTION ] = { /* 739 */ ++ .pme_name = "PM_CMPLU_STALL_EXCEPTION", ++ .pme_code = 0x000003003A, ++ .pme_short_desc = "Cycles in which the NTC instruction is not allowed to complete because it was interrupted by ANY exception, which has to be serviced before the instruction can complete", ++ .pme_long_desc = "Cycles in which the NTC instruction is not allowed to complete because it was interrupted by ANY exception, which has to be serviced before the instruction can complete", ++}, ++[ POWER9_PME_PM_LS0_LAUNCH_HELD_PREF ] = { /* 740 */ ++ .pme_name = "PM_LS0_LAUNCH_HELD_PREF", ++ .pme_code = 0x000000C09C, ++ .pme_short_desc = "Number of times a load or store instruction was unable to launch/relaunch because a high priority prefetch used that relaunch cycle", ++ .pme_long_desc = "Number of times a load or store instruction was unable to launch/relaunch because a high priority prefetch used that relaunch cycle", ++}, ++[ POWER9_PME_PM_ICT_NOSLOT_BR_MPRED ] = { /* 741 */ ++ .pme_name = "PM_ICT_NOSLOT_BR_MPRED", ++ .pme_code = 0x000004D01E, ++ .pme_short_desc = "Ict empty for this thread due to branch mispred", ++ .pme_long_desc = "Ict empty for this thread due to branch mispred", ++}, ++[ POWER9_PME_PM_MRK_BR_CMPL ] = { /* 742 */ ++ .pme_name = "PM_MRK_BR_CMPL", ++ .pme_code = 0x000001016E, ++ .pme_short_desc = "Branch Instruction completed", ++ .pme_long_desc = "Branch Instruction completed", ++}, ++[ POWER9_PME_PM_ICT_NOSLOT_DISP_HELD ] = { /* 743 */ ++ .pme_name = "PM_ICT_NOSLOT_DISP_HELD", ++ .pme_code = 0x000004E01A, ++ .pme_short_desc = "Cycles in which the NTC instruciton is held at dispatch for any reason", ++ .pme_long_desc = "Cycles in which the NTC instruciton is held at dispatch for any reason", ++}, ++[ POWER9_PME_PM_IC_PREF_WRITE ] = { /* 744 */ ++ .pme_name = "PM_IC_PREF_WRITE", ++ .pme_code = 0x000000488C, ++ .pme_short_desc = "Instruction prefetch written into IL1", ++ .pme_long_desc = "Instruction prefetch written into IL1", ++}, ++[ POWER9_PME_PM_MRK_LSU_FLUSH_LHL_SHL ] = { /* 745 */ ++ .pme_name = "PM_MRK_LSU_FLUSH_LHL_SHL", ++ .pme_code = 0x000000D8A0, ++ .pme_short_desc = "The instruction was flushed because of a sequential load/store consistency.", ++ .pme_long_desc = "The instruction was flushed because of a sequential load/store consistency. If a load or store hits on an older load that has either been snooped (for loads) or has stale data (for stores).", ++}, ++[ POWER9_PME_PM_DTLB_MISS_1G ] = { /* 746 */ ++ .pme_name = "PM_DTLB_MISS_1G", ++ .pme_code = 0x000004C05A, ++ .pme_short_desc = "Data TLB reload (after a miss) page size 1G.", ++ .pme_long_desc = "Data TLB reload (after a miss) page size 1G. Implies radix translation was used", ++}, ++[ POWER9_PME_PM_DATA_FROM_L2_NO_CONFLICT ] = { /* 747 */ ++ .pme_name = "PM_DATA_FROM_L2_NO_CONFLICT", ++ .pme_code = 0x000001C040, ++ .pme_short_desc = "The processor's data cache was reloaded from local core's L2 without conflict due to a demand load", ++ .pme_long_desc = "The processor's data cache was reloaded from local core's L2 without conflict due to a demand load", ++}, ++[ POWER9_PME_PM_MRK_DPTEG_FROM_L3MISS ] = { /* 748 */ ++ .pme_name = "PM_MRK_DPTEG_FROM_L3MISS", ++ .pme_code = 0x000004F14E, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from a localtion other than the local core's L3 due to a marked data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from a localtion other than the local core's L3 due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++}, ++[ POWER9_PME_PM_BR_PRED ] = { /* 749 */ ++ .pme_name = "PM_BR_PRED", ++ .pme_code = 0x000000409C, ++ .pme_short_desc = "Conditional Branch Executed in which the HW predicted the Direction or Target.", ++ .pme_long_desc = "Conditional Branch Executed in which the HW predicted the Direction or Target. Includes taken and not taken and is counted at execution time", ++}, ++[ POWER9_PME_PM_CMPLU_STALL_OTHER_CMPL ] = { /* 750 */ ++ .pme_name = "PM_CMPLU_STALL_OTHER_CMPL", ++ .pme_code = 0x0000030006, ++ .pme_short_desc = "Instructions the core completed while this tread was stalled", ++ .pme_long_desc = "Instructions the core completed while this tread was stalled", ++}, ++[ POWER9_PME_PM_INST_FROM_DMEM ] = { /* 751 */ ++ .pme_name = "PM_INST_FROM_DMEM", ++ .pme_code = 0x000004404C, ++ .pme_short_desc = "The processor's Instruction cache was reloaded from another chip's memory on the same Node or Group (Distant) due to an instruction fetch (not prefetch)", ++ .pme_long_desc = "The processor's Instruction cache was reloaded from another chip's memory on the same Node or Group (Distant) due to an instruction fetch (not prefetch)", ++}, ++[ POWER9_PME_PM_MRK_DPTEG_FROM_L2_NO_CONFLICT ] = { /* 752 */ ++ .pme_name = "PM_MRK_DPTEG_FROM_L2_NO_CONFLICT", ++ .pme_code = 0x000001F140, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L2 without conflict due to a marked data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L2 without conflict due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++}, ++[ POWER9_PME_PM_DC_PREF_SW_ALLOC ] = { /* 753 */ ++ .pme_name = "PM_DC_PREF_SW_ALLOC", ++ .pme_code = 0x000000F8A4, ++ .pme_short_desc = "Prefetch stream allocated by software prefetching", ++ .pme_long_desc = "Prefetch stream allocated by software prefetching", ++}, ++[ POWER9_PME_PM_L2_RCST_DISP_FAIL_OTHER ] = { /* 754 */ ++ .pme_name = "PM_L2_RCST_DISP_FAIL_OTHER", ++ .pme_code = 0x0000046084, ++ .pme_short_desc = "L2 RC store dispatch attempt failed due to other reasons", ++ .pme_long_desc = "L2 RC store dispatch attempt failed due to other reasons", ++}, ++[ POWER9_PME_PM_CMPLU_STALL_EMQ_FULL ] = { /* 755 */ ++ .pme_name = "PM_CMPLU_STALL_EMQ_FULL", ++ .pme_code = 0x0000030004, ++ .pme_short_desc = "Finish stall because the next to finish instruction suffered an ERAT miss and the EMQ was full", ++ .pme_long_desc = "Finish stall because the next to finish instruction suffered an ERAT miss and the EMQ was full", ++}, ++[ POWER9_PME_PM_MRK_INST_DECODED ] = { /* 756 */ ++ .pme_name = "PM_MRK_INST_DECODED", ++ .pme_code = 0x0000020130, ++ .pme_short_desc = "An instruction was marked at decode time.", ++ .pme_long_desc = "An instruction was marked at decode time. Random Instruction Sampling (RIS) only", ++}, ++[ POWER9_PME_PM_IERAT_RELOAD_4K ] = { /* 757 */ ++ .pme_name = "PM_IERAT_RELOAD_4K", ++ .pme_code = 0x0000020064, ++ .pme_short_desc = "IERAT reloaded (after a miss) for 4K pages", ++ .pme_long_desc = "IERAT reloaded (after a miss) for 4K pages", ++}, ++[ POWER9_PME_PM_CMPLU_STALL_LRQ_OTHER ] = { /* 758 */ ++ .pme_name = "PM_CMPLU_STALL_LRQ_OTHER", ++ .pme_code = 0x0000010004, ++ .pme_short_desc = "Finish stall due to LRQ miscellaneous reasons, lost arbitration to LMQ slot, bank collisions, set prediction cleanup, set prediction multihit and others", ++ .pme_long_desc = "Finish stall due to LRQ miscellaneous reasons, lost arbitration to LMQ slot, bank collisions, set prediction cleanup, set prediction multihit and others", ++}, ++[ POWER9_PME_PM_INST_FROM_L3_1_ECO_MOD ] = { /* 759 */ ++ .pme_name = "PM_INST_FROM_L3_1_ECO_MOD", ++ .pme_code = 0x0000044044, ++ .pme_short_desc = "The processor's Instruction cache was reloaded with Modified (M) data from another core's ECO L3 on the same chip due to an instruction fetch (not prefetch)", ++ .pme_long_desc = "The processor's Instruction cache was reloaded with Modified (M) data from another core's ECO L3 on the same chip due to an instruction fetch (not prefetch)", ++}, ++[ POWER9_PME_PM_L3_P0_CO_L31 ] = { /* 760 */ ++ .pme_name = "PM_L3_P0_CO_L31", ++ .pme_code = 0x00000460AA, ++ .pme_short_desc = "l3 CO to L3.", ++ .pme_long_desc = "l3 CO to L3.1 (lco) port 0", ++}, ++[ POWER9_PME_PM_NON_TM_RST_SC ] = { /* 761 */ ++ .pme_name = "PM_NON_TM_RST_SC", ++ .pme_code = 0x00000260A6, ++ .pme_short_desc = "non tm snp rst tm sc", ++ .pme_long_desc = "non tm snp rst tm sc", ++}, ++[ POWER9_PME_PM_RADIX_PWC_L4_PTE_FROM_L2 ] = { /* 762 */ ++ .pme_name = "PM_RADIX_PWC_L4_PTE_FROM_L2", ++ .pme_code = 0x000001F05A, ++ .pme_short_desc = "A Page Table Entry was reloaded to a level 4 page walk cache from the core's L2 data cache.", ++ .pme_long_desc = "A Page Table Entry was reloaded to a level 4 page walk cache from the core's L2 data cache. This is the deepest level of PWC possible for a translation", ++}, ++[ POWER9_PME_PM_INST_SYS_PUMP_CPRED ] = { /* 763 */ ++ .pme_name = "PM_INST_SYS_PUMP_CPRED", ++ .pme_code = 0x0000034050, ++ .pme_short_desc = "Initial and Final Pump Scope was system pump (prediction=correct) for an instruction fetch", ++ .pme_long_desc = "Initial and Final Pump Scope was system pump (prediction=correct) for an instruction fetch", ++}, ++[ POWER9_PME_PM_DPTEG_FROM_DMEM ] = { /* 764 */ ++ .pme_name = "PM_DPTEG_FROM_DMEM", ++ .pme_code = 0x000004E04C, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from another chip's memory on the same Node or Group (Distant) due to a data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from another chip's memory on the same Node or Group (Distant) due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++}, ++[ POWER9_PME_PM_LSU_LMQ_SRQ_EMPTY_CYC ] = { /* 765 */ ++ .pme_name = "PM_LSU_LMQ_SRQ_EMPTY_CYC", ++ .pme_code = 0x000002003E, ++ .pme_short_desc = "Cycles in which the LSU is empty for all threads (lmq and srq are completely empty)", ++ .pme_long_desc = "Cycles in which the LSU is empty for all threads (lmq and srq are completely empty)", ++}, ++[ POWER9_PME_PM_SYS_PUMP_CPRED ] = { /* 766 */ ++ .pme_name = "PM_SYS_PUMP_CPRED", ++ .pme_code = 0x0000030050, ++ .pme_short_desc = "Initial and Final Pump Scope was system pump for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", ++ .pme_long_desc = "Initial and Final Pump Scope was system pump for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", ++}, ++[ POWER9_PME_PM_DTLB_MISS_64K ] = { /* 767 */ ++ .pme_name = "PM_DTLB_MISS_64K", ++ .pme_code = 0x000003C056, ++ .pme_short_desc = "Data TLB Miss page size 64K", ++ .pme_long_desc = "Data TLB Miss page size 64K", ++}, ++[ POWER9_PME_PM_CMPLU_STALL_STCX ] = { /* 768 */ ++ .pme_name = "PM_CMPLU_STALL_STCX", ++ .pme_code = 0x000002D01C, ++ .pme_short_desc = "Finish stall because the NTF instruction was a stcx waiting for response from L2", ++ .pme_long_desc = "Finish stall because the NTF instruction was a stcx waiting for response from L2", ++}, ++[ POWER9_PME_PM_MRK_FAB_RSP_CLAIM_RTY ] = { /* 769 */ ++ .pme_name = "PM_MRK_FAB_RSP_CLAIM_RTY", ++ .pme_code = 0x000003015E, ++ .pme_short_desc = "Sampled store did a rwitm and got a rty", ++ .pme_long_desc = "Sampled store did a rwitm and got a rty", ++}, ++[ POWER9_PME_PM_PARTIAL_ST_FIN ] = { /* 770 */ ++ .pme_name = "PM_PARTIAL_ST_FIN", ++ .pme_code = 0x0000034054, ++ .pme_short_desc = "Any store finished by an LSU slice", ++ .pme_long_desc = "Any store finished by an LSU slice", ++}, ++[ POWER9_PME_PM_THRD_CONC_RUN_INST ] = { /* 771 */ ++ .pme_name = "PM_THRD_CONC_RUN_INST", ++ .pme_code = 0x00000300F4, ++ .pme_short_desc = "PPC Instructions Finished by this thread when all threads in the core had the run-latch set", ++ .pme_long_desc = "PPC Instructions Finished by this thread when all threads in the core had the run-latch set", ++}, ++[ POWER9_PME_PM_CO_TM_SC_FOOTPRINT ] = { /* 772 */ ++ .pme_name = "PM_CO_TM_SC_FOOTPRINT", ++ .pme_code = 0x0000026086, ++ .pme_short_desc = "L2 did a cleanifdirty CO to the L3 (ie created an SC line in the L3)", ++ .pme_long_desc = "L2 did a cleanifdirty CO to the L3 (ie created an SC line in the L3)", ++}, ++[ POWER9_PME_PM_MRK_LARX_FIN ] = { /* 773 */ ++ .pme_name = "PM_MRK_LARX_FIN", ++ .pme_code = 0x0000040116, ++ .pme_short_desc = "Larx finished", ++ .pme_long_desc = "Larx finished", ++}, ++[ POWER9_PME_PM_L3_LOC_GUESS_WRONG ] = { /* 774 */ ++ .pme_name = "PM_L3_LOC_GUESS_WRONG", ++ .pme_code = 0x00000268B2, ++ .pme_short_desc = "Initial scope=node but data from out side local node (near or far or rem).", ++ .pme_long_desc = "Initial scope=node but data from out side local node (near or far or rem). Prediction too Low", ++}, ++[ POWER9_PME_PM_CMPLU_STALL_DMISS_L21_L31 ] = { /* 775 */ ++ .pme_name = "PM_CMPLU_STALL_DMISS_L21_L31", ++ .pme_code = 0x000002C018, ++ .pme_short_desc = "Completion stall by Dcache miss which resolved on chip ( excluding local L2/L3)", ++ .pme_long_desc = "Completion stall by Dcache miss which resolved on chip ( excluding local L2/L3)", ++}, ++[ POWER9_PME_PM_SHL_ST_DISABLE ] = { /* 776 */ ++ .pme_name = "PM_SHL_ST_DISABLE", ++ .pme_code = 0x0000005090, ++ .pme_short_desc = "Store-Hit-Load Table Read Hit with entry Disabled (entry was disabled due to the entry shown to not prevent the flush)", ++ .pme_long_desc = "Store-Hit-Load Table Read Hit with entry Disabled (entry was disabled due to the entry shown to not prevent the flush)", ++}, ++[ POWER9_PME_PM_VSU_FIN ] = { /* 777 */ ++ .pme_name = "PM_VSU_FIN", ++ .pme_code = 0x000002505C, ++ .pme_short_desc = "VSU instruction finished.", ++ .pme_long_desc = "VSU instruction finished. Up to 4 per cycle", ++}, ++[ POWER9_PME_PM_MRK_LSU_FLUSH_ATOMIC ] = { /* 778 */ ++ .pme_name = "PM_MRK_LSU_FLUSH_ATOMIC", ++ .pme_code = 0x000000D098, ++ .pme_short_desc = "Quad-word loads (lq) are considered atomic because they always span at least 2 slices.", ++ .pme_long_desc = "Quad-word loads (lq) are considered atomic because they always span at least 2 slices. If a snoop or store from another thread changes the data the load is accessing between the 2 or 3 pieces of the lq instruction, the lq will be flushed", ++}, ++[ POWER9_PME_PM_L3_CI_HIT ] = { /* 779 */ ++ .pme_name = "PM_L3_CI_HIT", ++ .pme_code = 0x00000260A2, ++ .pme_short_desc = "L3 Castins Hit (total count", ++ .pme_long_desc = "L3 Castins Hit (total count", ++}, ++[ POWER9_PME_PM_CMPLU_STALL_DARQ ] = { /* 780 */ ++ .pme_name = "PM_CMPLU_STALL_DARQ", ++ .pme_code = 0x000003405A, ++ .pme_short_desc = "Finish stall because the next to finish instruction was spending cycles in the DARQ.", ++ .pme_long_desc = "Finish stall because the next to finish instruction was spending cycles in the DARQ. If this count is large is likely because the LSAQ had less than 4 slots available", ++}, ++[ POWER9_PME_PM_L3_PF_ON_CHIP_MEM ] = { /* 781 */ ++ .pme_name = "PM_L3_PF_ON_CHIP_MEM", ++ .pme_code = 0x00000460A0, ++ .pme_short_desc = "L3 Prefetch from On chip memory", ++ .pme_long_desc = "L3 Prefetch from On chip memory", ++}, ++[ POWER9_PME_PM_THRD_PRIO_0_1_CYC ] = { /* 782 */ ++ .pme_name = "PM_THRD_PRIO_0_1_CYC", ++ .pme_code = 0x00000040BC, ++ .pme_short_desc = "Cycles thread running at priority level 0 or 1", ++ .pme_long_desc = "Cycles thread running at priority level 0 or 1", ++}, ++[ POWER9_PME_PM_DERAT_MISS_64K ] = { /* 783 */ ++ .pme_name = "PM_DERAT_MISS_64K", ++ .pme_code = 0x000002C054, ++ .pme_short_desc = "Data ERAT Miss (Data TLB Access) page size 64K", ++ .pme_long_desc = "Data ERAT Miss (Data TLB Access) page size 64K", ++}, ++[ POWER9_PME_PM_PMC2_REWIND ] = { /* 784 */ ++ .pme_name = "PM_PMC2_REWIND", ++ .pme_code = 0x0000030020, ++ .pme_short_desc = "PMC2 Rewind Event (did not match condition)", ++ .pme_long_desc = "PMC2 Rewind Event (did not match condition)", ++}, ++[ POWER9_PME_PM_INST_FROM_L2 ] = { /* 785 */ ++ .pme_name = "PM_INST_FROM_L2", ++ .pme_code = 0x0000014042, ++ .pme_short_desc = "The processor's Instruction cache was reloaded from local core's L2 due to an instruction fetch (not prefetch)", ++ .pme_long_desc = "The processor's Instruction cache was reloaded from local core's L2 due to an instruction fetch (not prefetch)", ++}, ++[ POWER9_PME_PM_MRK_NTF_FIN ] = { /* 786 */ ++ .pme_name = "PM_MRK_NTF_FIN", ++ .pme_code = 0x0000020112, ++ .pme_short_desc = "Marked next to finish instruction finished", ++ .pme_long_desc = "Marked next to finish instruction finished", ++}, ++[ POWER9_PME_PM_ALL_SRQ_FULL ] = { /* 787 */ ++ .pme_name = "PM_ALL_SRQ_FULL", ++ .pme_code = 0x0000020004, ++ .pme_short_desc = "Number of cycles the SRQ is completely out of srq entries.", ++ .pme_long_desc = "Number of cycles the SRQ is completely out of srq entries. This event is not per thread, all threads will get the same count for this core resource", ++}, ++[ POWER9_PME_PM_INST_DISP ] = { /* 788 */ ++ .pme_name = "PM_INST_DISP", ++ .pme_code = 0x00000200F2, ++ .pme_short_desc = "# PPC Dispatched", ++ .pme_long_desc = "# PPC Dispatched", ++}, ++[ POWER9_PME_PM_LS3_ERAT_MISS_PREF ] = { /* 789 */ ++ .pme_name = "PM_LS3_ERAT_MISS_PREF", ++ .pme_code = 0x000000E888, ++ .pme_short_desc = "LS1 Erat miss due to prefetch", ++ .pme_long_desc = "LS1 Erat miss due to prefetch", ++}, ++[ POWER9_PME_PM_STOP_FETCH_PENDING_CYC ] = { /* 790 */ ++ .pme_name = "PM_STOP_FETCH_PENDING_CYC", ++ .pme_code = 0x00000048A4, ++ .pme_short_desc = "Fetching is stopped due to an incoming instruction that will result in a flush", ++ .pme_long_desc = "Fetching is stopped due to an incoming instruction that will result in a flush", ++}, ++[ POWER9_PME_PM_L1_DCACHE_RELOAD_VALID ] = { /* 791 */ ++ .pme_name = "PM_L1_DCACHE_RELOAD_VALID", ++ .pme_code = 0x00000300F6, ++ .pme_short_desc = "DL1 reloaded due to Demand Load", ++ .pme_long_desc = "DL1 reloaded due to Demand Load", ++}, ++[ POWER9_PME_PM_L3_P0_LCO_NO_DATA ] = { /* 792 */ ++ .pme_name = "PM_L3_P0_LCO_NO_DATA", ++ .pme_code = 0x00000160AA, ++ .pme_short_desc = "dataless l3 lco sent port 0", ++ .pme_long_desc = "dataless l3 lco sent port 0", ++}, ++[ POWER9_PME_PM_LSU3_VECTOR_LD_FIN ] = { /* 793 */ ++ .pme_name = "PM_LSU3_VECTOR_LD_FIN", ++ .pme_code = 0x000000C884, ++ .pme_short_desc = "A vector load instruction finished.", ++ .pme_long_desc = "A vector load instruction finished. The ops considered in this category are lxv*, lvx*, lve*, lxsi*zx, lxvwsx, lxsd, lxssp, lxvl, lxvll, lxvb16x, lxvh8x, lxv, lxvx", ++}, ++[ POWER9_PME_PM_MRK_DPTEG_FROM_L3_NO_CONFLICT ] = { /* 794 */ ++ .pme_name = "PM_MRK_DPTEG_FROM_L3_NO_CONFLICT", ++ .pme_code = 0x000001F144, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L3 without conflict due to a marked data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L3 without conflict due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++}, ++[ POWER9_PME_PM_MRK_FXU_FIN ] = { /* 795 */ ++ .pme_name = "PM_MRK_FXU_FIN", ++ .pme_code = 0x0000020134, ++ .pme_short_desc = "fxu marked instr finish", ++ .pme_long_desc = "fxu marked instr finish", ++}, ++[ POWER9_PME_PM_LS3_UNALIGNED_ST ] = { /* 796 */ ++ .pme_name = "PM_LS3_UNALIGNED_ST", ++ .pme_code = 0x000000F8BC, ++ .pme_short_desc = "Store instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the Store of that size.", ++ .pme_long_desc = "Store instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the Store of that size. If the Store wraps from slice 3 to slice 0, thee is an additional 3-cycle penalty", ++}, ++[ POWER9_PME_PM_DPTEG_FROM_MEMORY ] = { /* 797 */ ++ .pme_name = "PM_DPTEG_FROM_MEMORY", ++ .pme_code = 0x000002E04C, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from a memory location including L4 from local remote or distant due to a data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from a memory location including L4 from local remote or distant due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++}, ++[ POWER9_PME_PM_RUN_CYC_ST_MODE ] = { /* 798 */ ++ .pme_name = "PM_RUN_CYC_ST_MODE", ++ .pme_code = 0x000001006C, ++ .pme_short_desc = "Cycles run latch is set and core is in ST mode", ++ .pme_long_desc = "Cycles run latch is set and core is in ST mode", ++}, ++[ POWER9_PME_PM_PMC4_OVERFLOW ] = { /* 799 */ ++ .pme_name = "PM_PMC4_OVERFLOW", ++ .pme_code = 0x0000010010, ++ .pme_short_desc = "Overflow from counter 4", ++ .pme_long_desc = "Overflow from counter 4", ++}, ++[ POWER9_PME_PM_THRESH_EXC_256 ] = { /* 800 */ ++ .pme_name = "PM_THRESH_EXC_256", ++ .pme_code = 0x00000101E8, ++ .pme_short_desc = "Threshold counter exceed a count of 256", ++ .pme_long_desc = "Threshold counter exceed a count of 256", ++}, ++[ POWER9_PME_PM_MRK_DATA_FROM_L3_1_ECO_MOD_CYC ] = { /* 801 */ ++ .pme_name = "PM_MRK_DATA_FROM_L3_1_ECO_MOD_CYC", ++ .pme_code = 0x0000035158, ++ .pme_short_desc = "Duration in cycles to reload with Modified (M) data from another core's ECO L3 on the same chip due to a marked load", ++ .pme_long_desc = "Duration in cycles to reload with Modified (M) data from another core's ECO L3 on the same chip due to a marked load", ++}, ++[ POWER9_PME_PM_LSU0_LRQ_S0_VALID_CYC ] = { /* 802 */ ++ .pme_name = "PM_LSU0_LRQ_S0_VALID_CYC", ++ .pme_code = 0x000000D8B4, ++ .pme_short_desc = "Slot 0 of LRQ valid", ++ .pme_long_desc = "Slot 0 of LRQ valid", ++}, ++[ POWER9_PME_PM_INST_FROM_L2MISS ] = { /* 803 */ ++ .pme_name = "PM_INST_FROM_L2MISS", ++ .pme_code = 0x000001404E, ++ .pme_short_desc = "The processor's Instruction cache was reloaded from a localtion other than the local core's L2 due to an instruction fetch (not prefetch)", ++ .pme_long_desc = "The processor's Instruction cache was reloaded from a localtion other than the local core's L2 due to an instruction fetch (not prefetch)", ++}, ++[ POWER9_PME_PM_MRK_L2_TM_ST_ABORT_SISTER ] = { /* 804 */ ++ .pme_name = "PM_MRK_L2_TM_ST_ABORT_SISTER", ++ .pme_code = 0x000003E15C, ++ .pme_short_desc = "TM marked store abort for this thread", ++ .pme_long_desc = "TM marked store abort for this thread", ++}, ++[ POWER9_PME_PM_L2_ST ] = { /* 805 */ ++ .pme_name = "PM_L2_ST", ++ .pme_code = 0x0000016880, ++ .pme_short_desc = "All successful D-side store dispatches for this thread", ++ .pme_long_desc = "All successful D-side store dispatches for this thread", ++}, ++[ POWER9_PME_PM_RADIX_PWC_MISS ] = { /* 806 */ ++ .pme_name = "PM_RADIX_PWC_MISS", ++ .pme_code = 0x000004F054, ++ .pme_short_desc = "A radix translation attempt missed in the TLB and all levels of page walk cache.", ++ .pme_long_desc = "A radix translation attempt missed in the TLB and all levels of page walk cache.", ++}, ++[ POWER9_PME_PM_MRK_ST_L2DISP_TO_CMPL_CYC ] = { /* 807 */ ++ .pme_name = "PM_MRK_ST_L2DISP_TO_CMPL_CYC", ++ .pme_code = 0x000001F150, ++ .pme_short_desc = "cycles from L2 rc disp to l2 rc completion", ++ .pme_long_desc = "cycles from L2 rc disp to l2 rc completion", ++}, ++[ POWER9_PME_PM_LSU1_LDMX_FIN ] = { /* 808 */ ++ .pme_name = "PM_LSU1_LDMX_FIN", ++ .pme_code = 0x000000D888, ++ .pme_short_desc = " New P9 instruction LDMX.", ++ .pme_long_desc = " New P9 instruction LDMX.", ++}, ++[ POWER9_PME_PM_L3_P2_LCO_RTY ] = { /* 809 */ ++ .pme_name = "PM_L3_P2_LCO_RTY", ++ .pme_code = 0x00000260B4, ++ .pme_short_desc = "L3 lateral cast out received retry on port 2", ++ .pme_long_desc = "L3 lateral cast out received retry on port 2", ++}, ++[ POWER9_PME_PM_MRK_DATA_FROM_DL2L3_SHR ] = { /* 810 */ ++ .pme_name = "PM_MRK_DATA_FROM_DL2L3_SHR", ++ .pme_code = 0x000001D150, ++ .pme_short_desc = "The processor's data cache was reloaded with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a marked load", ++ .pme_long_desc = "The processor's data cache was reloaded with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a marked load", ++}, ++[ POWER9_PME_PM_L2_GRP_GUESS_CORRECT ] = { /* 811 */ ++ .pme_name = "PM_L2_GRP_GUESS_CORRECT", ++ .pme_code = 0x0000026088, ++ .pme_short_desc = "L2 guess grp and guess was correct (data intra-6chip AND ^on-chip)", ++ .pme_long_desc = "L2 guess grp and guess was correct (data intra-6chip AND ^on-chip)", ++}, ++[ POWER9_PME_PM_LSU0_1_LRQF_FULL_CYC ] = { /* 812 */ ++ .pme_name = "PM_LSU0_1_LRQF_FULL_CYC", ++ .pme_code = 0x000000D0BC, ++ .pme_short_desc = "Counts the number of cycles the LRQF is full.", ++ .pme_long_desc = "Counts the number of cycles the LRQF is full. LRQF is the queue that holds loads between finish and completion. If it fills up, instructions stay in LRQ until completion, potentially backing up the LRQ", ++}, ++[ POWER9_PME_PM_DATA_GRP_PUMP_MPRED ] = { /* 813 */ ++ .pme_name = "PM_DATA_GRP_PUMP_MPRED", ++ .pme_code = 0x000002C052, ++ .pme_short_desc = "Final Pump Scope (Group) ended up either larger or smaller than Initial Pump Scope for a demand load", ++ .pme_long_desc = "Final Pump Scope (Group) ended up either larger or smaller than Initial Pump Scope for a demand load", ++}, ++[ POWER9_PME_PM_LSU3_ERAT_HIT ] = { /* 814 */ ++ .pme_name = "PM_LSU3_ERAT_HIT", ++ .pme_code = 0x000000E890, ++ .pme_short_desc = "Primary ERAT hit.", ++ .pme_long_desc = "Primary ERAT hit. There is no secondary ERAT", ++}, ++[ POWER9_PME_PM_FORCED_NOP ] = { /* 815 */ ++ .pme_name = "PM_FORCED_NOP", ++ .pme_code = 0x000000509C, ++ .pme_short_desc = "Instruction was forced to execute as a nop because it was found to behave like a nop (have no effect) at decode time", ++ .pme_long_desc = "Instruction was forced to execute as a nop because it was found to behave like a nop (have no effect) at decode time", ++}, ++[ POWER9_PME_PM_MRK_DATA_FROM_L2_DISP_CONFLICT_LDHITST ] = { /* 816 */ ++ .pme_name = "PM_MRK_DATA_FROM_L2_DISP_CONFLICT_LDHITST", ++ .pme_code = 0x000002D148, ++ .pme_short_desc = "The processor's data cache was reloaded from local core's L2 with load hit store conflict due to a marked load", ++ .pme_long_desc = "The processor's data cache was reloaded from local core's L2 with load hit store conflict due to a marked load", ++}, ++[ POWER9_PME_PM_CMPLU_STALL_LARX ] = { /* 817 */ ++ .pme_name = "PM_CMPLU_STALL_LARX", ++ .pme_code = 0x000001002A, ++ .pme_short_desc = "Finish stall because the NTF instruction was a larx waiting to be satisfied", ++ .pme_long_desc = "Finish stall because the NTF instruction was a larx waiting to be satisfied", ++}, ++[ POWER9_PME_PM_MRK_DPTEG_FROM_RL4 ] = { /* 818 */ ++ .pme_name = "PM_MRK_DPTEG_FROM_RL4", ++ .pme_code = 0x000002F14A, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from another chip's L4 on the same Node or Group ( Remote) due to a marked data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from another chip's L4 on the same Node or Group ( Remote) due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++}, ++[ POWER9_PME_PM_MRK_DATA_FROM_L2 ] = { /* 819 */ ++ .pme_name = "PM_MRK_DATA_FROM_L2", ++ .pme_code = 0x000002C126, ++ .pme_short_desc = "The processor's data cache was reloaded from local core's L2 due to a marked load", ++ .pme_long_desc = "The processor's data cache was reloaded from local core's L2 due to a marked load", ++}, ++[ POWER9_PME_PM_TM_FAIL_CONF_NON_TM ] = { /* 820 */ ++ .pme_name = "PM_TM_FAIL_CONF_NON_TM", ++ .pme_code = 0x00000028A8, ++ .pme_short_desc = "TM aborted because a conflict occurred with a non-transactional access by another processor", ++ .pme_long_desc = "TM aborted because a conflict occurred with a non-transactional access by another processor", ++}, ++[ POWER9_PME_PM_DPTEG_FROM_RL2L3_SHR ] = { /* 821 */ ++ .pme_name = "PM_DPTEG_FROM_RL2L3_SHR", ++ .pme_code = 0x000001E04A, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++}, ++[ POWER9_PME_PM_DARQ_4_6_ENTRIES ] = { /* 822 */ ++ .pme_name = "PM_DARQ_4_6_ENTRIES", ++ .pme_code = 0x000003504E, ++ .pme_short_desc = "Cycles in which 4, 5, or 6 DARQ entries (out of 12) are in use", ++ .pme_long_desc = "Cycles in which 4, 5, or 6 DARQ entries (out of 12) are in use", ++}, ++[ POWER9_PME_PM_L2_SYS_PUMP ] = { /* 823 */ ++ .pme_name = "PM_L2_SYS_PUMP", ++ .pme_code = 0x000004688A, ++ .pme_short_desc = "RC requests that were system pump attempts", ++ .pme_long_desc = "RC requests that were system pump attempts", ++}, ++[ POWER9_PME_PM_IOPS_CMPL ] = { /* 824 */ ++ .pme_name = "PM_IOPS_CMPL", ++ .pme_code = 0x0000024050, ++ .pme_short_desc = "Internal Operations completed", ++ .pme_long_desc = "Internal Operations completed", ++}, ++[ POWER9_PME_PM_LSU_FLUSH_LHS ] = { /* 825 */ ++ .pme_name = "PM_LSU_FLUSH_LHS", ++ .pme_code = 0x000000C8B4, ++ .pme_short_desc = "Effective Address alias flush : no EA match but Real Address match.", ++ .pme_long_desc = "Effective Address alias flush : no EA match but Real Address match. If the data has not yet been returned for this load, the instruction will just be rejected, but if it has returned data, it will be flushed", ++}, ++[ POWER9_PME_PM_DATA_FROM_L3_1_SHR ] = { /* 826 */ ++ .pme_name = "PM_DATA_FROM_L3_1_SHR", ++ .pme_code = 0x000001C046, ++ .pme_short_desc = "The processor's data cache was reloaded with Shared (S) data from another core's L3 on the same chip due to a demand load", ++ .pme_long_desc = "The processor's data cache was reloaded with Shared (S) data from another core's L3 on the same chip due to a demand load", ++}, ++[ POWER9_PME_PM_NTC_FIN ] = { /* 827 */ ++ .pme_name = "PM_NTC_FIN", ++ .pme_code = 0x000002405A, ++ .pme_short_desc = "Cycles in which the oldest instruction in the pipeline (NTC) finishes.", ++ .pme_long_desc = "Cycles in which the oldest instruction in the pipeline (NTC) finishes. This event is used to account for cycles in which work is being completed in the CPI stack", ++}, ++[ POWER9_PME_PM_LS2_DC_COLLISIONS ] = { /* 828 */ ++ .pme_name = "PM_LS2_DC_COLLISIONS", ++ .pme_code = 0x000000D094, ++ .pme_short_desc = "Read-write data cache collisions", ++ .pme_long_desc = "Read-write data cache collisions", ++}, ++[ POWER9_PME_PM_FMA_CMPL ] = { /* 829 */ ++ .pme_name = "PM_FMA_CMPL", ++ .pme_code = 0x0000010014, ++ .pme_short_desc = "two flops operation completed (fmadd, fnmadd, fmsub, fnmsub) Scalar instructions only.", ++ .pme_long_desc = "two flops operation completed (fmadd, fnmadd, fmsub, fnmsub) Scalar instructions only.?", ++}, ++[ POWER9_PME_PM_IPTEG_FROM_MEMORY ] = { /* 830 */ ++ .pme_name = "PM_IPTEG_FROM_MEMORY", ++ .pme_code = 0x000002504C, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from a memory location including L4 from local remote or distant due to a instruction side request", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from a memory location including L4 from local remote or distant due to a instruction side request", ++}, ++[ POWER9_PME_PM_TM_NON_FAV_TBEGIN ] = { /* 831 */ ++ .pme_name = "PM_TM_NON_FAV_TBEGIN", ++ .pme_code = 0x000000289C, ++ .pme_short_desc = "Dispatch time non favored tbegin", ++ .pme_long_desc = "Dispatch time non favored tbegin", ++}, ++[ POWER9_PME_PM_PMC1_REWIND ] = { /* 832 */ ++ .pme_name = "PM_PMC1_REWIND", ++ .pme_code = 0x000004D02C, ++ .pme_short_desc = "", ++ .pme_long_desc = "", ++}, ++[ POWER9_PME_PM_ISU2_ISS_HOLD_ALL ] = { /* 833 */ ++ .pme_name = "PM_ISU2_ISS_HOLD_ALL", ++ .pme_code = 0x0000003880, ++ .pme_short_desc = "All ISU rejects", ++ .pme_long_desc = "All ISU rejects", ++}, ++[ POWER9_PME_PM_MRK_DATA_FROM_DL2L3_MOD_CYC ] = { /* 834 */ ++ .pme_name = "PM_MRK_DATA_FROM_DL2L3_MOD_CYC", ++ .pme_code = 0x000004D12E, ++ .pme_short_desc = "Duration in cycles to reload with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a marked load", ++ .pme_long_desc = "Duration in cycles to reload with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a marked load", ++}, ++[ POWER9_PME_PM_PTESYNC ] = { /* 835 */ ++ .pme_name = "PM_PTESYNC", ++ .pme_code = 0x000000589C, ++ .pme_short_desc = "ptesync instruction counted when the instructio is decoded and transmitted", ++ .pme_long_desc = "ptesync instruction counted when the instructio is decoded and transmitted", ++}, ++[ POWER9_PME_PM_ISIDE_DISP_FAIL_OTHER ] = { /* 836 */ ++ .pme_name = "PM_ISIDE_DISP_FAIL_OTHER", ++ .pme_code = 0x000002688A, ++ .pme_short_desc = "All i-side dispatch attempts that failed due to a reason other than addrs collision", ++ .pme_long_desc = "All i-side dispatch attempts that failed due to a reason other than addrs collision", ++}, ++[ POWER9_PME_PM_L2_IC_INV ] = { /* 837 */ ++ .pme_name = "PM_L2_IC_INV", ++ .pme_code = 0x0000026082, ++ .pme_short_desc = "Icache Invalidates from L2", ++ .pme_long_desc = "Icache Invalidates from L2", ++}, ++[ POWER9_PME_PM_DPTEG_FROM_L3 ] = { /* 838 */ ++ .pme_name = "PM_DPTEG_FROM_L3", ++ .pme_code = 0x000004E042, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L3 due to a data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L3 due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++}, ++[ POWER9_PME_PM_RADIX_PWC_L2_HIT ] = { /* 839 */ ++ .pme_name = "PM_RADIX_PWC_L2_HIT", ++ .pme_code = 0x000002D024, ++ .pme_short_desc = "A radix translation attempt missed in the TLB but hit on both the first and second levels of page walk cache.", ++ .pme_long_desc = "A radix translation attempt missed in the TLB but hit on both the first and second levels of page walk cache.", ++}, ++[ POWER9_PME_PM_DC_PREF_HW_ALLOC ] = { /* 840 */ ++ .pme_name = "PM_DC_PREF_HW_ALLOC", ++ .pme_code = 0x000000F0A4, ++ .pme_short_desc = "Prefetch stream allocated by the hardware prefetch mechanism", ++ .pme_long_desc = "Prefetch stream allocated by the hardware prefetch mechanism", ++}, ++[ POWER9_PME_PM_LSU0_VECTOR_LD_FIN ] = { /* 841 */ ++ .pme_name = "PM_LSU0_VECTOR_LD_FIN", ++ .pme_code = 0x000000C080, ++ .pme_short_desc = "A vector load instruction finished.", ++ .pme_long_desc = "A vector load instruction finished. The ops considered in this category are lxv*, lvx*, lve*, lxsi*zx, lxvwsx, lxsd, lxssp, lxvl, lxvll, lxvb16x, lxvh8x, lxv, lxvx", ++}, ++[ POWER9_PME_PM_1PLUS_PPC_DISP ] = { /* 842 */ ++ .pme_name = "PM_1PLUS_PPC_DISP", ++ .pme_code = 0x00000400F2, ++ .pme_short_desc = "Cycles at least one Instr Dispatched", ++ .pme_long_desc = "Cycles at least one Instr Dispatched", ++}, ++[ POWER9_PME_PM_RADIX_PWC_L3_PTE_FROM_L2 ] = { /* 843 */ ++ .pme_name = "PM_RADIX_PWC_L3_PTE_FROM_L2", ++ .pme_code = 0x000002D02E, ++ .pme_short_desc = "A Page Table Entry was reloaded to a level 3 page walk cache from the core's L2 data cache.", ++ .pme_long_desc = "A Page Table Entry was reloaded to a level 3 page walk cache from the core's L2 data cache. This implies that a level 4 PWC access was not necessary for this translation", ++}, ++[ POWER9_PME_PM_DATA_FROM_L2MISS ] = { /* 844 */ ++ .pme_name = "PM_DATA_FROM_L2MISS", ++ .pme_code = 0x00000200FE, ++ .pme_short_desc = "Demand LD - L2 Miss (not L2 hit)", ++ .pme_long_desc = "Demand LD - L2 Miss (not L2 hit)", ++}, ++[ POWER9_PME_PM_MRK_FAB_RSP_RD_T_INTV ] = { /* 845 */ ++ .pme_name = "PM_MRK_FAB_RSP_RD_T_INTV", ++ .pme_code = 0x000001015E, ++ .pme_short_desc = "Sampled Read got a T intervention", ++ .pme_long_desc = "Sampled Read got a T intervention", ++}, ++[ POWER9_PME_PM_NTC_ISSUE_HELD_ARB ] = { /* 846 */ ++ .pme_name = "PM_NTC_ISSUE_HELD_ARB", ++ .pme_code = 0x000002E016, ++ .pme_short_desc = "The NTC instruction is being held at dispatch because it lost arbitration onto the issue pipe to another instruction (from the same thread or a different thread)", ++ .pme_long_desc = "The NTC instruction is being held at dispatch because it lost arbitration onto the issue pipe to another instruction (from the same thread or a different thread)", ++}, ++[ POWER9_PME_PM_LSU2_L1_CAM_CANCEL ] = { /* 847 */ ++ .pme_name = "PM_LSU2_L1_CAM_CANCEL", ++ .pme_code = 0x000000F094, ++ .pme_short_desc = "ls2 l1 tm cam cancel", ++ .pme_long_desc = "ls2 l1 tm cam cancel", ++}, ++[ POWER9_PME_PM_L3_GRP_GUESS_WRONG_HIGH ] = { /* 848 */ ++ .pme_name = "PM_L3_GRP_GUESS_WRONG_HIGH", ++ .pme_code = 0x00000368B2, ++ .pme_short_desc = "Initial scope=group but data from local node.", ++ .pme_long_desc = "Initial scope=group but data from local node. Predition too high", ++}, ++[ POWER9_PME_PM_DATA_FROM_L3_NO_CONFLICT ] = { /* 849 */ ++ .pme_name = "PM_DATA_FROM_L3_NO_CONFLICT", ++ .pme_code = 0x000001C044, ++ .pme_short_desc = "The processor's data cache was reloaded from local core's L3 without conflict due to a demand load", ++ .pme_long_desc = "The processor's data cache was reloaded from local core's L3 without conflict due to a demand load", ++}, ++[ POWER9_PME_PM_SUSPENDED ] = { /* 850 */ ++ .pme_name = "PM_SUSPENDED", ++ .pme_code = 0x0000010000, ++ .pme_short_desc = "Counter OFF", ++ .pme_long_desc = "Counter OFF", ++}, ++[ POWER9_PME_PM_L3_SYS_GUESS_WRONG ] = { /* 851 */ ++ .pme_name = "PM_L3_SYS_GUESS_WRONG", ++ .pme_code = 0x00000460B2, ++ .pme_short_desc = "Initial scope=system but data from local or near.", ++ .pme_long_desc = "Initial scope=system but data from local or near. Predction too high", ++}, ++[ POWER9_PME_PM_L3_L2_CO_HIT ] = { /* 852 */ ++ .pme_name = "PM_L3_L2_CO_HIT", ++ .pme_code = 0x00000360A2, ++ .pme_short_desc = "L2 castout hits", ++ .pme_long_desc = "L2 castout hits", ++}, ++[ POWER9_PME_PM_LSU0_TM_L1_HIT ] = { /* 853 */ ++ .pme_name = "PM_LSU0_TM_L1_HIT", ++ .pme_code = 0x000000E094, ++ .pme_short_desc = "Load tm hit in L1", ++ .pme_long_desc = "Load tm hit in L1", ++}, ++[ POWER9_PME_PM_BR_MPRED_PCACHE ] = { /* 854 */ ++ .pme_name = "PM_BR_MPRED_PCACHE", ++ .pme_code = 0x00000048B0, ++ .pme_short_desc = "Conditional Branch Completed that was Mispredicted due to pattern cache prediction", ++ .pme_long_desc = "Conditional Branch Completed that was Mispredicted due to pattern cache prediction", ++}, ++[ POWER9_PME_PM_STCX_FAIL ] = { /* 855 */ ++ .pme_name = "PM_STCX_FAIL", ++ .pme_code = 0x000001E058, ++ .pme_short_desc = "stcx failed", ++ .pme_long_desc = "stcx failed", ++}, ++[ POWER9_PME_PM_LSU_FLUSH_NEXT ] = { /* 856 */ ++ .pme_name = "PM_LSU_FLUSH_NEXT", ++ .pme_code = 0x00000020B0, ++ .pme_short_desc = "LSU flush next reported at flush time.", ++ .pme_long_desc = "LSU flush next reported at flush time. Sometimes these also come with an exception", ++}, ++[ POWER9_PME_PM_DSIDE_MRU_TOUCH ] = { /* 857 */ ++ .pme_name = "PM_DSIDE_MRU_TOUCH", ++ .pme_code = 0x0000026884, ++ .pme_short_desc = "dside L2 MRU touch", ++ .pme_long_desc = "dside L2 MRU touch", ++}, ++[ POWER9_PME_PM_SN_MISS ] = { /* 858 */ ++ .pme_name = "PM_SN_MISS", ++ .pme_code = 0x00000468A8, ++ .pme_short_desc = "Any port snooper miss.", ++ .pme_long_desc = "Any port snooper miss. Up to 4 can happen in a cycle but we only count 1", ++}, ++[ POWER9_PME_PM_BR_PRED_TAKEN_CMPL ] = { /* 859 */ ++ .pme_name = "PM_BR_PRED_TAKEN_CMPL", ++ .pme_code = 0x000000489C, ++ .pme_short_desc = "Conditional Branch Completed in which the HW predicted the Direction or Target and the branch was resolved taken.", ++ .pme_long_desc = "Conditional Branch Completed in which the HW predicted the Direction or Target and the branch was resolved taken. Counted at completion time", ++}, ++[ POWER9_PME_PM_L3_P0_SYS_PUMP ] = { /* 860 */ ++ .pme_name = "PM_L3_P0_SYS_PUMP", ++ .pme_code = 0x00000360B0, ++ .pme_short_desc = "L3 pf sent with sys scope port 0", ++ .pme_long_desc = "L3 pf sent with sys scope port 0", ++}, ++[ POWER9_PME_PM_L3_HIT ] = { /* 861 */ ++ .pme_name = "PM_L3_HIT", ++ .pme_code = 0x00000160A4, ++ .pme_short_desc = "L3 Hits", ++ .pme_long_desc = "L3 Hits", ++}, ++[ POWER9_PME_PM_MRK_DFU_FIN ] = { /* 862 */ ++ .pme_name = "PM_MRK_DFU_FIN", ++ .pme_code = 0x0000020132, ++ .pme_short_desc = "Decimal Unit marked Instruction Finish", ++ .pme_long_desc = "Decimal Unit marked Instruction Finish", ++}, ++[ POWER9_PME_PM_CMPLU_STALL_NESTED_TEND ] = { /* 863 */ ++ .pme_name = "PM_CMPLU_STALL_NESTED_TEND", ++ .pme_code = 0x000003003C, ++ .pme_short_desc = "Completion stall because the ISU is updating the TEXASR to keep track of the nested tend and decrement the TEXASR nested level.", ++ .pme_long_desc = "Completion stall because the ISU is updating the TEXASR to keep track of the nested tend and decrement the TEXASR nested level. This is a short delay", ++}, ++[ POWER9_PME_PM_INST_FROM_L1 ] = { /* 864 */ ++ .pme_name = "PM_INST_FROM_L1", ++ .pme_code = 0x0000004080, ++ .pme_short_desc = "Instruction fetches from L1.", ++ .pme_long_desc = "Instruction fetches from L1. L1 instruction hit", ++}, ++[ POWER9_PME_PM_IC_DEMAND_REQ ] = { /* 865 */ ++ .pme_name = "PM_IC_DEMAND_REQ", ++ .pme_code = 0x0000004088, ++ .pme_short_desc = "Demand Instruction fetch request", ++ .pme_long_desc = "Demand Instruction fetch request", ++}, ++[ POWER9_PME_PM_BRU_FIN ] = { /* 866 */ ++ .pme_name = "PM_BRU_FIN", ++ .pme_code = 0x0000010068, ++ .pme_short_desc = "Branch Instruction Finished", ++ .pme_long_desc = "Branch Instruction Finished", ++}, ++[ POWER9_PME_PM_L1_ICACHE_RELOADED_ALL ] = { /* 867 */ ++ .pme_name = "PM_L1_ICACHE_RELOADED_ALL", ++ .pme_code = 0x0000040012, ++ .pme_short_desc = "Counts all Icache reloads includes demand, prefetch, prefetch turned into demand and demand turned into prefetch", ++ .pme_long_desc = "Counts all Icache reloads includes demand, prefetch, prefetch turned into demand and demand turned into prefetch", ++}, ++[ POWER9_PME_PM_IERAT_RELOAD_16M ] = { /* 868 */ ++ .pme_name = "PM_IERAT_RELOAD_16M", ++ .pme_code = 0x000004006A, ++ .pme_short_desc = "IERAT Reloaded (Miss) for a 16M page", ++ .pme_long_desc = "IERAT Reloaded (Miss) for a 16M page", ++}, ++[ POWER9_PME_PM_DATA_FROM_L2MISS_MOD ] = { /* 869 */ ++ .pme_name = "PM_DATA_FROM_L2MISS_MOD", ++ .pme_code = 0x000001C04E, ++ .pme_short_desc = "The processor's data cache was reloaded from a localtion other than the local core's L2 due to a demand load", ++ .pme_long_desc = "The processor's data cache was reloaded from a localtion other than the local core's L2 due to a demand load", ++}, ++[ POWER9_PME_PM_LSU0_ERAT_HIT ] = { /* 870 */ ++ .pme_name = "PM_LSU0_ERAT_HIT", ++ .pme_code = 0x000000E08C, ++ .pme_short_desc = "Primary ERAT hit.", ++ .pme_long_desc = "Primary ERAT hit. There is no secondary ERAT", ++}, ++[ POWER9_PME_PM_L3_PF0_BUSY ] = { /* 871 */ ++ .pme_name = "PM_L3_PF0_BUSY", ++ .pme_code = 0x00000460B4, ++ .pme_short_desc = "lifetime, sample of PF machine 0 valid", ++ .pme_long_desc = "lifetime, sample of PF machine 0 valid", ++}, ++[ POWER9_PME_PM_MRK_DPTEG_FROM_LL4 ] = { /* 872 */ ++ .pme_name = "PM_MRK_DPTEG_FROM_LL4", ++ .pme_code = 0x000001F14C, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from the local chip's L4 cache due to a marked data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from the local chip's L4 cache due to a marked data side request.. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++}, ++[ POWER9_PME_PM_LSU3_SET_MPRED ] = { /* 873 */ ++ .pme_name = "PM_LSU3_SET_MPRED", ++ .pme_code = 0x000000D884, ++ .pme_short_desc = "Set prediction(set-p) miss.", ++ .pme_long_desc = "Set prediction(set-p) miss. The entry was not found in the Set prediction table", ++}, ++[ POWER9_PME_PM_TM_CAM_OVERFLOW ] = { /* 874 */ ++ .pme_name = "PM_TM_CAM_OVERFLOW", ++ .pme_code = 0x00000168A6, ++ .pme_short_desc = "l3 tm cam overflow during L2 co of SC", ++ .pme_long_desc = "l3 tm cam overflow during L2 co of SC", ++}, ++[ POWER9_PME_PM_SYNC_MRK_FX_DIVIDE ] = { /* 875 */ ++ .pme_name = "PM_SYNC_MRK_FX_DIVIDE", ++ .pme_code = 0x0000015156, ++ .pme_short_desc = "Marked fixed point divide that can cause a synchronous interrupt", ++ .pme_long_desc = "Marked fixed point divide that can cause a synchronous interrupt", ++}, ++[ POWER9_PME_PM_IPTEG_FROM_L2_1_SHR ] = { /* 876 */ ++ .pme_name = "PM_IPTEG_FROM_L2_1_SHR", ++ .pme_code = 0x0000035046, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's L2 on the same chip due to a instruction side request", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's L2 on the same chip due to a instruction side request", ++}, ++[ POWER9_PME_PM_MRK_LD_MISS_L1 ] = { /* 877 */ ++ .pme_name = "PM_MRK_LD_MISS_L1", ++ .pme_code = 0x00000201E2, ++ .pme_short_desc = "Marked DL1 Demand Miss counted at exec time.", ++ .pme_long_desc = "Marked DL1 Demand Miss counted at exec time. Note that this count is per slice, so if a load spans multiple slices this event will increment multiple times for a single load.", ++}, ++[ POWER9_PME_PM_MRK_FAB_RSP_DCLAIM ] = { /* 878 */ ++ .pme_name = "PM_MRK_FAB_RSP_DCLAIM", ++ .pme_code = 0x0000030154, ++ .pme_short_desc = "Marked store had to do a dclaim", ++ .pme_long_desc = "Marked store had to do a dclaim", ++}, ++[ POWER9_PME_PM_IPTEG_FROM_L3_DISP_CONFLICT ] = { /* 879 */ ++ .pme_name = "PM_IPTEG_FROM_L3_DISP_CONFLICT", ++ .pme_code = 0x0000035042, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L3 with dispatch conflict due to a instruction side request", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L3 with dispatch conflict due to a instruction side request", ++}, ++[ POWER9_PME_PM_NON_FMA_FLOP_CMPL ] = { /* 880 */ ++ .pme_name = "PM_NON_FMA_FLOP_CMPL", ++ .pme_code = 0x000004D056, ++ .pme_short_desc = "Non fma flop instruction completed", ++ .pme_long_desc = "Non fma flop instruction completed", ++}, ++[ POWER9_PME_PM_MRK_DATA_FROM_L2MISS ] = { /* 881 */ ++ .pme_name = "PM_MRK_DATA_FROM_L2MISS", ++ .pme_code = 0x00000401E8, ++ .pme_short_desc = "The processor's data cache was reloaded from a localtion other than the local core's L2 due to a marked load", ++ .pme_long_desc = "The processor's data cache was reloaded from a localtion other than the local core's L2 due to a marked load", ++}, ++[ POWER9_PME_PM_L2_SYS_GUESS_WRONG ] = { /* 882 */ ++ .pme_name = "PM_L2_SYS_GUESS_WRONG", ++ .pme_code = 0x0000036888, ++ .pme_short_desc = "L2 guess sys and guess was not correct (ie data ^beyond-6chip)", ++ .pme_long_desc = "L2 guess sys and guess was not correct (ie data ^beyond-6chip)", ++}, ++[ POWER9_PME_PM_THRESH_EXC_2048 ] = { /* 883 */ ++ .pme_name = "PM_THRESH_EXC_2048", ++ .pme_code = 0x00000401EC, ++ .pme_short_desc = "Threshold counter exceeded a value of 2048", ++ .pme_long_desc = "Threshold counter exceeded a value of 2048", ++}, ++[ POWER9_PME_PM_INST_FROM_LL4 ] = { /* 884 */ ++ .pme_name = "PM_INST_FROM_LL4", ++ .pme_code = 0x000001404C, ++ .pme_short_desc = "The processor's Instruction cache was reloaded from the local chip's L4 cache due to an instruction fetch (not prefetch)", ++ .pme_long_desc = "The processor's Instruction cache was reloaded from the local chip's L4 cache due to an instruction fetch (not prefetch)", ++}, ++[ POWER9_PME_PM_DATA_FROM_RL2L3_SHR ] = { /* 885 */ ++ .pme_name = "PM_DATA_FROM_RL2L3_SHR", ++ .pme_code = 0x000001C04A, ++ .pme_short_desc = "The processor's data cache was reloaded with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a demand load", ++ .pme_long_desc = "The processor's data cache was reloaded with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a demand load", ++}, ++[ POWER9_PME_PM_DATA_FROM_L2_DISP_CONFLICT_LDHITST ] = { /* 886 */ ++ .pme_name = "PM_DATA_FROM_L2_DISP_CONFLICT_LDHITST", ++ .pme_code = 0x000003C040, ++ .pme_short_desc = "The processor's data cache was reloaded from local core's L2 with load hit store conflict due to a demand load", ++ .pme_long_desc = "The processor's data cache was reloaded from local core's L2 with load hit store conflict due to a demand load", ++}, ++[ POWER9_PME_PM_LSU_FLUSH_WRK_ARND ] = { /* 887 */ ++ .pme_name = "PM_LSU_FLUSH_WRK_ARND", ++ .pme_code = 0x000000C0B4, ++ .pme_short_desc = "LSU workaround flush.", ++ .pme_long_desc = "LSU workaround flush. These flushes are setup with programmable scan only latches to perform various actions when the flsh macro receives a trigger from the dbg macros. These actions include things like flushing the next op encountered for a particular thread or flushing the next op that is NTC op that is encountered on a particular slice. The kind of flush that the workaround is setup to perform is highly variable.", ++}, ++[ POWER9_PME_PM_L3_PF_HIT_L3 ] = { /* 888 */ ++ .pme_name = "PM_L3_PF_HIT_L3", ++ .pme_code = 0x00000260A8, ++ .pme_short_desc = "l3 pf hit in l3", ++ .pme_long_desc = "l3 pf hit in l3", ++}, ++[ POWER9_PME_PM_RD_FORMING_SC ] = { /* 889 */ ++ .pme_name = "PM_RD_FORMING_SC", ++ .pme_code = 0x00000460A6, ++ .pme_short_desc = "rd forming sc", ++ .pme_long_desc = "rd forming sc", ++}, ++[ POWER9_PME_PM_MRK_DATA_FROM_L2_1_MOD_CYC ] = { /* 890 */ ++ .pme_name = "PM_MRK_DATA_FROM_L2_1_MOD_CYC", ++ .pme_code = 0x000003D148, ++ .pme_short_desc = "Duration in cycles to reload with Modified (M) data from another core's L2 on the same chip due to a marked load", ++ .pme_long_desc = "Duration in cycles to reload with Modified (M) data from another core's L2 on the same chip due to a marked load", ++}, ++[ POWER9_PME_PM_IPTEG_FROM_DL4 ] = { /* 891 */ ++ .pme_name = "PM_IPTEG_FROM_DL4", ++ .pme_code = 0x000003504C, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from another chip's L4 on a different Node or Group (Distant) due to a instruction side request", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from another chip's L4 on a different Node or Group (Distant) due to a instruction side request", ++}, ++[ POWER9_PME_PM_CMPLU_STALL_STORE_FINISH ] = { /* 892 */ ++ .pme_name = "PM_CMPLU_STALL_STORE_FINISH", ++ .pme_code = 0x000002C014, ++ .pme_short_desc = "Finish stall because the NTF instruction was a store with all its dependencies met, just waiting to go through the LSU pipe to finish", ++ .pme_long_desc = "Finish stall because the NTF instruction was a store with all its dependencies met, just waiting to go through the LSU pipe to finish", ++}, ++[ POWER9_PME_PM_IPTEG_FROM_LL4 ] = { /* 893 */ ++ .pme_name = "PM_IPTEG_FROM_LL4", ++ .pme_code = 0x000001504C, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from the local chip's L4 cache due to a instruction side request", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from the local chip's L4 cache due to a instruction side request", ++}, ++[ POWER9_PME_PM_1FLOP_CMPL ] = { /* 894 */ ++ .pme_name = "PM_1FLOP_CMPL", ++ .pme_code = 0x000001000C, ++ .pme_short_desc = "one flop (fadd, fmul, fsub, fcmp, fsel, fabs, fnabs, fres, fsqrte, fneg) operation completed", ++ .pme_long_desc = "one flop (fadd, fmul, fsub, fcmp, fsel, fabs, fnabs, fres, fsqrte, fneg) operation completed", ++}, ++[ POWER9_PME_PM_L2_GRP_GUESS_WRONG ] = { /* 895 */ ++ .pme_name = "PM_L2_GRP_GUESS_WRONG", ++ .pme_code = 0x0000026888, ++ .pme_short_desc = "L2 guess grp and guess was not correct (ie data on-chip OR beyond-6chip)", ++ .pme_long_desc = "L2 guess grp and guess was not correct (ie data on-chip OR beyond-6chip)", ++}, ++[ POWER9_PME_PM_TM_FAV_TBEGIN ] = { /* 896 */ ++ .pme_name = "PM_TM_FAV_TBEGIN", ++ .pme_code = 0x000000209C, ++ .pme_short_desc = "Dispatch time Favored tbegin", ++ .pme_long_desc = "Dispatch time Favored tbegin", ++}, ++[ POWER9_PME_PM_INST_FROM_L2_NO_CONFLICT ] = { /* 897 */ ++ .pme_name = "PM_INST_FROM_L2_NO_CONFLICT", ++ .pme_code = 0x0000014040, ++ .pme_short_desc = "The processor's Instruction cache was reloaded from local core's L2 without conflict due to an instruction fetch (not prefetch)", ++ .pme_long_desc = "The processor's Instruction cache was reloaded from local core's L2 without conflict due to an instruction fetch (not prefetch)", ++}, ++[ POWER9_PME_PM_2FLOP_CMPL ] = { /* 898 */ ++ .pme_name = "PM_2FLOP_CMPL", ++ .pme_code = 0x000004D052, ++ .pme_short_desc = "DP vector version of fmul, fsub, fcmp, fsel, fabs, fnabs, fres ,fsqrte, fneg?", ++ .pme_long_desc = "DP vector version of fmul, fsub, fcmp, fsel, fabs, fnabs, fres ,fsqrte, fneg?", ++}, ++[ POWER9_PME_PM_LS2_TM_DISALLOW ] = { /* 899 */ ++ .pme_name = "PM_LS2_TM_DISALLOW", ++ .pme_code = 0x000000E0B8, ++ .pme_short_desc = "A TM-ineligible instruction tries to execute inside a transaction and the LSU disallows it", ++ .pme_long_desc = "A TM-ineligible instruction tries to execute inside a transaction and the LSU disallows it", ++}, ++[ POWER9_PME_PM_L2_LD_DISP ] = { /* 900 */ ++ .pme_name = "PM_L2_LD_DISP", ++ .pme_code = 0x000001609E, ++ .pme_short_desc = "All successful load dispatches", ++ .pme_long_desc = "All successful load dispatches", ++}, ++[ POWER9_PME_PM_CMPLU_STALL_LHS ] = { /* 901 */ ++ .pme_name = "PM_CMPLU_STALL_LHS", ++ .pme_code = 0x000002C01A, ++ .pme_short_desc = "Finish stall because the NTF instruction was a load that hit on an older store and it was waiting for store data", ++ .pme_long_desc = "Finish stall because the NTF instruction was a load that hit on an older store and it was waiting for store data", ++}, ++[ POWER9_PME_PM_TLB_HIT ] = { /* 902 */ ++ .pme_name = "PM_TLB_HIT", ++ .pme_code = 0x000001F054, ++ .pme_short_desc = "Number of times the TLB had the data required by the instruction.", ++ .pme_long_desc = "Number of times the TLB had the data required by the instruction. Applies to both HPT and RPT", ++}, ++[ POWER9_PME_PM_HV_CYC ] = { /* 903 */ ++ .pme_name = "PM_HV_CYC", ++ .pme_code = 0x0000020006, ++ .pme_short_desc = "Cycles in which msr_hv is high.", ++ .pme_long_desc = "Cycles in which msr_hv is high. Note that this event does not take msr_pr into consideration", ++}, ++[ POWER9_PME_PM_L2_RTY_LD ] = { /* 904 */ ++ .pme_name = "PM_L2_RTY_LD", ++ .pme_code = 0x000003689E, ++ .pme_short_desc = "RC retries on PB for any load from core", ++ .pme_long_desc = "RC retries on PB for any load from core", ++}, ++[ POWER9_PME_PM_STCX_SUCCESS_CMPL ] = { /* 905 */ ++ .pme_name = "PM_STCX_SUCCESS_CMPL", ++ .pme_code = 0x000000C8BC, ++ .pme_short_desc = "Number of stcx instructions that completed successfully", ++ .pme_long_desc = "Number of stcx instructions that completed successfully", ++}, ++[ POWER9_PME_PM_INST_PUMP_MPRED ] = { /* 906 */ ++ .pme_name = "PM_INST_PUMP_MPRED", ++ .pme_code = 0x0000044052, ++ .pme_short_desc = "Pump misprediction.", ++ .pme_long_desc = "Pump misprediction. Counts across all types of pumps for an instruction fetch", ++}, ++[ POWER9_PME_PM_LSU2_ERAT_HIT ] = { /* 907 */ ++ .pme_name = "PM_LSU2_ERAT_HIT", ++ .pme_code = 0x000000E090, ++ .pme_short_desc = "Primary ERAT hit.", ++ .pme_long_desc = "Primary ERAT hit. There is no secondary ERAT", ++}, ++[ POWER9_PME_PM_INST_FROM_RL4 ] = { /* 908 */ ++ .pme_name = "PM_INST_FROM_RL4", ++ .pme_code = 0x000002404A, ++ .pme_short_desc = "The processor's Instruction cache was reloaded from another chip's L4 on the same Node or Group ( Remote) due to an instruction fetch (not prefetch)", ++ .pme_long_desc = "The processor's Instruction cache was reloaded from another chip's L4 on the same Node or Group ( Remote) due to an instruction fetch (not prefetch)", ++}, ++[ POWER9_PME_PM_LD_L3MISS_PEND_CYC ] = { /* 909 */ ++ .pme_name = "PM_LD_L3MISS_PEND_CYC", ++ .pme_code = 0x0000010062, ++ .pme_short_desc = "Cycles L3 miss was pending for this thread", ++ .pme_long_desc = "Cycles L3 miss was pending for this thread", ++}, ++[ POWER9_PME_PM_L3_LAT_CI_MISS ] = { /* 910 */ ++ .pme_name = "PM_L3_LAT_CI_MISS", ++ .pme_code = 0x00000468A2, ++ .pme_short_desc = "L3 Lateral Castins Miss", ++ .pme_long_desc = "L3 Lateral Castins Miss", ++}, ++[ POWER9_PME_PM_MRK_FAB_RSP_RD_RTY ] = { /* 911 */ ++ .pme_name = "PM_MRK_FAB_RSP_RD_RTY", ++ .pme_code = 0x000004015E, ++ .pme_short_desc = "Sampled L2 reads retry count", ++ .pme_long_desc = "Sampled L2 reads retry count", ++}, ++[ POWER9_PME_PM_DTLB_MISS_16M ] = { /* 912 */ ++ .pme_name = "PM_DTLB_MISS_16M", ++ .pme_code = 0x000004C056, ++ .pme_short_desc = "Data TLB Miss page size 16M", ++ .pme_long_desc = "Data TLB Miss page size 16M", ++}, ++[ POWER9_PME_PM_DPTEG_FROM_L2_1_MOD ] = { /* 913 */ ++ .pme_name = "PM_DPTEG_FROM_L2_1_MOD", ++ .pme_code = 0x000004E046, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's L2 on the same chip due to a data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's L2 on the same chip due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++}, ++[ POWER9_PME_PM_MRK_DATA_FROM_RL2L3_SHR ] = { /* 914 */ ++ .pme_name = "PM_MRK_DATA_FROM_RL2L3_SHR", ++ .pme_code = 0x0000035150, ++ .pme_short_desc = "The processor's data cache was reloaded with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a marked load", ++ .pme_long_desc = "The processor's data cache was reloaded with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a marked load", ++}, ++[ POWER9_PME_PM_MRK_LSU_FIN ] = { /* 915 */ ++ .pme_name = "PM_MRK_LSU_FIN", ++ .pme_code = 0x0000040132, ++ .pme_short_desc = "lsu marked instr PPC finish", ++ .pme_long_desc = "lsu marked instr PPC finish", ++}, ++[ POWER9_PME_PM_LSU0_STORE_REJECT ] = { /* 916 */ ++ .pme_name = "PM_LSU0_STORE_REJECT", ++ .pme_code = 0x000000F08C, ++ .pme_short_desc = "All internal store rejects cause the instruction to go back to the SRQ and go to sleep until woken up to try again after the condition has been met", ++ .pme_long_desc = "All internal store rejects cause the instruction to go back to the SRQ and go to sleep until woken up to try again after the condition has been met", ++}, ++[ POWER9_PME_PM_CLB_HELD ] = { /* 917 */ ++ .pme_name = "PM_CLB_HELD", ++ .pme_code = 0x000000208C, ++ .pme_short_desc = "CLB Hold: Any Reason", ++ .pme_long_desc = "CLB Hold: Any Reason", ++}, ++[ POWER9_PME_PM_LS2_ERAT_MISS_PREF ] = { /* 918 */ ++ .pme_name = "PM_LS2_ERAT_MISS_PREF", ++ .pme_code = 0x000000E088, ++ .pme_short_desc = "LS0 Erat miss due to prefetch", ++ .pme_long_desc = "LS0 Erat miss due to prefetch", ++}, ++}; ++#endif +diff --git a/lib/pfmlib_common.c b/lib/pfmlib_common.c +index 6ff4499..bd57078 100644 +--- a/lib/pfmlib_common.c ++++ b/lib/pfmlib_common.c +@@ -286,6 +286,7 @@ static pfmlib_pmu_t *pfmlib_pmus[]= + &power6_support, + &power7_support, + &power8_support, ++ &power9_support, + &torrent_support, + &powerpc_nest_mcs_read_support, + &powerpc_nest_mcs_write_support, +diff --git a/lib/pfmlib_power9.c b/lib/pfmlib_power9.c +new file mode 100644 +index 0000000..b3807da +--- /dev/null ++++ b/lib/pfmlib_power9.c +@@ -0,0 +1,58 @@ ++/* ++ * pfmlib_power9.c : IBM Power9 support ++ * ++ * Copyright (C) IBM Corporation, 2017. All rights reserved. ++ * Contributed by Will Schmidt (will_schmidt@vnet.ibm.com) ++ * ++ * Permission is hereby granted, free of charge, to any person obtaining a copy ++ * of this software and associated documentation files (the "Software"), to deal ++ * in the Software without restriction, including without limitation the rights ++ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies ++ * of the Software, and to permit persons to whom the Software is furnished to do so, ++ * subject to the following conditions: ++ * ++ * The above copyright notice and this permission notice shall be included in all ++ * copies or substantial portions of the Software. ++ * ++ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, ++ * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A ++ * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT ++ * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF ++ * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE ++ * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ++ */ ++/* private headers */ ++#include "pfmlib_priv.h" ++#include "pfmlib_power_priv.h" ++#include "events/power9_events.h" ++ ++static int ++pfm_power9_detect(void* this) ++{ ++ if (__is_processor(PV_POWER9)) ++ return PFM_SUCCESS; ++ return PFM_ERR_NOTSUPP; ++} ++ ++pfmlib_pmu_t power9_support={ ++ .desc = "POWER9", ++ .name = "power9", ++ .pmu = PFM_PMU_POWER9, ++ .pme_count = LIBPFM_ARRAY_SIZE(power9_pe), ++ .type = PFM_PMU_TYPE_CORE, ++ .supported_plm = POWER9_PLM, ++ .num_cntrs = 4, ++ .num_fixed_cntrs = 2, ++ .max_encoding = 1, ++ .pe = power9_pe, ++ .pmu_detect = pfm_power9_detect, ++ .get_event_encoding[PFM_OS_NONE] = pfm_gen_powerpc_get_encoding, ++ PFMLIB_ENCODE_PERF(pfm_gen_powerpc_get_perf_encoding), ++ PFMLIB_VALID_PERF_PATTRS(pfm_gen_powerpc_perf_validate_pattrs), ++ .get_event_first = pfm_gen_powerpc_get_event_first, ++ .get_event_next = pfm_gen_powerpc_get_event_next, ++ .event_is_valid = pfm_gen_powerpc_event_is_valid, ++ .validate_table = pfm_gen_powerpc_validate_table, ++ .get_event_info = pfm_gen_powerpc_get_event_info, ++ .get_event_attr_info = pfm_gen_powerpc_get_event_attr_info, ++}; +diff --git a/lib/pfmlib_power_priv.h b/lib/pfmlib_power_priv.h +index 3b72d32..6fce092 100644 +--- a/lib/pfmlib_power_priv.h ++++ b/lib/pfmlib_power_priv.h +@@ -96,9 +96,11 @@ typedef struct { + #define PV_POWER8E 0x004b + #define PV_POWER8NVL 0x004c + #define PV_POWER8 0x004d ++#define PV_POWER9 0x004e + + #define POWER_PLM (PFM_PLM0|PFM_PLM3) + #define POWER8_PLM (POWER_PLM|PFM_PLMH) ++#define POWER9_PLM (POWER_PLM|PFM_PLMH) + + extern int pfm_gen_powerpc_get_event_info(void *this, int pidx, pfm_event_info_t *info); + extern int pfm_gen_powerpc_get_event_attr_info(void *this, int pidx, int umask_idx, pfmlib_event_attr_info_t *info); +diff --git a/lib/pfmlib_priv.h b/lib/pfmlib_priv.h +index b7503a7..1f80571 100644 +--- a/lib/pfmlib_priv.h ++++ b/lib/pfmlib_priv.h +@@ -448,6 +448,7 @@ extern pfmlib_pmu_t power5p_support; + extern pfmlib_pmu_t power6_support; + extern pfmlib_pmu_t power7_support; + extern pfmlib_pmu_t power8_support; ++extern pfmlib_pmu_t power9_support; + extern pfmlib_pmu_t torrent_support; + extern pfmlib_pmu_t powerpc_nest_mcs_read_support; + extern pfmlib_pmu_t powerpc_nest_mcs_write_support; +-- +2.9.4 + +From 930ef5dbcc5d0d663979e16079aac12a7872d44d Mon Sep 17 00:00:00 2001 +From: Stephane Eranian +Date: Fri, 2 Jun 2017 12:10:17 -0700 +Subject: [PATCH 3/6] fix Power9 event file header + +Had __POWER8_EVENTS_H__ instead of __POWER9_EVENTS_H__ + +Signed-off-by: Stephane Eranian +--- + lib/events/power9_events.h | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/lib/events/power9_events.h b/lib/events/power9_events.h +index 7414687..765e9bd 100644 +--- a/lib/events/power9_events.h ++++ b/lib/events/power9_events.h +@@ -20,7 +20,7 @@ + */ + + #ifndef __POWER9_EVENTS_H__ +-#define __POWER8_EVENTS_H__ ++#define __POWER9_EVENTS_H__ + + #define POWER9_PME_PM_IERAT_RELOAD 0 + #define POWER9_PME_PM_TM_OUTER_TEND 1 +-- +2.9.4 + +From 10d8044873d1aefb2428a9a4640a33920552bfcc Mon Sep 17 00:00:00 2001 +From: Will Schmidt +Date: Tue, 6 Jun 2017 10:36:55 -0500 +Subject: [PATCH 5/6] power9 event list update for perfmon2 + +Updated ppc64/Power9 event list. + +Updated event list. (Jun 2017). Significant updates +here with respect to the preliminary list that was posted +back in January. I've made this sorted alphabetically, +with the expectation future updates (if any) will be +less noisy. + +Signed-off-by: Will Schmidt +--- + lib/events/power9_events.h | 11160 ++++++++++++++++++++++--------------------- + 1 file changed, 5676 insertions(+), 5484 deletions(-) + +diff --git a/lib/events/power9_events.h b/lib/events/power9_events.h +index 765e9bd..72c481b 100644 +--- a/lib/events/power9_events.h ++++ b/lib/events/power9_events.h +@@ -8,6 +8,7 @@ + * + * Mods: + * Initial content generated by Will Schmidt. (Jan 31, 2017). ++* Refresh/update generated Jun 06, 2017 by Will Schmidt. + * + * Contributed by + * (C) Copyright IBM Corporation, 2017. All Rights Reserved. +@@ -22,6439 +23,6630 @@ + #ifndef __POWER9_EVENTS_H__ + #define __POWER9_EVENTS_H__ + +-#define POWER9_PME_PM_IERAT_RELOAD 0 +-#define POWER9_PME_PM_TM_OUTER_TEND 1 +-#define POWER9_PME_PM_IPTEG_FROM_L3 2 +-#define POWER9_PME_PM_DPTEG_FROM_L3_1_MOD 3 +-#define POWER9_PME_PM_PMC2_SAVED 4 +-#define POWER9_PME_PM_LSU_FLUSH_SAO 5 +-#define POWER9_PME_PM_CMPLU_STALL_DFU 6 +-#define POWER9_PME_PM_MRK_LSU_FLUSH_RELAUNCH_MISS 7 +-#define POWER9_PME_PM_SP_FLOP_CMPL 8 +-#define POWER9_PME_PM_IC_RELOAD_PRIVATE 9 +-#define POWER9_PME_PM_RADIX_PWC_L2_PTE_FROM_L2 10 +-#define POWER9_PME_PM_INST_PUMP_CPRED 11 +-#define POWER9_PME_PM_INST_FROM_L2_1_MOD 12 +-#define POWER9_PME_PM_MRK_ST_CMPL 13 +-#define POWER9_PME_PM_MRK_LSU_DERAT_MISS 14 +-#define POWER9_PME_PM_L2_ST_DISP 15 +-#define POWER9_PME_PM_LSU0_FALSE_LHS 16 +-#define POWER9_PME_PM_L2_CASTOUT_MOD 17 +-#define POWER9_PME_PM_L2_RCST_DISP_FAIL_ADDR 18 +-#define POWER9_PME_PM_MRK_INST_TIMEO 19 +-#define POWER9_PME_PM_CMPLU_STALL_LOAD_FINISH 20 +-#define POWER9_PME_PM_INST_FROM_L2_1_SHR 21 +-#define POWER9_PME_PM_LS1_DC_COLLISIONS 22 +-#define POWER9_PME_PM_LSU2_FALSE_LHS 23 +-#define POWER9_PME_PM_MRK_ST_DRAIN_TO_L2DISP_CYC 24 +-#define POWER9_PME_PM_MRK_DTLB_MISS_16M 25 +-#define POWER9_PME_PM_L2_GROUP_PUMP 26 +-#define POWER9_PME_PM_LSU2_VECTOR_ST_FIN 27 +-#define POWER9_PME_PM_CMPLU_STALL_LSAQ_ARB 28 +-#define POWER9_PME_PM_L3_CO_LCO 29 +-#define POWER9_PME_PM_INST_GRP_PUMP_CPRED 30 +-#define POWER9_PME_PM_THRD_PRIO_4_5_CYC 31 +-#define POWER9_PME_PM_BR_PRED_TA 32 +-#define POWER9_PME_PM_ICT_NOSLOT_BR_MPRED_ICMISS 33 +-#define POWER9_PME_PM_IPTEG_FROM_L3_NO_CONFLICT 34 +-#define POWER9_PME_PM_CMPLU_STALL_FXU 35 +-#define POWER9_PME_PM_VSU_FSQRT_FDIV 36 +-#define POWER9_PME_PM_EXT_INT 37 +-#define POWER9_PME_PM_MRK_LD_MISS_EXPOSED_CYC 38 +-#define POWER9_PME_PM_S2Q_FULL 39 +-#define POWER9_PME_PM_RUN_CYC_SMT2_MODE 40 +-#define POWER9_PME_PM_DECODE_LANES_NOT_AVAIL 41 +-#define POWER9_PME_PM_TM_FAIL_TLBIE 42 +-#define POWER9_PME_PM_DISP_CLB_HELD_BAL 43 +-#define POWER9_PME_PM_MRK_DATA_FROM_L3MISS_CYC 44 +-#define POWER9_PME_PM_MRK_ST_FWD 45 +-#define POWER9_PME_PM_FXU_FIN 46 +-#define POWER9_PME_PM_SYNC_MRK_BR_MPRED 47 +-#define POWER9_PME_PM_CMPLU_STALL_STORE_FIN_ARB 48 +-#define POWER9_PME_PM_DSLB_MISS 49 +-#define POWER9_PME_PM_L3_MISS 50 +-#define POWER9_PME_PM_DUMMY2_REMOVE_ME 51 +-#define POWER9_PME_PM_MRK_DERAT_MISS_1G 52 +-#define POWER9_PME_PM_MATH_FLOP_CMPL 53 +-#define POWER9_PME_PM_L2_INST 54 +-#define POWER9_PME_PM_FLUSH_DISP 55 +-#define POWER9_PME_PM_DISP_HELD_ISSQ_FULL 56 +-#define POWER9_PME_PM_MEM_READ 57 +-#define POWER9_PME_PM_DATA_PUMP_MPRED 58 +-#define POWER9_PME_PM_DATA_CHIP_PUMP_CPRED 59 +-#define POWER9_PME_PM_MRK_DATA_FROM_DMEM 60 +-#define POWER9_PME_PM_CMPLU_STALL_LSU 61 +-#define POWER9_PME_PM_DATA_FROM_L3_1_MOD 62 +-#define POWER9_PME_PM_MRK_DERAT_MISS_16M 63 +-#define POWER9_PME_PM_TM_TRANS_RUN_CYC 64 +-#define POWER9_PME_PM_THRD_ALL_RUN_CYC 65 +-#define POWER9_PME_PM_DATA_FROM_DL2L3_MOD 66 +-#define POWER9_PME_PM_MRK_BR_MPRED_CMPL 67 +-#define POWER9_PME_PM_ICT_NOSLOT_DISP_HELD_ISSQ 68 +-#define POWER9_PME_PM_MRK_INST 69 +-#define POWER9_PME_PM_TABLEWALK_CYC_PREF 70 +-#define POWER9_PME_PM_LSU1_ERAT_HIT 71 +-#define POWER9_PME_PM_NTC_ISSUE_HELD_OTHER 72 +-#define POWER9_PME_PM_CMPLU_STALL_LSU_FLUSH_NEXT 73 +-#define POWER9_PME_PM_MRK_DPTEG_FROM_L2 74 +-#define POWER9_PME_PM_LS1_TM_DISALLOW 75 +-#define POWER9_PME_PM_INST_FROM_L2_DISP_CONFLICT_LDHITST 76 +-#define POWER9_PME_PM_BR_PRED_PCACHE 77 +-#define POWER9_PME_PM_MRK_BACK_BR_CMPL 78 +-#define POWER9_PME_PM_RD_CLEARING_SC 79 +-#define POWER9_PME_PM_PMC1_OVERFLOW 80 +-#define POWER9_PME_PM_L2_RTY_ST 81 +-#define POWER9_PME_PM_IPTEG_FROM_L2_NO_CONFLICT 82 +-#define POWER9_PME_PM_LSU1_FALSE_LHS 83 +-#define POWER9_PME_PM_LSU0_VECTOR_ST_FIN 84 +-#define POWER9_PME_PM_MEM_LOC_THRESH_LSU_HIGH 85 +-#define POWER9_PME_PM_LS2_UNALIGNED_LD 86 +-#define POWER9_PME_PM_BR_TAKEN_CMPL 87 +-#define POWER9_PME_PM_DATA_SYS_PUMP_MPRED 88 +-#define POWER9_PME_PM_ISQ_36_44_ENTRIES 89 +-#define POWER9_PME_PM_LSU1_VECTOR_LD_FIN 90 +-#define POWER9_PME_PM_MRK_DATA_FROM_L2_DISP_CONFLICT_OTHER 91 +-#define POWER9_PME_PM_ICT_NOSLOT_IC_MISS 92 +-#define POWER9_PME_PM_LSU3_TM_L1_HIT 93 +-#define POWER9_PME_PM_MRK_INST_DISP 94 +-#define POWER9_PME_PM_VECTOR_FLOP_CMPL 95 +-#define POWER9_PME_PM_FXU_IDLE 96 +-#define POWER9_PME_PM_INST_CMPL 97 +-#define POWER9_PME_PM_EAT_FORCE_MISPRED 98 +-#define POWER9_PME_PM_CMPLU_STALL_LRQ_FULL 99 +-#define POWER9_PME_PM_MRK_DATA_FROM_DL2L3_MOD 100 +-#define POWER9_PME_PM_BACK_BR_CMPL 101 +-#define POWER9_PME_PM_NEST_REF_CLK 102 +-#define POWER9_PME_PM_MRK_DPTEG_FROM_RL2L3_SHR 103 +-#define POWER9_PME_PM_RC_USAGE 104 +-#define POWER9_PME_PM_MRK_DPTEG_FROM_L3_1_ECO_MOD 105 +-#define POWER9_PME_PM_BR_CMPL 106 +-#define POWER9_PME_PM_INST_FROM_RL2L3_MOD 107 +-#define POWER9_PME_PM_SHL_CREATED 108 +-#define POWER9_PME_PM_CMPLU_STALL_PASTE 109 +-#define POWER9_PME_PM_LSU3_LDMX_FIN 110 +-#define POWER9_PME_PM_SN_USAGE 111 +-#define POWER9_PME_PM_L2_ST_HIT 112 ++#define POWER9_PME_PM_1FLOP_CMPL 0 ++#define POWER9_PME_PM_1PLUS_PPC_CMPL 1 ++#define POWER9_PME_PM_1PLUS_PPC_DISP 2 ++#define POWER9_PME_PM_2FLOP_CMPL 3 ++#define POWER9_PME_PM_4FLOP_CMPL 4 ++#define POWER9_PME_PM_8FLOP_CMPL 5 ++#define POWER9_PME_PM_ANY_THRD_RUN_CYC 6 ++#define POWER9_PME_PM_BACK_BR_CMPL 7 ++#define POWER9_PME_PM_BANK_CONFLICT 8 ++#define POWER9_PME_PM_BFU_BUSY 9 ++#define POWER9_PME_PM_BR_2PATH 10 ++#define POWER9_PME_PM_BR_CMPL 11 ++#define POWER9_PME_PM_BR_CORECT_PRED_TAKEN_CMPL 12 ++#define POWER9_PME_PM_BR_MPRED_CCACHE 13 ++#define POWER9_PME_PM_BR_MPRED_CMPL 14 ++#define POWER9_PME_PM_BR_MPRED_LSTACK 15 ++#define POWER9_PME_PM_BR_MPRED_PCACHE 16 ++#define POWER9_PME_PM_BR_MPRED_TAKEN_CR 17 ++#define POWER9_PME_PM_BR_MPRED_TAKEN_TA 18 ++#define POWER9_PME_PM_BR_PRED_CCACHE 19 ++#define POWER9_PME_PM_BR_PRED_LSTACK 20 ++#define POWER9_PME_PM_BR_PRED_PCACHE 21 ++#define POWER9_PME_PM_BR_PRED_TAKEN_CR 22 ++#define POWER9_PME_PM_BR_PRED_TA 23 ++#define POWER9_PME_PM_BR_PRED 24 ++#define POWER9_PME_PM_BR_TAKEN_CMPL 25 ++#define POWER9_PME_PM_BRU_FIN 26 ++#define POWER9_PME_PM_BR_UNCOND 27 ++#define POWER9_PME_PM_BTAC_BAD_RESULT 28 ++#define POWER9_PME_PM_BTAC_GOOD_RESULT 29 ++#define POWER9_PME_PM_CHIP_PUMP_CPRED 30 ++#define POWER9_PME_PM_CLB_HELD 31 ++#define POWER9_PME_PM_CMPLU_STALL_ANY_SYNC 32 ++#define POWER9_PME_PM_CMPLU_STALL_BRU 33 ++#define POWER9_PME_PM_CMPLU_STALL_CRYPTO 34 ++#define POWER9_PME_PM_CMPLU_STALL_DCACHE_MISS 35 ++#define POWER9_PME_PM_CMPLU_STALL_DFLONG 36 ++#define POWER9_PME_PM_CMPLU_STALL_DFU 37 ++#define POWER9_PME_PM_CMPLU_STALL_DMISS_L21_L31 38 ++#define POWER9_PME_PM_CMPLU_STALL_DMISS_L2L3_CONFLICT 39 ++#define POWER9_PME_PM_CMPLU_STALL_DMISS_L2L3 40 ++#define POWER9_PME_PM_CMPLU_STALL_DMISS_L3MISS 41 ++#define POWER9_PME_PM_CMPLU_STALL_DMISS_LMEM 42 ++#define POWER9_PME_PM_CMPLU_STALL_DMISS_REMOTE 43 ++#define POWER9_PME_PM_CMPLU_STALL_DPLONG 44 ++#define POWER9_PME_PM_CMPLU_STALL_DP 45 ++#define POWER9_PME_PM_CMPLU_STALL_EIEIO 46 ++#define POWER9_PME_PM_CMPLU_STALL_EMQ_FULL 47 ++#define POWER9_PME_PM_CMPLU_STALL_ERAT_MISS 48 ++#define POWER9_PME_PM_CMPLU_STALL_EXCEPTION 49 ++#define POWER9_PME_PM_CMPLU_STALL_EXEC_UNIT 50 ++#define POWER9_PME_PM_CMPLU_STALL_FLUSH_ANY_THREAD 51 ++#define POWER9_PME_PM_CMPLU_STALL_FXLONG 52 ++#define POWER9_PME_PM_CMPLU_STALL_FXU 53 ++#define POWER9_PME_PM_CMPLU_STALL_HWSYNC 54 ++#define POWER9_PME_PM_CMPLU_STALL_LARX 55 ++#define POWER9_PME_PM_CMPLU_STALL_LHS 56 ++#define POWER9_PME_PM_CMPLU_STALL_LMQ_FULL 57 ++#define POWER9_PME_PM_CMPLU_STALL_LOAD_FINISH 58 ++#define POWER9_PME_PM_CMPLU_STALL_LRQ_FULL 59 ++#define POWER9_PME_PM_CMPLU_STALL_LRQ_OTHER 60 ++#define POWER9_PME_PM_CMPLU_STALL_LSAQ_ARB 61 ++#define POWER9_PME_PM_CMPLU_STALL_LSU_FIN 62 ++#define POWER9_PME_PM_CMPLU_STALL_LSU_FLUSH_NEXT 63 ++#define POWER9_PME_PM_CMPLU_STALL_LSU_MFSPR 64 ++#define POWER9_PME_PM_CMPLU_STALL_LSU 65 ++#define POWER9_PME_PM_CMPLU_STALL_LWSYNC 66 ++#define POWER9_PME_PM_CMPLU_STALL_MTFPSCR 67 ++#define POWER9_PME_PM_CMPLU_STALL_NESTED_TBEGIN 68 ++#define POWER9_PME_PM_CMPLU_STALL_NESTED_TEND 69 ++#define POWER9_PME_PM_CMPLU_STALL_NTC_DISP_FIN 70 ++#define POWER9_PME_PM_CMPLU_STALL_NTC_FLUSH 71 ++#define POWER9_PME_PM_CMPLU_STALL_OTHER_CMPL 72 ++#define POWER9_PME_PM_CMPLU_STALL_PASTE 73 ++#define POWER9_PME_PM_CMPLU_STALL_PM 74 ++#define POWER9_PME_PM_CMPLU_STALL_SLB 75 ++#define POWER9_PME_PM_CMPLU_STALL_SPEC_FINISH 76 ++#define POWER9_PME_PM_CMPLU_STALL_SRQ_FULL 77 ++#define POWER9_PME_PM_CMPLU_STALL_STCX 78 ++#define POWER9_PME_PM_CMPLU_STALL_ST_FWD 79 ++#define POWER9_PME_PM_CMPLU_STALL_STORE_DATA 80 ++#define POWER9_PME_PM_CMPLU_STALL_STORE_FIN_ARB 81 ++#define POWER9_PME_PM_CMPLU_STALL_STORE_FINISH 82 ++#define POWER9_PME_PM_CMPLU_STALL_STORE_PIPE_ARB 83 ++#define POWER9_PME_PM_CMPLU_STALL_SYNC_PMU_INT 84 ++#define POWER9_PME_PM_CMPLU_STALL_TEND 85 ++#define POWER9_PME_PM_CMPLU_STALL_THRD 86 ++#define POWER9_PME_PM_CMPLU_STALL_TLBIE 87 ++#define POWER9_PME_PM_CMPLU_STALL 88 ++#define POWER9_PME_PM_CMPLU_STALL_VDPLONG 89 ++#define POWER9_PME_PM_CMPLU_STALL_VDP 90 ++#define POWER9_PME_PM_CMPLU_STALL_VFXLONG 91 ++#define POWER9_PME_PM_CMPLU_STALL_VFXU 92 ++#define POWER9_PME_PM_CO0_BUSY 93 ++#define POWER9_PME_PM_CO0_BUSY_ALT 94 ++#define POWER9_PME_PM_CO_DISP_FAIL 95 ++#define POWER9_PME_PM_CO_TM_SC_FOOTPRINT 96 ++#define POWER9_PME_PM_CO_USAGE 97 ++#define POWER9_PME_PM_CYC 98 ++#define POWER9_PME_PM_DARQ0_0_3_ENTRIES 99 ++#define POWER9_PME_PM_DARQ0_10_12_ENTRIES 100 ++#define POWER9_PME_PM_DARQ0_4_6_ENTRIES 101 ++#define POWER9_PME_PM_DARQ0_7_9_ENTRIES 102 ++#define POWER9_PME_PM_DARQ1_0_3_ENTRIES 103 ++#define POWER9_PME_PM_DARQ1_10_12_ENTRIES 104 ++#define POWER9_PME_PM_DARQ1_4_6_ENTRIES 105 ++#define POWER9_PME_PM_DARQ1_7_9_ENTRIES 106 ++#define POWER9_PME_PM_DARQ_STORE_REJECT 107 ++#define POWER9_PME_PM_DARQ_STORE_XMIT 108 ++#define POWER9_PME_PM_DATA_CHIP_PUMP_CPRED 109 ++#define POWER9_PME_PM_DATA_FROM_DL2L3_MOD 110 ++#define POWER9_PME_PM_DATA_FROM_DL2L3_SHR 111 ++#define POWER9_PME_PM_DATA_FROM_DL4 112 + #define POWER9_PME_PM_DATA_FROM_DMEM 113 +-#define POWER9_PME_PM_CMPLU_STALL_DMISS_REMOTE 114 +-#define POWER9_PME_PM_LSU2_LDMX_FIN 115 +-#define POWER9_PME_PM_L3_LD_MISS 116 +-#define POWER9_PME_PM_DPTEG_FROM_RL4 117 +-#define POWER9_PME_PM_RADIX_PWC_L3_PDE_FROM_L2 118 +-#define POWER9_PME_PM_MRK_DATA_FROM_RL4_CYC 119 +-#define POWER9_PME_PM_TM_SC_CO 120 +-#define POWER9_PME_PM_L2_SN_SX_I_DONE 121 +-#define POWER9_PME_PM_DPTEG_FROM_L3_DISP_CONFLICT 122 +-#define POWER9_PME_PM_ISIDE_L2MEMACC 123 +-#define POWER9_PME_PM_L3_P0_GRP_PUMP 124 +-#define POWER9_PME_PM_IPTEG_FROM_DL2L3_SHR 125 +-#define POWER9_PME_PM_RADIX_PWC_L3_PDE_FROM_L3 126 +-#define POWER9_PME_PM_THRESH_MET 127 +-#define POWER9_PME_PM_DATA_FROM_L2_MEPF 128 +-#define POWER9_PME_PM_DISP_STARVED 129 +-#define POWER9_PME_PM_L3_P0_LCO_RTY 130 +-#define POWER9_PME_PM_NTC_ISSUE_HELD_DARQ_FULL 131 +-#define POWER9_PME_PM_L3_RD_USAGE 132 +-#define POWER9_PME_PM_TLBIE_FIN 133 +-#define POWER9_PME_PM_DPTEG_FROM_LL4 134 +-#define POWER9_PME_PM_CMPLU_STALL_TLBIE 135 +-#define POWER9_PME_PM_MRK_DATA_FROM_L2MISS_CYC 136 +-#define POWER9_PME_PM_LS3_DC_COLLISIONS 137 +-#define POWER9_PME_PM_L1_ICACHE_MISS 138 +-#define POWER9_PME_PM_LSU_REJECT_ERAT_MISS 139 +-#define POWER9_PME_PM_DATA_SYS_PUMP_CPRED 140 +-#define POWER9_PME_PM_MRK_FAB_RSP_RWITM_CYC 141 +-#define POWER9_PME_PM_MRK_DATA_FROM_L3_1_SHR_CYC 142 +-#define POWER9_PME_PM_LSU_FLUSH_UE 143 +-#define POWER9_PME_PM_BR_PRED_TAKEN_CR 144 +-#define POWER9_PME_PM_INST_FROM_L2_DISP_CONFLICT_OTHER 145 +-#define POWER9_PME_PM_MRK_DPTEG_FROM_DL2L3_SHR 146 +-#define POWER9_PME_PM_DATA_FROM_L2_1_MOD 147 +-#define POWER9_PME_PM_LSU_FLUSH_LHL_SHL 148 +-#define POWER9_PME_PM_L3_P1_PF_RTY 149 +-#define POWER9_PME_PM_MRK_DPTEG_FROM_DL2L3_MOD 150 +-#define POWER9_PME_PM_DFU_BUSY 151 +-#define POWER9_PME_PM_LSU1_TM_L1_MISS 152 +-#define POWER9_PME_PM_FREQ_UP 153 +-#define POWER9_PME_PM_DATA_FROM_LMEM 154 +-#define POWER9_PME_PM_MRK_DATA_FROM_L2_MEPF 155 +-#define POWER9_PME_PM_ISIDE_DISP 156 +-#define POWER9_PME_PM_TM_OUTER_TBEGIN 157 +-#define POWER9_PME_PM_PMC3_OVERFLOW 158 +-#define POWER9_PME_PM_LSU0_SET_MPRED 159 +-#define POWER9_PME_PM_INST_FROM_L2_MEPF 160 +-#define POWER9_PME_PM_L3_P0_NODE_PUMP 161 +-#define POWER9_PME_PM_IPTEG_FROM_L3_1_MOD 162 +-#define POWER9_PME_PM_L3_PF_USAGE 163 +-#define POWER9_PME_PM_CMPLU_STALL_BRU 164 +-#define POWER9_PME_PM_ISLB_MISS 165 +-#define POWER9_PME_PM_CYC 166 +-#define POWER9_PME_PM_MRK_DATA_FROM_L3_1_SHR 167 +-#define POWER9_PME_PM_IPTEG_FROM_RL2L3_MOD 168 +-#define POWER9_PME_PM_DARQ_10_12_ENTRIES 169 +-#define POWER9_PME_PM_LSU2_3_LRQF_FULL_CYC 170 +-#define POWER9_PME_PM_DECODE_FUSION_OP_PRESERV 171 +-#define POWER9_PME_PM_MRK_DPTEG_FROM_L2_MEPF 172 +-#define POWER9_PME_PM_MRK_L1_RELOAD_VALID 173 +-#define POWER9_PME_PM_LSU2_SET_MPRED 174 +-#define POWER9_PME_PM_1PLUS_PPC_CMPL 175 +-#define POWER9_PME_PM_DATA_FROM_LL4 176 +-#define POWER9_PME_PM_CMPLU_STALL_DMISS_L3MISS 177 +-#define POWER9_PME_PM_TM_CAP_OVERFLOW 178 +-#define POWER9_PME_PM_MRK_DPTEG_FROM_LMEM 179 +-#define POWER9_PME_PM_LSU3_FALSE_LHS 180 +-#define POWER9_PME_PM_THRESH_EXC_512 181 +-#define POWER9_PME_PM_RADIX_PWC_L1_PDE_FROM_L2 182 +-#define POWER9_PME_PM_HWSYNC 183 +-#define POWER9_PME_PM_TM_FAIL_FOOTPRINT_OVERFLOW 184 +-#define POWER9_PME_PM_INST_SYS_PUMP_MPRED_RTY 185 +-#define POWER9_PME_PM_ICT_NOSLOT_DISP_HELD_HB_FULL 186 +-#define POWER9_PME_PM_DC_DEALLOC_NO_CONF 187 +-#define POWER9_PME_PM_CMPLU_STALL_VFXLONG 188 +-#define POWER9_PME_PM_MEM_LOC_THRESH_IFU 189 +-#define POWER9_PME_PM_MRK_DATA_FROM_L3_CYC 190 +-#define POWER9_PME_PM_PTE_PREFETCH 191 +-#define POWER9_PME_PM_CMPLU_STALL_STORE_PIPE_ARB 192 +-#define POWER9_PME_PM_CMPLU_STALL_SLB 193 +-#define POWER9_PME_PM_MRK_DERAT_MISS_4K 194 +-#define POWER9_PME_PM_CMPLU_STALL_LSU_MFSPR 195 +-#define POWER9_PME_PM_MRK_DPTEG_FROM_L3_1_ECO_SHR 196 +-#define POWER9_PME_PM_VSU_DP_FSQRT_FDIV 197 +-#define POWER9_PME_PM_IPTEG_FROM_L3_1_ECO_SHR 198 +-#define POWER9_PME_PM_L3_P0_LCO_DATA 199 +-#define POWER9_PME_PM_RUN_INST_CMPL 200 +-#define POWER9_PME_PM_MRK_DATA_FROM_OFF_CHIP_CACHE 201 +-#define POWER9_PME_PM_MRK_TEND_FAIL 202 +-#define POWER9_PME_PM_MRK_VSU_FIN 203 +-#define POWER9_PME_PM_DATA_FROM_L3_1_ECO_MOD 204 +-#define POWER9_PME_PM_RUN_SPURR 205 +-#define POWER9_PME_PM_ST_CAUSED_FAIL 206 +-#define POWER9_PME_PM_SNOOP_TLBIE 207 +-#define POWER9_PME_PM_PMC1_SAVED 208 +-#define POWER9_PME_PM_DATA_FROM_L3MISS 209 +-#define POWER9_PME_PM_DATA_FROM_ON_CHIP_CACHE 210 +-#define POWER9_PME_PM_DTLB_MISS_16G 211 +-#define POWER9_PME_PM_MRK_DPTEG_FROM_DMEM 212 +-#define POWER9_PME_PM_ICT_NOSLOT_IC_L3MISS 213 +-#define POWER9_PME_PM_FLUSH 214 +-#define POWER9_PME_PM_LSU_FLUSH_OTHER 215 +-#define POWER9_PME_PM_LS1_LAUNCH_HELD_PREF 216 +-#define POWER9_PME_PM_L2_LD_HIT 217 +-#define POWER9_PME_PM_LSU2_VECTOR_LD_FIN 218 +-#define POWER9_PME_PM_LSU_FLUSH_EMSH 219 +-#define POWER9_PME_PM_IC_PREF_REQ 220 +-#define POWER9_PME_PM_DPTEG_FROM_L2_1_SHR 221 +-#define POWER9_PME_PM_XLATE_RADIX_MODE 222 +-#define POWER9_PME_PM_L3_LD_HIT 223 +-#define POWER9_PME_PM_DARQ_7_9_ENTRIES 224 +-#define POWER9_PME_PM_CMPLU_STALL_EXEC_UNIT 225 +-#define POWER9_PME_PM_DISP_HELD 226 +-#define POWER9_PME_PM_TM_FAIL_CONF_TM 227 +-#define POWER9_PME_PM_LS0_DC_COLLISIONS 228 +-#define POWER9_PME_PM_L2_LD 229 +-#define POWER9_PME_PM_BTAC_GOOD_RESULT 230 +-#define POWER9_PME_PM_TEND_PEND_CYC 231 +-#define POWER9_PME_PM_MRK_DCACHE_RELOAD_INTV 232 +-#define POWER9_PME_PM_DISP_HELD_HB_FULL 233 +-#define POWER9_PME_PM_TM_TRESUME 234 +-#define POWER9_PME_PM_MRK_LSU_FLUSH_SAO 235 +-#define POWER9_PME_PM_LS0_TM_DISALLOW 236 +-#define POWER9_PME_PM_DPTEG_FROM_OFF_CHIP_CACHE 237 +-#define POWER9_PME_PM_RC0_BUSY 238 +-#define POWER9_PME_PM_LSU1_TM_L1_HIT 239 +-#define POWER9_PME_PM_TB_BIT_TRANS 240 +-#define POWER9_PME_PM_DPTEG_FROM_L2_NO_CONFLICT 241 +-#define POWER9_PME_PM_MRK_DPTEG_FROM_L3_1_MOD 242 +-#define POWER9_PME_PM_MRK_DATA_FROM_L2_NO_CONFLICT 243 +-#define POWER9_PME_PM_MRK_DATA_FROM_LL4_CYC 244 +-#define POWER9_PME_PM_INST_FROM_OFF_CHIP_CACHE 245 +-#define POWER9_PME_PM_L3_CO_L31 246 +-#define POWER9_PME_PM_CMPLU_STALL_CRYPTO 247 +-#define POWER9_PME_PM_RADIX_PWC_L1_PDE_FROM_L3 248 +-#define POWER9_PME_PM_ICT_EMPTY_CYC 249 +-#define POWER9_PME_PM_BR_UNCOND 250 +-#define POWER9_PME_PM_DERAT_MISS_2M 251 +-#define POWER9_PME_PM_PMC4_REWIND 252 +-#define POWER9_PME_PM_L2_RCLD_DISP 253 +-#define POWER9_PME_PM_CMPLU_STALL_DMISS_L2L3_CONFLICT 254 +-#define POWER9_PME_PM_TAKEN_BR_MPRED_CMPL 255 +-#define POWER9_PME_PM_THRD_PRIO_2_3_CYC 256 +-#define POWER9_PME_PM_DATA_FROM_DL4 257 +-#define POWER9_PME_PM_CMPLU_STALL_DPLONG 258 +-#define POWER9_PME_PM_IC_DEMAND_L2_BHT_REDIRECT 259 +-#define POWER9_PME_PM_MRK_FAB_RSP_BKILL 260 +-#define POWER9_PME_PM_LSU_DERAT_MISS 261 +-#define POWER9_PME_PM_IC_PREF_CANCEL_L2 262 +-#define POWER9_PME_PM_MRK_NTC_CYC 263 +-#define POWER9_PME_PM_STCX_FIN 264 +-#define POWER9_PME_PM_MRK_DATA_FROM_L3_MEPF 265 +-#define POWER9_PME_PM_DC_PREF_FUZZY_CONF 266 +-#define POWER9_PME_PM_MULT_MRK 267 +-#define POWER9_PME_PM_LSU_FLUSH_LARX_STCX 268 +-#define POWER9_PME_PM_L3_P1_LCO_NO_DATA 269 +-#define POWER9_PME_PM_TM_TABORT_TRECLAIM 270 +-#define POWER9_PME_PM_MRK_DATA_FROM_L2_MEPF_CYC 271 +-#define POWER9_PME_PM_BR_PRED_CCACHE 272 +-#define POWER9_PME_PM_L3_P1_LCO_DATA 273 +-#define POWER9_PME_PM_LINK_STACK_WRONG_ADD_PRED 274 +-#define POWER9_PME_PM_MRK_DPTEG_FROM_L3 275 +-#define POWER9_PME_PM_MRK_ST_CMPL_INT 276 +-#define POWER9_PME_PM_FLUSH_HB_RESTORE_CYC 277 +-#define POWER9_PME_PM_LS1_PTE_TABLEWALK_CYC 278 +-#define POWER9_PME_PM_L3_CI_USAGE 279 +-#define POWER9_PME_PM_MRK_DATA_FROM_L3MISS 280 +-#define POWER9_PME_PM_DPTEG_FROM_DL4 281 +-#define POWER9_PME_PM_MRK_STCX_FIN 282 +-#define POWER9_PME_PM_MRK_LSU_FLUSH_UE 283 +-#define POWER9_PME_PM_MRK_DATA_FROM_MEMORY 284 +-#define POWER9_PME_PM_GRP_PUMP_MPRED_RTY 285 +-#define POWER9_PME_PM_DPTEG_FROM_L3_1_ECO_SHR 286 +-#define POWER9_PME_PM_FLUSH_DISP_TLBIE 287 +-#define POWER9_PME_PM_DPTEG_FROM_L3MISS 288 +-#define POWER9_PME_PM_L3_GRP_GUESS_CORRECT 289 +-#define POWER9_PME_PM_IC_INVALIDATE 290 +-#define POWER9_PME_PM_DERAT_MISS_16G 291 +-#define POWER9_PME_PM_SYS_PUMP_MPRED_RTY 292 +-#define POWER9_PME_PM_LMQ_MERGE 293 +-#define POWER9_PME_PM_IPTEG_FROM_LMEM 294 +-#define POWER9_PME_PM_L3_LAT_CI_HIT 295 +-#define POWER9_PME_PM_LSU1_VECTOR_ST_FIN 296 +-#define POWER9_PME_PM_IC_DEMAND_L2_BR_REDIRECT 297 +-#define POWER9_PME_PM_INST_FROM_LMEM 298 +-#define POWER9_PME_PM_MRK_DATA_FROM_RL4 299 +-#define POWER9_PME_PM_MRK_DTLB_MISS_4K 300 +-#define POWER9_PME_PM_MRK_DATA_FROM_L3_NO_CONFLICT 301 +-#define POWER9_PME_PM_CMPLU_STALL_NTC_FLUSH 302 +-#define POWER9_PME_PM_MRK_DATA_FROM_L3_NO_CONFLICT_CYC 303 +-#define POWER9_PME_PM_DARQ_0_3_ENTRIES 304 +-#define POWER9_PME_PM_DATA_FROM_L3MISS_MOD 305 +-#define POWER9_PME_PM_MRK_DATA_FROM_L2_1_SHR_CYC 306 +-#define POWER9_PME_PM_TAGE_OVERRIDE_WRONG 307 +-#define POWER9_PME_PM_L2_LD_MISS 308 +-#define POWER9_PME_PM_EAT_FULL_CYC 309 +-#define POWER9_PME_PM_CMPLU_STALL_SPEC_FINISH 310 +-#define POWER9_PME_PM_MRK_LSU_FLUSH_LARX_STCX 311 +-#define POWER9_PME_PM_THRESH_EXC_128 312 +-#define POWER9_PME_PM_LMQ_EMPTY_CYC 313 +-#define POWER9_PME_PM_RADIX_PWC_L2_PDE_FROM_L3 314 +-#define POWER9_PME_PM_MRK_IC_MISS 315 +-#define POWER9_PME_PM_L3_P1_GRP_PUMP 316 +-#define POWER9_PME_PM_CMPLU_STALL_TEND 317 +-#define POWER9_PME_PM_PUMP_MPRED 318 ++#define POWER9_PME_PM_DATA_FROM_L21_MOD 114 ++#define POWER9_PME_PM_DATA_FROM_L21_SHR 115 ++#define POWER9_PME_PM_DATA_FROM_L2_DISP_CONFLICT_LDHITST 116 ++#define POWER9_PME_PM_DATA_FROM_L2_DISP_CONFLICT_OTHER 117 ++#define POWER9_PME_PM_DATA_FROM_L2_MEPF 118 ++#define POWER9_PME_PM_DATA_FROM_L2MISS_MOD 119 ++#define POWER9_PME_PM_DATA_FROM_L2MISS 120 ++#define POWER9_PME_PM_DATA_FROM_L2_NO_CONFLICT 121 ++#define POWER9_PME_PM_DATA_FROM_L2 122 ++#define POWER9_PME_PM_DATA_FROM_L31_ECO_MOD 123 ++#define POWER9_PME_PM_DATA_FROM_L31_ECO_SHR 124 ++#define POWER9_PME_PM_DATA_FROM_L31_MOD 125 ++#define POWER9_PME_PM_DATA_FROM_L31_SHR 126 ++#define POWER9_PME_PM_DATA_FROM_L3_DISP_CONFLICT 127 ++#define POWER9_PME_PM_DATA_FROM_L3_MEPF 128 ++#define POWER9_PME_PM_DATA_FROM_L3MISS_MOD 129 ++#define POWER9_PME_PM_DATA_FROM_L3MISS 130 ++#define POWER9_PME_PM_DATA_FROM_L3_NO_CONFLICT 131 ++#define POWER9_PME_PM_DATA_FROM_L3 132 ++#define POWER9_PME_PM_DATA_FROM_LL4 133 ++#define POWER9_PME_PM_DATA_FROM_LMEM 134 ++#define POWER9_PME_PM_DATA_FROM_MEMORY 135 ++#define POWER9_PME_PM_DATA_FROM_OFF_CHIP_CACHE 136 ++#define POWER9_PME_PM_DATA_FROM_ON_CHIP_CACHE 137 ++#define POWER9_PME_PM_DATA_FROM_RL2L3_MOD 138 ++#define POWER9_PME_PM_DATA_FROM_RL2L3_SHR 139 ++#define POWER9_PME_PM_DATA_FROM_RL4 140 ++#define POWER9_PME_PM_DATA_FROM_RMEM 141 ++#define POWER9_PME_PM_DATA_GRP_PUMP_CPRED 142 ++#define POWER9_PME_PM_DATA_GRP_PUMP_MPRED_RTY 143 ++#define POWER9_PME_PM_DATA_GRP_PUMP_MPRED 144 ++#define POWER9_PME_PM_DATA_PUMP_CPRED 145 ++#define POWER9_PME_PM_DATA_PUMP_MPRED 146 ++#define POWER9_PME_PM_DATA_STORE 147 ++#define POWER9_PME_PM_DATA_SYS_PUMP_CPRED 148 ++#define POWER9_PME_PM_DATA_SYS_PUMP_MPRED_RTY 149 ++#define POWER9_PME_PM_DATA_SYS_PUMP_MPRED 150 ++#define POWER9_PME_PM_DATA_TABLEWALK_CYC 151 ++#define POWER9_PME_PM_DC_DEALLOC_NO_CONF 152 ++#define POWER9_PME_PM_DC_PREF_CONF 153 ++#define POWER9_PME_PM_DC_PREF_CONS_ALLOC 154 ++#define POWER9_PME_PM_DC_PREF_FUZZY_CONF 155 ++#define POWER9_PME_PM_DC_PREF_HW_ALLOC 156 ++#define POWER9_PME_PM_DC_PREF_STRIDED_CONF 157 ++#define POWER9_PME_PM_DC_PREF_SW_ALLOC 158 ++#define POWER9_PME_PM_DC_PREF_XCONS_ALLOC 159 ++#define POWER9_PME_PM_DECODE_FUSION_CONST_GEN 160 ++#define POWER9_PME_PM_DECODE_FUSION_EXT_ADD 161 ++#define POWER9_PME_PM_DECODE_FUSION_LD_ST_DISP 162 ++#define POWER9_PME_PM_DECODE_FUSION_OP_PRESERV 163 ++#define POWER9_PME_PM_DECODE_HOLD_ICT_FULL 164 ++#define POWER9_PME_PM_DECODE_LANES_NOT_AVAIL 165 ++#define POWER9_PME_PM_DERAT_MISS_16G 166 ++#define POWER9_PME_PM_DERAT_MISS_16M 167 ++#define POWER9_PME_PM_DERAT_MISS_1G 168 ++#define POWER9_PME_PM_DERAT_MISS_2M 169 ++#define POWER9_PME_PM_DERAT_MISS_4K 170 ++#define POWER9_PME_PM_DERAT_MISS_64K 171 ++#define POWER9_PME_PM_DFU_BUSY 172 ++#define POWER9_PME_PM_DISP_CLB_HELD_BAL 173 ++#define POWER9_PME_PM_DISP_CLB_HELD_SB 174 ++#define POWER9_PME_PM_DISP_CLB_HELD_TLBIE 175 ++#define POWER9_PME_PM_DISP_HELD_HB_FULL 176 ++#define POWER9_PME_PM_DISP_HELD_ISSQ_FULL 177 ++#define POWER9_PME_PM_DISP_HELD_SYNC_HOLD 178 ++#define POWER9_PME_PM_DISP_HELD_TBEGIN 179 ++#define POWER9_PME_PM_DISP_HELD 180 ++#define POWER9_PME_PM_DISP_STARVED 181 ++#define POWER9_PME_PM_DP_QP_FLOP_CMPL 182 ++#define POWER9_PME_PM_DPTEG_FROM_DL2L3_MOD 183 ++#define POWER9_PME_PM_DPTEG_FROM_DL2L3_SHR 184 ++#define POWER9_PME_PM_DPTEG_FROM_DL4 185 ++#define POWER9_PME_PM_DPTEG_FROM_DMEM 186 ++#define POWER9_PME_PM_DPTEG_FROM_L21_MOD 187 ++#define POWER9_PME_PM_DPTEG_FROM_L21_SHR 188 ++#define POWER9_PME_PM_DPTEG_FROM_L2_MEPF 189 ++#define POWER9_PME_PM_DPTEG_FROM_L2MISS 190 ++#define POWER9_PME_PM_DPTEG_FROM_L2_NO_CONFLICT 191 ++#define POWER9_PME_PM_DPTEG_FROM_L2 192 ++#define POWER9_PME_PM_DPTEG_FROM_L31_ECO_MOD 193 ++#define POWER9_PME_PM_DPTEG_FROM_L31_ECO_SHR 194 ++#define POWER9_PME_PM_DPTEG_FROM_L31_MOD 195 ++#define POWER9_PME_PM_DPTEG_FROM_L31_SHR 196 ++#define POWER9_PME_PM_DPTEG_FROM_L3_DISP_CONFLICT 197 ++#define POWER9_PME_PM_DPTEG_FROM_L3_MEPF 198 ++#define POWER9_PME_PM_DPTEG_FROM_L3MISS 199 ++#define POWER9_PME_PM_DPTEG_FROM_L3_NO_CONFLICT 200 ++#define POWER9_PME_PM_DPTEG_FROM_L3 201 ++#define POWER9_PME_PM_DPTEG_FROM_LL4 202 ++#define POWER9_PME_PM_DPTEG_FROM_LMEM 203 ++#define POWER9_PME_PM_DPTEG_FROM_MEMORY 204 ++#define POWER9_PME_PM_DPTEG_FROM_OFF_CHIP_CACHE 205 ++#define POWER9_PME_PM_DPTEG_FROM_ON_CHIP_CACHE 206 ++#define POWER9_PME_PM_DPTEG_FROM_RL2L3_MOD 207 ++#define POWER9_PME_PM_DPTEG_FROM_RL2L3_SHR 208 ++#define POWER9_PME_PM_DPTEG_FROM_RL4 209 ++#define POWER9_PME_PM_DPTEG_FROM_RMEM 210 ++#define POWER9_PME_PM_DSIDE_L2MEMACC 211 ++#define POWER9_PME_PM_DSIDE_MRU_TOUCH 212 ++#define POWER9_PME_PM_DSIDE_OTHER_64B_L2MEMACC 213 ++#define POWER9_PME_PM_DSLB_MISS 214 ++#define POWER9_PME_PM_DSLB_MISS_ALT 215 ++#define POWER9_PME_PM_DTLB_MISS_16G 216 ++#define POWER9_PME_PM_DTLB_MISS_16M 217 ++#define POWER9_PME_PM_DTLB_MISS_1G 218 ++#define POWER9_PME_PM_DTLB_MISS_2M 219 ++#define POWER9_PME_PM_DTLB_MISS_4K 220 ++#define POWER9_PME_PM_DTLB_MISS_64K 221 ++#define POWER9_PME_PM_DTLB_MISS 222 ++#define POWER9_PME_PM_SPACEHOLDER_0000040062 223 ++#define POWER9_PME_PM_SPACEHOLDER_0000040064 224 ++#define POWER9_PME_PM_EAT_FORCE_MISPRED 225 ++#define POWER9_PME_PM_EAT_FULL_CYC 226 ++#define POWER9_PME_PM_EE_OFF_EXT_INT 227 ++#define POWER9_PME_PM_EXT_INT 228 ++#define POWER9_PME_PM_FLOP_CMPL 229 ++#define POWER9_PME_PM_FLUSH_COMPLETION 230 ++#define POWER9_PME_PM_FLUSH_DISP_SB 231 ++#define POWER9_PME_PM_FLUSH_DISP_TLBIE 232 ++#define POWER9_PME_PM_FLUSH_DISP 233 ++#define POWER9_PME_PM_FLUSH_HB_RESTORE_CYC 234 ++#define POWER9_PME_PM_FLUSH_LSU 235 ++#define POWER9_PME_PM_FLUSH_MPRED 236 ++#define POWER9_PME_PM_FLUSH 237 ++#define POWER9_PME_PM_FMA_CMPL 238 ++#define POWER9_PME_PM_FORCED_NOP 239 ++#define POWER9_PME_PM_FREQ_DOWN 240 ++#define POWER9_PME_PM_FREQ_UP 241 ++#define POWER9_PME_PM_FXU_1PLUS_BUSY 242 ++#define POWER9_PME_PM_FXU_BUSY 243 ++#define POWER9_PME_PM_FXU_FIN 244 ++#define POWER9_PME_PM_FXU_IDLE 245 ++#define POWER9_PME_PM_GRP_PUMP_CPRED 246 ++#define POWER9_PME_PM_GRP_PUMP_MPRED_RTY 247 ++#define POWER9_PME_PM_GRP_PUMP_MPRED 248 ++#define POWER9_PME_PM_HV_CYC 249 ++#define POWER9_PME_PM_HWSYNC 250 ++#define POWER9_PME_PM_IBUF_FULL_CYC 251 ++#define POWER9_PME_PM_IC_DEMAND_CYC 252 ++#define POWER9_PME_PM_IC_DEMAND_L2_BHT_REDIRECT 253 ++#define POWER9_PME_PM_IC_DEMAND_L2_BR_REDIRECT 254 ++#define POWER9_PME_PM_IC_DEMAND_REQ 255 ++#define POWER9_PME_PM_IC_INVALIDATE 256 ++#define POWER9_PME_PM_IC_MISS_CMPL 257 ++#define POWER9_PME_PM_IC_MISS_ICBI 258 ++#define POWER9_PME_PM_IC_PREF_CANCEL_HIT 259 ++#define POWER9_PME_PM_IC_PREF_CANCEL_L2 260 ++#define POWER9_PME_PM_IC_PREF_CANCEL_PAGE 261 ++#define POWER9_PME_PM_IC_PREF_REQ 262 ++#define POWER9_PME_PM_IC_PREF_WRITE 263 ++#define POWER9_PME_PM_IC_RELOAD_PRIVATE 264 ++#define POWER9_PME_PM_ICT_EMPTY_CYC 265 ++#define POWER9_PME_PM_ICT_NOSLOT_BR_MPRED_ICMISS 266 ++#define POWER9_PME_PM_ICT_NOSLOT_BR_MPRED 267 ++#define POWER9_PME_PM_ICT_NOSLOT_CYC 268 ++#define POWER9_PME_PM_ICT_NOSLOT_DISP_HELD_HB_FULL 269 ++#define POWER9_PME_PM_ICT_NOSLOT_DISP_HELD_ISSQ 270 ++#define POWER9_PME_PM_ICT_NOSLOT_DISP_HELD_SYNC 271 ++#define POWER9_PME_PM_ICT_NOSLOT_DISP_HELD_TBEGIN 272 ++#define POWER9_PME_PM_ICT_NOSLOT_DISP_HELD 273 ++#define POWER9_PME_PM_ICT_NOSLOT_IC_L3MISS 274 ++#define POWER9_PME_PM_ICT_NOSLOT_IC_L3 275 ++#define POWER9_PME_PM_ICT_NOSLOT_IC_MISS 276 ++#define POWER9_PME_PM_IERAT_RELOAD_16M 277 ++#define POWER9_PME_PM_IERAT_RELOAD_4K 278 ++#define POWER9_PME_PM_IERAT_RELOAD_64K 279 ++#define POWER9_PME_PM_IERAT_RELOAD 280 ++#define POWER9_PME_PM_IFETCH_THROTTLE 281 ++#define POWER9_PME_PM_INST_CHIP_PUMP_CPRED 282 ++#define POWER9_PME_PM_INST_CMPL 283 ++#define POWER9_PME_PM_INST_DISP 284 ++#define POWER9_PME_PM_INST_FROM_DL2L3_MOD 285 ++#define POWER9_PME_PM_INST_FROM_DL2L3_SHR 286 ++#define POWER9_PME_PM_INST_FROM_DL4 287 ++#define POWER9_PME_PM_INST_FROM_DMEM 288 ++#define POWER9_PME_PM_INST_FROM_L1 289 ++#define POWER9_PME_PM_INST_FROM_L21_MOD 290 ++#define POWER9_PME_PM_INST_FROM_L21_SHR 291 ++#define POWER9_PME_PM_INST_FROM_L2_DISP_CONFLICT_LDHITST 292 ++#define POWER9_PME_PM_INST_FROM_L2_DISP_CONFLICT_OTHER 293 ++#define POWER9_PME_PM_INST_FROM_L2_MEPF 294 ++#define POWER9_PME_PM_INST_FROM_L2MISS 295 ++#define POWER9_PME_PM_INST_FROM_L2_NO_CONFLICT 296 ++#define POWER9_PME_PM_INST_FROM_L2 297 ++#define POWER9_PME_PM_INST_FROM_L31_ECO_MOD 298 ++#define POWER9_PME_PM_INST_FROM_L31_ECO_SHR 299 ++#define POWER9_PME_PM_INST_FROM_L31_MOD 300 ++#define POWER9_PME_PM_INST_FROM_L31_SHR 301 ++#define POWER9_PME_PM_INST_FROM_L3_DISP_CONFLICT 302 ++#define POWER9_PME_PM_INST_FROM_L3_MEPF 303 ++#define POWER9_PME_PM_INST_FROM_L3MISS_MOD 304 ++#define POWER9_PME_PM_INST_FROM_L3MISS 305 ++#define POWER9_PME_PM_INST_FROM_L3_NO_CONFLICT 306 ++#define POWER9_PME_PM_INST_FROM_L3 307 ++#define POWER9_PME_PM_INST_FROM_LL4 308 ++#define POWER9_PME_PM_INST_FROM_LMEM 309 ++#define POWER9_PME_PM_INST_FROM_MEMORY 310 ++#define POWER9_PME_PM_INST_FROM_OFF_CHIP_CACHE 311 ++#define POWER9_PME_PM_INST_FROM_ON_CHIP_CACHE 312 ++#define POWER9_PME_PM_INST_FROM_RL2L3_MOD 313 ++#define POWER9_PME_PM_INST_FROM_RL2L3_SHR 314 ++#define POWER9_PME_PM_INST_FROM_RL4 315 ++#define POWER9_PME_PM_INST_FROM_RMEM 316 ++#define POWER9_PME_PM_INST_GRP_PUMP_CPRED 317 ++#define POWER9_PME_PM_INST_GRP_PUMP_MPRED_RTY 318 + #define POWER9_PME_PM_INST_GRP_PUMP_MPRED 319 +-#define POWER9_PME_PM_L1_PREF 320 +-#define POWER9_PME_PM_MRK_DATA_FROM_LMEM_CYC 321 +-#define POWER9_PME_PM_LSU_FLUSH_ATOMIC 322 +-#define POWER9_PME_PM_L2_DISP_ALL_L2MISS 323 +-#define POWER9_PME_PM_DATA_FROM_MEMORY 324 +-#define POWER9_PME_PM_IPTEG_FROM_L3_1_ECO_MOD 325 +-#define POWER9_PME_PM_ISIDE_DISP_FAIL_ADDR 326 +-#define POWER9_PME_PM_CMPLU_STALL_HWSYNC 327 +-#define POWER9_PME_PM_DATA_FROM_L3 328 +-#define POWER9_PME_PM_PMC2_OVERFLOW 329 +-#define POWER9_PME_PM_LSU0_SRQ_S0_VALID_CYC 330 +-#define POWER9_PME_PM_DPTEG_FROM_LMEM 331 +-#define POWER9_PME_PM_IPTEG_FROM_ON_CHIP_CACHE 332 +-#define POWER9_PME_PM_LSU1_SET_MPRED 333 +-#define POWER9_PME_PM_DATA_FROM_L3_1_ECO_SHR 334 +-#define POWER9_PME_PM_INST_FROM_MEMORY 335 +-#define POWER9_PME_PM_L3_P1_LCO_RTY 336 +-#define POWER9_PME_PM_DATA_FROM_L2_1_SHR 337 +-#define POWER9_PME_PM_FLUSH_LSU 338 +-#define POWER9_PME_PM_CMPLU_STALL_FXLONG 339 +-#define POWER9_PME_PM_CMPLU_STALL_DMISS_LMEM 340 +-#define POWER9_PME_PM_SNP_TM_HIT_M 341 +-#define POWER9_PME_PM_INST_GRP_PUMP_MPRED_RTY 342 +-#define POWER9_PME_PM_L2_INST_MISS 343 +-#define POWER9_PME_PM_CMPLU_STALL_ERAT_MISS 344 +-#define POWER9_PME_PM_MRK_L2_RC_DONE 345 +-#define POWER9_PME_PM_INST_FROM_L3_1_SHR 346 +-#define POWER9_PME_PM_RADIX_PWC_L4_PDE_FROM_L2 347 +-#define POWER9_PME_PM_MRK_DATA_FROM_L3_1_MOD 348 +-#define POWER9_PME_PM_CO0_BUSY 349 +-#define POWER9_PME_PM_CMPLU_STALL_STORE_DATA 350 +-#define POWER9_PME_PM_INST_FROM_RMEM 351 +-#define POWER9_PME_PM_SYNC_MRK_BR_LINK 352 +-#define POWER9_PME_PM_L3_LD_PREF 353 +-#define POWER9_PME_PM_DISP_CLB_HELD_TLBIE 354 +-#define POWER9_PME_PM_DPTEG_FROM_ON_CHIP_CACHE 355 +-#define POWER9_PME_PM_MRK_DATA_FROM_L3_MEPF_CYC 356 +-#define POWER9_PME_PM_LS0_UNALIGNED_LD 357 +-#define POWER9_PME_PM_MRK_DATA_FROM_DMEM_CYC 358 +-#define POWER9_PME_PM_SN_HIT 359 +-#define POWER9_PME_PM_L3_LOC_GUESS_CORRECT 360 +-#define POWER9_PME_PM_MRK_INST_FROM_L3MISS 361 +-#define POWER9_PME_PM_DECODE_FUSION_EXT_ADD 362 +-#define POWER9_PME_PM_INST_FROM_DL4 363 +-#define POWER9_PME_PM_DC_PREF_XCONS_ALLOC 364 +-#define POWER9_PME_PM_MRK_DPTEG_FROM_MEMORY 365 +-#define POWER9_PME_PM_IC_PREF_CANCEL_PAGE 366 +-#define POWER9_PME_PM_RADIX_PWC_L3_PTE_FROM_L3 367 +-#define POWER9_PME_PM_L3_GRP_GUESS_WRONG_LOW 368 +-#define POWER9_PME_PM_TM_FAIL_SELF 369 +-#define POWER9_PME_PM_L3_P1_SYS_PUMP 370 +-#define POWER9_PME_PM_CMPLU_STALL_RFID 371 +-#define POWER9_PME_PM_BR_2PATH 372 +-#define POWER9_PME_PM_RADIX_PWC_L4_PTE_FROM_L3MISS 373 +-#define POWER9_PME_PM_DPTEG_FROM_L2MISS 374 +-#define POWER9_PME_PM_TM_TX_PASS_RUN_INST 375 +-#define POWER9_PME_PM_L1_ICACHE_RELOADED_PREF 376 +-#define POWER9_PME_PM_THRESH_EXC_4096 377 +-#define POWER9_PME_PM_IERAT_RELOAD_64K 378 +-#define POWER9_PME_PM_LSU0_TM_L1_MISS 379 +-#define POWER9_PME_PM_MEM_LOC_THRESH_LSU_MED 380 +-#define POWER9_PME_PM_PMC3_REWIND 381 +-#define POWER9_PME_PM_ST_FWD 382 +-#define POWER9_PME_PM_TM_FAIL_TX_CONFLICT 383 +-#define POWER9_PME_PM_SYNC_MRK_L2MISS 384 +-#define POWER9_PME_PM_ISU0_ISS_HOLD_ALL 385 +-#define POWER9_PME_PM_MRK_FAB_RSP_DCLAIM_CYC 386 +-#define POWER9_PME_PM_DATA_FROM_L2 387 +-#define POWER9_PME_PM_MRK_DATA_FROM_RL2L3_MOD 388 +-#define POWER9_PME_PM_ISQ_0_8_ENTRIES 389 +-#define POWER9_PME_PM_L3_CO_MEPF 390 +-#define POWER9_PME_PM_LINK_STACK_INVALID_PTR 391 +-#define POWER9_PME_PM_IPTEG_FROM_L2_1_MOD 392 +-#define POWER9_PME_PM_TM_ST_CAUSED_FAIL 393 +-#define POWER9_PME_PM_LD_REF_L1 394 +-#define POWER9_PME_PM_TM_FAIL_NON_TX_CONFLICT 395 +-#define POWER9_PME_PM_GRP_PUMP_CPRED 396 +-#define POWER9_PME_PM_INST_FROM_L3_NO_CONFLICT 397 +-#define POWER9_PME_PM_DC_PREF_STRIDED_CONF 398 +-#define POWER9_PME_PM_THRD_PRIO_6_7_CYC 399 +-#define POWER9_PME_PM_RADIX_PWC_L4_PDE_FROM_L3 400 +-#define POWER9_PME_PM_L3_PF_OFF_CHIP_MEM 401 +-#define POWER9_PME_PM_L3_CO_MEM 402 +-#define POWER9_PME_PM_DECODE_HOLD_ICT_FULL 403 +-#define POWER9_PME_PM_CMPLU_STALL_DFLONG 404 +-#define POWER9_PME_PM_LD_MISS_L1 405 +-#define POWER9_PME_PM_DATA_FROM_RL2L3_MOD 406 +-#define POWER9_PME_PM_L3_WI0_BUSY 407 +-#define POWER9_PME_PM_LSU_SRQ_FULL_CYC 408 +-#define POWER9_PME_PM_TABLEWALK_CYC 409 +-#define POWER9_PME_PM_MRK_DATA_FROM_MEMORY_CYC 410 +-#define POWER9_PME_PM_IPTEG_FROM_OFF_CHIP_CACHE 411 +-#define POWER9_PME_PM_RADIX_PWC_L1_PDE_FROM_L3MISS 412 +-#define POWER9_PME_PM_CMPLU_STALL_SYS_CALL 413 +-#define POWER9_PME_PM_LSU_FLUSH_RELAUNCH_MISS 414 +-#define POWER9_PME_PM_DPTEG_FROM_L3_1_ECO_MOD 415 +-#define POWER9_PME_PM_PMC5_OVERFLOW 416 +-#define POWER9_PME_PM_LS1_UNALIGNED_ST 417 +-#define POWER9_PME_PM_ICT_NOSLOT_DISP_HELD_SYNC 418 +-#define POWER9_PME_PM_CMPLU_STALL_THRD 419 +-#define POWER9_PME_PM_PMC3_SAVED 420 +-#define POWER9_PME_PM_MRK_DERAT_MISS 421 +-#define POWER9_PME_PM_RADIX_PWC_L3_HIT 422 +-#define POWER9_PME_PM_RADIX_PWC_L2_PTE_FROM_L3MISS 423 +-#define POWER9_PME_PM_RUN_CYC_SMT4_MODE 424 +-#define POWER9_PME_PM_DATA_FROM_RMEM 425 +-#define POWER9_PME_PM_BR_MPRED_LSTACK 426 +-#define POWER9_PME_PM_PROBE_NOP_DISP 427 +-#define POWER9_PME_PM_DPTEG_FROM_L3_MEPF 428 +-#define POWER9_PME_PM_INST_FROM_L3MISS_MOD 429 +-#define POWER9_PME_PM_DUMMY1_REMOVE_ME 430 +-#define POWER9_PME_PM_MRK_DATA_FROM_DL4 431 +-#define POWER9_PME_PM_MRK_DATA_FROM_RL2L3_MOD_CYC 432 +-#define POWER9_PME_PM_IPTEG_FROM_L3_1_SHR 433 +-#define POWER9_PME_PM_MRK_DATA_FROM_L3_1_ECO_SHR 434 +-#define POWER9_PME_PM_DTLB_MISS_2M 435 +-#define POWER9_PME_PM_TM_RST_SC 436 +-#define POWER9_PME_PM_LSU_NCST 437 +-#define POWER9_PME_PM_DATA_SYS_PUMP_MPRED_RTY 438 +-#define POWER9_PME_PM_THRESH_ACC 439 +-#define POWER9_PME_PM_ISU3_ISS_HOLD_ALL 440 +-#define POWER9_PME_PM_LSU0_L1_CAM_CANCEL 441 +-#define POWER9_PME_PM_MRK_FAB_RSP_BKILL_CYC 442 +-#define POWER9_PME_PM_MRK_DPTEG_FROM_L3_MEPF 443 +-#define POWER9_PME_PM_DARQ_STORE_REJECT 444 +-#define POWER9_PME_PM_DPTEG_FROM_L3_NO_CONFLICT 445 +-#define POWER9_PME_PM_TM_TX_PASS_RUN_CYC 446 +-#define POWER9_PME_PM_DTLB_MISS_4K 447 +-#define POWER9_PME_PM_MRK_DATA_FROM_ON_CHIP_CACHE_CYC 448 +-#define POWER9_PME_PM_LS0_PTE_TABLEWALK_CYC 449 +-#define POWER9_PME_PM_PMC4_SAVED 450 +-#define POWER9_PME_PM_SNP_TM_HIT_T 451 +-#define POWER9_PME_PM_MRK_BR_2PATH 452 +-#define POWER9_PME_PM_LSU_FLUSH_CI 453 +-#define POWER9_PME_PM_FLUSH_MPRED 454 +-#define POWER9_PME_PM_CMPLU_STALL_ST_FWD 455 +-#define POWER9_PME_PM_DTLB_MISS 456 +-#define POWER9_PME_PM_MRK_L2_TM_REQ_ABORT 457 +-#define POWER9_PME_PM_TM_NESTED_TEND 458 +-#define POWER9_PME_PM_CMPLU_STALL_PM 459 +-#define POWER9_PME_PM_CMPLU_STALL_ISYNC 460 +-#define POWER9_PME_PM_MRK_DTLB_MISS_1G 461 +-#define POWER9_PME_PM_L3_SYS_GUESS_CORRECT 462 +-#define POWER9_PME_PM_L2_CASTOUT_SHR 463 +-#define POWER9_PME_PM_CMPLU_STALL_DMISS_L2L3 464 +-#define POWER9_PME_PM_LS2_UNALIGNED_ST 465 +-#define POWER9_PME_PM_MRK_DPTEG_FROM_L2MISS 466 +-#define POWER9_PME_PM_THRESH_EXC_32 467 +-#define POWER9_PME_PM_TM_TSUSPEND 468 +-#define POWER9_PME_PM_DATA_FROM_DL2L3_SHR 469 +-#define POWER9_PME_PM_MRK_DATA_FROM_L3_DISP_CONFLICT 470 +-#define POWER9_PME_PM_MRK_DATA_FROM_L3_1_ECO_SHR_CYC 471 +-#define POWER9_PME_PM_THRESH_EXC_1024 472 +-#define POWER9_PME_PM_ST_FIN 473 +-#define POWER9_PME_PM_TM_LD_CAUSED_FAIL 474 +-#define POWER9_PME_PM_SRQ_SYNC_CYC 475 +-#define POWER9_PME_PM_IFETCH_THROTTLE 476 +-#define POWER9_PME_PM_L3_SW_PREF 477 +-#define POWER9_PME_PM_LSU0_LDMX_FIN 478 +-#define POWER9_PME_PM_L2_LOC_GUESS_WRONG 479 +-#define POWER9_PME_PM_MRK_DATA_FROM_L2_NO_CONFLICT_CYC 480 +-#define POWER9_PME_PM_MRK_DPTEG_FROM_ON_CHIP_CACHE 481 +-#define POWER9_PME_PM_L3_P1_CO_RTY 482 +-#define POWER9_PME_PM_MRK_STCX_FAIL 483 +-#define POWER9_PME_PM_LARX_FIN 484 +-#define POWER9_PME_PM_RADIX_PWC_L2_PTE_FROM_L3 485 +-#define POWER9_PME_PM_LSU3_L1_CAM_CANCEL 486 +-#define POWER9_PME_PM_IC_PREF_CANCEL_HIT 487 +-#define POWER9_PME_PM_CMPLU_STALL_EIEIO 488 +-#define POWER9_PME_PM_CMPLU_STALL_VDP 489 +-#define POWER9_PME_PM_DERAT_MISS_1G 490 +-#define POWER9_PME_PM_DATA_PUMP_CPRED 491 +-#define POWER9_PME_PM_DPTEG_FROM_L2_MEPF 492 +-#define POWER9_PME_PM_BR_MPRED_TAKEN_CR 493 +-#define POWER9_PME_PM_MRK_BRU_FIN 494 +-#define POWER9_PME_PM_MRK_DPTEG_FROM_DL4 495 +-#define POWER9_PME_PM_SHL_ST_DEP_CREATED 496 +-#define POWER9_PME_PM_DPTEG_FROM_L3_1_SHR 497 +-#define POWER9_PME_PM_DATA_FROM_RL4 498 +-#define POWER9_PME_PM_XLATE_MISS 499 +-#define POWER9_PME_PM_CMPLU_STALL_SRQ_FULL 500 +-#define POWER9_PME_PM_SN0_BUSY 501 +-#define POWER9_PME_PM_CMPLU_STALL_NESTED_TBEGIN 502 +-#define POWER9_PME_PM_ST_CMPL 503 +-#define POWER9_PME_PM_DPTEG_FROM_DL2L3_SHR 504 +-#define POWER9_PME_PM_DECODE_FUSION_CONST_GEN 505 +-#define POWER9_PME_PM_L2_LOC_GUESS_CORRECT 506 +-#define POWER9_PME_PM_INST_FROM_L3_1_ECO_SHR 507 +-#define POWER9_PME_PM_XLATE_HPT_MODE 508 +-#define POWER9_PME_PM_CMPLU_STALL_LSU_FIN 509 +-#define POWER9_PME_PM_THRESH_EXC_64 510 +-#define POWER9_PME_PM_MRK_DATA_FROM_DL4_CYC 511 +-#define POWER9_PME_PM_DARQ_STORE_XMIT 512 +-#define POWER9_PME_PM_DATA_TABLEWALK_CYC 513 +-#define POWER9_PME_PM_L2_RC_ST_DONE 514 +-#define POWER9_PME_PM_TMA_REQ_L2 515 +-#define POWER9_PME_PM_INST_FROM_ON_CHIP_CACHE 516 +-#define POWER9_PME_PM_SLB_TABLEWALK_CYC 517 +-#define POWER9_PME_PM_MRK_DATA_FROM_RMEM 518 +-#define POWER9_PME_PM_L3_PF_MISS_L3 519 +-#define POWER9_PME_PM_L3_CI_MISS 520 +-#define POWER9_PME_PM_L2_RCLD_DISP_FAIL_ADDR 521 +-#define POWER9_PME_PM_DERAT_MISS_4K 522 +-#define POWER9_PME_PM_ISIDE_MRU_TOUCH 523 +-#define POWER9_PME_PM_MRK_RUN_CYC 524 +-#define POWER9_PME_PM_L3_P0_CO_RTY 525 +-#define POWER9_PME_PM_BR_MPRED_CMPL 526 +-#define POWER9_PME_PM_BR_MPRED_TAKEN_TA 527 +-#define POWER9_PME_PM_DISP_HELD_TBEGIN 528 +-#define POWER9_PME_PM_DPTEG_FROM_RL2L3_MOD 529 +-#define POWER9_PME_PM_FLUSH_DISP_SB 530 +-#define POWER9_PME_PM_L2_CHIP_PUMP 531 +-#define POWER9_PME_PM_L2_DC_INV 532 +-#define POWER9_PME_PM_MRK_DATA_FROM_L2_DISP_CONFLICT_LDHITST_CYC 533 +-#define POWER9_PME_PM_MRK_DPTEG_FROM_L3_1_SHR 534 +-#define POWER9_PME_PM_MRK_DERAT_MISS_2M 535 +-#define POWER9_PME_PM_MRK_ST_DONE_L2 536 +-#define POWER9_PME_PM_MRK_DATA_FROM_L3_1_ECO_MOD 537 +-#define POWER9_PME_PM_IPTEG_FROM_RMEM 538 +-#define POWER9_PME_PM_MRK_LSU_FLUSH_EMSH 539 +-#define POWER9_PME_PM_BR_PRED_LSTACK 540 +-#define POWER9_PME_PM_L3_P0_CO_MEM 541 +-#define POWER9_PME_PM_IPTEG_FROM_L2_MEPF 542 +-#define POWER9_PME_PM_LS0_ERAT_MISS_PREF 543 +-#define POWER9_PME_PM_RD_HIT_PF 544 +-#define POWER9_PME_PM_DECODE_FUSION_LD_ST_DISP 545 +-#define POWER9_PME_PM_CMPLU_STALL_NTC_DISP_FIN 546 +-#define POWER9_PME_PM_ICT_NOSLOT_CYC 547 +-#define POWER9_PME_PM_DERAT_MISS_16M 548 +-#define POWER9_PME_PM_IC_MISS_ICBI 549 +-#define POWER9_PME_PM_TAGE_OVERRIDE_WRONG_SPEC 550 +-#define POWER9_PME_PM_ICT_NOSLOT_DISP_HELD_TBEGIN 551 +-#define POWER9_PME_PM_MRK_BR_TAKEN_CMPL 552 +-#define POWER9_PME_PM_CMPLU_STALL_VFXU 553 +-#define POWER9_PME_PM_DATA_GRP_PUMP_MPRED_RTY 554 +-#define POWER9_PME_PM_INST_FROM_L3 555 +-#define POWER9_PME_PM_ITLB_MISS 556 +-#define POWER9_PME_PM_MRK_DPTEG_FROM_RL2L3_MOD 557 +-#define POWER9_PME_PM_LSU2_TM_L1_MISS 558 +-#define POWER9_PME_PM_L3_WI_USAGE 559 +-#define POWER9_PME_PM_L2_SN_M_WR_DONE 560 +-#define POWER9_PME_PM_DISP_HELD_SYNC_HOLD 561 +-#define POWER9_PME_PM_MRK_DPTEG_FROM_L2_1_SHR 562 +-#define POWER9_PME_PM_MEM_PREF 563 +-#define POWER9_PME_PM_L2_SN_M_RD_DONE 564 +-#define POWER9_PME_PM_LS0_UNALIGNED_ST 565 +-#define POWER9_PME_PM_DC_PREF_CONS_ALLOC 566 +-#define POWER9_PME_PM_MRK_DERAT_MISS_16G 567 +-#define POWER9_PME_PM_IPTEG_FROM_L2 568 +-#define POWER9_PME_PM_ANY_THRD_RUN_CYC 569 +-#define POWER9_PME_PM_MRK_PROBE_NOP_CMPL 570 +-#define POWER9_PME_PM_BANK_CONFLICT 571 +-#define POWER9_PME_PM_INST_SYS_PUMP_MPRED 572 +-#define POWER9_PME_PM_NON_DATA_STORE 573 +-#define POWER9_PME_PM_DC_PREF_CONF 574 +-#define POWER9_PME_PM_BTAC_BAD_RESULT 575 +-#define POWER9_PME_PM_LSU_LMQ_FULL_CYC 576 +-#define POWER9_PME_PM_NON_MATH_FLOP_CMPL 577 +-#define POWER9_PME_PM_MRK_LD_MISS_L1_CYC 578 +-#define POWER9_PME_PM_MRK_DATA_FROM_L2_CYC 579 +-#define POWER9_PME_PM_FXU_1PLUS_BUSY 580 +-#define POWER9_PME_PM_CMPLU_STALL_DP 581 +-#define POWER9_PME_PM_MRK_DATA_FROM_L3_1_MOD_CYC 582 +-#define POWER9_PME_PM_SYNC_MRK_L2HIT 583 +-#define POWER9_PME_PM_MRK_DATA_FROM_RMEM_CYC 584 +-#define POWER9_PME_PM_ISU1_ISS_HOLD_ALL 585 +-#define POWER9_PME_PM_MRK_DPTEG_FROM_L3_DISP_CONFLICT 586 +-#define POWER9_PME_PM_MRK_FAB_RSP_RWITM_RTY 587 +-#define POWER9_PME_PM_L3_P3_LCO_RTY 588 +-#define POWER9_PME_PM_PUMP_CPRED 589 +-#define POWER9_PME_PM_LS3_TM_DISALLOW 590 +-#define POWER9_PME_PM_SN_INVL 591 +-#define POWER9_PME_PM_TM_LD_CONF 592 +-#define POWER9_PME_PM_LD_MISS_L1_FIN 593 +-#define POWER9_PME_PM_SYNC_MRK_PROBE_NOP 594 +-#define POWER9_PME_PM_RUN_CYC 595 +-#define POWER9_PME_PM_SYS_PUMP_MPRED 596 +-#define POWER9_PME_PM_DATA_FROM_OFF_CHIP_CACHE 597 +-#define POWER9_PME_PM_TM_NESTED_TBEGIN 598 +-#define POWER9_PME_PM_FLUSH_COMPLETION 599 +-#define POWER9_PME_PM_ST_MISS_L1 600 +-#define POWER9_PME_PM_IPTEG_FROM_L2MISS 601 +-#define POWER9_PME_PM_LSU3_TM_L1_MISS 602 +-#define POWER9_PME_PM_L3_CO 603 +-#define POWER9_PME_PM_MRK_STALL_CMPLU_CYC 604 +-#define POWER9_PME_PM_INST_FROM_DL2L3_SHR 605 +-#define POWER9_PME_PM_SCALAR_FLOP_CMPL 606 +-#define POWER9_PME_PM_LRQ_REJECT 607 +-#define POWER9_PME_PM_4FLOP_CMPL 608 +-#define POWER9_PME_PM_MRK_DPTEG_FROM_RMEM 609 +-#define POWER9_PME_PM_LD_CMPL 610 +-#define POWER9_PME_PM_DATA_FROM_L3_MEPF 611 +-#define POWER9_PME_PM_L1PF_L2MEMACC 612 +-#define POWER9_PME_PM_INST_FROM_L3MISS 613 +-#define POWER9_PME_PM_MRK_LSU_FLUSH_LHS 614 +-#define POWER9_PME_PM_EE_OFF_EXT_INT 615 +-#define POWER9_PME_PM_TM_ST_CONF 616 +-#define POWER9_PME_PM_PMC6_OVERFLOW 617 +-#define POWER9_PME_PM_INST_FROM_DL2L3_MOD 618 +-#define POWER9_PME_PM_MRK_INST_CMPL 619 +-#define POWER9_PME_PM_TAGE_CORRECT_TAKEN_CMPL 620 +-#define POWER9_PME_PM_MRK_L1_ICACHE_MISS 621 +-#define POWER9_PME_PM_TLB_MISS 622 +-#define POWER9_PME_PM_L2_RCLD_DISP_FAIL_OTHER 623 +-#define POWER9_PME_PM_FXU_BUSY 624 +-#define POWER9_PME_PM_DATA_FROM_L3_DISP_CONFLICT 625 +-#define POWER9_PME_PM_INST_FROM_L3_1_MOD 626 +-#define POWER9_PME_PM_LSU_REJECT_LMQ_FULL 627 +-#define POWER9_PME_PM_CO_DISP_FAIL 628 +-#define POWER9_PME_PM_L3_TRANS_PF 629 +-#define POWER9_PME_PM_MRK_ST_NEST 630 +-#define POWER9_PME_PM_LSU1_L1_CAM_CANCEL 631 +-#define POWER9_PME_PM_INST_CHIP_PUMP_CPRED 632 +-#define POWER9_PME_PM_LSU3_VECTOR_ST_FIN 633 +-#define POWER9_PME_PM_MRK_DPTEG_FROM_L2_1_MOD 634 +-#define POWER9_PME_PM_IBUF_FULL_CYC 635 +-#define POWER9_PME_PM_8FLOP_CMPL 636 +-#define POWER9_PME_PM_MRK_DATA_FROM_DL2L3_SHR_CYC 637 +-#define POWER9_PME_PM_MRK_DPTEG_FROM_OFF_CHIP_CACHE 638 +-#define POWER9_PME_PM_ICT_NOSLOT_IC_L3 639 +-#define POWER9_PME_PM_CMPLU_STALL_LWSYNC 640 +-#define POWER9_PME_PM_RADIX_PWC_L2_PDE_FROM_L2 641 +-#define POWER9_PME_PM_MRK_DATA_FROM_RL2L3_SHR_CYC 642 +-#define POWER9_PME_PM_L3_SN0_BUSY 643 +-#define POWER9_PME_PM_TM_OUTER_TBEGIN_DISP 644 +-#define POWER9_PME_PM_GRP_PUMP_MPRED 645 +-#define POWER9_PME_PM_SRQ_EMPTY_CYC 646 +-#define POWER9_PME_PM_LSU_REJECT_LHS 647 +-#define POWER9_PME_PM_IPTEG_FROM_L3_MEPF 648 +-#define POWER9_PME_PM_MRK_DATA_FROM_LMEM 649 +-#define POWER9_PME_PM_L3_P1_CO_MEM 650 +-#define POWER9_PME_PM_FREQ_DOWN 651 +-#define POWER9_PME_PM_L3_CINJ 652 +-#define POWER9_PME_PM_L3_P0_PF_RTY 653 +-#define POWER9_PME_PM_IPTEG_FROM_DL2L3_MOD 654 +-#define POWER9_PME_PM_MRK_INST_ISSUED 655 +-#define POWER9_PME_PM_INST_FROM_RL2L3_SHR 656 +-#define POWER9_PME_PM_LSU_STCX_FAIL 657 +-#define POWER9_PME_PM_L3_P1_NODE_PUMP 658 +-#define POWER9_PME_PM_MEM_RWITM 659 +-#define POWER9_PME_PM_DP_QP_FLOP_CMPL 660 +-#define POWER9_PME_PM_RUN_PURR 661 +-#define POWER9_PME_PM_CMPLU_STALL_LMQ_FULL 662 +-#define POWER9_PME_PM_CMPLU_STALL_VDPLONG 663 +-#define POWER9_PME_PM_LSU2_TM_L1_HIT 664 +-#define POWER9_PME_PM_MRK_DATA_FROM_L3 665 +-#define POWER9_PME_PM_CMPLU_STALL_MTFPSCR 666 +-#define POWER9_PME_PM_STALL_END_ICT_EMPTY 667 +-#define POWER9_PME_PM_L3_P1_CO_L31 668 +-#define POWER9_PME_PM_CMPLU_STALL_DCACHE_MISS 669 +-#define POWER9_PME_PM_DPTEG_FROM_DL2L3_MOD 670 +-#define POWER9_PME_PM_INST_FROM_L3_MEPF 671 +-#define POWER9_PME_PM_L1_DCACHE_RELOADED_ALL 672 +-#define POWER9_PME_PM_DATA_GRP_PUMP_CPRED 673 +-#define POWER9_PME_PM_MRK_DERAT_MISS_64K 674 +-#define POWER9_PME_PM_L2_ST_MISS 675 +-#define POWER9_PME_PM_L3_PF_OFF_CHIP_CACHE 676 +-#define POWER9_PME_PM_RADIX_PWC_L3_PTE_FROM_L3MISS 677 +-#define POWER9_PME_PM_LWSYNC 678 +-#define POWER9_PME_PM_LS3_UNALIGNED_LD 679 +-#define POWER9_PME_PM_L3_RD0_BUSY 680 +-#define POWER9_PME_PM_LINK_STACK_CORRECT 681 +-#define POWER9_PME_PM_MRK_DTLB_MISS 682 +-#define POWER9_PME_PM_INST_IMC_MATCH_CMPL 683 +-#define POWER9_PME_PM_LS1_ERAT_MISS_PREF 684 +-#define POWER9_PME_PM_L3_CO0_BUSY 685 +-#define POWER9_PME_PM_BFU_BUSY 686 +-#define POWER9_PME_PM_L2_SYS_GUESS_CORRECT 687 +-#define POWER9_PME_PM_L1_SW_PREF 688 +-#define POWER9_PME_PM_MRK_DATA_FROM_LL4 689 +-#define POWER9_PME_PM_MRK_INST_FIN 690 +-#define POWER9_PME_PM_SYNC_MRK_L3MISS 691 +-#define POWER9_PME_PM_LSU1_STORE_REJECT 692 +-#define POWER9_PME_PM_CHIP_PUMP_CPRED 693 +-#define POWER9_PME_PM_MRK_DATA_FROM_OFF_CHIP_CACHE_CYC 694 +-#define POWER9_PME_PM_DATA_STORE 695 +-#define POWER9_PME_PM_LS1_UNALIGNED_LD 696 +-#define POWER9_PME_PM_TM_TRANS_RUN_INST 697 +-#define POWER9_PME_PM_IC_MISS_CMPL 698 +-#define POWER9_PME_PM_THRESH_NOT_MET 699 +-#define POWER9_PME_PM_DPTEG_FROM_L2 700 +-#define POWER9_PME_PM_IPTEG_FROM_RL2L3_SHR 701 +-#define POWER9_PME_PM_DPTEG_FROM_RMEM 702 +-#define POWER9_PME_PM_L3_L2_CO_MISS 703 +-#define POWER9_PME_PM_IPTEG_FROM_DMEM 704 +-#define POWER9_PME_PM_MRK_DTLB_MISS_64K 705 +-#define POWER9_PME_PM_MRK_DATA_FROM_L3_DISP_CONFLICT_CYC 706 +-#define POWER9_PME_PM_LSU_FIN 707 +-#define POWER9_PME_PM_DATA_FROM_L2_DISP_CONFLICT_OTHER 708 +-#define POWER9_PME_PM_MRK_DATA_FROM_ON_CHIP_CACHE 709 +-#define POWER9_PME_PM_LSU_STCX 710 +-#define POWER9_PME_PM_MRK_DATA_FROM_L2_1_MOD 711 +-#define POWER9_PME_PM_VSU_NON_FLOP_CMPL 712 +-#define POWER9_PME_PM_INST_FROM_L3_DISP_CONFLICT 713 +-#define POWER9_PME_PM_MRK_DATA_FROM_L2_1_SHR 714 +-#define POWER9_PME_PM_RADIX_PWC_L4_PTE_FROM_L3 715 +-#define POWER9_PME_PM_TAGE_CORRECT 716 +-#define POWER9_PME_PM_TM_FAV_CAUSED_FAIL 717 +-#define POWER9_PME_PM_RADIX_PWC_L1_HIT 718 +-#define POWER9_PME_PM_LSU0_LMQ_S0_VALID 719 +-#define POWER9_PME_PM_BR_MPRED_CCACHE 720 +-#define POWER9_PME_PM_L1_DEMAND_WRITE 721 +-#define POWER9_PME_PM_CMPLU_STALL_FLUSH_ANY_THREAD 722 +-#define POWER9_PME_PM_IPTEG_FROM_L3MISS 723 +-#define POWER9_PME_PM_MRK_DTLB_MISS_16G 724 +-#define POWER9_PME_PM_IPTEG_FROM_RL4 725 +-#define POWER9_PME_PM_L2_RCST_DISP 726 +-#define POWER9_PME_PM_MRK_DATA_FROM_L2_DISP_CONFLICT_OTHER_CYC 727 +-#define POWER9_PME_PM_CMPLU_STALL 728 +-#define POWER9_PME_PM_DISP_CLB_HELD_SB 729 +-#define POWER9_PME_PM_L3_SN_USAGE 730 +-#define POWER9_PME_PM_FLOP_CMPL 731 +-#define POWER9_PME_PM_MRK_L2_RC_DISP 732 +-#define POWER9_PME_PM_L3_PF_ON_CHIP_CACHE 733 +-#define POWER9_PME_PM_IC_DEMAND_CYC 734 +-#define POWER9_PME_PM_CO_USAGE 735 +-#define POWER9_PME_PM_ISYNC 736 +-#define POWER9_PME_PM_MEM_CO 737 +-#define POWER9_PME_PM_NTC_ALL_FIN 738 +-#define POWER9_PME_PM_CMPLU_STALL_EXCEPTION 739 +-#define POWER9_PME_PM_LS0_LAUNCH_HELD_PREF 740 +-#define POWER9_PME_PM_ICT_NOSLOT_BR_MPRED 741 +-#define POWER9_PME_PM_MRK_BR_CMPL 742 +-#define POWER9_PME_PM_ICT_NOSLOT_DISP_HELD 743 +-#define POWER9_PME_PM_IC_PREF_WRITE 744 +-#define POWER9_PME_PM_MRK_LSU_FLUSH_LHL_SHL 745 +-#define POWER9_PME_PM_DTLB_MISS_1G 746 +-#define POWER9_PME_PM_DATA_FROM_L2_NO_CONFLICT 747 +-#define POWER9_PME_PM_MRK_DPTEG_FROM_L3MISS 748 +-#define POWER9_PME_PM_BR_PRED 749 +-#define POWER9_PME_PM_CMPLU_STALL_OTHER_CMPL 750 +-#define POWER9_PME_PM_INST_FROM_DMEM 751 +-#define POWER9_PME_PM_MRK_DPTEG_FROM_L2_NO_CONFLICT 752 +-#define POWER9_PME_PM_DC_PREF_SW_ALLOC 753 +-#define POWER9_PME_PM_L2_RCST_DISP_FAIL_OTHER 754 +-#define POWER9_PME_PM_CMPLU_STALL_EMQ_FULL 755 +-#define POWER9_PME_PM_MRK_INST_DECODED 756 +-#define POWER9_PME_PM_IERAT_RELOAD_4K 757 +-#define POWER9_PME_PM_CMPLU_STALL_LRQ_OTHER 758 +-#define POWER9_PME_PM_INST_FROM_L3_1_ECO_MOD 759 +-#define POWER9_PME_PM_L3_P0_CO_L31 760 +-#define POWER9_PME_PM_NON_TM_RST_SC 761 +-#define POWER9_PME_PM_RADIX_PWC_L4_PTE_FROM_L2 762 +-#define POWER9_PME_PM_INST_SYS_PUMP_CPRED 763 +-#define POWER9_PME_PM_DPTEG_FROM_DMEM 764 +-#define POWER9_PME_PM_LSU_LMQ_SRQ_EMPTY_CYC 765 +-#define POWER9_PME_PM_SYS_PUMP_CPRED 766 +-#define POWER9_PME_PM_DTLB_MISS_64K 767 +-#define POWER9_PME_PM_CMPLU_STALL_STCX 768 +-#define POWER9_PME_PM_MRK_FAB_RSP_CLAIM_RTY 769 +-#define POWER9_PME_PM_PARTIAL_ST_FIN 770 +-#define POWER9_PME_PM_THRD_CONC_RUN_INST 771 +-#define POWER9_PME_PM_CO_TM_SC_FOOTPRINT 772 +-#define POWER9_PME_PM_MRK_LARX_FIN 773 +-#define POWER9_PME_PM_L3_LOC_GUESS_WRONG 774 +-#define POWER9_PME_PM_CMPLU_STALL_DMISS_L21_L31 775 +-#define POWER9_PME_PM_SHL_ST_DISABLE 776 +-#define POWER9_PME_PM_VSU_FIN 777 +-#define POWER9_PME_PM_MRK_LSU_FLUSH_ATOMIC 778 +-#define POWER9_PME_PM_L3_CI_HIT 779 +-#define POWER9_PME_PM_CMPLU_STALL_DARQ 780 +-#define POWER9_PME_PM_L3_PF_ON_CHIP_MEM 781 +-#define POWER9_PME_PM_THRD_PRIO_0_1_CYC 782 +-#define POWER9_PME_PM_DERAT_MISS_64K 783 +-#define POWER9_PME_PM_PMC2_REWIND 784 +-#define POWER9_PME_PM_INST_FROM_L2 785 +-#define POWER9_PME_PM_MRK_NTF_FIN 786 +-#define POWER9_PME_PM_ALL_SRQ_FULL 787 +-#define POWER9_PME_PM_INST_DISP 788 +-#define POWER9_PME_PM_LS3_ERAT_MISS_PREF 789 +-#define POWER9_PME_PM_STOP_FETCH_PENDING_CYC 790 +-#define POWER9_PME_PM_L1_DCACHE_RELOAD_VALID 791 +-#define POWER9_PME_PM_L3_P0_LCO_NO_DATA 792 +-#define POWER9_PME_PM_LSU3_VECTOR_LD_FIN 793 +-#define POWER9_PME_PM_MRK_DPTEG_FROM_L3_NO_CONFLICT 794 +-#define POWER9_PME_PM_MRK_FXU_FIN 795 +-#define POWER9_PME_PM_LS3_UNALIGNED_ST 796 +-#define POWER9_PME_PM_DPTEG_FROM_MEMORY 797 +-#define POWER9_PME_PM_RUN_CYC_ST_MODE 798 +-#define POWER9_PME_PM_PMC4_OVERFLOW 799 +-#define POWER9_PME_PM_THRESH_EXC_256 800 +-#define POWER9_PME_PM_MRK_DATA_FROM_L3_1_ECO_MOD_CYC 801 +-#define POWER9_PME_PM_LSU0_LRQ_S0_VALID_CYC 802 +-#define POWER9_PME_PM_INST_FROM_L2MISS 803 +-#define POWER9_PME_PM_MRK_L2_TM_ST_ABORT_SISTER 804 +-#define POWER9_PME_PM_L2_ST 805 +-#define POWER9_PME_PM_RADIX_PWC_MISS 806 +-#define POWER9_PME_PM_MRK_ST_L2DISP_TO_CMPL_CYC 807 +-#define POWER9_PME_PM_LSU1_LDMX_FIN 808 +-#define POWER9_PME_PM_L3_P2_LCO_RTY 809 +-#define POWER9_PME_PM_MRK_DATA_FROM_DL2L3_SHR 810 +-#define POWER9_PME_PM_L2_GRP_GUESS_CORRECT 811 +-#define POWER9_PME_PM_LSU0_1_LRQF_FULL_CYC 812 +-#define POWER9_PME_PM_DATA_GRP_PUMP_MPRED 813 +-#define POWER9_PME_PM_LSU3_ERAT_HIT 814 +-#define POWER9_PME_PM_FORCED_NOP 815 +-#define POWER9_PME_PM_MRK_DATA_FROM_L2_DISP_CONFLICT_LDHITST 816 +-#define POWER9_PME_PM_CMPLU_STALL_LARX 817 +-#define POWER9_PME_PM_MRK_DPTEG_FROM_RL4 818 +-#define POWER9_PME_PM_MRK_DATA_FROM_L2 819 +-#define POWER9_PME_PM_TM_FAIL_CONF_NON_TM 820 +-#define POWER9_PME_PM_DPTEG_FROM_RL2L3_SHR 821 +-#define POWER9_PME_PM_DARQ_4_6_ENTRIES 822 +-#define POWER9_PME_PM_L2_SYS_PUMP 823 +-#define POWER9_PME_PM_IOPS_CMPL 824 +-#define POWER9_PME_PM_LSU_FLUSH_LHS 825 +-#define POWER9_PME_PM_DATA_FROM_L3_1_SHR 826 +-#define POWER9_PME_PM_NTC_FIN 827 +-#define POWER9_PME_PM_LS2_DC_COLLISIONS 828 +-#define POWER9_PME_PM_FMA_CMPL 829 +-#define POWER9_PME_PM_IPTEG_FROM_MEMORY 830 +-#define POWER9_PME_PM_TM_NON_FAV_TBEGIN 831 +-#define POWER9_PME_PM_PMC1_REWIND 832 +-#define POWER9_PME_PM_ISU2_ISS_HOLD_ALL 833 +-#define POWER9_PME_PM_MRK_DATA_FROM_DL2L3_MOD_CYC 834 +-#define POWER9_PME_PM_PTESYNC 835 +-#define POWER9_PME_PM_ISIDE_DISP_FAIL_OTHER 836 +-#define POWER9_PME_PM_L2_IC_INV 837 +-#define POWER9_PME_PM_DPTEG_FROM_L3 838 +-#define POWER9_PME_PM_RADIX_PWC_L2_HIT 839 +-#define POWER9_PME_PM_DC_PREF_HW_ALLOC 840 +-#define POWER9_PME_PM_LSU0_VECTOR_LD_FIN 841 +-#define POWER9_PME_PM_1PLUS_PPC_DISP 842 +-#define POWER9_PME_PM_RADIX_PWC_L3_PTE_FROM_L2 843 +-#define POWER9_PME_PM_DATA_FROM_L2MISS 844 +-#define POWER9_PME_PM_MRK_FAB_RSP_RD_T_INTV 845 +-#define POWER9_PME_PM_NTC_ISSUE_HELD_ARB 846 +-#define POWER9_PME_PM_LSU2_L1_CAM_CANCEL 847 +-#define POWER9_PME_PM_L3_GRP_GUESS_WRONG_HIGH 848 +-#define POWER9_PME_PM_DATA_FROM_L3_NO_CONFLICT 849 +-#define POWER9_PME_PM_SUSPENDED 850 +-#define POWER9_PME_PM_L3_SYS_GUESS_WRONG 851 +-#define POWER9_PME_PM_L3_L2_CO_HIT 852 +-#define POWER9_PME_PM_LSU0_TM_L1_HIT 853 +-#define POWER9_PME_PM_BR_MPRED_PCACHE 854 +-#define POWER9_PME_PM_STCX_FAIL 855 +-#define POWER9_PME_PM_LSU_FLUSH_NEXT 856 +-#define POWER9_PME_PM_DSIDE_MRU_TOUCH 857 +-#define POWER9_PME_PM_SN_MISS 858 +-#define POWER9_PME_PM_BR_PRED_TAKEN_CMPL 859 +-#define POWER9_PME_PM_L3_P0_SYS_PUMP 860 +-#define POWER9_PME_PM_L3_HIT 861 +-#define POWER9_PME_PM_MRK_DFU_FIN 862 +-#define POWER9_PME_PM_CMPLU_STALL_NESTED_TEND 863 +-#define POWER9_PME_PM_INST_FROM_L1 864 +-#define POWER9_PME_PM_IC_DEMAND_REQ 865 +-#define POWER9_PME_PM_BRU_FIN 866 +-#define POWER9_PME_PM_L1_ICACHE_RELOADED_ALL 867 +-#define POWER9_PME_PM_IERAT_RELOAD_16M 868 +-#define POWER9_PME_PM_DATA_FROM_L2MISS_MOD 869 +-#define POWER9_PME_PM_LSU0_ERAT_HIT 870 +-#define POWER9_PME_PM_L3_PF0_BUSY 871 +-#define POWER9_PME_PM_MRK_DPTEG_FROM_LL4 872 +-#define POWER9_PME_PM_LSU3_SET_MPRED 873 +-#define POWER9_PME_PM_TM_CAM_OVERFLOW 874 +-#define POWER9_PME_PM_SYNC_MRK_FX_DIVIDE 875 +-#define POWER9_PME_PM_IPTEG_FROM_L2_1_SHR 876 +-#define POWER9_PME_PM_MRK_LD_MISS_L1 877 +-#define POWER9_PME_PM_MRK_FAB_RSP_DCLAIM 878 +-#define POWER9_PME_PM_IPTEG_FROM_L3_DISP_CONFLICT 879 +-#define POWER9_PME_PM_NON_FMA_FLOP_CMPL 880 +-#define POWER9_PME_PM_MRK_DATA_FROM_L2MISS 881 +-#define POWER9_PME_PM_L2_SYS_GUESS_WRONG 882 +-#define POWER9_PME_PM_THRESH_EXC_2048 883 +-#define POWER9_PME_PM_INST_FROM_LL4 884 +-#define POWER9_PME_PM_DATA_FROM_RL2L3_SHR 885 +-#define POWER9_PME_PM_DATA_FROM_L2_DISP_CONFLICT_LDHITST 886 +-#define POWER9_PME_PM_LSU_FLUSH_WRK_ARND 887 +-#define POWER9_PME_PM_L3_PF_HIT_L3 888 +-#define POWER9_PME_PM_RD_FORMING_SC 889 +-#define POWER9_PME_PM_MRK_DATA_FROM_L2_1_MOD_CYC 890 +-#define POWER9_PME_PM_IPTEG_FROM_DL4 891 +-#define POWER9_PME_PM_CMPLU_STALL_STORE_FINISH 892 +-#define POWER9_PME_PM_IPTEG_FROM_LL4 893 +-#define POWER9_PME_PM_1FLOP_CMPL 894 +-#define POWER9_PME_PM_L2_GRP_GUESS_WRONG 895 +-#define POWER9_PME_PM_TM_FAV_TBEGIN 896 +-#define POWER9_PME_PM_INST_FROM_L2_NO_CONFLICT 897 +-#define POWER9_PME_PM_2FLOP_CMPL 898 +-#define POWER9_PME_PM_LS2_TM_DISALLOW 899 +-#define POWER9_PME_PM_L2_LD_DISP 900 +-#define POWER9_PME_PM_CMPLU_STALL_LHS 901 +-#define POWER9_PME_PM_TLB_HIT 902 +-#define POWER9_PME_PM_HV_CYC 903 +-#define POWER9_PME_PM_L2_RTY_LD 904 +-#define POWER9_PME_PM_STCX_SUCCESS_CMPL 905 +-#define POWER9_PME_PM_INST_PUMP_MPRED 906 +-#define POWER9_PME_PM_LSU2_ERAT_HIT 907 +-#define POWER9_PME_PM_INST_FROM_RL4 908 +-#define POWER9_PME_PM_LD_L3MISS_PEND_CYC 909 +-#define POWER9_PME_PM_L3_LAT_CI_MISS 910 +-#define POWER9_PME_PM_MRK_FAB_RSP_RD_RTY 911 +-#define POWER9_PME_PM_DTLB_MISS_16M 912 +-#define POWER9_PME_PM_DPTEG_FROM_L2_1_MOD 913 +-#define POWER9_PME_PM_MRK_DATA_FROM_RL2L3_SHR 914 +-#define POWER9_PME_PM_MRK_LSU_FIN 915 +-#define POWER9_PME_PM_LSU0_STORE_REJECT 916 +-#define POWER9_PME_PM_CLB_HELD 917 +-#define POWER9_PME_PM_LS2_ERAT_MISS_PREF 918 ++#define POWER9_PME_PM_INST_IMC_MATCH_CMPL 320 ++#define POWER9_PME_PM_INST_PUMP_CPRED 321 ++#define POWER9_PME_PM_INST_PUMP_MPRED 322 ++#define POWER9_PME_PM_INST_SYS_PUMP_CPRED 323 ++#define POWER9_PME_PM_INST_SYS_PUMP_MPRED_RTY 324 ++#define POWER9_PME_PM_INST_SYS_PUMP_MPRED 325 ++#define POWER9_PME_PM_IOPS_CMPL 326 ++#define POWER9_PME_PM_IPTEG_FROM_DL2L3_MOD 327 ++#define POWER9_PME_PM_IPTEG_FROM_DL2L3_SHR 328 ++#define POWER9_PME_PM_IPTEG_FROM_DL4 329 ++#define POWER9_PME_PM_IPTEG_FROM_DMEM 330 ++#define POWER9_PME_PM_IPTEG_FROM_L21_MOD 331 ++#define POWER9_PME_PM_IPTEG_FROM_L21_SHR 332 ++#define POWER9_PME_PM_IPTEG_FROM_L2_MEPF 333 ++#define POWER9_PME_PM_IPTEG_FROM_L2MISS 334 ++#define POWER9_PME_PM_IPTEG_FROM_L2_NO_CONFLICT 335 ++#define POWER9_PME_PM_IPTEG_FROM_L2 336 ++#define POWER9_PME_PM_IPTEG_FROM_L31_ECO_MOD 337 ++#define POWER9_PME_PM_IPTEG_FROM_L31_ECO_SHR 338 ++#define POWER9_PME_PM_IPTEG_FROM_L31_MOD 339 ++#define POWER9_PME_PM_IPTEG_FROM_L31_SHR 340 ++#define POWER9_PME_PM_IPTEG_FROM_L3_DISP_CONFLICT 341 ++#define POWER9_PME_PM_IPTEG_FROM_L3_MEPF 342 ++#define POWER9_PME_PM_IPTEG_FROM_L3MISS 343 ++#define POWER9_PME_PM_IPTEG_FROM_L3_NO_CONFLICT 344 ++#define POWER9_PME_PM_IPTEG_FROM_L3 345 ++#define POWER9_PME_PM_IPTEG_FROM_LL4 346 ++#define POWER9_PME_PM_IPTEG_FROM_LMEM 347 ++#define POWER9_PME_PM_IPTEG_FROM_MEMORY 348 ++#define POWER9_PME_PM_IPTEG_FROM_OFF_CHIP_CACHE 349 ++#define POWER9_PME_PM_IPTEG_FROM_ON_CHIP_CACHE 350 ++#define POWER9_PME_PM_IPTEG_FROM_RL2L3_MOD 351 ++#define POWER9_PME_PM_IPTEG_FROM_RL2L3_SHR 352 ++#define POWER9_PME_PM_IPTEG_FROM_RL4 353 ++#define POWER9_PME_PM_IPTEG_FROM_RMEM 354 ++#define POWER9_PME_PM_ISIDE_DISP_FAIL_ADDR 355 ++#define POWER9_PME_PM_ISIDE_DISP_FAIL_OTHER 356 ++#define POWER9_PME_PM_ISIDE_DISP 357 ++#define POWER9_PME_PM_ISIDE_L2MEMACC 358 ++#define POWER9_PME_PM_ISIDE_MRU_TOUCH 359 ++#define POWER9_PME_PM_ISLB_MISS 360 ++#define POWER9_PME_PM_ISLB_MISS_ALT 361 ++#define POWER9_PME_PM_ISQ_0_8_ENTRIES 362 ++#define POWER9_PME_PM_ISQ_36_44_ENTRIES 363 ++#define POWER9_PME_PM_ISU0_ISS_HOLD_ALL 364 ++#define POWER9_PME_PM_ISU1_ISS_HOLD_ALL 365 ++#define POWER9_PME_PM_ISU2_ISS_HOLD_ALL 366 ++#define POWER9_PME_PM_ISU3_ISS_HOLD_ALL 367 ++#define POWER9_PME_PM_ISYNC 368 ++#define POWER9_PME_PM_ITLB_MISS 369 ++#define POWER9_PME_PM_L1_DCACHE_RELOADED_ALL 370 ++#define POWER9_PME_PM_L1_DCACHE_RELOAD_VALID 371 ++#define POWER9_PME_PM_L1_DEMAND_WRITE 372 ++#define POWER9_PME_PM_L1_ICACHE_MISS 373 ++#define POWER9_PME_PM_L1_ICACHE_RELOADED_ALL 374 ++#define POWER9_PME_PM_L1_ICACHE_RELOADED_PREF 375 ++#define POWER9_PME_PM_L1PF_L2MEMACC 376 ++#define POWER9_PME_PM_L1_PREF 377 ++#define POWER9_PME_PM_L1_SW_PREF 378 ++#define POWER9_PME_PM_L2_CASTOUT_MOD 379 ++#define POWER9_PME_PM_L2_CASTOUT_SHR 380 ++#define POWER9_PME_PM_L2_CHIP_PUMP 381 ++#define POWER9_PME_PM_L2_DC_INV 382 ++#define POWER9_PME_PM_L2_DISP_ALL_L2MISS 383 ++#define POWER9_PME_PM_L2_GROUP_PUMP 384 ++#define POWER9_PME_PM_L2_GRP_GUESS_CORRECT 385 ++#define POWER9_PME_PM_L2_GRP_GUESS_WRONG 386 ++#define POWER9_PME_PM_L2_IC_INV 387 ++#define POWER9_PME_PM_L2_INST_MISS 388 ++#define POWER9_PME_PM_L2_INST_MISS_ALT 389 ++#define POWER9_PME_PM_L2_INST 390 ++#define POWER9_PME_PM_L2_INST_ALT 391 ++#define POWER9_PME_PM_L2_LD_DISP 392 ++#define POWER9_PME_PM_L2_LD_DISP_ALT 393 ++#define POWER9_PME_PM_L2_LD_HIT 394 ++#define POWER9_PME_PM_L2_LD_HIT_ALT 395 ++#define POWER9_PME_PM_L2_LD_MISS_128B 396 ++#define POWER9_PME_PM_L2_LD_MISS_64B 397 ++#define POWER9_PME_PM_L2_LD_MISS 398 ++#define POWER9_PME_PM_L2_LD 399 ++#define POWER9_PME_PM_L2_LOC_GUESS_CORRECT 400 ++#define POWER9_PME_PM_L2_LOC_GUESS_WRONG 401 ++#define POWER9_PME_PM_L2_RCLD_DISP_FAIL_ADDR 402 ++#define POWER9_PME_PM_L2_RCLD_DISP_FAIL_OTHER 403 ++#define POWER9_PME_PM_L2_RCLD_DISP 404 ++#define POWER9_PME_PM_L2_RCST_DISP_FAIL_ADDR 405 ++#define POWER9_PME_PM_L2_RCST_DISP_FAIL_OTHER 406 ++#define POWER9_PME_PM_L2_RCST_DISP 407 ++#define POWER9_PME_PM_L2_RC_ST_DONE 408 ++#define POWER9_PME_PM_L2_RTY_LD 409 ++#define POWER9_PME_PM_L2_RTY_LD_ALT 410 ++#define POWER9_PME_PM_L2_RTY_ST 411 ++#define POWER9_PME_PM_L2_RTY_ST_ALT 412 ++#define POWER9_PME_PM_L2_SN_M_RD_DONE 413 ++#define POWER9_PME_PM_L2_SN_M_WR_DONE 414 ++#define POWER9_PME_PM_L2_SN_M_WR_DONE_ALT 415 ++#define POWER9_PME_PM_L2_SN_SX_I_DONE 416 ++#define POWER9_PME_PM_L2_ST_DISP 417 ++#define POWER9_PME_PM_L2_ST_DISP_ALT 418 ++#define POWER9_PME_PM_L2_ST_HIT 419 ++#define POWER9_PME_PM_L2_ST_HIT_ALT 420 ++#define POWER9_PME_PM_L2_ST_MISS_128B 421 ++#define POWER9_PME_PM_L2_ST_MISS_64B 422 ++#define POWER9_PME_PM_L2_ST_MISS 423 ++#define POWER9_PME_PM_L2_ST 424 ++#define POWER9_PME_PM_L2_SYS_GUESS_CORRECT 425 ++#define POWER9_PME_PM_L2_SYS_GUESS_WRONG 426 ++#define POWER9_PME_PM_L2_SYS_PUMP 427 ++#define POWER9_PME_PM_L3_CI_HIT 428 ++#define POWER9_PME_PM_L3_CI_MISS 429 ++#define POWER9_PME_PM_L3_CINJ 430 ++#define POWER9_PME_PM_L3_CI_USAGE 431 ++#define POWER9_PME_PM_L3_CO0_BUSY 432 ++#define POWER9_PME_PM_L3_CO0_BUSY_ALT 433 ++#define POWER9_PME_PM_L3_CO_L31 434 ++#define POWER9_PME_PM_L3_CO_LCO 435 ++#define POWER9_PME_PM_L3_CO_MEM 436 ++#define POWER9_PME_PM_L3_CO_MEPF 437 ++#define POWER9_PME_PM_L3_CO_MEPF_ALT 438 ++#define POWER9_PME_PM_L3_CO 439 ++#define POWER9_PME_PM_L3_GRP_GUESS_CORRECT 440 ++#define POWER9_PME_PM_L3_GRP_GUESS_WRONG_HIGH 441 ++#define POWER9_PME_PM_L3_GRP_GUESS_WRONG_LOW 442 ++#define POWER9_PME_PM_L3_HIT 443 ++#define POWER9_PME_PM_L3_L2_CO_HIT 444 ++#define POWER9_PME_PM_L3_L2_CO_MISS 445 ++#define POWER9_PME_PM_L3_LAT_CI_HIT 446 ++#define POWER9_PME_PM_L3_LAT_CI_MISS 447 ++#define POWER9_PME_PM_L3_LD_HIT 448 ++#define POWER9_PME_PM_L3_LD_MISS 449 ++#define POWER9_PME_PM_L3_LD_PREF 450 ++#define POWER9_PME_PM_L3_LOC_GUESS_CORRECT 451 ++#define POWER9_PME_PM_L3_LOC_GUESS_WRONG 452 ++#define POWER9_PME_PM_L3_MISS 453 ++#define POWER9_PME_PM_L3_P0_CO_L31 454 ++#define POWER9_PME_PM_L3_P0_CO_MEM 455 ++#define POWER9_PME_PM_L3_P0_CO_RTY 456 ++#define POWER9_PME_PM_L3_P0_CO_RTY_ALT 457 ++#define POWER9_PME_PM_L3_P0_GRP_PUMP 458 ++#define POWER9_PME_PM_L3_P0_LCO_DATA 459 ++#define POWER9_PME_PM_L3_P0_LCO_NO_DATA 460 ++#define POWER9_PME_PM_L3_P0_LCO_RTY 461 ++#define POWER9_PME_PM_L3_P0_NODE_PUMP 462 ++#define POWER9_PME_PM_L3_P0_PF_RTY 463 ++#define POWER9_PME_PM_L3_P0_PF_RTY_ALT 464 ++#define POWER9_PME_PM_L3_P0_SYS_PUMP 465 ++#define POWER9_PME_PM_L3_P1_CO_L31 466 ++#define POWER9_PME_PM_L3_P1_CO_MEM 467 ++#define POWER9_PME_PM_L3_P1_CO_RTY 468 ++#define POWER9_PME_PM_L3_P1_CO_RTY_ALT 469 ++#define POWER9_PME_PM_L3_P1_GRP_PUMP 470 ++#define POWER9_PME_PM_L3_P1_LCO_DATA 471 ++#define POWER9_PME_PM_L3_P1_LCO_NO_DATA 472 ++#define POWER9_PME_PM_L3_P1_LCO_RTY 473 ++#define POWER9_PME_PM_L3_P1_NODE_PUMP 474 ++#define POWER9_PME_PM_L3_P1_PF_RTY 475 ++#define POWER9_PME_PM_L3_P1_PF_RTY_ALT 476 ++#define POWER9_PME_PM_L3_P1_SYS_PUMP 477 ++#define POWER9_PME_PM_L3_P2_LCO_RTY 478 ++#define POWER9_PME_PM_L3_P3_LCO_RTY 479 ++#define POWER9_PME_PM_L3_PF0_BUSY 480 ++#define POWER9_PME_PM_L3_PF0_BUSY_ALT 481 ++#define POWER9_PME_PM_L3_PF_HIT_L3 482 ++#define POWER9_PME_PM_L3_PF_MISS_L3 483 ++#define POWER9_PME_PM_L3_PF_OFF_CHIP_CACHE 484 ++#define POWER9_PME_PM_L3_PF_OFF_CHIP_MEM 485 ++#define POWER9_PME_PM_L3_PF_ON_CHIP_CACHE 486 ++#define POWER9_PME_PM_L3_PF_ON_CHIP_MEM 487 ++#define POWER9_PME_PM_L3_PF_USAGE 488 ++#define POWER9_PME_PM_L3_RD0_BUSY 489 ++#define POWER9_PME_PM_L3_RD0_BUSY_ALT 490 ++#define POWER9_PME_PM_L3_RD_USAGE 491 ++#define POWER9_PME_PM_L3_SN0_BUSY 492 ++#define POWER9_PME_PM_L3_SN0_BUSY_ALT 493 ++#define POWER9_PME_PM_L3_SN_USAGE 494 ++#define POWER9_PME_PM_L3_SW_PREF 495 ++#define POWER9_PME_PM_L3_SYS_GUESS_CORRECT 496 ++#define POWER9_PME_PM_L3_SYS_GUESS_WRONG 497 ++#define POWER9_PME_PM_L3_TRANS_PF 498 ++#define POWER9_PME_PM_L3_WI0_BUSY 499 ++#define POWER9_PME_PM_L3_WI0_BUSY_ALT 500 ++#define POWER9_PME_PM_L3_WI_USAGE 501 ++#define POWER9_PME_PM_LARX_FIN 502 ++#define POWER9_PME_PM_LD_CMPL 503 ++#define POWER9_PME_PM_LD_L3MISS_PEND_CYC 504 ++#define POWER9_PME_PM_LD_MISS_L1_FIN 505 ++#define POWER9_PME_PM_LD_MISS_L1 506 ++#define POWER9_PME_PM_LD_REF_L1 507 ++#define POWER9_PME_PM_LINK_STACK_CORRECT 508 ++#define POWER9_PME_PM_LINK_STACK_INVALID_PTR 509 ++#define POWER9_PME_PM_LINK_STACK_WRONG_ADD_PRED 510 ++#define POWER9_PME_PM_LMQ_EMPTY_CYC 511 ++#define POWER9_PME_PM_LMQ_MERGE 512 ++#define POWER9_PME_PM_LRQ_REJECT 513 ++#define POWER9_PME_PM_LS0_DC_COLLISIONS 514 ++#define POWER9_PME_PM_LS0_ERAT_MISS_PREF 515 ++#define POWER9_PME_PM_LS0_LAUNCH_HELD_PREF 516 ++#define POWER9_PME_PM_LS0_PTE_TABLEWALK_CYC 517 ++#define POWER9_PME_PM_LS0_TM_DISALLOW 518 ++#define POWER9_PME_PM_LS0_UNALIGNED_LD 519 ++#define POWER9_PME_PM_LS0_UNALIGNED_ST 520 ++#define POWER9_PME_PM_LS1_DC_COLLISIONS 521 ++#define POWER9_PME_PM_LS1_ERAT_MISS_PREF 522 ++#define POWER9_PME_PM_LS1_LAUNCH_HELD_PREF 523 ++#define POWER9_PME_PM_LS1_PTE_TABLEWALK_CYC 524 ++#define POWER9_PME_PM_LS1_TM_DISALLOW 525 ++#define POWER9_PME_PM_LS1_UNALIGNED_LD 526 ++#define POWER9_PME_PM_LS1_UNALIGNED_ST 527 ++#define POWER9_PME_PM_LS2_DC_COLLISIONS 528 ++#define POWER9_PME_PM_LS2_ERAT_MISS_PREF 529 ++#define POWER9_PME_PM_LS2_TM_DISALLOW 530 ++#define POWER9_PME_PM_LS2_UNALIGNED_LD 531 ++#define POWER9_PME_PM_LS2_UNALIGNED_ST 532 ++#define POWER9_PME_PM_LS3_DC_COLLISIONS 533 ++#define POWER9_PME_PM_LS3_ERAT_MISS_PREF 534 ++#define POWER9_PME_PM_LS3_TM_DISALLOW 535 ++#define POWER9_PME_PM_LS3_UNALIGNED_LD 536 ++#define POWER9_PME_PM_LS3_UNALIGNED_ST 537 ++#define POWER9_PME_PM_LSU0_1_LRQF_FULL_CYC 538 ++#define POWER9_PME_PM_LSU0_ERAT_HIT 539 ++#define POWER9_PME_PM_LSU0_FALSE_LHS 540 ++#define POWER9_PME_PM_LSU0_L1_CAM_CANCEL 541 ++#define POWER9_PME_PM_LSU0_LDMX_FIN 542 ++#define POWER9_PME_PM_LSU0_LMQ_S0_VALID 543 ++#define POWER9_PME_PM_LSU0_LRQ_S0_VALID_CYC 544 ++#define POWER9_PME_PM_LSU0_SET_MPRED 545 ++#define POWER9_PME_PM_LSU0_SRQ_S0_VALID_CYC 546 ++#define POWER9_PME_PM_LSU0_STORE_REJECT 547 ++#define POWER9_PME_PM_LSU0_TM_L1_HIT 548 ++#define POWER9_PME_PM_LSU0_TM_L1_MISS 549 ++#define POWER9_PME_PM_LSU1_ERAT_HIT 550 ++#define POWER9_PME_PM_LSU1_FALSE_LHS 551 ++#define POWER9_PME_PM_LSU1_L1_CAM_CANCEL 552 ++#define POWER9_PME_PM_LSU1_LDMX_FIN 553 ++#define POWER9_PME_PM_LSU1_SET_MPRED 554 ++#define POWER9_PME_PM_LSU1_STORE_REJECT 555 ++#define POWER9_PME_PM_LSU1_TM_L1_HIT 556 ++#define POWER9_PME_PM_LSU1_TM_L1_MISS 557 ++#define POWER9_PME_PM_LSU2_3_LRQF_FULL_CYC 558 ++#define POWER9_PME_PM_LSU2_ERAT_HIT 559 ++#define POWER9_PME_PM_LSU2_FALSE_LHS 560 ++#define POWER9_PME_PM_LSU2_L1_CAM_CANCEL 561 ++#define POWER9_PME_PM_LSU2_LDMX_FIN 562 ++#define POWER9_PME_PM_LSU2_SET_MPRED 563 ++#define POWER9_PME_PM_LSU2_STORE_REJECT 564 ++#define POWER9_PME_PM_LSU2_TM_L1_HIT 565 ++#define POWER9_PME_PM_LSU2_TM_L1_MISS 566 ++#define POWER9_PME_PM_LSU3_ERAT_HIT 567 ++#define POWER9_PME_PM_LSU3_FALSE_LHS 568 ++#define POWER9_PME_PM_LSU3_L1_CAM_CANCEL 569 ++#define POWER9_PME_PM_LSU3_LDMX_FIN 570 ++#define POWER9_PME_PM_LSU3_SET_MPRED 571 ++#define POWER9_PME_PM_LSU3_STORE_REJECT 572 ++#define POWER9_PME_PM_LSU3_TM_L1_HIT 573 ++#define POWER9_PME_PM_LSU3_TM_L1_MISS 574 ++#define POWER9_PME_PM_LSU_DERAT_MISS 575 ++#define POWER9_PME_PM_LSU_FIN 576 ++#define POWER9_PME_PM_LSU_FLUSH_ATOMIC 577 ++#define POWER9_PME_PM_LSU_FLUSH_CI 578 ++#define POWER9_PME_PM_LSU_FLUSH_EMSH 579 ++#define POWER9_PME_PM_LSU_FLUSH_LARX_STCX 580 ++#define POWER9_PME_PM_LSU_FLUSH_LHL_SHL 581 ++#define POWER9_PME_PM_LSU_FLUSH_LHS 582 ++#define POWER9_PME_PM_LSU_FLUSH_NEXT 583 ++#define POWER9_PME_PM_LSU_FLUSH_OTHER 584 ++#define POWER9_PME_PM_LSU_FLUSH_RELAUNCH_MISS 585 ++#define POWER9_PME_PM_LSU_FLUSH_SAO 586 ++#define POWER9_PME_PM_LSU_FLUSH_UE 587 ++#define POWER9_PME_PM_LSU_FLUSH_WRK_ARND 588 ++#define POWER9_PME_PM_LSU_LMQ_FULL_CYC 589 ++#define POWER9_PME_PM_LSU_LMQ_SRQ_EMPTY_CYC 590 ++#define POWER9_PME_PM_LSU_NCST 591 ++#define POWER9_PME_PM_LSU_REJECT_ERAT_MISS 592 ++#define POWER9_PME_PM_LSU_REJECT_LHS 593 ++#define POWER9_PME_PM_LSU_REJECT_LMQ_FULL 594 ++#define POWER9_PME_PM_LSU_SRQ_FULL_CYC 595 ++#define POWER9_PME_PM_LSU_STCX_FAIL 596 ++#define POWER9_PME_PM_LSU_STCX 597 ++#define POWER9_PME_PM_LWSYNC 598 ++#define POWER9_PME_PM_MATH_FLOP_CMPL 599 ++#define POWER9_PME_PM_MEM_CO 600 ++#define POWER9_PME_PM_MEM_LOC_THRESH_IFU 601 ++#define POWER9_PME_PM_MEM_LOC_THRESH_LSU_HIGH 602 ++#define POWER9_PME_PM_MEM_LOC_THRESH_LSU_MED 603 ++#define POWER9_PME_PM_MEM_PREF 604 ++#define POWER9_PME_PM_MEM_READ 605 ++#define POWER9_PME_PM_MEM_RWITM 606 ++#define POWER9_PME_PM_MRK_BACK_BR_CMPL 607 ++#define POWER9_PME_PM_MRK_BR_2PATH 608 ++#define POWER9_PME_PM_MRK_BR_CMPL 609 ++#define POWER9_PME_PM_MRK_BR_MPRED_CMPL 610 ++#define POWER9_PME_PM_MRK_BR_TAKEN_CMPL 611 ++#define POWER9_PME_PM_MRK_BRU_FIN 612 ++#define POWER9_PME_PM_MRK_DATA_FROM_DL2L3_MOD_CYC 613 ++#define POWER9_PME_PM_MRK_DATA_FROM_DL2L3_MOD 614 ++#define POWER9_PME_PM_MRK_DATA_FROM_DL2L3_SHR_CYC 615 ++#define POWER9_PME_PM_MRK_DATA_FROM_DL2L3_SHR 616 ++#define POWER9_PME_PM_MRK_DATA_FROM_DL4_CYC 617 ++#define POWER9_PME_PM_MRK_DATA_FROM_DL4 618 ++#define POWER9_PME_PM_MRK_DATA_FROM_DMEM_CYC 619 ++#define POWER9_PME_PM_MRK_DATA_FROM_DMEM 620 ++#define POWER9_PME_PM_MRK_DATA_FROM_L21_MOD_CYC 621 ++#define POWER9_PME_PM_MRK_DATA_FROM_L21_MOD 622 ++#define POWER9_PME_PM_MRK_DATA_FROM_L21_SHR_CYC 623 ++#define POWER9_PME_PM_MRK_DATA_FROM_L21_SHR 624 ++#define POWER9_PME_PM_MRK_DATA_FROM_L2_CYC 625 ++#define POWER9_PME_PM_MRK_DATA_FROM_L2_DISP_CONFLICT_LDHITST_CYC 626 ++#define POWER9_PME_PM_MRK_DATA_FROM_L2_DISP_CONFLICT_LDHITST 627 ++#define POWER9_PME_PM_MRK_DATA_FROM_L2_DISP_CONFLICT_OTHER_CYC 628 ++#define POWER9_PME_PM_MRK_DATA_FROM_L2_DISP_CONFLICT_OTHER 629 ++#define POWER9_PME_PM_MRK_DATA_FROM_L2_MEPF_CYC 630 ++#define POWER9_PME_PM_MRK_DATA_FROM_L2_MEPF 631 ++#define POWER9_PME_PM_MRK_DATA_FROM_L2MISS_CYC 632 ++#define POWER9_PME_PM_MRK_DATA_FROM_L2MISS 633 ++#define POWER9_PME_PM_MRK_DATA_FROM_L2_NO_CONFLICT_CYC 634 ++#define POWER9_PME_PM_MRK_DATA_FROM_L2_NO_CONFLICT 635 ++#define POWER9_PME_PM_MRK_DATA_FROM_L2 636 ++#define POWER9_PME_PM_MRK_DATA_FROM_L31_ECO_MOD_CYC 637 ++#define POWER9_PME_PM_MRK_DATA_FROM_L31_ECO_MOD 638 ++#define POWER9_PME_PM_MRK_DATA_FROM_L31_ECO_SHR_CYC 639 ++#define POWER9_PME_PM_MRK_DATA_FROM_L31_ECO_SHR 640 ++#define POWER9_PME_PM_MRK_DATA_FROM_L31_MOD_CYC 641 ++#define POWER9_PME_PM_MRK_DATA_FROM_L31_MOD 642 ++#define POWER9_PME_PM_MRK_DATA_FROM_L31_SHR_CYC 643 ++#define POWER9_PME_PM_MRK_DATA_FROM_L31_SHR 644 ++#define POWER9_PME_PM_MRK_DATA_FROM_L3_CYC 645 ++#define POWER9_PME_PM_MRK_DATA_FROM_L3_DISP_CONFLICT_CYC 646 ++#define POWER9_PME_PM_MRK_DATA_FROM_L3_DISP_CONFLICT 647 ++#define POWER9_PME_PM_MRK_DATA_FROM_L3_MEPF_CYC 648 ++#define POWER9_PME_PM_MRK_DATA_FROM_L3_MEPF 649 ++#define POWER9_PME_PM_MRK_DATA_FROM_L3MISS_CYC 650 ++#define POWER9_PME_PM_MRK_DATA_FROM_L3MISS 651 ++#define POWER9_PME_PM_MRK_DATA_FROM_L3_NO_CONFLICT_CYC 652 ++#define POWER9_PME_PM_MRK_DATA_FROM_L3_NO_CONFLICT 653 ++#define POWER9_PME_PM_MRK_DATA_FROM_L3 654 ++#define POWER9_PME_PM_MRK_DATA_FROM_LL4_CYC 655 ++#define POWER9_PME_PM_MRK_DATA_FROM_LL4 656 ++#define POWER9_PME_PM_MRK_DATA_FROM_LMEM_CYC 657 ++#define POWER9_PME_PM_MRK_DATA_FROM_LMEM 658 ++#define POWER9_PME_PM_MRK_DATA_FROM_MEMORY_CYC 659 ++#define POWER9_PME_PM_MRK_DATA_FROM_MEMORY 660 ++#define POWER9_PME_PM_MRK_DATA_FROM_OFF_CHIP_CACHE_CYC 661 ++#define POWER9_PME_PM_MRK_DATA_FROM_OFF_CHIP_CACHE 662 ++#define POWER9_PME_PM_MRK_DATA_FROM_ON_CHIP_CACHE_CYC 663 ++#define POWER9_PME_PM_MRK_DATA_FROM_ON_CHIP_CACHE 664 ++#define POWER9_PME_PM_MRK_DATA_FROM_RL2L3_MOD_CYC 665 ++#define POWER9_PME_PM_MRK_DATA_FROM_RL2L3_MOD 666 ++#define POWER9_PME_PM_MRK_DATA_FROM_RL2L3_SHR_CYC 667 ++#define POWER9_PME_PM_MRK_DATA_FROM_RL2L3_SHR 668 ++#define POWER9_PME_PM_MRK_DATA_FROM_RL4_CYC 669 ++#define POWER9_PME_PM_MRK_DATA_FROM_RL4 670 ++#define POWER9_PME_PM_MRK_DATA_FROM_RMEM_CYC 671 ++#define POWER9_PME_PM_MRK_DATA_FROM_RMEM 672 ++#define POWER9_PME_PM_MRK_DCACHE_RELOAD_INTV 673 ++#define POWER9_PME_PM_MRK_DERAT_MISS_16G 674 ++#define POWER9_PME_PM_MRK_DERAT_MISS_16M 675 ++#define POWER9_PME_PM_MRK_DERAT_MISS_1G 676 ++#define POWER9_PME_PM_MRK_DERAT_MISS_2M 677 ++#define POWER9_PME_PM_MRK_DERAT_MISS_4K 678 ++#define POWER9_PME_PM_MRK_DERAT_MISS_64K 679 ++#define POWER9_PME_PM_MRK_DERAT_MISS 680 ++#define POWER9_PME_PM_MRK_DFU_FIN 681 ++#define POWER9_PME_PM_MRK_DPTEG_FROM_DL2L3_MOD 682 ++#define POWER9_PME_PM_MRK_DPTEG_FROM_DL2L3_SHR 683 ++#define POWER9_PME_PM_MRK_DPTEG_FROM_DL4 684 ++#define POWER9_PME_PM_MRK_DPTEG_FROM_DMEM 685 ++#define POWER9_PME_PM_MRK_DPTEG_FROM_L21_MOD 686 ++#define POWER9_PME_PM_MRK_DPTEG_FROM_L21_SHR 687 ++#define POWER9_PME_PM_MRK_DPTEG_FROM_L2_MEPF 688 ++#define POWER9_PME_PM_MRK_DPTEG_FROM_L2MISS 689 ++#define POWER9_PME_PM_MRK_DPTEG_FROM_L2_NO_CONFLICT 690 ++#define POWER9_PME_PM_MRK_DPTEG_FROM_L2 691 ++#define POWER9_PME_PM_MRK_DPTEG_FROM_L31_ECO_MOD 692 ++#define POWER9_PME_PM_MRK_DPTEG_FROM_L31_ECO_SHR 693 ++#define POWER9_PME_PM_MRK_DPTEG_FROM_L31_MOD 694 ++#define POWER9_PME_PM_MRK_DPTEG_FROM_L31_SHR 695 ++#define POWER9_PME_PM_MRK_DPTEG_FROM_L3_DISP_CONFLICT 696 ++#define POWER9_PME_PM_MRK_DPTEG_FROM_L3_MEPF 697 ++#define POWER9_PME_PM_MRK_DPTEG_FROM_L3MISS 698 ++#define POWER9_PME_PM_MRK_DPTEG_FROM_L3_NO_CONFLICT 699 ++#define POWER9_PME_PM_MRK_DPTEG_FROM_L3 700 ++#define POWER9_PME_PM_MRK_DPTEG_FROM_LL4 701 ++#define POWER9_PME_PM_MRK_DPTEG_FROM_LMEM 702 ++#define POWER9_PME_PM_MRK_DPTEG_FROM_MEMORY 703 ++#define POWER9_PME_PM_MRK_DPTEG_FROM_OFF_CHIP_CACHE 704 ++#define POWER9_PME_PM_MRK_DPTEG_FROM_ON_CHIP_CACHE 705 ++#define POWER9_PME_PM_MRK_DPTEG_FROM_RL2L3_MOD 706 ++#define POWER9_PME_PM_MRK_DPTEG_FROM_RL2L3_SHR 707 ++#define POWER9_PME_PM_MRK_DPTEG_FROM_RL4 708 ++#define POWER9_PME_PM_MRK_DPTEG_FROM_RMEM 709 ++#define POWER9_PME_PM_MRK_DTLB_MISS_16G 710 ++#define POWER9_PME_PM_MRK_DTLB_MISS_16M 711 ++#define POWER9_PME_PM_MRK_DTLB_MISS_1G 712 ++#define POWER9_PME_PM_MRK_DTLB_MISS_4K 713 ++#define POWER9_PME_PM_MRK_DTLB_MISS_64K 714 ++#define POWER9_PME_PM_MRK_DTLB_MISS 715 ++#define POWER9_PME_PM_MRK_FAB_RSP_BKILL_CYC 716 ++#define POWER9_PME_PM_MRK_FAB_RSP_BKILL 717 ++#define POWER9_PME_PM_MRK_FAB_RSP_CLAIM_RTY 718 ++#define POWER9_PME_PM_MRK_FAB_RSP_DCLAIM_CYC 719 ++#define POWER9_PME_PM_MRK_FAB_RSP_DCLAIM 720 ++#define POWER9_PME_PM_MRK_FAB_RSP_RD_RTY 721 ++#define POWER9_PME_PM_MRK_FAB_RSP_RD_T_INTV 722 ++#define POWER9_PME_PM_MRK_FAB_RSP_RWITM_CYC 723 ++#define POWER9_PME_PM_MRK_FAB_RSP_RWITM_RTY 724 ++#define POWER9_PME_PM_MRK_FXU_FIN 725 ++#define POWER9_PME_PM_MRK_IC_MISS 726 ++#define POWER9_PME_PM_MRK_INST_CMPL 727 ++#define POWER9_PME_PM_MRK_INST_DECODED 728 ++#define POWER9_PME_PM_MRK_INST_DISP 729 ++#define POWER9_PME_PM_MRK_INST_FIN 730 ++#define POWER9_PME_PM_MRK_INST_FROM_L3MISS 731 ++#define POWER9_PME_PM_MRK_INST_ISSUED 732 ++#define POWER9_PME_PM_MRK_INST_TIMEO 733 ++#define POWER9_PME_PM_MRK_INST 734 ++#define POWER9_PME_PM_MRK_L1_ICACHE_MISS 735 ++#define POWER9_PME_PM_MRK_L1_RELOAD_VALID 736 ++#define POWER9_PME_PM_MRK_L2_RC_DISP 737 ++#define POWER9_PME_PM_MRK_L2_RC_DONE 738 ++#define POWER9_PME_PM_MRK_L2_TM_REQ_ABORT 739 ++#define POWER9_PME_PM_MRK_L2_TM_ST_ABORT_SISTER 740 ++#define POWER9_PME_PM_MRK_LARX_FIN 741 ++#define POWER9_PME_PM_MRK_LD_MISS_EXPOSED_CYC 742 ++#define POWER9_PME_PM_MRK_LD_MISS_L1_CYC 743 ++#define POWER9_PME_PM_MRK_LD_MISS_L1 744 ++#define POWER9_PME_PM_MRK_LSU_DERAT_MISS 745 ++#define POWER9_PME_PM_MRK_LSU_FIN 746 ++#define POWER9_PME_PM_MRK_LSU_FLUSH_ATOMIC 747 ++#define POWER9_PME_PM_MRK_LSU_FLUSH_EMSH 748 ++#define POWER9_PME_PM_MRK_LSU_FLUSH_LARX_STCX 749 ++#define POWER9_PME_PM_MRK_LSU_FLUSH_LHL_SHL 750 ++#define POWER9_PME_PM_MRK_LSU_FLUSH_LHS 751 ++#define POWER9_PME_PM_MRK_LSU_FLUSH_RELAUNCH_MISS 752 ++#define POWER9_PME_PM_MRK_LSU_FLUSH_SAO 753 ++#define POWER9_PME_PM_MRK_LSU_FLUSH_UE 754 ++#define POWER9_PME_PM_MRK_NTC_CYC 755 ++#define POWER9_PME_PM_MRK_NTF_FIN 756 ++#define POWER9_PME_PM_MRK_PROBE_NOP_CMPL 757 ++#define POWER9_PME_PM_MRK_RUN_CYC 758 ++#define POWER9_PME_PM_MRK_STALL_CMPLU_CYC 759 ++#define POWER9_PME_PM_MRK_ST_CMPL_INT 760 ++#define POWER9_PME_PM_MRK_ST_CMPL 761 ++#define POWER9_PME_PM_MRK_STCX_FAIL 762 ++#define POWER9_PME_PM_MRK_STCX_FIN 763 ++#define POWER9_PME_PM_MRK_ST_DONE_L2 764 ++#define POWER9_PME_PM_MRK_ST_DRAIN_TO_L2DISP_CYC 765 ++#define POWER9_PME_PM_MRK_ST_FWD 766 ++#define POWER9_PME_PM_MRK_ST_L2DISP_TO_CMPL_CYC 767 ++#define POWER9_PME_PM_MRK_ST_NEST 768 ++#define POWER9_PME_PM_MRK_TEND_FAIL 769 ++#define POWER9_PME_PM_MRK_VSU_FIN 770 ++#define POWER9_PME_PM_MULT_MRK 771 ++#define POWER9_PME_PM_NEST_REF_CLK 772 ++#define POWER9_PME_PM_NON_DATA_STORE 773 ++#define POWER9_PME_PM_NON_FMA_FLOP_CMPL 774 ++#define POWER9_PME_PM_NON_MATH_FLOP_CMPL 775 ++#define POWER9_PME_PM_NON_TM_RST_SC 776 ++#define POWER9_PME_PM_NTC_ALL_FIN 777 ++#define POWER9_PME_PM_NTC_FIN 778 ++#define POWER9_PME_PM_NTC_ISSUE_HELD_ARB 779 ++#define POWER9_PME_PM_NTC_ISSUE_HELD_DARQ_FULL 780 ++#define POWER9_PME_PM_NTC_ISSUE_HELD_OTHER 781 ++#define POWER9_PME_PM_PARTIAL_ST_FIN 782 ++#define POWER9_PME_PM_PMC1_OVERFLOW 783 ++#define POWER9_PME_PM_PMC1_REWIND 784 ++#define POWER9_PME_PM_PMC1_SAVED 785 ++#define POWER9_PME_PM_PMC2_OVERFLOW 786 ++#define POWER9_PME_PM_PMC2_REWIND 787 ++#define POWER9_PME_PM_PMC2_SAVED 788 ++#define POWER9_PME_PM_PMC3_OVERFLOW 789 ++#define POWER9_PME_PM_PMC3_REWIND 790 ++#define POWER9_PME_PM_PMC3_SAVED 791 ++#define POWER9_PME_PM_PMC4_OVERFLOW 792 ++#define POWER9_PME_PM_PMC4_REWIND 793 ++#define POWER9_PME_PM_PMC4_SAVED 794 ++#define POWER9_PME_PM_PMC5_OVERFLOW 795 ++#define POWER9_PME_PM_PMC6_OVERFLOW 796 ++#define POWER9_PME_PM_PROBE_NOP_DISP 797 ++#define POWER9_PME_PM_PTE_PREFETCH 798 ++#define POWER9_PME_PM_PTESYNC 799 ++#define POWER9_PME_PM_PUMP_CPRED 800 ++#define POWER9_PME_PM_PUMP_MPRED 801 ++#define POWER9_PME_PM_RADIX_PWC_L1_HIT 802 ++#define POWER9_PME_PM_RADIX_PWC_L1_PDE_FROM_L2 803 ++#define POWER9_PME_PM_RADIX_PWC_L1_PDE_FROM_L3MISS 804 ++#define POWER9_PME_PM_RADIX_PWC_L1_PDE_FROM_L3 805 ++#define POWER9_PME_PM_RADIX_PWC_L2_HIT 806 ++#define POWER9_PME_PM_RADIX_PWC_L2_PDE_FROM_L2 807 ++#define POWER9_PME_PM_RADIX_PWC_L2_PDE_FROM_L3 808 ++#define POWER9_PME_PM_RADIX_PWC_L2_PTE_FROM_L2 809 ++#define POWER9_PME_PM_RADIX_PWC_L2_PTE_FROM_L3MISS 810 ++#define POWER9_PME_PM_RADIX_PWC_L2_PTE_FROM_L3 811 ++#define POWER9_PME_PM_RADIX_PWC_L3_HIT 812 ++#define POWER9_PME_PM_RADIX_PWC_L3_PDE_FROM_L2 813 ++#define POWER9_PME_PM_RADIX_PWC_L3_PDE_FROM_L3 814 ++#define POWER9_PME_PM_RADIX_PWC_L3_PTE_FROM_L2 815 ++#define POWER9_PME_PM_RADIX_PWC_L3_PTE_FROM_L3MISS 816 ++#define POWER9_PME_PM_RADIX_PWC_L3_PTE_FROM_L3 817 ++#define POWER9_PME_PM_RADIX_PWC_L4_PTE_FROM_L2 818 ++#define POWER9_PME_PM_RADIX_PWC_L4_PTE_FROM_L3MISS 819 ++#define POWER9_PME_PM_RADIX_PWC_L4_PTE_FROM_L3 820 ++#define POWER9_PME_PM_RADIX_PWC_MISS 821 ++#define POWER9_PME_PM_RC0_BUSY 822 ++#define POWER9_PME_PM_RC0_BUSY_ALT 823 ++#define POWER9_PME_PM_RC_USAGE 824 ++#define POWER9_PME_PM_RD_CLEARING_SC 825 ++#define POWER9_PME_PM_RD_FORMING_SC 826 ++#define POWER9_PME_PM_RD_HIT_PF 827 ++#define POWER9_PME_PM_RUN_CYC_SMT2_MODE 828 ++#define POWER9_PME_PM_RUN_CYC_SMT4_MODE 829 ++#define POWER9_PME_PM_RUN_CYC_ST_MODE 830 ++#define POWER9_PME_PM_RUN_CYC 831 ++#define POWER9_PME_PM_RUN_INST_CMPL 832 ++#define POWER9_PME_PM_RUN_PURR 833 ++#define POWER9_PME_PM_RUN_SPURR 834 ++#define POWER9_PME_PM_S2Q_FULL 835 ++#define POWER9_PME_PM_SCALAR_FLOP_CMPL 836 ++#define POWER9_PME_PM_SHL_CREATED 837 ++#define POWER9_PME_PM_SHL_ST_DEP_CREATED 838 ++#define POWER9_PME_PM_SHL_ST_DISABLE 839 ++#define POWER9_PME_PM_SLB_TABLEWALK_CYC 840 ++#define POWER9_PME_PM_SN0_BUSY 841 ++#define POWER9_PME_PM_SN0_BUSY_ALT 842 ++#define POWER9_PME_PM_SN_HIT 843 ++#define POWER9_PME_PM_SN_INVL 844 ++#define POWER9_PME_PM_SN_MISS 845 ++#define POWER9_PME_PM_SNOOP_TLBIE 846 ++#define POWER9_PME_PM_SNP_TM_HIT_M 847 ++#define POWER9_PME_PM_SNP_TM_HIT_T 848 ++#define POWER9_PME_PM_SN_USAGE 849 ++#define POWER9_PME_PM_SP_FLOP_CMPL 850 ++#define POWER9_PME_PM_SRQ_EMPTY_CYC 851 ++#define POWER9_PME_PM_SRQ_SYNC_CYC 852 ++#define POWER9_PME_PM_STALL_END_ICT_EMPTY 853 ++#define POWER9_PME_PM_ST_CAUSED_FAIL 854 ++#define POWER9_PME_PM_ST_CMPL 855 ++#define POWER9_PME_PM_STCX_FAIL 856 ++#define POWER9_PME_PM_STCX_FIN 857 ++#define POWER9_PME_PM_STCX_SUCCESS_CMPL 858 ++#define POWER9_PME_PM_ST_FIN 859 ++#define POWER9_PME_PM_ST_FWD 860 ++#define POWER9_PME_PM_ST_MISS_L1 861 ++#define POWER9_PME_PM_STOP_FETCH_PENDING_CYC 862 ++#define POWER9_PME_PM_SUSPENDED 863 ++#define POWER9_PME_PM_SYNC_MRK_BR_LINK 864 ++#define POWER9_PME_PM_SYNC_MRK_BR_MPRED 865 ++#define POWER9_PME_PM_SYNC_MRK_FX_DIVIDE 866 ++#define POWER9_PME_PM_SYNC_MRK_L2HIT 867 ++#define POWER9_PME_PM_SYNC_MRK_L2MISS 868 ++#define POWER9_PME_PM_SYNC_MRK_L3MISS 869 ++#define POWER9_PME_PM_SYNC_MRK_PROBE_NOP 870 ++#define POWER9_PME_PM_SYS_PUMP_CPRED 871 ++#define POWER9_PME_PM_SYS_PUMP_MPRED_RTY 872 ++#define POWER9_PME_PM_SYS_PUMP_MPRED 873 ++#define POWER9_PME_PM_TABLEWALK_CYC_PREF 874 ++#define POWER9_PME_PM_TABLEWALK_CYC 875 ++#define POWER9_PME_PM_TAGE_CORRECT_TAKEN_CMPL 876 ++#define POWER9_PME_PM_TAGE_CORRECT 877 ++#define POWER9_PME_PM_TAGE_OVERRIDE_WRONG_SPEC 878 ++#define POWER9_PME_PM_TAGE_OVERRIDE_WRONG 879 ++#define POWER9_PME_PM_TAKEN_BR_MPRED_CMPL 880 ++#define POWER9_PME_PM_TB_BIT_TRANS 881 ++#define POWER9_PME_PM_TEND_PEND_CYC 882 ++#define POWER9_PME_PM_THRD_ALL_RUN_CYC 883 ++#define POWER9_PME_PM_THRD_CONC_RUN_INST 884 ++#define POWER9_PME_PM_THRD_PRIO_0_1_CYC 885 ++#define POWER9_PME_PM_THRD_PRIO_2_3_CYC 886 ++#define POWER9_PME_PM_THRD_PRIO_4_5_CYC 887 ++#define POWER9_PME_PM_THRD_PRIO_6_7_CYC 888 ++#define POWER9_PME_PM_THRESH_ACC 889 ++#define POWER9_PME_PM_THRESH_EXC_1024 890 ++#define POWER9_PME_PM_THRESH_EXC_128 891 ++#define POWER9_PME_PM_THRESH_EXC_2048 892 ++#define POWER9_PME_PM_THRESH_EXC_256 893 ++#define POWER9_PME_PM_THRESH_EXC_32 894 ++#define POWER9_PME_PM_THRESH_EXC_4096 895 ++#define POWER9_PME_PM_THRESH_EXC_512 896 ++#define POWER9_PME_PM_THRESH_EXC_64 897 ++#define POWER9_PME_PM_THRESH_MET 898 ++#define POWER9_PME_PM_THRESH_NOT_MET 899 ++#define POWER9_PME_PM_TLB_HIT 900 ++#define POWER9_PME_PM_TLBIE_FIN 901 ++#define POWER9_PME_PM_TLB_MISS 902 ++#define POWER9_PME_PM_TM_ABORTS 903 ++#define POWER9_PME_PM_TMA_REQ_L2 904 ++#define POWER9_PME_PM_TM_CAM_OVERFLOW 905 ++#define POWER9_PME_PM_TM_CAP_OVERFLOW 906 ++#define POWER9_PME_PM_TM_FAIL_CONF_NON_TM 907 ++#define POWER9_PME_PM_TM_FAIL_CONF_TM 908 ++#define POWER9_PME_PM_TM_FAIL_FOOTPRINT_OVERFLOW 909 ++#define POWER9_PME_PM_TM_FAIL_NON_TX_CONFLICT 910 ++#define POWER9_PME_PM_TM_FAIL_SELF 911 ++#define POWER9_PME_PM_TM_FAIL_TLBIE 912 ++#define POWER9_PME_PM_TM_FAIL_TX_CONFLICT 913 ++#define POWER9_PME_PM_TM_FAV_CAUSED_FAIL 914 ++#define POWER9_PME_PM_TM_FAV_TBEGIN 915 ++#define POWER9_PME_PM_TM_LD_CAUSED_FAIL 916 ++#define POWER9_PME_PM_TM_LD_CONF 917 ++#define POWER9_PME_PM_TM_NESTED_TBEGIN 918 ++#define POWER9_PME_PM_TM_NESTED_TEND 919 ++#define POWER9_PME_PM_TM_NON_FAV_TBEGIN 920 ++#define POWER9_PME_PM_TM_OUTER_TBEGIN_DISP 921 ++#define POWER9_PME_PM_TM_OUTER_TBEGIN 922 ++#define POWER9_PME_PM_TM_OUTER_TEND 923 ++#define POWER9_PME_PM_TM_PASSED 924 ++#define POWER9_PME_PM_TM_RST_SC 925 ++#define POWER9_PME_PM_TM_SC_CO 926 ++#define POWER9_PME_PM_TM_ST_CAUSED_FAIL 927 ++#define POWER9_PME_PM_TM_ST_CONF 928 ++#define POWER9_PME_PM_TM_TABORT_TRECLAIM 929 ++#define POWER9_PME_PM_TM_TRANS_RUN_CYC 930 ++#define POWER9_PME_PM_TM_TRANS_RUN_INST 931 ++#define POWER9_PME_PM_TM_TRESUME 932 ++#define POWER9_PME_PM_TM_TSUSPEND 933 ++#define POWER9_PME_PM_TM_TX_PASS_RUN_CYC 934 ++#define POWER9_PME_PM_TM_TX_PASS_RUN_INST 935 ++#define POWER9_PME_PM_VECTOR_FLOP_CMPL 936 ++#define POWER9_PME_PM_VECTOR_LD_CMPL 937 ++#define POWER9_PME_PM_VECTOR_ST_CMPL 938 ++#define POWER9_PME_PM_VSU_DP_FSQRT_FDIV 939 ++#define POWER9_PME_PM_VSU_FIN 940 ++#define POWER9_PME_PM_VSU_FSQRT_FDIV 941 ++#define POWER9_PME_PM_VSU_NON_FLOP_CMPL 942 ++#define POWER9_PME_PM_XLATE_HPT_MODE 943 ++#define POWER9_PME_PM_XLATE_MISS 944 ++#define POWER9_PME_PM_XLATE_RADIX_MODE 945 ++ + static const pme_power_entry_t power9_pe[] = { +-[ POWER9_PME_PM_IERAT_RELOAD ] = { /* 0 */ +- .pme_name = "PM_IERAT_RELOAD", +- .pme_code = 0x00000100F6, +- .pme_short_desc = "Number of I-ERAT reloads", +- .pme_long_desc = "Number of I-ERAT reloads", ++[ POWER9_PME_PM_1FLOP_CMPL ] = { ++ .pme_name = "PM_1FLOP_CMPL", ++ .pme_code = 0x0000045050, ++ .pme_short_desc = "one flop (fadd, fmul, fsub, fcmp, fsel, fabs, fnabs, fres, fsqrte, fneg) operation completed", ++ .pme_long_desc = "one flop (fadd, fmul, fsub, fcmp, fsel, fabs, fnabs, fres, fsqrte, fneg) operation completed", + }, +-[ POWER9_PME_PM_TM_OUTER_TEND ] = { /* 1 */ +- .pme_name = "PM_TM_OUTER_TEND", +- .pme_code = 0x0000002894, +- .pme_short_desc = "Completion time outer tend", +- .pme_long_desc = "Completion time outer tend", ++[ POWER9_PME_PM_1PLUS_PPC_CMPL ] = { ++ .pme_name = "PM_1PLUS_PPC_CMPL", ++ .pme_code = 0x00000100F2, ++ .pme_short_desc = "1 or more ppc insts finished", ++ .pme_long_desc = "1 or more ppc insts finished", + }, +-[ POWER9_PME_PM_IPTEG_FROM_L3 ] = { /* 2 */ +- .pme_name = "PM_IPTEG_FROM_L3", +- .pme_code = 0x0000045042, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L3 due to a instruction side request", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L3 due to a instruction side request", ++[ POWER9_PME_PM_1PLUS_PPC_DISP ] = { ++ .pme_name = "PM_1PLUS_PPC_DISP", ++ .pme_code = 0x00000400F2, ++ .pme_short_desc = "Cycles at least one Instr Dispatched", ++ .pme_long_desc = "Cycles at least one Instr Dispatched", + }, +-[ POWER9_PME_PM_DPTEG_FROM_L3_1_MOD ] = { /* 3 */ +- .pme_name = "PM_DPTEG_FROM_L3_1_MOD", +- .pme_code = 0x000002E044, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's L3 on the same chip due to a data side request.", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's L3 on the same chip due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++[ POWER9_PME_PM_2FLOP_CMPL ] = { ++ .pme_name = "PM_2FLOP_CMPL", ++ .pme_code = 0x000004D052, ++ .pme_short_desc = "DP vector version of fmul, fsub, fcmp, fsel, fabs, fnabs, fres ,fsqrte, fneg ", ++ .pme_long_desc = "DP vector version of fmul, fsub, fcmp, fsel, fabs, fnabs, fres ,fsqrte, fneg ", + }, +-[ POWER9_PME_PM_PMC2_SAVED ] = { /* 4 */ +- .pme_name = "PM_PMC2_SAVED", +- .pme_code = 0x0000010022, +- .pme_short_desc = "PMC2 Rewind Value saved", +- .pme_long_desc = "PMC2 Rewind Value saved", ++[ POWER9_PME_PM_4FLOP_CMPL ] = { ++ .pme_name = "PM_4FLOP_CMPL", ++ .pme_code = 0x0000045052, ++ .pme_short_desc = "4 FLOP instruction completed", ++ .pme_long_desc = "4 FLOP instruction completed", + }, +-[ POWER9_PME_PM_LSU_FLUSH_SAO ] = { /* 5 */ +- .pme_name = "PM_LSU_FLUSH_SAO", +- .pme_code = 0x000000C0B8, +- .pme_short_desc = "A load-hit-load condition with Strong Address Ordering will have address compare disabled and flush", +- .pme_long_desc = "A load-hit-load condition with Strong Address Ordering will have address compare disabled and flush", ++[ POWER9_PME_PM_8FLOP_CMPL ] = { ++ .pme_name = "PM_8FLOP_CMPL", ++ .pme_code = 0x000004D054, ++ .pme_short_desc = "8 FLOP instruction completed", ++ .pme_long_desc = "8 FLOP instruction completed", + }, +-[ POWER9_PME_PM_CMPLU_STALL_DFU ] = { /* 6 */ +- .pme_name = "PM_CMPLU_STALL_DFU", +- .pme_code = 0x000002D012, +- .pme_short_desc = "Finish stall because the NTF instruction was issued to the Decimal Floating Point execution pipe and waiting to finish.", +- .pme_long_desc = "Finish stall because the NTF instruction was issued to the Decimal Floating Point execution pipe and waiting to finish. Includes decimal floating point instructions + 128 bit binary floating point instructions. Not qualified by multicycle", ++[ POWER9_PME_PM_ANY_THRD_RUN_CYC ] = { ++ .pme_name = "PM_ANY_THRD_RUN_CYC", ++ .pme_code = 0x00000100FA, ++ .pme_short_desc = "Cycles in which at least one thread has the run latch set", ++ .pme_long_desc = "Cycles in which at least one thread has the run latch set", + }, +-[ POWER9_PME_PM_MRK_LSU_FLUSH_RELAUNCH_MISS ] = { /* 7 */ +- .pme_name = "PM_MRK_LSU_FLUSH_RELAUNCH_MISS", +- .pme_code = 0x000000D09C, +- .pme_short_desc = "If a load that has already returned data and has to relaunch for any reason then gets a miss (erat, setp, data cache), it will often be flushed at relaunch time because the data might be inconsistent", +- .pme_long_desc = "If a load that has already returned data and has to relaunch for any reason then gets a miss (erat, setp, data cache), it will often be flushed at relaunch time because the data might be inconsistent", ++[ POWER9_PME_PM_BACK_BR_CMPL ] = { ++ .pme_name = "PM_BACK_BR_CMPL", ++ .pme_code = 0x000002505E, ++ .pme_short_desc = "Branch instruction completed with a target address less than current instruction address", ++ .pme_long_desc = "Branch instruction completed with a target address less than current instruction address", + }, +-[ POWER9_PME_PM_SP_FLOP_CMPL ] = { /* 8 */ +- .pme_name = "PM_SP_FLOP_CMPL", +- .pme_code = 0x000001505E, +- .pme_short_desc = "Single-precision flop count", +- .pme_long_desc = "Single-precision flop count", ++[ POWER9_PME_PM_BANK_CONFLICT ] = { ++ .pme_name = "PM_BANK_CONFLICT", ++ .pme_code = 0x0000004880, ++ .pme_short_desc = "Read blocked due to interleave conflict.", ++ .pme_long_desc = "Read blocked due to interleave conflict. The ifar logic will detect an interleave conflict and kill the data that was read that cycle.", + }, +-[ POWER9_PME_PM_IC_RELOAD_PRIVATE ] = { /* 9 */ +- .pme_name = "PM_IC_RELOAD_PRIVATE", +- .pme_code = 0x0000004894, +- .pme_short_desc = "Reloading line was brought in private for a specific thread.", +- .pme_long_desc = "Reloading line was brought in private for a specific thread. Most lines are brought in shared for all eight thrreads. If RA does not match then invalidates and then brings it shared to other thread. In P7 line brought in private , then line was invalidat", ++[ POWER9_PME_PM_BFU_BUSY ] = { ++ .pme_name = "PM_BFU_BUSY", ++ .pme_code = 0x000003005C, ++ .pme_short_desc = "Cycles in which all 4 Binary Floating Point units are busy.", ++ .pme_long_desc = "Cycles in which all 4 Binary Floating Point units are busy. The BFU is running at capacity", + }, +-[ POWER9_PME_PM_RADIX_PWC_L2_PTE_FROM_L2 ] = { /* 10 */ +- .pme_name = "PM_RADIX_PWC_L2_PTE_FROM_L2", +- .pme_code = 0x000001F058, +- .pme_short_desc = "A Page Table Entry was reloaded to a level 2 page walk cache from the core's L2 data cache.", +- .pme_long_desc = "A Page Table Entry was reloaded to a level 2 page walk cache from the core's L2 data cache. This implies that level 3 and level 4 PWC accesses were not necessary for this translation", ++[ POWER9_PME_PM_BR_2PATH ] = { ++ .pme_name = "PM_BR_2PATH", ++ .pme_code = 0x0000020036, ++ .pme_short_desc = "Branches that are not strongly biased", ++ .pme_long_desc = "Branches that are not strongly biased", + }, +-[ POWER9_PME_PM_INST_PUMP_CPRED ] = { /* 11 */ +- .pme_name = "PM_INST_PUMP_CPRED", +- .pme_code = 0x0000014054, +- .pme_short_desc = "Pump prediction correct.", +- .pme_long_desc = "Pump prediction correct. Counts across all types of pumps for an instruction fetch", ++[ POWER9_PME_PM_BR_CMPL ] = { ++ .pme_name = "PM_BR_CMPL", ++ .pme_code = 0x000004D05E, ++ .pme_short_desc = "Any Branch instruction completed", ++ .pme_long_desc = "Any Branch instruction completed", + }, +-[ POWER9_PME_PM_INST_FROM_L2_1_MOD ] = { /* 12 */ +- .pme_name = "PM_INST_FROM_L2_1_MOD", +- .pme_code = 0x0000044046, +- .pme_short_desc = "The processor's Instruction cache was reloaded with Modified (M) data from another core's L2 on the same chip due to an instruction fetch (not prefetch)", +- .pme_long_desc = "The processor's Instruction cache was reloaded with Modified (M) data from another core's L2 on the same chip due to an instruction fetch (not prefetch)", ++[ POWER9_PME_PM_BR_CORECT_PRED_TAKEN_CMPL ] = { ++ .pme_name = "PM_BR_CORECT_PRED_TAKEN_CMPL", ++ .pme_code = 0x000000489C, ++ .pme_short_desc = "Conditional Branch Completed in which the HW correctly predicted the direction as taken.", ++ .pme_long_desc = "Conditional Branch Completed in which the HW correctly predicted the direction as taken. Counted at completion time", + }, +-[ POWER9_PME_PM_MRK_ST_CMPL ] = { /* 13 */ +- .pme_name = "PM_MRK_ST_CMPL", +- .pme_code = 0x00000301E2, +- .pme_short_desc = "Marked store completed and sent to nest", +- .pme_long_desc = "Marked store completed and sent to nest", ++[ POWER9_PME_PM_BR_MPRED_CCACHE ] = { ++ .pme_name = "PM_BR_MPRED_CCACHE", ++ .pme_code = 0x00000040AC, ++ .pme_short_desc = "Conditional Branch Completed that was Mispredicted due to the Count Cache Target Prediction", ++ .pme_long_desc = "Conditional Branch Completed that was Mispredicted due to the Count Cache Target Prediction", + }, +-[ POWER9_PME_PM_MRK_LSU_DERAT_MISS ] = { /* 14 */ +- .pme_name = "PM_MRK_LSU_DERAT_MISS", +- .pme_code = 0x0000030162, +- .pme_short_desc = "Marked derat reload (miss) for any page size", +- .pme_long_desc = "Marked derat reload (miss) for any page size", ++[ POWER9_PME_PM_BR_MPRED_CMPL ] = { ++ .pme_name = "PM_BR_MPRED_CMPL", ++ .pme_code = 0x00000400F6, ++ .pme_short_desc = "Number of Branch Mispredicts", ++ .pme_long_desc = "Number of Branch Mispredicts", + }, +-[ POWER9_PME_PM_L2_ST_DISP ] = { /* 15 */ +- .pme_name = "PM_L2_ST_DISP", +- .pme_code = 0x000001689E, +- .pme_short_desc = "All successful store dispatches", +- .pme_long_desc = "All successful store dispatches", ++[ POWER9_PME_PM_BR_MPRED_LSTACK ] = { ++ .pme_name = "PM_BR_MPRED_LSTACK", ++ .pme_code = 0x00000048AC, ++ .pme_short_desc = "Conditional Branch Completed that was Mispredicted due to the Link Stack Target Prediction", ++ .pme_long_desc = "Conditional Branch Completed that was Mispredicted due to the Link Stack Target Prediction", + }, +-[ POWER9_PME_PM_LSU0_FALSE_LHS ] = { /* 16 */ +- .pme_name = "PM_LSU0_FALSE_LHS", +- .pme_code = 0x000000C0A0, +- .pme_short_desc = "False LHS match detected", +- .pme_long_desc = "False LHS match detected", ++[ POWER9_PME_PM_BR_MPRED_PCACHE ] = { ++ .pme_name = "PM_BR_MPRED_PCACHE", ++ .pme_code = 0x00000048B0, ++ .pme_short_desc = "Conditional Branch Completed that was Mispredicted due to pattern cache prediction", ++ .pme_long_desc = "Conditional Branch Completed that was Mispredicted due to pattern cache prediction", + }, +-[ POWER9_PME_PM_L2_CASTOUT_MOD ] = { /* 17 */ +- .pme_name = "PM_L2_CASTOUT_MOD", +- .pme_code = 0x0000016082, +- .pme_short_desc = "L2 Castouts - Modified (M, Mu, Me)", +- .pme_long_desc = "L2 Castouts - Modified (M, Mu, Me)", ++[ POWER9_PME_PM_BR_MPRED_TAKEN_CR ] = { ++ .pme_name = "PM_BR_MPRED_TAKEN_CR", ++ .pme_code = 0x00000040B8, ++ .pme_short_desc = "A Conditional Branch that resolved to taken was mispredicted as not taken (due to the BHT Direction Prediction).", ++ .pme_long_desc = "A Conditional Branch that resolved to taken was mispredicted as not taken (due to the BHT Direction Prediction).", + }, +-[ POWER9_PME_PM_L2_RCST_DISP_FAIL_ADDR ] = { /* 18 */ +- .pme_name = "PM_L2_RCST_DISP_FAIL_ADDR", +- .pme_code = 0x0000036884, +- .pme_short_desc = "L2 RC store dispatch attempt failed due to address collision with RC/CO/SN/SQ", +- .pme_long_desc = "L2 RC store dispatch attempt failed due to address collision with RC/CO/SN/SQ", ++[ POWER9_PME_PM_BR_MPRED_TAKEN_TA ] = { ++ .pme_name = "PM_BR_MPRED_TAKEN_TA", ++ .pme_code = 0x00000048B8, ++ .pme_short_desc = "Conditional Branch Completed that was Mispredicted due to the Target Address Prediction from the Count Cache or Link Stack.", ++ .pme_long_desc = "Conditional Branch Completed that was Mispredicted due to the Target Address Prediction from the Count Cache or Link Stack. Only XL-form branches that resolved Taken set this event.", + }, +-[ POWER9_PME_PM_MRK_INST_TIMEO ] = { /* 19 */ +- .pme_name = "PM_MRK_INST_TIMEO", +- .pme_code = 0x0000040134, +- .pme_short_desc = "marked Instruction finish timeout (instruction lost)", +- .pme_long_desc = "marked Instruction finish timeout (instruction lost)", ++[ POWER9_PME_PM_BR_PRED_CCACHE ] = { ++ .pme_name = "PM_BR_PRED_CCACHE", ++ .pme_code = 0x00000040A4, ++ .pme_short_desc = "Conditional Branch Completed that used the Count Cache for Target Prediction", ++ .pme_long_desc = "Conditional Branch Completed that used the Count Cache for Target Prediction", + }, +-[ POWER9_PME_PM_CMPLU_STALL_LOAD_FINISH ] = { /* 20 */ +- .pme_name = "PM_CMPLU_STALL_LOAD_FINISH", +- .pme_code = 0x000004D014, +- .pme_short_desc = "Finish stall because the NTF instruction was a load instruction with all its dependencies satisfied just going through the LSU pipe to finish", +- .pme_long_desc = "Finish stall because the NTF instruction was a load instruction with all its dependencies satisfied just going through the LSU pipe to finish", ++[ POWER9_PME_PM_BR_PRED_LSTACK ] = { ++ .pme_name = "PM_BR_PRED_LSTACK", ++ .pme_code = 0x00000040A8, ++ .pme_short_desc = "Conditional Branch Completed that used the Link Stack for Target Prediction", ++ .pme_long_desc = "Conditional Branch Completed that used the Link Stack for Target Prediction", + }, +-[ POWER9_PME_PM_INST_FROM_L2_1_SHR ] = { /* 21 */ +- .pme_name = "PM_INST_FROM_L2_1_SHR", +- .pme_code = 0x0000034046, +- .pme_short_desc = "The processor's Instruction cache was reloaded with Shared (S) data from another core's L2 on the same chip due to an instruction fetch (not prefetch)", +- .pme_long_desc = "The processor's Instruction cache was reloaded with Shared (S) data from another core's L2 on the same chip due to an instruction fetch (not prefetch)", ++[ POWER9_PME_PM_BR_PRED_PCACHE ] = { ++ .pme_name = "PM_BR_PRED_PCACHE", ++ .pme_code = 0x00000048A0, ++ .pme_short_desc = "Conditional branch completed that used pattern cache prediction", ++ .pme_long_desc = "Conditional branch completed that used pattern cache prediction", + }, +-[ POWER9_PME_PM_LS1_DC_COLLISIONS ] = { /* 22 */ +- .pme_name = "PM_LS1_DC_COLLISIONS", +- .pme_code = 0x000000D890, +- .pme_short_desc = "Read-write data cache collisions", +- .pme_long_desc = "Read-write data cache collisions", ++[ POWER9_PME_PM_BR_PRED_TAKEN_CR ] = { ++ .pme_name = "PM_BR_PRED_TAKEN_CR", ++ .pme_code = 0x00000040B0, ++ .pme_short_desc = "Conditional Branch that had its direction predicted.", ++ .pme_long_desc = "Conditional Branch that had its direction predicted. I-form branches do not set this event. In addition, B-form branches which do not use the BHT do not set this event - these are branches with BO-field set to 'always taken' and branches", + }, +-[ POWER9_PME_PM_LSU2_FALSE_LHS ] = { /* 23 */ +- .pme_name = "PM_LSU2_FALSE_LHS", +- .pme_code = 0x000000C0A4, +- .pme_short_desc = "False LHS match detected", +- .pme_long_desc = "False LHS match detected", ++[ POWER9_PME_PM_BR_PRED_TA ] = { ++ .pme_name = "PM_BR_PRED_TA", ++ .pme_code = 0x00000040B4, ++ .pme_short_desc = "Conditional Branch Completed that had its target address predicted.", ++ .pme_long_desc = "Conditional Branch Completed that had its target address predicted. Only XL-form branches set this event. This equal the sum of CCACHE, LSTACK, and PCACHE", + }, +-[ POWER9_PME_PM_MRK_ST_DRAIN_TO_L2DISP_CYC ] = { /* 24 */ +- .pme_name = "PM_MRK_ST_DRAIN_TO_L2DISP_CYC", +- .pme_code = 0x000003F150, +- .pme_short_desc = "cycles to drain st from core to L2", +- .pme_long_desc = "cycles to drain st from core to L2", ++[ POWER9_PME_PM_BR_PRED ] = { ++ .pme_name = "PM_BR_PRED", ++ .pme_code = 0x000000409C, ++ .pme_short_desc = "Conditional Branch Executed in which the HW predicted the Direction or Target.", ++ .pme_long_desc = "Conditional Branch Executed in which the HW predicted the Direction or Target. Includes taken and not taken and is counted at execution time", + }, +-[ POWER9_PME_PM_MRK_DTLB_MISS_16M ] = { /* 25 */ +- .pme_name = "PM_MRK_DTLB_MISS_16M", +- .pme_code = 0x000004C15E, +- .pme_short_desc = "Marked Data TLB Miss page size 16M", +- .pme_long_desc = "Marked Data TLB Miss page size 16M", ++[ POWER9_PME_PM_BR_TAKEN_CMPL ] = { ++ .pme_name = "PM_BR_TAKEN_CMPL", ++ .pme_code = 0x00000200FA, ++ .pme_short_desc = "New event for Branch Taken", ++ .pme_long_desc = "New event for Branch Taken", + }, +-[ POWER9_PME_PM_L2_GROUP_PUMP ] = { /* 26 */ +- .pme_name = "PM_L2_GROUP_PUMP", +- .pme_code = 0x0000046888, +- .pme_short_desc = "RC requests that were on Node Pump attempts", +- .pme_long_desc = "RC requests that were on Node Pump attempts", ++[ POWER9_PME_PM_BRU_FIN ] = { ++ .pme_name = "PM_BRU_FIN", ++ .pme_code = 0x0000010068, ++ .pme_short_desc = "Branch Instruction Finished", ++ .pme_long_desc = "Branch Instruction Finished", + }, +-[ POWER9_PME_PM_LSU2_VECTOR_ST_FIN ] = { /* 27 */ +- .pme_name = "PM_LSU2_VECTOR_ST_FIN", +- .pme_code = 0x000000C08C, +- .pme_short_desc = "A vector store instruction finished.", +- .pme_long_desc = "A vector store instruction finished. The ops considered in this category are stv*, stxv*, stxsi*x, stxsd, and stxssp", ++[ POWER9_PME_PM_BR_UNCOND ] = { ++ .pme_name = "PM_BR_UNCOND", ++ .pme_code = 0x00000040A0, ++ .pme_short_desc = "Unconditional Branch Completed.", ++ .pme_long_desc = "Unconditional Branch Completed. HW branch prediction was not used for this branch. This can be an I-form branch, a B-form branch with BO-field set to branch always, or a B-form branch which was covenrted to a Resolve.", + }, +-[ POWER9_PME_PM_CMPLU_STALL_LSAQ_ARB ] = { /* 28 */ +- .pme_name = "PM_CMPLU_STALL_LSAQ_ARB", +- .pme_code = 0x000004E016, +- .pme_short_desc = "Finish stall because the NTF instruction was a load or store that was held in LSAQ because an older instruction from SRQ or LRQ won arbitration to the LSU pipe when this instruction tried to launch", +- .pme_long_desc = "Finish stall because the NTF instruction was a load or store that was held in LSAQ because an older instruction from SRQ or LRQ won arbitration to the LSU pipe when this instruction tried to launch", ++[ POWER9_PME_PM_BTAC_BAD_RESULT ] = { ++ .pme_name = "PM_BTAC_BAD_RESULT", ++ .pme_code = 0x00000050B0, ++ .pme_short_desc = "BTAC thinks branch will be taken but it is either predicted not-taken by the BHT, or the target address is wrong (less common).", ++ .pme_long_desc = "BTAC thinks branch will be taken but it is either predicted not-taken by the BHT, or the target address is wrong (less common). In both cases, a redirect will happen", + }, +-[ POWER9_PME_PM_L3_CO_LCO ] = { /* 29 */ +- .pme_name = "PM_L3_CO_LCO", +- .pme_code = 0x00000360A4, +- .pme_short_desc = "Total L3 castouts occurred on LCO", +- .pme_long_desc = "Total L3 castouts occurred on LCO", ++[ POWER9_PME_PM_BTAC_GOOD_RESULT ] = { ++ .pme_name = "PM_BTAC_GOOD_RESULT", ++ .pme_code = 0x00000058B0, ++ .pme_short_desc = "BTAC predicts a taken branch and the BHT agrees, and the target address is correct", ++ .pme_long_desc = "BTAC predicts a taken branch and the BHT agrees, and the target address is correct", + }, +-[ POWER9_PME_PM_INST_GRP_PUMP_CPRED ] = { /* 30 */ +- .pme_name = "PM_INST_GRP_PUMP_CPRED", +- .pme_code = 0x000002C05C, +- .pme_short_desc = "Initial and Final Pump Scope was group pump (prediction=correct) for an instruction fetch (demand only)", +- .pme_long_desc = "Initial and Final Pump Scope was group pump (prediction=correct) for an instruction fetch (demand only)", ++[ POWER9_PME_PM_CHIP_PUMP_CPRED ] = { ++ .pme_name = "PM_CHIP_PUMP_CPRED", ++ .pme_code = 0x0000010050, ++ .pme_short_desc = "Initial and Final Pump Scope was chip pump (prediction=correct) for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", ++ .pme_long_desc = "Initial and Final Pump Scope was chip pump (prediction=correct) for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", + }, +-[ POWER9_PME_PM_THRD_PRIO_4_5_CYC ] = { /* 31 */ +- .pme_name = "PM_THRD_PRIO_4_5_CYC", +- .pme_code = 0x0000005080, +- .pme_short_desc = "Cycles thread running at priority level 4 or 5", +- .pme_long_desc = "Cycles thread running at priority level 4 or 5", ++[ POWER9_PME_PM_CLB_HELD ] = { ++ .pme_name = "PM_CLB_HELD", ++ .pme_code = 0x000000208C, ++ .pme_short_desc = "CLB (control logic block - indicates quadword fetch block) Hold: Any Reason", ++ .pme_long_desc = "CLB (control logic block - indicates quadword fetch block) Hold: Any Reason", + }, +-[ POWER9_PME_PM_BR_PRED_TA ] = { /* 32 */ +- .pme_name = "PM_BR_PRED_TA", +- .pme_code = 0x00000040B4, +- .pme_short_desc = "Conditional Branch Completed that had its target address predicted.", +- .pme_long_desc = "Conditional Branch Completed that had its target address predicted. Only XL-form branches set this event. This equal the sum of CCACHE, LSTACK, and PCACHE", ++[ POWER9_PME_PM_CMPLU_STALL_ANY_SYNC ] = { ++ .pme_name = "PM_CMPLU_STALL_ANY_SYNC", ++ .pme_code = 0x000001E05A, ++ .pme_short_desc = "Cycles in which the NTC sync instruction (isync, lwsync or hwsync) is not allowed to complete", ++ .pme_long_desc = "Cycles in which the NTC sync instruction (isync, lwsync or hwsync) is not allowed to complete", + }, +-[ POWER9_PME_PM_ICT_NOSLOT_BR_MPRED_ICMISS ] = { /* 33 */ +- .pme_name = "PM_ICT_NOSLOT_BR_MPRED_ICMISS", +- .pme_code = 0x0000034058, +- .pme_short_desc = "Ict empty for this thread due to Icache Miss and branch mispred", +- .pme_long_desc = "Ict empty for this thread due to Icache Miss and branch mispred", ++[ POWER9_PME_PM_CMPLU_STALL_BRU ] = { ++ .pme_name = "PM_CMPLU_STALL_BRU", ++ .pme_code = 0x000004D018, ++ .pme_short_desc = "Completion stall due to a Branch Unit", ++ .pme_long_desc = "Completion stall due to a Branch Unit", + }, +-[ POWER9_PME_PM_IPTEG_FROM_L3_NO_CONFLICT ] = { /* 34 */ +- .pme_name = "PM_IPTEG_FROM_L3_NO_CONFLICT", +- .pme_code = 0x0000015044, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L3 without conflict due to a instruction side request", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L3 without conflict due to a instruction side request", ++[ POWER9_PME_PM_CMPLU_STALL_CRYPTO ] = { ++ .pme_name = "PM_CMPLU_STALL_CRYPTO", ++ .pme_code = 0x000004C01E, ++ .pme_short_desc = "Finish stall because the NTF instruction was routed to the crypto execution pipe and was waiting to finish", ++ .pme_long_desc = "Finish stall because the NTF instruction was routed to the crypto execution pipe and was waiting to finish", + }, +-[ POWER9_PME_PM_CMPLU_STALL_FXU ] = { /* 35 */ +- .pme_name = "PM_CMPLU_STALL_FXU", +- .pme_code = 0x000002D016, +- .pme_short_desc = "Finish stall due to a scalar fixed point or CR instruction in the execution pipeline.", +- .pme_long_desc = "Finish stall due to a scalar fixed point or CR instruction in the execution pipeline. These instructions get routed to the ALU, ALU2, and DIV pipes", ++[ POWER9_PME_PM_CMPLU_STALL_DCACHE_MISS ] = { ++ .pme_name = "PM_CMPLU_STALL_DCACHE_MISS", ++ .pme_code = 0x000002C012, ++ .pme_short_desc = "Finish stall because the NTF instruction was a load that missed the L1 and was waiting for the data to return from the nest", ++ .pme_long_desc = "Finish stall because the NTF instruction was a load that missed the L1 and was waiting for the data to return from the nest", + }, +-[ POWER9_PME_PM_VSU_FSQRT_FDIV ] = { /* 36 */ +- .pme_name = "PM_VSU_FSQRT_FDIV", +- .pme_code = 0x000004D04E, +- .pme_short_desc = "four flops operation (fdiv,fsqrt) Scalar Instructions only", +- .pme_long_desc = "four flops operation (fdiv,fsqrt) Scalar Instructions only", ++[ POWER9_PME_PM_CMPLU_STALL_DFLONG ] = { ++ .pme_name = "PM_CMPLU_STALL_DFLONG", ++ .pme_code = 0x000001005A, ++ .pme_short_desc = "Finish stall because the NTF instruction was a multi-cycle instruction issued to the Decimal Floating Point execution pipe and waiting to finish.", ++ .pme_long_desc = "Finish stall because the NTF instruction was a multi-cycle instruction issued to the Decimal Floating Point execution pipe and waiting to finish. Includes decimal floating point instructions + 128 bit binary floating point instructions. Qualified by multicycle", + }, +-[ POWER9_PME_PM_EXT_INT ] = { /* 37 */ +- .pme_name = "PM_EXT_INT", +- .pme_code = 0x00000200F8, +- .pme_short_desc = "external interrupt", +- .pme_long_desc = "external interrupt", ++[ POWER9_PME_PM_CMPLU_STALL_DFU ] = { ++ .pme_name = "PM_CMPLU_STALL_DFU", ++ .pme_code = 0x000002D012, ++ .pme_short_desc = "Finish stall because the NTF instruction was issued to the Decimal Floating Point execution pipe and waiting to finish.", ++ .pme_long_desc = "Finish stall because the NTF instruction was issued to the Decimal Floating Point execution pipe and waiting to finish. Includes decimal floating point instructions + 128 bit binary floating point instructions. Not qualified by multicycle", + }, +-[ POWER9_PME_PM_MRK_LD_MISS_EXPOSED_CYC ] = { /* 38 */ +- .pme_name = "PM_MRK_LD_MISS_EXPOSED_CYC", +- .pme_code = 0x000001013E, +- .pme_short_desc = "Marked Load exposed Miss (use edge detect to count #)", +- .pme_long_desc = "Marked Load exposed Miss (use edge detect to count #)", ++[ POWER9_PME_PM_CMPLU_STALL_DMISS_L21_L31 ] = { ++ .pme_name = "PM_CMPLU_STALL_DMISS_L21_L31", ++ .pme_code = 0x000002C018, ++ .pme_short_desc = "Completion stall by Dcache miss which resolved on chip ( excluding local L2/L3)", ++ .pme_long_desc = "Completion stall by Dcache miss which resolved on chip ( excluding local L2/L3)", + }, +-[ POWER9_PME_PM_S2Q_FULL ] = { /* 39 */ +- .pme_name = "PM_S2Q_FULL", +- .pme_code = 0x000000E080, +- .pme_short_desc = "Cycles during which the S2Q is full", +- .pme_long_desc = "Cycles during which the S2Q is full", ++[ POWER9_PME_PM_CMPLU_STALL_DMISS_L2L3_CONFLICT ] = { ++ .pme_name = "PM_CMPLU_STALL_DMISS_L2L3_CONFLICT", ++ .pme_code = 0x000004C016, ++ .pme_short_desc = "Completion stall due to cache miss that resolves in the L2 or L3 with a conflict", ++ .pme_long_desc = "Completion stall due to cache miss that resolves in the L2 or L3 with a conflict", + }, +-[ POWER9_PME_PM_RUN_CYC_SMT2_MODE ] = { /* 40 */ +- .pme_name = "PM_RUN_CYC_SMT2_MODE", +- .pme_code = 0x000003006C, +- .pme_short_desc = "Cycles in which this thread's run latch is set and the core is in SMT2 mode", +- .pme_long_desc = "Cycles in which this thread's run latch is set and the core is in SMT2 mode", ++[ POWER9_PME_PM_CMPLU_STALL_DMISS_L2L3 ] = { ++ .pme_name = "PM_CMPLU_STALL_DMISS_L2L3", ++ .pme_code = 0x000001003C, ++ .pme_short_desc = "Completion stall by Dcache miss which resolved in L2/L3", ++ .pme_long_desc = "Completion stall by Dcache miss which resolved in L2/L3", + }, +-[ POWER9_PME_PM_DECODE_LANES_NOT_AVAIL ] = { /* 41 */ +- .pme_name = "PM_DECODE_LANES_NOT_AVAIL", +- .pme_code = 0x0000005884, +- .pme_short_desc = "Decode has something to transmit but dispatch lanes are not available", +- .pme_long_desc = "Decode has something to transmit but dispatch lanes are not available", ++[ POWER9_PME_PM_CMPLU_STALL_DMISS_L3MISS ] = { ++ .pme_name = "PM_CMPLU_STALL_DMISS_L3MISS", ++ .pme_code = 0x000004C01A, ++ .pme_short_desc = "Completion stall due to cache miss resolving missed the L3", ++ .pme_long_desc = "Completion stall due to cache miss resolving missed the L3", + }, +-[ POWER9_PME_PM_TM_FAIL_TLBIE ] = { /* 42 */ +- .pme_name = "PM_TM_FAIL_TLBIE", +- .pme_code = 0x000000E0AC, +- .pme_short_desc = "Transaction failed because there was a TLBIE hit in the bloom filter", +- .pme_long_desc = "Transaction failed because there was a TLBIE hit in the bloom filter", ++[ POWER9_PME_PM_CMPLU_STALL_DMISS_LMEM ] = { ++ .pme_name = "PM_CMPLU_STALL_DMISS_LMEM", ++ .pme_code = 0x0000030038, ++ .pme_short_desc = "Completion stall due to cache miss that resolves in local memory", ++ .pme_long_desc = "Completion stall due to cache miss that resolves in local memory", + }, +-[ POWER9_PME_PM_DISP_CLB_HELD_BAL ] = { /* 43 */ +- .pme_name = "PM_DISP_CLB_HELD_BAL", +- .pme_code = 0x000000288C, +- .pme_short_desc = "Dispatch/CLB Hold: Balance Flush", +- .pme_long_desc = "Dispatch/CLB Hold: Balance Flush", ++[ POWER9_PME_PM_CMPLU_STALL_DMISS_REMOTE ] = { ++ .pme_name = "PM_CMPLU_STALL_DMISS_REMOTE", ++ .pme_code = 0x000002C01C, ++ .pme_short_desc = "Completion stall by Dcache miss which resolved from remote chip (cache or memory)", ++ .pme_long_desc = "Completion stall by Dcache miss which resolved from remote chip (cache or memory)", + }, +-[ POWER9_PME_PM_MRK_DATA_FROM_L3MISS_CYC ] = { /* 44 */ +- .pme_name = "PM_MRK_DATA_FROM_L3MISS_CYC", +- .pme_code = 0x000001415E, +- .pme_short_desc = "Duration in cycles to reload from a localtion other than the local core's L3 due to a marked load", +- .pme_long_desc = "Duration in cycles to reload from a localtion other than the local core's L3 due to a marked load", ++[ POWER9_PME_PM_CMPLU_STALL_DPLONG ] = { ++ .pme_name = "PM_CMPLU_STALL_DPLONG", ++ .pme_code = 0x000003405C, ++ .pme_short_desc = "Finish stall because the NTF instruction was a scalar multi-cycle instruction issued to the Double Precision execution pipe and waiting to finish.", ++ .pme_long_desc = "Finish stall because the NTF instruction was a scalar multi-cycle instruction issued to the Double Precision execution pipe and waiting to finish. Includes binary floating point instructions in 32 and 64 bit binary floating point format. Qualified by NOT vector AND multicycle", + }, +-[ POWER9_PME_PM_MRK_ST_FWD ] = { /* 45 */ +- .pme_name = "PM_MRK_ST_FWD", +- .pme_code = 0x000003012C, +- .pme_short_desc = "Marked st forwards", +- .pme_long_desc = "Marked st forwards", ++[ POWER9_PME_PM_CMPLU_STALL_DP ] = { ++ .pme_name = "PM_CMPLU_STALL_DP", ++ .pme_code = 0x000001005C, ++ .pme_short_desc = "Finish stall because the NTF instruction was a scalar instruction issued to the Double Precision execution pipe and waiting to finish.", ++ .pme_long_desc = "Finish stall because the NTF instruction was a scalar instruction issued to the Double Precision execution pipe and waiting to finish. Includes binary floating point instructions in 32 and 64 bit binary floating point format. Not qualified multicycle. Qualified by NOT vector", + }, +-[ POWER9_PME_PM_FXU_FIN ] = { /* 46 */ +- .pme_name = "PM_FXU_FIN", +- .pme_code = 0x0000040004, +- .pme_short_desc = "The fixed point unit Unit finished an instruction.", +- .pme_long_desc = "The fixed point unit Unit finished an instruction. Instructions that finish may not necessary complete.", ++[ POWER9_PME_PM_CMPLU_STALL_EIEIO ] = { ++ .pme_name = "PM_CMPLU_STALL_EIEIO", ++ .pme_code = 0x000004D01A, ++ .pme_short_desc = "Finish stall because the NTF instruction is an EIEIO waiting for response from L2", ++ .pme_long_desc = "Finish stall because the NTF instruction is an EIEIO waiting for response from L2", + }, +-[ POWER9_PME_PM_SYNC_MRK_BR_MPRED ] = { /* 47 */ +- .pme_name = "PM_SYNC_MRK_BR_MPRED", +- .pme_code = 0x000001515C, +- .pme_short_desc = "Marked Branch mispredict that can cause a synchronous interrupt", +- .pme_long_desc = "Marked Branch mispredict that can cause a synchronous interrupt", ++[ POWER9_PME_PM_CMPLU_STALL_EMQ_FULL ] = { ++ .pme_name = "PM_CMPLU_STALL_EMQ_FULL", ++ .pme_code = 0x0000030004, ++ .pme_short_desc = "Finish stall because the next to finish instruction suffered an ERAT miss and the EMQ was full", ++ .pme_long_desc = "Finish stall because the next to finish instruction suffered an ERAT miss and the EMQ was full", + }, +-[ POWER9_PME_PM_CMPLU_STALL_STORE_FIN_ARB ] = { /* 48 */ +- .pme_name = "PM_CMPLU_STALL_STORE_FIN_ARB", +- .pme_code = 0x0000030014, +- .pme_short_desc = "Finish stall because the NTF instruction was a store waiting for a slot in the store finish pipe.", +- .pme_long_desc = "Finish stall because the NTF instruction was a store waiting for a slot in the store finish pipe. This means the instruction is ready to finish but there are instructions ahead of it, using the finish pipe", ++[ POWER9_PME_PM_CMPLU_STALL_ERAT_MISS ] = { ++ .pme_name = "PM_CMPLU_STALL_ERAT_MISS", ++ .pme_code = 0x000004C012, ++ .pme_short_desc = "Finish stall because the NTF instruction was a load or store that suffered a translation miss", ++ .pme_long_desc = "Finish stall because the NTF instruction was a load or store that suffered a translation miss", + }, +-[ POWER9_PME_PM_DSLB_MISS ] = { /* 49 */ +- .pme_name = "PM_DSLB_MISS", +- .pme_code = 0x000000D0A8, +- .pme_short_desc = "Data SLB Miss - Total of all segment sizes", +- .pme_long_desc = "Data SLB Miss - Total of all segment sizes", ++[ POWER9_PME_PM_CMPLU_STALL_EXCEPTION ] = { ++ .pme_name = "PM_CMPLU_STALL_EXCEPTION", ++ .pme_code = 0x000003003A, ++ .pme_short_desc = "Cycles in which the NTC instruction is not allowed to complete because it was interrupted by ANY exception, which has to be serviced before the instruction can complete", ++ .pme_long_desc = "Cycles in which the NTC instruction is not allowed to complete because it was interrupted by ANY exception, which has to be serviced before the instruction can complete", + }, +-[ POWER9_PME_PM_L3_MISS ] = { /* 50 */ +- .pme_name = "PM_L3_MISS", +- .pme_code = 0x00000168A4, +- .pme_short_desc = "L3 Misses", +- .pme_long_desc = "L3 Misses", ++[ POWER9_PME_PM_CMPLU_STALL_EXEC_UNIT ] = { ++ .pme_name = "PM_CMPLU_STALL_EXEC_UNIT", ++ .pme_code = 0x000002D018, ++ .pme_short_desc = "Completion stall due to execution units (FXU/VSU/CRU)", ++ .pme_long_desc = "Completion stall due to execution units (FXU/VSU/CRU)", + }, +-[ POWER9_PME_PM_DUMMY2_REMOVE_ME ] = { /* 51 */ +- .pme_name = "PM_DUMMY2_REMOVE_ME", +- .pme_code = 0x0000040064, +- .pme_short_desc = "Space holder for LS_PC_RELOAD_RA", +- .pme_long_desc = "Space holder for LS_PC_RELOAD_RA", ++[ POWER9_PME_PM_CMPLU_STALL_FLUSH_ANY_THREAD ] = { ++ .pme_name = "PM_CMPLU_STALL_FLUSH_ANY_THREAD", ++ .pme_code = 0x000001E056, ++ .pme_short_desc = "Cycles in which the NTC instruction is not allowed to complete because any of the 4 threads in the same core suffered a flush, which blocks completion", ++ .pme_long_desc = "Cycles in which the NTC instruction is not allowed to complete because any of the 4 threads in the same core suffered a flush, which blocks completion", + }, +-[ POWER9_PME_PM_MRK_DERAT_MISS_1G ] = { /* 52 */ +- .pme_name = "PM_MRK_DERAT_MISS_1G", +- .pme_code = 0x000003D152, +- .pme_short_desc = "Marked Data ERAT Miss (Data TLB Access) page size 1G.", +- .pme_long_desc = "Marked Data ERAT Miss (Data TLB Access) page size 1G. Implies radix translation", ++[ POWER9_PME_PM_CMPLU_STALL_FXLONG ] = { ++ .pme_name = "PM_CMPLU_STALL_FXLONG", ++ .pme_code = 0x000004D016, ++ .pme_short_desc = "Completion stall due to a long latency scalar fixed point instruction (division, square root)", ++ .pme_long_desc = "Completion stall due to a long latency scalar fixed point instruction (division, square root)", + }, +-[ POWER9_PME_PM_MATH_FLOP_CMPL ] = { /* 53 */ +- .pme_name = "PM_MATH_FLOP_CMPL", +- .pme_code = 0x0000010066, +- .pme_short_desc = "", +- .pme_long_desc = "", ++[ POWER9_PME_PM_CMPLU_STALL_FXU ] = { ++ .pme_name = "PM_CMPLU_STALL_FXU", ++ .pme_code = 0x000002D016, ++ .pme_short_desc = "Finish stall due to a scalar fixed point or CR instruction in the execution pipeline.", ++ .pme_long_desc = "Finish stall due to a scalar fixed point or CR instruction in the execution pipeline. These instructions get routed to the ALU, ALU2, and DIV pipes", + }, +-[ POWER9_PME_PM_L2_INST ] = { /* 54 */ +- .pme_name = "PM_L2_INST", +- .pme_code = 0x000003609E, +- .pme_short_desc = "All successful I-side dispatches for this thread (excludes i_l2mru_tch reqs)", +- .pme_long_desc = "All successful I-side dispatches for this thread (excludes i_l2mru_tch reqs)", ++[ POWER9_PME_PM_CMPLU_STALL_HWSYNC ] = { ++ .pme_name = "PM_CMPLU_STALL_HWSYNC", ++ .pme_code = 0x0000030036, ++ .pme_short_desc = "completion stall due to hwsync", ++ .pme_long_desc = "completion stall due to hwsync", + }, +-[ POWER9_PME_PM_FLUSH_DISP ] = { /* 55 */ +- .pme_name = "PM_FLUSH_DISP", +- .pme_code = 0x0000002880, +- .pme_short_desc = "Dispatch flush", +- .pme_long_desc = "Dispatch flush", ++[ POWER9_PME_PM_CMPLU_STALL_LARX ] = { ++ .pme_name = "PM_CMPLU_STALL_LARX", ++ .pme_code = 0x000001002A, ++ .pme_short_desc = "Finish stall because the NTF instruction was a larx waiting to be satisfied", ++ .pme_long_desc = "Finish stall because the NTF instruction was a larx waiting to be satisfied", + }, +-[ POWER9_PME_PM_DISP_HELD_ISSQ_FULL ] = { /* 56 */ +- .pme_name = "PM_DISP_HELD_ISSQ_FULL", +- .pme_code = 0x0000020006, +- .pme_short_desc = "Dispatch held due to Issue q full.", +- .pme_long_desc = "Dispatch held due to Issue q full. Includes issue queue and branch queue", ++[ POWER9_PME_PM_CMPLU_STALL_LHS ] = { ++ .pme_name = "PM_CMPLU_STALL_LHS", ++ .pme_code = 0x000002C01A, ++ .pme_short_desc = "Finish stall because the NTF instruction was a load that hit on an older store and it was waiting for store data", ++ .pme_long_desc = "Finish stall because the NTF instruction was a load that hit on an older store and it was waiting for store data", + }, +-[ POWER9_PME_PM_MEM_READ ] = { /* 57 */ +- .pme_name = "PM_MEM_READ", +- .pme_code = 0x0000010056, +- .pme_short_desc = "Reads from Memory from this thread (includes data/inst/xlate/l1prefetch/inst prefetch).", +- .pme_long_desc = "Reads from Memory from this thread (includes data/inst/xlate/l1prefetch/inst prefetch). Includes L4", ++[ POWER9_PME_PM_CMPLU_STALL_LMQ_FULL ] = { ++ .pme_name = "PM_CMPLU_STALL_LMQ_FULL", ++ .pme_code = 0x000004C014, ++ .pme_short_desc = "Finish stall because the NTF instruction was a load that missed in the L1 and the LMQ was unable to accept this load miss request because it was full", ++ .pme_long_desc = "Finish stall because the NTF instruction was a load that missed in the L1 and the LMQ was unable to accept this load miss request because it was full", + }, +-[ POWER9_PME_PM_DATA_PUMP_MPRED ] = { /* 58 */ +- .pme_name = "PM_DATA_PUMP_MPRED", +- .pme_code = 0x000004C052, +- .pme_short_desc = "Pump misprediction.", +- .pme_long_desc = "Pump misprediction. Counts across all types of pumps for a demand load", ++[ POWER9_PME_PM_CMPLU_STALL_LOAD_FINISH ] = { ++ .pme_name = "PM_CMPLU_STALL_LOAD_FINISH", ++ .pme_code = 0x000004D014, ++ .pme_short_desc = "Finish stall because the NTF instruction was a load instruction with all its dependencies satisfied just going through the LSU pipe to finish", ++ .pme_long_desc = "Finish stall because the NTF instruction was a load instruction with all its dependencies satisfied just going through the LSU pipe to finish", + }, +-[ POWER9_PME_PM_DATA_CHIP_PUMP_CPRED ] = { /* 59 */ +- .pme_name = "PM_DATA_CHIP_PUMP_CPRED", +- .pme_code = 0x000001C050, +- .pme_short_desc = "Initial and Final Pump Scope was chip pump (prediction=correct) for a demand load", +- .pme_long_desc = "Initial and Final Pump Scope was chip pump (prediction=correct) for a demand load", ++[ POWER9_PME_PM_CMPLU_STALL_LRQ_FULL ] = { ++ .pme_name = "PM_CMPLU_STALL_LRQ_FULL", ++ .pme_code = 0x000002D014, ++ .pme_short_desc = "Finish stall because the NTF instruction was a load that was held in LSAQ (load-store address queue) because the LRQ (load-reorder queue) was full", ++ .pme_long_desc = "Finish stall because the NTF instruction was a load that was held in LSAQ (load-store address queue) because the LRQ (load-reorder queue) was full", + }, +-[ POWER9_PME_PM_MRK_DATA_FROM_DMEM ] = { /* 60 */ +- .pme_name = "PM_MRK_DATA_FROM_DMEM", +- .pme_code = 0x000003D14C, +- .pme_short_desc = "The processor's data cache was reloaded from another chip's memory on the same Node or Group (Distant) due to a marked load", +- .pme_long_desc = "The processor's data cache was reloaded from another chip's memory on the same Node or Group (Distant) due to a marked load", ++[ POWER9_PME_PM_CMPLU_STALL_LRQ_OTHER ] = { ++ .pme_name = "PM_CMPLU_STALL_LRQ_OTHER", ++ .pme_code = 0x0000010004, ++ .pme_short_desc = "Finish stall due to LRQ miscellaneous reasons, lost arbitration to LMQ slot, bank collisions, set prediction cleanup, set prediction multihit and others", ++ .pme_long_desc = "Finish stall due to LRQ miscellaneous reasons, lost arbitration to LMQ slot, bank collisions, set prediction cleanup, set prediction multihit and others", ++}, ++[ POWER9_PME_PM_CMPLU_STALL_LSAQ_ARB ] = { ++ .pme_name = "PM_CMPLU_STALL_LSAQ_ARB", ++ .pme_code = 0x000004E016, ++ .pme_short_desc = "Finish stall because the NTF instruction was a load or store that was held in LSAQ because an older instruction from SRQ or LRQ won arbitration to the LSU pipe when this instruction tried to launch", ++ .pme_long_desc = "Finish stall because the NTF instruction was a load or store that was held in LSAQ because an older instruction from SRQ or LRQ won arbitration to the LSU pipe when this instruction tried to launch", ++}, ++[ POWER9_PME_PM_CMPLU_STALL_LSU_FIN ] = { ++ .pme_name = "PM_CMPLU_STALL_LSU_FIN", ++ .pme_code = 0x000001003A, ++ .pme_short_desc = "Finish stall because the NTF instruction was an LSU op (other than a load or a store) with all its dependencies met and just going through the LSU pipe to finish", ++ .pme_long_desc = "Finish stall because the NTF instruction was an LSU op (other than a load or a store) with all its dependencies met and just going through the LSU pipe to finish", ++}, ++[ POWER9_PME_PM_CMPLU_STALL_LSU_FLUSH_NEXT ] = { ++ .pme_name = "PM_CMPLU_STALL_LSU_FLUSH_NEXT", ++ .pme_code = 0x000002E01A, ++ .pme_short_desc = "Completion stall of one cycle because the LSU requested to flush the next iop in the sequence.", ++ .pme_long_desc = "Completion stall of one cycle because the LSU requested to flush the next iop in the sequence. It takes 1 cycle for the ISU to process this request before the LSU instruction is allowed to complete", ++}, ++[ POWER9_PME_PM_CMPLU_STALL_LSU_MFSPR ] = { ++ .pme_name = "PM_CMPLU_STALL_LSU_MFSPR", ++ .pme_code = 0x0000034056, ++ .pme_short_desc = "Finish stall because the NTF instruction was a mfspr instruction targeting an LSU SPR and it was waiting for the register data to be returned", ++ .pme_long_desc = "Finish stall because the NTF instruction was a mfspr instruction targeting an LSU SPR and it was waiting for the register data to be returned", + }, +-[ POWER9_PME_PM_CMPLU_STALL_LSU ] = { /* 61 */ ++[ POWER9_PME_PM_CMPLU_STALL_LSU ] = { + .pme_name = "PM_CMPLU_STALL_LSU", + .pme_code = 0x000002C010, + .pme_short_desc = "Completion stall by LSU instruction", + .pme_long_desc = "Completion stall by LSU instruction", + }, +-[ POWER9_PME_PM_DATA_FROM_L3_1_MOD ] = { /* 62 */ +- .pme_name = "PM_DATA_FROM_L3_1_MOD", +- .pme_code = 0x000002C044, +- .pme_short_desc = "The processor's data cache was reloaded with Modified (M) data from another core's L3 on the same chip due to a demand load", +- .pme_long_desc = "The processor's data cache was reloaded with Modified (M) data from another core's L3 on the same chip due to a demand load", ++[ POWER9_PME_PM_CMPLU_STALL_LWSYNC ] = { ++ .pme_name = "PM_CMPLU_STALL_LWSYNC", ++ .pme_code = 0x0000010036, ++ .pme_short_desc = "completion stall due to lwsync", ++ .pme_long_desc = "completion stall due to lwsync", + }, +-[ POWER9_PME_PM_MRK_DERAT_MISS_16M ] = { /* 63 */ +- .pme_name = "PM_MRK_DERAT_MISS_16M", +- .pme_code = 0x000003D154, +- .pme_short_desc = "Marked Data ERAT Miss (Data TLB Access) page size 16M", +- .pme_long_desc = "Marked Data ERAT Miss (Data TLB Access) page size 16M", ++[ POWER9_PME_PM_CMPLU_STALL_MTFPSCR ] = { ++ .pme_name = "PM_CMPLU_STALL_MTFPSCR", ++ .pme_code = 0x000004E012, ++ .pme_short_desc = "Completion stall because the ISU is updating the register and notifying the Effective Address Table (EAT)", ++ .pme_long_desc = "Completion stall because the ISU is updating the register and notifying the Effective Address Table (EAT)", + }, +-[ POWER9_PME_PM_TM_TRANS_RUN_CYC ] = { /* 64 */ +- .pme_name = "PM_TM_TRANS_RUN_CYC", +- .pme_code = 0x0000010060, +- .pme_short_desc = "run cycles in transactional state", +- .pme_long_desc = "run cycles in transactional state", ++[ POWER9_PME_PM_CMPLU_STALL_NESTED_TBEGIN ] = { ++ .pme_name = "PM_CMPLU_STALL_NESTED_TBEGIN", ++ .pme_code = 0x000001E05C, ++ .pme_short_desc = "Completion stall because the ISU is updating the TEXASR to keep track of the nested tbegin.", ++ .pme_long_desc = "Completion stall because the ISU is updating the TEXASR to keep track of the nested tbegin. This is a short delay, and it includes ROT", + }, +-[ POWER9_PME_PM_THRD_ALL_RUN_CYC ] = { /* 65 */ +- .pme_name = "PM_THRD_ALL_RUN_CYC", +- .pme_code = 0x0000020008, +- .pme_short_desc = "Cycles in which all the threads have the run latch set", +- .pme_long_desc = "Cycles in which all the threads have the run latch set", ++[ POWER9_PME_PM_CMPLU_STALL_NESTED_TEND ] = { ++ .pme_name = "PM_CMPLU_STALL_NESTED_TEND", ++ .pme_code = 0x000003003C, ++ .pme_short_desc = "Completion stall because the ISU is updating the TEXASR to keep track of the nested tend and decrement the TEXASR nested level.", ++ .pme_long_desc = "Completion stall because the ISU is updating the TEXASR to keep track of the nested tend and decrement the TEXASR nested level. This is a short delay", + }, +-[ POWER9_PME_PM_DATA_FROM_DL2L3_MOD ] = { /* 66 */ +- .pme_name = "PM_DATA_FROM_DL2L3_MOD", +- .pme_code = 0x000004C048, +- .pme_short_desc = "The processor's data cache was reloaded with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a demand load", +- .pme_long_desc = "The processor's data cache was reloaded with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a demand load", ++[ POWER9_PME_PM_CMPLU_STALL_NTC_DISP_FIN ] = { ++ .pme_name = "PM_CMPLU_STALL_NTC_DISP_FIN", ++ .pme_code = 0x000004E018, ++ .pme_short_desc = "Finish stall because the NTF instruction was one that must finish at dispatch.", ++ .pme_long_desc = "Finish stall because the NTF instruction was one that must finish at dispatch.", + }, +-[ POWER9_PME_PM_MRK_BR_MPRED_CMPL ] = { /* 67 */ +- .pme_name = "PM_MRK_BR_MPRED_CMPL", +- .pme_code = 0x00000301E4, +- .pme_short_desc = "Marked Branch Mispredicted", +- .pme_long_desc = "Marked Branch Mispredicted", ++[ POWER9_PME_PM_CMPLU_STALL_NTC_FLUSH ] = { ++ .pme_name = "PM_CMPLU_STALL_NTC_FLUSH", ++ .pme_code = 0x000002E01E, ++ .pme_short_desc = "Completion stall due to ntc flush", ++ .pme_long_desc = "Completion stall due to ntc flush", + }, +-[ POWER9_PME_PM_ICT_NOSLOT_DISP_HELD_ISSQ ] = { /* 68 */ +- .pme_name = "PM_ICT_NOSLOT_DISP_HELD_ISSQ", +- .pme_code = 0x000002D01E, +- .pme_short_desc = "Ict empty for this thread due to dispatch hold on this thread due to Issue q full, BRQ full, XVCF Full, Count cache, Link, Tar full", +- .pme_long_desc = "Ict empty for this thread due to dispatch hold on this thread due to Issue q full, BRQ full, XVCF Full, Count cache, Link, Tar full", ++[ POWER9_PME_PM_CMPLU_STALL_OTHER_CMPL ] = { ++ .pme_name = "PM_CMPLU_STALL_OTHER_CMPL", ++ .pme_code = 0x0000030006, ++ .pme_short_desc = "Instructions the core completed while this tread was stalled", ++ .pme_long_desc = "Instructions the core completed while this tread was stalled", + }, +-[ POWER9_PME_PM_MRK_INST ] = { /* 69 */ +- .pme_name = "PM_MRK_INST", +- .pme_code = 0x0000024058, +- .pme_short_desc = "An instruction was marked.", +- .pme_long_desc = "An instruction was marked. Includes both Random Instruction Sampling (RIS) at decode time and Random Event Samping (RES) at the time the configured event happens", ++[ POWER9_PME_PM_CMPLU_STALL_PASTE ] = { ++ .pme_name = "PM_CMPLU_STALL_PASTE", ++ .pme_code = 0x000002C016, ++ .pme_short_desc = "Finish stall because the NTF instruction was a paste waiting for response from L2", ++ .pme_long_desc = "Finish stall because the NTF instruction was a paste waiting for response from L2", + }, +-[ POWER9_PME_PM_TABLEWALK_CYC_PREF ] = { /* 70 */ +- .pme_name = "PM_TABLEWALK_CYC_PREF", +- .pme_code = 0x000000F884, +- .pme_short_desc = "tablewalk qualified for pte prefetches", +- .pme_long_desc = "tablewalk qualified for pte prefetches", ++[ POWER9_PME_PM_CMPLU_STALL_PM ] = { ++ .pme_name = "PM_CMPLU_STALL_PM", ++ .pme_code = 0x000003000A, ++ .pme_short_desc = "Finish stall because the NTF instruction was issued to the Permute execution pipe and waiting to finish.", ++ .pme_long_desc = "Finish stall because the NTF instruction was issued to the Permute execution pipe and waiting to finish. Includes permute and decimal fixed point instructions (128 bit BCD arithmetic) + a few 128 bit fixpoint add/subtract instructions with carry. Not qualified by vector or multicycle", + }, +-[ POWER9_PME_PM_LSU1_ERAT_HIT ] = { /* 71 */ +- .pme_name = "PM_LSU1_ERAT_HIT", +- .pme_code = 0x000000E88C, +- .pme_short_desc = "Primary ERAT hit.", +- .pme_long_desc = "Primary ERAT hit. There is no secondary ERAT", ++[ POWER9_PME_PM_CMPLU_STALL_SLB ] = { ++ .pme_name = "PM_CMPLU_STALL_SLB", ++ .pme_code = 0x000001E052, ++ .pme_short_desc = "Finish stall because the NTF instruction was awaiting L2 response for an SLB", ++ .pme_long_desc = "Finish stall because the NTF instruction was awaiting L2 response for an SLB", + }, +-[ POWER9_PME_PM_NTC_ISSUE_HELD_OTHER ] = { /* 72 */ +- .pme_name = "PM_NTC_ISSUE_HELD_OTHER", +- .pme_code = 0x000003D05A, +- .pme_short_desc = "The NTC instruction is being held at dispatch during regular pipeline cycles, or because the VSU is busy with multi-cycle instructions, or because of a write-back collision with VSU", +- .pme_long_desc = "The NTC instruction is being held at dispatch during regular pipeline cycles, or because the VSU is busy with multi-cycle instructions, or because of a write-back collision with VSU", ++[ POWER9_PME_PM_CMPLU_STALL_SPEC_FINISH ] = { ++ .pme_name = "PM_CMPLU_STALL_SPEC_FINISH", ++ .pme_code = 0x0000030028, ++ .pme_short_desc = "Finish stall while waiting for the non-speculative finish of either a stcx waiting for its result or a load waiting for non-critical sectors of data and ECC", ++ .pme_long_desc = "Finish stall while waiting for the non-speculative finish of either a stcx waiting for its result or a load waiting for non-critical sectors of data and ECC", + }, +-[ POWER9_PME_PM_CMPLU_STALL_LSU_FLUSH_NEXT ] = { /* 73 */ +- .pme_name = "PM_CMPLU_STALL_LSU_FLUSH_NEXT", +- .pme_code = 0x000002E01A, +- .pme_short_desc = "Completion stall of one cycle because the LSU requested to flush the next iop in the sequence.", +- .pme_long_desc = "Completion stall of one cycle because the LSU requested to flush the next iop in the sequence. It takes 1 cycle for the ISU to process this request before the LSU instruction is allowed to complete", ++[ POWER9_PME_PM_CMPLU_STALL_SRQ_FULL ] = { ++ .pme_name = "PM_CMPLU_STALL_SRQ_FULL", ++ .pme_code = 0x0000030016, ++ .pme_short_desc = "Finish stall because the NTF instruction was a store that was held in LSAQ because the SRQ was full", ++ .pme_long_desc = "Finish stall because the NTF instruction was a store that was held in LSAQ because the SRQ was full", + }, +-[ POWER9_PME_PM_MRK_DPTEG_FROM_L2 ] = { /* 74 */ +- .pme_name = "PM_MRK_DPTEG_FROM_L2", +- .pme_code = 0x000001F142, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L2 due to a marked data side request.", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L2 due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++[ POWER9_PME_PM_CMPLU_STALL_STCX ] = { ++ .pme_name = "PM_CMPLU_STALL_STCX", ++ .pme_code = 0x000002D01C, ++ .pme_short_desc = "Finish stall because the NTF instruction was a stcx waiting for response from L2", ++ .pme_long_desc = "Finish stall because the NTF instruction was a stcx waiting for response from L2", + }, +-[ POWER9_PME_PM_LS1_TM_DISALLOW ] = { /* 75 */ +- .pme_name = "PM_LS1_TM_DISALLOW", +- .pme_code = 0x000000E8B4, +- .pme_short_desc = "A TM-ineligible instruction tries to execute inside a transaction and the LSU disallows it", +- .pme_long_desc = "A TM-ineligible instruction tries to execute inside a transaction and the LSU disallows it", ++[ POWER9_PME_PM_CMPLU_STALL_ST_FWD ] = { ++ .pme_name = "PM_CMPLU_STALL_ST_FWD", ++ .pme_code = 0x000004C01C, ++ .pme_short_desc = "Completion stall due to store forward", ++ .pme_long_desc = "Completion stall due to store forward", + }, +-[ POWER9_PME_PM_INST_FROM_L2_DISP_CONFLICT_LDHITST ] = { /* 76 */ +- .pme_name = "PM_INST_FROM_L2_DISP_CONFLICT_LDHITST", +- .pme_code = 0x0000034040, +- .pme_short_desc = "The processor's Instruction cache was reloaded from local core's L2 with load hit store conflict due to an instruction fetch (not prefetch)", +- .pme_long_desc = "The processor's Instruction cache was reloaded from local core's L2 with load hit store conflict due to an instruction fetch (not prefetch)", ++[ POWER9_PME_PM_CMPLU_STALL_STORE_DATA ] = { ++ .pme_name = "PM_CMPLU_STALL_STORE_DATA", ++ .pme_code = 0x0000030026, ++ .pme_short_desc = "Finish stall because the next to finish instruction was a store waiting on data", ++ .pme_long_desc = "Finish stall because the next to finish instruction was a store waiting on data", + }, +-[ POWER9_PME_PM_BR_PRED_PCACHE ] = { /* 77 */ +- .pme_name = "PM_BR_PRED_PCACHE", +- .pme_code = 0x00000048A0, +- .pme_short_desc = "Conditional branch completed that used pattern cache prediction", +- .pme_long_desc = "Conditional branch completed that used pattern cache prediction", ++[ POWER9_PME_PM_CMPLU_STALL_STORE_FIN_ARB ] = { ++ .pme_name = "PM_CMPLU_STALL_STORE_FIN_ARB", ++ .pme_code = 0x0000030014, ++ .pme_short_desc = "Finish stall because the NTF instruction was a store waiting for a slot in the store finish pipe.", ++ .pme_long_desc = "Finish stall because the NTF instruction was a store waiting for a slot in the store finish pipe. This means the instruction is ready to finish but there are instructions ahead of it, using the finish pipe", + }, +-[ POWER9_PME_PM_MRK_BACK_BR_CMPL ] = { /* 78 */ +- .pme_name = "PM_MRK_BACK_BR_CMPL", +- .pme_code = 0x000003515E, +- .pme_short_desc = "Marked branch instruction completed with a target address less than current instruction address", +- .pme_long_desc = "Marked branch instruction completed with a target address less than current instruction address", ++[ POWER9_PME_PM_CMPLU_STALL_STORE_FINISH ] = { ++ .pme_name = "PM_CMPLU_STALL_STORE_FINISH", ++ .pme_code = 0x000002C014, ++ .pme_short_desc = "Finish stall because the NTF instruction was a store with all its dependencies met, just waiting to go through the LSU pipe to finish", ++ .pme_long_desc = "Finish stall because the NTF instruction was a store with all its dependencies met, just waiting to go through the LSU pipe to finish", + }, +-[ POWER9_PME_PM_RD_CLEARING_SC ] = { /* 79 */ +- .pme_name = "PM_RD_CLEARING_SC", +- .pme_code = 0x00000468A6, +- .pme_short_desc = "rd clearing sc", +- .pme_long_desc = "rd clearing sc", ++[ POWER9_PME_PM_CMPLU_STALL_STORE_PIPE_ARB ] = { ++ .pme_name = "PM_CMPLU_STALL_STORE_PIPE_ARB", ++ .pme_code = 0x000004C010, ++ .pme_short_desc = "Finish stall because the NTF instruction was a store waiting for the next relaunch opportunity after an internal reject.", ++ .pme_long_desc = "Finish stall because the NTF instruction was a store waiting for the next relaunch opportunity after an internal reject. This means the instruction is ready to relaunch and tried once but lost arbitration", + }, +-[ POWER9_PME_PM_PMC1_OVERFLOW ] = { /* 80 */ +- .pme_name = "PM_PMC1_OVERFLOW", +- .pme_code = 0x0000020010, +- .pme_short_desc = "Overflow from counter 1", +- .pme_long_desc = "Overflow from counter 1", ++[ POWER9_PME_PM_CMPLU_STALL_SYNC_PMU_INT ] = { ++ .pme_name = "PM_CMPLU_STALL_SYNC_PMU_INT", ++ .pme_code = 0x000002C01E, ++ .pme_short_desc = "Cycles in which the NTC instruction is waiting for a synchronous PMU interrupt", ++ .pme_long_desc = "Cycles in which the NTC instruction is waiting for a synchronous PMU interrupt", + }, +-[ POWER9_PME_PM_L2_RTY_ST ] = { /* 81 */ +- .pme_name = "PM_L2_RTY_ST", +- .pme_code = 0x000004689E, +- .pme_short_desc = "RC retries on PB for any store from core", +- .pme_long_desc = "RC retries on PB for any store from core", ++[ POWER9_PME_PM_CMPLU_STALL_TEND ] = { ++ .pme_name = "PM_CMPLU_STALL_TEND", ++ .pme_code = 0x000001E050, ++ .pme_short_desc = "Finish stall because the NTF instruction was a tend instruction awaiting response from L2", ++ .pme_long_desc = "Finish stall because the NTF instruction was a tend instruction awaiting response from L2", + }, +-[ POWER9_PME_PM_IPTEG_FROM_L2_NO_CONFLICT ] = { /* 82 */ +- .pme_name = "PM_IPTEG_FROM_L2_NO_CONFLICT", +- .pme_code = 0x0000015040, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L2 without conflict due to a instruction side request", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L2 without conflict due to a instruction side request", ++[ POWER9_PME_PM_CMPLU_STALL_THRD ] = { ++ .pme_name = "PM_CMPLU_STALL_THRD", ++ .pme_code = 0x000001001C, ++ .pme_short_desc = "Completion Stalled because the thread was blocked", ++ .pme_long_desc = "Completion Stalled because the thread was blocked", + }, +-[ POWER9_PME_PM_LSU1_FALSE_LHS ] = { /* 83 */ +- .pme_name = "PM_LSU1_FALSE_LHS", +- .pme_code = 0x000000C8A0, +- .pme_short_desc = "False LHS match detected", +- .pme_long_desc = "False LHS match detected", ++[ POWER9_PME_PM_CMPLU_STALL_TLBIE ] = { ++ .pme_name = "PM_CMPLU_STALL_TLBIE", ++ .pme_code = 0x000002E01C, ++ .pme_short_desc = "Finish stall because the NTF instruction was a tlbie waiting for response from L2", ++ .pme_long_desc = "Finish stall because the NTF instruction was a tlbie waiting for response from L2", + }, +-[ POWER9_PME_PM_LSU0_VECTOR_ST_FIN ] = { /* 84 */ +- .pme_name = "PM_LSU0_VECTOR_ST_FIN", +- .pme_code = 0x000000C088, +- .pme_short_desc = "A vector store instruction finished.", +- .pme_long_desc = "A vector store instruction finished. The ops considered in this category are stv*, stxv*, stxsi*x, stxsd, and stxssp", ++[ POWER9_PME_PM_CMPLU_STALL ] = { ++ .pme_name = "PM_CMPLU_STALL", ++ .pme_code = 0x000001E054, ++ .pme_short_desc = "Nothing completed and ICT not empty", ++ .pme_long_desc = "Nothing completed and ICT not empty", + }, +-[ POWER9_PME_PM_MEM_LOC_THRESH_LSU_HIGH ] = { /* 85 */ +- .pme_name = "PM_MEM_LOC_THRESH_LSU_HIGH", +- .pme_code = 0x0000040056, +- .pme_short_desc = "Local memory above threshold for LSU medium", +- .pme_long_desc = "Local memory above threshold for LSU medium", ++[ POWER9_PME_PM_CMPLU_STALL_VDPLONG ] = { ++ .pme_name = "PM_CMPLU_STALL_VDPLONG", ++ .pme_code = 0x000003C05A, ++ .pme_short_desc = "Finish stall because the NTF instruction was a scalar multi-cycle instruction issued to the Double Precision execution pipe and waiting to finish.", ++ .pme_long_desc = "Finish stall because the NTF instruction was a scalar multi-cycle instruction issued to the Double Precision execution pipe and waiting to finish. Includes binary floating point instructions in 32 and 64 bit binary floating point format. Qualified by NOT vector AND multicycle", + }, +-[ POWER9_PME_PM_LS2_UNALIGNED_LD ] = { /* 86 */ +- .pme_name = "PM_LS2_UNALIGNED_LD", +- .pme_code = 0x000000C098, +- .pme_short_desc = "Load instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the load of that size.", +- .pme_long_desc = "Load instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the load of that size. If the load wraps from slice 3 to slice 0, thee is an additional 3-cycle penalty", ++[ POWER9_PME_PM_CMPLU_STALL_VDP ] = { ++ .pme_name = "PM_CMPLU_STALL_VDP", ++ .pme_code = 0x000004405C, ++ .pme_short_desc = "Finish stall because the NTF instruction was a vector instruction issued to the Double Precision execution pipe and waiting to finish.", ++ .pme_long_desc = "Finish stall because the NTF instruction was a vector instruction issued to the Double Precision execution pipe and waiting to finish. Includes binary floating point instructions in 32 and 64 bit binary floating point format. Not qualified multicycle. Qualified by vector", + }, +-[ POWER9_PME_PM_BR_TAKEN_CMPL ] = { /* 87 */ +- .pme_name = "PM_BR_TAKEN_CMPL", +- .pme_code = 0x00000200FA, +- .pme_short_desc = "New event for Branch Taken", +- .pme_long_desc = "New event for Branch Taken", ++[ POWER9_PME_PM_CMPLU_STALL_VFXLONG ] = { ++ .pme_name = "PM_CMPLU_STALL_VFXLONG", ++ .pme_code = 0x000002E018, ++ .pme_short_desc = "Completion stall due to a long latency vector fixed point instruction (division, square root)", ++ .pme_long_desc = "Completion stall due to a long latency vector fixed point instruction (division, square root)", + }, +-[ POWER9_PME_PM_DATA_SYS_PUMP_MPRED ] = { /* 88 */ +- .pme_name = "PM_DATA_SYS_PUMP_MPRED", +- .pme_code = 0x000003C052, +- .pme_short_desc = "Final Pump Scope (system) mispredicted.", +- .pme_long_desc = "Final Pump Scope (system) mispredicted. Either the original scope was too small (Chip/Group) or the original scope was System and it should have been smaller. Counts for a demand load", ++[ POWER9_PME_PM_CMPLU_STALL_VFXU ] = { ++ .pme_name = "PM_CMPLU_STALL_VFXU", ++ .pme_code = 0x000003C05C, ++ .pme_short_desc = "Finish stall due to a vector fixed point instruction in the execution pipeline.", ++ .pme_long_desc = "Finish stall due to a vector fixed point instruction in the execution pipeline. These instructions get routed to the ALU, ALU2, and DIV pipes", + }, +-[ POWER9_PME_PM_ISQ_36_44_ENTRIES ] = { /* 89 */ +- .pme_name = "PM_ISQ_36_44_ENTRIES", +- .pme_code = 0x000004000A, +- .pme_short_desc = "Cycles in which 36 or more Issue Queue entries are in use.", +- .pme_long_desc = "Cycles in which 36 or more Issue Queue entries are in use. This is a shared event, not per thread. There are 44 issue queue entries across 4 slices in the whole core", ++[ POWER9_PME_PM_CO0_BUSY ] = { ++ .pme_name = "PM_CO0_BUSY", ++ .pme_code = 0x000003608C, ++ .pme_short_desc = "CO mach 0 Busy.", ++ .pme_long_desc = "CO mach 0 Busy. Used by PMU to sample ave CO lifetime (mach0 used as sample point)", + }, +-[ POWER9_PME_PM_LSU1_VECTOR_LD_FIN ] = { /* 90 */ +- .pme_name = "PM_LSU1_VECTOR_LD_FIN", +- .pme_code = 0x000000C880, +- .pme_short_desc = "A vector load instruction finished.", +- .pme_long_desc = "A vector load instruction finished. The ops considered in this category are lxv*, lvx*, lve*, lxsi*zx, lxvwsx, lxsd, lxssp, lxvl, lxvll, lxvb16x, lxvh8x, lxv, lxvx", ++[ POWER9_PME_PM_CO0_BUSY_ALT ] = { ++ .pme_name = "PM_CO0_BUSY", ++ .pme_code = 0x000004608C, ++ .pme_short_desc = "CO mach 0 Busy.", ++ .pme_long_desc = "CO mach 0 Busy. Used by PMU to sample ave CO lifetime (mach0 used as sample point)", + }, +-[ POWER9_PME_PM_MRK_DATA_FROM_L2_DISP_CONFLICT_OTHER ] = { /* 91 */ +- .pme_name = "PM_MRK_DATA_FROM_L2_DISP_CONFLICT_OTHER", +- .pme_code = 0x000002C124, +- .pme_short_desc = "The processor's data cache was reloaded from local core's L2 with dispatch conflict due to a marked load", +- .pme_long_desc = "The processor's data cache was reloaded from local core's L2 with dispatch conflict due to a marked load", ++[ POWER9_PME_PM_CO_DISP_FAIL ] = { ++ .pme_name = "PM_CO_DISP_FAIL", ++ .pme_code = 0x0000016886, ++ .pme_short_desc = "CO dispatch failed due to all CO machines being busy", ++ .pme_long_desc = "CO dispatch failed due to all CO machines being busy", + }, +-[ POWER9_PME_PM_ICT_NOSLOT_IC_MISS ] = { /* 92 */ +- .pme_name = "PM_ICT_NOSLOT_IC_MISS", +- .pme_code = 0x000002D01A, +- .pme_short_desc = "Ict empty for this thread due to Icache Miss", +- .pme_long_desc = "Ict empty for this thread due to Icache Miss", ++[ POWER9_PME_PM_CO_TM_SC_FOOTPRINT ] = { ++ .pme_name = "PM_CO_TM_SC_FOOTPRINT", ++ .pme_code = 0x0000026086, ++ .pme_short_desc = "L2 did a cleanifdirty CO to the L3 (ie created an SC line in the L3) OR L2 TM_store hit dirty HPC line and L3 indicated SC line formed in L3 on RDR bus", ++ .pme_long_desc = "L2 did a cleanifdirty CO to the L3 (ie created an SC line in the L3) OR L2 TM_store hit dirty HPC line and L3 indicated SC line formed in L3 on RDR bus", + }, +-[ POWER9_PME_PM_LSU3_TM_L1_HIT ] = { /* 93 */ +- .pme_name = "PM_LSU3_TM_L1_HIT", +- .pme_code = 0x000000E898, +- .pme_short_desc = "Load tm hit in L1", +- .pme_long_desc = "Load tm hit in L1", ++[ POWER9_PME_PM_CO_USAGE ] = { ++ .pme_name = "PM_CO_USAGE", ++ .pme_code = 0x000002688C, ++ .pme_short_desc = "Continuous 16 cycle (2to1) window where this signals rotates thru sampling each CO machine busy.", ++ .pme_long_desc = "Continuous 16 cycle (2to1) window where this signals rotates thru sampling each CO machine busy. PMU uses this wave to then do 16 cyc count to sample total number of machs running", + }, +-[ POWER9_PME_PM_MRK_INST_DISP ] = { /* 94 */ +- .pme_name = "PM_MRK_INST_DISP", +- .pme_code = 0x00000101E0, +- .pme_short_desc = "The thread has dispatched a randomly sampled marked instruction", +- .pme_long_desc = "The thread has dispatched a randomly sampled marked instruction", ++[ POWER9_PME_PM_CYC ] = { ++ .pme_name = "PM_CYC", ++ .pme_code = 0x000001001E, ++ .pme_short_desc = "Processor cycles", ++ .pme_long_desc = "Processor cycles", + }, +-[ POWER9_PME_PM_VECTOR_FLOP_CMPL ] = { /* 95 */ +- .pme_name = "PM_VECTOR_FLOP_CMPL", +- .pme_code = 0x000004D058, +- .pme_short_desc = "Vector flop instruction completed", +- .pme_long_desc = "Vector flop instruction completed", ++[ POWER9_PME_PM_DARQ0_0_3_ENTRIES ] = { ++ .pme_name = "PM_DARQ0_0_3_ENTRIES", ++ .pme_code = 0x000004D04A, ++ .pme_short_desc = "Cycles in which 3 or less DARQ entries (out of 12) are in use", ++ .pme_long_desc = "Cycles in which 3 or less DARQ entries (out of 12) are in use", + }, +-[ POWER9_PME_PM_FXU_IDLE ] = { /* 96 */ +- .pme_name = "PM_FXU_IDLE", +- .pme_code = 0x0000024052, +- .pme_short_desc = "Cycles in which FXU0, FXU1, FXU2, and FXU3 are all idle", +- .pme_long_desc = "Cycles in which FXU0, FXU1, FXU2, and FXU3 are all idle", ++[ POWER9_PME_PM_DARQ0_10_12_ENTRIES ] = { ++ .pme_name = "PM_DARQ0_10_12_ENTRIES", ++ .pme_code = 0x000001D058, ++ .pme_short_desc = "Cycles in which 10 or more DARQ entries (out of 12) are in use", ++ .pme_long_desc = "Cycles in which 10 or more DARQ entries (out of 12) are in use", + }, +-[ POWER9_PME_PM_INST_CMPL ] = { /* 97 */ +- .pme_name = "PM_INST_CMPL", +- .pme_code = 0x0000010002, +- .pme_short_desc = "# PPC instructions completed", +- .pme_long_desc = "# PPC instructions completed", ++[ POWER9_PME_PM_DARQ0_4_6_ENTRIES ] = { ++ .pme_name = "PM_DARQ0_4_6_ENTRIES", ++ .pme_code = 0x000003504E, ++ .pme_short_desc = "Cycles in which 4, 5, or 6 DARQ entries (out of 12) are in use", ++ .pme_long_desc = "Cycles in which 4, 5, or 6 DARQ entries (out of 12) are in use", + }, +-[ POWER9_PME_PM_EAT_FORCE_MISPRED ] = { /* 98 */ +- .pme_name = "PM_EAT_FORCE_MISPRED", +- .pme_code = 0x00000050A8, +- .pme_short_desc = "XL-form branch was mispredicted due to the predicted target address missing from EAT.", +- .pme_long_desc = "XL-form branch was mispredicted due to the predicted target address missing from EAT. The EAT forces a mispredict in this case since there is no predicated target to validate. This is a rare case that may occur when the EAT is full and a branch is issued", ++[ POWER9_PME_PM_DARQ0_7_9_ENTRIES ] = { ++ .pme_name = "PM_DARQ0_7_9_ENTRIES", ++ .pme_code = 0x000002E050, ++ .pme_short_desc = "Cycles in which 7,8, or 9 DARQ entries (out of 12) are in use", ++ .pme_long_desc = "Cycles in which 7,8, or 9 DARQ entries (out of 12) are in use", + }, +-[ POWER9_PME_PM_CMPLU_STALL_LRQ_FULL ] = { /* 99 */ +- .pme_name = "PM_CMPLU_STALL_LRQ_FULL", +- .pme_code = 0x000002D014, +- .pme_short_desc = "Finish stall because the NTF instruction was a load that was held in LSAQ because the LRQ was full", +- .pme_long_desc = "Finish stall because the NTF instruction was a load that was held in LSAQ because the LRQ was full", ++[ POWER9_PME_PM_DARQ1_0_3_ENTRIES ] = { ++ .pme_name = "PM_DARQ1_0_3_ENTRIES", ++ .pme_code = 0x000004C122, ++ .pme_short_desc = "Cycles in which 3 or fewer DARQ1 entries (out of 12) are in use", ++ .pme_long_desc = "Cycles in which 3 or fewer DARQ1 entries (out of 12) are in use", ++}, ++[ POWER9_PME_PM_DARQ1_10_12_ENTRIES ] = { ++ .pme_name = "PM_DARQ1_10_12_ENTRIES", ++ .pme_code = 0x0000020058, ++ .pme_short_desc = "Cycles in which 10 or more DARQ1 entries (out of 12) are in use", ++ .pme_long_desc = "Cycles in which 10 or more DARQ1 entries (out of 12) are in use", ++}, ++[ POWER9_PME_PM_DARQ1_4_6_ENTRIES ] = { ++ .pme_name = "PM_DARQ1_4_6_ENTRIES", ++ .pme_code = 0x000003E050, ++ .pme_short_desc = "Cycles in which 4, 5, or 6 DARQ1 entries (out of 12) are in use", ++ .pme_long_desc = "Cycles in which 4, 5, or 6 DARQ1 entries (out of 12) are in use", ++}, ++[ POWER9_PME_PM_DARQ1_7_9_ENTRIES ] = { ++ .pme_name = "PM_DARQ1_7_9_ENTRIES", ++ .pme_code = 0x000002005A, ++ .pme_short_desc = "Cycles in which 7 to 9 DARQ1 entries (out of 12) are in use", ++ .pme_long_desc = "Cycles in which 7 to 9 DARQ1 entries (out of 12) are in use", ++}, ++[ POWER9_PME_PM_DARQ_STORE_REJECT ] = { ++ .pme_name = "PM_DARQ_STORE_REJECT", ++ .pme_code = 0x000004405E, ++ .pme_short_desc = "The DARQ attempted to transmit a store into an LSAQ or SRQ entry but It was rejected.", ++ .pme_long_desc = "The DARQ attempted to transmit a store into an LSAQ or SRQ entry but It was rejected. Divide by PM_DARQ_STORE_XMIT to get reject ratio", + }, +-[ POWER9_PME_PM_MRK_DATA_FROM_DL2L3_MOD ] = { /* 100 */ +- .pme_name = "PM_MRK_DATA_FROM_DL2L3_MOD", +- .pme_code = 0x000003D14E, +- .pme_short_desc = "The processor's data cache was reloaded with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a marked load", +- .pme_long_desc = "The processor's data cache was reloaded with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a marked load", ++[ POWER9_PME_PM_DARQ_STORE_XMIT ] = { ++ .pme_name = "PM_DARQ_STORE_XMIT", ++ .pme_code = 0x0000030064, ++ .pme_short_desc = "The DARQ attempted to transmit a store into an LSAQ or SRQ entry.", ++ .pme_long_desc = "The DARQ attempted to transmit a store into an LSAQ or SRQ entry. Includes rejects. Not qualified by thread, so it includes counts for the whole core", + }, +-[ POWER9_PME_PM_BACK_BR_CMPL ] = { /* 101 */ +- .pme_name = "PM_BACK_BR_CMPL", +- .pme_code = 0x000002505E, +- .pme_short_desc = "Branch instruction completed with a target address less than current instruction address", +- .pme_long_desc = "Branch instruction completed with a target address less than current instruction address", ++[ POWER9_PME_PM_DATA_CHIP_PUMP_CPRED ] = { ++ .pme_name = "PM_DATA_CHIP_PUMP_CPRED", ++ .pme_code = 0x000001C050, ++ .pme_short_desc = "Initial and Final Pump Scope was chip pump (prediction=correct) for a demand load", ++ .pme_long_desc = "Initial and Final Pump Scope was chip pump (prediction=correct) for a demand load", + }, +-[ POWER9_PME_PM_NEST_REF_CLK ] = { /* 102 */ +- .pme_name = "PM_NEST_REF_CLK", +- .pme_code = 0x000003006E, +- .pme_short_desc = "Multiply by 4 to obtain the number of PB cycles", +- .pme_long_desc = "Multiply by 4 to obtain the number of PB cycles", ++[ POWER9_PME_PM_DATA_FROM_DL2L3_MOD ] = { ++ .pme_name = "PM_DATA_FROM_DL2L3_MOD", ++ .pme_code = 0x000004C048, ++ .pme_short_desc = "The processor's data cache was reloaded with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a demand load", ++ .pme_long_desc = "The processor's data cache was reloaded with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a demand load", + }, +-[ POWER9_PME_PM_MRK_DPTEG_FROM_RL2L3_SHR ] = { /* 103 */ +- .pme_name = "PM_MRK_DPTEG_FROM_RL2L3_SHR", +- .pme_code = 0x000001F14A, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a marked data side request.", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a marked data side request.. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++[ POWER9_PME_PM_DATA_FROM_DL2L3_SHR ] = { ++ .pme_name = "PM_DATA_FROM_DL2L3_SHR", ++ .pme_code = 0x000003C048, ++ .pme_short_desc = "The processor's data cache was reloaded with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a demand load", ++ .pme_long_desc = "The processor's data cache was reloaded with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a demand load", + }, +-[ POWER9_PME_PM_RC_USAGE ] = { /* 104 */ +- .pme_name = "PM_RC_USAGE", +- .pme_code = 0x000001688E, +- .pme_short_desc = " continuous 16 cycle(2to1) window where this signals rotates thru sampling each machine", +- .pme_long_desc = " continuous 16 cycle(2to1) window where this signals rotates thru sampling each machine", ++[ POWER9_PME_PM_DATA_FROM_DL4 ] = { ++ .pme_name = "PM_DATA_FROM_DL4", ++ .pme_code = 0x000003C04C, ++ .pme_short_desc = "The processor's data cache was reloaded from another chip's L4 on a different Node or Group (Distant) due to a demand load", ++ .pme_long_desc = "The processor's data cache was reloaded from another chip's L4 on a different Node or Group (Distant) due to a demand load", + }, +-[ POWER9_PME_PM_MRK_DPTEG_FROM_L3_1_ECO_MOD ] = { /* 105 */ +- .pme_name = "PM_MRK_DPTEG_FROM_L3_1_ECO_MOD", +- .pme_code = 0x000004F144, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's ECO L3 on the same chip due to a marked data side request.", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's ECO L3 on the same chip due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++[ POWER9_PME_PM_DATA_FROM_DMEM ] = { ++ .pme_name = "PM_DATA_FROM_DMEM", ++ .pme_code = 0x000004C04C, ++ .pme_short_desc = "The processor's data cache was reloaded from another chip's memory on the same Node or Group (Distant) due to a demand load", ++ .pme_long_desc = "The processor's data cache was reloaded from another chip's memory on the same Node or Group (Distant) due to a demand load", + }, +-[ POWER9_PME_PM_BR_CMPL ] = { /* 106 */ +- .pme_name = "PM_BR_CMPL", +- .pme_code = 0x0000010012, +- .pme_short_desc = "Branch Instruction completed", +- .pme_long_desc = "Branch Instruction completed", ++[ POWER9_PME_PM_DATA_FROM_L21_MOD ] = { ++ .pme_name = "PM_DATA_FROM_L21_MOD", ++ .pme_code = 0x000004C046, ++ .pme_short_desc = "The processor's data cache was reloaded with Modified (M) data from another core's L2 on the same chip due to a demand load", ++ .pme_long_desc = "The processor's data cache was reloaded with Modified (M) data from another core's L2 on the same chip due to a demand load", + }, +-[ POWER9_PME_PM_INST_FROM_RL2L3_MOD ] = { /* 107 */ +- .pme_name = "PM_INST_FROM_RL2L3_MOD", +- .pme_code = 0x0000024046, +- .pme_short_desc = "The processor's Instruction cache was reloaded with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to an instruction fetch (not prefetch)", +- .pme_long_desc = "The processor's Instruction cache was reloaded with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to an instruction fetch (not prefetch)", ++[ POWER9_PME_PM_DATA_FROM_L21_SHR ] = { ++ .pme_name = "PM_DATA_FROM_L21_SHR", ++ .pme_code = 0x000003C046, ++ .pme_short_desc = "The processor's data cache was reloaded with Shared (S) data from another core's L2 on the same chip due to a demand load", ++ .pme_long_desc = "The processor's data cache was reloaded with Shared (S) data from another core's L2 on the same chip due to a demand load", + }, +-[ POWER9_PME_PM_SHL_CREATED ] = { /* 108 */ +- .pme_name = "PM_SHL_CREATED", +- .pme_code = 0x000000508C, +- .pme_short_desc = "Store-Hit-Load Table Entry Created", +- .pme_long_desc = "Store-Hit-Load Table Entry Created", ++[ POWER9_PME_PM_DATA_FROM_L2_DISP_CONFLICT_LDHITST ] = { ++ .pme_name = "PM_DATA_FROM_L2_DISP_CONFLICT_LDHITST", ++ .pme_code = 0x000003C040, ++ .pme_short_desc = "The processor's data cache was reloaded from local core's L2 with load hit store conflict due to a demand load", ++ .pme_long_desc = "The processor's data cache was reloaded from local core's L2 with load hit store conflict due to a demand load", + }, +-[ POWER9_PME_PM_CMPLU_STALL_PASTE ] = { /* 109 */ +- .pme_name = "PM_CMPLU_STALL_PASTE", +- .pme_code = 0x000002C016, +- .pme_short_desc = "Finish stall because the NTF instruction was a paste waiting for response from L2", +- .pme_long_desc = "Finish stall because the NTF instruction was a paste waiting for response from L2", ++[ POWER9_PME_PM_DATA_FROM_L2_DISP_CONFLICT_OTHER ] = { ++ .pme_name = "PM_DATA_FROM_L2_DISP_CONFLICT_OTHER", ++ .pme_code = 0x000004C040, ++ .pme_short_desc = "The processor's data cache was reloaded from local core's L2 with dispatch conflict due to a demand load", ++ .pme_long_desc = "The processor's data cache was reloaded from local core's L2 with dispatch conflict due to a demand load", + }, +-[ POWER9_PME_PM_LSU3_LDMX_FIN ] = { /* 110 */ +- .pme_name = "PM_LSU3_LDMX_FIN", +- .pme_code = 0x000000D88C, +- .pme_short_desc = " New P9 instruction LDMX.", +- .pme_long_desc = " New P9 instruction LDMX.", ++[ POWER9_PME_PM_DATA_FROM_L2_MEPF ] = { ++ .pme_name = "PM_DATA_FROM_L2_MEPF", ++ .pme_code = 0x000002C040, ++ .pme_short_desc = "The processor's data cache was reloaded from local core's L2 hit without dispatch conflicts on Mepf state due to a demand load", ++ .pme_long_desc = "The processor's data cache was reloaded from local core's L2 hit without dispatch conflicts on Mepf state due to a demand load", + }, +-[ POWER9_PME_PM_SN_USAGE ] = { /* 111 */ +- .pme_name = "PM_SN_USAGE", +- .pme_code = 0x000003688E, +- .pme_short_desc = " continuous 16 cycle(2to1) window where this signals rotates thru sampling each machine", +- .pme_long_desc = " continuous 16 cycle(2to1) window where this signals rotates thru sampling each machine", ++[ POWER9_PME_PM_DATA_FROM_L2MISS_MOD ] = { ++ .pme_name = "PM_DATA_FROM_L2MISS_MOD", ++ .pme_code = 0x000001C04E, ++ .pme_short_desc = "The processor's data cache was reloaded from a location other than the local core's L2 due to a demand load", ++ .pme_long_desc = "The processor's data cache was reloaded from a location other than the local core's L2 due to a demand load", + }, +-[ POWER9_PME_PM_L2_ST_HIT ] = { /* 112 */ +- .pme_name = "PM_L2_ST_HIT", +- .pme_code = 0x000002689E, +- .pme_short_desc = "All successful store dispatches that were L2Hits", +- .pme_long_desc = "All successful store dispatches that were L2Hits", ++[ POWER9_PME_PM_DATA_FROM_L2MISS ] = { ++ .pme_name = "PM_DATA_FROM_L2MISS", ++ .pme_code = 0x00000200FE, ++ .pme_short_desc = "Demand LD - L2 Miss (not L2 hit)", ++ .pme_long_desc = "Demand LD - L2 Miss (not L2 hit)", + }, +-[ POWER9_PME_PM_DATA_FROM_DMEM ] = { /* 113 */ +- .pme_name = "PM_DATA_FROM_DMEM", +- .pme_code = 0x000004C04C, +- .pme_short_desc = "The processor's data cache was reloaded from another chip's memory on the same Node or Group (Distant) due to a demand load", +- .pme_long_desc = "The processor's data cache was reloaded from another chip's memory on the same Node or Group (Distant) due to a demand load", ++[ POWER9_PME_PM_DATA_FROM_L2_NO_CONFLICT ] = { ++ .pme_name = "PM_DATA_FROM_L2_NO_CONFLICT", ++ .pme_code = 0x000001C040, ++ .pme_short_desc = "The processor's data cache was reloaded from local core's L2 without conflict due to a demand load", ++ .pme_long_desc = "The processor's data cache was reloaded from local core's L2 without conflict due to a demand load", + }, +-[ POWER9_PME_PM_CMPLU_STALL_DMISS_REMOTE ] = { /* 114 */ +- .pme_name = "PM_CMPLU_STALL_DMISS_REMOTE", +- .pme_code = 0x000002C01C, +- .pme_short_desc = "Completion stall by Dcache miss which resolved from remote chip (cache or memory)", +- .pme_long_desc = "Completion stall by Dcache miss which resolved from remote chip (cache or memory)", ++[ POWER9_PME_PM_DATA_FROM_L2 ] = { ++ .pme_name = "PM_DATA_FROM_L2", ++ .pme_code = 0x000001C042, ++ .pme_short_desc = "The processor's data cache was reloaded from local core's L2 due to a demand load", ++ .pme_long_desc = "The processor's data cache was reloaded from local core's L2 due to a demand load", + }, +-[ POWER9_PME_PM_LSU2_LDMX_FIN ] = { /* 115 */ +- .pme_name = "PM_LSU2_LDMX_FIN", +- .pme_code = 0x000000D08C, +- .pme_short_desc = " New P9 instruction LDMX.", +- .pme_long_desc = " New P9 instruction LDMX.", ++[ POWER9_PME_PM_DATA_FROM_L31_ECO_MOD ] = { ++ .pme_name = "PM_DATA_FROM_L31_ECO_MOD", ++ .pme_code = 0x000004C044, ++ .pme_short_desc = "The processor's data cache was reloaded with Modified (M) data from another core's ECO L3 on the same chip due to a demand load", ++ .pme_long_desc = "The processor's data cache was reloaded with Modified (M) data from another core's ECO L3 on the same chip due to a demand load", + }, +-[ POWER9_PME_PM_L3_LD_MISS ] = { /* 116 */ +- .pme_name = "PM_L3_LD_MISS", +- .pme_code = 0x00000268A4, +- .pme_short_desc = "L3 demand LD Miss", +- .pme_long_desc = "L3 demand LD Miss", ++[ POWER9_PME_PM_DATA_FROM_L31_ECO_SHR ] = { ++ .pme_name = "PM_DATA_FROM_L31_ECO_SHR", ++ .pme_code = 0x000003C044, ++ .pme_short_desc = "The processor's data cache was reloaded with Shared (S) data from another core's ECO L3 on the same chip due to a demand load", ++ .pme_long_desc = "The processor's data cache was reloaded with Shared (S) data from another core's ECO L3 on the same chip due to a demand load", + }, +-[ POWER9_PME_PM_DPTEG_FROM_RL4 ] = { /* 117 */ +- .pme_name = "PM_DPTEG_FROM_RL4", +- .pme_code = 0x000002E04A, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB from another chip's L4 on the same Node or Group ( Remote) due to a data side request.", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB from another chip's L4 on the same Node or Group ( Remote) due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++[ POWER9_PME_PM_DATA_FROM_L31_MOD ] = { ++ .pme_name = "PM_DATA_FROM_L31_MOD", ++ .pme_code = 0x000002C044, ++ .pme_short_desc = "The processor's data cache was reloaded with Modified (M) data from another core's L3 on the same chip due to a demand load", ++ .pme_long_desc = "The processor's data cache was reloaded with Modified (M) data from another core's L3 on the same chip due to a demand load", + }, +-[ POWER9_PME_PM_RADIX_PWC_L3_PDE_FROM_L2 ] = { /* 118 */ +- .pme_name = "PM_RADIX_PWC_L3_PDE_FROM_L2", +- .pme_code = 0x000002D02A, +- .pme_short_desc = "A Page Directory Entry was reloaded to a level 3 page walk cache from the core's L2 data cache", +- .pme_long_desc = "A Page Directory Entry was reloaded to a level 3 page walk cache from the core's L2 data cache", ++[ POWER9_PME_PM_DATA_FROM_L31_SHR ] = { ++ .pme_name = "PM_DATA_FROM_L31_SHR", ++ .pme_code = 0x000001C046, ++ .pme_short_desc = "The processor's data cache was reloaded with Shared (S) data from another core's L3 on the same chip due to a demand load", ++ .pme_long_desc = "The processor's data cache was reloaded with Shared (S) data from another core's L3 on the same chip due to a demand load", + }, +-[ POWER9_PME_PM_MRK_DATA_FROM_RL4_CYC ] = { /* 119 */ +- .pme_name = "PM_MRK_DATA_FROM_RL4_CYC", +- .pme_code = 0x000004D12A, +- .pme_short_desc = "Duration in cycles to reload from another chip's L4 on the same Node or Group ( Remote) due to a marked load", +- .pme_long_desc = "Duration in cycles to reload from another chip's L4 on the same Node or Group ( Remote) due to a marked load", ++[ POWER9_PME_PM_DATA_FROM_L3_DISP_CONFLICT ] = { ++ .pme_name = "PM_DATA_FROM_L3_DISP_CONFLICT", ++ .pme_code = 0x000003C042, ++ .pme_short_desc = "The processor's data cache was reloaded from local core's L3 with dispatch conflict due to a demand load", ++ .pme_long_desc = "The processor's data cache was reloaded from local core's L3 with dispatch conflict due to a demand load", + }, +-[ POWER9_PME_PM_TM_SC_CO ] = { /* 120 */ +- .pme_name = "PM_TM_SC_CO", +- .pme_code = 0x00000160A6, +- .pme_short_desc = "l3 castout tm Sc line", +- .pme_long_desc = "l3 castout tm Sc line", ++[ POWER9_PME_PM_DATA_FROM_L3_MEPF ] = { ++ .pme_name = "PM_DATA_FROM_L3_MEPF", ++ .pme_code = 0x000002C042, ++ .pme_short_desc = "The processor's data cache was reloaded from local core's L3 without dispatch conflicts hit on Mepf state due to a demand load", ++ .pme_long_desc = "The processor's data cache was reloaded from local core's L3 without dispatch conflicts hit on Mepf state due to a demand load", + }, +-[ POWER9_PME_PM_L2_SN_SX_I_DONE ] = { /* 121 */ +- .pme_name = "PM_L2_SN_SX_I_DONE", +- .pme_code = 0x0000036886, +- .pme_short_desc = "SNP dispatched and went from Sx or Tx to Ix", +- .pme_long_desc = "SNP dispatched and went from Sx or Tx to Ix", ++[ POWER9_PME_PM_DATA_FROM_L3MISS_MOD ] = { ++ .pme_name = "PM_DATA_FROM_L3MISS_MOD", ++ .pme_code = 0x000004C04E, ++ .pme_short_desc = "The processor's data cache was reloaded from a location other than the local core's L3 due to a demand load", ++ .pme_long_desc = "The processor's data cache was reloaded from a location other than the local core's L3 due to a demand load", + }, +-[ POWER9_PME_PM_DPTEG_FROM_L3_DISP_CONFLICT ] = { /* 122 */ +- .pme_name = "PM_DPTEG_FROM_L3_DISP_CONFLICT", +- .pme_code = 0x000003E042, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L3 with dispatch conflict due to a data side request.", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L3 with dispatch conflict due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++[ POWER9_PME_PM_DATA_FROM_L3MISS ] = { ++ .pme_name = "PM_DATA_FROM_L3MISS", ++ .pme_code = 0x00000300FE, ++ .pme_short_desc = "Demand LD - L3 Miss (not L2 hit and not L3 hit)", ++ .pme_long_desc = "Demand LD - L3 Miss (not L2 hit and not L3 hit)", + }, +-[ POWER9_PME_PM_ISIDE_L2MEMACC ] = { /* 123 */ +- .pme_name = "PM_ISIDE_L2MEMACC", +- .pme_code = 0x0000026890, +- .pme_short_desc = "valid when first beat of data comes in for an i-side fetch where data came from mem(or L4)", +- .pme_long_desc = "valid when first beat of data comes in for an i-side fetch where data came from mem(or L4)", ++[ POWER9_PME_PM_DATA_FROM_L3_NO_CONFLICT ] = { ++ .pme_name = "PM_DATA_FROM_L3_NO_CONFLICT", ++ .pme_code = 0x000001C044, ++ .pme_short_desc = "The processor's data cache was reloaded from local core's L3 without conflict due to a demand load", ++ .pme_long_desc = "The processor's data cache was reloaded from local core's L3 without conflict due to a demand load", + }, +-[ POWER9_PME_PM_L3_P0_GRP_PUMP ] = { /* 124 */ +- .pme_name = "PM_L3_P0_GRP_PUMP", +- .pme_code = 0x00000260B0, +- .pme_short_desc = "L3 pf sent with grp scope port 0", +- .pme_long_desc = "L3 pf sent with grp scope port 0", ++[ POWER9_PME_PM_DATA_FROM_L3 ] = { ++ .pme_name = "PM_DATA_FROM_L3", ++ .pme_code = 0x000004C042, ++ .pme_short_desc = "The processor's data cache was reloaded from local core's L3 due to a demand load", ++ .pme_long_desc = "The processor's data cache was reloaded from local core's L3 due to a demand load", + }, +-[ POWER9_PME_PM_IPTEG_FROM_DL2L3_SHR ] = { /* 125 */ +- .pme_name = "PM_IPTEG_FROM_DL2L3_SHR", +- .pme_code = 0x0000035048, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a instruction side request", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a instruction side request", ++[ POWER9_PME_PM_DATA_FROM_LL4 ] = { ++ .pme_name = "PM_DATA_FROM_LL4", ++ .pme_code = 0x000001C04C, ++ .pme_short_desc = "The processor's data cache was reloaded from the local chip's L4 cache due to a demand load", ++ .pme_long_desc = "The processor's data cache was reloaded from the local chip's L4 cache due to a demand load", + }, +-[ POWER9_PME_PM_RADIX_PWC_L3_PDE_FROM_L3 ] = { /* 126 */ +- .pme_name = "PM_RADIX_PWC_L3_PDE_FROM_L3", +- .pme_code = 0x000001F15C, +- .pme_short_desc = "A Page Directory Entry was reloaded to a level 3 page walk cache from the core's L3 data cache", +- .pme_long_desc = "A Page Directory Entry was reloaded to a level 3 page walk cache from the core's L3 data cache", ++[ POWER9_PME_PM_DATA_FROM_LMEM ] = { ++ .pme_name = "PM_DATA_FROM_LMEM", ++ .pme_code = 0x000002C048, ++ .pme_short_desc = "The processor's data cache was reloaded from the local chip's Memory due to a demand load", ++ .pme_long_desc = "The processor's data cache was reloaded from the local chip's Memory due to a demand load", + }, +-[ POWER9_PME_PM_THRESH_MET ] = { /* 127 */ +- .pme_name = "PM_THRESH_MET", +- .pme_code = 0x00000101EC, +- .pme_short_desc = "threshold exceeded", +- .pme_long_desc = "threshold exceeded", ++[ POWER9_PME_PM_DATA_FROM_MEMORY ] = { ++ .pme_name = "PM_DATA_FROM_MEMORY", ++ .pme_code = 0x00000400FE, ++ .pme_short_desc = "The processor's data cache was reloaded from a memory location including L4 from local remote or distant due to a demand load", ++ .pme_long_desc = "The processor's data cache was reloaded from a memory location including L4 from local remote or distant due to a demand load", + }, +-[ POWER9_PME_PM_DATA_FROM_L2_MEPF ] = { /* 128 */ +- .pme_name = "PM_DATA_FROM_L2_MEPF", +- .pme_code = 0x000002C040, +- .pme_short_desc = "The processor's data cache was reloaded from local core's L2 hit without dispatch conflicts on Mepf state due to a demand load", +- .pme_long_desc = "The processor's data cache was reloaded from local core's L2 hit without dispatch conflicts on Mepf state due to a demand load", ++[ POWER9_PME_PM_DATA_FROM_OFF_CHIP_CACHE ] = { ++ .pme_name = "PM_DATA_FROM_OFF_CHIP_CACHE", ++ .pme_code = 0x000004C04A, ++ .pme_short_desc = "The processor's data cache was reloaded either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to a demand load", ++ .pme_long_desc = "The processor's data cache was reloaded either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to a demand load", + }, +-[ POWER9_PME_PM_DISP_STARVED ] = { /* 129 */ +- .pme_name = "PM_DISP_STARVED", +- .pme_code = 0x0000030008, +- .pme_short_desc = "Dispatched Starved", +- .pme_long_desc = "Dispatched Starved", ++[ POWER9_PME_PM_DATA_FROM_ON_CHIP_CACHE ] = { ++ .pme_name = "PM_DATA_FROM_ON_CHIP_CACHE", ++ .pme_code = 0x000001C048, ++ .pme_short_desc = "The processor's data cache was reloaded either shared or modified data from another core's L2/L3 on the same chip due to a demand load", ++ .pme_long_desc = "The processor's data cache was reloaded either shared or modified data from another core's L2/L3 on the same chip due to a demand load", + }, +-[ POWER9_PME_PM_L3_P0_LCO_RTY ] = { /* 130 */ +- .pme_name = "PM_L3_P0_LCO_RTY", +- .pme_code = 0x00000160B4, +- .pme_short_desc = "L3 lateral cast out received retry on port 0", +- .pme_long_desc = "L3 lateral cast out received retry on port 0", ++[ POWER9_PME_PM_DATA_FROM_RL2L3_MOD ] = { ++ .pme_name = "PM_DATA_FROM_RL2L3_MOD", ++ .pme_code = 0x000002C046, ++ .pme_short_desc = "The processor's data cache was reloaded with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a demand load", ++ .pme_long_desc = "The processor's data cache was reloaded with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a demand load", + }, +-[ POWER9_PME_PM_NTC_ISSUE_HELD_DARQ_FULL ] = { /* 131 */ +- .pme_name = "PM_NTC_ISSUE_HELD_DARQ_FULL", +- .pme_code = 0x000001006A, +- .pme_short_desc = "The NTC instruction is being held at dispatch because there are no slots in the DARQ for it", +- .pme_long_desc = "The NTC instruction is being held at dispatch because there are no slots in the DARQ for it", ++[ POWER9_PME_PM_DATA_FROM_RL2L3_SHR ] = { ++ .pme_name = "PM_DATA_FROM_RL2L3_SHR", ++ .pme_code = 0x000001C04A, ++ .pme_short_desc = "The processor's data cache was reloaded with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a demand load", ++ .pme_long_desc = "The processor's data cache was reloaded with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a demand load", + }, +-[ POWER9_PME_PM_L3_RD_USAGE ] = { /* 132 */ +- .pme_name = "PM_L3_RD_USAGE", +- .pme_code = 0x00000268AC, +- .pme_short_desc = "rotating sample of 16 RD actives", +- .pme_long_desc = "rotating sample of 16 RD actives", ++[ POWER9_PME_PM_DATA_FROM_RL4 ] = { ++ .pme_name = "PM_DATA_FROM_RL4", ++ .pme_code = 0x000002C04A, ++ .pme_short_desc = "The processor's data cache was reloaded from another chip's L4 on the same Node or Group ( Remote) due to a demand load", ++ .pme_long_desc = "The processor's data cache was reloaded from another chip's L4 on the same Node or Group ( Remote) due to a demand load", + }, +-[ POWER9_PME_PM_TLBIE_FIN ] = { /* 133 */ +- .pme_name = "PM_TLBIE_FIN", +- .pme_code = 0x0000030058, +- .pme_short_desc = "tlbie finished", +- .pme_long_desc = "tlbie finished", ++[ POWER9_PME_PM_DATA_FROM_RMEM ] = { ++ .pme_name = "PM_DATA_FROM_RMEM", ++ .pme_code = 0x000003C04A, ++ .pme_short_desc = "The processor's data cache was reloaded from another chip's memory on the same Node or Group ( Remote) due to a demand load", ++ .pme_long_desc = "The processor's data cache was reloaded from another chip's memory on the same Node or Group ( Remote) due to a demand load", + }, +-[ POWER9_PME_PM_DPTEG_FROM_LL4 ] = { /* 134 */ +- .pme_name = "PM_DPTEG_FROM_LL4", +- .pme_code = 0x000001E04C, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB from the local chip's L4 cache due to a data side request.", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB from the local chip's L4 cache due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++[ POWER9_PME_PM_DATA_GRP_PUMP_CPRED ] = { ++ .pme_name = "PM_DATA_GRP_PUMP_CPRED", ++ .pme_code = 0x000002C050, ++ .pme_short_desc = "Initial and Final Pump Scope was group pump (prediction=correct) for a demand load", ++ .pme_long_desc = "Initial and Final Pump Scope was group pump (prediction=correct) for a demand load", + }, +-[ POWER9_PME_PM_CMPLU_STALL_TLBIE ] = { /* 135 */ +- .pme_name = "PM_CMPLU_STALL_TLBIE", +- .pme_code = 0x000002E01C, +- .pme_short_desc = "Finish stall because the NTF instruction was a tlbie waiting for response from L2", +- .pme_long_desc = "Finish stall because the NTF instruction was a tlbie waiting for response from L2", ++[ POWER9_PME_PM_DATA_GRP_PUMP_MPRED_RTY ] = { ++ .pme_name = "PM_DATA_GRP_PUMP_MPRED_RTY", ++ .pme_code = 0x000001C052, ++ .pme_short_desc = "Final Pump Scope (Group) ended up larger than Initial Pump Scope (Chip) for a demand load", ++ .pme_long_desc = "Final Pump Scope (Group) ended up larger than Initial Pump Scope (Chip) for a demand load", + }, +-[ POWER9_PME_PM_MRK_DATA_FROM_L2MISS_CYC ] = { /* 136 */ +- .pme_name = "PM_MRK_DATA_FROM_L2MISS_CYC", +- .pme_code = 0x0000035152, +- .pme_short_desc = "Duration in cycles to reload from a localtion other than the local core's L2 due to a marked load", +- .pme_long_desc = "Duration in cycles to reload from a localtion other than the local core's L2 due to a marked load", ++[ POWER9_PME_PM_DATA_GRP_PUMP_MPRED ] = { ++ .pme_name = "PM_DATA_GRP_PUMP_MPRED", ++ .pme_code = 0x000002C052, ++ .pme_short_desc = "Final Pump Scope (Group) ended up either larger or smaller than Initial Pump Scope for a demand load", ++ .pme_long_desc = "Final Pump Scope (Group) ended up either larger or smaller than Initial Pump Scope for a demand load", + }, +-[ POWER9_PME_PM_LS3_DC_COLLISIONS ] = { /* 137 */ +- .pme_name = "PM_LS3_DC_COLLISIONS", +- .pme_code = 0x000000D894, +- .pme_short_desc = "Read-write data cache collisions", +- .pme_long_desc = "Read-write data cache collisions", ++[ POWER9_PME_PM_DATA_PUMP_CPRED ] = { ++ .pme_name = "PM_DATA_PUMP_CPRED", ++ .pme_code = 0x000001C054, ++ .pme_short_desc = "Pump prediction correct.", ++ .pme_long_desc = "Pump prediction correct. Counts across all types of pumps for a demand load", + }, +-[ POWER9_PME_PM_L1_ICACHE_MISS ] = { /* 138 */ +- .pme_name = "PM_L1_ICACHE_MISS", +- .pme_code = 0x00000200FD, +- .pme_short_desc = "Demand iCache Miss", +- .pme_long_desc = "Demand iCache Miss", ++[ POWER9_PME_PM_DATA_PUMP_MPRED ] = { ++ .pme_name = "PM_DATA_PUMP_MPRED", ++ .pme_code = 0x000004C052, ++ .pme_short_desc = "Pump misprediction.", ++ .pme_long_desc = "Pump misprediction. Counts across all types of pumps for a demand load", + }, +-[ POWER9_PME_PM_LSU_REJECT_ERAT_MISS ] = { /* 139 */ +- .pme_name = "PM_LSU_REJECT_ERAT_MISS", +- .pme_code = 0x000002E05C, +- .pme_short_desc = "LSU Reject due to ERAT (up to 4 per cycles)", +- .pme_long_desc = "LSU Reject due to ERAT (up to 4 per cycles)", ++[ POWER9_PME_PM_DATA_STORE ] = { ++ .pme_name = "PM_DATA_STORE", ++ .pme_code = 0x000000F0A0, ++ .pme_short_desc = "All ops that drain from s2q to L2 containing data", ++ .pme_long_desc = "All ops that drain from s2q to L2 containing data", + }, +-[ POWER9_PME_PM_DATA_SYS_PUMP_CPRED ] = { /* 140 */ ++[ POWER9_PME_PM_DATA_SYS_PUMP_CPRED ] = { + .pme_name = "PM_DATA_SYS_PUMP_CPRED", + .pme_code = 0x000003C050, + .pme_short_desc = "Initial and Final Pump Scope was system pump (prediction=correct) for a demand load", + .pme_long_desc = "Initial and Final Pump Scope was system pump (prediction=correct) for a demand load", + }, +-[ POWER9_PME_PM_MRK_FAB_RSP_RWITM_CYC ] = { /* 141 */ +- .pme_name = "PM_MRK_FAB_RSP_RWITM_CYC", +- .pme_code = 0x000004F150, +- .pme_short_desc = "cycles L2 RC took for a rwitm", +- .pme_long_desc = "cycles L2 RC took for a rwitm", ++[ POWER9_PME_PM_DATA_SYS_PUMP_MPRED_RTY ] = { ++ .pme_name = "PM_DATA_SYS_PUMP_MPRED_RTY", ++ .pme_code = 0x000004C050, ++ .pme_short_desc = "Final Pump Scope (system) ended up larger than Initial Pump Scope (Chip/Group) for a demand load", ++ .pme_long_desc = "Final Pump Scope (system) ended up larger than Initial Pump Scope (Chip/Group) for a demand load", + }, +-[ POWER9_PME_PM_MRK_DATA_FROM_L3_1_SHR_CYC ] = { /* 142 */ +- .pme_name = "PM_MRK_DATA_FROM_L3_1_SHR_CYC", +- .pme_code = 0x0000035156, +- .pme_short_desc = "Duration in cycles to reload with Shared (S) data from another core's L3 on the same chip due to a marked load", +- .pme_long_desc = "Duration in cycles to reload with Shared (S) data from another core's L3 on the same chip due to a marked load", ++[ POWER9_PME_PM_DATA_SYS_PUMP_MPRED ] = { ++ .pme_name = "PM_DATA_SYS_PUMP_MPRED", ++ .pme_code = 0x000003C052, ++ .pme_short_desc = "Final Pump Scope (system) mispredicted.", ++ .pme_long_desc = "Final Pump Scope (system) mispredicted. Either the original scope was too small (Chip/Group) or the original scope was System and it should have been smaller. Counts for a demand load", + }, +-[ POWER9_PME_PM_LSU_FLUSH_UE ] = { /* 143 */ +- .pme_name = "PM_LSU_FLUSH_UE", +- .pme_code = 0x000000C0B4, +- .pme_short_desc = "Correctable ECC error on reload data, reported at critical data forward time", +- .pme_long_desc = "Correctable ECC error on reload data, reported at critical data forward time", ++[ POWER9_PME_PM_DATA_TABLEWALK_CYC ] = { ++ .pme_name = "PM_DATA_TABLEWALK_CYC", ++ .pme_code = 0x000003001A, ++ .pme_short_desc = "Data Tablewalk Cycles.", ++ .pme_long_desc = "Data Tablewalk Cycles. Could be 1 or 2 active tablewalks. Includes data prefetches.", + }, +-[ POWER9_PME_PM_BR_PRED_TAKEN_CR ] = { /* 144 */ +- .pme_name = "PM_BR_PRED_TAKEN_CR", +- .pme_code = 0x00000040B0, +- .pme_short_desc = "Conditional Branch that had its direction predicted.", +- .pme_long_desc = "Conditional Branch that had its direction predicted. I-form branches do not set this event. In addition, B-form branches which do not use the BHT do not set this event - these are branches with BO-field set to 'always taken' and branches", +-}, +-[ POWER9_PME_PM_INST_FROM_L2_DISP_CONFLICT_OTHER ] = { /* 145 */ +- .pme_name = "PM_INST_FROM_L2_DISP_CONFLICT_OTHER", +- .pme_code = 0x0000044040, +- .pme_short_desc = "The processor's Instruction cache was reloaded from local core's L2 with dispatch conflict due to an instruction fetch (not prefetch)", +- .pme_long_desc = "The processor's Instruction cache was reloaded from local core's L2 with dispatch conflict due to an instruction fetch (not prefetch)", +-}, +-[ POWER9_PME_PM_MRK_DPTEG_FROM_DL2L3_SHR ] = { /* 146 */ +- .pme_name = "PM_MRK_DPTEG_FROM_DL2L3_SHR", +- .pme_code = 0x000003F148, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a marked data side request.", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", +-}, +-[ POWER9_PME_PM_DATA_FROM_L2_1_MOD ] = { /* 147 */ +- .pme_name = "PM_DATA_FROM_L2_1_MOD", +- .pme_code = 0x000004C046, +- .pme_short_desc = "The processor's data cache was reloaded with Modified (M) data from another core's L2 on the same chip due to a demand load", +- .pme_long_desc = "The processor's data cache was reloaded with Modified (M) data from another core's L2 on the same chip due to a demand load", +-}, +-[ POWER9_PME_PM_LSU_FLUSH_LHL_SHL ] = { /* 148 */ +- .pme_name = "PM_LSU_FLUSH_LHL_SHL", +- .pme_code = 0x000000C8B4, +- .pme_short_desc = "The instruction was flushed because of a sequential load/store consistency.", +- .pme_long_desc = "The instruction was flushed because of a sequential load/store consistency. If a load or store hits on an older load that has either been snooped (for loads) or has stale data (for stores).", +-}, +-[ POWER9_PME_PM_L3_P1_PF_RTY ] = { /* 149 */ +- .pme_name = "PM_L3_P1_PF_RTY", +- .pme_code = 0x00000268AE, +- .pme_short_desc = "L3 PF received retry port 3", +- .pme_long_desc = "L3 PF received retry port 3", +-}, +-[ POWER9_PME_PM_MRK_DPTEG_FROM_DL2L3_MOD ] = { /* 150 */ +- .pme_name = "PM_MRK_DPTEG_FROM_DL2L3_MOD", +- .pme_code = 0x000004F148, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a marked data side request.", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", +-}, +-[ POWER9_PME_PM_DFU_BUSY ] = { /* 151 */ +- .pme_name = "PM_DFU_BUSY", +- .pme_code = 0x000004D04C, +- .pme_short_desc = "Cycles in which all 4 Decimal Floating Point units are busy.", +- .pme_long_desc = "Cycles in which all 4 Decimal Floating Point units are busy. The DFU is running at capacity", +-}, +-[ POWER9_PME_PM_LSU1_TM_L1_MISS ] = { /* 152 */ +- .pme_name = "PM_LSU1_TM_L1_MISS", +- .pme_code = 0x000000E89C, +- .pme_short_desc = "Load tm L1 miss", +- .pme_long_desc = "Load tm L1 miss", +-}, +-[ POWER9_PME_PM_FREQ_UP ] = { /* 153 */ +- .pme_name = "PM_FREQ_UP", +- .pme_code = 0x000004000C, +- .pme_short_desc = "Power Management: Above Threshold A", +- .pme_long_desc = "Power Management: Above Threshold A", +-}, +-[ POWER9_PME_PM_DATA_FROM_LMEM ] = { /* 154 */ +- .pme_name = "PM_DATA_FROM_LMEM", +- .pme_code = 0x000002C048, +- .pme_short_desc = "The processor's data cache was reloaded from the local chip's Memory due to a demand load", +- .pme_long_desc = "The processor's data cache was reloaded from the local chip's Memory due to a demand load", +-}, +-[ POWER9_PME_PM_MRK_DATA_FROM_L2_MEPF ] = { /* 155 */ +- .pme_name = "PM_MRK_DATA_FROM_L2_MEPF", +- .pme_code = 0x000004C120, +- .pme_short_desc = "The processor's data cache was reloaded from local core's L2 hit without dispatch conflicts on Mepf state.", +- .pme_long_desc = "The processor's data cache was reloaded from local core's L2 hit without dispatch conflicts on Mepf state. due to a marked load", +-}, +-[ POWER9_PME_PM_ISIDE_DISP ] = { /* 156 */ +- .pme_name = "PM_ISIDE_DISP", +- .pme_code = 0x000001688A, +- .pme_short_desc = "All i-side dispatch attempts", +- .pme_long_desc = "All i-side dispatch attempts", +-}, +-[ POWER9_PME_PM_TM_OUTER_TBEGIN ] = { /* 157 */ +- .pme_name = "PM_TM_OUTER_TBEGIN", +- .pme_code = 0x0000002094, +- .pme_short_desc = "Completion time outer tbegin", +- .pme_long_desc = "Completion time outer tbegin", +-}, +-[ POWER9_PME_PM_PMC3_OVERFLOW ] = { /* 158 */ +- .pme_name = "PM_PMC3_OVERFLOW", +- .pme_code = 0x0000040010, +- .pme_short_desc = "Overflow from counter 3", +- .pme_long_desc = "Overflow from counter 3", +-}, +-[ POWER9_PME_PM_LSU0_SET_MPRED ] = { /* 159 */ +- .pme_name = "PM_LSU0_SET_MPRED", +- .pme_code = 0x000000D080, +- .pme_short_desc = "Set prediction(set-p) miss.", +- .pme_long_desc = "Set prediction(set-p) miss. The entry was not found in the Set prediction table", +-}, +-[ POWER9_PME_PM_INST_FROM_L2_MEPF ] = { /* 160 */ +- .pme_name = "PM_INST_FROM_L2_MEPF", +- .pme_code = 0x0000024040, +- .pme_short_desc = "The processor's Instruction cache was reloaded from local core's L2 hit without dispatch conflicts on Mepf state.", +- .pme_long_desc = "The processor's Instruction cache was reloaded from local core's L2 hit without dispatch conflicts on Mepf state. due to an instruction fetch (not prefetch)", ++[ POWER9_PME_PM_DC_DEALLOC_NO_CONF ] = { ++ .pme_name = "PM_DC_DEALLOC_NO_CONF", ++ .pme_code = 0x000000F8AC, ++ .pme_short_desc = "A demand load referenced a line in an active fuzzy prefetch stream.", ++ .pme_long_desc = "A demand load referenced a line in an active fuzzy prefetch stream. The stream could have been allocated through the hardware prefetch mechanism or through software.Fuzzy stream confirm (out of order effects, or pf cant keep up)", + }, +-[ POWER9_PME_PM_L3_P0_NODE_PUMP ] = { /* 161 */ +- .pme_name = "PM_L3_P0_NODE_PUMP", +- .pme_code = 0x00000160B0, +- .pme_short_desc = "L3 pf sent with nodal scope port 0", +- .pme_long_desc = "L3 pf sent with nodal scope port 0", ++[ POWER9_PME_PM_DC_PREF_CONF ] = { ++ .pme_name = "PM_DC_PREF_CONF", ++ .pme_code = 0x000000F0A8, ++ .pme_short_desc = "A demand load referenced a line in an active prefetch stream.", ++ .pme_long_desc = "A demand load referenced a line in an active prefetch stream. The stream could have been allocated through the hardware prefetch mechanism or through software. Includes forwards and backwards streams", + }, +-[ POWER9_PME_PM_IPTEG_FROM_L3_1_MOD ] = { /* 162 */ +- .pme_name = "PM_IPTEG_FROM_L3_1_MOD", +- .pme_code = 0x0000025044, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's L3 on the same chip due to a instruction side request", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's L3 on the same chip due to a instruction side request", ++[ POWER9_PME_PM_DC_PREF_CONS_ALLOC ] = { ++ .pme_name = "PM_DC_PREF_CONS_ALLOC", ++ .pme_code = 0x000000F0B4, ++ .pme_short_desc = "Prefetch stream allocated in the conservative phase by either the hardware prefetch mechanism or software prefetch", ++ .pme_long_desc = "Prefetch stream allocated in the conservative phase by either the hardware prefetch mechanism or software prefetch", + }, +-[ POWER9_PME_PM_L3_PF_USAGE ] = { /* 163 */ +- .pme_name = "PM_L3_PF_USAGE", +- .pme_code = 0x00000260AC, +- .pme_short_desc = "rotating sample of 32 PF actives", +- .pme_long_desc = "rotating sample of 32 PF actives", ++[ POWER9_PME_PM_DC_PREF_FUZZY_CONF ] = { ++ .pme_name = "PM_DC_PREF_FUZZY_CONF", ++ .pme_code = 0x000000F8A8, ++ .pme_short_desc = "A demand load referenced a line in an active fuzzy prefetch stream.", ++ .pme_long_desc = "A demand load referenced a line in an active fuzzy prefetch stream. The stream could have been allocated through the hardware prefetch mechanism or through software.Fuzzy stream confirm (out of order effects, or pf cant keep up)", + }, +-[ POWER9_PME_PM_CMPLU_STALL_BRU ] = { /* 164 */ +- .pme_name = "PM_CMPLU_STALL_BRU", +- .pme_code = 0x000004D018, +- .pme_short_desc = "Completion stall due to a Branch Unit", +- .pme_long_desc = "Completion stall due to a Branch Unit", ++[ POWER9_PME_PM_DC_PREF_HW_ALLOC ] = { ++ .pme_name = "PM_DC_PREF_HW_ALLOC", ++ .pme_code = 0x000000F0A4, ++ .pme_short_desc = "Prefetch stream allocated by the hardware prefetch mechanism", ++ .pme_long_desc = "Prefetch stream allocated by the hardware prefetch mechanism", + }, +-[ POWER9_PME_PM_ISLB_MISS ] = { /* 165 */ +- .pme_name = "PM_ISLB_MISS", +- .pme_code = 0x000000D8A8, +- .pme_short_desc = "Instruction SLB Miss - Total of all segment sizes", +- .pme_long_desc = "Instruction SLB Miss - Total of all segment sizes", ++[ POWER9_PME_PM_DC_PREF_STRIDED_CONF ] = { ++ .pme_name = "PM_DC_PREF_STRIDED_CONF", ++ .pme_code = 0x000000F0AC, ++ .pme_short_desc = "A demand load referenced a line in an active strided prefetch stream.", ++ .pme_long_desc = "A demand load referenced a line in an active strided prefetch stream. The stream could have been allocated through the hardware prefetch mechanism or through software.", + }, +-[ POWER9_PME_PM_CYC ] = { /* 166 */ +- .pme_name = "PM_CYC", +- .pme_code = 0x000001001E, +- .pme_short_desc = "Cycles", +- .pme_long_desc = "Cycles", ++[ POWER9_PME_PM_DC_PREF_SW_ALLOC ] = { ++ .pme_name = "PM_DC_PREF_SW_ALLOC", ++ .pme_code = 0x000000F8A4, ++ .pme_short_desc = "Prefetch stream allocated by software prefetching", ++ .pme_long_desc = "Prefetch stream allocated by software prefetching", + }, +-[ POWER9_PME_PM_MRK_DATA_FROM_L3_1_SHR ] = { /* 167 */ +- .pme_name = "PM_MRK_DATA_FROM_L3_1_SHR", +- .pme_code = 0x000004D124, +- .pme_short_desc = "The processor's data cache was reloaded with Shared (S) data from another core's L3 on the same chip due to a marked load", +- .pme_long_desc = "The processor's data cache was reloaded with Shared (S) data from another core's L3 on the same chip due to a marked load", ++[ POWER9_PME_PM_DC_PREF_XCONS_ALLOC ] = { ++ .pme_name = "PM_DC_PREF_XCONS_ALLOC", ++ .pme_code = 0x000000F8B4, ++ .pme_short_desc = "Prefetch stream allocated in the Ultra conservative phase by either the hardware prefetch mechanism or software prefetch", ++ .pme_long_desc = "Prefetch stream allocated in the Ultra conservative phase by either the hardware prefetch mechanism or software prefetch", + }, +-[ POWER9_PME_PM_IPTEG_FROM_RL2L3_MOD ] = { /* 168 */ +- .pme_name = "PM_IPTEG_FROM_RL2L3_MOD", +- .pme_code = 0x0000025046, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a instruction side request", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a instruction side request", ++[ POWER9_PME_PM_DECODE_FUSION_CONST_GEN ] = { ++ .pme_name = "PM_DECODE_FUSION_CONST_GEN", ++ .pme_code = 0x00000048B4, ++ .pme_short_desc = "32-bit constant generation", ++ .pme_long_desc = "32-bit constant generation", + }, +-[ POWER9_PME_PM_DARQ_10_12_ENTRIES ] = { /* 169 */ +- .pme_name = "PM_DARQ_10_12_ENTRIES", +- .pme_code = 0x000001D058, +- .pme_short_desc = "Cycles in which 10 or more DARQ entries (out of 12) are in use", +- .pme_long_desc = "Cycles in which 10 or more DARQ entries (out of 12) are in use", ++[ POWER9_PME_PM_DECODE_FUSION_EXT_ADD ] = { ++ .pme_name = "PM_DECODE_FUSION_EXT_ADD", ++ .pme_code = 0x0000005084, ++ .pme_short_desc = "32-bit extended addition", ++ .pme_long_desc = "32-bit extended addition", + }, +-[ POWER9_PME_PM_LSU2_3_LRQF_FULL_CYC ] = { /* 170 */ +- .pme_name = "PM_LSU2_3_LRQF_FULL_CYC", +- .pme_code = 0x000000D8BC, +- .pme_short_desc = "Counts the number of cycles the LRQF is full.", +- .pme_long_desc = "Counts the number of cycles the LRQF is full. LRQF is the queue that holds loads between finish and completion. If it fills up, instructions stay in LRQ until completion, potentially backing up the LRQ", ++[ POWER9_PME_PM_DECODE_FUSION_LD_ST_DISP ] = { ++ .pme_name = "PM_DECODE_FUSION_LD_ST_DISP", ++ .pme_code = 0x00000048A8, ++ .pme_short_desc = "32-bit displacement D-form and 16-bit displacement X-form", ++ .pme_long_desc = "32-bit displacement D-form and 16-bit displacement X-form", + }, +-[ POWER9_PME_PM_DECODE_FUSION_OP_PRESERV ] = { /* 171 */ ++[ POWER9_PME_PM_DECODE_FUSION_OP_PRESERV ] = { + .pme_name = "PM_DECODE_FUSION_OP_PRESERV", + .pme_code = 0x0000005088, + .pme_short_desc = "Destructive op operand preservation", + .pme_long_desc = "Destructive op operand preservation", + }, +-[ POWER9_PME_PM_MRK_DPTEG_FROM_L2_MEPF ] = { /* 172 */ +- .pme_name = "PM_MRK_DPTEG_FROM_L2_MEPF", +- .pme_code = 0x000002F140, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L2 hit without dispatch conflicts on Mepf state.", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L2 hit without dispatch conflicts on Mepf state. due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++[ POWER9_PME_PM_DECODE_HOLD_ICT_FULL ] = { ++ .pme_name = "PM_DECODE_HOLD_ICT_FULL", ++ .pme_code = 0x00000058A8, ++ .pme_short_desc = "Counts the number of cycles in which the IFU was not able to decode and transmit one or more instructions because all itags were in use.", ++ .pme_long_desc = "Counts the number of cycles in which the IFU was not able to decode and transmit one or more instructions because all itags were in use. This means the ICT is full for this thread", + }, +-[ POWER9_PME_PM_MRK_L1_RELOAD_VALID ] = { /* 173 */ +- .pme_name = "PM_MRK_L1_RELOAD_VALID", +- .pme_code = 0x00000101EA, +- .pme_short_desc = "Marked demand reload", +- .pme_long_desc = "Marked demand reload", ++[ POWER9_PME_PM_DECODE_LANES_NOT_AVAIL ] = { ++ .pme_name = "PM_DECODE_LANES_NOT_AVAIL", ++ .pme_code = 0x0000005884, ++ .pme_short_desc = "Decode has something to transmit but dispatch lanes are not available", ++ .pme_long_desc = "Decode has something to transmit but dispatch lanes are not available", + }, +-[ POWER9_PME_PM_LSU2_SET_MPRED ] = { /* 174 */ +- .pme_name = "PM_LSU2_SET_MPRED", +- .pme_code = 0x000000D084, +- .pme_short_desc = "Set prediction(set-p) miss.", +- .pme_long_desc = "Set prediction(set-p) miss. The entry was not found in the Set prediction table", ++[ POWER9_PME_PM_DERAT_MISS_16G ] = { ++ .pme_name = "PM_DERAT_MISS_16G", ++ .pme_code = 0x000004C054, ++ .pme_short_desc = "Data ERAT Miss (Data TLB Access) page size 16G", ++ .pme_long_desc = "Data ERAT Miss (Data TLB Access) page size 16G", + }, +-[ POWER9_PME_PM_1PLUS_PPC_CMPL ] = { /* 175 */ +- .pme_name = "PM_1PLUS_PPC_CMPL", +- .pme_code = 0x00000100F2, +- .pme_short_desc = "1 or more ppc insts finished", +- .pme_long_desc = "1 or more ppc insts finished", ++[ POWER9_PME_PM_DERAT_MISS_16M ] = { ++ .pme_name = "PM_DERAT_MISS_16M", ++ .pme_code = 0x000003C054, ++ .pme_short_desc = "Data ERAT Miss (Data TLB Access) page size 16M", ++ .pme_long_desc = "Data ERAT Miss (Data TLB Access) page size 16M", + }, +-[ POWER9_PME_PM_DATA_FROM_LL4 ] = { /* 176 */ +- .pme_name = "PM_DATA_FROM_LL4", +- .pme_code = 0x000001C04C, +- .pme_short_desc = "The processor's data cache was reloaded from the local chip's L4 cache due to a demand load", +- .pme_long_desc = "The processor's data cache was reloaded from the local chip's L4 cache due to a demand load", ++[ POWER9_PME_PM_DERAT_MISS_1G ] = { ++ .pme_name = "PM_DERAT_MISS_1G", ++ .pme_code = 0x000002C05A, ++ .pme_short_desc = "Data ERAT Miss (Data TLB Access) page size 1G.", ++ .pme_long_desc = "Data ERAT Miss (Data TLB Access) page size 1G. Implies radix translation", + }, +-[ POWER9_PME_PM_CMPLU_STALL_DMISS_L3MISS ] = { /* 177 */ +- .pme_name = "PM_CMPLU_STALL_DMISS_L3MISS", +- .pme_code = 0x000004C01A, +- .pme_short_desc = "Completion stall due to cache miss resolving missed the L3", +- .pme_long_desc = "Completion stall due to cache miss resolving missed the L3", ++[ POWER9_PME_PM_DERAT_MISS_2M ] = { ++ .pme_name = "PM_DERAT_MISS_2M", ++ .pme_code = 0x000001C05A, ++ .pme_short_desc = "Data ERAT Miss (Data TLB Access) page size 2M.", ++ .pme_long_desc = "Data ERAT Miss (Data TLB Access) page size 2M. Implies radix translation", + }, +-[ POWER9_PME_PM_TM_CAP_OVERFLOW ] = { /* 178 */ +- .pme_name = "PM_TM_CAP_OVERFLOW", +- .pme_code = 0x000004608C, +- .pme_short_desc = "TM Footprint Capactiy Overflow", +- .pme_long_desc = "TM Footprint Capactiy Overflow", ++[ POWER9_PME_PM_DERAT_MISS_4K ] = { ++ .pme_name = "PM_DERAT_MISS_4K", ++ .pme_code = 0x000001C056, ++ .pme_short_desc = "Data ERAT Miss (Data TLB Access) page size 4K", ++ .pme_long_desc = "Data ERAT Miss (Data TLB Access) page size 4K", + }, +-[ POWER9_PME_PM_MRK_DPTEG_FROM_LMEM ] = { /* 179 */ +- .pme_name = "PM_MRK_DPTEG_FROM_LMEM", +- .pme_code = 0x000002F148, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB from the local chip's Memory due to a marked data side request.", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB from the local chip's Memory due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++[ POWER9_PME_PM_DERAT_MISS_64K ] = { ++ .pme_name = "PM_DERAT_MISS_64K", ++ .pme_code = 0x000002C054, ++ .pme_short_desc = "Data ERAT Miss (Data TLB Access) page size 64K", ++ .pme_long_desc = "Data ERAT Miss (Data TLB Access) page size 64K", + }, +-[ POWER9_PME_PM_LSU3_FALSE_LHS ] = { /* 180 */ +- .pme_name = "PM_LSU3_FALSE_LHS", +- .pme_code = 0x000000C8A4, +- .pme_short_desc = "False LHS match detected", +- .pme_long_desc = "False LHS match detected", ++[ POWER9_PME_PM_DFU_BUSY ] = { ++ .pme_name = "PM_DFU_BUSY", ++ .pme_code = 0x000004D04C, ++ .pme_short_desc = "Cycles in which all 4 Decimal Floating Point units are busy.", ++ .pme_long_desc = "Cycles in which all 4 Decimal Floating Point units are busy. The DFU is running at capacity", + }, +-[ POWER9_PME_PM_THRESH_EXC_512 ] = { /* 181 */ +- .pme_name = "PM_THRESH_EXC_512", +- .pme_code = 0x00000201E8, +- .pme_short_desc = "Threshold counter exceeded a value of 512", +- .pme_long_desc = "Threshold counter exceeded a value of 512", ++[ POWER9_PME_PM_DISP_CLB_HELD_BAL ] = { ++ .pme_name = "PM_DISP_CLB_HELD_BAL", ++ .pme_code = 0x000000288C, ++ .pme_short_desc = "Dispatch/CLB Hold: Balance Flush", ++ .pme_long_desc = "Dispatch/CLB Hold: Balance Flush", + }, +-[ POWER9_PME_PM_RADIX_PWC_L1_PDE_FROM_L2 ] = { /* 182 */ +- .pme_name = "PM_RADIX_PWC_L1_PDE_FROM_L2", +- .pme_code = 0x000002D026, +- .pme_short_desc = "A Page Directory Entry was reloaded to a level 1 page walk cache from the core's L2 data cache", +- .pme_long_desc = "A Page Directory Entry was reloaded to a level 1 page walk cache from the core's L2 data cache", ++[ POWER9_PME_PM_DISP_CLB_HELD_SB ] = { ++ .pme_name = "PM_DISP_CLB_HELD_SB", ++ .pme_code = 0x0000002090, ++ .pme_short_desc = "Dispatch/CLB Hold: Scoreboard", ++ .pme_long_desc = "Dispatch/CLB Hold: Scoreboard", + }, +-[ POWER9_PME_PM_HWSYNC ] = { /* 183 */ +- .pme_name = "PM_HWSYNC", +- .pme_code = 0x00000050A0, +- .pme_short_desc = "Hwsync instruction decoded and transferred", +- .pme_long_desc = "Hwsync instruction decoded and transferred", ++[ POWER9_PME_PM_DISP_CLB_HELD_TLBIE ] = { ++ .pme_name = "PM_DISP_CLB_HELD_TLBIE", ++ .pme_code = 0x0000002890, ++ .pme_short_desc = "Dispatch Hold: Due to TLBIE", ++ .pme_long_desc = "Dispatch Hold: Due to TLBIE", + }, +-[ POWER9_PME_PM_TM_FAIL_FOOTPRINT_OVERFLOW ] = { /* 184 */ +- .pme_name = "PM_TM_FAIL_FOOTPRINT_OVERFLOW", +- .pme_code = 0x00000020A8, +- .pme_short_desc = "TM aborted because the tracking limit for transactional storage accesses was exceeded.", +- .pme_long_desc = "TM aborted because the tracking limit for transactional storage accesses was exceeded.. Asynchronous", ++[ POWER9_PME_PM_DISP_HELD_HB_FULL ] = { ++ .pme_name = "PM_DISP_HELD_HB_FULL", ++ .pme_code = 0x000003D05C, ++ .pme_short_desc = "Dispatch held due to History Buffer full.", ++ .pme_long_desc = "Dispatch held due to History Buffer full. Could be GPR/VSR/VMR/FPR/CR/XVF; CR; XVF (XER/VSCR/FPSCR)", + }, +-[ POWER9_PME_PM_INST_SYS_PUMP_MPRED_RTY ] = { /* 185 */ +- .pme_name = "PM_INST_SYS_PUMP_MPRED_RTY", +- .pme_code = 0x0000044050, +- .pme_short_desc = "Final Pump Scope (system) ended up larger than Initial Pump Scope (Chip/Group) for an instruction fetch", +- .pme_long_desc = "Final Pump Scope (system) ended up larger than Initial Pump Scope (Chip/Group) for an instruction fetch", ++[ POWER9_PME_PM_DISP_HELD_ISSQ_FULL ] = { ++ .pme_name = "PM_DISP_HELD_ISSQ_FULL", ++ .pme_code = 0x0000020006, ++ .pme_short_desc = "Dispatch held due to Issue q full.", ++ .pme_long_desc = "Dispatch held due to Issue q full. Includes issue queue and branch queue", + }, +-[ POWER9_PME_PM_ICT_NOSLOT_DISP_HELD_HB_FULL ] = { /* 186 */ +- .pme_name = "PM_ICT_NOSLOT_DISP_HELD_HB_FULL", +- .pme_code = 0x0000030018, +- .pme_short_desc = "Ict empty for this thread due to dispatch holds because the History Buffer was full.", +- .pme_long_desc = "Ict empty for this thread due to dispatch holds because the History Buffer was full. Could be GPR/VSR/VMR/FPR/CR/XVF", ++[ POWER9_PME_PM_DISP_HELD_SYNC_HOLD ] = { ++ .pme_name = "PM_DISP_HELD_SYNC_HOLD", ++ .pme_code = 0x000004003C, ++ .pme_short_desc = "Cycles in which dispatch is held because of a synchronizing instruction in the pipeline", ++ .pme_long_desc = "Cycles in which dispatch is held because of a synchronizing instruction in the pipeline", + }, +-[ POWER9_PME_PM_DC_DEALLOC_NO_CONF ] = { /* 187 */ +- .pme_name = "PM_DC_DEALLOC_NO_CONF", +- .pme_code = 0x000000F8AC, +- .pme_short_desc = "A demand load referenced a line in an active fuzzy prefetch stream.", +- .pme_long_desc = "A demand load referenced a line in an active fuzzy prefetch stream. The stream could have been allocated through the hardware prefetch mechanism or through software.Fuzzy stream confirm (out of order effects, or pf cant keep up)", ++[ POWER9_PME_PM_DISP_HELD_TBEGIN ] = { ++ .pme_name = "PM_DISP_HELD_TBEGIN", ++ .pme_code = 0x00000028B0, ++ .pme_short_desc = "This outer tbegin transaction cannot be dispatched until the previous tend instruction completes", ++ .pme_long_desc = "This outer tbegin transaction cannot be dispatched until the previous tend instruction completes", + }, +-[ POWER9_PME_PM_CMPLU_STALL_VFXLONG ] = { /* 188 */ +- .pme_name = "PM_CMPLU_STALL_VFXLONG", +- .pme_code = 0x000002E018, +- .pme_short_desc = "Completion stall due to a long latency vector fixed point instruction (division, square root)", +- .pme_long_desc = "Completion stall due to a long latency vector fixed point instruction (division, square root)", ++[ POWER9_PME_PM_DISP_HELD ] = { ++ .pme_name = "PM_DISP_HELD", ++ .pme_code = 0x0000010006, ++ .pme_short_desc = "Dispatch Held", ++ .pme_long_desc = "Dispatch Held", + }, +-[ POWER9_PME_PM_MEM_LOC_THRESH_IFU ] = { /* 189 */ +- .pme_name = "PM_MEM_LOC_THRESH_IFU", +- .pme_code = 0x0000010058, +- .pme_short_desc = "Local Memory above threshold for IFU speculation control", +- .pme_long_desc = "Local Memory above threshold for IFU speculation control", ++[ POWER9_PME_PM_DISP_STARVED ] = { ++ .pme_name = "PM_DISP_STARVED", ++ .pme_code = 0x0000030008, ++ .pme_short_desc = "Dispatched Starved", ++ .pme_long_desc = "Dispatched Starved", + }, +-[ POWER9_PME_PM_MRK_DATA_FROM_L3_CYC ] = { /* 190 */ +- .pme_name = "PM_MRK_DATA_FROM_L3_CYC", +- .pme_code = 0x0000035154, +- .pme_short_desc = "Duration in cycles to reload from local core's L3 due to a marked load", +- .pme_long_desc = "Duration in cycles to reload from local core's L3 due to a marked load", ++[ POWER9_PME_PM_DP_QP_FLOP_CMPL ] = { ++ .pme_name = "PM_DP_QP_FLOP_CMPL", ++ .pme_code = 0x000004D05C, ++ .pme_short_desc = "Double-Precion or Quad-Precision instruction completed", ++ .pme_long_desc = "Double-Precion or Quad-Precision instruction completed", + }, +-[ POWER9_PME_PM_PTE_PREFETCH ] = { /* 191 */ +- .pme_name = "PM_PTE_PREFETCH", +- .pme_code = 0x000000F084, +- .pme_short_desc = "PTE prefetches", +- .pme_long_desc = "PTE prefetches", ++[ POWER9_PME_PM_DPTEG_FROM_DL2L3_MOD ] = { ++ .pme_name = "PM_DPTEG_FROM_DL2L3_MOD", ++ .pme_code = 0x000004E048, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", + }, +-[ POWER9_PME_PM_CMPLU_STALL_STORE_PIPE_ARB ] = { /* 192 */ +- .pme_name = "PM_CMPLU_STALL_STORE_PIPE_ARB", +- .pme_code = 0x000004C010, +- .pme_short_desc = "Finish stall because the NTF instruction was a store waiting for the next relaunch opportunity after an internal reject.", +- .pme_long_desc = "Finish stall because the NTF instruction was a store waiting for the next relaunch opportunity after an internal reject. This means the instruction is ready to relaunch and tried once but lost arbitration", ++[ POWER9_PME_PM_DPTEG_FROM_DL2L3_SHR ] = { ++ .pme_name = "PM_DPTEG_FROM_DL2L3_SHR", ++ .pme_code = 0x000003E048, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", + }, +-[ POWER9_PME_PM_CMPLU_STALL_SLB ] = { /* 193 */ +- .pme_name = "PM_CMPLU_STALL_SLB", +- .pme_code = 0x000001E052, +- .pme_short_desc = "Finish stall because the NTF instruction was awaiting L2 response for an SLB", +- .pme_long_desc = "Finish stall because the NTF instruction was awaiting L2 response for an SLB", ++[ POWER9_PME_PM_DPTEG_FROM_DL4 ] = { ++ .pme_name = "PM_DPTEG_FROM_DL4", ++ .pme_code = 0x000003E04C, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from another chip's L4 on a different Node or Group (Distant) due to a data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from another chip's L4 on a different Node or Group (Distant) due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", + }, +-[ POWER9_PME_PM_MRK_DERAT_MISS_4K ] = { /* 194 */ +- .pme_name = "PM_MRK_DERAT_MISS_4K", +- .pme_code = 0x000002D150, +- .pme_short_desc = "Marked Data ERAT Miss (Data TLB Access) page size 4K", +- .pme_long_desc = "Marked Data ERAT Miss (Data TLB Access) page size 4K", ++[ POWER9_PME_PM_DPTEG_FROM_DMEM ] = { ++ .pme_name = "PM_DPTEG_FROM_DMEM", ++ .pme_code = 0x000004E04C, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from another chip's memory on the same Node or Group (Distant) due to a data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from another chip's memory on the same Node or Group (Distant) due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", + }, +-[ POWER9_PME_PM_CMPLU_STALL_LSU_MFSPR ] = { /* 195 */ +- .pme_name = "PM_CMPLU_STALL_LSU_MFSPR", +- .pme_code = 0x0000034056, +- .pme_short_desc = "Finish stall because the NTF instruction was a mfspr instruction targeting an LSU SPR and it was waiting for the register data to be returned", +- .pme_long_desc = "Finish stall because the NTF instruction was a mfspr instruction targeting an LSU SPR and it was waiting for the register data to be returned", ++[ POWER9_PME_PM_DPTEG_FROM_L21_MOD ] = { ++ .pme_name = "PM_DPTEG_FROM_L21_MOD", ++ .pme_code = 0x000004E046, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's L2 on the same chip due to a data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's L2 on the same chip due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", + }, +-[ POWER9_PME_PM_MRK_DPTEG_FROM_L3_1_ECO_SHR ] = { /* 196 */ +- .pme_name = "PM_MRK_DPTEG_FROM_L3_1_ECO_SHR", +- .pme_code = 0x000003F144, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's ECO L3 on the same chip due to a marked data side request.", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's ECO L3 on the same chip due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++[ POWER9_PME_PM_DPTEG_FROM_L21_SHR ] = { ++ .pme_name = "PM_DPTEG_FROM_L21_SHR", ++ .pme_code = 0x000003E046, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's L2 on the same chip due to a data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's L2 on the same chip due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", + }, +-[ POWER9_PME_PM_VSU_DP_FSQRT_FDIV ] = { /* 197 */ +- .pme_name = "PM_VSU_DP_FSQRT_FDIV", +- .pme_code = 0x000003D058, +- .pme_short_desc = "vector versions of fdiv,fsqrt", +- .pme_long_desc = "vector versions of fdiv,fsqrt", ++[ POWER9_PME_PM_DPTEG_FROM_L2_MEPF ] = { ++ .pme_name = "PM_DPTEG_FROM_L2_MEPF", ++ .pme_code = 0x000002E040, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L2 hit without dispatch conflicts on Mepf state.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L2 hit without dispatch conflicts on Mepf state. due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", + }, +-[ POWER9_PME_PM_IPTEG_FROM_L3_1_ECO_SHR ] = { /* 198 */ +- .pme_name = "PM_IPTEG_FROM_L3_1_ECO_SHR", +- .pme_code = 0x0000035044, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's ECO L3 on the same chip due to a instruction side request", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's ECO L3 on the same chip due to a instruction side request", ++[ POWER9_PME_PM_DPTEG_FROM_L2MISS ] = { ++ .pme_name = "PM_DPTEG_FROM_L2MISS", ++ .pme_code = 0x000001E04E, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from a location other than the local core's L2 due to a data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from a location other than the local core's L2 due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", + }, +-[ POWER9_PME_PM_L3_P0_LCO_DATA ] = { /* 199 */ +- .pme_name = "PM_L3_P0_LCO_DATA", +- .pme_code = 0x00000260AA, +- .pme_short_desc = "lco sent with data port 0", +- .pme_long_desc = "lco sent with data port 0", ++[ POWER9_PME_PM_DPTEG_FROM_L2_NO_CONFLICT ] = { ++ .pme_name = "PM_DPTEG_FROM_L2_NO_CONFLICT", ++ .pme_code = 0x000001E040, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L2 without conflict due to a data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L2 without conflict due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", + }, +-[ POWER9_PME_PM_RUN_INST_CMPL ] = { /* 200 */ +- .pme_name = "PM_RUN_INST_CMPL", +- .pme_code = 0x00000400FA, +- .pme_short_desc = "Run_Instructions", +- .pme_long_desc = "Run_Instructions", ++[ POWER9_PME_PM_DPTEG_FROM_L2 ] = { ++ .pme_name = "PM_DPTEG_FROM_L2", ++ .pme_code = 0x000001E042, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L2 due to a data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L2 due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", + }, +-[ POWER9_PME_PM_MRK_DATA_FROM_OFF_CHIP_CACHE ] = { /* 201 */ +- .pme_name = "PM_MRK_DATA_FROM_OFF_CHIP_CACHE", +- .pme_code = 0x000002D120, +- .pme_short_desc = "The processor's data cache was reloaded either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to a marked load", +- .pme_long_desc = "The processor's data cache was reloaded either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to a marked load", ++[ POWER9_PME_PM_DPTEG_FROM_L31_ECO_MOD ] = { ++ .pme_name = "PM_DPTEG_FROM_L31_ECO_MOD", ++ .pme_code = 0x000004E044, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's ECO L3 on the same chip due to a data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's ECO L3 on the same chip due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", + }, +-[ POWER9_PME_PM_MRK_TEND_FAIL ] = { /* 202 */ +- .pme_name = "PM_MRK_TEND_FAIL", +- .pme_code = 0x00000028A4, +- .pme_short_desc = "Nested or not nested tend failed for a marked tend instruction", +- .pme_long_desc = "Nested or not nested tend failed for a marked tend instruction", ++[ POWER9_PME_PM_DPTEG_FROM_L31_ECO_SHR ] = { ++ .pme_name = "PM_DPTEG_FROM_L31_ECO_SHR", ++ .pme_code = 0x000003E044, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's ECO L3 on the same chip due to a data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's ECO L3 on the same chip due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", + }, +-[ POWER9_PME_PM_MRK_VSU_FIN ] = { /* 203 */ +- .pme_name = "PM_MRK_VSU_FIN", +- .pme_code = 0x0000030132, +- .pme_short_desc = "VSU marked instr finish", +- .pme_long_desc = "VSU marked instr finish", ++[ POWER9_PME_PM_DPTEG_FROM_L31_MOD ] = { ++ .pme_name = "PM_DPTEG_FROM_L31_MOD", ++ .pme_code = 0x000002E044, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's L3 on the same chip due to a data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's L3 on the same chip due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", + }, +-[ POWER9_PME_PM_DATA_FROM_L3_1_ECO_MOD ] = { /* 204 */ +- .pme_name = "PM_DATA_FROM_L3_1_ECO_MOD", +- .pme_code = 0x000004C044, +- .pme_short_desc = "The processor's data cache was reloaded with Modified (M) data from another core's ECO L3 on the same chip due to a demand load", +- .pme_long_desc = "The processor's data cache was reloaded with Modified (M) data from another core's ECO L3 on the same chip due to a demand load", ++[ POWER9_PME_PM_DPTEG_FROM_L31_SHR ] = { ++ .pme_name = "PM_DPTEG_FROM_L31_SHR", ++ .pme_code = 0x000001E046, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's L3 on the same chip due to a data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's L3 on the same chip due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", + }, +-[ POWER9_PME_PM_RUN_SPURR ] = { /* 205 */ +- .pme_name = "PM_RUN_SPURR", +- .pme_code = 0x0000010008, +- .pme_short_desc = "Run SPURR", +- .pme_long_desc = "Run SPURR", ++[ POWER9_PME_PM_DPTEG_FROM_L3_DISP_CONFLICT ] = { ++ .pme_name = "PM_DPTEG_FROM_L3_DISP_CONFLICT", ++ .pme_code = 0x000003E042, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L3 with dispatch conflict due to a data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L3 with dispatch conflict due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", + }, +-[ POWER9_PME_PM_ST_CAUSED_FAIL ] = { /* 206 */ +- .pme_name = "PM_ST_CAUSED_FAIL", +- .pme_code = 0x000001608C, +- .pme_short_desc = "Non TM St caused any thread to fail", +- .pme_long_desc = "Non TM St caused any thread to fail", ++[ POWER9_PME_PM_DPTEG_FROM_L3_MEPF ] = { ++ .pme_name = "PM_DPTEG_FROM_L3_MEPF", ++ .pme_code = 0x000002E042, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L3 without dispatch conflicts hit on Mepf state.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L3 without dispatch conflicts hit on Mepf state. due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", + }, +-[ POWER9_PME_PM_SNOOP_TLBIE ] = { /* 207 */ +- .pme_name = "PM_SNOOP_TLBIE", +- .pme_code = 0x000000F880, +- .pme_short_desc = "TLBIE snoop", +- .pme_long_desc = "TLBIE snoop", ++[ POWER9_PME_PM_DPTEG_FROM_L3MISS ] = { ++ .pme_name = "PM_DPTEG_FROM_L3MISS", ++ .pme_code = 0x000004E04E, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from a location other than the local core's L3 due to a data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from a location other than the local core's L3 due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", + }, +-[ POWER9_PME_PM_PMC1_SAVED ] = { /* 208 */ +- .pme_name = "PM_PMC1_SAVED", +- .pme_code = 0x000004D010, +- .pme_short_desc = "PMC1 Rewind Value saved", +- .pme_long_desc = "PMC1 Rewind Value saved", ++[ POWER9_PME_PM_DPTEG_FROM_L3_NO_CONFLICT ] = { ++ .pme_name = "PM_DPTEG_FROM_L3_NO_CONFLICT", ++ .pme_code = 0x000001E044, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L3 without conflict due to a data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L3 without conflict due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", + }, +-[ POWER9_PME_PM_DATA_FROM_L3MISS ] = { /* 209 */ +- .pme_name = "PM_DATA_FROM_L3MISS", +- .pme_code = 0x00000300FE, +- .pme_short_desc = "Demand LD - L3 Miss (not L2 hit and not L3 hit)", +- .pme_long_desc = "Demand LD - L3 Miss (not L2 hit and not L3 hit)", ++[ POWER9_PME_PM_DPTEG_FROM_L3 ] = { ++ .pme_name = "PM_DPTEG_FROM_L3", ++ .pme_code = 0x000004E042, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L3 due to a data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L3 due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", + }, +-[ POWER9_PME_PM_DATA_FROM_ON_CHIP_CACHE ] = { /* 210 */ +- .pme_name = "PM_DATA_FROM_ON_CHIP_CACHE", +- .pme_code = 0x000001C048, +- .pme_short_desc = "The processor's data cache was reloaded either shared or modified data from another core's L2/L3 on the same chip due to a demand load", +- .pme_long_desc = "The processor's data cache was reloaded either shared or modified data from another core's L2/L3 on the same chip due to a demand load", ++[ POWER9_PME_PM_DPTEG_FROM_LL4 ] = { ++ .pme_name = "PM_DPTEG_FROM_LL4", ++ .pme_code = 0x000001E04C, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from the local chip's L4 cache due to a data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from the local chip's L4 cache due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", + }, +-[ POWER9_PME_PM_DTLB_MISS_16G ] = { /* 211 */ +- .pme_name = "PM_DTLB_MISS_16G", +- .pme_code = 0x000001C058, +- .pme_short_desc = "Data TLB Miss page size 16G", +- .pme_long_desc = "Data TLB Miss page size 16G", ++[ POWER9_PME_PM_DPTEG_FROM_LMEM ] = { ++ .pme_name = "PM_DPTEG_FROM_LMEM", ++ .pme_code = 0x000002E048, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from the local chip's Memory due to a data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from the local chip's Memory due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", + }, +-[ POWER9_PME_PM_MRK_DPTEG_FROM_DMEM ] = { /* 212 */ +- .pme_name = "PM_MRK_DPTEG_FROM_DMEM", +- .pme_code = 0x000004F14C, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB from another chip's memory on the same Node or Group (Distant) due to a marked data side request.", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB from another chip's memory on the same Node or Group (Distant) due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++[ POWER9_PME_PM_DPTEG_FROM_MEMORY ] = { ++ .pme_name = "PM_DPTEG_FROM_MEMORY", ++ .pme_code = 0x000002E04C, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from a memory location including L4 from local remote or distant due to a data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from a memory location including L4 from local remote or distant due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", + }, +-[ POWER9_PME_PM_ICT_NOSLOT_IC_L3MISS ] = { /* 213 */ +- .pme_name = "PM_ICT_NOSLOT_IC_L3MISS", +- .pme_code = 0x000004E010, +- .pme_short_desc = "Ict empty for this thread due to icache misses that were sourced from beyond the local L3.", +- .pme_long_desc = "Ict empty for this thread due to icache misses that were sourced from beyond the local L3. The source could be local/remote/distant memory or another core's cache", ++[ POWER9_PME_PM_DPTEG_FROM_OFF_CHIP_CACHE ] = { ++ .pme_name = "PM_DPTEG_FROM_OFF_CHIP_CACHE", ++ .pme_code = 0x000004E04A, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to a data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", + }, +-[ POWER9_PME_PM_FLUSH ] = { /* 214 */ +- .pme_name = "PM_FLUSH", +- .pme_code = 0x00000400F8, +- .pme_short_desc = "Flush (any type)", +- .pme_long_desc = "Flush (any type)", ++[ POWER9_PME_PM_DPTEG_FROM_ON_CHIP_CACHE ] = { ++ .pme_name = "PM_DPTEG_FROM_ON_CHIP_CACHE", ++ .pme_code = 0x000001E048, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB either shared or modified data from another core's L2/L3 on the same chip due to a data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB either shared or modified data from another core's L2/L3 on the same chip due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", + }, +-[ POWER9_PME_PM_LSU_FLUSH_OTHER ] = { /* 215 */ +- .pme_name = "PM_LSU_FLUSH_OTHER", +- .pme_code = 0x000000C0BC, +- .pme_short_desc = "Other LSU flushes including: Sync (sync ack from L2 caused search of LRQ for oldest snooped load, This will either signal a Precise Flush of the oldest snooped loa or a Flush Next PPC)", +- .pme_long_desc = "Other LSU flushes including: Sync (sync ack from L2 caused search of LRQ for oldest snooped load, This will either signal a Precise Flush of the oldest snooped loa or a Flush Next PPC)", ++[ POWER9_PME_PM_DPTEG_FROM_RL2L3_MOD ] = { ++ .pme_name = "PM_DPTEG_FROM_RL2L3_MOD", ++ .pme_code = 0x000002E046, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", + }, +-[ POWER9_PME_PM_LS1_LAUNCH_HELD_PREF ] = { /* 216 */ +- .pme_name = "PM_LS1_LAUNCH_HELD_PREF", +- .pme_code = 0x000000C89C, +- .pme_short_desc = "Number of times a load or store instruction was unable to launch/relaunch because a high priority prefetch used that relaunch cycle", +- .pme_long_desc = "Number of times a load or store instruction was unable to launch/relaunch because a high priority prefetch used that relaunch cycle", ++[ POWER9_PME_PM_DPTEG_FROM_RL2L3_SHR ] = { ++ .pme_name = "PM_DPTEG_FROM_RL2L3_SHR", ++ .pme_code = 0x000001E04A, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", + }, +-[ POWER9_PME_PM_L2_LD_HIT ] = { /* 217 */ +- .pme_name = "PM_L2_LD_HIT", +- .pme_code = 0x000002609E, +- .pme_short_desc = "All successful load dispatches that were L2 hits", +- .pme_long_desc = "All successful load dispatches that were L2 hits", ++[ POWER9_PME_PM_DPTEG_FROM_RL4 ] = { ++ .pme_name = "PM_DPTEG_FROM_RL4", ++ .pme_code = 0x000002E04A, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from another chip's L4 on the same Node or Group ( Remote) due to a data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from another chip's L4 on the same Node or Group ( Remote) due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", + }, +-[ POWER9_PME_PM_LSU2_VECTOR_LD_FIN ] = { /* 218 */ +- .pme_name = "PM_LSU2_VECTOR_LD_FIN", +- .pme_code = 0x000000C084, +- .pme_short_desc = "A vector load instruction finished.", +- .pme_long_desc = "A vector load instruction finished. The ops considered in this category are lxv*, lvx*, lve*, lxsi*zx, lxvwsx, lxsd, lxssp, lxvl, lxvll, lxvb16x, lxvh8x, lxv, lxvx", ++[ POWER9_PME_PM_DPTEG_FROM_RMEM ] = { ++ .pme_name = "PM_DPTEG_FROM_RMEM", ++ .pme_code = 0x000003E04A, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from another chip's memory on the same Node or Group ( Remote) due to a data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from another chip's memory on the same Node or Group ( Remote) due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", + }, +-[ POWER9_PME_PM_LSU_FLUSH_EMSH ] = { /* 219 */ +- .pme_name = "PM_LSU_FLUSH_EMSH", +- .pme_code = 0x000000C0B0, +- .pme_short_desc = "An ERAT miss was detected after a set-p hit.", +- .pme_long_desc = "An ERAT miss was detected after a set-p hit. Erat tracker indicates fail due to tlbmiss and the instruction gets flushed because the instruction was working on the wrong address", ++[ POWER9_PME_PM_DSIDE_L2MEMACC ] = { ++ .pme_name = "PM_DSIDE_L2MEMACC", ++ .pme_code = 0x0000036092, ++ .pme_short_desc = "Valid when first beat of data comes in for an D-side fetch where data came EXCLUSIVELY from memory (excluding hpcread64 accesses), i.", ++ .pme_long_desc = "Valid when first beat of data comes in for an D-side fetch where data came EXCLUSIVELY from memory (excluding hpcread64 accesses), i.e., total memory accesses by RCs", + }, +-[ POWER9_PME_PM_IC_PREF_REQ ] = { /* 220 */ +- .pme_name = "PM_IC_PREF_REQ", +- .pme_code = 0x0000004888, +- .pme_short_desc = "Instruction prefetch requests", +- .pme_long_desc = "Instruction prefetch requests", ++[ POWER9_PME_PM_DSIDE_MRU_TOUCH ] = { ++ .pme_name = "PM_DSIDE_MRU_TOUCH", ++ .pme_code = 0x0000026884, ++ .pme_short_desc = "D-side L2 MRU touch sent to L2", ++ .pme_long_desc = "D-side L2 MRU touch sent to L2", + }, +-[ POWER9_PME_PM_DPTEG_FROM_L2_1_SHR ] = { /* 221 */ +- .pme_name = "PM_DPTEG_FROM_L2_1_SHR", +- .pme_code = 0x000003E046, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's L2 on the same chip due to a data side request.", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's L2 on the same chip due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++[ POWER9_PME_PM_DSIDE_OTHER_64B_L2MEMACC ] = { ++ .pme_name = "PM_DSIDE_OTHER_64B_L2MEMACC", ++ .pme_code = 0x0000036892, ++ .pme_short_desc = "Valid when first beat of data comes in for an D-side fetch where data came EXCLUSIVELY from memory that was for hpc_read64, (RC had to fetch other 64B of a line from MC) i.", ++ .pme_long_desc = "Valid when first beat of data comes in for an D-side fetch where data came EXCLUSIVELY from memory that was for hpc_read64, (RC had to fetch other 64B of a line from MC) i.e., number of times RC had to go to memory to get 'missing' 64B", ++}, ++[ POWER9_PME_PM_DSLB_MISS ] = { ++ .pme_name = "PM_DSLB_MISS", ++ .pme_code = 0x000000D0A8, ++ .pme_short_desc = "Data SLB Miss - Total of all segment sizes", ++ .pme_long_desc = "Data SLB Miss - Total of all segment sizes", + }, +-[ POWER9_PME_PM_XLATE_RADIX_MODE ] = { /* 222 */ +- .pme_name = "PM_XLATE_RADIX_MODE", +- .pme_code = 0x000000F898, +- .pme_short_desc = "LSU reports every cycle the thread is in radix translation mode (as opposed to HPT mode)", +- .pme_long_desc = "LSU reports every cycle the thread is in radix translation mode (as opposed to HPT mode)", ++[ POWER9_PME_PM_DSLB_MISS_ALT ] = { ++ .pme_name = "PM_DSLB_MISS", ++ .pme_code = 0x0000010016, ++ .pme_short_desc = "gate_and(sd_pc_c0_comp_valid AND sd_pc_c0_comp_thread(0:1)=tid,sd_pc_c0_comp_ppc_count(0:3)) + gate_and(sd_pc_c1_comp_valid AND sd_pc_c1_comp_thread(0:1)=tid,sd_pc_c1_comp_ppc_count(0:3))", ++ .pme_long_desc = "gate_and(sd_pc_c0_comp_valid AND sd_pc_c0_comp_thread(0:1)=tid,sd_pc_c0_comp_ppc_count(0:3)) + gate_and(sd_pc_c1_comp_valid AND sd_pc_c1_comp_thread(0:1)=tid,sd_pc_c1_comp_ppc_count(0:3))", + }, +-[ POWER9_PME_PM_L3_LD_HIT ] = { /* 223 */ +- .pme_name = "PM_L3_LD_HIT", +- .pme_code = 0x00000260A4, +- .pme_short_desc = "L3 demand LD Hits", +- .pme_long_desc = "L3 demand LD Hits", ++[ POWER9_PME_PM_DTLB_MISS_16G ] = { ++ .pme_name = "PM_DTLB_MISS_16G", ++ .pme_code = 0x000001C058, ++ .pme_short_desc = "Data TLB Miss page size 16G", ++ .pme_long_desc = "Data TLB Miss page size 16G", + }, +-[ POWER9_PME_PM_DARQ_7_9_ENTRIES ] = { /* 224 */ +- .pme_name = "PM_DARQ_7_9_ENTRIES", +- .pme_code = 0x000002E050, +- .pme_short_desc = "Cycles in which 7,8, or 9 DARQ entries (out of 12) are in use", +- .pme_long_desc = "Cycles in which 7,8, or 9 DARQ entries (out of 12) are in use", ++[ POWER9_PME_PM_DTLB_MISS_16M ] = { ++ .pme_name = "PM_DTLB_MISS_16M", ++ .pme_code = 0x000004C056, ++ .pme_short_desc = "Data TLB Miss page size 16M", ++ .pme_long_desc = "Data TLB Miss page size 16M", + }, +-[ POWER9_PME_PM_CMPLU_STALL_EXEC_UNIT ] = { /* 225 */ +- .pme_name = "PM_CMPLU_STALL_EXEC_UNIT", +- .pme_code = 0x000002D018, +- .pme_short_desc = "Completion stall due to execution units (FXU/VSU/CRU)", +- .pme_long_desc = "Completion stall due to execution units (FXU/VSU/CRU)", ++[ POWER9_PME_PM_DTLB_MISS_1G ] = { ++ .pme_name = "PM_DTLB_MISS_1G", ++ .pme_code = 0x000004C05A, ++ .pme_short_desc = "Data TLB reload (after a miss) page size 1G.", ++ .pme_long_desc = "Data TLB reload (after a miss) page size 1G. Implies radix translation was used", + }, +-[ POWER9_PME_PM_DISP_HELD ] = { /* 226 */ +- .pme_name = "PM_DISP_HELD", +- .pme_code = 0x0000010006, +- .pme_short_desc = "Dispatch Held", +- .pme_long_desc = "Dispatch Held", ++[ POWER9_PME_PM_DTLB_MISS_2M ] = { ++ .pme_name = "PM_DTLB_MISS_2M", ++ .pme_code = 0x000001C05C, ++ .pme_short_desc = "Data TLB reload (after a miss) page size 2M.", ++ .pme_long_desc = "Data TLB reload (after a miss) page size 2M. Implies radix translation was used", + }, +-[ POWER9_PME_PM_TM_FAIL_CONF_TM ] = { /* 227 */ +- .pme_name = "PM_TM_FAIL_CONF_TM", +- .pme_code = 0x00000020AC, +- .pme_short_desc = "TM aborted because a conflict occurred with another transaction.", +- .pme_long_desc = "TM aborted because a conflict occurred with another transaction.", ++[ POWER9_PME_PM_DTLB_MISS_4K ] = { ++ .pme_name = "PM_DTLB_MISS_4K", ++ .pme_code = 0x000002C056, ++ .pme_short_desc = "Data TLB Miss page size 4k", ++ .pme_long_desc = "Data TLB Miss page size 4k", + }, +-[ POWER9_PME_PM_LS0_DC_COLLISIONS ] = { /* 228 */ +- .pme_name = "PM_LS0_DC_COLLISIONS", +- .pme_code = 0x000000D090, +- .pme_short_desc = "Read-write data cache collisions", +- .pme_long_desc = "Read-write data cache collisions", ++[ POWER9_PME_PM_DTLB_MISS_64K ] = { ++ .pme_name = "PM_DTLB_MISS_64K", ++ .pme_code = 0x000003C056, ++ .pme_short_desc = "Data TLB Miss page size 64K", ++ .pme_long_desc = "Data TLB Miss page size 64K", + }, +-[ POWER9_PME_PM_L2_LD ] = { /* 229 */ +- .pme_name = "PM_L2_LD", +- .pme_code = 0x0000016080, +- .pme_short_desc = "All successful D-side Load dispatches for this thread", +- .pme_long_desc = "All successful D-side Load dispatches for this thread", ++[ POWER9_PME_PM_DTLB_MISS ] = { ++ .pme_name = "PM_DTLB_MISS", ++ .pme_code = 0x00000300FC, ++ .pme_short_desc = "Data PTEG reload", ++ .pme_long_desc = "Data PTEG reload", + }, +-[ POWER9_PME_PM_BTAC_GOOD_RESULT ] = { /* 230 */ +- .pme_name = "PM_BTAC_GOOD_RESULT", +- .pme_code = 0x00000058B0, +- .pme_short_desc = "BTAC predicts a taken branch and the BHT agrees, and the target address is correct", +- .pme_long_desc = "BTAC predicts a taken branch and the BHT agrees, and the target address is correct", ++[ POWER9_PME_PM_SPACEHOLDER_0000040062 ] = { ++ .pme_name = "PM_SPACEHOLDER_0000040062", ++ .pme_code = 0x0000040062, ++ .pme_short_desc = "SPACE_HOLDER for event 0000040062", ++ .pme_long_desc = "SPACE_HOLDER for event 0000040062", + }, +-[ POWER9_PME_PM_TEND_PEND_CYC ] = { /* 231 */ +- .pme_name = "PM_TEND_PEND_CYC", +- .pme_code = 0x000000E8B0, +- .pme_short_desc = "TEND latency per thread", +- .pme_long_desc = "TEND latency per thread", ++[ POWER9_PME_PM_SPACEHOLDER_0000040064 ] = { ++ .pme_name = "PM_SPACEHOLDER_0000040064", ++ .pme_code = 0x0000040064, ++ .pme_short_desc = "SPACE_HOLDER for event 0000040064", ++ .pme_long_desc = "SPACE_HOLDER for event 0000040064", + }, +-[ POWER9_PME_PM_MRK_DCACHE_RELOAD_INTV ] = { /* 232 */ +- .pme_name = "PM_MRK_DCACHE_RELOAD_INTV", +- .pme_code = 0x0000040118, +- .pme_short_desc = "Combined Intervention event", +- .pme_long_desc = "Combined Intervention event", ++[ POWER9_PME_PM_EAT_FORCE_MISPRED ] = { ++ .pme_name = "PM_EAT_FORCE_MISPRED", ++ .pme_code = 0x00000050A8, ++ .pme_short_desc = "XL-form branch was mispredicted due to the predicted target address missing from EAT.", ++ .pme_long_desc = "XL-form branch was mispredicted due to the predicted target address missing from EAT. The EAT forces a mispredict in this case since there is no predicated target to validate. This is a rare case that may occur when the EAT is full and a branch is issued", + }, +-[ POWER9_PME_PM_DISP_HELD_HB_FULL ] = { /* 233 */ +- .pme_name = "PM_DISP_HELD_HB_FULL", +- .pme_code = 0x000003D05C, +- .pme_short_desc = "Dispatch held due to History Buffer full.", +- .pme_long_desc = "Dispatch held due to History Buffer full. Could be GPR/VSR/VMR/FPR/CR/XVF", ++[ POWER9_PME_PM_EAT_FULL_CYC ] = { ++ .pme_name = "PM_EAT_FULL_CYC", ++ .pme_code = 0x0000004084, ++ .pme_short_desc = "Cycles No room in EAT", ++ .pme_long_desc = "Cycles No room in EAT", + }, +-[ POWER9_PME_PM_TM_TRESUME ] = { /* 234 */ +- .pme_name = "PM_TM_TRESUME", +- .pme_code = 0x00000020A4, +- .pme_short_desc = "TM resume instruction completed", +- .pme_long_desc = "TM resume instruction completed", ++[ POWER9_PME_PM_EE_OFF_EXT_INT ] = { ++ .pme_name = "PM_EE_OFF_EXT_INT", ++ .pme_code = 0x0000002080, ++ .pme_short_desc = "CyclesMSR[EE] is off and external interrupts are active", ++ .pme_long_desc = "CyclesMSR[EE] is off and external interrupts are active", + }, +-[ POWER9_PME_PM_MRK_LSU_FLUSH_SAO ] = { /* 235 */ +- .pme_name = "PM_MRK_LSU_FLUSH_SAO", +- .pme_code = 0x000000D0A4, +- .pme_short_desc = "A load-hit-load condition with Strong Address Ordering will have address compare disabled and flush", +- .pme_long_desc = "A load-hit-load condition with Strong Address Ordering will have address compare disabled and flush", ++[ POWER9_PME_PM_EXT_INT ] = { ++ .pme_name = "PM_EXT_INT", ++ .pme_code = 0x00000200F8, ++ .pme_short_desc = "external interrupt", ++ .pme_long_desc = "external interrupt", + }, +-[ POWER9_PME_PM_LS0_TM_DISALLOW ] = { /* 236 */ +- .pme_name = "PM_LS0_TM_DISALLOW", +- .pme_code = 0x000000E0B4, +- .pme_short_desc = "A TM-ineligible instruction tries to execute inside a transaction and the LSU disallows it", +- .pme_long_desc = "A TM-ineligible instruction tries to execute inside a transaction and the LSU disallows it", ++[ POWER9_PME_PM_FLOP_CMPL ] = { ++ .pme_name = "PM_FLOP_CMPL", ++ .pme_code = 0x000004505E, ++ .pme_short_desc = "Floating Point Operation Finished", ++ .pme_long_desc = "Floating Point Operation Finished", + }, +-[ POWER9_PME_PM_DPTEG_FROM_OFF_CHIP_CACHE ] = { /* 237 */ +- .pme_name = "PM_DPTEG_FROM_OFF_CHIP_CACHE", +- .pme_code = 0x000004E04A, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to a data side request.", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++[ POWER9_PME_PM_FLUSH_COMPLETION ] = { ++ .pme_name = "PM_FLUSH_COMPLETION", ++ .pme_code = 0x0000030012, ++ .pme_short_desc = "The instruction that was next to complete did not complete because it suffered a flush", ++ .pme_long_desc = "The instruction that was next to complete did not complete because it suffered a flush", + }, +-[ POWER9_PME_PM_RC0_BUSY ] = { /* 238 */ +- .pme_name = "PM_RC0_BUSY", +- .pme_code = 0x000002608E, +- .pme_short_desc = "RC mach 0 Busy.", +- .pme_long_desc = "RC mach 0 Busy. Used by PMU to sample ave RC livetime(mach0 used as sample point)", ++[ POWER9_PME_PM_FLUSH_DISP_SB ] = { ++ .pme_name = "PM_FLUSH_DISP_SB", ++ .pme_code = 0x0000002088, ++ .pme_short_desc = "Dispatch Flush: Scoreboard", ++ .pme_long_desc = "Dispatch Flush: Scoreboard", + }, +-[ POWER9_PME_PM_LSU1_TM_L1_HIT ] = { /* 239 */ +- .pme_name = "PM_LSU1_TM_L1_HIT", +- .pme_code = 0x000000E894, +- .pme_short_desc = "Load tm hit in L1", +- .pme_long_desc = "Load tm hit in L1", ++[ POWER9_PME_PM_FLUSH_DISP_TLBIE ] = { ++ .pme_name = "PM_FLUSH_DISP_TLBIE", ++ .pme_code = 0x0000002888, ++ .pme_short_desc = "Dispatch Flush: TLBIE", ++ .pme_long_desc = "Dispatch Flush: TLBIE", + }, +-[ POWER9_PME_PM_TB_BIT_TRANS ] = { /* 240 */ +- .pme_name = "PM_TB_BIT_TRANS", +- .pme_code = 0x00000300F8, +- .pme_short_desc = "timebase event", +- .pme_long_desc = "timebase event", ++[ POWER9_PME_PM_FLUSH_DISP ] = { ++ .pme_name = "PM_FLUSH_DISP", ++ .pme_code = 0x0000002880, ++ .pme_short_desc = "Dispatch flush", ++ .pme_long_desc = "Dispatch flush", + }, +-[ POWER9_PME_PM_DPTEG_FROM_L2_NO_CONFLICT ] = { /* 241 */ +- .pme_name = "PM_DPTEG_FROM_L2_NO_CONFLICT", +- .pme_code = 0x000001E040, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L2 without conflict due to a data side request.", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L2 without conflict due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++[ POWER9_PME_PM_FLUSH_HB_RESTORE_CYC ] = { ++ .pme_name = "PM_FLUSH_HB_RESTORE_CYC", ++ .pme_code = 0x0000002084, ++ .pme_short_desc = "Cycles in which no new instructions can be dispatched to the ICT after a flush.", ++ .pme_long_desc = "Cycles in which no new instructions can be dispatched to the ICT after a flush. History buffer recovery", + }, +-[ POWER9_PME_PM_MRK_DPTEG_FROM_L3_1_MOD ] = { /* 242 */ +- .pme_name = "PM_MRK_DPTEG_FROM_L3_1_MOD", +- .pme_code = 0x000002F144, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's L3 on the same chip due to a marked data side request.", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's L3 on the same chip due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++[ POWER9_PME_PM_FLUSH_LSU ] = { ++ .pme_name = "PM_FLUSH_LSU", ++ .pme_code = 0x00000058A4, ++ .pme_short_desc = "LSU flushes.", ++ .pme_long_desc = "LSU flushes. Includes all lsu flushes", + }, +-[ POWER9_PME_PM_MRK_DATA_FROM_L2_NO_CONFLICT ] = { /* 243 */ +- .pme_name = "PM_MRK_DATA_FROM_L2_NO_CONFLICT", +- .pme_code = 0x000002C120, +- .pme_short_desc = "The processor's data cache was reloaded from local core's L2 without conflict due to a marked load", +- .pme_long_desc = "The processor's data cache was reloaded from local core's L2 without conflict due to a marked load", ++[ POWER9_PME_PM_FLUSH_MPRED ] = { ++ .pme_name = "PM_FLUSH_MPRED", ++ .pme_code = 0x00000050A4, ++ .pme_short_desc = "Branch mispredict flushes.", ++ .pme_long_desc = "Branch mispredict flushes. Includes target and address misprecition", + }, +-[ POWER9_PME_PM_MRK_DATA_FROM_LL4_CYC ] = { /* 244 */ +- .pme_name = "PM_MRK_DATA_FROM_LL4_CYC", +- .pme_code = 0x000002C12E, +- .pme_short_desc = "Duration in cycles to reload from the local chip's L4 cache due to a marked load", +- .pme_long_desc = "Duration in cycles to reload from the local chip's L4 cache due to a marked load", ++[ POWER9_PME_PM_FLUSH ] = { ++ .pme_name = "PM_FLUSH", ++ .pme_code = 0x00000400F8, ++ .pme_short_desc = "Flush (any type)", ++ .pme_long_desc = "Flush (any type)", + }, +-[ POWER9_PME_PM_INST_FROM_OFF_CHIP_CACHE ] = { /* 245 */ +- .pme_name = "PM_INST_FROM_OFF_CHIP_CACHE", +- .pme_code = 0x000004404A, +- .pme_short_desc = "The processor's Instruction cache was reloaded either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to an instruction fetch (not prefetch)", +- .pme_long_desc = "The processor's Instruction cache was reloaded either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to an instruction fetch (not prefetch)", ++[ POWER9_PME_PM_FMA_CMPL ] = { ++ .pme_name = "PM_FMA_CMPL", ++ .pme_code = 0x0000045054, ++ .pme_short_desc = "two flops operation completed (fmadd, fnmadd, fmsub, fnmsub) Scalar instructions only.", ++ .pme_long_desc = "two flops operation completed (fmadd, fnmadd, fmsub, fnmsub) Scalar instructions only. ", + }, +-[ POWER9_PME_PM_L3_CO_L31 ] = { /* 246 */ +- .pme_name = "PM_L3_CO_L31", +- .pme_code = 0x00000268A0, +- .pme_short_desc = "L3 CO to L3.", +- .pme_long_desc = "L3 CO to L3.1 OR of port 0 and 1 ( lossy)", ++[ POWER9_PME_PM_FORCED_NOP ] = { ++ .pme_name = "PM_FORCED_NOP", ++ .pme_code = 0x000000509C, ++ .pme_short_desc = "Instruction was forced to execute as a nop because it was found to behave like a nop (have no effect) at decode time", ++ .pme_long_desc = "Instruction was forced to execute as a nop because it was found to behave like a nop (have no effect) at decode time", + }, +-[ POWER9_PME_PM_CMPLU_STALL_CRYPTO ] = { /* 247 */ +- .pme_name = "PM_CMPLU_STALL_CRYPTO", +- .pme_code = 0x000004C01E, +- .pme_short_desc = "Finish stall because the NTF instruction was routed to the crypto execution pipe and was waiting to finish", +- .pme_long_desc = "Finish stall because the NTF instruction was routed to the crypto execution pipe and was waiting to finish", ++[ POWER9_PME_PM_FREQ_DOWN ] = { ++ .pme_name = "PM_FREQ_DOWN", ++ .pme_code = 0x000003000C, ++ .pme_short_desc = "Power Management: Below Threshold B", ++ .pme_long_desc = "Power Management: Below Threshold B", + }, +-[ POWER9_PME_PM_RADIX_PWC_L1_PDE_FROM_L3 ] = { /* 248 */ +- .pme_name = "PM_RADIX_PWC_L1_PDE_FROM_L3", +- .pme_code = 0x000003F058, +- .pme_short_desc = "A Page Directory Entry was reloaded to a level 1 page walk cache from the core's L3 data cache", +- .pme_long_desc = "A Page Directory Entry was reloaded to a level 1 page walk cache from the core's L3 data cache", ++[ POWER9_PME_PM_FREQ_UP ] = { ++ .pme_name = "PM_FREQ_UP", ++ .pme_code = 0x000004000C, ++ .pme_short_desc = "Power Management: Above Threshold A", ++ .pme_long_desc = "Power Management: Above Threshold A", + }, +-[ POWER9_PME_PM_ICT_EMPTY_CYC ] = { /* 249 */ +- .pme_name = "PM_ICT_EMPTY_CYC", +- .pme_code = 0x0000020004, +- .pme_short_desc = "Cycles in which the ICT is completely empty.", +- .pme_long_desc = "Cycles in which the ICT is completely empty. No itags are assigned to any thread", ++[ POWER9_PME_PM_FXU_1PLUS_BUSY ] = { ++ .pme_name = "PM_FXU_1PLUS_BUSY", ++ .pme_code = 0x000003000E, ++ .pme_short_desc = "At least one of the 4 FXU units is busy", ++ .pme_long_desc = "At least one of the 4 FXU units is busy", + }, +-[ POWER9_PME_PM_BR_UNCOND ] = { /* 250 */ +- .pme_name = "PM_BR_UNCOND", +- .pme_code = 0x00000040A0, +- .pme_short_desc = "Unconditional Branch Completed.", +- .pme_long_desc = "Unconditional Branch Completed. HW branch prediction was not used for this branch. This can be an I-form branch, a B-form branch with BO-field set to branch always, or a B-form branch which was coverted to a Resolve.", ++[ POWER9_PME_PM_FXU_BUSY ] = { ++ .pme_name = "PM_FXU_BUSY", ++ .pme_code = 0x000002000E, ++ .pme_short_desc = "Cycles in which all 4 FXUs are busy.", ++ .pme_long_desc = "Cycles in which all 4 FXUs are busy. The FXU is running at capacity", + }, +-[ POWER9_PME_PM_DERAT_MISS_2M ] = { /* 251 */ +- .pme_name = "PM_DERAT_MISS_2M", +- .pme_code = 0x000001C05A, +- .pme_short_desc = "Data ERAT Miss (Data TLB Access) page size 2M.", +- .pme_long_desc = "Data ERAT Miss (Data TLB Access) page size 2M. Implies radix translation", ++[ POWER9_PME_PM_FXU_FIN ] = { ++ .pme_name = "PM_FXU_FIN", ++ .pme_code = 0x0000040004, ++ .pme_short_desc = "The fixed point unit Unit finished an instruction.", ++ .pme_long_desc = "The fixed point unit Unit finished an instruction. Instructions that finish may not necessary complete.", + }, +-[ POWER9_PME_PM_PMC4_REWIND ] = { /* 252 */ +- .pme_name = "PM_PMC4_REWIND", +- .pme_code = 0x0000010020, +- .pme_short_desc = "PMC4 Rewind Event", +- .pme_long_desc = "PMC4 Rewind Event", ++[ POWER9_PME_PM_FXU_IDLE ] = { ++ .pme_name = "PM_FXU_IDLE", ++ .pme_code = 0x0000024052, ++ .pme_short_desc = "Cycles in which FXU0, FXU1, FXU2, and FXU3 are all idle", ++ .pme_long_desc = "Cycles in which FXU0, FXU1, FXU2, and FXU3 are all idle", + }, +-[ POWER9_PME_PM_L2_RCLD_DISP ] = { /* 253 */ +- .pme_name = "PM_L2_RCLD_DISP", +- .pme_code = 0x0000016084, +- .pme_short_desc = "L2 RC load dispatch attempt", +- .pme_long_desc = "L2 RC load dispatch attempt", ++[ POWER9_PME_PM_GRP_PUMP_CPRED ] = { ++ .pme_name = "PM_GRP_PUMP_CPRED", ++ .pme_code = 0x0000020050, ++ .pme_short_desc = "Initial and Final Pump Scope and data sourced across this scope was group pump for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", ++ .pme_long_desc = "Initial and Final Pump Scope and data sourced across this scope was group pump for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", + }, +-[ POWER9_PME_PM_CMPLU_STALL_DMISS_L2L3_CONFLICT ] = { /* 254 */ +- .pme_name = "PM_CMPLU_STALL_DMISS_L2L3_CONFLICT", +- .pme_code = 0x000004C016, +- .pme_short_desc = "Completion stall due to cache miss that resolves in the L2 or L3 with a conflict", +- .pme_long_desc = "Completion stall due to cache miss that resolves in the L2 or L3 with a conflict", ++[ POWER9_PME_PM_GRP_PUMP_MPRED_RTY ] = { ++ .pme_name = "PM_GRP_PUMP_MPRED_RTY", ++ .pme_code = 0x0000010052, ++ .pme_short_desc = "Final Pump Scope (Group) ended up larger than Initial Pump Scope (Chip) for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", ++ .pme_long_desc = "Final Pump Scope (Group) ended up larger than Initial Pump Scope (Chip) for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", + }, +-[ POWER9_PME_PM_TAKEN_BR_MPRED_CMPL ] = { /* 255 */ +- .pme_name = "PM_TAKEN_BR_MPRED_CMPL", +- .pme_code = 0x0000020056, +- .pme_short_desc = "Total number of taken branches that were incorrectly predicted as not-taken.", +- .pme_long_desc = "Total number of taken branches that were incorrectly predicted as not-taken. This event counts branches completed and does not include speculative instructions", ++[ POWER9_PME_PM_GRP_PUMP_MPRED ] = { ++ .pme_name = "PM_GRP_PUMP_MPRED", ++ .pme_code = 0x0000020052, ++ .pme_short_desc = "Final Pump Scope (Group) ended up either larger or smaller than Initial Pump Scope for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", ++ .pme_long_desc = "Final Pump Scope (Group) ended up either larger or smaller than Initial Pump Scope for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", + }, +-[ POWER9_PME_PM_THRD_PRIO_2_3_CYC ] = { /* 256 */ +- .pme_name = "PM_THRD_PRIO_2_3_CYC", +- .pme_code = 0x00000048BC, +- .pme_short_desc = "Cycles thread running at priority level 2 or 3", +- .pme_long_desc = "Cycles thread running at priority level 2 or 3", ++[ POWER9_PME_PM_HV_CYC ] = { ++ .pme_name = "PM_HV_CYC", ++ .pme_code = 0x000002000A, ++ .pme_short_desc = "Cycles in which msr_hv is high.", ++ .pme_long_desc = "Cycles in which msr_hv is high. Note that this event does not take msr_pr into consideration", ++}, ++[ POWER9_PME_PM_HWSYNC ] = { ++ .pme_name = "PM_HWSYNC", ++ .pme_code = 0x00000050A0, ++ .pme_short_desc = "Hwsync instruction decoded and transferred", ++ .pme_long_desc = "Hwsync instruction decoded and transferred", + }, +-[ POWER9_PME_PM_DATA_FROM_DL4 ] = { /* 257 */ +- .pme_name = "PM_DATA_FROM_DL4", +- .pme_code = 0x000003C04C, +- .pme_short_desc = "The processor's data cache was reloaded from another chip's L4 on a different Node or Group (Distant) due to a demand load", +- .pme_long_desc = "The processor's data cache was reloaded from another chip's L4 on a different Node or Group (Distant) due to a demand load", ++[ POWER9_PME_PM_IBUF_FULL_CYC ] = { ++ .pme_name = "PM_IBUF_FULL_CYC", ++ .pme_code = 0x0000004884, ++ .pme_short_desc = "Cycles No room in ibuff", ++ .pme_long_desc = "Cycles No room in ibuff", + }, +-[ POWER9_PME_PM_CMPLU_STALL_DPLONG ] = { /* 258 */ +- .pme_name = "PM_CMPLU_STALL_DPLONG", +- .pme_code = 0x000003405C, +- .pme_short_desc = "Finish stall because the NTF instruction was a scalar multi-cycle instruction issued to the Double Precision execution pipe and waiting to finish.", +- .pme_long_desc = "Finish stall because the NTF instruction was a scalar multi-cycle instruction issued to the Double Precision execution pipe and waiting to finish. Includes binary floating point instructions in 32 and 64 bit binary floating point format. Qualified by NOT vector AND multicycle", ++[ POWER9_PME_PM_IC_DEMAND_CYC ] = { ++ .pme_name = "PM_IC_DEMAND_CYC", ++ .pme_code = 0x0000010018, ++ .pme_short_desc = "Final Pump Scope (Group) ended up larger than Initial Pump Scope (Chip) for a demand load", ++ .pme_long_desc = "Final Pump Scope (Group) ended up larger than Initial Pump Scope (Chip) for a demand load", + }, +-[ POWER9_PME_PM_IC_DEMAND_L2_BHT_REDIRECT ] = { /* 259 */ ++[ POWER9_PME_PM_IC_DEMAND_L2_BHT_REDIRECT ] = { + .pme_name = "PM_IC_DEMAND_L2_BHT_REDIRECT", + .pme_code = 0x0000004098, + .pme_short_desc = "L2 I cache demand request due to BHT redirect, branch redirect ( 2 bubbles 3 cycles)", + .pme_long_desc = "L2 I cache demand request due to BHT redirect, branch redirect ( 2 bubbles 3 cycles)", + }, +-[ POWER9_PME_PM_MRK_FAB_RSP_BKILL ] = { /* 260 */ +- .pme_name = "PM_MRK_FAB_RSP_BKILL", +- .pme_code = 0x0000040154, +- .pme_short_desc = "Marked store had to do a bkill", +- .pme_long_desc = "Marked store had to do a bkill", ++[ POWER9_PME_PM_IC_DEMAND_L2_BR_REDIRECT ] = { ++ .pme_name = "PM_IC_DEMAND_L2_BR_REDIRECT", ++ .pme_code = 0x0000004898, ++ .pme_short_desc = "L2 I cache demand request due to branch Mispredict ( 15 cycle path)", ++ .pme_long_desc = "L2 I cache demand request due to branch Mispredict ( 15 cycle path)", + }, +-[ POWER9_PME_PM_LSU_DERAT_MISS ] = { /* 261 */ +- .pme_name = "PM_LSU_DERAT_MISS", +- .pme_code = 0x00000200F6, +- .pme_short_desc = "DERAT Reloaded due to a DERAT miss", +- .pme_long_desc = "DERAT Reloaded due to a DERAT miss", ++[ POWER9_PME_PM_IC_DEMAND_REQ ] = { ++ .pme_name = "PM_IC_DEMAND_REQ", ++ .pme_code = 0x0000004088, ++ .pme_short_desc = "Demand Instruction fetch request", ++ .pme_long_desc = "Demand Instruction fetch request", ++}, ++[ POWER9_PME_PM_IC_INVALIDATE ] = { ++ .pme_name = "PM_IC_INVALIDATE", ++ .pme_code = 0x0000005888, ++ .pme_short_desc = "Ic line invalidated", ++ .pme_long_desc = "Ic line invalidated", ++}, ++[ POWER9_PME_PM_IC_MISS_CMPL ] = { ++ .pme_name = "PM_IC_MISS_CMPL", ++ .pme_code = 0x0000045058, ++ .pme_short_desc = "Non-speculative icache miss, counted at completion", ++ .pme_long_desc = "Non-speculative icache miss, counted at completion", ++}, ++[ POWER9_PME_PM_IC_MISS_ICBI ] = { ++ .pme_name = "PM_IC_MISS_ICBI", ++ .pme_code = 0x0000005094, ++ .pme_short_desc = "threaded version, IC Misses where we got EA dir hit but no sector valids were on.", ++ .pme_long_desc = "threaded version, IC Misses where we got EA dir hit but no sector valids were on. ICBI took line out", ++}, ++[ POWER9_PME_PM_IC_PREF_CANCEL_HIT ] = { ++ .pme_name = "PM_IC_PREF_CANCEL_HIT", ++ .pme_code = 0x0000004890, ++ .pme_short_desc = "Prefetch Canceled due to icache hit", ++ .pme_long_desc = "Prefetch Canceled due to icache hit", + }, +-[ POWER9_PME_PM_IC_PREF_CANCEL_L2 ] = { /* 262 */ ++[ POWER9_PME_PM_IC_PREF_CANCEL_L2 ] = { + .pme_name = "PM_IC_PREF_CANCEL_L2", + .pme_code = 0x0000004094, + .pme_short_desc = "L2 Squashed a demand or prefetch request", + .pme_long_desc = "L2 Squashed a demand or prefetch request", + }, +-[ POWER9_PME_PM_MRK_NTC_CYC ] = { /* 263 */ +- .pme_name = "PM_MRK_NTC_CYC", +- .pme_code = 0x000002011C, +- .pme_short_desc = "Cycles during which the marked instruction is next to complete (completion is held up because the marked instruction hasn't completed yet)", +- .pme_long_desc = "Cycles during which the marked instruction is next to complete (completion is held up because the marked instruction hasn't completed yet)", ++[ POWER9_PME_PM_IC_PREF_CANCEL_PAGE ] = { ++ .pme_name = "PM_IC_PREF_CANCEL_PAGE", ++ .pme_code = 0x0000004090, ++ .pme_short_desc = "Prefetch Canceled due to page boundary", ++ .pme_long_desc = "Prefetch Canceled due to page boundary", + }, +-[ POWER9_PME_PM_STCX_FIN ] = { /* 264 */ +- .pme_name = "PM_STCX_FIN", +- .pme_code = 0x000002E014, +- .pme_short_desc = "Number of stcx instructions finished.", +- .pme_long_desc = "Number of stcx instructions finished. This includes instructions in the speculative path of a branch that may be flushed", ++[ POWER9_PME_PM_IC_PREF_REQ ] = { ++ .pme_name = "PM_IC_PREF_REQ", ++ .pme_code = 0x0000004888, ++ .pme_short_desc = "Instruction prefetch requests", ++ .pme_long_desc = "Instruction prefetch requests", + }, +-[ POWER9_PME_PM_MRK_DATA_FROM_L3_MEPF ] = { /* 265 */ +- .pme_name = "PM_MRK_DATA_FROM_L3_MEPF", +- .pme_code = 0x000002D142, +- .pme_short_desc = "The processor's data cache was reloaded from local core's L3 without dispatch conflicts hit on Mepf state.", +- .pme_long_desc = "The processor's data cache was reloaded from local core's L3 without dispatch conflicts hit on Mepf state. due to a marked load", ++[ POWER9_PME_PM_IC_PREF_WRITE ] = { ++ .pme_name = "PM_IC_PREF_WRITE", ++ .pme_code = 0x000000488C, ++ .pme_short_desc = "Instruction prefetch written into IL1", ++ .pme_long_desc = "Instruction prefetch written into IL1", + }, +-[ POWER9_PME_PM_DC_PREF_FUZZY_CONF ] = { /* 266 */ +- .pme_name = "PM_DC_PREF_FUZZY_CONF", +- .pme_code = 0x000000F8A8, +- .pme_short_desc = "A demand load referenced a line in an active fuzzy prefetch stream.", +- .pme_long_desc = "A demand load referenced a line in an active fuzzy prefetch stream. The stream could have been allocated through the hardware prefetch mechanism or through software.Fuzzy stream confirm (out of order effects, or pf cant keep up)", ++[ POWER9_PME_PM_IC_RELOAD_PRIVATE ] = { ++ .pme_name = "PM_IC_RELOAD_PRIVATE", ++ .pme_code = 0x0000004894, ++ .pme_short_desc = "Reloading line was brought in private for a specific thread.", ++ .pme_long_desc = "Reloading line was brought in private for a specific thread. Most lines are brought in shared for all eight threads. If RA does not match then invalidates and then brings it shared to other thread. In P7 line brought in private , then line was invalidat", + }, +-[ POWER9_PME_PM_MULT_MRK ] = { /* 267 */ +- .pme_name = "PM_MULT_MRK", +- .pme_code = 0x000003D15E, +- .pme_short_desc = "mult marked instr", +- .pme_long_desc = "mult marked instr", ++[ POWER9_PME_PM_ICT_EMPTY_CYC ] = { ++ .pme_name = "PM_ICT_EMPTY_CYC", ++ .pme_code = 0x0000020008, ++ .pme_short_desc = "Cycles in which the ICT is completely empty.", ++ .pme_long_desc = "Cycles in which the ICT is completely empty. No itags are assigned to any thread", + }, +-[ POWER9_PME_PM_LSU_FLUSH_LARX_STCX ] = { /* 268 */ +- .pme_name = "PM_LSU_FLUSH_LARX_STCX", +- .pme_code = 0x000000C8B8, +- .pme_short_desc = "A larx is flushed because an older larx has an LMQ reservation for the same thread.", +- .pme_long_desc = "A larx is flushed because an older larx has an LMQ reservation for the same thread. A stcx is flushed because an older stcx is in the LMQ. The flush happens when the older larx/stcx relaunches", ++[ POWER9_PME_PM_ICT_NOSLOT_BR_MPRED_ICMISS ] = { ++ .pme_name = "PM_ICT_NOSLOT_BR_MPRED_ICMISS", ++ .pme_code = 0x0000034058, ++ .pme_short_desc = "Ict empty for this thread due to Icache Miss and branch mispred", ++ .pme_long_desc = "Ict empty for this thread due to Icache Miss and branch mispred", + }, +-[ POWER9_PME_PM_L3_P1_LCO_NO_DATA ] = { /* 269 */ +- .pme_name = "PM_L3_P1_LCO_NO_DATA", +- .pme_code = 0x00000168AA, +- .pme_short_desc = "dataless l3 lco sent port 1", +- .pme_long_desc = "dataless l3 lco sent port 1", ++[ POWER9_PME_PM_ICT_NOSLOT_BR_MPRED ] = { ++ .pme_name = "PM_ICT_NOSLOT_BR_MPRED", ++ .pme_code = 0x000004D01E, ++ .pme_short_desc = "Ict empty for this thread due to branch mispred", ++ .pme_long_desc = "Ict empty for this thread due to branch mispred", + }, +-[ POWER9_PME_PM_TM_TABORT_TRECLAIM ] = { /* 270 */ +- .pme_name = "PM_TM_TABORT_TRECLAIM", +- .pme_code = 0x0000002898, +- .pme_short_desc = "Completion time tabortnoncd, tabortcd, treclaim", +- .pme_long_desc = "Completion time tabortnoncd, tabortcd, treclaim", ++[ POWER9_PME_PM_ICT_NOSLOT_CYC ] = { ++ .pme_name = "PM_ICT_NOSLOT_CYC", ++ .pme_code = 0x00000100F8, ++ .pme_short_desc = "Number of cycles the ICT has no itags assigned to this thread", ++ .pme_long_desc = "Number of cycles the ICT has no itags assigned to this thread", + }, +-[ POWER9_PME_PM_MRK_DATA_FROM_L2_MEPF_CYC ] = { /* 271 */ +- .pme_name = "PM_MRK_DATA_FROM_L2_MEPF_CYC", +- .pme_code = 0x000003D144, +- .pme_short_desc = "Duration in cycles to reload from local core's L2 hit without dispatch conflicts on Mepf state.", +- .pme_long_desc = "Duration in cycles to reload from local core's L2 hit without dispatch conflicts on Mepf state. due to a marked load", ++[ POWER9_PME_PM_ICT_NOSLOT_DISP_HELD_HB_FULL ] = { ++ .pme_name = "PM_ICT_NOSLOT_DISP_HELD_HB_FULL", ++ .pme_code = 0x0000030018, ++ .pme_short_desc = "Ict empty for this thread due to dispatch holds because the History Buffer was full.", ++ .pme_long_desc = "Ict empty for this thread due to dispatch holds because the History Buffer was full. Could be GPR/VSR/VMR/FPR/CR/XVF; CR; XVF (XER/VSCR/FPSCR)", + }, +-[ POWER9_PME_PM_BR_PRED_CCACHE ] = { /* 272 */ +- .pme_name = "PM_BR_PRED_CCACHE", +- .pme_code = 0x00000040A4, +- .pme_short_desc = "Conditional Branch Completed that used the Count Cache for Target Prediction", +- .pme_long_desc = "Conditional Branch Completed that used the Count Cache for Target Prediction", ++[ POWER9_PME_PM_ICT_NOSLOT_DISP_HELD_ISSQ ] = { ++ .pme_name = "PM_ICT_NOSLOT_DISP_HELD_ISSQ", ++ .pme_code = 0x000002D01E, ++ .pme_short_desc = "Ict empty for this thread due to dispatch hold on this thread due to Issue q full, BRQ full, XVCF Full, Count cache, Link, Tar full", ++ .pme_long_desc = "Ict empty for this thread due to dispatch hold on this thread due to Issue q full, BRQ full, XVCF Full, Count cache, Link, Tar full", + }, +-[ POWER9_PME_PM_L3_P1_LCO_DATA ] = { /* 273 */ +- .pme_name = "PM_L3_P1_LCO_DATA", +- .pme_code = 0x00000268AA, +- .pme_short_desc = "lco sent with data port 1", +- .pme_long_desc = "lco sent with data port 1", ++[ POWER9_PME_PM_ICT_NOSLOT_DISP_HELD_SYNC ] = { ++ .pme_name = "PM_ICT_NOSLOT_DISP_HELD_SYNC", ++ .pme_code = 0x000004D01C, ++ .pme_short_desc = "Dispatch held due to a synchronizing instruction at dispatch", ++ .pme_long_desc = "Dispatch held due to a synchronizing instruction at dispatch", + }, +-[ POWER9_PME_PM_LINK_STACK_WRONG_ADD_PRED ] = { /* 274 */ +- .pme_name = "PM_LINK_STACK_WRONG_ADD_PRED", +- .pme_code = 0x0000005098, +- .pme_short_desc = "Link stack predicts wrong address, because of link stack design limitation or software violating the coding conventions", +- .pme_long_desc = "Link stack predicts wrong address, because of link stack design limitation or software violating the coding conventions", ++[ POWER9_PME_PM_ICT_NOSLOT_DISP_HELD_TBEGIN ] = { ++ .pme_name = "PM_ICT_NOSLOT_DISP_HELD_TBEGIN", ++ .pme_code = 0x0000010064, ++ .pme_short_desc = "the NTC instruction is being held at dispatch because it is a tbegin instruction and there is an older tbegin in the pipeline that must complete before the younger tbegin can dispatch", ++ .pme_long_desc = "the NTC instruction is being held at dispatch because it is a tbegin instruction and there is an older tbegin in the pipeline that must complete before the younger tbegin can dispatch", + }, +-[ POWER9_PME_PM_MRK_DPTEG_FROM_L3 ] = { /* 275 */ +- .pme_name = "PM_MRK_DPTEG_FROM_L3", +- .pme_code = 0x000004F142, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L3 due to a marked data side request.", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L3 due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++[ POWER9_PME_PM_ICT_NOSLOT_DISP_HELD ] = { ++ .pme_name = "PM_ICT_NOSLOT_DISP_HELD", ++ .pme_code = 0x000004E01A, ++ .pme_short_desc = "Cycles in which the NTC instruction is held at dispatch for any reason", ++ .pme_long_desc = "Cycles in which the NTC instruction is held at dispatch for any reason", + }, +-[ POWER9_PME_PM_MRK_ST_CMPL_INT ] = { /* 276 */ +- .pme_name = "PM_MRK_ST_CMPL_INT", +- .pme_code = 0x0000030134, +- .pme_short_desc = "marked store finished with intervention", +- .pme_long_desc = "marked store finished with intervention", ++[ POWER9_PME_PM_ICT_NOSLOT_IC_L3MISS ] = { ++ .pme_name = "PM_ICT_NOSLOT_IC_L3MISS", ++ .pme_code = 0x000004E010, ++ .pme_short_desc = "Ict empty for this thread due to icache misses that were sourced from beyond the local L3.", ++ .pme_long_desc = "Ict empty for this thread due to icache misses that were sourced from beyond the local L3. The source could be local/remote/distant memory or another core's cache", + }, +-[ POWER9_PME_PM_FLUSH_HB_RESTORE_CYC ] = { /* 277 */ +- .pme_name = "PM_FLUSH_HB_RESTORE_CYC", +- .pme_code = 0x0000002084, +- .pme_short_desc = "Cycles in which no new instructions can be dispatched to the ICT after a flush.", +- .pme_long_desc = "Cycles in which no new instructions can be dispatched to the ICT after a flush. History buffer recovery", ++[ POWER9_PME_PM_ICT_NOSLOT_IC_L3 ] = { ++ .pme_name = "PM_ICT_NOSLOT_IC_L3", ++ .pme_code = 0x000003E052, ++ .pme_short_desc = "Ict empty for this thread due to icache misses that were sourced from the local L3", ++ .pme_long_desc = "Ict empty for this thread due to icache misses that were sourced from the local L3", + }, +-[ POWER9_PME_PM_LS1_PTE_TABLEWALK_CYC ] = { /* 278 */ +- .pme_name = "PM_LS1_PTE_TABLEWALK_CYC", +- .pme_code = 0x000000E8BC, +- .pme_short_desc = "Cycles when a tablewalk is pending on this thread on table 1", +- .pme_long_desc = "Cycles when a tablewalk is pending on this thread on table 1", ++[ POWER9_PME_PM_ICT_NOSLOT_IC_MISS ] = { ++ .pme_name = "PM_ICT_NOSLOT_IC_MISS", ++ .pme_code = 0x000002D01A, ++ .pme_short_desc = "Ict empty for this thread due to Icache Miss", ++ .pme_long_desc = "Ict empty for this thread due to Icache Miss", + }, +-[ POWER9_PME_PM_L3_CI_USAGE ] = { /* 279 */ +- .pme_name = "PM_L3_CI_USAGE", +- .pme_code = 0x00000168AC, +- .pme_short_desc = "rotating sample of 16 CI or CO actives", +- .pme_long_desc = "rotating sample of 16 CI or CO actives", ++[ POWER9_PME_PM_IERAT_RELOAD_16M ] = { ++ .pme_name = "PM_IERAT_RELOAD_16M", ++ .pme_code = 0x000004006A, ++ .pme_short_desc = "IERAT Reloaded (Miss) for a 16M page", ++ .pme_long_desc = "IERAT Reloaded (Miss) for a 16M page", + }, +-[ POWER9_PME_PM_MRK_DATA_FROM_L3MISS ] = { /* 280 */ +- .pme_name = "PM_MRK_DATA_FROM_L3MISS", +- .pme_code = 0x00000201E4, +- .pme_short_desc = "The processor's data cache was reloaded from a localtion other than the local core's L3 due to a marked load", +- .pme_long_desc = "The processor's data cache was reloaded from a localtion other than the local core's L3 due to a marked load", ++[ POWER9_PME_PM_IERAT_RELOAD_4K ] = { ++ .pme_name = "PM_IERAT_RELOAD_4K", ++ .pme_code = 0x0000020064, ++ .pme_short_desc = "IERAT reloaded (after a miss) for 4K pages", ++ .pme_long_desc = "IERAT reloaded (after a miss) for 4K pages", + }, +-[ POWER9_PME_PM_DPTEG_FROM_DL4 ] = { /* 281 */ +- .pme_name = "PM_DPTEG_FROM_DL4", +- .pme_code = 0x000003E04C, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB from another chip's L4 on a different Node or Group (Distant) due to a data side request.", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB from another chip's L4 on a different Node or Group (Distant) due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++[ POWER9_PME_PM_IERAT_RELOAD_64K ] = { ++ .pme_name = "PM_IERAT_RELOAD_64K", ++ .pme_code = 0x000003006A, ++ .pme_short_desc = "IERAT Reloaded (Miss) for a 64k page", ++ .pme_long_desc = "IERAT Reloaded (Miss) for a 64k page", + }, +-[ POWER9_PME_PM_MRK_STCX_FIN ] = { /* 282 */ +- .pme_name = "PM_MRK_STCX_FIN", +- .pme_code = 0x0000024056, +- .pme_short_desc = "Number of marked stcx instructions finished.", +- .pme_long_desc = "Number of marked stcx instructions finished. This includes instructions in the speculative path of a branch that may be flushed", ++[ POWER9_PME_PM_IERAT_RELOAD ] = { ++ .pme_name = "PM_IERAT_RELOAD", ++ .pme_code = 0x00000100F6, ++ .pme_short_desc = "Number of I-ERAT reloads", ++ .pme_long_desc = "Number of I-ERAT reloads", ++}, ++[ POWER9_PME_PM_IFETCH_THROTTLE ] = { ++ .pme_name = "PM_IFETCH_THROTTLE", ++ .pme_code = 0x000003405E, ++ .pme_short_desc = "Cycles in which Instruction fetch throttle was active.", ++ .pme_long_desc = "Cycles in which Instruction fetch throttle was active.", ++}, ++[ POWER9_PME_PM_INST_CHIP_PUMP_CPRED ] = { ++ .pme_name = "PM_INST_CHIP_PUMP_CPRED", ++ .pme_code = 0x0000014050, ++ .pme_short_desc = "Initial and Final Pump Scope was chip pump (prediction=correct) for an instruction fetch", ++ .pme_long_desc = "Initial and Final Pump Scope was chip pump (prediction=correct) for an instruction fetch", ++}, ++[ POWER9_PME_PM_INST_CMPL ] = { ++ .pme_name = "PM_INST_CMPL", ++ .pme_code = 0x0000010002, ++ .pme_short_desc = "Number of PowerPC Instructions that completed.", ++ .pme_long_desc = "Number of PowerPC Instructions that completed.", ++}, ++[ POWER9_PME_PM_INST_DISP ] = { ++ .pme_name = "PM_INST_DISP", ++ .pme_code = 0x00000200F2, ++ .pme_short_desc = "# PPC Dispatched", ++ .pme_long_desc = "# PPC Dispatched", ++}, ++[ POWER9_PME_PM_INST_FROM_DL2L3_MOD ] = { ++ .pme_name = "PM_INST_FROM_DL2L3_MOD", ++ .pme_code = 0x0000044048, ++ .pme_short_desc = "The processor's Instruction cache was reloaded with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to an instruction fetch (not prefetch)", ++ .pme_long_desc = "The processor's Instruction cache was reloaded with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to an instruction fetch (not prefetch)", ++}, ++[ POWER9_PME_PM_INST_FROM_DL2L3_SHR ] = { ++ .pme_name = "PM_INST_FROM_DL2L3_SHR", ++ .pme_code = 0x0000034048, ++ .pme_short_desc = "The processor's Instruction cache was reloaded with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to an instruction fetch (not prefetch)", ++ .pme_long_desc = "The processor's Instruction cache was reloaded with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to an instruction fetch (not prefetch)", ++}, ++[ POWER9_PME_PM_INST_FROM_DL4 ] = { ++ .pme_name = "PM_INST_FROM_DL4", ++ .pme_code = 0x000003404C, ++ .pme_short_desc = "The processor's Instruction cache was reloaded from another chip's L4 on a different Node or Group (Distant) due to an instruction fetch (not prefetch)", ++ .pme_long_desc = "The processor's Instruction cache was reloaded from another chip's L4 on a different Node or Group (Distant) due to an instruction fetch (not prefetch)", ++}, ++[ POWER9_PME_PM_INST_FROM_DMEM ] = { ++ .pme_name = "PM_INST_FROM_DMEM", ++ .pme_code = 0x000004404C, ++ .pme_short_desc = "The processor's Instruction cache was reloaded from another chip's memory on the same Node or Group (Distant) due to an instruction fetch (not prefetch)", ++ .pme_long_desc = "The processor's Instruction cache was reloaded from another chip's memory on the same Node or Group (Distant) due to an instruction fetch (not prefetch)", ++}, ++[ POWER9_PME_PM_INST_FROM_L1 ] = { ++ .pme_name = "PM_INST_FROM_L1", ++ .pme_code = 0x0000004080, ++ .pme_short_desc = "Instruction fetches from L1.", ++ .pme_long_desc = "Instruction fetches from L1. L1 instruction hit", ++}, ++[ POWER9_PME_PM_INST_FROM_L21_MOD ] = { ++ .pme_name = "PM_INST_FROM_L21_MOD", ++ .pme_code = 0x0000044046, ++ .pme_short_desc = "The processor's Instruction cache was reloaded with Modified (M) data from another core's L2 on the same chip due to an instruction fetch (not prefetch)", ++ .pme_long_desc = "The processor's Instruction cache was reloaded with Modified (M) data from another core's L2 on the same chip due to an instruction fetch (not prefetch)", ++}, ++[ POWER9_PME_PM_INST_FROM_L21_SHR ] = { ++ .pme_name = "PM_INST_FROM_L21_SHR", ++ .pme_code = 0x0000034046, ++ .pme_short_desc = "The processor's Instruction cache was reloaded with Shared (S) data from another core's L2 on the same chip due to an instruction fetch (not prefetch)", ++ .pme_long_desc = "The processor's Instruction cache was reloaded with Shared (S) data from another core's L2 on the same chip due to an instruction fetch (not prefetch)", ++}, ++[ POWER9_PME_PM_INST_FROM_L2_DISP_CONFLICT_LDHITST ] = { ++ .pme_name = "PM_INST_FROM_L2_DISP_CONFLICT_LDHITST", ++ .pme_code = 0x0000034040, ++ .pme_short_desc = "The processor's Instruction cache was reloaded from local core's L2 with load hit store conflict due to an instruction fetch (not prefetch)", ++ .pme_long_desc = "The processor's Instruction cache was reloaded from local core's L2 with load hit store conflict due to an instruction fetch (not prefetch)", + }, +-[ POWER9_PME_PM_MRK_LSU_FLUSH_UE ] = { /* 283 */ +- .pme_name = "PM_MRK_LSU_FLUSH_UE", +- .pme_code = 0x000000D89C, +- .pme_short_desc = "Correctable ECC error on reload data, reported at critical data forward time", +- .pme_long_desc = "Correctable ECC error on reload data, reported at critical data forward time", ++[ POWER9_PME_PM_INST_FROM_L2_DISP_CONFLICT_OTHER ] = { ++ .pme_name = "PM_INST_FROM_L2_DISP_CONFLICT_OTHER", ++ .pme_code = 0x0000044040, ++ .pme_short_desc = "The processor's Instruction cache was reloaded from local core's L2 with dispatch conflict due to an instruction fetch (not prefetch)", ++ .pme_long_desc = "The processor's Instruction cache was reloaded from local core's L2 with dispatch conflict due to an instruction fetch (not prefetch)", + }, +-[ POWER9_PME_PM_MRK_DATA_FROM_MEMORY ] = { /* 284 */ +- .pme_name = "PM_MRK_DATA_FROM_MEMORY", +- .pme_code = 0x00000201E0, +- .pme_short_desc = "The processor's data cache was reloaded from a memory location including L4 from local remote or distant due to a marked load", +- .pme_long_desc = "The processor's data cache was reloaded from a memory location including L4 from local remote or distant due to a marked load", ++[ POWER9_PME_PM_INST_FROM_L2_MEPF ] = { ++ .pme_name = "PM_INST_FROM_L2_MEPF", ++ .pme_code = 0x0000024040, ++ .pme_short_desc = "The processor's Instruction cache was reloaded from local core's L2 hit without dispatch conflicts on Mepf state.", ++ .pme_long_desc = "The processor's Instruction cache was reloaded from local core's L2 hit without dispatch conflicts on Mepf state. due to an instruction fetch (not prefetch)", + }, +-[ POWER9_PME_PM_GRP_PUMP_MPRED_RTY ] = { /* 285 */ +- .pme_name = "PM_GRP_PUMP_MPRED_RTY", +- .pme_code = 0x0000010052, +- .pme_short_desc = "Final Pump Scope (Group) ended up larger than Initial Pump Scope (Chip) for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", +- .pme_long_desc = "Final Pump Scope (Group) ended up larger than Initial Pump Scope (Chip) for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", ++[ POWER9_PME_PM_INST_FROM_L2MISS ] = { ++ .pme_name = "PM_INST_FROM_L2MISS", ++ .pme_code = 0x000001404E, ++ .pme_short_desc = "The processor's Instruction cache was reloaded from a location other than the local core's L2 due to an instruction fetch (not prefetch)", ++ .pme_long_desc = "The processor's Instruction cache was reloaded from a location other than the local core's L2 due to an instruction fetch (not prefetch)", + }, +-[ POWER9_PME_PM_DPTEG_FROM_L3_1_ECO_SHR ] = { /* 286 */ +- .pme_name = "PM_DPTEG_FROM_L3_1_ECO_SHR", +- .pme_code = 0x000003E044, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's ECO L3 on the same chip due to a data side request.", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's ECO L3 on the same chip due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++[ POWER9_PME_PM_INST_FROM_L2_NO_CONFLICT ] = { ++ .pme_name = "PM_INST_FROM_L2_NO_CONFLICT", ++ .pme_code = 0x0000014040, ++ .pme_short_desc = "The processor's Instruction cache was reloaded from local core's L2 without conflict due to an instruction fetch (not prefetch)", ++ .pme_long_desc = "The processor's Instruction cache was reloaded from local core's L2 without conflict due to an instruction fetch (not prefetch)", + }, +-[ POWER9_PME_PM_FLUSH_DISP_TLBIE ] = { /* 287 */ +- .pme_name = "PM_FLUSH_DISP_TLBIE", +- .pme_code = 0x0000002888, +- .pme_short_desc = "Dispatch Flush: TLBIE", +- .pme_long_desc = "Dispatch Flush: TLBIE", ++[ POWER9_PME_PM_INST_FROM_L2 ] = { ++ .pme_name = "PM_INST_FROM_L2", ++ .pme_code = 0x0000014042, ++ .pme_short_desc = "The processor's Instruction cache was reloaded from local core's L2 due to an instruction fetch (not prefetch)", ++ .pme_long_desc = "The processor's Instruction cache was reloaded from local core's L2 due to an instruction fetch (not prefetch)", + }, +-[ POWER9_PME_PM_DPTEG_FROM_L3MISS ] = { /* 288 */ +- .pme_name = "PM_DPTEG_FROM_L3MISS", +- .pme_code = 0x000004E04E, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB from a localtion other than the local core's L3 due to a data side request.", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB from a localtion other than the local core's L3 due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++[ POWER9_PME_PM_INST_FROM_L31_ECO_MOD ] = { ++ .pme_name = "PM_INST_FROM_L31_ECO_MOD", ++ .pme_code = 0x0000044044, ++ .pme_short_desc = "The processor's Instruction cache was reloaded with Modified (M) data from another core's ECO L3 on the same chip due to an instruction fetch (not prefetch)", ++ .pme_long_desc = "The processor's Instruction cache was reloaded with Modified (M) data from another core's ECO L3 on the same chip due to an instruction fetch (not prefetch)", + }, +-[ POWER9_PME_PM_L3_GRP_GUESS_CORRECT ] = { /* 289 */ +- .pme_name = "PM_L3_GRP_GUESS_CORRECT", +- .pme_code = 0x00000168B2, +- .pme_short_desc = "Initial scope=group and data from same group (near) (pred successful)", +- .pme_long_desc = "Initial scope=group and data from same group (near) (pred successful)", ++[ POWER9_PME_PM_INST_FROM_L31_ECO_SHR ] = { ++ .pme_name = "PM_INST_FROM_L31_ECO_SHR", ++ .pme_code = 0x0000034044, ++ .pme_short_desc = "The processor's Instruction cache was reloaded with Shared (S) data from another core's ECO L3 on the same chip due to an instruction fetch (not prefetch)", ++ .pme_long_desc = "The processor's Instruction cache was reloaded with Shared (S) data from another core's ECO L3 on the same chip due to an instruction fetch (not prefetch)", + }, +-[ POWER9_PME_PM_IC_INVALIDATE ] = { /* 290 */ +- .pme_name = "PM_IC_INVALIDATE", +- .pme_code = 0x0000005888, +- .pme_short_desc = "Ic line invalidated", +- .pme_long_desc = "Ic line invalidated", ++[ POWER9_PME_PM_INST_FROM_L31_MOD ] = { ++ .pme_name = "PM_INST_FROM_L31_MOD", ++ .pme_code = 0x0000024044, ++ .pme_short_desc = "The processor's Instruction cache was reloaded with Modified (M) data from another core's L3 on the same chip due to an instruction fetch (not prefetch)", ++ .pme_long_desc = "The processor's Instruction cache was reloaded with Modified (M) data from another core's L3 on the same chip due to an instruction fetch (not prefetch)", + }, +-[ POWER9_PME_PM_DERAT_MISS_16G ] = { /* 291 */ +- .pme_name = "PM_DERAT_MISS_16G", +- .pme_code = 0x000004C054, +- .pme_short_desc = "Data ERAT Miss (Data TLB Access) page size 16G", +- .pme_long_desc = "Data ERAT Miss (Data TLB Access) page size 16G", ++[ POWER9_PME_PM_INST_FROM_L31_SHR ] = { ++ .pme_name = "PM_INST_FROM_L31_SHR", ++ .pme_code = 0x0000014046, ++ .pme_short_desc = "The processor's Instruction cache was reloaded with Shared (S) data from another core's L3 on the same chip due to an instruction fetch (not prefetch)", ++ .pme_long_desc = "The processor's Instruction cache was reloaded with Shared (S) data from another core's L3 on the same chip due to an instruction fetch (not prefetch)", + }, +-[ POWER9_PME_PM_SYS_PUMP_MPRED_RTY ] = { /* 292 */ +- .pme_name = "PM_SYS_PUMP_MPRED_RTY", +- .pme_code = 0x0000040050, +- .pme_short_desc = "Final Pump Scope (system) ended up larger than Initial Pump Scope (Chip/Group) for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", +- .pme_long_desc = "Final Pump Scope (system) ended up larger than Initial Pump Scope (Chip/Group) for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", ++[ POWER9_PME_PM_INST_FROM_L3_DISP_CONFLICT ] = { ++ .pme_name = "PM_INST_FROM_L3_DISP_CONFLICT", ++ .pme_code = 0x0000034042, ++ .pme_short_desc = "The processor's Instruction cache was reloaded from local core's L3 with dispatch conflict due to an instruction fetch (not prefetch)", ++ .pme_long_desc = "The processor's Instruction cache was reloaded from local core's L3 with dispatch conflict due to an instruction fetch (not prefetch)", + }, +-[ POWER9_PME_PM_LMQ_MERGE ] = { /* 293 */ +- .pme_name = "PM_LMQ_MERGE", +- .pme_code = 0x000001002E, +- .pme_short_desc = "A demand miss collides with a prefetch for the same line", +- .pme_long_desc = "A demand miss collides with a prefetch for the same line", ++[ POWER9_PME_PM_INST_FROM_L3_MEPF ] = { ++ .pme_name = "PM_INST_FROM_L3_MEPF", ++ .pme_code = 0x0000024042, ++ .pme_short_desc = "The processor's Instruction cache was reloaded from local core's L3 without dispatch conflicts hit on Mepf state.", ++ .pme_long_desc = "The processor's Instruction cache was reloaded from local core's L3 without dispatch conflicts hit on Mepf state. due to an instruction fetch (not prefetch)", + }, +-[ POWER9_PME_PM_IPTEG_FROM_LMEM ] = { /* 294 */ +- .pme_name = "PM_IPTEG_FROM_LMEM", +- .pme_code = 0x0000025048, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB from the local chip's Memory due to a instruction side request", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB from the local chip's Memory due to a instruction side request", ++[ POWER9_PME_PM_INST_FROM_L3MISS_MOD ] = { ++ .pme_name = "PM_INST_FROM_L3MISS_MOD", ++ .pme_code = 0x000004404E, ++ .pme_short_desc = "The processor's Instruction cache was reloaded from a location other than the local core's L3 due to a instruction fetch", ++ .pme_long_desc = "The processor's Instruction cache was reloaded from a location other than the local core's L3 due to a instruction fetch", + }, +-[ POWER9_PME_PM_L3_LAT_CI_HIT ] = { /* 295 */ +- .pme_name = "PM_L3_LAT_CI_HIT", +- .pme_code = 0x00000460A2, +- .pme_short_desc = "L3 Lateral Castins Hit", +- .pme_long_desc = "L3 Lateral Castins Hit", ++[ POWER9_PME_PM_INST_FROM_L3MISS ] = { ++ .pme_name = "PM_INST_FROM_L3MISS", ++ .pme_code = 0x00000300FA, ++ .pme_short_desc = "Marked instruction was reloaded from a location beyond the local chiplet", ++ .pme_long_desc = "Marked instruction was reloaded from a location beyond the local chiplet", ++}, ++[ POWER9_PME_PM_INST_FROM_L3_NO_CONFLICT ] = { ++ .pme_name = "PM_INST_FROM_L3_NO_CONFLICT", ++ .pme_code = 0x0000014044, ++ .pme_short_desc = "The processor's Instruction cache was reloaded from local core's L3 without conflict due to an instruction fetch (not prefetch)", ++ .pme_long_desc = "The processor's Instruction cache was reloaded from local core's L3 without conflict due to an instruction fetch (not prefetch)", + }, +-[ POWER9_PME_PM_LSU1_VECTOR_ST_FIN ] = { /* 296 */ +- .pme_name = "PM_LSU1_VECTOR_ST_FIN", +- .pme_code = 0x000000C888, +- .pme_short_desc = "A vector store instruction finished.", +- .pme_long_desc = "A vector store instruction finished. The ops considered in this category are stv*, stxv*, stxsi*x, stxsd, and stxssp", ++[ POWER9_PME_PM_INST_FROM_L3 ] = { ++ .pme_name = "PM_INST_FROM_L3", ++ .pme_code = 0x0000044042, ++ .pme_short_desc = "The processor's Instruction cache was reloaded from local core's L3 due to an instruction fetch (not prefetch)", ++ .pme_long_desc = "The processor's Instruction cache was reloaded from local core's L3 due to an instruction fetch (not prefetch)", + }, +-[ POWER9_PME_PM_IC_DEMAND_L2_BR_REDIRECT ] = { /* 297 */ +- .pme_name = "PM_IC_DEMAND_L2_BR_REDIRECT", +- .pme_code = 0x0000004898, +- .pme_short_desc = "L2 I cache demand request due to branch Mispredict ( 15 cycle path)", +- .pme_long_desc = "L2 I cache demand request due to branch Mispredict ( 15 cycle path)", ++[ POWER9_PME_PM_INST_FROM_LL4 ] = { ++ .pme_name = "PM_INST_FROM_LL4", ++ .pme_code = 0x000001404C, ++ .pme_short_desc = "The processor's Instruction cache was reloaded from the local chip's L4 cache due to an instruction fetch (not prefetch)", ++ .pme_long_desc = "The processor's Instruction cache was reloaded from the local chip's L4 cache due to an instruction fetch (not prefetch)", + }, +-[ POWER9_PME_PM_INST_FROM_LMEM ] = { /* 298 */ ++[ POWER9_PME_PM_INST_FROM_LMEM ] = { + .pme_name = "PM_INST_FROM_LMEM", + .pme_code = 0x0000024048, + .pme_short_desc = "The processor's Instruction cache was reloaded from the local chip's Memory due to an instruction fetch (not prefetch)", + .pme_long_desc = "The processor's Instruction cache was reloaded from the local chip's Memory due to an instruction fetch (not prefetch)", + }, +-[ POWER9_PME_PM_MRK_DATA_FROM_RL4 ] = { /* 299 */ +- .pme_name = "PM_MRK_DATA_FROM_RL4", +- .pme_code = 0x000003515C, +- .pme_short_desc = "The processor's data cache was reloaded from another chip's L4 on the same Node or Group ( Remote) due to a marked load", +- .pme_long_desc = "The processor's data cache was reloaded from another chip's L4 on the same Node or Group ( Remote) due to a marked load", ++[ POWER9_PME_PM_INST_FROM_MEMORY ] = { ++ .pme_name = "PM_INST_FROM_MEMORY", ++ .pme_code = 0x000002404C, ++ .pme_short_desc = "The processor's Instruction cache was reloaded from a memory location including L4 from local remote or distant due to an instruction fetch (not prefetch)", ++ .pme_long_desc = "The processor's Instruction cache was reloaded from a memory location including L4 from local remote or distant due to an instruction fetch (not prefetch)", + }, +-[ POWER9_PME_PM_MRK_DTLB_MISS_4K ] = { /* 300 */ +- .pme_name = "PM_MRK_DTLB_MISS_4K", +- .pme_code = 0x000002D156, +- .pme_short_desc = "Marked Data TLB Miss page size 4k", +- .pme_long_desc = "Marked Data TLB Miss page size 4k", ++[ POWER9_PME_PM_INST_FROM_OFF_CHIP_CACHE ] = { ++ .pme_name = "PM_INST_FROM_OFF_CHIP_CACHE", ++ .pme_code = 0x000004404A, ++ .pme_short_desc = "The processor's Instruction cache was reloaded either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to an instruction fetch (not prefetch)", ++ .pme_long_desc = "The processor's Instruction cache was reloaded either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to an instruction fetch (not prefetch)", + }, +-[ POWER9_PME_PM_MRK_DATA_FROM_L3_NO_CONFLICT ] = { /* 301 */ +- .pme_name = "PM_MRK_DATA_FROM_L3_NO_CONFLICT", +- .pme_code = 0x000003D146, +- .pme_short_desc = "The processor's data cache was reloaded from local core's L3 without conflict due to a marked load", +- .pme_long_desc = "The processor's data cache was reloaded from local core's L3 without conflict due to a marked load", ++[ POWER9_PME_PM_INST_FROM_ON_CHIP_CACHE ] = { ++ .pme_name = "PM_INST_FROM_ON_CHIP_CACHE", ++ .pme_code = 0x0000014048, ++ .pme_short_desc = "The processor's Instruction cache was reloaded either shared or modified data from another core's L2/L3 on the same chip due to an instruction fetch (not prefetch)", ++ .pme_long_desc = "The processor's Instruction cache was reloaded either shared or modified data from another core's L2/L3 on the same chip due to an instruction fetch (not prefetch)", + }, +-[ POWER9_PME_PM_CMPLU_STALL_NTC_FLUSH ] = { /* 302 */ +- .pme_name = "PM_CMPLU_STALL_NTC_FLUSH", +- .pme_code = 0x000002E01E, +- .pme_short_desc = "Completion stall due to ntc flush", +- .pme_long_desc = "Completion stall due to ntc flush", ++[ POWER9_PME_PM_INST_FROM_RL2L3_MOD ] = { ++ .pme_name = "PM_INST_FROM_RL2L3_MOD", ++ .pme_code = 0x0000024046, ++ .pme_short_desc = "The processor's Instruction cache was reloaded with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to an instruction fetch (not prefetch)", ++ .pme_long_desc = "The processor's Instruction cache was reloaded with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to an instruction fetch (not prefetch)", + }, +-[ POWER9_PME_PM_MRK_DATA_FROM_L3_NO_CONFLICT_CYC ] = { /* 303 */ +- .pme_name = "PM_MRK_DATA_FROM_L3_NO_CONFLICT_CYC", +- .pme_code = 0x000004C124, +- .pme_short_desc = "Duration in cycles to reload from local core's L3 without conflict due to a marked load", +- .pme_long_desc = "Duration in cycles to reload from local core's L3 without conflict due to a marked load", ++[ POWER9_PME_PM_INST_FROM_RL2L3_SHR ] = { ++ .pme_name = "PM_INST_FROM_RL2L3_SHR", ++ .pme_code = 0x000001404A, ++ .pme_short_desc = "The processor's Instruction cache was reloaded with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to an instruction fetch (not prefetch)", ++ .pme_long_desc = "The processor's Instruction cache was reloaded with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to an instruction fetch (not prefetch)", + }, +-[ POWER9_PME_PM_DARQ_0_3_ENTRIES ] = { /* 304 */ +- .pme_name = "PM_DARQ_0_3_ENTRIES", +- .pme_code = 0x000004D04A, +- .pme_short_desc = "Cycles in which 3 or less DARQ entries (out of 12) are in use", +- .pme_long_desc = "Cycles in which 3 or less DARQ entries (out of 12) are in use", ++[ POWER9_PME_PM_INST_FROM_RL4 ] = { ++ .pme_name = "PM_INST_FROM_RL4", ++ .pme_code = 0x000002404A, ++ .pme_short_desc = "The processor's Instruction cache was reloaded from another chip's L4 on the same Node or Group ( Remote) due to an instruction fetch (not prefetch)", ++ .pme_long_desc = "The processor's Instruction cache was reloaded from another chip's L4 on the same Node or Group ( Remote) due to an instruction fetch (not prefetch)", + }, +-[ POWER9_PME_PM_DATA_FROM_L3MISS_MOD ] = { /* 305 */ +- .pme_name = "PM_DATA_FROM_L3MISS_MOD", +- .pme_code = 0x000004C04E, +- .pme_short_desc = "The processor's data cache was reloaded from a localtion other than the local core's L3 due to a demand load", +- .pme_long_desc = "The processor's data cache was reloaded from a localtion other than the local core's L3 due to a demand load", ++[ POWER9_PME_PM_INST_FROM_RMEM ] = { ++ .pme_name = "PM_INST_FROM_RMEM", ++ .pme_code = 0x000003404A, ++ .pme_short_desc = "The processor's Instruction cache was reloaded from another chip's memory on the same Node or Group ( Remote) due to an instruction fetch (not prefetch)", ++ .pme_long_desc = "The processor's Instruction cache was reloaded from another chip's memory on the same Node or Group ( Remote) due to an instruction fetch (not prefetch)", + }, +-[ POWER9_PME_PM_MRK_DATA_FROM_L2_1_SHR_CYC ] = { /* 306 */ +- .pme_name = "PM_MRK_DATA_FROM_L2_1_SHR_CYC", +- .pme_code = 0x000001D154, +- .pme_short_desc = "Duration in cycles to reload with Shared (S) data from another core's L2 on the same chip due to a marked load", +- .pme_long_desc = "Duration in cycles to reload with Shared (S) data from another core's L2 on the same chip due to a marked load", ++[ POWER9_PME_PM_INST_GRP_PUMP_CPRED ] = { ++ .pme_name = "PM_INST_GRP_PUMP_CPRED", ++ .pme_code = 0x000002C05C, ++ .pme_short_desc = "Initial and Final Pump Scope was group pump (prediction=correct) for an instruction fetch (demand only)", ++ .pme_long_desc = "Initial and Final Pump Scope was group pump (prediction=correct) for an instruction fetch (demand only)", + }, +-[ POWER9_PME_PM_TAGE_OVERRIDE_WRONG ] = { /* 307 */ +- .pme_name = "PM_TAGE_OVERRIDE_WRONG", +- .pme_code = 0x00000050B8, +- .pme_short_desc = "The TAGE overrode BHT direction prediction but it was incorrect.", +- .pme_long_desc = "The TAGE overrode BHT direction prediction but it was incorrect. Counted at completion for taken branches only", ++[ POWER9_PME_PM_INST_GRP_PUMP_MPRED_RTY ] = { ++ .pme_name = "PM_INST_GRP_PUMP_MPRED_RTY", ++ .pme_code = 0x0000014052, ++ .pme_short_desc = "Final Pump Scope (Group) ended up larger than Initial Pump Scope (Chip) for an instruction fetch", ++ .pme_long_desc = "Final Pump Scope (Group) ended up larger than Initial Pump Scope (Chip) for an instruction fetch", + }, +-[ POWER9_PME_PM_L2_LD_MISS ] = { /* 308 */ +- .pme_name = "PM_L2_LD_MISS", +- .pme_code = 0x0000026080, +- .pme_short_desc = "All successful D-Side Load dispatches that were an L2miss for this thread", +- .pme_long_desc = "All successful D-Side Load dispatches that were an L2miss for this thread", ++[ POWER9_PME_PM_INST_GRP_PUMP_MPRED ] = { ++ .pme_name = "PM_INST_GRP_PUMP_MPRED", ++ .pme_code = 0x000002C05E, ++ .pme_short_desc = "Final Pump Scope (Group) ended up either larger or smaller than Initial Pump Scope for an instruction fetch (demand only)", ++ .pme_long_desc = "Final Pump Scope (Group) ended up either larger or smaller than Initial Pump Scope for an instruction fetch (demand only)", + }, +-[ POWER9_PME_PM_EAT_FULL_CYC ] = { /* 309 */ +- .pme_name = "PM_EAT_FULL_CYC", +- .pme_code = 0x0000004084, +- .pme_short_desc = "Cycles No room in EAT", +- .pme_long_desc = "Cycles No room in EAT", ++[ POWER9_PME_PM_INST_IMC_MATCH_CMPL ] = { ++ .pme_name = "PM_INST_IMC_MATCH_CMPL", ++ .pme_code = 0x000004001C, ++ .pme_short_desc = "IMC Match Count", ++ .pme_long_desc = "IMC Match Count", + }, +-[ POWER9_PME_PM_CMPLU_STALL_SPEC_FINISH ] = { /* 310 */ +- .pme_name = "PM_CMPLU_STALL_SPEC_FINISH", +- .pme_code = 0x0000030028, +- .pme_short_desc = "Finish stall while waiting for the non-speculative finish of either a stcx waiting for its result or a load waiting for non-critical sectors of data and ECC", +- .pme_long_desc = "Finish stall while waiting for the non-speculative finish of either a stcx waiting for its result or a load waiting for non-critical sectors of data and ECC", ++[ POWER9_PME_PM_INST_PUMP_CPRED ] = { ++ .pme_name = "PM_INST_PUMP_CPRED", ++ .pme_code = 0x0000014054, ++ .pme_short_desc = "Pump prediction correct.", ++ .pme_long_desc = "Pump prediction correct. Counts across all types of pumps for an instruction fetch", + }, +-[ POWER9_PME_PM_MRK_LSU_FLUSH_LARX_STCX ] = { /* 311 */ +- .pme_name = "PM_MRK_LSU_FLUSH_LARX_STCX", +- .pme_code = 0x000000D8A4, +- .pme_short_desc = "A larx is flushed because an older larx has an LMQ reservation for the same thread.", +- .pme_long_desc = "A larx is flushed because an older larx has an LMQ reservation for the same thread. A stcx is flushed because an older stcx is in the LMQ. The flush happens when the older larx/stcx relaunches", ++[ POWER9_PME_PM_INST_PUMP_MPRED ] = { ++ .pme_name = "PM_INST_PUMP_MPRED", ++ .pme_code = 0x0000044052, ++ .pme_short_desc = "Pump misprediction.", ++ .pme_long_desc = "Pump misprediction. Counts across all types of pumps for an instruction fetch", + }, +-[ POWER9_PME_PM_THRESH_EXC_128 ] = { /* 312 */ +- .pme_name = "PM_THRESH_EXC_128", +- .pme_code = 0x00000401EA, +- .pme_short_desc = "Threshold counter exceeded a value of 128", +- .pme_long_desc = "Threshold counter exceeded a value of 128", ++[ POWER9_PME_PM_INST_SYS_PUMP_CPRED ] = { ++ .pme_name = "PM_INST_SYS_PUMP_CPRED", ++ .pme_code = 0x0000034050, ++ .pme_short_desc = "Initial and Final Pump Scope was system pump (prediction=correct) for an instruction fetch", ++ .pme_long_desc = "Initial and Final Pump Scope was system pump (prediction=correct) for an instruction fetch", + }, +-[ POWER9_PME_PM_LMQ_EMPTY_CYC ] = { /* 313 */ +- .pme_name = "PM_LMQ_EMPTY_CYC", +- .pme_code = 0x000002E05E, +- .pme_short_desc = "Cycles in which the LMQ has no pending load misses for this thread", +- .pme_long_desc = "Cycles in which the LMQ has no pending load misses for this thread", ++[ POWER9_PME_PM_INST_SYS_PUMP_MPRED_RTY ] = { ++ .pme_name = "PM_INST_SYS_PUMP_MPRED_RTY", ++ .pme_code = 0x0000044050, ++ .pme_short_desc = "Final Pump Scope (system) ended up larger than Initial Pump Scope (Chip/Group) for an instruction fetch", ++ .pme_long_desc = "Final Pump Scope (system) ended up larger than Initial Pump Scope (Chip/Group) for an instruction fetch", + }, +-[ POWER9_PME_PM_RADIX_PWC_L2_PDE_FROM_L3 ] = { /* 314 */ +- .pme_name = "PM_RADIX_PWC_L2_PDE_FROM_L3", +- .pme_code = 0x000003F05A, +- .pme_short_desc = "A Page Directory Entry was reloaded to a level 2 page walk cache from the core's L3 data cache", +- .pme_long_desc = "A Page Directory Entry was reloaded to a level 2 page walk cache from the core's L3 data cache", ++[ POWER9_PME_PM_INST_SYS_PUMP_MPRED ] = { ++ .pme_name = "PM_INST_SYS_PUMP_MPRED", ++ .pme_code = 0x0000034052, ++ .pme_short_desc = "Final Pump Scope (system) mispredicted.", ++ .pme_long_desc = "Final Pump Scope (system) mispredicted. Either the original scope was too small (Chip/Group) or the original scope was System and it should have been smaller. Counts for an instruction fetch", + }, +-[ POWER9_PME_PM_MRK_IC_MISS ] = { /* 315 */ +- .pme_name = "PM_MRK_IC_MISS", +- .pme_code = 0x000004013A, +- .pme_short_desc = "Marked instruction experienced I cache miss", +- .pme_long_desc = "Marked instruction experienced I cache miss", ++[ POWER9_PME_PM_IOPS_CMPL ] = { ++ .pme_name = "PM_IOPS_CMPL", ++ .pme_code = 0x0000024050, ++ .pme_short_desc = "Internal Operations completed", ++ .pme_long_desc = "Internal Operations completed", + }, +-[ POWER9_PME_PM_L3_P1_GRP_PUMP ] = { /* 316 */ +- .pme_name = "PM_L3_P1_GRP_PUMP", +- .pme_code = 0x00000268B0, +- .pme_short_desc = "L3 pf sent with grp scope port 1", +- .pme_long_desc = "L3 pf sent with grp scope port 1", ++[ POWER9_PME_PM_IPTEG_FROM_DL2L3_MOD ] = { ++ .pme_name = "PM_IPTEG_FROM_DL2L3_MOD", ++ .pme_code = 0x0000045048, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a instruction side request", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a instruction side request", + }, +-[ POWER9_PME_PM_CMPLU_STALL_TEND ] = { /* 317 */ +- .pme_name = "PM_CMPLU_STALL_TEND", +- .pme_code = 0x000001E050, +- .pme_short_desc = "Finish stall because the NTF instruction was a tend instruction awaiting response from L2", +- .pme_long_desc = "Finish stall because the NTF instruction was a tend instruction awaiting response from L2", ++[ POWER9_PME_PM_IPTEG_FROM_DL2L3_SHR ] = { ++ .pme_name = "PM_IPTEG_FROM_DL2L3_SHR", ++ .pme_code = 0x0000035048, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a instruction side request", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a instruction side request", ++}, ++[ POWER9_PME_PM_IPTEG_FROM_DL4 ] = { ++ .pme_name = "PM_IPTEG_FROM_DL4", ++ .pme_code = 0x000003504C, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from another chip's L4 on a different Node or Group (Distant) due to a instruction side request", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from another chip's L4 on a different Node or Group (Distant) due to a instruction side request", + }, +-[ POWER9_PME_PM_PUMP_MPRED ] = { /* 318 */ +- .pme_name = "PM_PUMP_MPRED", +- .pme_code = 0x0000040052, +- .pme_short_desc = "Pump misprediction.", +- .pme_long_desc = "Pump misprediction. Counts across all types of pumps for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", ++[ POWER9_PME_PM_IPTEG_FROM_DMEM ] = { ++ .pme_name = "PM_IPTEG_FROM_DMEM", ++ .pme_code = 0x000004504C, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from another chip's memory on the same Node or Group (Distant) due to a instruction side request", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from another chip's memory on the same Node or Group (Distant) due to a instruction side request", + }, +-[ POWER9_PME_PM_INST_GRP_PUMP_MPRED ] = { /* 319 */ +- .pme_name = "PM_INST_GRP_PUMP_MPRED", +- .pme_code = 0x000002C05E, +- .pme_short_desc = "Final Pump Scope (Group) ended up either larger or smaller than Initial Pump Scope for an instruction fetch (demand only)", +- .pme_long_desc = "Final Pump Scope (Group) ended up either larger or smaller than Initial Pump Scope for an instruction fetch (demand only)", ++[ POWER9_PME_PM_IPTEG_FROM_L21_MOD ] = { ++ .pme_name = "PM_IPTEG_FROM_L21_MOD", ++ .pme_code = 0x0000045046, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's L2 on the same chip due to a instruction side request", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's L2 on the same chip due to a instruction side request", + }, +-[ POWER9_PME_PM_L1_PREF ] = { /* 320 */ +- .pme_name = "PM_L1_PREF", +- .pme_code = 0x0000020054, +- .pme_short_desc = "A data line was written to the L1 due to a hardware or software prefetch", +- .pme_long_desc = "A data line was written to the L1 due to a hardware or software prefetch", ++[ POWER9_PME_PM_IPTEG_FROM_L21_SHR ] = { ++ .pme_name = "PM_IPTEG_FROM_L21_SHR", ++ .pme_code = 0x0000035046, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's L2 on the same chip due to a instruction side request", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's L2 on the same chip due to a instruction side request", + }, +-[ POWER9_PME_PM_MRK_DATA_FROM_LMEM_CYC ] = { /* 321 */ +- .pme_name = "PM_MRK_DATA_FROM_LMEM_CYC", +- .pme_code = 0x000004D128, +- .pme_short_desc = "Duration in cycles to reload from the local chip's Memory due to a marked load", +- .pme_long_desc = "Duration in cycles to reload from the local chip's Memory due to a marked load", ++[ POWER9_PME_PM_IPTEG_FROM_L2_MEPF ] = { ++ .pme_name = "PM_IPTEG_FROM_L2_MEPF", ++ .pme_code = 0x0000025040, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L2 hit without dispatch conflicts on Mepf state.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L2 hit without dispatch conflicts on Mepf state. due to a instruction side request", + }, +-[ POWER9_PME_PM_LSU_FLUSH_ATOMIC ] = { /* 322 */ +- .pme_name = "PM_LSU_FLUSH_ATOMIC", +- .pme_code = 0x000000C8A8, +- .pme_short_desc = "Quad-word loads (lq) are considered atomic because they always span at least 2 slices.", +- .pme_long_desc = "Quad-word loads (lq) are considered atomic because they always span at least 2 slices. If a snoop or store from another thread changes the data the load is accessing between the 2 or 3 pieces of the lq instruction, the lq will be flushed", ++[ POWER9_PME_PM_IPTEG_FROM_L2MISS ] = { ++ .pme_name = "PM_IPTEG_FROM_L2MISS", ++ .pme_code = 0x000001504E, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from a location other than the local core's L2 due to a instruction side request", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from a location other than the local core's L2 due to a instruction side request", + }, +-[ POWER9_PME_PM_L2_DISP_ALL_L2MISS ] = { /* 323 */ +- .pme_name = "PM_L2_DISP_ALL_L2MISS", +- .pme_code = 0x0000046080, +- .pme_short_desc = "All successful Ld/St dispatches for this thread that were an L2miss.", +- .pme_long_desc = "All successful Ld/St dispatches for this thread that were an L2miss.", ++[ POWER9_PME_PM_IPTEG_FROM_L2_NO_CONFLICT ] = { ++ .pme_name = "PM_IPTEG_FROM_L2_NO_CONFLICT", ++ .pme_code = 0x0000015040, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L2 without conflict due to a instruction side request", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L2 without conflict due to a instruction side request", + }, +-[ POWER9_PME_PM_DATA_FROM_MEMORY ] = { /* 324 */ +- .pme_name = "PM_DATA_FROM_MEMORY", +- .pme_code = 0x00000400FE, +- .pme_short_desc = "The processor's data cache was reloaded from a memory location including L4 from local remote or distant due to a demand load", +- .pme_long_desc = "The processor's data cache was reloaded from a memory location including L4 from local remote or distant due to a demand load", ++[ POWER9_PME_PM_IPTEG_FROM_L2 ] = { ++ .pme_name = "PM_IPTEG_FROM_L2", ++ .pme_code = 0x0000015042, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L2 due to a instruction side request", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L2 due to a instruction side request", + }, +-[ POWER9_PME_PM_IPTEG_FROM_L3_1_ECO_MOD ] = { /* 325 */ +- .pme_name = "PM_IPTEG_FROM_L3_1_ECO_MOD", ++[ POWER9_PME_PM_IPTEG_FROM_L31_ECO_MOD ] = { ++ .pme_name = "PM_IPTEG_FROM_L31_ECO_MOD", + .pme_code = 0x0000045044, + .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's ECO L3 on the same chip due to a instruction side request", + .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's ECO L3 on the same chip due to a instruction side request", + }, +-[ POWER9_PME_PM_ISIDE_DISP_FAIL_ADDR ] = { /* 326 */ +- .pme_name = "PM_ISIDE_DISP_FAIL_ADDR", +- .pme_code = 0x000002608A, +- .pme_short_desc = "All i-side dispatch attempts that failed due to a addr collision with another machine", +- .pme_long_desc = "All i-side dispatch attempts that failed due to a addr collision with another machine", +-}, +-[ POWER9_PME_PM_CMPLU_STALL_HWSYNC ] = { /* 327 */ +- .pme_name = "PM_CMPLU_STALL_HWSYNC", +- .pme_code = 0x0000030036, +- .pme_short_desc = "completion stall due to hwsync", +- .pme_long_desc = "completion stall due to hwsync", +-}, +-[ POWER9_PME_PM_DATA_FROM_L3 ] = { /* 328 */ +- .pme_name = "PM_DATA_FROM_L3", +- .pme_code = 0x000004C042, +- .pme_short_desc = "The processor's data cache was reloaded from local core's L3 due to a demand load", +- .pme_long_desc = "The processor's data cache was reloaded from local core's L3 due to a demand load", ++[ POWER9_PME_PM_IPTEG_FROM_L31_ECO_SHR ] = { ++ .pme_name = "PM_IPTEG_FROM_L31_ECO_SHR", ++ .pme_code = 0x0000035044, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's ECO L3 on the same chip due to a instruction side request", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's ECO L3 on the same chip due to a instruction side request", + }, +-[ POWER9_PME_PM_PMC2_OVERFLOW ] = { /* 329 */ +- .pme_name = "PM_PMC2_OVERFLOW", +- .pme_code = 0x0000030010, +- .pme_short_desc = "Overflow from counter 2", +- .pme_long_desc = "Overflow from counter 2", ++[ POWER9_PME_PM_IPTEG_FROM_L31_MOD ] = { ++ .pme_name = "PM_IPTEG_FROM_L31_MOD", ++ .pme_code = 0x0000025044, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's L3 on the same chip due to a instruction side request", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's L3 on the same chip due to a instruction side request", + }, +-[ POWER9_PME_PM_LSU0_SRQ_S0_VALID_CYC ] = { /* 330 */ +- .pme_name = "PM_LSU0_SRQ_S0_VALID_CYC", +- .pme_code = 0x000000D0B4, +- .pme_short_desc = "Slot 0 of SRQ valid", +- .pme_long_desc = "Slot 0 of SRQ valid", ++[ POWER9_PME_PM_IPTEG_FROM_L31_SHR ] = { ++ .pme_name = "PM_IPTEG_FROM_L31_SHR", ++ .pme_code = 0x0000015046, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's L3 on the same chip due to a instruction side request", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's L3 on the same chip due to a instruction side request", + }, +-[ POWER9_PME_PM_DPTEG_FROM_LMEM ] = { /* 331 */ +- .pme_name = "PM_DPTEG_FROM_LMEM", +- .pme_code = 0x000002E048, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB from the local chip's Memory due to a data side request.", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB from the local chip's Memory due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++[ POWER9_PME_PM_IPTEG_FROM_L3_DISP_CONFLICT ] = { ++ .pme_name = "PM_IPTEG_FROM_L3_DISP_CONFLICT", ++ .pme_code = 0x0000035042, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L3 with dispatch conflict due to a instruction side request", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L3 with dispatch conflict due to a instruction side request", + }, +-[ POWER9_PME_PM_IPTEG_FROM_ON_CHIP_CACHE ] = { /* 332 */ +- .pme_name = "PM_IPTEG_FROM_ON_CHIP_CACHE", +- .pme_code = 0x0000015048, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB either shared or modified data from another core's L2/L3 on the same chip due to a instruction side request", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB either shared or modified data from another core's L2/L3 on the same chip due to a instruction side request", ++[ POWER9_PME_PM_IPTEG_FROM_L3_MEPF ] = { ++ .pme_name = "PM_IPTEG_FROM_L3_MEPF", ++ .pme_code = 0x0000025042, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L3 without dispatch conflicts hit on Mepf state.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L3 without dispatch conflicts hit on Mepf state. due to a instruction side request", + }, +-[ POWER9_PME_PM_LSU1_SET_MPRED ] = { /* 333 */ +- .pme_name = "PM_LSU1_SET_MPRED", +- .pme_code = 0x000000D880, +- .pme_short_desc = "Set prediction(set-p) miss.", +- .pme_long_desc = "Set prediction(set-p) miss. The entry was not found in the Set prediction table", ++[ POWER9_PME_PM_IPTEG_FROM_L3MISS ] = { ++ .pme_name = "PM_IPTEG_FROM_L3MISS", ++ .pme_code = 0x000004504E, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from a location other than the local core's L3 due to a instruction side request", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from a location other than the local core's L3 due to a instruction side request", + }, +-[ POWER9_PME_PM_DATA_FROM_L3_1_ECO_SHR ] = { /* 334 */ +- .pme_name = "PM_DATA_FROM_L3_1_ECO_SHR", +- .pme_code = 0x000003C044, +- .pme_short_desc = "The processor's data cache was reloaded with Shared (S) data from another core's ECO L3 on the same chip due to a demand load", +- .pme_long_desc = "The processor's data cache was reloaded with Shared (S) data from another core's ECO L3 on the same chip due to a demand load", ++[ POWER9_PME_PM_IPTEG_FROM_L3_NO_CONFLICT ] = { ++ .pme_name = "PM_IPTEG_FROM_L3_NO_CONFLICT", ++ .pme_code = 0x0000015044, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L3 without conflict due to a instruction side request", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L3 without conflict due to a instruction side request", + }, +-[ POWER9_PME_PM_INST_FROM_MEMORY ] = { /* 335 */ +- .pme_name = "PM_INST_FROM_MEMORY", +- .pme_code = 0x000002404C, +- .pme_short_desc = "The processor's Instruction cache was reloaded from a memory location including L4 from local remote or distant due to an instruction fetch (not prefetch)", +- .pme_long_desc = "The processor's Instruction cache was reloaded from a memory location including L4 from local remote or distant due to an instruction fetch (not prefetch)", ++[ POWER9_PME_PM_IPTEG_FROM_L3 ] = { ++ .pme_name = "PM_IPTEG_FROM_L3", ++ .pme_code = 0x0000045042, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L3 due to a instruction side request", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L3 due to a instruction side request", + }, +-[ POWER9_PME_PM_L3_P1_LCO_RTY ] = { /* 336 */ +- .pme_name = "PM_L3_P1_LCO_RTY", +- .pme_code = 0x00000168B4, +- .pme_short_desc = "L3 lateral cast out received retry on port 1", +- .pme_long_desc = "L3 lateral cast out received retry on port 1", ++[ POWER9_PME_PM_IPTEG_FROM_LL4 ] = { ++ .pme_name = "PM_IPTEG_FROM_LL4", ++ .pme_code = 0x000001504C, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from the local chip's L4 cache due to a instruction side request", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from the local chip's L4 cache due to a instruction side request", + }, +-[ POWER9_PME_PM_DATA_FROM_L2_1_SHR ] = { /* 337 */ +- .pme_name = "PM_DATA_FROM_L2_1_SHR", +- .pme_code = 0x000003C046, +- .pme_short_desc = "The processor's data cache was reloaded with Shared (S) data from another core's L2 on the same chip due to a demand load", +- .pme_long_desc = "The processor's data cache was reloaded with Shared (S) data from another core's L2 on the same chip due to a demand load", ++[ POWER9_PME_PM_IPTEG_FROM_LMEM ] = { ++ .pme_name = "PM_IPTEG_FROM_LMEM", ++ .pme_code = 0x0000025048, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from the local chip's Memory due to a instruction side request", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from the local chip's Memory due to a instruction side request", + }, +-[ POWER9_PME_PM_FLUSH_LSU ] = { /* 338 */ +- .pme_name = "PM_FLUSH_LSU", +- .pme_code = 0x00000058A4, +- .pme_short_desc = "LSU flushes.", +- .pme_long_desc = "LSU flushes. Includes all lsu flushes", ++[ POWER9_PME_PM_IPTEG_FROM_MEMORY ] = { ++ .pme_name = "PM_IPTEG_FROM_MEMORY", ++ .pme_code = 0x000002504C, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from a memory location including L4 from local remote or distant due to a instruction side request", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from a memory location including L4 from local remote or distant due to a instruction side request", + }, +-[ POWER9_PME_PM_CMPLU_STALL_FXLONG ] = { /* 339 */ +- .pme_name = "PM_CMPLU_STALL_FXLONG", +- .pme_code = 0x000004D016, +- .pme_short_desc = "Completion stall due to a long latency scalar fixed point instruction (division, square root)", +- .pme_long_desc = "Completion stall due to a long latency scalar fixed point instruction (division, square root)", ++[ POWER9_PME_PM_IPTEG_FROM_OFF_CHIP_CACHE ] = { ++ .pme_name = "PM_IPTEG_FROM_OFF_CHIP_CACHE", ++ .pme_code = 0x000004504A, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to a instruction side request", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to a instruction side request", + }, +-[ POWER9_PME_PM_CMPLU_STALL_DMISS_LMEM ] = { /* 340 */ +- .pme_name = "PM_CMPLU_STALL_DMISS_LMEM", +- .pme_code = 0x0000030038, +- .pme_short_desc = "Completion stall due to cache miss that resolves in local memory", +- .pme_long_desc = "Completion stall due to cache miss that resolves in local memory", ++[ POWER9_PME_PM_IPTEG_FROM_ON_CHIP_CACHE ] = { ++ .pme_name = "PM_IPTEG_FROM_ON_CHIP_CACHE", ++ .pme_code = 0x0000015048, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB either shared or modified data from another core's L2/L3 on the same chip due to a instruction side request", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB either shared or modified data from another core's L2/L3 on the same chip due to a instruction side request", + }, +-[ POWER9_PME_PM_SNP_TM_HIT_M ] = { /* 341 */ +- .pme_name = "PM_SNP_TM_HIT_M", +- .pme_code = 0x00000360A6, +- .pme_short_desc = "snp tm st hit m mu", +- .pme_long_desc = "snp tm st hit m mu", ++[ POWER9_PME_PM_IPTEG_FROM_RL2L3_MOD ] = { ++ .pme_name = "PM_IPTEG_FROM_RL2L3_MOD", ++ .pme_code = 0x0000025046, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a instruction side request", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a instruction side request", + }, +-[ POWER9_PME_PM_INST_GRP_PUMP_MPRED_RTY ] = { /* 342 */ +- .pme_name = "PM_INST_GRP_PUMP_MPRED_RTY", +- .pme_code = 0x0000014052, +- .pme_short_desc = "Final Pump Scope (Group) ended up larger than Initial Pump Scope (Chip) for an instruction fetch", +- .pme_long_desc = "Final Pump Scope (Group) ended up larger than Initial Pump Scope (Chip) for an instruction fetch", ++[ POWER9_PME_PM_IPTEG_FROM_RL2L3_SHR ] = { ++ .pme_name = "PM_IPTEG_FROM_RL2L3_SHR", ++ .pme_code = 0x000001504A, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a instruction side request", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a instruction side request", + }, +-[ POWER9_PME_PM_L2_INST_MISS ] = { /* 343 */ +- .pme_name = "PM_L2_INST_MISS", +- .pme_code = 0x000004609E, +- .pme_short_desc = "All successful i-side dispatches that were an L2miss for this thread (excludes i_l2mru_tch reqs)", +- .pme_long_desc = "All successful i-side dispatches that were an L2miss for this thread (excludes i_l2mru_tch reqs)", ++[ POWER9_PME_PM_IPTEG_FROM_RL4 ] = { ++ .pme_name = "PM_IPTEG_FROM_RL4", ++ .pme_code = 0x000002504A, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from another chip's L4 on the same Node or Group ( Remote) due to a instruction side request", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from another chip's L4 on the same Node or Group ( Remote) due to a instruction side request", + }, +-[ POWER9_PME_PM_CMPLU_STALL_ERAT_MISS ] = { /* 344 */ +- .pme_name = "PM_CMPLU_STALL_ERAT_MISS", +- .pme_code = 0x000004C012, +- .pme_short_desc = "Finish stall because the NTF instruction was a load or store that suffered a translation miss", +- .pme_long_desc = "Finish stall because the NTF instruction was a load or store that suffered a translation miss", ++[ POWER9_PME_PM_IPTEG_FROM_RMEM ] = { ++ .pme_name = "PM_IPTEG_FROM_RMEM", ++ .pme_code = 0x000003504A, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from another chip's memory on the same Node or Group ( Remote) due to a instruction side request", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from another chip's memory on the same Node or Group ( Remote) due to a instruction side request", + }, +-[ POWER9_PME_PM_MRK_L2_RC_DONE ] = { /* 345 */ +- .pme_name = "PM_MRK_L2_RC_DONE", +- .pme_code = 0x000003012A, +- .pme_short_desc = "Marked RC done", +- .pme_long_desc = "Marked RC done", ++[ POWER9_PME_PM_ISIDE_DISP_FAIL_ADDR ] = { ++ .pme_name = "PM_ISIDE_DISP_FAIL_ADDR", ++ .pme_code = 0x000002608A, ++ .pme_short_desc = "All I-side dispatch attempts for this thread that failed due to a addr collision with another machine (excludes i_l2mru_tch_reqs)", ++ .pme_long_desc = "All I-side dispatch attempts for this thread that failed due to a addr collision with another machine (excludes i_l2mru_tch_reqs)", + }, +-[ POWER9_PME_PM_INST_FROM_L3_1_SHR ] = { /* 346 */ +- .pme_name = "PM_INST_FROM_L3_1_SHR", +- .pme_code = 0x0000014046, +- .pme_short_desc = "The processor's Instruction cache was reloaded with Shared (S) data from another core's L3 on the same chip due to an instruction fetch (not prefetch)", +- .pme_long_desc = "The processor's Instruction cache was reloaded with Shared (S) data from another core's L3 on the same chip due to an instruction fetch (not prefetch)", ++[ POWER9_PME_PM_ISIDE_DISP_FAIL_OTHER ] = { ++ .pme_name = "PM_ISIDE_DISP_FAIL_OTHER", ++ .pme_code = 0x000002688A, ++ .pme_short_desc = "All I-side dispatch attempts for this thread that failed due to a reason other than addrs collision (excludes i_l2mru_tch_reqs)", ++ .pme_long_desc = "All I-side dispatch attempts for this thread that failed due to a reason other than addrs collision (excludes i_l2mru_tch_reqs)", + }, +-[ POWER9_PME_PM_RADIX_PWC_L4_PDE_FROM_L2 ] = { /* 347 */ +- .pme_name = "PM_RADIX_PWC_L4_PDE_FROM_L2", +- .pme_code = 0x000002D02C, +- .pme_short_desc = "A Page Directory Entry was reloaded to a level 4 page walk cache from the core's L2 data cache", +- .pme_long_desc = "A Page Directory Entry was reloaded to a level 4 page walk cache from the core's L2 data cache", ++[ POWER9_PME_PM_ISIDE_DISP ] = { ++ .pme_name = "PM_ISIDE_DISP", ++ .pme_code = 0x000001688A, ++ .pme_short_desc = "All I-side dispatch attempts for this thread (excludes i_l2mru_tch_reqs)", ++ .pme_long_desc = "All I-side dispatch attempts for this thread (excludes i_l2mru_tch_reqs)", + }, +-[ POWER9_PME_PM_MRK_DATA_FROM_L3_1_MOD ] = { /* 348 */ +- .pme_name = "PM_MRK_DATA_FROM_L3_1_MOD", +- .pme_code = 0x000002D144, +- .pme_short_desc = "The processor's data cache was reloaded with Modified (M) data from another core's L3 on the same chip due to a marked load", +- .pme_long_desc = "The processor's data cache was reloaded with Modified (M) data from another core's L3 on the same chip due to a marked load", ++[ POWER9_PME_PM_ISIDE_L2MEMACC ] = { ++ .pme_name = "PM_ISIDE_L2MEMACC", ++ .pme_code = 0x0000026890, ++ .pme_short_desc = "Valid when first beat of data comes in for an I-side fetch where data came from memory", ++ .pme_long_desc = "Valid when first beat of data comes in for an I-side fetch where data came from memory", + }, +-[ POWER9_PME_PM_CO0_BUSY ] = { /* 349 */ +- .pme_name = "PM_CO0_BUSY", +- .pme_code = 0x000004608E, +- .pme_short_desc = "CO mach 0 Busy.", +- .pme_long_desc = "CO mach 0 Busy. Used by PMU to sample ave RC livetime(mach0 used as sample point)", ++[ POWER9_PME_PM_ISIDE_MRU_TOUCH ] = { ++ .pme_name = "PM_ISIDE_MRU_TOUCH", ++ .pme_code = 0x0000046880, ++ .pme_short_desc = "I-side L2 MRU touch sent to L2 for this thread", ++ .pme_long_desc = "I-side L2 MRU touch sent to L2 for this thread", + }, +-[ POWER9_PME_PM_CMPLU_STALL_STORE_DATA ] = { /* 350 */ +- .pme_name = "PM_CMPLU_STALL_STORE_DATA", +- .pme_code = 0x0000030026, +- .pme_short_desc = "Finish stall because the next to finish instruction was a store waiting on data", +- .pme_long_desc = "Finish stall because the next to finish instruction was a store waiting on data", ++[ POWER9_PME_PM_ISLB_MISS ] = { ++ .pme_name = "PM_ISLB_MISS", ++ .pme_code = 0x000000D8A8, ++ .pme_short_desc = "Instruction SLB Miss - Total of all segment sizes", ++ .pme_long_desc = "Instruction SLB Miss - Total of all segment sizes", + }, +-[ POWER9_PME_PM_INST_FROM_RMEM ] = { /* 351 */ +- .pme_name = "PM_INST_FROM_RMEM", +- .pme_code = 0x000003404A, +- .pme_short_desc = "The processor's Instruction cache was reloaded from another chip's memory on the same Node or Group ( Remote) due to an instruction fetch (not prefetch)", +- .pme_long_desc = "The processor's Instruction cache was reloaded from another chip's memory on the same Node or Group ( Remote) due to an instruction fetch (not prefetch)", ++[ POWER9_PME_PM_ISLB_MISS_ALT ] = { ++ .pme_name = "PM_ISLB_MISS", ++ .pme_code = 0x0000040006, ++ .pme_short_desc = "Number of ISLB misses for this thread", ++ .pme_long_desc = "Number of ISLB misses for this thread", + }, +-[ POWER9_PME_PM_SYNC_MRK_BR_LINK ] = { /* 352 */ +- .pme_name = "PM_SYNC_MRK_BR_LINK", +- .pme_code = 0x0000015152, +- .pme_short_desc = "Marked Branch and link branch that can cause a synchronous interrupt", +- .pme_long_desc = "Marked Branch and link branch that can cause a synchronous interrupt", ++[ POWER9_PME_PM_ISQ_0_8_ENTRIES ] = { ++ .pme_name = "PM_ISQ_0_8_ENTRIES", ++ .pme_code = 0x000003005A, ++ .pme_short_desc = "Cycles in which 8 or less Issue Queue entries are in use.", ++ .pme_long_desc = "Cycles in which 8 or less Issue Queue entries are in use. This is a shared event, not per thread", + }, +-[ POWER9_PME_PM_L3_LD_PREF ] = { /* 353 */ +- .pme_name = "PM_L3_LD_PREF", +- .pme_code = 0x000000F0B0, +- .pme_short_desc = "L3 load prefetch, sourced from a hardware or software stream, was sent to the nest", +- .pme_long_desc = "L3 load prefetch, sourced from a hardware or software stream, was sent to the nest", ++[ POWER9_PME_PM_ISQ_36_44_ENTRIES ] = { ++ .pme_name = "PM_ISQ_36_44_ENTRIES", ++ .pme_code = 0x000004000A, ++ .pme_short_desc = "Cycles in which 36 or more Issue Queue entries are in use.", ++ .pme_long_desc = "Cycles in which 36 or more Issue Queue entries are in use. This is a shared event, not per thread. There are 44 issue queue entries across 4 slices in the whole core", + }, +-[ POWER9_PME_PM_DISP_CLB_HELD_TLBIE ] = { /* 354 */ +- .pme_name = "PM_DISP_CLB_HELD_TLBIE", +- .pme_code = 0x0000002890, +- .pme_short_desc = "Dispatch Hold: Due to TLBIE", +- .pme_long_desc = "Dispatch Hold: Due to TLBIE", ++[ POWER9_PME_PM_ISU0_ISS_HOLD_ALL ] = { ++ .pme_name = "PM_ISU0_ISS_HOLD_ALL", ++ .pme_code = 0x0000003080, ++ .pme_short_desc = "All ISU rejects", ++ .pme_long_desc = "All ISU rejects", + }, +-[ POWER9_PME_PM_DPTEG_FROM_ON_CHIP_CACHE ] = { /* 355 */ +- .pme_name = "PM_DPTEG_FROM_ON_CHIP_CACHE", +- .pme_code = 0x000001E048, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB either shared or modified data from another core's L2/L3 on the same chip due to a data side request.", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB either shared or modified data from another core's L2/L3 on the same chip due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++[ POWER9_PME_PM_ISU1_ISS_HOLD_ALL ] = { ++ .pme_name = "PM_ISU1_ISS_HOLD_ALL", ++ .pme_code = 0x0000003084, ++ .pme_short_desc = "All ISU rejects", ++ .pme_long_desc = "All ISU rejects", + }, +-[ POWER9_PME_PM_MRK_DATA_FROM_L3_MEPF_CYC ] = { /* 356 */ +- .pme_name = "PM_MRK_DATA_FROM_L3_MEPF_CYC", +- .pme_code = 0x000001415C, +- .pme_short_desc = "Duration in cycles to reload from local core's L3 without dispatch conflicts hit on Mepf state.", +- .pme_long_desc = "Duration in cycles to reload from local core's L3 without dispatch conflicts hit on Mepf state. due to a marked load", ++[ POWER9_PME_PM_ISU2_ISS_HOLD_ALL ] = { ++ .pme_name = "PM_ISU2_ISS_HOLD_ALL", ++ .pme_code = 0x0000003880, ++ .pme_short_desc = "All ISU rejects", ++ .pme_long_desc = "All ISU rejects", + }, +-[ POWER9_PME_PM_LS0_UNALIGNED_LD ] = { /* 357 */ +- .pme_name = "PM_LS0_UNALIGNED_LD", +- .pme_code = 0x000000C094, +- .pme_short_desc = "Load instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the load of that size.", +- .pme_long_desc = "Load instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the load of that size. If the load wraps from slice 3 to slice 0, thee is an additional 3-cycle penalty", ++[ POWER9_PME_PM_ISU3_ISS_HOLD_ALL ] = { ++ .pme_name = "PM_ISU3_ISS_HOLD_ALL", ++ .pme_code = 0x0000003884, ++ .pme_short_desc = "All ISU rejects", ++ .pme_long_desc = "All ISU rejects", + }, +-[ POWER9_PME_PM_MRK_DATA_FROM_DMEM_CYC ] = { /* 358 */ +- .pme_name = "PM_MRK_DATA_FROM_DMEM_CYC", +- .pme_code = 0x000004E11E, +- .pme_short_desc = "Duration in cycles to reload from another chip's memory on the same Node or Group (Distant) due to a marked load", +- .pme_long_desc = "Duration in cycles to reload from another chip's memory on the same Node or Group (Distant) due to a marked load", ++[ POWER9_PME_PM_ISYNC ] = { ++ .pme_name = "PM_ISYNC", ++ .pme_code = 0x0000002884, ++ .pme_short_desc = "Isync completion count per thread", ++ .pme_long_desc = "Isync completion count per thread", + }, +-[ POWER9_PME_PM_SN_HIT ] = { /* 359 */ +- .pme_name = "PM_SN_HIT", +- .pme_code = 0x00000460A8, +- .pme_short_desc = "Any port snooper hit.", +- .pme_long_desc = "Any port snooper hit. Up to 4 can happen in a cycle but we only count 1", ++[ POWER9_PME_PM_ITLB_MISS ] = { ++ .pme_name = "PM_ITLB_MISS", ++ .pme_code = 0x00000400FC, ++ .pme_short_desc = "ITLB Reloaded.", ++ .pme_long_desc = "ITLB Reloaded. Counts 1 per ITLB miss for HPT but multiple for radix depending on number of levels traveresed", + }, +-[ POWER9_PME_PM_L3_LOC_GUESS_CORRECT ] = { /* 360 */ +- .pme_name = "PM_L3_LOC_GUESS_CORRECT", +- .pme_code = 0x00000160B2, +- .pme_short_desc = "initial scope=node/chip and data from local node (local) (pred successful)", +- .pme_long_desc = "initial scope=node/chip and data from local node (local) (pred successful)", ++[ POWER9_PME_PM_L1_DCACHE_RELOADED_ALL ] = { ++ .pme_name = "PM_L1_DCACHE_RELOADED_ALL", ++ .pme_code = 0x000001002C, ++ .pme_short_desc = "L1 data cache reloaded for demand.", ++ .pme_long_desc = "L1 data cache reloaded for demand. If MMCR1[16] is 1, prefetches will be included as well", + }, +-[ POWER9_PME_PM_MRK_INST_FROM_L3MISS ] = { /* 361 */ +- .pme_name = "PM_MRK_INST_FROM_L3MISS", +- .pme_code = 0x00000401E6, +- .pme_short_desc = "Marked instruction was reloaded from a location beyond the local chiplet", +- .pme_long_desc = "Marked instruction was reloaded from a location beyond the local chiplet", ++[ POWER9_PME_PM_L1_DCACHE_RELOAD_VALID ] = { ++ .pme_name = "PM_L1_DCACHE_RELOAD_VALID", ++ .pme_code = 0x00000300F6, ++ .pme_short_desc = "DL1 reloaded due to Demand Load", ++ .pme_long_desc = "DL1 reloaded due to Demand Load", + }, +-[ POWER9_PME_PM_DECODE_FUSION_EXT_ADD ] = { /* 362 */ +- .pme_name = "PM_DECODE_FUSION_EXT_ADD", +- .pme_code = 0x0000005084, +- .pme_short_desc = "32-bit extended addition", +- .pme_long_desc = "32-bit extended addition", ++[ POWER9_PME_PM_L1_DEMAND_WRITE ] = { ++ .pme_name = "PM_L1_DEMAND_WRITE", ++ .pme_code = 0x000000408C, ++ .pme_short_desc = "Instruction Demand sectors written into IL1", ++ .pme_long_desc = "Instruction Demand sectors written into IL1", + }, +-[ POWER9_PME_PM_INST_FROM_DL4 ] = { /* 363 */ +- .pme_name = "PM_INST_FROM_DL4", +- .pme_code = 0x000003404C, +- .pme_short_desc = "The processor's Instruction cache was reloaded from another chip's L4 on a different Node or Group (Distant) due to an instruction fetch (not prefetch)", +- .pme_long_desc = "The processor's Instruction cache was reloaded from another chip's L4 on a different Node or Group (Distant) due to an instruction fetch (not prefetch)", ++[ POWER9_PME_PM_L1_ICACHE_MISS ] = { ++ .pme_name = "PM_L1_ICACHE_MISS", ++ .pme_code = 0x00000200FD, ++ .pme_short_desc = "Demand iCache Miss", ++ .pme_long_desc = "Demand iCache Miss", + }, +-[ POWER9_PME_PM_DC_PREF_XCONS_ALLOC ] = { /* 364 */ +- .pme_name = "PM_DC_PREF_XCONS_ALLOC", +- .pme_code = 0x000000F8B4, +- .pme_short_desc = "Prefetch stream allocated in the Ultra conservative phase by either the hardware prefetch mechanism or software prefetch", +- .pme_long_desc = "Prefetch stream allocated in the Ultra conservative phase by either the hardware prefetch mechanism or software prefetch", ++[ POWER9_PME_PM_L1_ICACHE_RELOADED_ALL ] = { ++ .pme_name = "PM_L1_ICACHE_RELOADED_ALL", ++ .pme_code = 0x0000040012, ++ .pme_short_desc = "Counts all Icache reloads includes demand, prefetch, prefetch turned into demand and demand turned into prefetch", ++ .pme_long_desc = "Counts all Icache reloads includes demand, prefetch, prefetch turned into demand and demand turned into prefetch", + }, +-[ POWER9_PME_PM_MRK_DPTEG_FROM_MEMORY ] = { /* 365 */ +- .pme_name = "PM_MRK_DPTEG_FROM_MEMORY", +- .pme_code = 0x000002F14C, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB from a memory location including L4 from local remote or distant due to a marked data side request.", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB from a memory location including L4 from local remote or distant due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++[ POWER9_PME_PM_L1_ICACHE_RELOADED_PREF ] = { ++ .pme_name = "PM_L1_ICACHE_RELOADED_PREF", ++ .pme_code = 0x0000030068, ++ .pme_short_desc = "Counts all Icache prefetch reloads ( includes demand turned into prefetch)", ++ .pme_long_desc = "Counts all Icache prefetch reloads ( includes demand turned into prefetch)", + }, +-[ POWER9_PME_PM_IC_PREF_CANCEL_PAGE ] = { /* 366 */ +- .pme_name = "PM_IC_PREF_CANCEL_PAGE", +- .pme_code = 0x0000004090, +- .pme_short_desc = "Prefetch Canceled due to page boundary", +- .pme_long_desc = "Prefetch Canceled due to page boundary", ++[ POWER9_PME_PM_L1PF_L2MEMACC ] = { ++ .pme_name = "PM_L1PF_L2MEMACC", ++ .pme_code = 0x0000016890, ++ .pme_short_desc = "Valid when first beat of data comes in for an L1PF where data came from memory", ++ .pme_long_desc = "Valid when first beat of data comes in for an L1PF where data came from memory", + }, +-[ POWER9_PME_PM_RADIX_PWC_L3_PTE_FROM_L3 ] = { /* 367 */ +- .pme_name = "PM_RADIX_PWC_L3_PTE_FROM_L3", +- .pme_code = 0x000003F05E, +- .pme_short_desc = "A Page Table Entry was reloaded to a level 3 page walk cache from the core's L3 data cache.", +- .pme_long_desc = "A Page Table Entry was reloaded to a level 3 page walk cache from the core's L3 data cache. This implies that a level 4 PWC access was not necessary for this translation", ++[ POWER9_PME_PM_L1_PREF ] = { ++ .pme_name = "PM_L1_PREF", ++ .pme_code = 0x0000020054, ++ .pme_short_desc = "A data line was written to the L1 due to a hardware or software prefetch", ++ .pme_long_desc = "A data line was written to the L1 due to a hardware or software prefetch", + }, +-[ POWER9_PME_PM_L3_GRP_GUESS_WRONG_LOW ] = { /* 368 */ +- .pme_name = "PM_L3_GRP_GUESS_WRONG_LOW", +- .pme_code = 0x00000360B2, +- .pme_short_desc = "Initial scope=group but data from outside group (far or rem).", +- .pme_long_desc = "Initial scope=group but data from outside group (far or rem). Prediction too Low", ++[ POWER9_PME_PM_L1_SW_PREF ] = { ++ .pme_name = "PM_L1_SW_PREF", ++ .pme_code = 0x000000E880, ++ .pme_short_desc = "Software L1 Prefetches, including SW Transient Prefetches", ++ .pme_long_desc = "Software L1 Prefetches, including SW Transient Prefetches", + }, +-[ POWER9_PME_PM_TM_FAIL_SELF ] = { /* 369 */ +- .pme_name = "PM_TM_FAIL_SELF", +- .pme_code = 0x00000028AC, +- .pme_short_desc = "TM aborted because a self-induced conflict occurred in Suspended state, due to one of the following: a store to a storage location that was previously accessed transactionally", +- .pme_long_desc = "TM aborted because a self-induced conflict occurred in Suspended state, due to one of the following: a store to a storage location that was previously accessed transactionally", ++[ POWER9_PME_PM_L2_CASTOUT_MOD ] = { ++ .pme_name = "PM_L2_CASTOUT_MOD", ++ .pme_code = 0x0000016082, ++ .pme_short_desc = "L2 Castouts - Modified (M,Mu,Me)", ++ .pme_long_desc = "L2 Castouts - Modified (M,Mu,Me)", + }, +-[ POWER9_PME_PM_L3_P1_SYS_PUMP ] = { /* 370 */ +- .pme_name = "PM_L3_P1_SYS_PUMP", +- .pme_code = 0x00000368B0, +- .pme_short_desc = "L3 pf sent with sys scope port 1", +- .pme_long_desc = "L3 pf sent with sys scope port 1", ++[ POWER9_PME_PM_L2_CASTOUT_SHR ] = { ++ .pme_name = "PM_L2_CASTOUT_SHR", ++ .pme_code = 0x0000016882, ++ .pme_short_desc = "L2 Castouts - Shared (Tx,Sx)", ++ .pme_long_desc = "L2 Castouts - Shared (Tx,Sx)", + }, +-[ POWER9_PME_PM_CMPLU_STALL_RFID ] = { /* 371 */ +- .pme_name = "PM_CMPLU_STALL_RFID", +- .pme_code = 0x000002C01E, +- .pme_short_desc = "Cycles in which the NTC instruction is not allowed to complete because it was interrupted by an RFID exception, which has to be serviced before the instruction can complete", +- .pme_long_desc = "Cycles in which the NTC instruction is not allowed to complete because it was interrupted by an RFID exception, which has to be serviced before the instruction can complete", ++[ POWER9_PME_PM_L2_CHIP_PUMP ] = { ++ .pme_name = "PM_L2_CHIP_PUMP", ++ .pme_code = 0x0000046088, ++ .pme_short_desc = "RC requests that were local (aka chip) pump attempts", ++ .pme_long_desc = "RC requests that were local (aka chip) pump attempts", + }, +-[ POWER9_PME_PM_BR_2PATH ] = { /* 372 */ +- .pme_name = "PM_BR_2PATH", +- .pme_code = 0x0000020036, +- .pme_short_desc = "two path branch", +- .pme_long_desc = "two path branch", ++[ POWER9_PME_PM_L2_DC_INV ] = { ++ .pme_name = "PM_L2_DC_INV", ++ .pme_code = 0x0000026882, ++ .pme_short_desc = "D-cache invalidates sent over the reload bus to the core", ++ .pme_long_desc = "D-cache invalidates sent over the reload bus to the core", + }, +-[ POWER9_PME_PM_RADIX_PWC_L4_PTE_FROM_L3MISS ] = { /* 373 */ +- .pme_name = "PM_RADIX_PWC_L4_PTE_FROM_L3MISS", +- .pme_code = 0x000003F054, +- .pme_short_desc = "A Page Table Entry was reloaded to a level 4 page walk cache from beyond the core's L3 data cache.", +- .pme_long_desc = "A Page Table Entry was reloaded to a level 4 page walk cache from beyond the core's L3 data cache. This is the deepest level of PWC possible for a translation. The source could be local/remote/distant memory or another core's cache", ++[ POWER9_PME_PM_L2_DISP_ALL_L2MISS ] = { ++ .pme_name = "PM_L2_DISP_ALL_L2MISS", ++ .pme_code = 0x0000046080, ++ .pme_short_desc = "All successful Ld/St dispatches for this thread that were an L2 miss (excludes i_l2mru_tch_reqs)", ++ .pme_long_desc = "All successful Ld/St dispatches for this thread that were an L2 miss (excludes i_l2mru_tch_reqs)", + }, +-[ POWER9_PME_PM_DPTEG_FROM_L2MISS ] = { /* 374 */ +- .pme_name = "PM_DPTEG_FROM_L2MISS", +- .pme_code = 0x000001E04E, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB from a localtion other than the local core's L2 due to a data side request.", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB from a localtion other than the local core's L2 due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++[ POWER9_PME_PM_L2_GROUP_PUMP ] = { ++ .pme_name = "PM_L2_GROUP_PUMP", ++ .pme_code = 0x0000046888, ++ .pme_short_desc = "RC requests that were on group (aka nodel) pump attempts", ++ .pme_long_desc = "RC requests that were on group (aka nodel) pump attempts", + }, +-[ POWER9_PME_PM_TM_TX_PASS_RUN_INST ] = { /* 375 */ +- .pme_name = "PM_TM_TX_PASS_RUN_INST", +- .pme_code = 0x000004E014, +- .pme_short_desc = "Run instructions spent in successful transactions", +- .pme_long_desc = "Run instructions spent in successful transactions", ++[ POWER9_PME_PM_L2_GRP_GUESS_CORRECT ] = { ++ .pme_name = "PM_L2_GRP_GUESS_CORRECT", ++ .pme_code = 0x0000026088, ++ .pme_short_desc = "L2 guess grp (GS or NNS) and guess was correct (data intra-group AND ^on-chip)", ++ .pme_long_desc = "L2 guess grp (GS or NNS) and guess was correct (data intra-group AND ^on-chip)", + }, +-[ POWER9_PME_PM_L1_ICACHE_RELOADED_PREF ] = { /* 376 */ +- .pme_name = "PM_L1_ICACHE_RELOADED_PREF", +- .pme_code = 0x0000030068, +- .pme_short_desc = "Counts all Icache prefetch reloads ( includes demand turned into prefetch)", +- .pme_long_desc = "Counts all Icache prefetch reloads ( includes demand turned into prefetch)", ++[ POWER9_PME_PM_L2_GRP_GUESS_WRONG ] = { ++ .pme_name = "PM_L2_GRP_GUESS_WRONG", ++ .pme_code = 0x0000026888, ++ .pme_short_desc = "L2 guess grp (GS or NNS) and guess was not correct (ie data on-chip OR beyond-group)", ++ .pme_long_desc = "L2 guess grp (GS or NNS) and guess was not correct (ie data on-chip OR beyond-group)", + }, +-[ POWER9_PME_PM_THRESH_EXC_4096 ] = { /* 377 */ +- .pme_name = "PM_THRESH_EXC_4096", +- .pme_code = 0x00000101E6, +- .pme_short_desc = "Threshold counter exceed a count of 4096", +- .pme_long_desc = "Threshold counter exceed a count of 4096", ++[ POWER9_PME_PM_L2_IC_INV ] = { ++ .pme_name = "PM_L2_IC_INV", ++ .pme_code = 0x0000026082, ++ .pme_short_desc = "I-cache Invalidates sent over the realod bus to the core", ++ .pme_long_desc = "I-cache Invalidates sent over the realod bus to the core", + }, +-[ POWER9_PME_PM_IERAT_RELOAD_64K ] = { /* 378 */ +- .pme_name = "PM_IERAT_RELOAD_64K", +- .pme_code = 0x000003006A, +- .pme_short_desc = "IERAT Reloaded (Miss) for a 64k page", +- .pme_long_desc = "IERAT Reloaded (Miss) for a 64k page", ++[ POWER9_PME_PM_L2_INST_MISS ] = { ++ .pme_name = "PM_L2_INST_MISS", ++ .pme_code = 0x0000036880, ++ .pme_short_desc = "All successful I-side dispatches that were an L2 miss for this thread (excludes i_l2mru_tch reqs)", ++ .pme_long_desc = "All successful I-side dispatches that were an L2 miss for this thread (excludes i_l2mru_tch reqs)", + }, +-[ POWER9_PME_PM_LSU0_TM_L1_MISS ] = { /* 379 */ +- .pme_name = "PM_LSU0_TM_L1_MISS", +- .pme_code = 0x000000E09C, +- .pme_short_desc = "Load tm L1 miss", +- .pme_long_desc = "Load tm L1 miss", ++[ POWER9_PME_PM_L2_INST_MISS_ALT ] = { ++ .pme_name = "PM_L2_INST_MISS", ++ .pme_code = 0x000004609E, ++ .pme_short_desc = "All successful I-side dispatches that were an L2 miss for this thread (excludes i_l2mru_tch reqs)", ++ .pme_long_desc = "All successful I-side dispatches that were an L2 miss for this thread (excludes i_l2mru_tch reqs)", + }, +-[ POWER9_PME_PM_MEM_LOC_THRESH_LSU_MED ] = { /* 380 */ +- .pme_name = "PM_MEM_LOC_THRESH_LSU_MED", +- .pme_code = 0x000001C05E, +- .pme_short_desc = "Local memory above theshold for data prefetch", +- .pme_long_desc = "Local memory above theshold for data prefetch", ++[ POWER9_PME_PM_L2_INST ] = { ++ .pme_name = "PM_L2_INST", ++ .pme_code = 0x0000036080, ++ .pme_short_desc = "All successful I-side dispatches for this thread (excludes i_l2mru_tch reqs)", ++ .pme_long_desc = "All successful I-side dispatches for this thread (excludes i_l2mru_tch reqs)", + }, +-[ POWER9_PME_PM_PMC3_REWIND ] = { /* 381 */ +- .pme_name = "PM_PMC3_REWIND", +- .pme_code = 0x000001000A, +- .pme_short_desc = "PMC3 rewind event.", +- .pme_long_desc = "PMC3 rewind event. A rewind happens when a speculative event (such as latency or CPI stack) is selected on PMC3 and the stall reason or reload source did not match the one programmed in PMC3. When this occurs, the count in PMC3 will not change.", ++[ POWER9_PME_PM_L2_INST_ALT ] = { ++ .pme_name = "PM_L2_INST", ++ .pme_code = 0x000003609E, ++ .pme_short_desc = "All successful I-side dispatches for this thread (excludes i_l2mru_tch reqs)", ++ .pme_long_desc = "All successful I-side dispatches for this thread (excludes i_l2mru_tch reqs)", + }, +-[ POWER9_PME_PM_ST_FWD ] = { /* 382 */ +- .pme_name = "PM_ST_FWD", +- .pme_code = 0x0000020018, +- .pme_short_desc = "Store forwards that finished", +- .pme_long_desc = "Store forwards that finished", ++[ POWER9_PME_PM_L2_LD_DISP ] = { ++ .pme_name = "PM_L2_LD_DISP", ++ .pme_code = 0x000001609E, ++ .pme_short_desc = "All successful D-side load dispatches for this thread (L2 miss + L2 hits)", ++ .pme_long_desc = "All successful D-side load dispatches for this thread (L2 miss + L2 hits)", + }, +-[ POWER9_PME_PM_TM_FAIL_TX_CONFLICT ] = { /* 383 */ +- .pme_name = "PM_TM_FAIL_TX_CONFLICT", +- .pme_code = 0x000000E8AC, +- .pme_short_desc = "Transactional conflict from LSU, whatever gets reported to texas", +- .pme_long_desc = "Transactional conflict from LSU, whatever gets reported to texas", ++[ POWER9_PME_PM_L2_LD_DISP_ALT ] = { ++ .pme_name = "PM_L2_LD_DISP", ++ .pme_code = 0x0000036082, ++ .pme_short_desc = "All successful I-or-D side load dispatches for this thread (excludes i_l2mru_tch_reqs)", ++ .pme_long_desc = "All successful I-or-D side load dispatches for this thread (excludes i_l2mru_tch_reqs)", + }, +-[ POWER9_PME_PM_SYNC_MRK_L2MISS ] = { /* 384 */ +- .pme_name = "PM_SYNC_MRK_L2MISS", +- .pme_code = 0x000001515A, +- .pme_short_desc = "Marked L2 Miss that can throw a synchronous interrupt", +- .pme_long_desc = "Marked L2 Miss that can throw a synchronous interrupt", ++[ POWER9_PME_PM_L2_LD_HIT ] = { ++ .pme_name = "PM_L2_LD_HIT", ++ .pme_code = 0x000002609E, ++ .pme_short_desc = "All successful D-side load dispatches that were L2 hits for this thread", ++ .pme_long_desc = "All successful D-side load dispatches that were L2 hits for this thread", + }, +-[ POWER9_PME_PM_ISU0_ISS_HOLD_ALL ] = { /* 385 */ +- .pme_name = "PM_ISU0_ISS_HOLD_ALL", +- .pme_code = 0x0000003080, +- .pme_short_desc = "All ISU rejects", +- .pme_long_desc = "All ISU rejects", ++[ POWER9_PME_PM_L2_LD_HIT_ALT ] = { ++ .pme_name = "PM_L2_LD_HIT", ++ .pme_code = 0x0000036882, ++ .pme_short_desc = "All successful I-or-D side load dispatches for this thread that were L2 hits (excludes i_l2mru_tch_reqs)", ++ .pme_long_desc = "All successful I-or-D side load dispatches for this thread that were L2 hits (excludes i_l2mru_tch_reqs)", ++}, ++[ POWER9_PME_PM_L2_LD_MISS_128B ] = { ++ .pme_name = "PM_L2_LD_MISS_128B", ++ .pme_code = 0x0000016092, ++ .pme_short_desc = "All successful D-side load dispatches that were an L2 miss (NOT Sx,Tx,Mx) for this thread and the RC calculated the request should be for 128B (i.", ++ .pme_long_desc = "All successful D-side load dispatches that were an L2 miss (NOT Sx,Tx,Mx) for this thread and the RC calculated the request should be for 128B (i.e., M=0)", ++}, ++[ POWER9_PME_PM_L2_LD_MISS_64B ] = { ++ .pme_name = "PM_L2_LD_MISS_64B", ++ .pme_code = 0x0000026092, ++ .pme_short_desc = "All successful D-side load dispatches that were an L2 miss (NOT Sx,Tx,Mx) for this thread and the RC calculated the request should be for 64B(i.", ++ .pme_long_desc = "All successful D-side load dispatches that were an L2 miss (NOT Sx,Tx,Mx) for this thread and the RC calculated the request should be for 64B(i.e., M=1)", ++}, ++[ POWER9_PME_PM_L2_LD_MISS ] = { ++ .pme_name = "PM_L2_LD_MISS", ++ .pme_code = 0x0000026080, ++ .pme_short_desc = "All successful D-Side Load dispatches that were an L2 miss for this thread", ++ .pme_long_desc = "All successful D-Side Load dispatches that were an L2 miss for this thread", + }, +-[ POWER9_PME_PM_MRK_FAB_RSP_DCLAIM_CYC ] = { /* 386 */ +- .pme_name = "PM_MRK_FAB_RSP_DCLAIM_CYC", +- .pme_code = 0x000002F152, +- .pme_short_desc = "cycles L2 RC took for a dclaim", +- .pme_long_desc = "cycles L2 RC took for a dclaim", ++[ POWER9_PME_PM_L2_LD ] = { ++ .pme_name = "PM_L2_LD", ++ .pme_code = 0x0000016080, ++ .pme_short_desc = "All successful D-side Load dispatches for this thread (L2 miss + L2 hits)", ++ .pme_long_desc = "All successful D-side Load dispatches for this thread (L2 miss + L2 hits)", + }, +-[ POWER9_PME_PM_DATA_FROM_L2 ] = { /* 387 */ +- .pme_name = "PM_DATA_FROM_L2", +- .pme_code = 0x000001C042, +- .pme_short_desc = "The processor's data cache was reloaded from local core's L2 due to a demand load", +- .pme_long_desc = "The processor's data cache was reloaded from local core's L2 due to a demand load", ++[ POWER9_PME_PM_L2_LOC_GUESS_CORRECT ] = { ++ .pme_name = "PM_L2_LOC_GUESS_CORRECT", ++ .pme_code = 0x0000016088, ++ .pme_short_desc = "L2 guess local (LNS) and guess was correct (ie data local)", ++ .pme_long_desc = "L2 guess local (LNS) and guess was correct (ie data local)", + }, +-[ POWER9_PME_PM_MRK_DATA_FROM_RL2L3_MOD ] = { /* 388 */ +- .pme_name = "PM_MRK_DATA_FROM_RL2L3_MOD", +- .pme_code = 0x000001D14A, +- .pme_short_desc = "The processor's data cache was reloaded with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a marked load", +- .pme_long_desc = "The processor's data cache was reloaded with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a marked load", ++[ POWER9_PME_PM_L2_LOC_GUESS_WRONG ] = { ++ .pme_name = "PM_L2_LOC_GUESS_WRONG", ++ .pme_code = 0x0000016888, ++ .pme_short_desc = "L2 guess local (LNS) and guess was not correct (ie data not on chip)", ++ .pme_long_desc = "L2 guess local (LNS) and guess was not correct (ie data not on chip)", + }, +-[ POWER9_PME_PM_ISQ_0_8_ENTRIES ] = { /* 389 */ +- .pme_name = "PM_ISQ_0_8_ENTRIES", +- .pme_code = 0x000003005A, +- .pme_short_desc = "Cycles in which 8 or less Issue Queue entries are in use.", +- .pme_long_desc = "Cycles in which 8 or less Issue Queue entries are in use. This is a shared event, not per thread", ++[ POWER9_PME_PM_L2_RCLD_DISP_FAIL_ADDR ] = { ++ .pme_name = "PM_L2_RCLD_DISP_FAIL_ADDR", ++ .pme_code = 0x0000016884, ++ .pme_short_desc = "All I-od-D side load dispatch attempts for this thread that failed due to address collision with RC/CO/SN/SQ machine (excludes i_l2mru_tch_reqs)", ++ .pme_long_desc = "All I-od-D side load dispatch attempts for this thread that failed due to address collision with RC/CO/SN/SQ machine (excludes i_l2mru_tch_reqs)", + }, +-[ POWER9_PME_PM_L3_CO_MEPF ] = { /* 390 */ +- .pme_name = "PM_L3_CO_MEPF", +- .pme_code = 0x00000168A0, +- .pme_short_desc = "L3 CO of line in Mep state ( includes casthrough", +- .pme_long_desc = "L3 CO of line in Mep state ( includes casthrough", ++[ POWER9_PME_PM_L2_RCLD_DISP_FAIL_OTHER ] = { ++ .pme_name = "PM_L2_RCLD_DISP_FAIL_OTHER", ++ .pme_code = 0x0000026084, ++ .pme_short_desc = "All I-or-D side load dispatch attempts for this thread that failed due to reason other than address collision (excludes i_l2mru_tch_reqs)", ++ .pme_long_desc = "All I-or-D side load dispatch attempts for this thread that failed due to reason other than address collision (excludes i_l2mru_tch_reqs)", + }, +-[ POWER9_PME_PM_LINK_STACK_INVALID_PTR ] = { /* 391 */ +- .pme_name = "PM_LINK_STACK_INVALID_PTR", +- .pme_code = 0x0000005898, +- .pme_short_desc = "It is most often caused by certain types of flush where the pointer is not available.", +- .pme_long_desc = "It is most often caused by certain types of flush where the pointer is not available. Can result in the data in the link stack becoming unusable.", ++[ POWER9_PME_PM_L2_RCLD_DISP ] = { ++ .pme_name = "PM_L2_RCLD_DISP", ++ .pme_code = 0x0000016084, ++ .pme_short_desc = "All I-or-D side load dispatch attempts for this thread (excludes i_l2mru_tch_reqs)", ++ .pme_long_desc = "All I-or-D side load dispatch attempts for this thread (excludes i_l2mru_tch_reqs)", + }, +-[ POWER9_PME_PM_IPTEG_FROM_L2_1_MOD ] = { /* 392 */ +- .pme_name = "PM_IPTEG_FROM_L2_1_MOD", +- .pme_code = 0x0000045046, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's L2 on the same chip due to a instruction side request", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's L2 on the same chip due to a instruction side request", ++[ POWER9_PME_PM_L2_RCST_DISP_FAIL_ADDR ] = { ++ .pme_name = "PM_L2_RCST_DISP_FAIL_ADDR", ++ .pme_code = 0x0000036884, ++ .pme_short_desc = "All D-side store dispatch attempts for this thread that failed due to address collision with RC/CO/SN/SQ", ++ .pme_long_desc = "All D-side store dispatch attempts for this thread that failed due to address collision with RC/CO/SN/SQ", + }, +-[ POWER9_PME_PM_TM_ST_CAUSED_FAIL ] = { /* 393 */ +- .pme_name = "PM_TM_ST_CAUSED_FAIL", +- .pme_code = 0x000003688C, +- .pme_short_desc = "TM Store (fav or non-fav) caused another thread to fail", +- .pme_long_desc = "TM Store (fav or non-fav) caused another thread to fail", ++[ POWER9_PME_PM_L2_RCST_DISP_FAIL_OTHER ] = { ++ .pme_name = "PM_L2_RCST_DISP_FAIL_OTHER", ++ .pme_code = 0x0000046084, ++ .pme_short_desc = "All D-side store dispatch attempts for this thread that failed due to reason other than address collision", ++ .pme_long_desc = "All D-side store dispatch attempts for this thread that failed due to reason other than address collision", + }, +-[ POWER9_PME_PM_LD_REF_L1 ] = { /* 394 */ +- .pme_name = "PM_LD_REF_L1", +- .pme_code = 0x00000100FC, +- .pme_short_desc = "All L1 D cache load references counted at finish, gated by reject", +- .pme_long_desc = "All L1 D cache load references counted at finish, gated by reject", ++[ POWER9_PME_PM_L2_RCST_DISP ] = { ++ .pme_name = "PM_L2_RCST_DISP", ++ .pme_code = 0x0000036084, ++ .pme_short_desc = "All D-side store dispatch attempts for this thread", ++ .pme_long_desc = "All D-side store dispatch attempts for this thread", + }, +-[ POWER9_PME_PM_TM_FAIL_NON_TX_CONFLICT ] = { /* 395 */ +- .pme_name = "PM_TM_FAIL_NON_TX_CONFLICT", +- .pme_code = 0x000000E0B0, +- .pme_short_desc = "Non transactional conflict from LSU whtver gets repoted to texas", +- .pme_long_desc = "Non transactional conflict from LSU whtver gets repoted to texas", ++[ POWER9_PME_PM_L2_RC_ST_DONE ] = { ++ .pme_name = "PM_L2_RC_ST_DONE", ++ .pme_code = 0x0000036086, ++ .pme_short_desc = "RC did store to line that was Tx or Sx", ++ .pme_long_desc = "RC did store to line that was Tx or Sx", + }, +-[ POWER9_PME_PM_GRP_PUMP_CPRED ] = { /* 396 */ +- .pme_name = "PM_GRP_PUMP_CPRED", +- .pme_code = 0x0000020050, +- .pme_short_desc = "Initial and Final Pump Scope and data sourced across this scope was group pump for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", +- .pme_long_desc = "Initial and Final Pump Scope and data sourced across this scope was group pump for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", ++[ POWER9_PME_PM_L2_RTY_LD ] = { ++ .pme_name = "PM_L2_RTY_LD", ++ .pme_code = 0x000003688A, ++ .pme_short_desc = "RC retries on PB for any load from core (excludes DCBFs)", ++ .pme_long_desc = "RC retries on PB for any load from core (excludes DCBFs)", + }, +-[ POWER9_PME_PM_INST_FROM_L3_NO_CONFLICT ] = { /* 397 */ +- .pme_name = "PM_INST_FROM_L3_NO_CONFLICT", +- .pme_code = 0x0000014044, +- .pme_short_desc = "The processor's Instruction cache was reloaded from local core's L3 without conflict due to an instruction fetch (not prefetch)", +- .pme_long_desc = "The processor's Instruction cache was reloaded from local core's L3 without conflict due to an instruction fetch (not prefetch)", ++[ POWER9_PME_PM_L2_RTY_LD_ALT ] = { ++ .pme_name = "PM_L2_RTY_LD", ++ .pme_code = 0x000003689E, ++ .pme_short_desc = "RC retries on PB for any load from core (excludes DCBFs)", ++ .pme_long_desc = "RC retries on PB for any load from core (excludes DCBFs)", + }, +-[ POWER9_PME_PM_DC_PREF_STRIDED_CONF ] = { /* 398 */ +- .pme_name = "PM_DC_PREF_STRIDED_CONF", +- .pme_code = 0x000000F0AC, +- .pme_short_desc = "A demand load referenced a line in an active strided prefetch stream.", +- .pme_long_desc = "A demand load referenced a line in an active strided prefetch stream. The stream could have been allocated through the hardware prefetch mechanism or through software.", ++[ POWER9_PME_PM_L2_RTY_ST ] = { ++ .pme_name = "PM_L2_RTY_ST", ++ .pme_code = 0x000003608A, ++ .pme_short_desc = "RC retries on PB for any store from core (excludes DCBFs)", ++ .pme_long_desc = "RC retries on PB for any store from core (excludes DCBFs)", + }, +-[ POWER9_PME_PM_THRD_PRIO_6_7_CYC ] = { /* 399 */ +- .pme_name = "PM_THRD_PRIO_6_7_CYC", +- .pme_code = 0x0000005880, +- .pme_short_desc = "Cycles thread running at priority level 6 or 7", +- .pme_long_desc = "Cycles thread running at priority level 6 or 7", ++[ POWER9_PME_PM_L2_RTY_ST_ALT ] = { ++ .pme_name = "PM_L2_RTY_ST", ++ .pme_code = 0x000004689E, ++ .pme_short_desc = "RC retries on PB for any store from core (excludes DCBFs)", ++ .pme_long_desc = "RC retries on PB for any store from core (excludes DCBFs)", + }, +-[ POWER9_PME_PM_RADIX_PWC_L4_PDE_FROM_L3 ] = { /* 400 */ +- .pme_name = "PM_RADIX_PWC_L4_PDE_FROM_L3", +- .pme_code = 0x000003F05C, +- .pme_short_desc = "A Page Directory Entry was reloaded to a level 2 page walk cache from the core's L3 data cache", +- .pme_long_desc = "A Page Directory Entry was reloaded to a level 2 page walk cache from the core's L3 data cache", ++[ POWER9_PME_PM_L2_SN_M_RD_DONE ] = { ++ .pme_name = "PM_L2_SN_M_RD_DONE", ++ .pme_code = 0x0000046086, ++ .pme_short_desc = "SNP dispatched for a read and was M (true M)", ++ .pme_long_desc = "SNP dispatched for a read and was M (true M)", + }, +-[ POWER9_PME_PM_L3_PF_OFF_CHIP_MEM ] = { /* 401 */ +- .pme_name = "PM_L3_PF_OFF_CHIP_MEM", +- .pme_code = 0x00000468A0, +- .pme_short_desc = "L3 Prefetch from Off chip memory", +- .pme_long_desc = "L3 Prefetch from Off chip memory", ++[ POWER9_PME_PM_L2_SN_M_WR_DONE ] = { ++ .pme_name = "PM_L2_SN_M_WR_DONE", ++ .pme_code = 0x0000016086, ++ .pme_short_desc = "SNP dispatched for a write and was M (true M); for DMA cacheinj this will pulse if rty/push is required (won't pulse if cacheinj is accepted)", ++ .pme_long_desc = "SNP dispatched for a write and was M (true M); for DMA cacheinj this will pulse if rty/push is required (won't pulse if cacheinj is accepted)", + }, +-[ POWER9_PME_PM_L3_CO_MEM ] = { /* 402 */ +- .pme_name = "PM_L3_CO_MEM", +- .pme_code = 0x00000260A0, +- .pme_short_desc = "L3 CO to memory OR of port 0 and 1 ( lossy)", +- .pme_long_desc = "L3 CO to memory OR of port 0 and 1 ( lossy)", ++[ POWER9_PME_PM_L2_SN_M_WR_DONE_ALT ] = { ++ .pme_name = "PM_L2_SN_M_WR_DONE", ++ .pme_code = 0x0000046886, ++ .pme_short_desc = "SNP dispatched for a write and was M (true M); for DMA cacheinj this will pulse if rty/push is required (won't pulse if cacheinj is accepted)", ++ .pme_long_desc = "SNP dispatched for a write and was M (true M); for DMA cacheinj this will pulse if rty/push is required (won't pulse if cacheinj is accepted)", + }, +-[ POWER9_PME_PM_DECODE_HOLD_ICT_FULL ] = { /* 403 */ +- .pme_name = "PM_DECODE_HOLD_ICT_FULL", +- .pme_code = 0x00000058A8, +- .pme_short_desc = "Counts the number of cycles in which the IFU was not able to decode and transmit one or more instructions because all itags were in use.", +- .pme_long_desc = "Counts the number of cycles in which the IFU was not able to decode and transmit one or more instructions because all itags were in use. This means the ICT is full for this thread", ++[ POWER9_PME_PM_L2_SN_SX_I_DONE ] = { ++ .pme_name = "PM_L2_SN_SX_I_DONE", ++ .pme_code = 0x0000036886, ++ .pme_short_desc = "SNP dispatched and went from Sx to Ix", ++ .pme_long_desc = "SNP dispatched and went from Sx to Ix", + }, +-[ POWER9_PME_PM_CMPLU_STALL_DFLONG ] = { /* 404 */ +- .pme_name = "PM_CMPLU_STALL_DFLONG", +- .pme_code = 0x000001005A, +- .pme_short_desc = "Finish stall because the NTF instruction was a multi-cycle instruction issued to the Decimal Floating Point execution pipe and waiting to finish.", +- .pme_long_desc = "Finish stall because the NTF instruction was a multi-cycle instruction issued to the Decimal Floating Point execution pipe and waiting to finish. Includes decimal floating point instructions + 128 bit binary floating point instructions. Qualified by multicycle", ++[ POWER9_PME_PM_L2_ST_DISP ] = { ++ .pme_name = "PM_L2_ST_DISP", ++ .pme_code = 0x0000046082, ++ .pme_short_desc = "All successful D-side store dispatches for this thread", ++ .pme_long_desc = "All successful D-side store dispatches for this thread", + }, +-[ POWER9_PME_PM_LD_MISS_L1 ] = { /* 405 */ +- .pme_name = "PM_LD_MISS_L1", +- .pme_code = 0x000003E054, +- .pme_short_desc = "Load Missed L1, at execution time (not gated by finish, which means this counter can be greater than loads finished)", +- .pme_long_desc = "Load Missed L1, at execution time (not gated by finish, which means this counter can be greater than loads finished)", ++[ POWER9_PME_PM_L2_ST_DISP_ALT ] = { ++ .pme_name = "PM_L2_ST_DISP", ++ .pme_code = 0x000001689E, ++ .pme_short_desc = "All successful D-side store dispatches for this thread (L2 miss + L2 hits)", ++ .pme_long_desc = "All successful D-side store dispatches for this thread (L2 miss + L2 hits)", + }, +-[ POWER9_PME_PM_DATA_FROM_RL2L3_MOD ] = { /* 406 */ +- .pme_name = "PM_DATA_FROM_RL2L3_MOD", +- .pme_code = 0x000002C046, +- .pme_short_desc = "The processor's data cache was reloaded with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a demand load", +- .pme_long_desc = "The processor's data cache was reloaded with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a demand load", ++[ POWER9_PME_PM_L2_ST_HIT ] = { ++ .pme_name = "PM_L2_ST_HIT", ++ .pme_code = 0x0000046882, ++ .pme_short_desc = "All successful D-side store dispatches for this thread that were L2 hits", ++ .pme_long_desc = "All successful D-side store dispatches for this thread that were L2 hits", + }, +-[ POWER9_PME_PM_L3_WI0_BUSY ] = { /* 407 */ +- .pme_name = "PM_L3_WI0_BUSY", +- .pme_code = 0x00000260B6, +- .pme_short_desc = "lifetime, sample of Write Inject machine 0 valid", +- .pme_long_desc = "lifetime, sample of Write Inject machine 0 valid", ++[ POWER9_PME_PM_L2_ST_HIT_ALT ] = { ++ .pme_name = "PM_L2_ST_HIT", ++ .pme_code = 0x000002689E, ++ .pme_short_desc = "All successful D-side store dispatches that were L2 hits for this thread", ++ .pme_long_desc = "All successful D-side store dispatches that were L2 hits for this thread", ++}, ++[ POWER9_PME_PM_L2_ST_MISS_128B ] = { ++ .pme_name = "PM_L2_ST_MISS_128B", ++ .pme_code = 0x0000016892, ++ .pme_short_desc = "All successful D-side store dispatches that were an L2 miss (NOT Sx,Tx,Mx) for this thread and the RC calculated the request should be for 128B (i.", ++ .pme_long_desc = "All successful D-side store dispatches that were an L2 miss (NOT Sx,Tx,Mx) for this thread and the RC calculated the request should be for 128B (i.e., M=0)", ++}, ++[ POWER9_PME_PM_L2_ST_MISS_64B ] = { ++ .pme_name = "PM_L2_ST_MISS_64B", ++ .pme_code = 0x0000026892, ++ .pme_short_desc = "All successful D-side store dispatches that were an L2 miss (NOT Sx,Tx,Mx) for this thread and the RC calculated the request should be for 64B (i.", ++ .pme_long_desc = "All successful D-side store dispatches that were an L2 miss (NOT Sx,Tx,Mx) for this thread and the RC calculated the request should be for 64B (i.e., M=1)", ++}, ++[ POWER9_PME_PM_L2_ST_MISS ] = { ++ .pme_name = "PM_L2_ST_MISS", ++ .pme_code = 0x0000026880, ++ .pme_short_desc = "All successful D-Side Store dispatches that were an L2 miss for this thread", ++ .pme_long_desc = "All successful D-Side Store dispatches that were an L2 miss for this thread", + }, +-[ POWER9_PME_PM_LSU_SRQ_FULL_CYC ] = { /* 408 */ +- .pme_name = "PM_LSU_SRQ_FULL_CYC", +- .pme_code = 0x000001001A, +- .pme_short_desc = "Cycles in which the Store Queue is full on all 4 slices.", +- .pme_long_desc = "Cycles in which the Store Queue is full on all 4 slices. This is event is not per thread. All the threads will see the same count for this core resource", ++[ POWER9_PME_PM_L2_ST ] = { ++ .pme_name = "PM_L2_ST", ++ .pme_code = 0x0000016880, ++ .pme_short_desc = "All successful D-side store dispatches for this thread (L2 miss + L2 hits)", ++ .pme_long_desc = "All successful D-side store dispatches for this thread (L2 miss + L2 hits)", + }, +-[ POWER9_PME_PM_TABLEWALK_CYC ] = { /* 409 */ +- .pme_name = "PM_TABLEWALK_CYC", +- .pme_code = 0x0000010026, +- .pme_short_desc = "Cycles when a tablewalk (I or D) is active", +- .pme_long_desc = "Cycles when a tablewalk (I or D) is active", ++[ POWER9_PME_PM_L2_SYS_GUESS_CORRECT ] = { ++ .pme_name = "PM_L2_SYS_GUESS_CORRECT", ++ .pme_code = 0x0000036088, ++ .pme_short_desc = "L2 guess system (VGS or RNS) and guess was correct (ie data beyond-group)", ++ .pme_long_desc = "L2 guess system (VGS or RNS) and guess was correct (ie data beyond-group)", + }, +-[ POWER9_PME_PM_MRK_DATA_FROM_MEMORY_CYC ] = { /* 410 */ +- .pme_name = "PM_MRK_DATA_FROM_MEMORY_CYC", +- .pme_code = 0x000001D146, +- .pme_short_desc = "Duration in cycles to reload from a memory location including L4 from local remote or distant due to a marked load", +- .pme_long_desc = "Duration in cycles to reload from a memory location including L4 from local remote or distant due to a marked load", ++[ POWER9_PME_PM_L2_SYS_GUESS_WRONG ] = { ++ .pme_name = "PM_L2_SYS_GUESS_WRONG", ++ .pme_code = 0x0000036888, ++ .pme_short_desc = "L2 guess system (VGS or RNS) and guess was not correct (ie data ^beyond-group)", ++ .pme_long_desc = "L2 guess system (VGS or RNS) and guess was not correct (ie data ^beyond-group)", + }, +-[ POWER9_PME_PM_IPTEG_FROM_OFF_CHIP_CACHE ] = { /* 411 */ +- .pme_name = "PM_IPTEG_FROM_OFF_CHIP_CACHE", +- .pme_code = 0x000004504A, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to a instruction side request", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to a instruction side request", ++[ POWER9_PME_PM_L2_SYS_PUMP ] = { ++ .pme_name = "PM_L2_SYS_PUMP", ++ .pme_code = 0x000004688A, ++ .pme_short_desc = "RC requests that were system pump attempts", ++ .pme_long_desc = "RC requests that were system pump attempts", + }, +-[ POWER9_PME_PM_RADIX_PWC_L1_PDE_FROM_L3MISS ] = { /* 412 */ +- .pme_name = "PM_RADIX_PWC_L1_PDE_FROM_L3MISS", +- .pme_code = 0x000004F056, +- .pme_short_desc = "A Page Directory Entry was reloaded to a level 1 page walk cache from beyond the core's L3 data cache.", +- .pme_long_desc = "A Page Directory Entry was reloaded to a level 1 page walk cache from beyond the core's L3 data cache. The source could be local/remote/distant memory or another core's cache", ++[ POWER9_PME_PM_L3_CI_HIT ] = { ++ .pme_name = "PM_L3_CI_HIT", ++ .pme_code = 0x00000260A2, ++ .pme_short_desc = "L3 Castins Hit (total count)", ++ .pme_long_desc = "L3 Castins Hit (total count)", + }, +-[ POWER9_PME_PM_CMPLU_STALL_SYS_CALL ] = { /* 413 */ +- .pme_name = "PM_CMPLU_STALL_SYS_CALL", +- .pme_code = 0x000001E05A, +- .pme_short_desc = "Cycles in which the NTC instruction is not allowed to complete because it was interrupted by a system call exception, which has to be serviced before the instruction can complete", +- .pme_long_desc = "Cycles in which the NTC instruction is not allowed to complete because it was interrupted by a system call exception, which has to be serviced before the instruction can complete", ++[ POWER9_PME_PM_L3_CI_MISS ] = { ++ .pme_name = "PM_L3_CI_MISS", ++ .pme_code = 0x00000268A2, ++ .pme_short_desc = "L3 castins miss (total count)", ++ .pme_long_desc = "L3 castins miss (total count)", + }, +-[ POWER9_PME_PM_LSU_FLUSH_RELAUNCH_MISS ] = { /* 414 */ +- .pme_name = "PM_LSU_FLUSH_RELAUNCH_MISS", +- .pme_code = 0x000000C8B0, +- .pme_short_desc = "If a load that has already returned data and has to relaunch for any reason then gets a miss (erat, setp, data cache), it will often be flushed at relaunch time because the data might be inconsistent", +- .pme_long_desc = "If a load that has already returned data and has to relaunch for any reason then gets a miss (erat, setp, data cache), it will often be flushed at relaunch time because the data might be inconsistent", ++[ POWER9_PME_PM_L3_CINJ ] = { ++ .pme_name = "PM_L3_CINJ", ++ .pme_code = 0x00000368A4, ++ .pme_short_desc = "L3 castin of cache inject", ++ .pme_long_desc = "L3 castin of cache inject", + }, +-[ POWER9_PME_PM_DPTEG_FROM_L3_1_ECO_MOD ] = { /* 415 */ +- .pme_name = "PM_DPTEG_FROM_L3_1_ECO_MOD", +- .pme_code = 0x000004E044, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's ECO L3 on the same chip due to a data side request.", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's ECO L3 on the same chip due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++[ POWER9_PME_PM_L3_CI_USAGE ] = { ++ .pme_name = "PM_L3_CI_USAGE", ++ .pme_code = 0x00000168AC, ++ .pme_short_desc = "Rotating sample of 16 CI or CO actives", ++ .pme_long_desc = "Rotating sample of 16 CI or CO actives", + }, +-[ POWER9_PME_PM_PMC5_OVERFLOW ] = { /* 416 */ +- .pme_name = "PM_PMC5_OVERFLOW", +- .pme_code = 0x0000010024, +- .pme_short_desc = "Overflow from counter 5", +- .pme_long_desc = "Overflow from counter 5", ++[ POWER9_PME_PM_L3_CO0_BUSY ] = { ++ .pme_name = "PM_L3_CO0_BUSY", ++ .pme_code = 0x00000368AC, ++ .pme_short_desc = "Lifetime, sample of CO machine 0 valid", ++ .pme_long_desc = "Lifetime, sample of CO machine 0 valid", + }, +-[ POWER9_PME_PM_LS1_UNALIGNED_ST ] = { /* 417 */ +- .pme_name = "PM_LS1_UNALIGNED_ST", +- .pme_code = 0x000000F8B8, +- .pme_short_desc = "Store instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the Store of that size.", +- .pme_long_desc = "Store instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the Store of that size. If the Store wraps from slice 3 to slice 0, thee is an additional 3-cycle penalty", ++[ POWER9_PME_PM_L3_CO0_BUSY_ALT ] = { ++ .pme_name = "PM_L3_CO0_BUSY", ++ .pme_code = 0x00000468AC, ++ .pme_short_desc = "Lifetime, sample of CO machine 0 valid", ++ .pme_long_desc = "Lifetime, sample of CO machine 0 valid", + }, +-[ POWER9_PME_PM_ICT_NOSLOT_DISP_HELD_SYNC ] = { /* 418 */ +- .pme_name = "PM_ICT_NOSLOT_DISP_HELD_SYNC", +- .pme_code = 0x000004D01C, +- .pme_short_desc = "Dispatch held due to a synchronizing instruction at dispatch", +- .pme_long_desc = "Dispatch held due to a synchronizing instruction at dispatch", ++[ POWER9_PME_PM_L3_CO_L31 ] = { ++ .pme_name = "PM_L3_CO_L31", ++ .pme_code = 0x00000268A0, ++ .pme_short_desc = "L3 CO to L3.", ++ .pme_long_desc = "L3 CO to L3.1 OR of port 0 and 1 (lossy = may undercount if two cresps come in the same cyc)", + }, +-[ POWER9_PME_PM_CMPLU_STALL_THRD ] = { /* 419 */ +- .pme_name = "PM_CMPLU_STALL_THRD", +- .pme_code = 0x000001001C, +- .pme_short_desc = "Completion Stalled because the thread was blocked", +- .pme_long_desc = "Completion Stalled because the thread was blocked", ++[ POWER9_PME_PM_L3_CO_LCO ] = { ++ .pme_name = "PM_L3_CO_LCO", ++ .pme_code = 0x00000360A4, ++ .pme_short_desc = "Total L3 COs occurred on LCO L3.", ++ .pme_long_desc = "Total L3 COs occurred on LCO L3.1 (good cresp, may end up in mem on a retry)", + }, +-[ POWER9_PME_PM_PMC3_SAVED ] = { /* 420 */ +- .pme_name = "PM_PMC3_SAVED", +- .pme_code = 0x000004D012, +- .pme_short_desc = "PMC3 Rewind Value saved", +- .pme_long_desc = "PMC3 Rewind Value saved", ++[ POWER9_PME_PM_L3_CO_MEM ] = { ++ .pme_name = "PM_L3_CO_MEM", ++ .pme_code = 0x00000260A0, ++ .pme_short_desc = "L3 CO to memory OR of port 0 and 1 (lossy = may undercount if two cresp come in the same cyc)", ++ .pme_long_desc = "L3 CO to memory OR of port 0 and 1 (lossy = may undercount if two cresp come in the same cyc)", + }, +-[ POWER9_PME_PM_MRK_DERAT_MISS ] = { /* 421 */ +- .pme_name = "PM_MRK_DERAT_MISS", +- .pme_code = 0x00000301E6, +- .pme_short_desc = "Erat Miss (TLB Access) All page sizes", +- .pme_long_desc = "Erat Miss (TLB Access) All page sizes", ++[ POWER9_PME_PM_L3_CO_MEPF ] = { ++ .pme_name = "PM_L3_CO_MEPF", ++ .pme_code = 0x000003E05E, ++ .pme_short_desc = "L3 castouts in Mepf state for this thread", ++ .pme_long_desc = "L3 castouts in Mepf state for this thread", + }, +-[ POWER9_PME_PM_RADIX_PWC_L3_HIT ] = { /* 422 */ +- .pme_name = "PM_RADIX_PWC_L3_HIT", +- .pme_code = 0x000003F056, +- .pme_short_desc = "A radix translation attempt missed in the TLB but hit on the first, second, and third levels of page walk cache.", +- .pme_long_desc = "A radix translation attempt missed in the TLB but hit on the first, second, and third levels of page walk cache.", ++[ POWER9_PME_PM_L3_CO_MEPF_ALT ] = { ++ .pme_name = "PM_L3_CO_MEPF", ++ .pme_code = 0x00000168A0, ++ .pme_short_desc = "L3 CO of line in Mep state (includes casthrough to memory).", ++ .pme_long_desc = "L3 CO of line in Mep state (includes casthrough to memory). The Mepf state indicates that a line was brought in to satisfy an L3 prefetch request", + }, +-[ POWER9_PME_PM_RADIX_PWC_L2_PTE_FROM_L3MISS ] = { /* 423 */ +- .pme_name = "PM_RADIX_PWC_L2_PTE_FROM_L3MISS", +- .pme_code = 0x000004F05C, +- .pme_short_desc = "A Page Table Entry was reloaded to a level 2 page walk cache from beyond the core's L3 data cache.", +- .pme_long_desc = "A Page Table Entry was reloaded to a level 2 page walk cache from beyond the core's L3 data cache. This implies that level 3 and level 4 PWC accesses were not necessary for this translation. The source could be local/remote/distant memory or another core's cache", ++[ POWER9_PME_PM_L3_CO ] = { ++ .pme_name = "PM_L3_CO", ++ .pme_code = 0x00000360A8, ++ .pme_short_desc = "L3 castout occurring (does not include casthrough or log writes (cinj/dmaw))", ++ .pme_long_desc = "L3 castout occurring (does not include casthrough or log writes (cinj/dmaw))", + }, +-[ POWER9_PME_PM_RUN_CYC_SMT4_MODE ] = { /* 424 */ +- .pme_name = "PM_RUN_CYC_SMT4_MODE", +- .pme_code = 0x000002006C, +- .pme_short_desc = "Cycles in which this thread's run latch is set and the core is in SMT4 mode", +- .pme_long_desc = "Cycles in which this thread's run latch is set and the core is in SMT4 mode", ++[ POWER9_PME_PM_L3_GRP_GUESS_CORRECT ] = { ++ .pme_name = "PM_L3_GRP_GUESS_CORRECT", ++ .pme_code = 0x00000168B2, ++ .pme_short_desc = "Initial scope=group (GS or NNS) and data from same group (near) (pred successful)", ++ .pme_long_desc = "Initial scope=group (GS or NNS) and data from same group (near) (pred successful)", + }, +-[ POWER9_PME_PM_DATA_FROM_RMEM ] = { /* 425 */ +- .pme_name = "PM_DATA_FROM_RMEM", +- .pme_code = 0x000003C04A, +- .pme_short_desc = "The processor's data cache was reloaded from another chip's memory on the same Node or Group ( Remote) due to a demand load", +- .pme_long_desc = "The processor's data cache was reloaded from another chip's memory on the same Node or Group ( Remote) due to a demand load", ++[ POWER9_PME_PM_L3_GRP_GUESS_WRONG_HIGH ] = { ++ .pme_name = "PM_L3_GRP_GUESS_WRONG_HIGH", ++ .pme_code = 0x00000368B2, ++ .pme_short_desc = "Initial scope=group (GS or NNS) but data from local node.", ++ .pme_long_desc = "Initial scope=group (GS or NNS) but data from local node. Prediction too high", + }, +-[ POWER9_PME_PM_BR_MPRED_LSTACK ] = { /* 426 */ +- .pme_name = "PM_BR_MPRED_LSTACK", +- .pme_code = 0x00000048AC, +- .pme_short_desc = "Conditional Branch Completed that was Mispredicted due to the Link Stack Target Prediction", +- .pme_long_desc = "Conditional Branch Completed that was Mispredicted due to the Link Stack Target Prediction", ++[ POWER9_PME_PM_L3_GRP_GUESS_WRONG_LOW ] = { ++ .pme_name = "PM_L3_GRP_GUESS_WRONG_LOW", ++ .pme_code = 0x00000360B2, ++ .pme_short_desc = "Initial scope=group (GS or NNS) but data from outside group (far or rem).", ++ .pme_long_desc = "Initial scope=group (GS or NNS) but data from outside group (far or rem). Prediction too Low", + }, +-[ POWER9_PME_PM_PROBE_NOP_DISP ] = { /* 427 */ +- .pme_name = "PM_PROBE_NOP_DISP", +- .pme_code = 0x0000040014, +- .pme_short_desc = "ProbeNops dispatched", +- .pme_long_desc = "ProbeNops dispatched", ++[ POWER9_PME_PM_L3_HIT ] = { ++ .pme_name = "PM_L3_HIT", ++ .pme_code = 0x00000160A4, ++ .pme_short_desc = "L3 Hits (L2 miss hitting L3, including data/instrn/xlate)", ++ .pme_long_desc = "L3 Hits (L2 miss hitting L3, including data/instrn/xlate)", + }, +-[ POWER9_PME_PM_DPTEG_FROM_L3_MEPF ] = { /* 428 */ +- .pme_name = "PM_DPTEG_FROM_L3_MEPF", +- .pme_code = 0x000002E042, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L3 without dispatch conflicts hit on Mepf state.", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L3 without dispatch conflicts hit on Mepf state. due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++[ POWER9_PME_PM_L3_L2_CO_HIT ] = { ++ .pme_name = "PM_L3_L2_CO_HIT", ++ .pme_code = 0x00000360A2, ++ .pme_short_desc = "L2 CO hits", ++ .pme_long_desc = "L2 CO hits", + }, +-[ POWER9_PME_PM_INST_FROM_L3MISS_MOD ] = { /* 429 */ +- .pme_name = "PM_INST_FROM_L3MISS_MOD", +- .pme_code = 0x000004404E, +- .pme_short_desc = "The processor's Instruction cache was reloaded from a localtion other than the local core's L3 due to a instruction fetch", +- .pme_long_desc = "The processor's Instruction cache was reloaded from a localtion other than the local core's L3 due to a instruction fetch", ++[ POWER9_PME_PM_L3_L2_CO_MISS ] = { ++ .pme_name = "PM_L3_L2_CO_MISS", ++ .pme_code = 0x00000368A2, ++ .pme_short_desc = "L2 CO miss", ++ .pme_long_desc = "L2 CO miss", + }, +-[ POWER9_PME_PM_DUMMY1_REMOVE_ME ] = { /* 430 */ +- .pme_name = "PM_DUMMY1_REMOVE_ME", +- .pme_code = 0x0000040062, +- .pme_short_desc = "Space holder for l2_pc_pm_mk_ldst_scope_pred_status", +- .pme_long_desc = "Space holder for l2_pc_pm_mk_ldst_scope_pred_status", ++[ POWER9_PME_PM_L3_LAT_CI_HIT ] = { ++ .pme_name = "PM_L3_LAT_CI_HIT", ++ .pme_code = 0x00000460A2, ++ .pme_short_desc = "L3 Lateral Castins Hit", ++ .pme_long_desc = "L3 Lateral Castins Hit", + }, +-[ POWER9_PME_PM_MRK_DATA_FROM_DL4 ] = { /* 431 */ +- .pme_name = "PM_MRK_DATA_FROM_DL4", +- .pme_code = 0x000001D152, +- .pme_short_desc = "The processor's data cache was reloaded from another chip's L4 on a different Node or Group (Distant) due to a marked load", +- .pme_long_desc = "The processor's data cache was reloaded from another chip's L4 on a different Node or Group (Distant) due to a marked load", ++[ POWER9_PME_PM_L3_LAT_CI_MISS ] = { ++ .pme_name = "PM_L3_LAT_CI_MISS", ++ .pme_code = 0x00000468A2, ++ .pme_short_desc = "L3 Lateral Castins Miss", ++ .pme_long_desc = "L3 Lateral Castins Miss", + }, +-[ POWER9_PME_PM_MRK_DATA_FROM_RL2L3_MOD_CYC ] = { /* 432 */ +- .pme_name = "PM_MRK_DATA_FROM_RL2L3_MOD_CYC", +- .pme_code = 0x000002D14A, +- .pme_short_desc = "Duration in cycles to reload with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a marked load", +- .pme_long_desc = "Duration in cycles to reload with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a marked load", ++[ POWER9_PME_PM_L3_LD_HIT ] = { ++ .pme_name = "PM_L3_LD_HIT", ++ .pme_code = 0x00000260A4, ++ .pme_short_desc = "L3 Hits for demand LDs", ++ .pme_long_desc = "L3 Hits for demand LDs", + }, +-[ POWER9_PME_PM_IPTEG_FROM_L3_1_SHR ] = { /* 433 */ +- .pme_name = "PM_IPTEG_FROM_L3_1_SHR", +- .pme_code = 0x0000015046, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's L3 on the same chip due to a instruction side request", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's L3 on the same chip due to a instruction side request", ++[ POWER9_PME_PM_L3_LD_MISS ] = { ++ .pme_name = "PM_L3_LD_MISS", ++ .pme_code = 0x00000268A4, ++ .pme_short_desc = "L3 Misses for demand LDs", ++ .pme_long_desc = "L3 Misses for demand LDs", + }, +-[ POWER9_PME_PM_MRK_DATA_FROM_L3_1_ECO_SHR ] = { /* 434 */ +- .pme_name = "PM_MRK_DATA_FROM_L3_1_ECO_SHR", +- .pme_code = 0x000002D14C, +- .pme_short_desc = "The processor's data cache was reloaded with Shared (S) data from another core's ECO L3 on the same chip due to a marked load", +- .pme_long_desc = "The processor's data cache was reloaded with Shared (S) data from another core's ECO L3 on the same chip due to a marked load", ++[ POWER9_PME_PM_L3_LD_PREF ] = { ++ .pme_name = "PM_L3_LD_PREF", ++ .pme_code = 0x000000F0B0, ++ .pme_short_desc = "L3 load prefetch, sourced from a hardware or software stream, was sent to the nest", ++ .pme_long_desc = "L3 load prefetch, sourced from a hardware or software stream, was sent to the nest", + }, +-[ POWER9_PME_PM_DTLB_MISS_2M ] = { /* 435 */ +- .pme_name = "PM_DTLB_MISS_2M", +- .pme_code = 0x000001C05C, +- .pme_short_desc = "Data TLB reload (after a miss) page size 2M.", +- .pme_long_desc = "Data TLB reload (after a miss) page size 2M. Implies radix translation was used", ++[ POWER9_PME_PM_L3_LOC_GUESS_CORRECT ] = { ++ .pme_name = "PM_L3_LOC_GUESS_CORRECT", ++ .pme_code = 0x00000160B2, ++ .pme_short_desc = "initial scope=node/chip (LNS) and data from local node (local) (pred successful) - always PFs only", ++ .pme_long_desc = "initial scope=node/chip (LNS) and data from local node (local) (pred successful) - always PFs only", + }, +-[ POWER9_PME_PM_TM_RST_SC ] = { /* 436 */ +- .pme_name = "PM_TM_RST_SC", +- .pme_code = 0x00000268A6, +- .pme_short_desc = "tm snp rst tm sc", +- .pme_long_desc = "tm snp rst tm sc", ++[ POWER9_PME_PM_L3_LOC_GUESS_WRONG ] = { ++ .pme_name = "PM_L3_LOC_GUESS_WRONG", ++ .pme_code = 0x00000268B2, ++ .pme_short_desc = "Initial scope=node (LNS) but data from out side local node (near or far or rem).", ++ .pme_long_desc = "Initial scope=node (LNS) but data from out side local node (near or far or rem). Prediction too Low", + }, +-[ POWER9_PME_PM_LSU_NCST ] = { /* 437 */ +- .pme_name = "PM_LSU_NCST", +- .pme_code = 0x000000C890, +- .pme_short_desc = "Asserts when a i=1 store op is sent to the nest.", +- .pme_long_desc = "Asserts when a i=1 store op is sent to the nest. No record of issue pipe (LS0/LS1) is maintained so this is for both pipes. Probably don't need separate LS0 and LS1", ++[ POWER9_PME_PM_L3_MISS ] = { ++ .pme_name = "PM_L3_MISS", ++ .pme_code = 0x00000168A4, ++ .pme_short_desc = "L3 Misses (L2 miss also missing L3, including data/instrn/xlate)", ++ .pme_long_desc = "L3 Misses (L2 miss also missing L3, including data/instrn/xlate)", + }, +-[ POWER9_PME_PM_DATA_SYS_PUMP_MPRED_RTY ] = { /* 438 */ +- .pme_name = "PM_DATA_SYS_PUMP_MPRED_RTY", +- .pme_code = 0x000004C050, +- .pme_short_desc = "Final Pump Scope (system) ended up larger than Initial Pump Scope (Chip/Group) for a demand load", +- .pme_long_desc = "Final Pump Scope (system) ended up larger than Initial Pump Scope (Chip/Group) for a demand load", ++[ POWER9_PME_PM_L3_P0_CO_L31 ] = { ++ .pme_name = "PM_L3_P0_CO_L31", ++ .pme_code = 0x00000460AA, ++ .pme_short_desc = "L3 CO to L3.", ++ .pme_long_desc = "L3 CO to L3.1 (LCO) port 0 with or without data", + }, +-[ POWER9_PME_PM_THRESH_ACC ] = { /* 439 */ +- .pme_name = "PM_THRESH_ACC", +- .pme_code = 0x0000024154, +- .pme_short_desc = "This event increments every time the threshold event counter ticks.", +- .pme_long_desc = "This event increments every time the threshold event counter ticks. Thresholding must be enabled (via MMCRA) and the thresholding start event must occur for this counter to increment. It will stop incrementing when the thresholding stop event occurs or when thresholding is disabled, until the next time a configured thresholding start event occurs.", ++[ POWER9_PME_PM_L3_P0_CO_MEM ] = { ++ .pme_name = "PM_L3_P0_CO_MEM", ++ .pme_code = 0x00000360AA, ++ .pme_short_desc = "L3 CO to memory port 0 with or without data", ++ .pme_long_desc = "L3 CO to memory port 0 with or without data", + }, +-[ POWER9_PME_PM_ISU3_ISS_HOLD_ALL ] = { /* 440 */ +- .pme_name = "PM_ISU3_ISS_HOLD_ALL", +- .pme_code = 0x0000003884, +- .pme_short_desc = "All ISU rejects", +- .pme_long_desc = "All ISU rejects", ++[ POWER9_PME_PM_L3_P0_CO_RTY ] = { ++ .pme_name = "PM_L3_P0_CO_RTY", ++ .pme_code = 0x00000360AE, ++ .pme_short_desc = "L3 CO received retry port 0 (memory only), every retry counted", ++ .pme_long_desc = "L3 CO received retry port 0 (memory only), every retry counted", + }, +-[ POWER9_PME_PM_LSU0_L1_CAM_CANCEL ] = { /* 441 */ +- .pme_name = "PM_LSU0_L1_CAM_CANCEL", +- .pme_code = 0x000000F090, +- .pme_short_desc = "ls0 l1 tm cam cancel", +- .pme_long_desc = "ls0 l1 tm cam cancel", ++[ POWER9_PME_PM_L3_P0_CO_RTY_ALT ] = { ++ .pme_name = "PM_L3_P0_CO_RTY", ++ .pme_code = 0x00000460AE, ++ .pme_short_desc = "L3 CO received retry port 2 (memory only), every retry counted", ++ .pme_long_desc = "L3 CO received retry port 2 (memory only), every retry counted", + }, +-[ POWER9_PME_PM_MRK_FAB_RSP_BKILL_CYC ] = { /* 442 */ +- .pme_name = "PM_MRK_FAB_RSP_BKILL_CYC", +- .pme_code = 0x000001F152, +- .pme_short_desc = "cycles L2 RC took for a bkill", +- .pme_long_desc = "cycles L2 RC took for a bkill", ++[ POWER9_PME_PM_L3_P0_GRP_PUMP ] = { ++ .pme_name = "PM_L3_P0_GRP_PUMP", ++ .pme_code = 0x00000260B0, ++ .pme_short_desc = "L3 PF sent with grp scope port 0, counts even retried requests", ++ .pme_long_desc = "L3 PF sent with grp scope port 0, counts even retried requests", + }, +-[ POWER9_PME_PM_MRK_DPTEG_FROM_L3_MEPF ] = { /* 443 */ +- .pme_name = "PM_MRK_DPTEG_FROM_L3_MEPF", +- .pme_code = 0x000002F142, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L3 without dispatch conflicts hit on Mepf state.", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L3 without dispatch conflicts hit on Mepf state. due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++[ POWER9_PME_PM_L3_P0_LCO_DATA ] = { ++ .pme_name = "PM_L3_P0_LCO_DATA", ++ .pme_code = 0x00000260AA, ++ .pme_short_desc = "LCO sent with data port 0", ++ .pme_long_desc = "LCO sent with data port 0", + }, +-[ POWER9_PME_PM_DARQ_STORE_REJECT ] = { /* 444 */ +- .pme_name = "PM_DARQ_STORE_REJECT", +- .pme_code = 0x000004405E, +- .pme_short_desc = "The DARQ attempted to transmit a store into an LSAQ or SRQ entry but It was rejected.", +- .pme_long_desc = "The DARQ attempted to transmit a store into an LSAQ or SRQ entry but It was rejected. Divide by pm_darq_store_xmit to get reject ratio", ++[ POWER9_PME_PM_L3_P0_LCO_NO_DATA ] = { ++ .pme_name = "PM_L3_P0_LCO_NO_DATA", ++ .pme_code = 0x00000160AA, ++ .pme_short_desc = "Dataless L3 LCO sent port 0", ++ .pme_long_desc = "Dataless L3 LCO sent port 0", + }, +-[ POWER9_PME_PM_DPTEG_FROM_L3_NO_CONFLICT ] = { /* 445 */ +- .pme_name = "PM_DPTEG_FROM_L3_NO_CONFLICT", +- .pme_code = 0x000001E044, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L3 without conflict due to a data side request.", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L3 without conflict due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++[ POWER9_PME_PM_L3_P0_LCO_RTY ] = { ++ .pme_name = "PM_L3_P0_LCO_RTY", ++ .pme_code = 0x00000160B4, ++ .pme_short_desc = "L3 initiated LCO received retry on port 0 (can try 4 times)", ++ .pme_long_desc = "L3 initiated LCO received retry on port 0 (can try 4 times)", ++}, ++[ POWER9_PME_PM_L3_P0_NODE_PUMP ] = { ++ .pme_name = "PM_L3_P0_NODE_PUMP", ++ .pme_code = 0x00000160B0, ++ .pme_short_desc = "L3 PF sent with nodal scope port 0, counts even retried requests", ++ .pme_long_desc = "L3 PF sent with nodal scope port 0, counts even retried requests", + }, +-[ POWER9_PME_PM_TM_TX_PASS_RUN_CYC ] = { /* 446 */ +- .pme_name = "PM_TM_TX_PASS_RUN_CYC", +- .pme_code = 0x000002E012, +- .pme_short_desc = "cycles spent in successful transactions", +- .pme_long_desc = "cycles spent in successful transactions", ++[ POWER9_PME_PM_L3_P0_PF_RTY ] = { ++ .pme_name = "PM_L3_P0_PF_RTY", ++ .pme_code = 0x00000160AE, ++ .pme_short_desc = "L3 PF received retry port 0, every retry counted", ++ .pme_long_desc = "L3 PF received retry port 0, every retry counted", + }, +-[ POWER9_PME_PM_DTLB_MISS_4K ] = { /* 447 */ +- .pme_name = "PM_DTLB_MISS_4K", +- .pme_code = 0x000002C056, +- .pme_short_desc = "Data TLB Miss page size 4k", +- .pme_long_desc = "Data TLB Miss page size 4k", ++[ POWER9_PME_PM_L3_P0_PF_RTY_ALT ] = { ++ .pme_name = "PM_L3_P0_PF_RTY", ++ .pme_code = 0x00000260AE, ++ .pme_short_desc = "L3 PF received retry port 2, every retry counted", ++ .pme_long_desc = "L3 PF received retry port 2, every retry counted", + }, +-[ POWER9_PME_PM_MRK_DATA_FROM_ON_CHIP_CACHE_CYC ] = { /* 448 */ +- .pme_name = "PM_MRK_DATA_FROM_ON_CHIP_CACHE_CYC", +- .pme_code = 0x000003515A, +- .pme_short_desc = "Duration in cycles to reload either shared or modified data from another core's L2/L3 on the same chip due to a marked load", +- .pme_long_desc = "Duration in cycles to reload either shared or modified data from another core's L2/L3 on the same chip due to a marked load", ++[ POWER9_PME_PM_L3_P0_SYS_PUMP ] = { ++ .pme_name = "PM_L3_P0_SYS_PUMP", ++ .pme_code = 0x00000360B0, ++ .pme_short_desc = "L3 PF sent with sys scope port 0, counts even retried requests", ++ .pme_long_desc = "L3 PF sent with sys scope port 0, counts even retried requests", + }, +-[ POWER9_PME_PM_LS0_PTE_TABLEWALK_CYC ] = { /* 449 */ +- .pme_name = "PM_LS0_PTE_TABLEWALK_CYC", +- .pme_code = 0x000000E0BC, +- .pme_short_desc = "Cycles when a tablewalk is pending on this thread on table 0", +- .pme_long_desc = "Cycles when a tablewalk is pending on this thread on table 0", ++[ POWER9_PME_PM_L3_P1_CO_L31 ] = { ++ .pme_name = "PM_L3_P1_CO_L31", ++ .pme_code = 0x00000468AA, ++ .pme_short_desc = "L3 CO to L3.", ++ .pme_long_desc = "L3 CO to L3.1 (LCO) port 1 with or without data", + }, +-[ POWER9_PME_PM_PMC4_SAVED ] = { /* 450 */ +- .pme_name = "PM_PMC4_SAVED", +- .pme_code = 0x0000030022, +- .pme_short_desc = "PMC4 Rewind Value saved (matched condition)", +- .pme_long_desc = "PMC4 Rewind Value saved (matched condition)", ++[ POWER9_PME_PM_L3_P1_CO_MEM ] = { ++ .pme_name = "PM_L3_P1_CO_MEM", ++ .pme_code = 0x00000368AA, ++ .pme_short_desc = "L3 CO to memory port 1 with or without data", ++ .pme_long_desc = "L3 CO to memory port 1 with or without data", + }, +-[ POWER9_PME_PM_SNP_TM_HIT_T ] = { /* 451 */ +- .pme_name = "PM_SNP_TM_HIT_T", +- .pme_code = 0x00000368A6, +- .pme_short_desc = "snp tm_st_hit t tn te", +- .pme_long_desc = "snp tm_st_hit t tn te", ++[ POWER9_PME_PM_L3_P1_CO_RTY ] = { ++ .pme_name = "PM_L3_P1_CO_RTY", ++ .pme_code = 0x00000368AE, ++ .pme_short_desc = "L3 CO received retry port 1 (memory only), every retry counted", ++ .pme_long_desc = "L3 CO received retry port 1 (memory only), every retry counted", + }, +-[ POWER9_PME_PM_MRK_BR_2PATH ] = { /* 452 */ +- .pme_name = "PM_MRK_BR_2PATH", +- .pme_code = 0x0000010138, +- .pme_short_desc = "marked two path branch", +- .pme_long_desc = "marked two path branch", ++[ POWER9_PME_PM_L3_P1_CO_RTY_ALT ] = { ++ .pme_name = "PM_L3_P1_CO_RTY", ++ .pme_code = 0x00000468AE, ++ .pme_short_desc = "L3 CO received retry port 3 (memory only), every retry counted", ++ .pme_long_desc = "L3 CO received retry port 3 (memory only), every retry counted", + }, +-[ POWER9_PME_PM_LSU_FLUSH_CI ] = { /* 453 */ +- .pme_name = "PM_LSU_FLUSH_CI", +- .pme_code = 0x000000C0A8, +- .pme_short_desc = "Load was not issued to LSU as a cache inhibited (non-cacheable) load but it was later determined to be cache inhibited", +- .pme_long_desc = "Load was not issued to LSU as a cache inhibited (non-cacheable) load but it was later determined to be cache inhibited", ++[ POWER9_PME_PM_L3_P1_GRP_PUMP ] = { ++ .pme_name = "PM_L3_P1_GRP_PUMP", ++ .pme_code = 0x00000268B0, ++ .pme_short_desc = "L3 PF sent with grp scope port 1, counts even retried requests", ++ .pme_long_desc = "L3 PF sent with grp scope port 1, counts even retried requests", + }, +-[ POWER9_PME_PM_FLUSH_MPRED ] = { /* 454 */ +- .pme_name = "PM_FLUSH_MPRED", +- .pme_code = 0x00000050A4, +- .pme_short_desc = "Branch mispredict flushes.", +- .pme_long_desc = "Branch mispredict flushes. Includes target and address misprecition", ++[ POWER9_PME_PM_L3_P1_LCO_DATA ] = { ++ .pme_name = "PM_L3_P1_LCO_DATA", ++ .pme_code = 0x00000268AA, ++ .pme_short_desc = "LCO sent with data port 1", ++ .pme_long_desc = "LCO sent with data port 1", + }, +-[ POWER9_PME_PM_CMPLU_STALL_ST_FWD ] = { /* 455 */ +- .pme_name = "PM_CMPLU_STALL_ST_FWD", +- .pme_code = 0x000004C01C, +- .pme_short_desc = "Completion stall due to store forward", +- .pme_long_desc = "Completion stall due to store forward", ++[ POWER9_PME_PM_L3_P1_LCO_NO_DATA ] = { ++ .pme_name = "PM_L3_P1_LCO_NO_DATA", ++ .pme_code = 0x00000168AA, ++ .pme_short_desc = "Dataless L3 LCO sent port 1", ++ .pme_long_desc = "Dataless L3 LCO sent port 1", + }, +-[ POWER9_PME_PM_DTLB_MISS ] = { /* 456 */ +- .pme_name = "PM_DTLB_MISS", +- .pme_code = 0x00000300FC, +- .pme_short_desc = "Data PTEG reload", +- .pme_long_desc = "Data PTEG reload", ++[ POWER9_PME_PM_L3_P1_LCO_RTY ] = { ++ .pme_name = "PM_L3_P1_LCO_RTY", ++ .pme_code = 0x00000168B4, ++ .pme_short_desc = "L3 initiated LCO received retry on port 1 (can try 4 times)", ++ .pme_long_desc = "L3 initiated LCO received retry on port 1 (can try 4 times)", + }, +-[ POWER9_PME_PM_MRK_L2_TM_REQ_ABORT ] = { /* 457 */ +- .pme_name = "PM_MRK_L2_TM_REQ_ABORT", +- .pme_code = 0x000001E15E, +- .pme_short_desc = "TM abort", +- .pme_long_desc = "TM abort", ++[ POWER9_PME_PM_L3_P1_NODE_PUMP ] = { ++ .pme_name = "PM_L3_P1_NODE_PUMP", ++ .pme_code = 0x00000168B0, ++ .pme_short_desc = "L3 PF sent with nodal scope port 1, counts even retried requests", ++ .pme_long_desc = "L3 PF sent with nodal scope port 1, counts even retried requests", + }, +-[ POWER9_PME_PM_TM_NESTED_TEND ] = { /* 458 */ +- .pme_name = "PM_TM_NESTED_TEND", +- .pme_code = 0x0000002098, +- .pme_short_desc = "Completion time nested tend", +- .pme_long_desc = "Completion time nested tend", ++[ POWER9_PME_PM_L3_P1_PF_RTY ] = { ++ .pme_name = "PM_L3_P1_PF_RTY", ++ .pme_code = 0x00000168AE, ++ .pme_short_desc = "L3 PF received retry port 1, every retry counted", ++ .pme_long_desc = "L3 PF received retry port 1, every retry counted", + }, +-[ POWER9_PME_PM_CMPLU_STALL_PM ] = { /* 459 */ +- .pme_name = "PM_CMPLU_STALL_PM", +- .pme_code = 0x000003000A, +- .pme_short_desc = "Finish stall because the NTF instruction was issued to the Permute execution pipe and waiting to finish.", +- .pme_long_desc = "Finish stall because the NTF instruction was issued to the Permute execution pipe and waiting to finish. Includes permute and decimal fixpoint instructions (128 bit BCD arithmetic) + a few 128 bit fixpoint add/subtract instructions with carry. Not qualified by vector or multicycle", ++[ POWER9_PME_PM_L3_P1_PF_RTY_ALT ] = { ++ .pme_name = "PM_L3_P1_PF_RTY", ++ .pme_code = 0x00000268AE, ++ .pme_short_desc = "L3 PF received retry port 3, every retry counted", ++ .pme_long_desc = "L3 PF received retry port 3, every retry counted", + }, +-[ POWER9_PME_PM_CMPLU_STALL_ISYNC ] = { /* 460 */ +- .pme_name = "PM_CMPLU_STALL_ISYNC", +- .pme_code = 0x000003002E, +- .pme_short_desc = "Completion stall because the ISU is checking the scoreboard for whether the isync instruction requires a flush or not", +- .pme_long_desc = "Completion stall because the ISU is checking the scoreboard for whether the isync instruction requires a flush or not", ++[ POWER9_PME_PM_L3_P1_SYS_PUMP ] = { ++ .pme_name = "PM_L3_P1_SYS_PUMP", ++ .pme_code = 0x00000368B0, ++ .pme_short_desc = "L3 PF sent with sys scope port 1, counts even retried requests", ++ .pme_long_desc = "L3 PF sent with sys scope port 1, counts even retried requests", + }, +-[ POWER9_PME_PM_MRK_DTLB_MISS_1G ] = { /* 461 */ +- .pme_name = "PM_MRK_DTLB_MISS_1G", +- .pme_code = 0x000001D15C, +- .pme_short_desc = "Marked Data TLB reload (after a miss) page size 2M.", +- .pme_long_desc = "Marked Data TLB reload (after a miss) page size 2M. Implies radix translation was used", ++[ POWER9_PME_PM_L3_P2_LCO_RTY ] = { ++ .pme_name = "PM_L3_P2_LCO_RTY", ++ .pme_code = 0x00000260B4, ++ .pme_short_desc = "L3 initiated LCO received retry on port 2 (can try 4 times)", ++ .pme_long_desc = "L3 initiated LCO received retry on port 2 (can try 4 times)", + }, +-[ POWER9_PME_PM_L3_SYS_GUESS_CORRECT ] = { /* 462 */ +- .pme_name = "PM_L3_SYS_GUESS_CORRECT", +- .pme_code = 0x00000260B2, +- .pme_short_desc = "Initial scope=system and data from outside group (far or rem)(pred successful)", +- .pme_long_desc = "Initial scope=system and data from outside group (far or rem)(pred successful)", ++[ POWER9_PME_PM_L3_P3_LCO_RTY ] = { ++ .pme_name = "PM_L3_P3_LCO_RTY", ++ .pme_code = 0x00000268B4, ++ .pme_short_desc = "L3 initiated LCO received retry on port 3 (can try 4 times)", ++ .pme_long_desc = "L3 initiated LCO received retry on port 3 (can try 4 times)", + }, +-[ POWER9_PME_PM_L2_CASTOUT_SHR ] = { /* 463 */ +- .pme_name = "PM_L2_CASTOUT_SHR", +- .pme_code = 0x0000016882, +- .pme_short_desc = "L2 Castouts - Shared (T, Te, Si, S)", +- .pme_long_desc = "L2 Castouts - Shared (T, Te, Si, S)", ++[ POWER9_PME_PM_L3_PF0_BUSY ] = { ++ .pme_name = "PM_L3_PF0_BUSY", ++ .pme_code = 0x00000360B4, ++ .pme_short_desc = "Lifetime, sample of PF machine 0 valid", ++ .pme_long_desc = "Lifetime, sample of PF machine 0 valid", + }, +-[ POWER9_PME_PM_CMPLU_STALL_DMISS_L2L3 ] = { /* 464 */ +- .pme_name = "PM_CMPLU_STALL_DMISS_L2L3", +- .pme_code = 0x000001003C, +- .pme_short_desc = "Completion stall by Dcache miss which resolved in L2/L3", +- .pme_long_desc = "Completion stall by Dcache miss which resolved in L2/L3", ++[ POWER9_PME_PM_L3_PF0_BUSY_ALT ] = { ++ .pme_name = "PM_L3_PF0_BUSY", ++ .pme_code = 0x00000460B4, ++ .pme_short_desc = "Lifetime, sample of PF machine 0 valid", ++ .pme_long_desc = "Lifetime, sample of PF machine 0 valid", + }, +-[ POWER9_PME_PM_LS2_UNALIGNED_ST ] = { /* 465 */ +- .pme_name = "PM_LS2_UNALIGNED_ST", +- .pme_code = 0x000000F0BC, +- .pme_short_desc = "Store instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the Store of that size.", +- .pme_long_desc = "Store instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the Store of that size. If the Store wraps from slice 3 to slice 0, thee is an additional 3-cycle penalty", ++[ POWER9_PME_PM_L3_PF_HIT_L3 ] = { ++ .pme_name = "PM_L3_PF_HIT_L3", ++ .pme_code = 0x00000260A8, ++ .pme_short_desc = "L3 PF hit in L3 (abandoned)", ++ .pme_long_desc = "L3 PF hit in L3 (abandoned)", + }, +-[ POWER9_PME_PM_MRK_DPTEG_FROM_L2MISS ] = { /* 466 */ +- .pme_name = "PM_MRK_DPTEG_FROM_L2MISS", +- .pme_code = 0x000001F14E, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB from a localtion other than the local core's L2 due to a marked data side request.", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB from a localtion other than the local core's L2 due to a marked data side request.. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++[ POWER9_PME_PM_L3_PF_MISS_L3 ] = { ++ .pme_name = "PM_L3_PF_MISS_L3", ++ .pme_code = 0x00000160A0, ++ .pme_short_desc = "L3 PF missed in L3", ++ .pme_long_desc = "L3 PF missed in L3", + }, +-[ POWER9_PME_PM_THRESH_EXC_32 ] = { /* 467 */ +- .pme_name = "PM_THRESH_EXC_32", +- .pme_code = 0x00000201E6, +- .pme_short_desc = "Threshold counter exceeded a value of 32", +- .pme_long_desc = "Threshold counter exceeded a value of 32", ++[ POWER9_PME_PM_L3_PF_OFF_CHIP_CACHE ] = { ++ .pme_name = "PM_L3_PF_OFF_CHIP_CACHE", ++ .pme_code = 0x00000368A0, ++ .pme_short_desc = "L3 PF from Off chip cache", ++ .pme_long_desc = "L3 PF from Off chip cache", + }, +-[ POWER9_PME_PM_TM_TSUSPEND ] = { /* 468 */ +- .pme_name = "PM_TM_TSUSPEND", +- .pme_code = 0x00000028A0, +- .pme_short_desc = "TM suspend instruction completed", +- .pme_long_desc = "TM suspend instruction completed", ++[ POWER9_PME_PM_L3_PF_OFF_CHIP_MEM ] = { ++ .pme_name = "PM_L3_PF_OFF_CHIP_MEM", ++ .pme_code = 0x00000468A0, ++ .pme_short_desc = "L3 PF from Off chip memory", ++ .pme_long_desc = "L3 PF from Off chip memory", + }, +-[ POWER9_PME_PM_DATA_FROM_DL2L3_SHR ] = { /* 469 */ +- .pme_name = "PM_DATA_FROM_DL2L3_SHR", +- .pme_code = 0x000003C048, +- .pme_short_desc = "The processor's data cache was reloaded with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a demand load", +- .pme_long_desc = "The processor's data cache was reloaded with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a demand load", ++[ POWER9_PME_PM_L3_PF_ON_CHIP_CACHE ] = { ++ .pme_name = "PM_L3_PF_ON_CHIP_CACHE", ++ .pme_code = 0x00000360A0, ++ .pme_short_desc = "L3 PF from On chip cache", ++ .pme_long_desc = "L3 PF from On chip cache", + }, +-[ POWER9_PME_PM_MRK_DATA_FROM_L3_DISP_CONFLICT ] = { /* 470 */ +- .pme_name = "PM_MRK_DATA_FROM_L3_DISP_CONFLICT", +- .pme_code = 0x000001D144, +- .pme_short_desc = "The processor's data cache was reloaded from local core's L3 with dispatch conflict due to a marked load", +- .pme_long_desc = "The processor's data cache was reloaded from local core's L3 with dispatch conflict due to a marked load", ++[ POWER9_PME_PM_L3_PF_ON_CHIP_MEM ] = { ++ .pme_name = "PM_L3_PF_ON_CHIP_MEM", ++ .pme_code = 0x00000460A0, ++ .pme_short_desc = "L3 PF from On chip memory", ++ .pme_long_desc = "L3 PF from On chip memory", + }, +-[ POWER9_PME_PM_MRK_DATA_FROM_L3_1_ECO_SHR_CYC ] = { /* 471 */ +- .pme_name = "PM_MRK_DATA_FROM_L3_1_ECO_SHR_CYC", +- .pme_code = 0x000001D142, +- .pme_short_desc = "Duration in cycles to reload with Shared (S) data from another core's ECO L3 on the same chip due to a marked load", +- .pme_long_desc = "Duration in cycles to reload with Shared (S) data from another core's ECO L3 on the same chip due to a marked load", ++[ POWER9_PME_PM_L3_PF_USAGE ] = { ++ .pme_name = "PM_L3_PF_USAGE", ++ .pme_code = 0x00000260AC, ++ .pme_short_desc = "Rotating sample of 32 PF actives", ++ .pme_long_desc = "Rotating sample of 32 PF actives", + }, +-[ POWER9_PME_PM_THRESH_EXC_1024 ] = { /* 472 */ +- .pme_name = "PM_THRESH_EXC_1024", +- .pme_code = 0x00000301EA, +- .pme_short_desc = "Threshold counter exceeded a value of 1024", +- .pme_long_desc = "Threshold counter exceeded a value of 1024", ++[ POWER9_PME_PM_L3_RD0_BUSY ] = { ++ .pme_name = "PM_L3_RD0_BUSY", ++ .pme_code = 0x00000368B4, ++ .pme_short_desc = "Lifetime, sample of RD machine 0 valid", ++ .pme_long_desc = "Lifetime, sample of RD machine 0 valid", + }, +-[ POWER9_PME_PM_ST_FIN ] = { /* 473 */ +- .pme_name = "PM_ST_FIN", +- .pme_code = 0x0000020016, +- .pme_short_desc = "Store finish count.", +- .pme_long_desc = "Store finish count. Includes speculative activity", ++[ POWER9_PME_PM_L3_RD0_BUSY_ALT ] = { ++ .pme_name = "PM_L3_RD0_BUSY", ++ .pme_code = 0x00000468B4, ++ .pme_short_desc = "Lifetime, sample of RD machine 0 valid", ++ .pme_long_desc = "Lifetime, sample of RD machine 0 valid", + }, +-[ POWER9_PME_PM_TM_LD_CAUSED_FAIL ] = { /* 474 */ +- .pme_name = "PM_TM_LD_CAUSED_FAIL", +- .pme_code = 0x000001688C, +- .pme_short_desc = "Non TM Ld caused any thread to fail", +- .pme_long_desc = "Non TM Ld caused any thread to fail", ++[ POWER9_PME_PM_L3_RD_USAGE ] = { ++ .pme_name = "PM_L3_RD_USAGE", ++ .pme_code = 0x00000268AC, ++ .pme_short_desc = "Rotating sample of 16 RD actives", ++ .pme_long_desc = "Rotating sample of 16 RD actives", + }, +-[ POWER9_PME_PM_SRQ_SYNC_CYC ] = { /* 475 */ +- .pme_name = "PM_SRQ_SYNC_CYC", +- .pme_code = 0x000000D0AC, +- .pme_short_desc = "A sync is in the S2Q (edge detect to count)", +- .pme_long_desc = "A sync is in the S2Q (edge detect to count)", ++[ POWER9_PME_PM_L3_SN0_BUSY ] = { ++ .pme_name = "PM_L3_SN0_BUSY", ++ .pme_code = 0x00000360AC, ++ .pme_short_desc = "Lifetime, sample of snooper machine 0 valid", ++ .pme_long_desc = "Lifetime, sample of snooper machine 0 valid", + }, +-[ POWER9_PME_PM_IFETCH_THROTTLE ] = { /* 476 */ +- .pme_name = "PM_IFETCH_THROTTLE", +- .pme_code = 0x000003405E, +- .pme_short_desc = "Cycles in which Instruction fetch throttle was active.", +- .pme_long_desc = "Cycles in which Instruction fetch throttle was active.", ++[ POWER9_PME_PM_L3_SN0_BUSY_ALT ] = { ++ .pme_name = "PM_L3_SN0_BUSY", ++ .pme_code = 0x00000460AC, ++ .pme_short_desc = "Lifetime, sample of snooper machine 0 valid", ++ .pme_long_desc = "Lifetime, sample of snooper machine 0 valid", + }, +-[ POWER9_PME_PM_L3_SW_PREF ] = { /* 477 */ ++[ POWER9_PME_PM_L3_SN_USAGE ] = { ++ .pme_name = "PM_L3_SN_USAGE", ++ .pme_code = 0x00000160AC, ++ .pme_short_desc = "Rotating sample of 16 snoop valids", ++ .pme_long_desc = "Rotating sample of 16 snoop valids", ++}, ++[ POWER9_PME_PM_L3_SW_PREF ] = { + .pme_name = "PM_L3_SW_PREF", + .pme_code = 0x000000F8B0, + .pme_short_desc = "L3 load prefetch, sourced from a software prefetch stream, was sent to the nest", + .pme_long_desc = "L3 load prefetch, sourced from a software prefetch stream, was sent to the nest", + }, +-[ POWER9_PME_PM_LSU0_LDMX_FIN ] = { /* 478 */ +- .pme_name = "PM_LSU0_LDMX_FIN", +- .pme_code = 0x000000D088, +- .pme_short_desc = " New P9 instruction LDMX.", +- .pme_long_desc = " New P9 instruction LDMX.", ++[ POWER9_PME_PM_L3_SYS_GUESS_CORRECT ] = { ++ .pme_name = "PM_L3_SYS_GUESS_CORRECT", ++ .pme_code = 0x00000260B2, ++ .pme_short_desc = "Initial scope=system (VGS or RNS) and data from outside group (far or rem)(pred successful)", ++ .pme_long_desc = "Initial scope=system (VGS or RNS) and data from outside group (far or rem)(pred successful)", + }, +-[ POWER9_PME_PM_L2_LOC_GUESS_WRONG ] = { /* 479 */ +- .pme_name = "PM_L2_LOC_GUESS_WRONG", +- .pme_code = 0x0000016888, +- .pme_short_desc = "L2 guess loc and guess was not correct (ie data not on chip)", +- .pme_long_desc = "L2 guess loc and guess was not correct (ie data not on chip)", ++[ POWER9_PME_PM_L3_SYS_GUESS_WRONG ] = { ++ .pme_name = "PM_L3_SYS_GUESS_WRONG", ++ .pme_code = 0x00000460B2, ++ .pme_short_desc = "Initial scope=system (VGS or RNS) but data from local or near.", ++ .pme_long_desc = "Initial scope=system (VGS or RNS) but data from local or near. Prediction too high", + }, +-[ POWER9_PME_PM_MRK_DATA_FROM_L2_NO_CONFLICT_CYC ] = { /* 480 */ +- .pme_name = "PM_MRK_DATA_FROM_L2_NO_CONFLICT_CYC", +- .pme_code = 0x0000014158, +- .pme_short_desc = "Duration in cycles to reload from local core's L2 without conflict due to a marked load", +- .pme_long_desc = "Duration in cycles to reload from local core's L2 without conflict due to a marked load", ++[ POWER9_PME_PM_L3_TRANS_PF ] = { ++ .pme_name = "PM_L3_TRANS_PF", ++ .pme_code = 0x00000468A4, ++ .pme_short_desc = "L3 Transient prefetch received from L2", ++ .pme_long_desc = "L3 Transient prefetch received from L2", + }, +-[ POWER9_PME_PM_MRK_DPTEG_FROM_ON_CHIP_CACHE ] = { /* 481 */ +- .pme_name = "PM_MRK_DPTEG_FROM_ON_CHIP_CACHE", +- .pme_code = 0x000001F148, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB either shared or modified data from another core's L2/L3 on the same chip due to a marked data side request.", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB either shared or modified data from another core's L2/L3 on the same chip due to a marked data side request.. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++[ POWER9_PME_PM_L3_WI0_BUSY ] = { ++ .pme_name = "PM_L3_WI0_BUSY", ++ .pme_code = 0x00000160B6, ++ .pme_short_desc = "Rotating sample of 8 WI valid", ++ .pme_long_desc = "Rotating sample of 8 WI valid", + }, +-[ POWER9_PME_PM_L3_P1_CO_RTY ] = { /* 482 */ +- .pme_name = "PM_L3_P1_CO_RTY", +- .pme_code = 0x00000468AE, +- .pme_short_desc = "L3 CO received retry port 3", +- .pme_long_desc = "L3 CO received retry port 3", ++[ POWER9_PME_PM_L3_WI0_BUSY_ALT ] = { ++ .pme_name = "PM_L3_WI0_BUSY", ++ .pme_code = 0x00000260B6, ++ .pme_short_desc = "Rotating sample of 8 WI valid (duplicate)", ++ .pme_long_desc = "Rotating sample of 8 WI valid (duplicate)", + }, +-[ POWER9_PME_PM_MRK_STCX_FAIL ] = { /* 483 */ +- .pme_name = "PM_MRK_STCX_FAIL", +- .pme_code = 0x000003E158, +- .pme_short_desc = "marked stcx failed", +- .pme_long_desc = "marked stcx failed", ++[ POWER9_PME_PM_L3_WI_USAGE ] = { ++ .pme_name = "PM_L3_WI_USAGE", ++ .pme_code = 0x00000168A8, ++ .pme_short_desc = "Lifetime, sample of Write Inject machine 0 valid", ++ .pme_long_desc = "Lifetime, sample of Write Inject machine 0 valid", + }, +-[ POWER9_PME_PM_LARX_FIN ] = { /* 484 */ ++[ POWER9_PME_PM_LARX_FIN ] = { + .pme_name = "PM_LARX_FIN", + .pme_code = 0x000003C058, + .pme_short_desc = "Larx finished", + .pme_long_desc = "Larx finished", + }, +-[ POWER9_PME_PM_RADIX_PWC_L2_PTE_FROM_L3 ] = { /* 485 */ +- .pme_name = "PM_RADIX_PWC_L2_PTE_FROM_L3", +- .pme_code = 0x000004F058, +- .pme_short_desc = "A Page Table Entry was reloaded to a level 2 page walk cache from the core's L3 data cache.", +- .pme_long_desc = "A Page Table Entry was reloaded to a level 2 page walk cache from the core's L3 data cache. This implies that level 3 and level 4 PWC accesses were not necessary for this translation", +-}, +-[ POWER9_PME_PM_LSU3_L1_CAM_CANCEL ] = { /* 486 */ +- .pme_name = "PM_LSU3_L1_CAM_CANCEL", +- .pme_code = 0x000000F894, +- .pme_short_desc = "ls3 l1 tm cam cancel", +- .pme_long_desc = "ls3 l1 tm cam cancel", +-}, +-[ POWER9_PME_PM_IC_PREF_CANCEL_HIT ] = { /* 487 */ +- .pme_name = "PM_IC_PREF_CANCEL_HIT", +- .pme_code = 0x0000004890, +- .pme_short_desc = "Prefetch Canceled due to icache hit", +- .pme_long_desc = "Prefetch Canceled due to icache hit", +-}, +-[ POWER9_PME_PM_CMPLU_STALL_EIEIO ] = { /* 488 */ +- .pme_name = "PM_CMPLU_STALL_EIEIO", +- .pme_code = 0x000004D01A, +- .pme_short_desc = "Finish stall because the NTF instruction is an EIEIO waiting for response from L2", +- .pme_long_desc = "Finish stall because the NTF instruction is an EIEIO waiting for response from L2", +-}, +-[ POWER9_PME_PM_CMPLU_STALL_VDP ] = { /* 489 */ +- .pme_name = "PM_CMPLU_STALL_VDP", +- .pme_code = 0x000004405C, +- .pme_short_desc = "Finish stall because the NTF instruction was a vector instruction issued to the Double Precision execution pipe and waiting to finish.", +- .pme_long_desc = "Finish stall because the NTF instruction was a vector instruction issued to the Double Precision execution pipe and waiting to finish. Includes binary floating point instructions in 32 and 64 bit binary floating point format. Not qualified multicycle. Qualified by vector", +-}, +-[ POWER9_PME_PM_DERAT_MISS_1G ] = { /* 490 */ +- .pme_name = "PM_DERAT_MISS_1G", +- .pme_code = 0x000002C05A, +- .pme_short_desc = "Data ERAT Miss (Data TLB Access) page size 1G.", +- .pme_long_desc = "Data ERAT Miss (Data TLB Access) page size 1G. Implies radix translation", +-}, +-[ POWER9_PME_PM_DATA_PUMP_CPRED ] = { /* 491 */ +- .pme_name = "PM_DATA_PUMP_CPRED", +- .pme_code = 0x000001C054, +- .pme_short_desc = "Pump prediction correct.", +- .pme_long_desc = "Pump prediction correct. Counts across all types of pumps for a demand load", +-}, +-[ POWER9_PME_PM_DPTEG_FROM_L2_MEPF ] = { /* 492 */ +- .pme_name = "PM_DPTEG_FROM_L2_MEPF", +- .pme_code = 0x000002E040, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L2 hit without dispatch conflicts on Mepf state.", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L2 hit without dispatch conflicts on Mepf state. due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", +-}, +-[ POWER9_PME_PM_BR_MPRED_TAKEN_CR ] = { /* 493 */ +- .pme_name = "PM_BR_MPRED_TAKEN_CR", +- .pme_code = 0x00000040B8, +- .pme_short_desc = "Conditional Branch Completed that was Mispredicted due to the BHT Direction Prediction (taken/not taken).", +- .pme_long_desc = "Conditional Branch Completed that was Mispredicted due to the BHT Direction Prediction (taken/not taken).", +-}, +-[ POWER9_PME_PM_MRK_BRU_FIN ] = { /* 494 */ +- .pme_name = "PM_MRK_BRU_FIN", +- .pme_code = 0x000002013A, +- .pme_short_desc = "bru marked instr finish", +- .pme_long_desc = "bru marked instr finish", +-}, +-[ POWER9_PME_PM_MRK_DPTEG_FROM_DL4 ] = { /* 495 */ +- .pme_name = "PM_MRK_DPTEG_FROM_DL4", +- .pme_code = 0x000003F14C, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB from another chip's L4 on a different Node or Group (Distant) due to a marked data side request.", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB from another chip's L4 on a different Node or Group (Distant) due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++[ POWER9_PME_PM_LD_CMPL ] = { ++ .pme_name = "PM_LD_CMPL", ++ .pme_code = 0x000004003E, ++ .pme_short_desc = "count of Loads completed", ++ .pme_long_desc = "count of Loads completed", + }, +-[ POWER9_PME_PM_SHL_ST_DEP_CREATED ] = { /* 496 */ +- .pme_name = "PM_SHL_ST_DEP_CREATED", +- .pme_code = 0x000000588C, +- .pme_short_desc = "Store-Hit-Load Table Read Hit with entry Enabled", +- .pme_long_desc = "Store-Hit-Load Table Read Hit with entry Enabled", ++[ POWER9_PME_PM_LD_L3MISS_PEND_CYC ] = { ++ .pme_name = "PM_LD_L3MISS_PEND_CYC", ++ .pme_code = 0x0000010062, ++ .pme_short_desc = "Cycles L3 miss was pending for this thread", ++ .pme_long_desc = "Cycles L3 miss was pending for this thread", + }, +-[ POWER9_PME_PM_DPTEG_FROM_L3_1_SHR ] = { /* 497 */ +- .pme_name = "PM_DPTEG_FROM_L3_1_SHR", +- .pme_code = 0x000001E046, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's L3 on the same chip due to a data side request.", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's L3 on the same chip due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++[ POWER9_PME_PM_LD_MISS_L1_FIN ] = { ++ .pme_name = "PM_LD_MISS_L1_FIN", ++ .pme_code = 0x000002C04E, ++ .pme_short_desc = "Number of load instructions that finished with an L1 miss.", ++ .pme_long_desc = "Number of load instructions that finished with an L1 miss. Note that even if a load spans multiple slices this event will increment only once per load op.", + }, +-[ POWER9_PME_PM_DATA_FROM_RL4 ] = { /* 498 */ +- .pme_name = "PM_DATA_FROM_RL4", +- .pme_code = 0x000002C04A, +- .pme_short_desc = "The processor's data cache was reloaded from another chip's L4 on the same Node or Group ( Remote) due to a demand load", +- .pme_long_desc = "The processor's data cache was reloaded from another chip's L4 on the same Node or Group ( Remote) due to a demand load", ++[ POWER9_PME_PM_LD_MISS_L1 ] = { ++ .pme_name = "PM_LD_MISS_L1", ++ .pme_code = 0x000003E054, ++ .pme_short_desc = "Load Missed L1, counted at execution time (can be greater than loads finished).", ++ .pme_long_desc = "Load Missed L1, counted at execution time (can be greater than loads finished). LMQ merges are not included in this count. i.e. if a load instruction misses on an address that is already allocated on the LMQ, this event will not increment for that load). Note that this count is per slice, so if a load spans multiple slices this event will increment multiple times for a single load.", + }, +-[ POWER9_PME_PM_XLATE_MISS ] = { /* 499 */ +- .pme_name = "PM_XLATE_MISS", +- .pme_code = 0x000000F89C, +- .pme_short_desc = "The LSU requested a line from L2 for translation.", +- .pme_long_desc = "The LSU requested a line from L2 for translation. It may be satisfied from any source beyond L2. Includes speculative instructions", ++[ POWER9_PME_PM_LD_REF_L1 ] = { ++ .pme_name = "PM_LD_REF_L1", ++ .pme_code = 0x00000100FC, ++ .pme_short_desc = "All L1 D cache load references counted at finish, gated by reject", ++ .pme_long_desc = "All L1 D cache load references counted at finish, gated by reject", + }, +-[ POWER9_PME_PM_CMPLU_STALL_SRQ_FULL ] = { /* 500 */ +- .pme_name = "PM_CMPLU_STALL_SRQ_FULL", +- .pme_code = 0x0000030016, +- .pme_short_desc = "Finish stall because the NTF instruction was a store that was held in LSAQ because the SRQ was full", +- .pme_long_desc = "Finish stall because the NTF instruction was a store that was held in LSAQ because the SRQ was full", ++[ POWER9_PME_PM_LINK_STACK_CORRECT ] = { ++ .pme_name = "PM_LINK_STACK_CORRECT", ++ .pme_code = 0x00000058A0, ++ .pme_short_desc = "Link stack predicts right address", ++ .pme_long_desc = "Link stack predicts right address", + }, +-[ POWER9_PME_PM_SN0_BUSY ] = { /* 501 */ +- .pme_name = "PM_SN0_BUSY", +- .pme_code = 0x0000026090, +- .pme_short_desc = "SN mach 0 Busy.", +- .pme_long_desc = "SN mach 0 Busy. Used by PMU to sample ave RC livetime(mach0 used as sample point)", ++[ POWER9_PME_PM_LINK_STACK_INVALID_PTR ] = { ++ .pme_name = "PM_LINK_STACK_INVALID_PTR", ++ .pme_code = 0x0000005898, ++ .pme_short_desc = "It is most often caused by certain types of flush where the pointer is not available.", ++ .pme_long_desc = "It is most often caused by certain types of flush where the pointer is not available. Can result in the data in the link stack becoming unusable.", + }, +-[ POWER9_PME_PM_CMPLU_STALL_NESTED_TBEGIN ] = { /* 502 */ +- .pme_name = "PM_CMPLU_STALL_NESTED_TBEGIN", +- .pme_code = 0x000001E05C, +- .pme_short_desc = "Completion stall because the ISU is updating the TEXASR to keep track of the nested tbegin.", +- .pme_long_desc = "Completion stall because the ISU is updating the TEXASR to keep track of the nested tbegin. This is a short delay, and it includes ROT", ++[ POWER9_PME_PM_LINK_STACK_WRONG_ADD_PRED ] = { ++ .pme_name = "PM_LINK_STACK_WRONG_ADD_PRED", ++ .pme_code = 0x0000005098, ++ .pme_short_desc = "Link stack predicts wrong address, because of link stack design limitation or software violating the coding conventions", ++ .pme_long_desc = "Link stack predicts wrong address, because of link stack design limitation or software violating the coding conventions", + }, +-[ POWER9_PME_PM_ST_CMPL ] = { /* 503 */ +- .pme_name = "PM_ST_CMPL", +- .pme_code = 0x00000200F0, +- .pme_short_desc = "Store Instructions Completed", +- .pme_long_desc = "Store Instructions Completed", ++[ POWER9_PME_PM_LMQ_EMPTY_CYC ] = { ++ .pme_name = "PM_LMQ_EMPTY_CYC", ++ .pme_code = 0x000002E05E, ++ .pme_short_desc = "Cycles in which the LMQ has no pending load misses for this thread", ++ .pme_long_desc = "Cycles in which the LMQ has no pending load misses for this thread", + }, +-[ POWER9_PME_PM_DPTEG_FROM_DL2L3_SHR ] = { /* 504 */ +- .pme_name = "PM_DPTEG_FROM_DL2L3_SHR", +- .pme_code = 0x000003E048, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a data side request.", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++[ POWER9_PME_PM_LMQ_MERGE ] = { ++ .pme_name = "PM_LMQ_MERGE", ++ .pme_code = 0x000001002E, ++ .pme_short_desc = "A demand miss collides with a prefetch for the same line", ++ .pme_long_desc = "A demand miss collides with a prefetch for the same line", + }, +-[ POWER9_PME_PM_DECODE_FUSION_CONST_GEN ] = { /* 505 */ +- .pme_name = "PM_DECODE_FUSION_CONST_GEN", +- .pme_code = 0x00000048B4, +- .pme_short_desc = "32-bit constant generation", +- .pme_long_desc = "32-bit constant generation", ++[ POWER9_PME_PM_LRQ_REJECT ] = { ++ .pme_name = "PM_LRQ_REJECT", ++ .pme_code = 0x000002E05A, ++ .pme_short_desc = "Internal LSU reject from LRQ.", ++ .pme_long_desc = "Internal LSU reject from LRQ. Rejects cause the load to go back to LRQ, but it stays contained within the LSU once it gets issued. This event counts the number of times the LRQ attempts to relaunch an instruction after a reject. Any load can suffer multiple rejects", + }, +-[ POWER9_PME_PM_L2_LOC_GUESS_CORRECT ] = { /* 506 */ +- .pme_name = "PM_L2_LOC_GUESS_CORRECT", +- .pme_code = 0x0000016088, +- .pme_short_desc = "L2 guess loc and guess was correct (ie data local)", +- .pme_long_desc = "L2 guess loc and guess was correct (ie data local)", ++[ POWER9_PME_PM_LS0_DC_COLLISIONS ] = { ++ .pme_name = "PM_LS0_DC_COLLISIONS", ++ .pme_code = 0x000000D090, ++ .pme_short_desc = "Read-write data cache collisions", ++ .pme_long_desc = "Read-write data cache collisions", + }, +-[ POWER9_PME_PM_INST_FROM_L3_1_ECO_SHR ] = { /* 507 */ +- .pme_name = "PM_INST_FROM_L3_1_ECO_SHR", +- .pme_code = 0x0000034044, +- .pme_short_desc = "The processor's Instruction cache was reloaded with Shared (S) data from another core's ECO L3 on the same chip due to an instruction fetch (not prefetch)", +- .pme_long_desc = "The processor's Instruction cache was reloaded with Shared (S) data from another core's ECO L3 on the same chip due to an instruction fetch (not prefetch)", ++[ POWER9_PME_PM_LS0_ERAT_MISS_PREF ] = { ++ .pme_name = "PM_LS0_ERAT_MISS_PREF", ++ .pme_code = 0x000000E084, ++ .pme_short_desc = "LS0 Erat miss due to prefetch", ++ .pme_long_desc = "LS0 Erat miss due to prefetch", + }, +-[ POWER9_PME_PM_XLATE_HPT_MODE ] = { /* 508 */ +- .pme_name = "PM_XLATE_HPT_MODE", +- .pme_code = 0x000000F098, +- .pme_short_desc = "LSU reports every cycle the thread is in HPT translation mode (as opposed to radix mode)", +- .pme_long_desc = "LSU reports every cycle the thread is in HPT translation mode (as opposed to radix mode)", ++[ POWER9_PME_PM_LS0_LAUNCH_HELD_PREF ] = { ++ .pme_name = "PM_LS0_LAUNCH_HELD_PREF", ++ .pme_code = 0x000000C09C, ++ .pme_short_desc = "Number of times a load or store instruction was unable to launch/relaunch because a high priority prefetch used that relaunch cycle", ++ .pme_long_desc = "Number of times a load or store instruction was unable to launch/relaunch because a high priority prefetch used that relaunch cycle", + }, +-[ POWER9_PME_PM_CMPLU_STALL_LSU_FIN ] = { /* 509 */ +- .pme_name = "PM_CMPLU_STALL_LSU_FIN", +- .pme_code = 0x000001003A, +- .pme_short_desc = "Finish stall because the NTF instruction was an LSU op (other than a load or a store) with all its dependencies met and just going through the LSU pipe to finish", +- .pme_long_desc = "Finish stall because the NTF instruction was an LSU op (other than a load or a store) with all its dependencies met and just going through the LSU pipe to finish", ++[ POWER9_PME_PM_LS0_PTE_TABLEWALK_CYC ] = { ++ .pme_name = "PM_LS0_PTE_TABLEWALK_CYC", ++ .pme_code = 0x000000E0BC, ++ .pme_short_desc = "Cycles when a tablewalk is pending on this thread on table 0", ++ .pme_long_desc = "Cycles when a tablewalk is pending on this thread on table 0", + }, +-[ POWER9_PME_PM_THRESH_EXC_64 ] = { /* 510 */ +- .pme_name = "PM_THRESH_EXC_64", +- .pme_code = 0x00000301E8, +- .pme_short_desc = "Threshold counter exceeded a value of 64", +- .pme_long_desc = "Threshold counter exceeded a value of 64", ++[ POWER9_PME_PM_LS0_TM_DISALLOW ] = { ++ .pme_name = "PM_LS0_TM_DISALLOW", ++ .pme_code = 0x000000E0B4, ++ .pme_short_desc = "A TM-ineligible instruction tries to execute inside a transaction and the LSU disallows it", ++ .pme_long_desc = "A TM-ineligible instruction tries to execute inside a transaction and the LSU disallows it", + }, +-[ POWER9_PME_PM_MRK_DATA_FROM_DL4_CYC ] = { /* 511 */ +- .pme_name = "PM_MRK_DATA_FROM_DL4_CYC", +- .pme_code = 0x000002C12C, +- .pme_short_desc = "Duration in cycles to reload from another chip's L4 on a different Node or Group (Distant) due to a marked load", +- .pme_long_desc = "Duration in cycles to reload from another chip's L4 on a different Node or Group (Distant) due to a marked load", ++[ POWER9_PME_PM_LS0_UNALIGNED_LD ] = { ++ .pme_name = "PM_LS0_UNALIGNED_LD", ++ .pme_code = 0x000000C094, ++ .pme_short_desc = "Load instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the load of that size.", ++ .pme_long_desc = "Load instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the load of that size. If the load wraps from slice 3 to slice 0, thee is an additional 3-cycle penalty", + }, +-[ POWER9_PME_PM_DARQ_STORE_XMIT ] = { /* 512 */ +- .pme_name = "PM_DARQ_STORE_XMIT", +- .pme_code = 0x0000030064, +- .pme_short_desc = "The DARQ attempted to transmit a store into an LSAQ or SRQ entry.", +- .pme_long_desc = "The DARQ attempted to transmit a store into an LSAQ or SRQ entry. Includes rejects. Not qualified by thread, so it includes counts for the whole core", ++[ POWER9_PME_PM_LS0_UNALIGNED_ST ] = { ++ .pme_name = "PM_LS0_UNALIGNED_ST", ++ .pme_code = 0x000000F0B8, ++ .pme_short_desc = "Store instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the Store of that size.", ++ .pme_long_desc = "Store instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the Store of that size. If the Store wraps from slice 3 to slice 0, thee is an additional 3-cycle penalty", + }, +-[ POWER9_PME_PM_DATA_TABLEWALK_CYC ] = { /* 513 */ +- .pme_name = "PM_DATA_TABLEWALK_CYC", +- .pme_code = 0x000003001A, +- .pme_short_desc = "Tablwalk Cycles (could be 1 or 2 active tablewalks)", +- .pme_long_desc = "Tablwalk Cycles (could be 1 or 2 active tablewalks)", ++[ POWER9_PME_PM_LS1_DC_COLLISIONS ] = { ++ .pme_name = "PM_LS1_DC_COLLISIONS", ++ .pme_code = 0x000000D890, ++ .pme_short_desc = "Read-write data cache collisions", ++ .pme_long_desc = "Read-write data cache collisions", + }, +-[ POWER9_PME_PM_L2_RC_ST_DONE ] = { /* 514 */ +- .pme_name = "PM_L2_RC_ST_DONE", +- .pme_code = 0x0000036086, +- .pme_short_desc = "RC did st to line that was Tx or Sx", +- .pme_long_desc = "RC did st to line that was Tx or Sx", ++[ POWER9_PME_PM_LS1_ERAT_MISS_PREF ] = { ++ .pme_name = "PM_LS1_ERAT_MISS_PREF", ++ .pme_code = 0x000000E884, ++ .pme_short_desc = "LS1 Erat miss due to prefetch", ++ .pme_long_desc = "LS1 Erat miss due to prefetch", + }, +-[ POWER9_PME_PM_TMA_REQ_L2 ] = { /* 515 */ +- .pme_name = "PM_TMA_REQ_L2", +- .pme_code = 0x000000E0A4, +- .pme_short_desc = "addrs only req to L2 only on the first one,Indication that Load footprint is not expanding", +- .pme_long_desc = "addrs only req to L2 only on the first one,Indication that Load footprint is not expanding", ++[ POWER9_PME_PM_LS1_LAUNCH_HELD_PREF ] = { ++ .pme_name = "PM_LS1_LAUNCH_HELD_PREF", ++ .pme_code = 0x000000C89C, ++ .pme_short_desc = "Number of times a load or store instruction was unable to launch/relaunch because a high priority prefetch used that relaunch cycle", ++ .pme_long_desc = "Number of times a load or store instruction was unable to launch/relaunch because a high priority prefetch used that relaunch cycle", + }, +-[ POWER9_PME_PM_INST_FROM_ON_CHIP_CACHE ] = { /* 516 */ +- .pme_name = "PM_INST_FROM_ON_CHIP_CACHE", +- .pme_code = 0x0000014048, +- .pme_short_desc = "The processor's Instruction cache was reloaded either shared or modified data from another core's L2/L3 on the same chip due to an instruction fetch (not prefetch)", +- .pme_long_desc = "The processor's Instruction cache was reloaded either shared or modified data from another core's L2/L3 on the same chip due to an instruction fetch (not prefetch)", ++[ POWER9_PME_PM_LS1_PTE_TABLEWALK_CYC ] = { ++ .pme_name = "PM_LS1_PTE_TABLEWALK_CYC", ++ .pme_code = 0x000000E8BC, ++ .pme_short_desc = "Cycles when a tablewalk is pending on this thread on table 1", ++ .pme_long_desc = "Cycles when a tablewalk is pending on this thread on table 1", + }, +-[ POWER9_PME_PM_SLB_TABLEWALK_CYC ] = { /* 517 */ +- .pme_name = "PM_SLB_TABLEWALK_CYC", +- .pme_code = 0x000000F09C, +- .pme_short_desc = "Cycles when a tablewalk is pending on this thread on the SLB table", +- .pme_long_desc = "Cycles when a tablewalk is pending on this thread on the SLB table", ++[ POWER9_PME_PM_LS1_TM_DISALLOW ] = { ++ .pme_name = "PM_LS1_TM_DISALLOW", ++ .pme_code = 0x000000E8B4, ++ .pme_short_desc = "A TM-ineligible instruction tries to execute inside a transaction and the LSU disallows it", ++ .pme_long_desc = "A TM-ineligible instruction tries to execute inside a transaction and the LSU disallows it", + }, +-[ POWER9_PME_PM_MRK_DATA_FROM_RMEM ] = { /* 518 */ +- .pme_name = "PM_MRK_DATA_FROM_RMEM", +- .pme_code = 0x000001D148, +- .pme_short_desc = "The processor's data cache was reloaded from another chip's memory on the same Node or Group ( Remote) due to a marked load", +- .pme_long_desc = "The processor's data cache was reloaded from another chip's memory on the same Node or Group ( Remote) due to a marked load", ++[ POWER9_PME_PM_LS1_UNALIGNED_LD ] = { ++ .pme_name = "PM_LS1_UNALIGNED_LD", ++ .pme_code = 0x000000C894, ++ .pme_short_desc = "Load instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the load of that size.", ++ .pme_long_desc = "Load instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the load of that size. If the load wraps from slice 3 to slice 0, thee is an additional 3-cycle penalty", + }, +-[ POWER9_PME_PM_L3_PF_MISS_L3 ] = { /* 519 */ +- .pme_name = "PM_L3_PF_MISS_L3", +- .pme_code = 0x00000160A0, +- .pme_short_desc = "L3 Prefetch missed in L3", +- .pme_long_desc = "L3 Prefetch missed in L3", ++[ POWER9_PME_PM_LS1_UNALIGNED_ST ] = { ++ .pme_name = "PM_LS1_UNALIGNED_ST", ++ .pme_code = 0x000000F8B8, ++ .pme_short_desc = "Store instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the Store of that size.", ++ .pme_long_desc = "Store instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the Store of that size. If the Store wraps from slice 3 to slice 0, thee is an additional 3-cycle penalty", + }, +-[ POWER9_PME_PM_L3_CI_MISS ] = { /* 520 */ +- .pme_name = "PM_L3_CI_MISS", +- .pme_code = 0x00000268A2, +- .pme_short_desc = "L3 castins miss (total count", +- .pme_long_desc = "L3 castins miss (total count", ++[ POWER9_PME_PM_LS2_DC_COLLISIONS ] = { ++ .pme_name = "PM_LS2_DC_COLLISIONS", ++ .pme_code = 0x000000D094, ++ .pme_short_desc = "Read-write data cache collisions", ++ .pme_long_desc = "Read-write data cache collisions", + }, +-[ POWER9_PME_PM_L2_RCLD_DISP_FAIL_ADDR ] = { /* 521 */ +- .pme_name = "PM_L2_RCLD_DISP_FAIL_ADDR", +- .pme_code = 0x0000016884, +- .pme_short_desc = "L2 RC load dispatch attempt failed due to address collision with RC/CO/SN/SQ", +- .pme_long_desc = "L2 RC load dispatch attempt failed due to address collision with RC/CO/SN/SQ", ++[ POWER9_PME_PM_LS2_ERAT_MISS_PREF ] = { ++ .pme_name = "PM_LS2_ERAT_MISS_PREF", ++ .pme_code = 0x000000E088, ++ .pme_short_desc = "LS0 Erat miss due to prefetch", ++ .pme_long_desc = "LS0 Erat miss due to prefetch", + }, +-[ POWER9_PME_PM_DERAT_MISS_4K ] = { /* 522 */ +- .pme_name = "PM_DERAT_MISS_4K", +- .pme_code = 0x000001C056, +- .pme_short_desc = "Data ERAT Miss (Data TLB Access) page size 4K", +- .pme_long_desc = "Data ERAT Miss (Data TLB Access) page size 4K", ++[ POWER9_PME_PM_LS2_TM_DISALLOW ] = { ++ .pme_name = "PM_LS2_TM_DISALLOW", ++ .pme_code = 0x000000E0B8, ++ .pme_short_desc = "A TM-ineligible instruction tries to execute inside a transaction and the LSU disallows it", ++ .pme_long_desc = "A TM-ineligible instruction tries to execute inside a transaction and the LSU disallows it", + }, +-[ POWER9_PME_PM_ISIDE_MRU_TOUCH ] = { /* 523 */ +- .pme_name = "PM_ISIDE_MRU_TOUCH", +- .pme_code = 0x0000046880, +- .pme_short_desc = "Iside L2 MRU touch", +- .pme_long_desc = "Iside L2 MRU touch", ++[ POWER9_PME_PM_LS2_UNALIGNED_LD ] = { ++ .pme_name = "PM_LS2_UNALIGNED_LD", ++ .pme_code = 0x000000C098, ++ .pme_short_desc = "Load instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the load of that size.", ++ .pme_long_desc = "Load instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the load of that size. If the load wraps from slice 3 to slice 0, thee is an additional 3-cycle penalty", + }, +-[ POWER9_PME_PM_MRK_RUN_CYC ] = { /* 524 */ +- .pme_name = "PM_MRK_RUN_CYC", +- .pme_code = 0x000001D15E, +- .pme_short_desc = "Run cycles in which a marked instruction is in the pipeline", +- .pme_long_desc = "Run cycles in which a marked instruction is in the pipeline", ++[ POWER9_PME_PM_LS2_UNALIGNED_ST ] = { ++ .pme_name = "PM_LS2_UNALIGNED_ST", ++ .pme_code = 0x000000F0BC, ++ .pme_short_desc = "Store instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the Store of that size.", ++ .pme_long_desc = "Store instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the Store of that size. If the Store wraps from slice 3 to slice 0, thee is an additional 3-cycle penalty", + }, +-[ POWER9_PME_PM_L3_P0_CO_RTY ] = { /* 525 */ +- .pme_name = "PM_L3_P0_CO_RTY", +- .pme_code = 0x00000460AE, +- .pme_short_desc = "L3 CO received retry port 2", +- .pme_long_desc = "L3 CO received retry port 2", ++[ POWER9_PME_PM_LS3_DC_COLLISIONS ] = { ++ .pme_name = "PM_LS3_DC_COLLISIONS", ++ .pme_code = 0x000000D894, ++ .pme_short_desc = "Read-write data cache collisions", ++ .pme_long_desc = "Read-write data cache collisions", + }, +-[ POWER9_PME_PM_BR_MPRED_CMPL ] = { /* 526 */ +- .pme_name = "PM_BR_MPRED_CMPL", +- .pme_code = 0x00000400F6, +- .pme_short_desc = "Number of Branch Mispredicts", +- .pme_long_desc = "Number of Branch Mispredicts", ++[ POWER9_PME_PM_LS3_ERAT_MISS_PREF ] = { ++ .pme_name = "PM_LS3_ERAT_MISS_PREF", ++ .pme_code = 0x000000E888, ++ .pme_short_desc = "LS1 Erat miss due to prefetch", ++ .pme_long_desc = "LS1 Erat miss due to prefetch", + }, +-[ POWER9_PME_PM_BR_MPRED_TAKEN_TA ] = { /* 527 */ +- .pme_name = "PM_BR_MPRED_TAKEN_TA", +- .pme_code = 0x00000048B8, +- .pme_short_desc = "Conditional Branch Completed that was Mispredicted due to the Target Address Prediction from the Count Cache or Link Stack.", +- .pme_long_desc = "Conditional Branch Completed that was Mispredicted due to the Target Address Prediction from the Count Cache or Link Stack. Only XL-form branches that resolved Taken set this event.", ++[ POWER9_PME_PM_LS3_TM_DISALLOW ] = { ++ .pme_name = "PM_LS3_TM_DISALLOW", ++ .pme_code = 0x000000E8B8, ++ .pme_short_desc = "A TM-ineligible instruction tries to execute inside a transaction and the LSU disallows it", ++ .pme_long_desc = "A TM-ineligible instruction tries to execute inside a transaction and the LSU disallows it", + }, +-[ POWER9_PME_PM_DISP_HELD_TBEGIN ] = { /* 528 */ +- .pme_name = "PM_DISP_HELD_TBEGIN", +- .pme_code = 0x00000028B0, +- .pme_short_desc = "This outer tbegin transaction cannot be dispatched until the previous tend instruction completes", +- .pme_long_desc = "This outer tbegin transaction cannot be dispatched until the previous tend instruction completes", ++[ POWER9_PME_PM_LS3_UNALIGNED_LD ] = { ++ .pme_name = "PM_LS3_UNALIGNED_LD", ++ .pme_code = 0x000000C898, ++ .pme_short_desc = "Load instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the load of that size.", ++ .pme_long_desc = "Load instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the load of that size. If the load wraps from slice 3 to slice 0, thee is an additional 3-cycle penalty", + }, +-[ POWER9_PME_PM_DPTEG_FROM_RL2L3_MOD ] = { /* 529 */ +- .pme_name = "PM_DPTEG_FROM_RL2L3_MOD", +- .pme_code = 0x000002E046, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a data side request.", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++[ POWER9_PME_PM_LS3_UNALIGNED_ST ] = { ++ .pme_name = "PM_LS3_UNALIGNED_ST", ++ .pme_code = 0x000000F8BC, ++ .pme_short_desc = "Store instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the Store of that size.", ++ .pme_long_desc = "Store instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the Store of that size. If the Store wraps from slice 3 to slice 0, thee is an additional 3-cycle penalty", + }, +-[ POWER9_PME_PM_FLUSH_DISP_SB ] = { /* 530 */ +- .pme_name = "PM_FLUSH_DISP_SB", +- .pme_code = 0x0000002088, +- .pme_short_desc = "Dispatch Flush: Scoreboard", +- .pme_long_desc = "Dispatch Flush: Scoreboard", ++[ POWER9_PME_PM_LSU0_1_LRQF_FULL_CYC ] = { ++ .pme_name = "PM_LSU0_1_LRQF_FULL_CYC", ++ .pme_code = 0x000000D0BC, ++ .pme_short_desc = "Counts the number of cycles the LRQF is full.", ++ .pme_long_desc = "Counts the number of cycles the LRQF is full. LRQF is the queue that holds loads between finish and completion. If it fills up, instructions stay in LRQ until completion, potentially backing up the LRQ", + }, +-[ POWER9_PME_PM_L2_CHIP_PUMP ] = { /* 531 */ +- .pme_name = "PM_L2_CHIP_PUMP", +- .pme_code = 0x0000046088, +- .pme_short_desc = "RC requests that were local on chip pump attempts", +- .pme_long_desc = "RC requests that were local on chip pump attempts", ++[ POWER9_PME_PM_LSU0_ERAT_HIT ] = { ++ .pme_name = "PM_LSU0_ERAT_HIT", ++ .pme_code = 0x000000E08C, ++ .pme_short_desc = "Primary ERAT hit.", ++ .pme_long_desc = "Primary ERAT hit. There is no secondary ERAT", + }, +-[ POWER9_PME_PM_L2_DC_INV ] = { /* 532 */ +- .pme_name = "PM_L2_DC_INV", +- .pme_code = 0x0000026882, +- .pme_short_desc = "Dcache invalidates from L2", +- .pme_long_desc = "Dcache invalidates from L2", ++[ POWER9_PME_PM_LSU0_FALSE_LHS ] = { ++ .pme_name = "PM_LSU0_FALSE_LHS", ++ .pme_code = 0x000000C0A0, ++ .pme_short_desc = "False LHS match detected", ++ .pme_long_desc = "False LHS match detected", + }, +-[ POWER9_PME_PM_MRK_DATA_FROM_L2_DISP_CONFLICT_LDHITST_CYC ] = { /* 533 */ +- .pme_name = "PM_MRK_DATA_FROM_L2_DISP_CONFLICT_LDHITST_CYC", +- .pme_code = 0x000001415A, +- .pme_short_desc = "Duration in cycles to reload from local core's L2 with load hit store conflict due to a marked load", +- .pme_long_desc = "Duration in cycles to reload from local core's L2 with load hit store conflict due to a marked load", ++[ POWER9_PME_PM_LSU0_L1_CAM_CANCEL ] = { ++ .pme_name = "PM_LSU0_L1_CAM_CANCEL", ++ .pme_code = 0x000000F090, ++ .pme_short_desc = "ls0 l1 tm cam cancel", ++ .pme_long_desc = "ls0 l1 tm cam cancel", + }, +-[ POWER9_PME_PM_MRK_DPTEG_FROM_L3_1_SHR ] = { /* 534 */ +- .pme_name = "PM_MRK_DPTEG_FROM_L3_1_SHR", +- .pme_code = 0x000001F146, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's L3 on the same chip due to a marked data side request.", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's L3 on the same chip due to a marked data side request.. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++[ POWER9_PME_PM_LSU0_LDMX_FIN ] = { ++ .pme_name = "PM_LSU0_LDMX_FIN", ++ .pme_code = 0x000000D088, ++ .pme_short_desc = "New P9 instruction LDMX.", ++ .pme_long_desc = "New P9 instruction LDMX. The definition of this new PMU event is (from the ldmx RFC02491): The thread has executed an ldmx instruction that accessed a doubleword that contains an effective address within an enabled section of the Load Monitored region. This event, therefore, should not occur if the FSCR has disabled the load monitored facility (FSCR[52]) or disabled the EBB facility (FSCR[56]).", + }, +-[ POWER9_PME_PM_MRK_DERAT_MISS_2M ] = { /* 535 */ +- .pme_name = "PM_MRK_DERAT_MISS_2M", +- .pme_code = 0x000002D152, +- .pme_short_desc = "Marked Data ERAT Miss (Data TLB Access) page size 2M.", +- .pme_long_desc = "Marked Data ERAT Miss (Data TLB Access) page size 2M. Implies radix translation", ++[ POWER9_PME_PM_LSU0_LMQ_S0_VALID ] = { ++ .pme_name = "PM_LSU0_LMQ_S0_VALID", ++ .pme_code = 0x000000D8B8, ++ .pme_short_desc = "Slot 0 of LMQ valid", ++ .pme_long_desc = "Slot 0 of LMQ valid", + }, +-[ POWER9_PME_PM_MRK_ST_DONE_L2 ] = { /* 536 */ +- .pme_name = "PM_MRK_ST_DONE_L2", +- .pme_code = 0x0000010134, +- .pme_short_desc = "marked store completed in L2 ( RC machine done)", +- .pme_long_desc = "marked store completed in L2 ( RC machine done)", ++[ POWER9_PME_PM_LSU0_LRQ_S0_VALID_CYC ] = { ++ .pme_name = "PM_LSU0_LRQ_S0_VALID_CYC", ++ .pme_code = 0x000000D8B4, ++ .pme_short_desc = "Slot 0 of LRQ valid", ++ .pme_long_desc = "Slot 0 of LRQ valid", + }, +-[ POWER9_PME_PM_MRK_DATA_FROM_L3_1_ECO_MOD ] = { /* 537 */ +- .pme_name = "PM_MRK_DATA_FROM_L3_1_ECO_MOD", +- .pme_code = 0x000004D144, +- .pme_short_desc = "The processor's data cache was reloaded with Modified (M) data from another core's ECO L3 on the same chip due to a marked load", +- .pme_long_desc = "The processor's data cache was reloaded with Modified (M) data from another core's ECO L3 on the same chip due to a marked load", ++[ POWER9_PME_PM_LSU0_SET_MPRED ] = { ++ .pme_name = "PM_LSU0_SET_MPRED", ++ .pme_code = 0x000000D080, ++ .pme_short_desc = "Set prediction(set-p) miss.", ++ .pme_long_desc = "Set prediction(set-p) miss. The entry was not found in the Set prediction table", + }, +-[ POWER9_PME_PM_IPTEG_FROM_RMEM ] = { /* 538 */ +- .pme_name = "PM_IPTEG_FROM_RMEM", +- .pme_code = 0x000003504A, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB from another chip's memory on the same Node or Group ( Remote) due to a instruction side request", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB from another chip's memory on the same Node or Group ( Remote) due to a instruction side request", ++[ POWER9_PME_PM_LSU0_SRQ_S0_VALID_CYC ] = { ++ .pme_name = "PM_LSU0_SRQ_S0_VALID_CYC", ++ .pme_code = 0x000000D0B4, ++ .pme_short_desc = "Slot 0 of SRQ valid", ++ .pme_long_desc = "Slot 0 of SRQ valid", + }, +-[ POWER9_PME_PM_MRK_LSU_FLUSH_EMSH ] = { /* 539 */ +- .pme_name = "PM_MRK_LSU_FLUSH_EMSH", +- .pme_code = 0x000000D898, +- .pme_short_desc = "An ERAT miss was detected after a set-p hit.", +- .pme_long_desc = "An ERAT miss was detected after a set-p hit. Erat tracker indicates fail due to tlbmiss and the instruction gets flushed because the instruction was working on the wrong address", ++[ POWER9_PME_PM_LSU0_STORE_REJECT ] = { ++ .pme_name = "PM_LSU0_STORE_REJECT", ++ .pme_code = 0x000000F088, ++ .pme_short_desc = "All internal store rejects cause the instruction to go back to the SRQ and go to sleep until woken up to try again after the condition has been met", ++ .pme_long_desc = "All internal store rejects cause the instruction to go back to the SRQ and go to sleep until woken up to try again after the condition has been met", + }, +-[ POWER9_PME_PM_BR_PRED_LSTACK ] = { /* 540 */ +- .pme_name = "PM_BR_PRED_LSTACK", +- .pme_code = 0x00000040A8, +- .pme_short_desc = "Conditional Branch Completed that used the Link Stack for Target Prediction", +- .pme_long_desc = "Conditional Branch Completed that used the Link Stack for Target Prediction", ++[ POWER9_PME_PM_LSU0_TM_L1_HIT ] = { ++ .pme_name = "PM_LSU0_TM_L1_HIT", ++ .pme_code = 0x000000E094, ++ .pme_short_desc = "Load tm hit in L1", ++ .pme_long_desc = "Load tm hit in L1", + }, +-[ POWER9_PME_PM_L3_P0_CO_MEM ] = { /* 541 */ +- .pme_name = "PM_L3_P0_CO_MEM", +- .pme_code = 0x00000360AA, +- .pme_short_desc = "l3 CO to memory port 0", +- .pme_long_desc = "l3 CO to memory port 0", ++[ POWER9_PME_PM_LSU0_TM_L1_MISS ] = { ++ .pme_name = "PM_LSU0_TM_L1_MISS", ++ .pme_code = 0x000000E09C, ++ .pme_short_desc = "Load tm L1 miss", ++ .pme_long_desc = "Load tm L1 miss", + }, +-[ POWER9_PME_PM_IPTEG_FROM_L2_MEPF ] = { /* 542 */ +- .pme_name = "PM_IPTEG_FROM_L2_MEPF", +- .pme_code = 0x0000025040, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L2 hit without dispatch conflicts on Mepf state.", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L2 hit without dispatch conflicts on Mepf state. due to a instruction side request", ++[ POWER9_PME_PM_LSU1_ERAT_HIT ] = { ++ .pme_name = "PM_LSU1_ERAT_HIT", ++ .pme_code = 0x000000E88C, ++ .pme_short_desc = "Primary ERAT hit.", ++ .pme_long_desc = "Primary ERAT hit. There is no secondary ERAT", + }, +-[ POWER9_PME_PM_LS0_ERAT_MISS_PREF ] = { /* 543 */ +- .pme_name = "PM_LS0_ERAT_MISS_PREF", +- .pme_code = 0x000000E084, +- .pme_short_desc = "LS0 Erat miss due to prefetch", +- .pme_long_desc = "LS0 Erat miss due to prefetch", ++[ POWER9_PME_PM_LSU1_FALSE_LHS ] = { ++ .pme_name = "PM_LSU1_FALSE_LHS", ++ .pme_code = 0x000000C8A0, ++ .pme_short_desc = "False LHS match detected", ++ .pme_long_desc = "False LHS match detected", + }, +-[ POWER9_PME_PM_RD_HIT_PF ] = { /* 544 */ +- .pme_name = "PM_RD_HIT_PF", +- .pme_code = 0x00000268A8, +- .pme_short_desc = "rd machine hit l3 pf machine", +- .pme_long_desc = "rd machine hit l3 pf machine", ++[ POWER9_PME_PM_LSU1_L1_CAM_CANCEL ] = { ++ .pme_name = "PM_LSU1_L1_CAM_CANCEL", ++ .pme_code = 0x000000F890, ++ .pme_short_desc = "ls1 l1 tm cam cancel", ++ .pme_long_desc = "ls1 l1 tm cam cancel", + }, +-[ POWER9_PME_PM_DECODE_FUSION_LD_ST_DISP ] = { /* 545 */ +- .pme_name = "PM_DECODE_FUSION_LD_ST_DISP", +- .pme_code = 0x00000048A8, +- .pme_short_desc = "32-bit displacement D-form and 16-bit displacement X-form", +- .pme_long_desc = "32-bit displacement D-form and 16-bit displacement X-form", ++[ POWER9_PME_PM_LSU1_LDMX_FIN ] = { ++ .pme_name = "PM_LSU1_LDMX_FIN", ++ .pme_code = 0x000000D888, ++ .pme_short_desc = "New P9 instruction LDMX.", ++ .pme_long_desc = "New P9 instruction LDMX. The definition of this new PMU event is (from the ldmx RFC02491): The thread has executed an ldmx instruction that accessed a doubleword that contains an effective address within an enabled section of the Load Monitored region. This event, therefore, should not occur if the FSCR has disabled the load monitored facility (FSCR[52]) or disabled the EBB facility (FSCR[56]).", + }, +-[ POWER9_PME_PM_CMPLU_STALL_NTC_DISP_FIN ] = { /* 546 */ +- .pme_name = "PM_CMPLU_STALL_NTC_DISP_FIN", +- .pme_code = 0x000004E018, +- .pme_short_desc = "Finish stall because the NTF instruction was one that must finish at dispatch.", +- .pme_long_desc = "Finish stall because the NTF instruction was one that must finish at dispatch.", ++[ POWER9_PME_PM_LSU1_SET_MPRED ] = { ++ .pme_name = "PM_LSU1_SET_MPRED", ++ .pme_code = 0x000000D880, ++ .pme_short_desc = "Set prediction(set-p) miss.", ++ .pme_long_desc = "Set prediction(set-p) miss. The entry was not found in the Set prediction table", + }, +-[ POWER9_PME_PM_ICT_NOSLOT_CYC ] = { /* 547 */ +- .pme_name = "PM_ICT_NOSLOT_CYC", +- .pme_code = 0x00000100F8, +- .pme_short_desc = "Number of cycles the ICT has no itags assigned to this thread", +- .pme_long_desc = "Number of cycles the ICT has no itags assigned to this thread", ++[ POWER9_PME_PM_LSU1_STORE_REJECT ] = { ++ .pme_name = "PM_LSU1_STORE_REJECT", ++ .pme_code = 0x000000F888, ++ .pme_short_desc = "All internal store rejects cause the instruction to go back to the SRQ and go to sleep until woken up to try again after the condition has been met", ++ .pme_long_desc = "All internal store rejects cause the instruction to go back to the SRQ and go to sleep until woken up to try again after the condition has been met", + }, +-[ POWER9_PME_PM_DERAT_MISS_16M ] = { /* 548 */ +- .pme_name = "PM_DERAT_MISS_16M", +- .pme_code = 0x000003C054, +- .pme_short_desc = "Data ERAT Miss (Data TLB Access) page size 16M", +- .pme_long_desc = "Data ERAT Miss (Data TLB Access) page size 16M", ++[ POWER9_PME_PM_LSU1_TM_L1_HIT ] = { ++ .pme_name = "PM_LSU1_TM_L1_HIT", ++ .pme_code = 0x000000E894, ++ .pme_short_desc = "Load tm hit in L1", ++ .pme_long_desc = "Load tm hit in L1", + }, +-[ POWER9_PME_PM_IC_MISS_ICBI ] = { /* 549 */ +- .pme_name = "PM_IC_MISS_ICBI", +- .pme_code = 0x0000005094, +- .pme_short_desc = "threaded version, IC Misses where we got EA dir hit but no sector valids were on.", +- .pme_long_desc = "threaded version, IC Misses where we got EA dir hit but no sector valids were on. ICBI took line out", ++[ POWER9_PME_PM_LSU1_TM_L1_MISS ] = { ++ .pme_name = "PM_LSU1_TM_L1_MISS", ++ .pme_code = 0x000000E89C, ++ .pme_short_desc = "Load tm L1 miss", ++ .pme_long_desc = "Load tm L1 miss", + }, +-[ POWER9_PME_PM_TAGE_OVERRIDE_WRONG_SPEC ] = { /* 550 */ +- .pme_name = "PM_TAGE_OVERRIDE_WRONG_SPEC", +- .pme_code = 0x00000058B8, +- .pme_short_desc = "The TAGE overrode BHT direction prediction and it was correct.", +- .pme_long_desc = "The TAGE overrode BHT direction prediction and it was correct. Includes taken and not taken and is counted at execution time", ++[ POWER9_PME_PM_LSU2_3_LRQF_FULL_CYC ] = { ++ .pme_name = "PM_LSU2_3_LRQF_FULL_CYC", ++ .pme_code = 0x000000D8BC, ++ .pme_short_desc = "Counts the number of cycles the LRQF is full.", ++ .pme_long_desc = "Counts the number of cycles the LRQF is full. LRQF is the queue that holds loads between finish and completion. If it fills up, instructions stay in LRQ until completion, potentially backing up the LRQ", + }, +-[ POWER9_PME_PM_ICT_NOSLOT_DISP_HELD_TBEGIN ] = { /* 551 */ +- .pme_name = "PM_ICT_NOSLOT_DISP_HELD_TBEGIN", +- .pme_code = 0x0000010064, +- .pme_short_desc = "the NTC instruction is being held at dispatch because it is a tbegin instruction and there is an older tbegin in the pipeline that must complete before the younger tbegin can dispatch", +- .pme_long_desc = "the NTC instruction is being held at dispatch because it is a tbegin instruction and there is an older tbegin in the pipeline that must complete before the younger tbegin can dispatch", ++[ POWER9_PME_PM_LSU2_ERAT_HIT ] = { ++ .pme_name = "PM_LSU2_ERAT_HIT", ++ .pme_code = 0x000000E090, ++ .pme_short_desc = "Primary ERAT hit.", ++ .pme_long_desc = "Primary ERAT hit. There is no secondary ERAT", + }, +-[ POWER9_PME_PM_MRK_BR_TAKEN_CMPL ] = { /* 552 */ +- .pme_name = "PM_MRK_BR_TAKEN_CMPL", +- .pme_code = 0x00000101E2, +- .pme_short_desc = "Marked Branch Taken completed", +- .pme_long_desc = "Marked Branch Taken completed", ++[ POWER9_PME_PM_LSU2_FALSE_LHS ] = { ++ .pme_name = "PM_LSU2_FALSE_LHS", ++ .pme_code = 0x000000C0A4, ++ .pme_short_desc = "False LHS match detected", ++ .pme_long_desc = "False LHS match detected", + }, +-[ POWER9_PME_PM_CMPLU_STALL_VFXU ] = { /* 553 */ +- .pme_name = "PM_CMPLU_STALL_VFXU", +- .pme_code = 0x000003C05C, +- .pme_short_desc = "Finish stall due to a vector fixed point instruction in the execution pipeline.", +- .pme_long_desc = "Finish stall due to a vector fixed point instruction in the execution pipeline. These instructions get routed to the ALU, ALU2, and DIV pipes", ++[ POWER9_PME_PM_LSU2_L1_CAM_CANCEL ] = { ++ .pme_name = "PM_LSU2_L1_CAM_CANCEL", ++ .pme_code = 0x000000F094, ++ .pme_short_desc = "ls2 l1 tm cam cancel", ++ .pme_long_desc = "ls2 l1 tm cam cancel", + }, +-[ POWER9_PME_PM_DATA_GRP_PUMP_MPRED_RTY ] = { /* 554 */ +- .pme_name = "PM_DATA_GRP_PUMP_MPRED_RTY", +- .pme_code = 0x000001C052, +- .pme_short_desc = "Final Pump Scope (Group) ended up larger than Initial Pump Scope (Chip) for a demand load", +- .pme_long_desc = "Final Pump Scope (Group) ended up larger than Initial Pump Scope (Chip) for a demand load", ++[ POWER9_PME_PM_LSU2_LDMX_FIN ] = { ++ .pme_name = "PM_LSU2_LDMX_FIN", ++ .pme_code = 0x000000D08C, ++ .pme_short_desc = "New P9 instruction LDMX.", ++ .pme_long_desc = "New P9 instruction LDMX. The definition of this new PMU event is (from the ldmx RFC02491): The thread has executed an ldmx instruction that accessed a doubleword that contains an effective address within an enabled section of the Load Monitored region. This event, therefore, should not occur if the FSCR has disabled the load monitored facility (FSCR[52]) or disabled the EBB facility (FSCR[56]).", + }, +-[ POWER9_PME_PM_INST_FROM_L3 ] = { /* 555 */ +- .pme_name = "PM_INST_FROM_L3", +- .pme_code = 0x0000044042, +- .pme_short_desc = "The processor's Instruction cache was reloaded from local core's L3 due to an instruction fetch (not prefetch)", +- .pme_long_desc = "The processor's Instruction cache was reloaded from local core's L3 due to an instruction fetch (not prefetch)", ++[ POWER9_PME_PM_LSU2_SET_MPRED ] = { ++ .pme_name = "PM_LSU2_SET_MPRED", ++ .pme_code = 0x000000D084, ++ .pme_short_desc = "Set prediction(set-p) miss.", ++ .pme_long_desc = "Set prediction(set-p) miss. The entry was not found in the Set prediction table", + }, +-[ POWER9_PME_PM_ITLB_MISS ] = { /* 556 */ +- .pme_name = "PM_ITLB_MISS", +- .pme_code = 0x00000400FC, +- .pme_short_desc = "ITLB Reloaded (always zero on POWER6)", +- .pme_long_desc = "ITLB Reloaded (always zero on POWER6)", ++[ POWER9_PME_PM_LSU2_STORE_REJECT ] = { ++ .pme_name = "PM_LSU2_STORE_REJECT", ++ .pme_code = 0x000000F08C, ++ .pme_short_desc = "All internal store rejects cause the instruction to go back to the SRQ and go to sleep until woken up to try again after the condition has been met", ++ .pme_long_desc = "All internal store rejects cause the instruction to go back to the SRQ and go to sleep until woken up to try again after the condition has been met", + }, +-[ POWER9_PME_PM_MRK_DPTEG_FROM_RL2L3_MOD ] = { /* 557 */ +- .pme_name = "PM_MRK_DPTEG_FROM_RL2L3_MOD", +- .pme_code = 0x000002F146, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a marked data side request.", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++[ POWER9_PME_PM_LSU2_TM_L1_HIT ] = { ++ .pme_name = "PM_LSU2_TM_L1_HIT", ++ .pme_code = 0x000000E098, ++ .pme_short_desc = "Load tm hit in L1", ++ .pme_long_desc = "Load tm hit in L1", + }, +-[ POWER9_PME_PM_LSU2_TM_L1_MISS ] = { /* 558 */ ++[ POWER9_PME_PM_LSU2_TM_L1_MISS ] = { + .pme_name = "PM_LSU2_TM_L1_MISS", + .pme_code = 0x000000E0A0, + .pme_short_desc = "Load tm L1 miss", + .pme_long_desc = "Load tm L1 miss", + }, +-[ POWER9_PME_PM_L3_WI_USAGE ] = { /* 559 */ +- .pme_name = "PM_L3_WI_USAGE", +- .pme_code = 0x00000168A8, +- .pme_short_desc = "rotating sample of 8 WI actives", +- .pme_long_desc = "rotating sample of 8 WI actives", ++[ POWER9_PME_PM_LSU3_ERAT_HIT ] = { ++ .pme_name = "PM_LSU3_ERAT_HIT", ++ .pme_code = 0x000000E890, ++ .pme_short_desc = "Primary ERAT hit.", ++ .pme_long_desc = "Primary ERAT hit. There is no secondary ERAT", + }, +-[ POWER9_PME_PM_L2_SN_M_WR_DONE ] = { /* 560 */ +- .pme_name = "PM_L2_SN_M_WR_DONE", +- .pme_code = 0x0000046886, +- .pme_short_desc = "SNP dispatched for a write and was M", +- .pme_long_desc = "SNP dispatched for a write and was M", ++[ POWER9_PME_PM_LSU3_FALSE_LHS ] = { ++ .pme_name = "PM_LSU3_FALSE_LHS", ++ .pme_code = 0x000000C8A4, ++ .pme_short_desc = "False LHS match detected", ++ .pme_long_desc = "False LHS match detected", + }, +-[ POWER9_PME_PM_DISP_HELD_SYNC_HOLD ] = { /* 561 */ +- .pme_name = "PM_DISP_HELD_SYNC_HOLD", +- .pme_code = 0x000004003C, +- .pme_short_desc = "Cycles in which dispatch is held because of a synchronizing instruction in the pipeline", +- .pme_long_desc = "Cycles in which dispatch is held because of a synchronizing instruction in the pipeline", ++[ POWER9_PME_PM_LSU3_L1_CAM_CANCEL ] = { ++ .pme_name = "PM_LSU3_L1_CAM_CANCEL", ++ .pme_code = 0x000000F894, ++ .pme_short_desc = "ls3 l1 tm cam cancel", ++ .pme_long_desc = "ls3 l1 tm cam cancel", + }, +-[ POWER9_PME_PM_MRK_DPTEG_FROM_L2_1_SHR ] = { /* 562 */ +- .pme_name = "PM_MRK_DPTEG_FROM_L2_1_SHR", +- .pme_code = 0x000003F146, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's L2 on the same chip due to a marked data side request.", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's L2 on the same chip due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++[ POWER9_PME_PM_LSU3_LDMX_FIN ] = { ++ .pme_name = "PM_LSU3_LDMX_FIN", ++ .pme_code = 0x000000D88C, ++ .pme_short_desc = "New P9 instruction LDMX.", ++ .pme_long_desc = "New P9 instruction LDMX. The definition of this new PMU event is (from the ldmx RFC02491): The thread has executed an ldmx instruction that accessed a doubleword that contains an effective address within an enabled section of the Load Monitored region. This event, therefore, should not occur if the FSCR has disabled the load monitored facility (FSCR[52]) or disabled the EBB facility (FSCR[56]).", + }, +-[ POWER9_PME_PM_MEM_PREF ] = { /* 563 */ +- .pme_name = "PM_MEM_PREF", +- .pme_code = 0x000002C058, +- .pme_short_desc = "Memory prefetch for this thread.", +- .pme_long_desc = "Memory prefetch for this thread. Includes L4", ++[ POWER9_PME_PM_LSU3_SET_MPRED ] = { ++ .pme_name = "PM_LSU3_SET_MPRED", ++ .pme_code = 0x000000D884, ++ .pme_short_desc = "Set prediction(set-p) miss.", ++ .pme_long_desc = "Set prediction(set-p) miss. The entry was not found in the Set prediction table", + }, +-[ POWER9_PME_PM_L2_SN_M_RD_DONE ] = { /* 564 */ +- .pme_name = "PM_L2_SN_M_RD_DONE", +- .pme_code = 0x0000046086, +- .pme_short_desc = "SNP dispatched for a read and was M", +- .pme_long_desc = "SNP dispatched for a read and was M", ++[ POWER9_PME_PM_LSU3_STORE_REJECT ] = { ++ .pme_name = "PM_LSU3_STORE_REJECT", ++ .pme_code = 0x000000F88C, ++ .pme_short_desc = "All internal store rejects cause the instruction to go back to the SRQ and go to sleep until woken up to try again after the condition has been met", ++ .pme_long_desc = "All internal store rejects cause the instruction to go back to the SRQ and go to sleep until woken up to try again after the condition has been met", + }, +-[ POWER9_PME_PM_LS0_UNALIGNED_ST ] = { /* 565 */ +- .pme_name = "PM_LS0_UNALIGNED_ST", +- .pme_code = 0x000000F0B8, +- .pme_short_desc = "Store instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the Store of that size.", +- .pme_long_desc = "Store instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the Store of that size. If the Store wraps from slice 3 to slice 0, thee is an additional 3-cycle penalty", ++[ POWER9_PME_PM_LSU3_TM_L1_HIT ] = { ++ .pme_name = "PM_LSU3_TM_L1_HIT", ++ .pme_code = 0x000000E898, ++ .pme_short_desc = "Load tm hit in L1", ++ .pme_long_desc = "Load tm hit in L1", ++}, ++[ POWER9_PME_PM_LSU3_TM_L1_MISS ] = { ++ .pme_name = "PM_LSU3_TM_L1_MISS", ++ .pme_code = 0x000000E8A0, ++ .pme_short_desc = "Load tm L1 miss", ++ .pme_long_desc = "Load tm L1 miss", ++}, ++[ POWER9_PME_PM_LSU_DERAT_MISS ] = { ++ .pme_name = "PM_LSU_DERAT_MISS", ++ .pme_code = 0x00000200F6, ++ .pme_short_desc = "DERAT Reloaded due to a DERAT miss", ++ .pme_long_desc = "DERAT Reloaded due to a DERAT miss", ++}, ++[ POWER9_PME_PM_LSU_FIN ] = { ++ .pme_name = "PM_LSU_FIN", ++ .pme_code = 0x0000030066, ++ .pme_short_desc = "LSU Finished a PPC instruction (up to 4 per cycle)", ++ .pme_long_desc = "LSU Finished a PPC instruction (up to 4 per cycle)", ++}, ++[ POWER9_PME_PM_LSU_FLUSH_ATOMIC ] = { ++ .pme_name = "PM_LSU_FLUSH_ATOMIC", ++ .pme_code = 0x000000C8A8, ++ .pme_short_desc = "Quad-word loads (lq) are considered atomic because they always span at least 2 slices.", ++ .pme_long_desc = "Quad-word loads (lq) are considered atomic because they always span at least 2 slices. If a snoop or store from another thread changes the data the load is accessing between the 2 or 3 pieces of the lq instruction, the lq will be flushed", + }, +-[ POWER9_PME_PM_DC_PREF_CONS_ALLOC ] = { /* 566 */ +- .pme_name = "PM_DC_PREF_CONS_ALLOC", +- .pme_code = 0x000000F0B4, +- .pme_short_desc = "Prefetch stream allocated in the conservative phase by either the hardware prefetch mechanism or software prefetch", +- .pme_long_desc = "Prefetch stream allocated in the conservative phase by either the hardware prefetch mechanism or software prefetch", ++[ POWER9_PME_PM_LSU_FLUSH_CI ] = { ++ .pme_name = "PM_LSU_FLUSH_CI", ++ .pme_code = 0x000000C0A8, ++ .pme_short_desc = "Load was not issued to LSU as a cache inhibited (non-cacheable) load but it was later determined to be cache inhibited", ++ .pme_long_desc = "Load was not issued to LSU as a cache inhibited (non-cacheable) load but it was later determined to be cache inhibited", + }, +-[ POWER9_PME_PM_MRK_DERAT_MISS_16G ] = { /* 567 */ +- .pme_name = "PM_MRK_DERAT_MISS_16G", +- .pme_code = 0x000004C15C, +- .pme_short_desc = "Marked Data ERAT Miss (Data TLB Access) page size 16G", +- .pme_long_desc = "Marked Data ERAT Miss (Data TLB Access) page size 16G", ++[ POWER9_PME_PM_LSU_FLUSH_EMSH ] = { ++ .pme_name = "PM_LSU_FLUSH_EMSH", ++ .pme_code = 0x000000C0AC, ++ .pme_short_desc = "An ERAT miss was detected after a set-p hit.", ++ .pme_long_desc = "An ERAT miss was detected after a set-p hit. Erat tracker indicates fail due to tlbmiss and the instruction gets flushed because the instruction was working on the wrong address", + }, +-[ POWER9_PME_PM_IPTEG_FROM_L2 ] = { /* 568 */ +- .pme_name = "PM_IPTEG_FROM_L2", +- .pme_code = 0x0000015042, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L2 due to a instruction side request", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L2 due to a instruction side request", ++[ POWER9_PME_PM_LSU_FLUSH_LARX_STCX ] = { ++ .pme_name = "PM_LSU_FLUSH_LARX_STCX", ++ .pme_code = 0x000000C8B8, ++ .pme_short_desc = "A larx is flushed because an older larx has an LMQ reservation for the same thread.", ++ .pme_long_desc = "A larx is flushed because an older larx has an LMQ reservation for the same thread. A stcx is flushed because an older stcx is in the LMQ. The flush happens when the older larx/stcx relaunches", + }, +-[ POWER9_PME_PM_ANY_THRD_RUN_CYC ] = { /* 569 */ +- .pme_name = "PM_ANY_THRD_RUN_CYC", +- .pme_code = 0x00000100FA, +- .pme_short_desc = "Cycles in which at least one thread has the run latch set", +- .pme_long_desc = "Cycles in which at least one thread has the run latch set", ++[ POWER9_PME_PM_LSU_FLUSH_LHL_SHL ] = { ++ .pme_name = "PM_LSU_FLUSH_LHL_SHL", ++ .pme_code = 0x000000C8B4, ++ .pme_short_desc = "The instruction was flushed because of a sequential load/store consistency.", ++ .pme_long_desc = "The instruction was flushed because of a sequential load/store consistency. If a load or store hits on an older load that has either been snooped (for loads) or has stale data (for stores).", + }, +-[ POWER9_PME_PM_MRK_PROBE_NOP_CMPL ] = { /* 570 */ +- .pme_name = "PM_MRK_PROBE_NOP_CMPL", +- .pme_code = 0x000001F05E, +- .pme_short_desc = "Marked probeNops completed", +- .pme_long_desc = "Marked probeNops completed", ++[ POWER9_PME_PM_LSU_FLUSH_LHS ] = { ++ .pme_name = "PM_LSU_FLUSH_LHS", ++ .pme_code = 0x000000C8B0, ++ .pme_short_desc = "Effective Address alias flush : no EA match but Real Address match.", ++ .pme_long_desc = "Effective Address alias flush : no EA match but Real Address match. If the data has not yet been returned for this load, the instruction will just be rejected, but if it has returned data, it will be flushed", + }, +-[ POWER9_PME_PM_BANK_CONFLICT ] = { /* 571 */ +- .pme_name = "PM_BANK_CONFLICT", +- .pme_code = 0x0000004880, +- .pme_short_desc = "Read blocked due to interleave conflict.", +- .pme_long_desc = "Read blocked due to interleave conflict. The ifar logic will detect an interleave conflict and kill the data that was read that cycle.", ++[ POWER9_PME_PM_LSU_FLUSH_NEXT ] = { ++ .pme_name = "PM_LSU_FLUSH_NEXT", ++ .pme_code = 0x00000020B0, ++ .pme_short_desc = "LSU flush next reported at flush time.", ++ .pme_long_desc = "LSU flush next reported at flush time. Sometimes these also come with an exception", + }, +-[ POWER9_PME_PM_INST_SYS_PUMP_MPRED ] = { /* 572 */ +- .pme_name = "PM_INST_SYS_PUMP_MPRED", +- .pme_code = 0x0000034052, +- .pme_short_desc = "Final Pump Scope (system) mispredicted.", +- .pme_long_desc = "Final Pump Scope (system) mispredicted. Either the original scope was too small (Chip/Group) or the original scope was System and it should have been smaller. Counts for an instruction fetch", ++[ POWER9_PME_PM_LSU_FLUSH_OTHER ] = { ++ .pme_name = "PM_LSU_FLUSH_OTHER", ++ .pme_code = 0x000000C0BC, ++ .pme_short_desc = "Other LSU flushes including: Sync (sync ack from L2 caused search of LRQ for oldest snooped load, This will either signal a Precise Flush of the oldest snooped loa or a Flush Next PPC); Data Valid Flush Next (several cases of this, one example is store and reload are lined up such that a store-hit-reload scenario exists and the CDF has already launched and has gotten bad/stale data); Bad Data Valid Flush Next (might be a few cases of this, one example is a larxa (D$ hit) return data and dval but can't allocate to LMQ (LMQ full or other reason).", ++ .pme_long_desc = "Other LSU flushes including: Sync (sync ack from L2 caused search of LRQ for oldest snooped load, This will either signal a Precise Flush of the oldest snooped loa or a Flush Next PPC); Data Valid Flush Next (several cases of this, one example is store and reload are lined up such that a store-hit-reload scenario exists and the CDF has already launched and has gotten bad/stale data); Bad Data Valid Flush Next (might be a few cases of this, one example is a larxa (D$ hit) return data and dval but can't allocate to LMQ (LMQ full or other reason). Already gave dval but can't watch it for snoop_hit_larx. Need to take the “bad dval” back and flush all younger ops)", + }, +-[ POWER9_PME_PM_NON_DATA_STORE ] = { /* 573 */ +- .pme_name = "PM_NON_DATA_STORE", +- .pme_code = 0x000000F8A0, +- .pme_short_desc = "All ops that drain from s2q to L2 and contain no data", +- .pme_long_desc = "All ops that drain from s2q to L2 and contain no data", ++[ POWER9_PME_PM_LSU_FLUSH_RELAUNCH_MISS ] = { ++ .pme_name = "PM_LSU_FLUSH_RELAUNCH_MISS", ++ .pme_code = 0x000000C8AC, ++ .pme_short_desc = "If a load that has already returned data and has to relaunch for any reason then gets a miss (erat, setp, data cache), it will often be flushed at relaunch time because the data might be inconsistent", ++ .pme_long_desc = "If a load that has already returned data and has to relaunch for any reason then gets a miss (erat, setp, data cache), it will often be flushed at relaunch time because the data might be inconsistent", + }, +-[ POWER9_PME_PM_DC_PREF_CONF ] = { /* 574 */ +- .pme_name = "PM_DC_PREF_CONF", +- .pme_code = 0x000000F0A8, +- .pme_short_desc = "A demand load referenced a line in an active prefetch stream.", +- .pme_long_desc = "A demand load referenced a line in an active prefetch stream. The stream could have been allocated through the hardware prefetch mechanism or through software. Includes forwards and backwards streams", ++[ POWER9_PME_PM_LSU_FLUSH_SAO ] = { ++ .pme_name = "PM_LSU_FLUSH_SAO", ++ .pme_code = 0x000000C0B8, ++ .pme_short_desc = "A load-hit-load condition with Strong Address Ordering will have address compare disabled and flush", ++ .pme_long_desc = "A load-hit-load condition with Strong Address Ordering will have address compare disabled and flush", + }, +-[ POWER9_PME_PM_BTAC_BAD_RESULT ] = { /* 575 */ +- .pme_name = "PM_BTAC_BAD_RESULT", +- .pme_code = 0x00000050B0, +- .pme_short_desc = "BTAC thinks branch will be taken but it is either predicted not-taken by the BHT, or the target address is wrong (less common).", +- .pme_long_desc = "BTAC thinks branch will be taken but it is either predicted not-taken by the BHT, or the target address is wrong (less common). In both cases, a redirect will happen", ++[ POWER9_PME_PM_LSU_FLUSH_UE ] = { ++ .pme_name = "PM_LSU_FLUSH_UE", ++ .pme_code = 0x000000C0B0, ++ .pme_short_desc = "Correctable ECC error on reload data, reported at critical data forward time", ++ .pme_long_desc = "Correctable ECC error on reload data, reported at critical data forward time", ++}, ++[ POWER9_PME_PM_LSU_FLUSH_WRK_ARND ] = { ++ .pme_name = "PM_LSU_FLUSH_WRK_ARND", ++ .pme_code = 0x000000C0B4, ++ .pme_short_desc = "LSU workaround flush.", ++ .pme_long_desc = "LSU workaround flush. These flushes are setup with programmable scan only latches to perform various actions when the flush macro receives a trigger from the dbg macros. These actions include things like flushing the next op encountered for a particular thread or flushing the next op that is NTC op that is encountered on a particular slice. The kind of flush that the workaround is setup to perform is highly variable.", + }, +-[ POWER9_PME_PM_LSU_LMQ_FULL_CYC ] = { /* 576 */ ++[ POWER9_PME_PM_LSU_LMQ_FULL_CYC ] = { + .pme_name = "PM_LSU_LMQ_FULL_CYC", + .pme_code = 0x000000D0B8, + .pme_short_desc = "Counts the number of cycles the LMQ is full", + .pme_long_desc = "Counts the number of cycles the LMQ is full", + }, +-[ POWER9_PME_PM_NON_MATH_FLOP_CMPL ] = { /* 577 */ +- .pme_name = "PM_NON_MATH_FLOP_CMPL", +- .pme_code = 0x000004D05A, +- .pme_short_desc = "Non-math flop instruction completed", +- .pme_long_desc = "Non-math flop instruction completed", +-}, +-[ POWER9_PME_PM_MRK_LD_MISS_L1_CYC ] = { /* 578 */ +- .pme_name = "PM_MRK_LD_MISS_L1_CYC", +- .pme_code = 0x000001D056, +- .pme_short_desc = "Marked ld latency", +- .pme_long_desc = "Marked ld latency", +-}, +-[ POWER9_PME_PM_MRK_DATA_FROM_L2_CYC ] = { /* 579 */ +- .pme_name = "PM_MRK_DATA_FROM_L2_CYC", +- .pme_code = 0x0000014156, +- .pme_short_desc = "Duration in cycles to reload from local core's L2 due to a marked load", +- .pme_long_desc = "Duration in cycles to reload from local core's L2 due to a marked load", +-}, +-[ POWER9_PME_PM_FXU_1PLUS_BUSY ] = { /* 580 */ +- .pme_name = "PM_FXU_1PLUS_BUSY", +- .pme_code = 0x000003000E, +- .pme_short_desc = "At least one of the 4 FXU units is busy", +- .pme_long_desc = "At least one of the 4 FXU units is busy", +-}, +-[ POWER9_PME_PM_CMPLU_STALL_DP ] = { /* 581 */ +- .pme_name = "PM_CMPLU_STALL_DP", +- .pme_code = 0x000001005C, +- .pme_short_desc = "Finish stall because the NTF instruction was a scalar instruction issued to the Double Precision execution pipe and waiting to finish.", +- .pme_long_desc = "Finish stall because the NTF instruction was a scalar instruction issued to the Double Precision execution pipe and waiting to finish. Includes binary floating point instructions in 32 and 64 bit binary floating point format. Not qualified multicycle. Qualified by NOT vector", +-}, +-[ POWER9_PME_PM_MRK_DATA_FROM_L3_1_MOD_CYC ] = { /* 582 */ +- .pme_name = "PM_MRK_DATA_FROM_L3_1_MOD_CYC", +- .pme_code = 0x000001D140, +- .pme_short_desc = "Duration in cycles to reload with Modified (M) data from another core's L3 on the same chip due to a marked load", +- .pme_long_desc = "Duration in cycles to reload with Modified (M) data from another core's L3 on the same chip due to a marked load", +-}, +-[ POWER9_PME_PM_SYNC_MRK_L2HIT ] = { /* 583 */ +- .pme_name = "PM_SYNC_MRK_L2HIT", +- .pme_code = 0x0000015158, +- .pme_short_desc = "Marked L2 Hits that can throw a synchronous interrupt", +- .pme_long_desc = "Marked L2 Hits that can throw a synchronous interrupt", ++[ POWER9_PME_PM_LSU_LMQ_SRQ_EMPTY_CYC ] = { ++ .pme_name = "PM_LSU_LMQ_SRQ_EMPTY_CYC", ++ .pme_code = 0x000002003E, ++ .pme_short_desc = "Cycles in which the LSU is empty for all threads (lmq and srq are completely empty)", ++ .pme_long_desc = "Cycles in which the LSU is empty for all threads (lmq and srq are completely empty)", + }, +-[ POWER9_PME_PM_MRK_DATA_FROM_RMEM_CYC ] = { /* 584 */ +- .pme_name = "PM_MRK_DATA_FROM_RMEM_CYC", +- .pme_code = 0x000002C12A, +- .pme_short_desc = "Duration in cycles to reload from another chip's memory on the same Node or Group ( Remote) due to a marked load", +- .pme_long_desc = "Duration in cycles to reload from another chip's memory on the same Node or Group ( Remote) due to a marked load", ++[ POWER9_PME_PM_LSU_NCST ] = { ++ .pme_name = "PM_LSU_NCST", ++ .pme_code = 0x000000C890, ++ .pme_short_desc = "Asserts when a i=1 store op is sent to the nest.", ++ .pme_long_desc = "Asserts when a i=1 store op is sent to the nest. No record of issue pipe (LS0/LS1) is maintained so this is for both pipes. Probably don't need separate LS0 and LS1", + }, +-[ POWER9_PME_PM_ISU1_ISS_HOLD_ALL ] = { /* 585 */ +- .pme_name = "PM_ISU1_ISS_HOLD_ALL", +- .pme_code = 0x0000003084, +- .pme_short_desc = "All ISU rejects", +- .pme_long_desc = "All ISU rejects", ++[ POWER9_PME_PM_LSU_REJECT_ERAT_MISS ] = { ++ .pme_name = "PM_LSU_REJECT_ERAT_MISS", ++ .pme_code = 0x000002E05C, ++ .pme_short_desc = "LSU Reject due to ERAT (up to 4 per cycles)", ++ .pme_long_desc = "LSU Reject due to ERAT (up to 4 per cycles)", + }, +-[ POWER9_PME_PM_MRK_DPTEG_FROM_L3_DISP_CONFLICT ] = { /* 586 */ +- .pme_name = "PM_MRK_DPTEG_FROM_L3_DISP_CONFLICT", +- .pme_code = 0x000003F142, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L3 with dispatch conflict due to a marked data side request.", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L3 with dispatch conflict due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++[ POWER9_PME_PM_LSU_REJECT_LHS ] = { ++ .pme_name = "PM_LSU_REJECT_LHS", ++ .pme_code = 0x000004E05C, ++ .pme_short_desc = "LSU Reject due to LHS (up to 4 per cycle)", ++ .pme_long_desc = "LSU Reject due to LHS (up to 4 per cycle)", + }, +-[ POWER9_PME_PM_MRK_FAB_RSP_RWITM_RTY ] = { /* 587 */ +- .pme_name = "PM_MRK_FAB_RSP_RWITM_RTY", +- .pme_code = 0x000002015E, +- .pme_short_desc = "Sampled store did a rwitm and got a rty", +- .pme_long_desc = "Sampled store did a rwitm and got a rty", ++[ POWER9_PME_PM_LSU_REJECT_LMQ_FULL ] = { ++ .pme_name = "PM_LSU_REJECT_LMQ_FULL", ++ .pme_code = 0x000003001C, ++ .pme_short_desc = "LSU Reject due to LMQ full (up to 4 per cycles)", ++ .pme_long_desc = "LSU Reject due to LMQ full (up to 4 per cycles)", + }, +-[ POWER9_PME_PM_L3_P3_LCO_RTY ] = { /* 588 */ +- .pme_name = "PM_L3_P3_LCO_RTY", +- .pme_code = 0x00000268B4, +- .pme_short_desc = "L3 lateral cast out received retry on port 3", +- .pme_long_desc = "L3 lateral cast out received retry on port 3", ++[ POWER9_PME_PM_LSU_SRQ_FULL_CYC ] = { ++ .pme_name = "PM_LSU_SRQ_FULL_CYC", ++ .pme_code = 0x000001001A, ++ .pme_short_desc = "Cycles in which the Store Queue is full on all 4 slices.", ++ .pme_long_desc = "Cycles in which the Store Queue is full on all 4 slices. This is event is not per thread. All the threads will see the same count for this core resource", + }, +-[ POWER9_PME_PM_PUMP_CPRED ] = { /* 589 */ +- .pme_name = "PM_PUMP_CPRED", +- .pme_code = 0x0000010054, +- .pme_short_desc = "Pump prediction correct.", +- .pme_long_desc = "Pump prediction correct. Counts across all types of pumps for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", ++[ POWER9_PME_PM_LSU_STCX_FAIL ] = { ++ .pme_name = "PM_LSU_STCX_FAIL", ++ .pme_code = 0x000000F080, ++ .pme_short_desc = "", ++ .pme_long_desc = "", + }, +-[ POWER9_PME_PM_LS3_TM_DISALLOW ] = { /* 590 */ +- .pme_name = "PM_LS3_TM_DISALLOW", +- .pme_code = 0x000000E8B8, +- .pme_short_desc = "A TM-ineligible instruction tries to execute inside a transaction and the LSU disallows it", +- .pme_long_desc = "A TM-ineligible instruction tries to execute inside a transaction and the LSU disallows it", ++[ POWER9_PME_PM_LSU_STCX ] = { ++ .pme_name = "PM_LSU_STCX", ++ .pme_code = 0x000000C090, ++ .pme_short_desc = "STCX sent to nest, i.", ++ .pme_long_desc = "STCX sent to nest, i.e. total", + }, +-[ POWER9_PME_PM_SN_INVL ] = { /* 591 */ +- .pme_name = "PM_SN_INVL", +- .pme_code = 0x00000368A8, +- .pme_short_desc = "Any port snooper detects a store to a line that?s in the Sx state and invalidates the line.", +- .pme_long_desc = "Any port snooper detects a store to a line that?s in the Sx state and invalidates the line. Up to 4 can happen in a cycle but we only count 1", ++[ POWER9_PME_PM_LWSYNC ] = { ++ .pme_name = "PM_LWSYNC", ++ .pme_code = 0x0000005894, ++ .pme_short_desc = "Lwsync instruction decoded and transferred", ++ .pme_long_desc = "Lwsync instruction decoded and transferred", + }, +-[ POWER9_PME_PM_TM_LD_CONF ] = { /* 592 */ +- .pme_name = "PM_TM_LD_CONF", +- .pme_code = 0x000002608C, +- .pme_short_desc = "TM Load (fav or non-fav) ran into conflict (failed)", +- .pme_long_desc = "TM Load (fav or non-fav) ran into conflict (failed)", ++[ POWER9_PME_PM_MATH_FLOP_CMPL ] = { ++ .pme_name = "PM_MATH_FLOP_CMPL", ++ .pme_code = 0x000004505C, ++ .pme_short_desc = "Math flop instruction completed", ++ .pme_long_desc = "Math flop instruction completed", + }, +-[ POWER9_PME_PM_LD_MISS_L1_FIN ] = { /* 593 */ +- .pme_name = "PM_LD_MISS_L1_FIN", +- .pme_code = 0x000002C04E, +- .pme_short_desc = "Number of load instructions that finished with an L1 miss.", +- .pme_long_desc = "Number of load instructions that finished with an L1 miss. Note that even if a load spans multiple slices this event will increment only once per load op.", ++[ POWER9_PME_PM_MEM_CO ] = { ++ .pme_name = "PM_MEM_CO", ++ .pme_code = 0x000004C058, ++ .pme_short_desc = "Memory castouts from this thread", ++ .pme_long_desc = "Memory castouts from this thread", + }, +-[ POWER9_PME_PM_SYNC_MRK_PROBE_NOP ] = { /* 594 */ +- .pme_name = "PM_SYNC_MRK_PROBE_NOP", +- .pme_code = 0x0000015150, +- .pme_short_desc = "Marked probeNops which can cause synchronous interrupts", +- .pme_long_desc = "Marked probeNops which can cause synchronous interrupts", ++[ POWER9_PME_PM_MEM_LOC_THRESH_IFU ] = { ++ .pme_name = "PM_MEM_LOC_THRESH_IFU", ++ .pme_code = 0x0000010058, ++ .pme_short_desc = "Local Memory above threshold for IFU speculation control", ++ .pme_long_desc = "Local Memory above threshold for IFU speculation control", + }, +-[ POWER9_PME_PM_RUN_CYC ] = { /* 595 */ +- .pme_name = "PM_RUN_CYC", +- .pme_code = 0x00000200F4, +- .pme_short_desc = "Run_cycles", +- .pme_long_desc = "Run_cycles", ++[ POWER9_PME_PM_MEM_LOC_THRESH_LSU_HIGH ] = { ++ .pme_name = "PM_MEM_LOC_THRESH_LSU_HIGH", ++ .pme_code = 0x0000040056, ++ .pme_short_desc = "Local memory above threshold for LSU medium", ++ .pme_long_desc = "Local memory above threshold for LSU medium", + }, +-[ POWER9_PME_PM_SYS_PUMP_MPRED ] = { /* 596 */ +- .pme_name = "PM_SYS_PUMP_MPRED", +- .pme_code = 0x0000030052, +- .pme_short_desc = "Final Pump Scope (system) mispredicted.", +- .pme_long_desc = "Final Pump Scope (system) mispredicted. Either the original scope was too small (Chip/Group) or the original scope was System and it should have been smaller. Counts for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", ++[ POWER9_PME_PM_MEM_LOC_THRESH_LSU_MED ] = { ++ .pme_name = "PM_MEM_LOC_THRESH_LSU_MED", ++ .pme_code = 0x000001C05E, ++ .pme_short_desc = "Local memory above threshold for data prefetch", ++ .pme_long_desc = "Local memory above threshold for data prefetch", + }, +-[ POWER9_PME_PM_DATA_FROM_OFF_CHIP_CACHE ] = { /* 597 */ +- .pme_name = "PM_DATA_FROM_OFF_CHIP_CACHE", +- .pme_code = 0x000004C04A, +- .pme_short_desc = "The processor's data cache was reloaded either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to a demand load", +- .pme_long_desc = "The processor's data cache was reloaded either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to a demand load", ++[ POWER9_PME_PM_MEM_PREF ] = { ++ .pme_name = "PM_MEM_PREF", ++ .pme_code = 0x000002C058, ++ .pme_short_desc = "Memory prefetch for this thread.", ++ .pme_long_desc = "Memory prefetch for this thread. Includes L4", + }, +-[ POWER9_PME_PM_TM_NESTED_TBEGIN ] = { /* 598 */ +- .pme_name = "PM_TM_NESTED_TBEGIN", +- .pme_code = 0x00000020A0, +- .pme_short_desc = "Completion Tm nested tbegin", +- .pme_long_desc = "Completion Tm nested tbegin", ++[ POWER9_PME_PM_MEM_READ ] = { ++ .pme_name = "PM_MEM_READ", ++ .pme_code = 0x0000010056, ++ .pme_short_desc = "Reads from Memory from this thread (includes data/inst/xlate/l1prefetch/inst prefetch).", ++ .pme_long_desc = "Reads from Memory from this thread (includes data/inst/xlate/l1prefetch/inst prefetch). Includes L4", + }, +-[ POWER9_PME_PM_FLUSH_COMPLETION ] = { /* 599 */ +- .pme_name = "PM_FLUSH_COMPLETION", +- .pme_code = 0x0000030012, +- .pme_short_desc = "The instruction that was next to complete did not complete because it suffered a flush", +- .pme_long_desc = "The instruction that was next to complete did not complete because it suffered a flush", ++[ POWER9_PME_PM_MEM_RWITM ] = { ++ .pme_name = "PM_MEM_RWITM", ++ .pme_code = 0x000003C05E, ++ .pme_short_desc = "Memory Read With Intent to Modify for this thread", ++ .pme_long_desc = "Memory Read With Intent to Modify for this thread", + }, +-[ POWER9_PME_PM_ST_MISS_L1 ] = { /* 600 */ +- .pme_name = "PM_ST_MISS_L1", +- .pme_code = 0x00000300F0, +- .pme_short_desc = "Store Missed L1", +- .pme_long_desc = "Store Missed L1", ++[ POWER9_PME_PM_MRK_BACK_BR_CMPL ] = { ++ .pme_name = "PM_MRK_BACK_BR_CMPL", ++ .pme_code = 0x000003515E, ++ .pme_short_desc = "Marked branch instruction completed with a target address less than current instruction address", ++ .pme_long_desc = "Marked branch instruction completed with a target address less than current instruction address", + }, +-[ POWER9_PME_PM_IPTEG_FROM_L2MISS ] = { /* 601 */ +- .pme_name = "PM_IPTEG_FROM_L2MISS", +- .pme_code = 0x000001504E, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB from a localtion other than the local core's L2 due to a instruction side request", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB from a localtion other than the local core's L2 due to a instruction side request", ++[ POWER9_PME_PM_MRK_BR_2PATH ] = { ++ .pme_name = "PM_MRK_BR_2PATH", ++ .pme_code = 0x0000010138, ++ .pme_short_desc = "marked branches which are not strongly biased", ++ .pme_long_desc = "marked branches which are not strongly biased", + }, +-[ POWER9_PME_PM_LSU3_TM_L1_MISS ] = { /* 602 */ +- .pme_name = "PM_LSU3_TM_L1_MISS", +- .pme_code = 0x000000E8A0, +- .pme_short_desc = "Load tm L1 miss", +- .pme_long_desc = "Load tm L1 miss", ++[ POWER9_PME_PM_MRK_BR_CMPL ] = { ++ .pme_name = "PM_MRK_BR_CMPL", ++ .pme_code = 0x000001016E, ++ .pme_short_desc = "Branch Instruction completed", ++ .pme_long_desc = "Branch Instruction completed", + }, +-[ POWER9_PME_PM_L3_CO ] = { /* 603 */ +- .pme_name = "PM_L3_CO", +- .pme_code = 0x00000360A8, +- .pme_short_desc = "l3 castout occuring ( does not include casthrough or log writes (cinj/dmaw)", +- .pme_long_desc = "l3 castout occuring ( does not include casthrough or log writes (cinj/dmaw)", ++[ POWER9_PME_PM_MRK_BR_MPRED_CMPL ] = { ++ .pme_name = "PM_MRK_BR_MPRED_CMPL", ++ .pme_code = 0x00000301E4, ++ .pme_short_desc = "Marked Branch Mispredicted", ++ .pme_long_desc = "Marked Branch Mispredicted", + }, +-[ POWER9_PME_PM_MRK_STALL_CMPLU_CYC ] = { /* 604 */ +- .pme_name = "PM_MRK_STALL_CMPLU_CYC", +- .pme_code = 0x000003013E, +- .pme_short_desc = "Number of cycles the marked instruction is experiencing a stall while it is next to complete (NTC)", +- .pme_long_desc = "Number of cycles the marked instruction is experiencing a stall while it is next to complete (NTC)", ++[ POWER9_PME_PM_MRK_BR_TAKEN_CMPL ] = { ++ .pme_name = "PM_MRK_BR_TAKEN_CMPL", ++ .pme_code = 0x00000101E2, ++ .pme_short_desc = "Marked Branch Taken completed", ++ .pme_long_desc = "Marked Branch Taken completed", + }, +-[ POWER9_PME_PM_INST_FROM_DL2L3_SHR ] = { /* 605 */ +- .pme_name = "PM_INST_FROM_DL2L3_SHR", +- .pme_code = 0x0000034048, +- .pme_short_desc = "The processor's Instruction cache was reloaded with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to an instruction fetch (not prefetch)", +- .pme_long_desc = "The processor's Instruction cache was reloaded with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to an instruction fetch (not prefetch)", ++[ POWER9_PME_PM_MRK_BRU_FIN ] = { ++ .pme_name = "PM_MRK_BRU_FIN", ++ .pme_code = 0x000002013A, ++ .pme_short_desc = "bru marked instr finish", ++ .pme_long_desc = "bru marked instr finish", + }, +-[ POWER9_PME_PM_SCALAR_FLOP_CMPL ] = { /* 606 */ +- .pme_name = "PM_SCALAR_FLOP_CMPL", +- .pme_code = 0x0000010130, +- .pme_short_desc = "Scalar flop events", +- .pme_long_desc = "Scalar flop events", ++[ POWER9_PME_PM_MRK_DATA_FROM_DL2L3_MOD_CYC ] = { ++ .pme_name = "PM_MRK_DATA_FROM_DL2L3_MOD_CYC", ++ .pme_code = 0x000004D12E, ++ .pme_short_desc = "Duration in cycles to reload with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a marked load", ++ .pme_long_desc = "Duration in cycles to reload with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a marked load", + }, +-[ POWER9_PME_PM_LRQ_REJECT ] = { /* 607 */ +- .pme_name = "PM_LRQ_REJECT", +- .pme_code = 0x000002E05A, +- .pme_short_desc = "Internal LSU reject from LRQ.", +- .pme_long_desc = "Internal LSU reject from LRQ. Rejects cause the load to go back to LRQ, but it stays contained within the LSU once it gets issued. This event counts the number of times the LRQ attempts to relaunch an instruction after a reject. Any load can suffer multiple rejects", ++[ POWER9_PME_PM_MRK_DATA_FROM_DL2L3_MOD ] = { ++ .pme_name = "PM_MRK_DATA_FROM_DL2L3_MOD", ++ .pme_code = 0x000003D14E, ++ .pme_short_desc = "The processor's data cache was reloaded with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a marked load", ++ .pme_long_desc = "The processor's data cache was reloaded with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a marked load", + }, +-[ POWER9_PME_PM_4FLOP_CMPL ] = { /* 608 */ +- .pme_name = "PM_4FLOP_CMPL", +- .pme_code = 0x000001000E, +- .pme_short_desc = "four flop events", +- .pme_long_desc = "four flop events", ++[ POWER9_PME_PM_MRK_DATA_FROM_DL2L3_SHR_CYC ] = { ++ .pme_name = "PM_MRK_DATA_FROM_DL2L3_SHR_CYC", ++ .pme_code = 0x000002C128, ++ .pme_short_desc = "Duration in cycles to reload with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a marked load", ++ .pme_long_desc = "Duration in cycles to reload with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a marked load", + }, +-[ POWER9_PME_PM_MRK_DPTEG_FROM_RMEM ] = { /* 609 */ +- .pme_name = "PM_MRK_DPTEG_FROM_RMEM", +- .pme_code = 0x000003F14A, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB from another chip's memory on the same Node or Group ( Remote) due to a marked data side request.", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB from another chip's memory on the same Node or Group ( Remote) due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++[ POWER9_PME_PM_MRK_DATA_FROM_DL2L3_SHR ] = { ++ .pme_name = "PM_MRK_DATA_FROM_DL2L3_SHR", ++ .pme_code = 0x000001D150, ++ .pme_short_desc = "The processor's data cache was reloaded with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a marked load", ++ .pme_long_desc = "The processor's data cache was reloaded with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a marked load", + }, +-[ POWER9_PME_PM_LD_CMPL ] = { /* 610 */ +- .pme_name = "PM_LD_CMPL", +- .pme_code = 0x000004003E, +- .pme_short_desc = "count of Loads completed", +- .pme_long_desc = "count of Loads completed", ++[ POWER9_PME_PM_MRK_DATA_FROM_DL4_CYC ] = { ++ .pme_name = "PM_MRK_DATA_FROM_DL4_CYC", ++ .pme_code = 0x000002C12C, ++ .pme_short_desc = "Duration in cycles to reload from another chip's L4 on a different Node or Group (Distant) due to a marked load", ++ .pme_long_desc = "Duration in cycles to reload from another chip's L4 on a different Node or Group (Distant) due to a marked load", + }, +-[ POWER9_PME_PM_DATA_FROM_L3_MEPF ] = { /* 611 */ +- .pme_name = "PM_DATA_FROM_L3_MEPF", +- .pme_code = 0x000002C042, +- .pme_short_desc = "The processor's data cache was reloaded from local core's L3 without dispatch conflicts hit on Mepf state due to a demand load", +- .pme_long_desc = "The processor's data cache was reloaded from local core's L3 without dispatch conflicts hit on Mepf state due to a demand load", ++[ POWER9_PME_PM_MRK_DATA_FROM_DL4 ] = { ++ .pme_name = "PM_MRK_DATA_FROM_DL4", ++ .pme_code = 0x000001D152, ++ .pme_short_desc = "The processor's data cache was reloaded from another chip's L4 on a different Node or Group (Distant) due to a marked load", ++ .pme_long_desc = "The processor's data cache was reloaded from another chip's L4 on a different Node or Group (Distant) due to a marked load", + }, +-[ POWER9_PME_PM_L1PF_L2MEMACC ] = { /* 612 */ +- .pme_name = "PM_L1PF_L2MEMACC", +- .pme_code = 0x0000016890, +- .pme_short_desc = "valid when first beat of data comes in for an L1pref where data came from mem(or L4)", +- .pme_long_desc = "valid when first beat of data comes in for an L1pref where data came from mem(or L4)", ++[ POWER9_PME_PM_MRK_DATA_FROM_DMEM_CYC ] = { ++ .pme_name = "PM_MRK_DATA_FROM_DMEM_CYC", ++ .pme_code = 0x000004E11E, ++ .pme_short_desc = "Duration in cycles to reload from another chip's memory on the same Node or Group (Distant) due to a marked load", ++ .pme_long_desc = "Duration in cycles to reload from another chip's memory on the same Node or Group (Distant) due to a marked load", + }, +-[ POWER9_PME_PM_INST_FROM_L3MISS ] = { /* 613 */ +- .pme_name = "PM_INST_FROM_L3MISS", +- .pme_code = 0x00000300FA, +- .pme_short_desc = "Marked instruction was reloaded from a location beyond the local chiplet", +- .pme_long_desc = "Marked instruction was reloaded from a location beyond the local chiplet", ++[ POWER9_PME_PM_MRK_DATA_FROM_DMEM ] = { ++ .pme_name = "PM_MRK_DATA_FROM_DMEM", ++ .pme_code = 0x000003D14C, ++ .pme_short_desc = "The processor's data cache was reloaded from another chip's memory on the same Node or Group (Distant) due to a marked load", ++ .pme_long_desc = "The processor's data cache was reloaded from another chip's memory on the same Node or Group (Distant) due to a marked load", + }, +-[ POWER9_PME_PM_MRK_LSU_FLUSH_LHS ] = { /* 614 */ +- .pme_name = "PM_MRK_LSU_FLUSH_LHS", +- .pme_code = 0x000000D0A0, +- .pme_short_desc = "Effective Address alias flush : no EA match but Real Address match.", +- .pme_long_desc = "Effective Address alias flush : no EA match but Real Address match. If the data has not yet been returned for this load, the instruction will just be rejected, but if it has returned data, it will be flushed", ++[ POWER9_PME_PM_MRK_DATA_FROM_L21_MOD_CYC ] = { ++ .pme_name = "PM_MRK_DATA_FROM_L21_MOD_CYC", ++ .pme_code = 0x000003D148, ++ .pme_short_desc = "Duration in cycles to reload with Modified (M) data from another core's L2 on the same chip due to a marked load", ++ .pme_long_desc = "Duration in cycles to reload with Modified (M) data from another core's L2 on the same chip due to a marked load", + }, +-[ POWER9_PME_PM_EE_OFF_EXT_INT ] = { /* 615 */ +- .pme_name = "PM_EE_OFF_EXT_INT", +- .pme_code = 0x0000002080, +- .pme_short_desc = "CyclesMSR[EE] is off and external interrupts are active", +- .pme_long_desc = "CyclesMSR[EE] is off and external interrupts are active", ++[ POWER9_PME_PM_MRK_DATA_FROM_L21_MOD ] = { ++ .pme_name = "PM_MRK_DATA_FROM_L21_MOD", ++ .pme_code = 0x000004D146, ++ .pme_short_desc = "The processor's data cache was reloaded with Modified (M) data from another core's L2 on the same chip due to a marked load", ++ .pme_long_desc = "The processor's data cache was reloaded with Modified (M) data from another core's L2 on the same chip due to a marked load", + }, +-[ POWER9_PME_PM_TM_ST_CONF ] = { /* 616 */ +- .pme_name = "PM_TM_ST_CONF", +- .pme_code = 0x000003608C, +- .pme_short_desc = "TM Store (fav or non-fav) ran into conflict (failed)", +- .pme_long_desc = "TM Store (fav or non-fav) ran into conflict (failed)", ++[ POWER9_PME_PM_MRK_DATA_FROM_L21_SHR_CYC ] = { ++ .pme_name = "PM_MRK_DATA_FROM_L21_SHR_CYC", ++ .pme_code = 0x000001D154, ++ .pme_short_desc = "Duration in cycles to reload with Shared (S) data from another core's L2 on the same chip due to a marked load", ++ .pme_long_desc = "Duration in cycles to reload with Shared (S) data from another core's L2 on the same chip due to a marked load", + }, +-[ POWER9_PME_PM_PMC6_OVERFLOW ] = { /* 617 */ +- .pme_name = "PM_PMC6_OVERFLOW", +- .pme_code = 0x0000030024, +- .pme_short_desc = "Overflow from counter 6", +- .pme_long_desc = "Overflow from counter 6", ++[ POWER9_PME_PM_MRK_DATA_FROM_L21_SHR ] = { ++ .pme_name = "PM_MRK_DATA_FROM_L21_SHR", ++ .pme_code = 0x000002D14E, ++ .pme_short_desc = "The processor's data cache was reloaded with Shared (S) data from another core's L2 on the same chip due to a marked load", ++ .pme_long_desc = "The processor's data cache was reloaded with Shared (S) data from another core's L2 on the same chip due to a marked load", + }, +-[ POWER9_PME_PM_INST_FROM_DL2L3_MOD ] = { /* 618 */ +- .pme_name = "PM_INST_FROM_DL2L3_MOD", +- .pme_code = 0x0000044048, +- .pme_short_desc = "The processor's Instruction cache was reloaded with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to an instruction fetch (not prefetch)", +- .pme_long_desc = "The processor's Instruction cache was reloaded with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to an instruction fetch (not prefetch)", ++[ POWER9_PME_PM_MRK_DATA_FROM_L2_CYC ] = { ++ .pme_name = "PM_MRK_DATA_FROM_L2_CYC", ++ .pme_code = 0x0000014156, ++ .pme_short_desc = "Duration in cycles to reload from local core's L2 due to a marked load", ++ .pme_long_desc = "Duration in cycles to reload from local core's L2 due to a marked load", + }, +-[ POWER9_PME_PM_MRK_INST_CMPL ] = { /* 619 */ +- .pme_name = "PM_MRK_INST_CMPL", +- .pme_code = 0x00000401E0, +- .pme_short_desc = "marked instruction completed", +- .pme_long_desc = "marked instruction completed", ++[ POWER9_PME_PM_MRK_DATA_FROM_L2_DISP_CONFLICT_LDHITST_CYC ] = { ++ .pme_name = "PM_MRK_DATA_FROM_L2_DISP_CONFLICT_LDHITST_CYC", ++ .pme_code = 0x000001415A, ++ .pme_short_desc = "Duration in cycles to reload from local core's L2 with load hit store conflict due to a marked load", ++ .pme_long_desc = "Duration in cycles to reload from local core's L2 with load hit store conflict due to a marked load", + }, +-[ POWER9_PME_PM_TAGE_CORRECT_TAKEN_CMPL ] = { /* 620 */ +- .pme_name = "PM_TAGE_CORRECT_TAKEN_CMPL", +- .pme_code = 0x00000050B4, +- .pme_short_desc = "The TAGE overrode BHT direction prediction and it was correct.", +- .pme_long_desc = "The TAGE overrode BHT direction prediction and it was correct. Counted at completion for taken branches only", ++[ POWER9_PME_PM_MRK_DATA_FROM_L2_DISP_CONFLICT_LDHITST ] = { ++ .pme_name = "PM_MRK_DATA_FROM_L2_DISP_CONFLICT_LDHITST", ++ .pme_code = 0x000002D148, ++ .pme_short_desc = "The processor's data cache was reloaded from local core's L2 with load hit store conflict due to a marked load", ++ .pme_long_desc = "The processor's data cache was reloaded from local core's L2 with load hit store conflict due to a marked load", + }, +-[ POWER9_PME_PM_MRK_L1_ICACHE_MISS ] = { /* 621 */ +- .pme_name = "PM_MRK_L1_ICACHE_MISS", +- .pme_code = 0x00000101E4, +- .pme_short_desc = "sampled Instruction suffered an icache Miss", +- .pme_long_desc = "sampled Instruction suffered an icache Miss", ++[ POWER9_PME_PM_MRK_DATA_FROM_L2_DISP_CONFLICT_OTHER_CYC ] = { ++ .pme_name = "PM_MRK_DATA_FROM_L2_DISP_CONFLICT_OTHER_CYC", ++ .pme_code = 0x000003D140, ++ .pme_short_desc = "Duration in cycles to reload from local core's L2 with dispatch conflict due to a marked load", ++ .pme_long_desc = "Duration in cycles to reload from local core's L2 with dispatch conflict due to a marked load", + }, +-[ POWER9_PME_PM_TLB_MISS ] = { /* 622 */ +- .pme_name = "PM_TLB_MISS", +- .pme_code = 0x0000020066, +- .pme_short_desc = "TLB Miss (I + D)", +- .pme_long_desc = "TLB Miss (I + D)", ++[ POWER9_PME_PM_MRK_DATA_FROM_L2_DISP_CONFLICT_OTHER ] = { ++ .pme_name = "PM_MRK_DATA_FROM_L2_DISP_CONFLICT_OTHER", ++ .pme_code = 0x000002C124, ++ .pme_short_desc = "The processor's data cache was reloaded from local core's L2 with dispatch conflict due to a marked load", ++ .pme_long_desc = "The processor's data cache was reloaded from local core's L2 with dispatch conflict due to a marked load", + }, +-[ POWER9_PME_PM_L2_RCLD_DISP_FAIL_OTHER ] = { /* 623 */ +- .pme_name = "PM_L2_RCLD_DISP_FAIL_OTHER", +- .pme_code = 0x0000026084, +- .pme_short_desc = "L2 RC load dispatch attempt failed due to other reasons", +- .pme_long_desc = "L2 RC load dispatch attempt failed due to other reasons", ++[ POWER9_PME_PM_MRK_DATA_FROM_L2_MEPF_CYC ] = { ++ .pme_name = "PM_MRK_DATA_FROM_L2_MEPF_CYC", ++ .pme_code = 0x000003D144, ++ .pme_short_desc = "Duration in cycles to reload from local core's L2 hit without dispatch conflicts on Mepf state.", ++ .pme_long_desc = "Duration in cycles to reload from local core's L2 hit without dispatch conflicts on Mepf state. due to a marked load", + }, +-[ POWER9_PME_PM_FXU_BUSY ] = { /* 624 */ +- .pme_name = "PM_FXU_BUSY", +- .pme_code = 0x000002000A, +- .pme_short_desc = "Cycles in which all 4 FXUs are busy.", +- .pme_long_desc = "Cycles in which all 4 FXUs are busy. The FXU is running at capacity", ++[ POWER9_PME_PM_MRK_DATA_FROM_L2_MEPF ] = { ++ .pme_name = "PM_MRK_DATA_FROM_L2_MEPF", ++ .pme_code = 0x000004C120, ++ .pme_short_desc = "The processor's data cache was reloaded from local core's L2 hit without dispatch conflicts on Mepf state.", ++ .pme_long_desc = "The processor's data cache was reloaded from local core's L2 hit without dispatch conflicts on Mepf state. due to a marked load", + }, +-[ POWER9_PME_PM_DATA_FROM_L3_DISP_CONFLICT ] = { /* 625 */ +- .pme_name = "PM_DATA_FROM_L3_DISP_CONFLICT", +- .pme_code = 0x000003C042, +- .pme_short_desc = "The processor's data cache was reloaded from local core's L3 with dispatch conflict due to a demand load", +- .pme_long_desc = "The processor's data cache was reloaded from local core's L3 with dispatch conflict due to a demand load", ++[ POWER9_PME_PM_MRK_DATA_FROM_L2MISS_CYC ] = { ++ .pme_name = "PM_MRK_DATA_FROM_L2MISS_CYC", ++ .pme_code = 0x0000035152, ++ .pme_short_desc = "Duration in cycles to reload from a location other than the local core's L2 due to a marked load", ++ .pme_long_desc = "Duration in cycles to reload from a location other than the local core's L2 due to a marked load", + }, +-[ POWER9_PME_PM_INST_FROM_L3_1_MOD ] = { /* 626 */ +- .pme_name = "PM_INST_FROM_L3_1_MOD", +- .pme_code = 0x0000024044, +- .pme_short_desc = "The processor's Instruction cache was reloaded with Modified (M) data from another core's L3 on the same chip due to an instruction fetch (not prefetch)", +- .pme_long_desc = "The processor's Instruction cache was reloaded with Modified (M) data from another core's L3 on the same chip due to an instruction fetch (not prefetch)", ++[ POWER9_PME_PM_MRK_DATA_FROM_L2MISS ] = { ++ .pme_name = "PM_MRK_DATA_FROM_L2MISS", ++ .pme_code = 0x00000401E8, ++ .pme_short_desc = "The processor's data cache was reloaded from a location other than the local core's L2 due to a marked load", ++ .pme_long_desc = "The processor's data cache was reloaded from a location other than the local core's L2 due to a marked load", + }, +-[ POWER9_PME_PM_LSU_REJECT_LMQ_FULL ] = { /* 627 */ +- .pme_name = "PM_LSU_REJECT_LMQ_FULL", +- .pme_code = 0x000003001C, +- .pme_short_desc = "LSU Reject due to LMQ full (up to 4 per cycles)", +- .pme_long_desc = "LSU Reject due to LMQ full (up to 4 per cycles)", ++[ POWER9_PME_PM_MRK_DATA_FROM_L2_NO_CONFLICT_CYC ] = { ++ .pme_name = "PM_MRK_DATA_FROM_L2_NO_CONFLICT_CYC", ++ .pme_code = 0x0000014158, ++ .pme_short_desc = "Duration in cycles to reload from local core's L2 without conflict due to a marked load", ++ .pme_long_desc = "Duration in cycles to reload from local core's L2 without conflict due to a marked load", + }, +-[ POWER9_PME_PM_CO_DISP_FAIL ] = { /* 628 */ +- .pme_name = "PM_CO_DISP_FAIL", +- .pme_code = 0x0000016886, +- .pme_short_desc = "CO dispatch failed due to all CO machines being busy", +- .pme_long_desc = "CO dispatch failed due to all CO machines being busy", ++[ POWER9_PME_PM_MRK_DATA_FROM_L2_NO_CONFLICT ] = { ++ .pme_name = "PM_MRK_DATA_FROM_L2_NO_CONFLICT", ++ .pme_code = 0x000002C120, ++ .pme_short_desc = "The processor's data cache was reloaded from local core's L2 without conflict due to a marked load", ++ .pme_long_desc = "The processor's data cache was reloaded from local core's L2 without conflict due to a marked load", + }, +-[ POWER9_PME_PM_L3_TRANS_PF ] = { /* 629 */ +- .pme_name = "PM_L3_TRANS_PF", +- .pme_code = 0x00000468A4, +- .pme_short_desc = "L3 Transient prefetch", +- .pme_long_desc = "L3 Transient prefetch", ++[ POWER9_PME_PM_MRK_DATA_FROM_L2 ] = { ++ .pme_name = "PM_MRK_DATA_FROM_L2", ++ .pme_code = 0x000002C126, ++ .pme_short_desc = "The processor's data cache was reloaded from local core's L2 due to a marked load", ++ .pme_long_desc = "The processor's data cache was reloaded from local core's L2 due to a marked load", + }, +-[ POWER9_PME_PM_MRK_ST_NEST ] = { /* 630 */ +- .pme_name = "PM_MRK_ST_NEST", +- .pme_code = 0x0000020138, +- .pme_short_desc = "Marked store sent to nest", +- .pme_long_desc = "Marked store sent to nest", ++[ POWER9_PME_PM_MRK_DATA_FROM_L31_ECO_MOD_CYC ] = { ++ .pme_name = "PM_MRK_DATA_FROM_L31_ECO_MOD_CYC", ++ .pme_code = 0x0000035158, ++ .pme_short_desc = "Duration in cycles to reload with Modified (M) data from another core's ECO L3 on the same chip due to a marked load", ++ .pme_long_desc = "Duration in cycles to reload with Modified (M) data from another core's ECO L3 on the same chip due to a marked load", + }, +-[ POWER9_PME_PM_LSU1_L1_CAM_CANCEL ] = { /* 631 */ +- .pme_name = "PM_LSU1_L1_CAM_CANCEL", +- .pme_code = 0x000000F890, +- .pme_short_desc = "ls1 l1 tm cam cancel", +- .pme_long_desc = "ls1 l1 tm cam cancel", ++[ POWER9_PME_PM_MRK_DATA_FROM_L31_ECO_MOD ] = { ++ .pme_name = "PM_MRK_DATA_FROM_L31_ECO_MOD", ++ .pme_code = 0x000004D144, ++ .pme_short_desc = "The processor's data cache was reloaded with Modified (M) data from another core's ECO L3 on the same chip due to a marked load", ++ .pme_long_desc = "The processor's data cache was reloaded with Modified (M) data from another core's ECO L3 on the same chip due to a marked load", + }, +-[ POWER9_PME_PM_INST_CHIP_PUMP_CPRED ] = { /* 632 */ +- .pme_name = "PM_INST_CHIP_PUMP_CPRED", +- .pme_code = 0x0000014050, +- .pme_short_desc = "Initial and Final Pump Scope was chip pump (prediction=correct) for an instruction fetch", +- .pme_long_desc = "Initial and Final Pump Scope was chip pump (prediction=correct) for an instruction fetch", ++[ POWER9_PME_PM_MRK_DATA_FROM_L31_ECO_SHR_CYC ] = { ++ .pme_name = "PM_MRK_DATA_FROM_L31_ECO_SHR_CYC", ++ .pme_code = 0x000001D142, ++ .pme_short_desc = "Duration in cycles to reload with Shared (S) data from another core's ECO L3 on the same chip due to a marked load", ++ .pme_long_desc = "Duration in cycles to reload with Shared (S) data from another core's ECO L3 on the same chip due to a marked load", + }, +-[ POWER9_PME_PM_LSU3_VECTOR_ST_FIN ] = { /* 633 */ +- .pme_name = "PM_LSU3_VECTOR_ST_FIN", +- .pme_code = 0x000000C88C, +- .pme_short_desc = "A vector store instruction finished.", +- .pme_long_desc = "A vector store instruction finished. The ops considered in this category are stv*, stxv*, stxsi*x, stxsd, and stxssp", ++[ POWER9_PME_PM_MRK_DATA_FROM_L31_ECO_SHR ] = { ++ .pme_name = "PM_MRK_DATA_FROM_L31_ECO_SHR", ++ .pme_code = 0x000002D14C, ++ .pme_short_desc = "The processor's data cache was reloaded with Shared (S) data from another core's ECO L3 on the same chip due to a marked load", ++ .pme_long_desc = "The processor's data cache was reloaded with Shared (S) data from another core's ECO L3 on the same chip due to a marked load", + }, +-[ POWER9_PME_PM_MRK_DPTEG_FROM_L2_1_MOD ] = { /* 634 */ +- .pme_name = "PM_MRK_DPTEG_FROM_L2_1_MOD", +- .pme_code = 0x000004F146, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's L2 on the same chip due to a marked data side request.", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's L2 on the same chip due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++[ POWER9_PME_PM_MRK_DATA_FROM_L31_MOD_CYC ] = { ++ .pme_name = "PM_MRK_DATA_FROM_L31_MOD_CYC", ++ .pme_code = 0x000001D140, ++ .pme_short_desc = "Duration in cycles to reload with Modified (M) data from another core's L3 on the same chip due to a marked load", ++ .pme_long_desc = "Duration in cycles to reload with Modified (M) data from another core's L3 on the same chip due to a marked load", + }, +-[ POWER9_PME_PM_IBUF_FULL_CYC ] = { /* 635 */ +- .pme_name = "PM_IBUF_FULL_CYC", +- .pme_code = 0x0000004884, +- .pme_short_desc = "Cycles No room in ibuff", +- .pme_long_desc = "Cycles No room in ibuff", ++[ POWER9_PME_PM_MRK_DATA_FROM_L31_MOD ] = { ++ .pme_name = "PM_MRK_DATA_FROM_L31_MOD", ++ .pme_code = 0x000002D144, ++ .pme_short_desc = "The processor's data cache was reloaded with Modified (M) data from another core's L3 on the same chip due to a marked load", ++ .pme_long_desc = "The processor's data cache was reloaded with Modified (M) data from another core's L3 on the same chip due to a marked load", + }, +-[ POWER9_PME_PM_8FLOP_CMPL ] = { /* 636 */ +- .pme_name = "PM_8FLOP_CMPL", +- .pme_code = 0x000004D054, +- .pme_short_desc = "", +- .pme_long_desc = "", ++[ POWER9_PME_PM_MRK_DATA_FROM_L31_SHR_CYC ] = { ++ .pme_name = "PM_MRK_DATA_FROM_L31_SHR_CYC", ++ .pme_code = 0x0000035156, ++ .pme_short_desc = "Duration in cycles to reload with Shared (S) data from another core's L3 on the same chip due to a marked load", ++ .pme_long_desc = "Duration in cycles to reload with Shared (S) data from another core's L3 on the same chip due to a marked load", + }, +-[ POWER9_PME_PM_MRK_DATA_FROM_DL2L3_SHR_CYC ] = { /* 637 */ +- .pme_name = "PM_MRK_DATA_FROM_DL2L3_SHR_CYC", +- .pme_code = 0x000002C128, +- .pme_short_desc = "Duration in cycles to reload with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a marked load", +- .pme_long_desc = "Duration in cycles to reload with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a marked load", ++[ POWER9_PME_PM_MRK_DATA_FROM_L31_SHR ] = { ++ .pme_name = "PM_MRK_DATA_FROM_L31_SHR", ++ .pme_code = 0x000004D124, ++ .pme_short_desc = "The processor's data cache was reloaded with Shared (S) data from another core's L3 on the same chip due to a marked load", ++ .pme_long_desc = "The processor's data cache was reloaded with Shared (S) data from another core's L3 on the same chip due to a marked load", + }, +-[ POWER9_PME_PM_MRK_DPTEG_FROM_OFF_CHIP_CACHE ] = { /* 638 */ +- .pme_name = "PM_MRK_DPTEG_FROM_OFF_CHIP_CACHE", +- .pme_code = 0x000004F14A, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to a marked data side request.", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++[ POWER9_PME_PM_MRK_DATA_FROM_L3_CYC ] = { ++ .pme_name = "PM_MRK_DATA_FROM_L3_CYC", ++ .pme_code = 0x0000035154, ++ .pme_short_desc = "Duration in cycles to reload from local core's L3 due to a marked load", ++ .pme_long_desc = "Duration in cycles to reload from local core's L3 due to a marked load", ++}, ++[ POWER9_PME_PM_MRK_DATA_FROM_L3_DISP_CONFLICT_CYC ] = { ++ .pme_name = "PM_MRK_DATA_FROM_L3_DISP_CONFLICT_CYC", ++ .pme_code = 0x000002C122, ++ .pme_short_desc = "Duration in cycles to reload from local core's L3 with dispatch conflict due to a marked load", ++ .pme_long_desc = "Duration in cycles to reload from local core's L3 with dispatch conflict due to a marked load", + }, +-[ POWER9_PME_PM_ICT_NOSLOT_IC_L3 ] = { /* 639 */ +- .pme_name = "PM_ICT_NOSLOT_IC_L3", +- .pme_code = 0x000003E052, +- .pme_short_desc = "Ict empty for this thread due to icache misses that were sourced from the local L3", +- .pme_long_desc = "Ict empty for this thread due to icache misses that were sourced from the local L3", ++[ POWER9_PME_PM_MRK_DATA_FROM_L3_DISP_CONFLICT ] = { ++ .pme_name = "PM_MRK_DATA_FROM_L3_DISP_CONFLICT", ++ .pme_code = 0x000001D144, ++ .pme_short_desc = "The processor's data cache was reloaded from local core's L3 with dispatch conflict due to a marked load", ++ .pme_long_desc = "The processor's data cache was reloaded from local core's L3 with dispatch conflict due to a marked load", + }, +-[ POWER9_PME_PM_CMPLU_STALL_LWSYNC ] = { /* 640 */ +- .pme_name = "PM_CMPLU_STALL_LWSYNC", +- .pme_code = 0x0000010036, +- .pme_short_desc = "completion stall due to lwsync", +- .pme_long_desc = "completion stall due to lwsync", ++[ POWER9_PME_PM_MRK_DATA_FROM_L3_MEPF_CYC ] = { ++ .pme_name = "PM_MRK_DATA_FROM_L3_MEPF_CYC", ++ .pme_code = 0x000001415C, ++ .pme_short_desc = "Duration in cycles to reload from local core's L3 without dispatch conflicts hit on Mepf state due to a marked load", ++ .pme_long_desc = "Duration in cycles to reload from local core's L3 without dispatch conflicts hit on Mepf state due to a marked load", + }, +-[ POWER9_PME_PM_RADIX_PWC_L2_PDE_FROM_L2 ] = { /* 641 */ +- .pme_name = "PM_RADIX_PWC_L2_PDE_FROM_L2", +- .pme_code = 0x000002D028, +- .pme_short_desc = "A Page Directory Entry was reloaded to a level 2 page walk cache from the core's L2 data cache", +- .pme_long_desc = "A Page Directory Entry was reloaded to a level 2 page walk cache from the core's L2 data cache", ++[ POWER9_PME_PM_MRK_DATA_FROM_L3_MEPF ] = { ++ .pme_name = "PM_MRK_DATA_FROM_L3_MEPF", ++ .pme_code = 0x000002D142, ++ .pme_short_desc = "The processor's data cache was reloaded from local core's L3 without dispatch conflicts hit on Mepf state.", ++ .pme_long_desc = "The processor's data cache was reloaded from local core's L3 without dispatch conflicts hit on Mepf state. due to a marked load", + }, +-[ POWER9_PME_PM_MRK_DATA_FROM_RL2L3_SHR_CYC ] = { /* 642 */ +- .pme_name = "PM_MRK_DATA_FROM_RL2L3_SHR_CYC", +- .pme_code = 0x000004C12A, +- .pme_short_desc = "Duration in cycles to reload with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a marked load", +- .pme_long_desc = "Duration in cycles to reload with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a marked load", ++[ POWER9_PME_PM_MRK_DATA_FROM_L3MISS_CYC ] = { ++ .pme_name = "PM_MRK_DATA_FROM_L3MISS_CYC", ++ .pme_code = 0x000001415E, ++ .pme_short_desc = "Duration in cycles to reload from a location other than the local core's L3 due to a marked load", ++ .pme_long_desc = "Duration in cycles to reload from a location other than the local core's L3 due to a marked load", + }, +-[ POWER9_PME_PM_L3_SN0_BUSY ] = { /* 643 */ +- .pme_name = "PM_L3_SN0_BUSY", +- .pme_code = 0x00000460AC, +- .pme_short_desc = "lifetime, sample of snooper machine 0 valid", +- .pme_long_desc = "lifetime, sample of snooper machine 0 valid", ++[ POWER9_PME_PM_MRK_DATA_FROM_L3MISS ] = { ++ .pme_name = "PM_MRK_DATA_FROM_L3MISS", ++ .pme_code = 0x00000201E4, ++ .pme_short_desc = "The processor's data cache was reloaded from a location other than the local core's L3 due to a marked load", ++ .pme_long_desc = "The processor's data cache was reloaded from a location other than the local core's L3 due to a marked load", + }, +-[ POWER9_PME_PM_TM_OUTER_TBEGIN_DISP ] = { /* 644 */ +- .pme_name = "PM_TM_OUTER_TBEGIN_DISP", +- .pme_code = 0x000004E05E, +- .pme_short_desc = "Number of outer tbegin instructions dispatched.", +- .pme_long_desc = "Number of outer tbegin instructions dispatched. The dispatch unit determines whether the tbegin instruction is outer or nested. This is a speculative count, which includes flushed instructions", ++[ POWER9_PME_PM_MRK_DATA_FROM_L3_NO_CONFLICT_CYC ] = { ++ .pme_name = "PM_MRK_DATA_FROM_L3_NO_CONFLICT_CYC", ++ .pme_code = 0x000004C124, ++ .pme_short_desc = "Duration in cycles to reload from local core's L3 without conflict due to a marked load", ++ .pme_long_desc = "Duration in cycles to reload from local core's L3 without conflict due to a marked load", + }, +-[ POWER9_PME_PM_GRP_PUMP_MPRED ] = { /* 645 */ +- .pme_name = "PM_GRP_PUMP_MPRED", +- .pme_code = 0x0000020052, +- .pme_short_desc = "Final Pump Scope (Group) ended up either larger or smaller than Initial Pump Scope for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", +- .pme_long_desc = "Final Pump Scope (Group) ended up either larger or smaller than Initial Pump Scope for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", ++[ POWER9_PME_PM_MRK_DATA_FROM_L3_NO_CONFLICT ] = { ++ .pme_name = "PM_MRK_DATA_FROM_L3_NO_CONFLICT", ++ .pme_code = 0x000003D146, ++ .pme_short_desc = "The processor's data cache was reloaded from local core's L3 without conflict due to a marked load", ++ .pme_long_desc = "The processor's data cache was reloaded from local core's L3 without conflict due to a marked load", + }, +-[ POWER9_PME_PM_SRQ_EMPTY_CYC ] = { /* 646 */ +- .pme_name = "PM_SRQ_EMPTY_CYC", +- .pme_code = 0x0000040008, +- .pme_short_desc = "Cycles in which the SRQ has at least one (out of four) empty slice", +- .pme_long_desc = "Cycles in which the SRQ has at least one (out of four) empty slice", ++[ POWER9_PME_PM_MRK_DATA_FROM_L3 ] = { ++ .pme_name = "PM_MRK_DATA_FROM_L3", ++ .pme_code = 0x000004D142, ++ .pme_short_desc = "The processor's data cache was reloaded from local core's L3 due to a marked load", ++ .pme_long_desc = "The processor's data cache was reloaded from local core's L3 due to a marked load", + }, +-[ POWER9_PME_PM_LSU_REJECT_LHS ] = { /* 647 */ +- .pme_name = "PM_LSU_REJECT_LHS", +- .pme_code = 0x000004E05C, +- .pme_short_desc = "LSU Reject due to LHS (up to 4 per cycle)", +- .pme_long_desc = "LSU Reject due to LHS (up to 4 per cycle)", ++[ POWER9_PME_PM_MRK_DATA_FROM_LL4_CYC ] = { ++ .pme_name = "PM_MRK_DATA_FROM_LL4_CYC", ++ .pme_code = 0x000002C12E, ++ .pme_short_desc = "Duration in cycles to reload from the local chip's L4 cache due to a marked load", ++ .pme_long_desc = "Duration in cycles to reload from the local chip's L4 cache due to a marked load", + }, +-[ POWER9_PME_PM_IPTEG_FROM_L3_MEPF ] = { /* 648 */ +- .pme_name = "PM_IPTEG_FROM_L3_MEPF", +- .pme_code = 0x0000025042, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L3 without dispatch conflicts hit on Mepf state.", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L3 without dispatch conflicts hit on Mepf state. due to a instruction side request", ++[ POWER9_PME_PM_MRK_DATA_FROM_LL4 ] = { ++ .pme_name = "PM_MRK_DATA_FROM_LL4", ++ .pme_code = 0x000001D14C, ++ .pme_short_desc = "The processor's data cache was reloaded from the local chip's L4 cache due to a marked load", ++ .pme_long_desc = "The processor's data cache was reloaded from the local chip's L4 cache due to a marked load", ++}, ++[ POWER9_PME_PM_MRK_DATA_FROM_LMEM_CYC ] = { ++ .pme_name = "PM_MRK_DATA_FROM_LMEM_CYC", ++ .pme_code = 0x000004D128, ++ .pme_short_desc = "Duration in cycles to reload from the local chip's Memory due to a marked load", ++ .pme_long_desc = "Duration in cycles to reload from the local chip's Memory due to a marked load", + }, +-[ POWER9_PME_PM_MRK_DATA_FROM_LMEM ] = { /* 649 */ ++[ POWER9_PME_PM_MRK_DATA_FROM_LMEM ] = { + .pme_name = "PM_MRK_DATA_FROM_LMEM", + .pme_code = 0x000003D142, + .pme_short_desc = "The processor's data cache was reloaded from the local chip's Memory due to a marked load", + .pme_long_desc = "The processor's data cache was reloaded from the local chip's Memory due to a marked load", + }, +-[ POWER9_PME_PM_L3_P1_CO_MEM ] = { /* 650 */ +- .pme_name = "PM_L3_P1_CO_MEM", +- .pme_code = 0x00000368AA, +- .pme_short_desc = "l3 CO to memory port 1", +- .pme_long_desc = "l3 CO to memory port 1", ++[ POWER9_PME_PM_MRK_DATA_FROM_MEMORY_CYC ] = { ++ .pme_name = "PM_MRK_DATA_FROM_MEMORY_CYC", ++ .pme_code = 0x000001D146, ++ .pme_short_desc = "Duration in cycles to reload from a memory location including L4 from local remote or distant due to a marked load", ++ .pme_long_desc = "Duration in cycles to reload from a memory location including L4 from local remote or distant due to a marked load", + }, +-[ POWER9_PME_PM_FREQ_DOWN ] = { /* 651 */ +- .pme_name = "PM_FREQ_DOWN", +- .pme_code = 0x000003000C, +- .pme_short_desc = "Power Management: Below Threshold B", +- .pme_long_desc = "Power Management: Below Threshold B", ++[ POWER9_PME_PM_MRK_DATA_FROM_MEMORY ] = { ++ .pme_name = "PM_MRK_DATA_FROM_MEMORY", ++ .pme_code = 0x00000201E0, ++ .pme_short_desc = "The processor's data cache was reloaded from a memory location including L4 from local remote or distant due to a marked load", ++ .pme_long_desc = "The processor's data cache was reloaded from a memory location including L4 from local remote or distant due to a marked load", + }, +-[ POWER9_PME_PM_L3_CINJ ] = { /* 652 */ +- .pme_name = "PM_L3_CINJ", +- .pme_code = 0x00000368A4, +- .pme_short_desc = "l3 ci of cache inject", +- .pme_long_desc = "l3 ci of cache inject", ++[ POWER9_PME_PM_MRK_DATA_FROM_OFF_CHIP_CACHE_CYC ] = { ++ .pme_name = "PM_MRK_DATA_FROM_OFF_CHIP_CACHE_CYC", ++ .pme_code = 0x000001D14E, ++ .pme_short_desc = "Duration in cycles to reload either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to a marked load", ++ .pme_long_desc = "Duration in cycles to reload either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to a marked load", + }, +-[ POWER9_PME_PM_L3_P0_PF_RTY ] = { /* 653 */ +- .pme_name = "PM_L3_P0_PF_RTY", +- .pme_code = 0x00000260AE, +- .pme_short_desc = "L3 PF received retry port 2", +- .pme_long_desc = "L3 PF received retry port 2", ++[ POWER9_PME_PM_MRK_DATA_FROM_OFF_CHIP_CACHE ] = { ++ .pme_name = "PM_MRK_DATA_FROM_OFF_CHIP_CACHE", ++ .pme_code = 0x000002D120, ++ .pme_short_desc = "The processor's data cache was reloaded either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to a marked load", ++ .pme_long_desc = "The processor's data cache was reloaded either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to a marked load", + }, +-[ POWER9_PME_PM_IPTEG_FROM_DL2L3_MOD ] = { /* 654 */ +- .pme_name = "PM_IPTEG_FROM_DL2L3_MOD", +- .pme_code = 0x0000045048, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a instruction side request", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a instruction side request", ++[ POWER9_PME_PM_MRK_DATA_FROM_ON_CHIP_CACHE_CYC ] = { ++ .pme_name = "PM_MRK_DATA_FROM_ON_CHIP_CACHE_CYC", ++ .pme_code = 0x000003515A, ++ .pme_short_desc = "Duration in cycles to reload either shared or modified data from another core's L2/L3 on the same chip due to a marked load", ++ .pme_long_desc = "Duration in cycles to reload either shared or modified data from another core's L2/L3 on the same chip due to a marked load", + }, +-[ POWER9_PME_PM_MRK_INST_ISSUED ] = { /* 655 */ +- .pme_name = "PM_MRK_INST_ISSUED", +- .pme_code = 0x0000010132, +- .pme_short_desc = "Marked instruction issued", +- .pme_long_desc = "Marked instruction issued", ++[ POWER9_PME_PM_MRK_DATA_FROM_ON_CHIP_CACHE ] = { ++ .pme_name = "PM_MRK_DATA_FROM_ON_CHIP_CACHE", ++ .pme_code = 0x000004D140, ++ .pme_short_desc = "The processor's data cache was reloaded either shared or modified data from another core's L2/L3 on the same chip due to a marked load", ++ .pme_long_desc = "The processor's data cache was reloaded either shared or modified data from another core's L2/L3 on the same chip due to a marked load", + }, +-[ POWER9_PME_PM_INST_FROM_RL2L3_SHR ] = { /* 656 */ +- .pme_name = "PM_INST_FROM_RL2L3_SHR", +- .pme_code = 0x000001404A, +- .pme_short_desc = "The processor's Instruction cache was reloaded with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to an instruction fetch (not prefetch)", +- .pme_long_desc = "The processor's Instruction cache was reloaded with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to an instruction fetch (not prefetch)", ++[ POWER9_PME_PM_MRK_DATA_FROM_RL2L3_MOD_CYC ] = { ++ .pme_name = "PM_MRK_DATA_FROM_RL2L3_MOD_CYC", ++ .pme_code = 0x000002D14A, ++ .pme_short_desc = "Duration in cycles to reload with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a marked load", ++ .pme_long_desc = "Duration in cycles to reload with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a marked load", + }, +-[ POWER9_PME_PM_LSU_STCX_FAIL ] = { /* 657 */ +- .pme_name = "PM_LSU_STCX_FAIL", +- .pme_code = 0x000000F080, +- .pme_short_desc = "stcx failed", +- .pme_long_desc = "stcx failed", ++[ POWER9_PME_PM_MRK_DATA_FROM_RL2L3_MOD ] = { ++ .pme_name = "PM_MRK_DATA_FROM_RL2L3_MOD", ++ .pme_code = 0x000001D14A, ++ .pme_short_desc = "The processor's data cache was reloaded with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a marked load", ++ .pme_long_desc = "The processor's data cache was reloaded with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a marked load", + }, +-[ POWER9_PME_PM_L3_P1_NODE_PUMP ] = { /* 658 */ +- .pme_name = "PM_L3_P1_NODE_PUMP", +- .pme_code = 0x00000168B0, +- .pme_short_desc = "L3 pf sent with nodal scope port 1", +- .pme_long_desc = "L3 pf sent with nodal scope port 1", ++[ POWER9_PME_PM_MRK_DATA_FROM_RL2L3_SHR_CYC ] = { ++ .pme_name = "PM_MRK_DATA_FROM_RL2L3_SHR_CYC", ++ .pme_code = 0x000004C12A, ++ .pme_short_desc = "Duration in cycles to reload with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a marked load", ++ .pme_long_desc = "Duration in cycles to reload with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a marked load", + }, +-[ POWER9_PME_PM_MEM_RWITM ] = { /* 659 */ +- .pme_name = "PM_MEM_RWITM", +- .pme_code = 0x000003C05E, +- .pme_short_desc = "Memory Read With Intent to Modify for this thread", +- .pme_long_desc = "Memory Read With Intent to Modify for this thread", ++[ POWER9_PME_PM_MRK_DATA_FROM_RL2L3_SHR ] = { ++ .pme_name = "PM_MRK_DATA_FROM_RL2L3_SHR", ++ .pme_code = 0x0000035150, ++ .pme_short_desc = "The processor's data cache was reloaded with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a marked load", ++ .pme_long_desc = "The processor's data cache was reloaded with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a marked load", + }, +-[ POWER9_PME_PM_DP_QP_FLOP_CMPL ] = { /* 660 */ +- .pme_name = "PM_DP_QP_FLOP_CMPL", +- .pme_code = 0x000004D05C, +- .pme_short_desc = "Double-precision flop instruction completed", +- .pme_long_desc = "Double-precision flop instruction completed", ++[ POWER9_PME_PM_MRK_DATA_FROM_RL4_CYC ] = { ++ .pme_name = "PM_MRK_DATA_FROM_RL4_CYC", ++ .pme_code = 0x000004D12A, ++ .pme_short_desc = "Duration in cycles to reload from another chip's L4 on the same Node or Group ( Remote) due to a marked load", ++ .pme_long_desc = "Duration in cycles to reload from another chip's L4 on the same Node or Group ( Remote) due to a marked load", + }, +-[ POWER9_PME_PM_RUN_PURR ] = { /* 661 */ +- .pme_name = "PM_RUN_PURR", +- .pme_code = 0x00000400F4, +- .pme_short_desc = "Run_PURR", +- .pme_long_desc = "Run_PURR", ++[ POWER9_PME_PM_MRK_DATA_FROM_RL4 ] = { ++ .pme_name = "PM_MRK_DATA_FROM_RL4", ++ .pme_code = 0x000003515C, ++ .pme_short_desc = "The processor's data cache was reloaded from another chip's L4 on the same Node or Group ( Remote) due to a marked load", ++ .pme_long_desc = "The processor's data cache was reloaded from another chip's L4 on the same Node or Group ( Remote) due to a marked load", + }, +-[ POWER9_PME_PM_CMPLU_STALL_LMQ_FULL ] = { /* 662 */ +- .pme_name = "PM_CMPLU_STALL_LMQ_FULL", +- .pme_code = 0x000004C014, +- .pme_short_desc = "Finish stall because the NTF instruction was a load that missed in the L1 and the LMQ was unable to accept this load miss request because it was full", +- .pme_long_desc = "Finish stall because the NTF instruction was a load that missed in the L1 and the LMQ was unable to accept this load miss request because it was full", ++[ POWER9_PME_PM_MRK_DATA_FROM_RMEM_CYC ] = { ++ .pme_name = "PM_MRK_DATA_FROM_RMEM_CYC", ++ .pme_code = 0x000002C12A, ++ .pme_short_desc = "Duration in cycles to reload from another chip's memory on the same Node or Group ( Remote) due to a marked load", ++ .pme_long_desc = "Duration in cycles to reload from another chip's memory on the same Node or Group ( Remote) due to a marked load", + }, +-[ POWER9_PME_PM_CMPLU_STALL_VDPLONG ] = { /* 663 */ +- .pme_name = "PM_CMPLU_STALL_VDPLONG", +- .pme_code = 0x000003C05A, +- .pme_short_desc = "Finish stall because the NTF instruction was a scalar multi-cycle instruction issued to the Double Precision execution pipe and waiting to finish.", +- .pme_long_desc = "Finish stall because the NTF instruction was a scalar multi-cycle instruction issued to the Double Precision execution pipe and waiting to finish. Includes binary floating point instructions in 32 and 64 bit binary floating point format. Qualified by NOT vector AND multicycle", ++[ POWER9_PME_PM_MRK_DATA_FROM_RMEM ] = { ++ .pme_name = "PM_MRK_DATA_FROM_RMEM", ++ .pme_code = 0x000001D148, ++ .pme_short_desc = "The processor's data cache was reloaded from another chip's memory on the same Node or Group ( Remote) due to a marked load", ++ .pme_long_desc = "The processor's data cache was reloaded from another chip's memory on the same Node or Group ( Remote) due to a marked load", + }, +-[ POWER9_PME_PM_LSU2_TM_L1_HIT ] = { /* 664 */ +- .pme_name = "PM_LSU2_TM_L1_HIT", +- .pme_code = 0x000000E098, +- .pme_short_desc = "Load tm hit in L1", +- .pme_long_desc = "Load tm hit in L1", ++[ POWER9_PME_PM_MRK_DCACHE_RELOAD_INTV ] = { ++ .pme_name = "PM_MRK_DCACHE_RELOAD_INTV", ++ .pme_code = 0x0000040118, ++ .pme_short_desc = "Combined Intervention event", ++ .pme_long_desc = "Combined Intervention event", + }, +-[ POWER9_PME_PM_MRK_DATA_FROM_L3 ] = { /* 665 */ +- .pme_name = "PM_MRK_DATA_FROM_L3", +- .pme_code = 0x000004D142, +- .pme_short_desc = "The processor's data cache was reloaded from local core's L3 due to a marked load", +- .pme_long_desc = "The processor's data cache was reloaded from local core's L3 due to a marked load", ++[ POWER9_PME_PM_MRK_DERAT_MISS_16G ] = { ++ .pme_name = "PM_MRK_DERAT_MISS_16G", ++ .pme_code = 0x000004C15C, ++ .pme_short_desc = "Marked Data ERAT Miss (Data TLB Access) page size 16G", ++ .pme_long_desc = "Marked Data ERAT Miss (Data TLB Access) page size 16G", + }, +-[ POWER9_PME_PM_CMPLU_STALL_MTFPSCR ] = { /* 666 */ +- .pme_name = "PM_CMPLU_STALL_MTFPSCR", +- .pme_code = 0x000004E012, +- .pme_short_desc = "Completion stall because the ISU is updating the register and notifying the Effective Address Table (EAT)", +- .pme_long_desc = "Completion stall because the ISU is updating the register and notifying the Effective Address Table (EAT)", ++[ POWER9_PME_PM_MRK_DERAT_MISS_16M ] = { ++ .pme_name = "PM_MRK_DERAT_MISS_16M", ++ .pme_code = 0x000003D154, ++ .pme_short_desc = "Marked Data ERAT Miss (Data TLB Access) page size 16M", ++ .pme_long_desc = "Marked Data ERAT Miss (Data TLB Access) page size 16M", + }, +-[ POWER9_PME_PM_STALL_END_ICT_EMPTY ] = { /* 667 */ +- .pme_name = "PM_STALL_END_ICT_EMPTY", +- .pme_code = 0x0000010028, +- .pme_short_desc = "The number a times the core transitioned from a stall to ICT-empty for this thread", +- .pme_long_desc = "The number a times the core transitioned from a stall to ICT-empty for this thread", ++[ POWER9_PME_PM_MRK_DERAT_MISS_1G ] = { ++ .pme_name = "PM_MRK_DERAT_MISS_1G", ++ .pme_code = 0x000003D152, ++ .pme_short_desc = "Marked Data ERAT Miss (Data TLB Access) page size 1G.", ++ .pme_long_desc = "Marked Data ERAT Miss (Data TLB Access) page size 1G. Implies radix translation", + }, +-[ POWER9_PME_PM_L3_P1_CO_L31 ] = { /* 668 */ +- .pme_name = "PM_L3_P1_CO_L31", +- .pme_code = 0x00000468AA, +- .pme_short_desc = "l3 CO to L3.", +- .pme_long_desc = "l3 CO to L3.1 (lco) port 1", ++[ POWER9_PME_PM_MRK_DERAT_MISS_2M ] = { ++ .pme_name = "PM_MRK_DERAT_MISS_2M", ++ .pme_code = 0x000002D152, ++ .pme_short_desc = "Marked Data ERAT Miss (Data TLB Access) page size 2M.", ++ .pme_long_desc = "Marked Data ERAT Miss (Data TLB Access) page size 2M. Implies radix translation", + }, +-[ POWER9_PME_PM_CMPLU_STALL_DCACHE_MISS ] = { /* 669 */ +- .pme_name = "PM_CMPLU_STALL_DCACHE_MISS", +- .pme_code = 0x000002C012, +- .pme_short_desc = "Finish stall because the NTF instruction was a load that missed the L1 and was waiting for the data to return from the nest", +- .pme_long_desc = "Finish stall because the NTF instruction was a load that missed the L1 and was waiting for the data to return from the nest", ++[ POWER9_PME_PM_MRK_DERAT_MISS_4K ] = { ++ .pme_name = "PM_MRK_DERAT_MISS_4K", ++ .pme_code = 0x000002D150, ++ .pme_short_desc = "Marked Data ERAT Miss (Data TLB Access) page size 4K", ++ .pme_long_desc = "Marked Data ERAT Miss (Data TLB Access) page size 4K", + }, +-[ POWER9_PME_PM_DPTEG_FROM_DL2L3_MOD ] = { /* 670 */ +- .pme_name = "PM_DPTEG_FROM_DL2L3_MOD", +- .pme_code = 0x000004E048, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a data side request.", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++[ POWER9_PME_PM_MRK_DERAT_MISS_64K ] = { ++ .pme_name = "PM_MRK_DERAT_MISS_64K", ++ .pme_code = 0x000002D154, ++ .pme_short_desc = "Marked Data ERAT Miss (Data TLB Access) page size 64K", ++ .pme_long_desc = "Marked Data ERAT Miss (Data TLB Access) page size 64K", + }, +-[ POWER9_PME_PM_INST_FROM_L3_MEPF ] = { /* 671 */ +- .pme_name = "PM_INST_FROM_L3_MEPF", +- .pme_code = 0x0000024042, +- .pme_short_desc = "The processor's Instruction cache was reloaded from local core's L3 without dispatch conflicts hit on Mepf state.", +- .pme_long_desc = "The processor's Instruction cache was reloaded from local core's L3 without dispatch conflicts hit on Mepf state. due to an instruction fetch (not prefetch)", ++[ POWER9_PME_PM_MRK_DERAT_MISS ] = { ++ .pme_name = "PM_MRK_DERAT_MISS", ++ .pme_code = 0x00000301E6, ++ .pme_short_desc = "Erat Miss (TLB Access) All page sizes", ++ .pme_long_desc = "Erat Miss (TLB Access) All page sizes", + }, +-[ POWER9_PME_PM_L1_DCACHE_RELOADED_ALL ] = { /* 672 */ +- .pme_name = "PM_L1_DCACHE_RELOADED_ALL", +- .pme_code = 0x000001002C, +- .pme_short_desc = "L1 data cache reloaded for demand.", +- .pme_long_desc = "L1 data cache reloaded for demand. If MMCR1[16] is 1, prefetches will be included as well", ++[ POWER9_PME_PM_MRK_DFU_FIN ] = { ++ .pme_name = "PM_MRK_DFU_FIN", ++ .pme_code = 0x0000020132, ++ .pme_short_desc = "Decimal Unit marked Instruction Finish", ++ .pme_long_desc = "Decimal Unit marked Instruction Finish", + }, +-[ POWER9_PME_PM_DATA_GRP_PUMP_CPRED ] = { /* 673 */ +- .pme_name = "PM_DATA_GRP_PUMP_CPRED", +- .pme_code = 0x000002C050, +- .pme_short_desc = "Initial and Final Pump Scope was group pump (prediction=correct) for a demand load", +- .pme_long_desc = "Initial and Final Pump Scope was group pump (prediction=correct) for a demand load", ++[ POWER9_PME_PM_MRK_DPTEG_FROM_DL2L3_MOD ] = { ++ .pme_name = "PM_MRK_DPTEG_FROM_DL2L3_MOD", ++ .pme_code = 0x000004F148, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a marked data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", + }, +-[ POWER9_PME_PM_MRK_DERAT_MISS_64K ] = { /* 674 */ +- .pme_name = "PM_MRK_DERAT_MISS_64K", +- .pme_code = 0x000002D154, +- .pme_short_desc = "Marked Data ERAT Miss (Data TLB Access) page size 64K", +- .pme_long_desc = "Marked Data ERAT Miss (Data TLB Access) page size 64K", ++[ POWER9_PME_PM_MRK_DPTEG_FROM_DL2L3_SHR ] = { ++ .pme_name = "PM_MRK_DPTEG_FROM_DL2L3_SHR", ++ .pme_code = 0x000003F148, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a marked data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", + }, +-[ POWER9_PME_PM_L2_ST_MISS ] = { /* 675 */ +- .pme_name = "PM_L2_ST_MISS", +- .pme_code = 0x0000026880, +- .pme_short_desc = "All successful D-Side Store dispatches that were an L2miss for this thread", +- .pme_long_desc = "All successful D-Side Store dispatches that were an L2miss for this thread", ++[ POWER9_PME_PM_MRK_DPTEG_FROM_DL4 ] = { ++ .pme_name = "PM_MRK_DPTEG_FROM_DL4", ++ .pme_code = 0x000003F14C, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from another chip's L4 on a different Node or Group (Distant) due to a marked data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from another chip's L4 on a different Node or Group (Distant) due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", + }, +-[ POWER9_PME_PM_L3_PF_OFF_CHIP_CACHE ] = { /* 676 */ +- .pme_name = "PM_L3_PF_OFF_CHIP_CACHE", +- .pme_code = 0x00000368A0, +- .pme_short_desc = "L3 Prefetch from Off chip cache", +- .pme_long_desc = "L3 Prefetch from Off chip cache", ++[ POWER9_PME_PM_MRK_DPTEG_FROM_DMEM ] = { ++ .pme_name = "PM_MRK_DPTEG_FROM_DMEM", ++ .pme_code = 0x000004F14C, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from another chip's memory on the same Node or Group (Distant) due to a marked data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from another chip's memory on the same Node or Group (Distant) due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", + }, +-[ POWER9_PME_PM_RADIX_PWC_L3_PTE_FROM_L3MISS ] = { /* 677 */ +- .pme_name = "PM_RADIX_PWC_L3_PTE_FROM_L3MISS", +- .pme_code = 0x000004F05E, +- .pme_short_desc = "A Page Table Entry was reloaded to a level 3 page walk cache from beyond the core's L3 data cache.", +- .pme_long_desc = "A Page Table Entry was reloaded to a level 3 page walk cache from beyond the core's L3 data cache. This implies that a level 4 PWC access was not necessary for this translation. The source could be local/remote/distant memory or another core's cache", ++[ POWER9_PME_PM_MRK_DPTEG_FROM_L21_MOD ] = { ++ .pme_name = "PM_MRK_DPTEG_FROM_L21_MOD", ++ .pme_code = 0x000004F146, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's L2 on the same chip due to a marked data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's L2 on the same chip due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", + }, +-[ POWER9_PME_PM_LWSYNC ] = { /* 678 */ +- .pme_name = "PM_LWSYNC", +- .pme_code = 0x0000005894, +- .pme_short_desc = "Lwsync instruction decoded and transferred", +- .pme_long_desc = "Lwsync instruction decoded and transferred", ++[ POWER9_PME_PM_MRK_DPTEG_FROM_L21_SHR ] = { ++ .pme_name = "PM_MRK_DPTEG_FROM_L21_SHR", ++ .pme_code = 0x000003F146, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's L2 on the same chip due to a marked data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's L2 on the same chip due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", + }, +-[ POWER9_PME_PM_LS3_UNALIGNED_LD ] = { /* 679 */ +- .pme_name = "PM_LS3_UNALIGNED_LD", +- .pme_code = 0x000000C898, +- .pme_short_desc = "Load instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the load of that size.", +- .pme_long_desc = "Load instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the load of that size. If the load wraps from slice 3 to slice 0, thee is an additional 3-cycle penalty", ++[ POWER9_PME_PM_MRK_DPTEG_FROM_L2_MEPF ] = { ++ .pme_name = "PM_MRK_DPTEG_FROM_L2_MEPF", ++ .pme_code = 0x000002F140, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L2 hit without dispatch conflicts on Mepf state.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L2 hit without dispatch conflicts on Mepf state. due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", + }, +-[ POWER9_PME_PM_L3_RD0_BUSY ] = { /* 680 */ +- .pme_name = "PM_L3_RD0_BUSY", +- .pme_code = 0x00000468B4, +- .pme_short_desc = "lifetime, sample of RD machine 0 valid", +- .pme_long_desc = "lifetime, sample of RD machine 0 valid", ++[ POWER9_PME_PM_MRK_DPTEG_FROM_L2MISS ] = { ++ .pme_name = "PM_MRK_DPTEG_FROM_L2MISS", ++ .pme_code = 0x000001F14E, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from a location other than the local core's L2 due to a marked data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from a location other than the local core's L2 due to a marked data side request.. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", + }, +-[ POWER9_PME_PM_LINK_STACK_CORRECT ] = { /* 681 */ +- .pme_name = "PM_LINK_STACK_CORRECT", +- .pme_code = 0x00000058A0, +- .pme_short_desc = "Link stack predicts right address", +- .pme_long_desc = "Link stack predicts right address", ++[ POWER9_PME_PM_MRK_DPTEG_FROM_L2_NO_CONFLICT ] = { ++ .pme_name = "PM_MRK_DPTEG_FROM_L2_NO_CONFLICT", ++ .pme_code = 0x000001F140, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L2 without conflict due to a marked data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L2 without conflict due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", + }, +-[ POWER9_PME_PM_MRK_DTLB_MISS ] = { /* 682 */ +- .pme_name = "PM_MRK_DTLB_MISS", +- .pme_code = 0x00000401E4, +- .pme_short_desc = "Marked dtlb miss", +- .pme_long_desc = "Marked dtlb miss", ++[ POWER9_PME_PM_MRK_DPTEG_FROM_L2 ] = { ++ .pme_name = "PM_MRK_DPTEG_FROM_L2", ++ .pme_code = 0x000001F142, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L2 due to a marked data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L2 due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", + }, +-[ POWER9_PME_PM_INST_IMC_MATCH_CMPL ] = { /* 683 */ +- .pme_name = "PM_INST_IMC_MATCH_CMPL", +- .pme_code = 0x000004001C, +- .pme_short_desc = "IMC Match Count", +- .pme_long_desc = "IMC Match Count", ++[ POWER9_PME_PM_MRK_DPTEG_FROM_L31_ECO_MOD ] = { ++ .pme_name = "PM_MRK_DPTEG_FROM_L31_ECO_MOD", ++ .pme_code = 0x000004F144, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's ECO L3 on the same chip due to a marked data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's ECO L3 on the same chip due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", + }, +-[ POWER9_PME_PM_LS1_ERAT_MISS_PREF ] = { /* 684 */ +- .pme_name = "PM_LS1_ERAT_MISS_PREF", +- .pme_code = 0x000000E884, +- .pme_short_desc = "LS1 Erat miss due to prefetch", +- .pme_long_desc = "LS1 Erat miss due to prefetch", ++[ POWER9_PME_PM_MRK_DPTEG_FROM_L31_ECO_SHR ] = { ++ .pme_name = "PM_MRK_DPTEG_FROM_L31_ECO_SHR", ++ .pme_code = 0x000003F144, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's ECO L3 on the same chip due to a marked data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's ECO L3 on the same chip due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", + }, +-[ POWER9_PME_PM_L3_CO0_BUSY ] = { /* 685 */ +- .pme_name = "PM_L3_CO0_BUSY", +- .pme_code = 0x00000468AC, +- .pme_short_desc = "lifetime, sample of CO machine 0 valid", +- .pme_long_desc = "lifetime, sample of CO machine 0 valid", ++[ POWER9_PME_PM_MRK_DPTEG_FROM_L31_MOD ] = { ++ .pme_name = "PM_MRK_DPTEG_FROM_L31_MOD", ++ .pme_code = 0x000002F144, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's L3 on the same chip due to a marked data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's L3 on the same chip due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", + }, +-[ POWER9_PME_PM_BFU_BUSY ] = { /* 686 */ +- .pme_name = "PM_BFU_BUSY", +- .pme_code = 0x000003005C, +- .pme_short_desc = "Cycles in which all 4 Binary Floating Point units are busy.", +- .pme_long_desc = "Cycles in which all 4 Binary Floating Point units are busy. The BFU is running at capacity", ++[ POWER9_PME_PM_MRK_DPTEG_FROM_L31_SHR ] = { ++ .pme_name = "PM_MRK_DPTEG_FROM_L31_SHR", ++ .pme_code = 0x000001F146, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's L3 on the same chip due to a marked data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's L3 on the same chip due to a marked data side request.. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", + }, +-[ POWER9_PME_PM_L2_SYS_GUESS_CORRECT ] = { /* 687 */ +- .pme_name = "PM_L2_SYS_GUESS_CORRECT", +- .pme_code = 0x0000036088, +- .pme_short_desc = "L2 guess sys and guess was correct (ie data beyond-6chip)", +- .pme_long_desc = "L2 guess sys and guess was correct (ie data beyond-6chip)", ++[ POWER9_PME_PM_MRK_DPTEG_FROM_L3_DISP_CONFLICT ] = { ++ .pme_name = "PM_MRK_DPTEG_FROM_L3_DISP_CONFLICT", ++ .pme_code = 0x000003F142, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L3 with dispatch conflict due to a marked data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L3 with dispatch conflict due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", + }, +-[ POWER9_PME_PM_L1_SW_PREF ] = { /* 688 */ +- .pme_name = "PM_L1_SW_PREF", +- .pme_code = 0x000000E880, +- .pme_short_desc = "Software L1 Prefetches, including SW Transient Prefetches", +- .pme_long_desc = "Software L1 Prefetches, including SW Transient Prefetches", ++[ POWER9_PME_PM_MRK_DPTEG_FROM_L3_MEPF ] = { ++ .pme_name = "PM_MRK_DPTEG_FROM_L3_MEPF", ++ .pme_code = 0x000002F142, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L3 without dispatch conflicts hit on Mepf state.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L3 without dispatch conflicts hit on Mepf state. due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", + }, +-[ POWER9_PME_PM_MRK_DATA_FROM_LL4 ] = { /* 689 */ +- .pme_name = "PM_MRK_DATA_FROM_LL4", +- .pme_code = 0x000001D14C, +- .pme_short_desc = "The processor's data cache was reloaded from the local chip's L4 cache due to a marked load", +- .pme_long_desc = "The processor's data cache was reloaded from the local chip's L4 cache due to a marked load", ++[ POWER9_PME_PM_MRK_DPTEG_FROM_L3MISS ] = { ++ .pme_name = "PM_MRK_DPTEG_FROM_L3MISS", ++ .pme_code = 0x000004F14E, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from a location other than the local core's L3 due to a marked data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from a location other than the local core's L3 due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", + }, +-[ POWER9_PME_PM_MRK_INST_FIN ] = { /* 690 */ +- .pme_name = "PM_MRK_INST_FIN", +- .pme_code = 0x0000030130, +- .pme_short_desc = "marked instruction finished", +- .pme_long_desc = "marked instruction finished", ++[ POWER9_PME_PM_MRK_DPTEG_FROM_L3_NO_CONFLICT ] = { ++ .pme_name = "PM_MRK_DPTEG_FROM_L3_NO_CONFLICT", ++ .pme_code = 0x000001F144, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L3 without conflict due to a marked data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L3 without conflict due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", + }, +-[ POWER9_PME_PM_SYNC_MRK_L3MISS ] = { /* 691 */ +- .pme_name = "PM_SYNC_MRK_L3MISS", +- .pme_code = 0x0000015154, +- .pme_short_desc = "Marked L3 misses that can throw a synchronous interrupt", +- .pme_long_desc = "Marked L3 misses that can throw a synchronous interrupt", ++[ POWER9_PME_PM_MRK_DPTEG_FROM_L3 ] = { ++ .pme_name = "PM_MRK_DPTEG_FROM_L3", ++ .pme_code = 0x000004F142, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L3 due to a marked data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L3 due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", + }, +-[ POWER9_PME_PM_LSU1_STORE_REJECT ] = { /* 692 */ +- .pme_name = "PM_LSU1_STORE_REJECT", +- .pme_code = 0x000000F88C, +- .pme_short_desc = "All internal store rejects cause the instruction to go back to the SRQ and go to sleep until woken up to try again after the condition has been met", +- .pme_long_desc = "All internal store rejects cause the instruction to go back to the SRQ and go to sleep until woken up to try again after the condition has been met", ++[ POWER9_PME_PM_MRK_DPTEG_FROM_LL4 ] = { ++ .pme_name = "PM_MRK_DPTEG_FROM_LL4", ++ .pme_code = 0x000001F14C, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from the local chip's L4 cache due to a marked data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from the local chip's L4 cache due to a marked data side request.. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", + }, +-[ POWER9_PME_PM_CHIP_PUMP_CPRED ] = { /* 693 */ +- .pme_name = "PM_CHIP_PUMP_CPRED", +- .pme_code = 0x0000010050, +- .pme_short_desc = "Initial and Final Pump Scope was chip pump (prediction=correct) for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", +- .pme_long_desc = "Initial and Final Pump Scope was chip pump (prediction=correct) for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", ++[ POWER9_PME_PM_MRK_DPTEG_FROM_LMEM ] = { ++ .pme_name = "PM_MRK_DPTEG_FROM_LMEM", ++ .pme_code = 0x000002F148, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from the local chip's Memory due to a marked data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from the local chip's Memory due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", + }, +-[ POWER9_PME_PM_MRK_DATA_FROM_OFF_CHIP_CACHE_CYC ] = { /* 694 */ +- .pme_name = "PM_MRK_DATA_FROM_OFF_CHIP_CACHE_CYC", +- .pme_code = 0x000001D14E, +- .pme_short_desc = "Duration in cycles to reload either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to a marked load", +- .pme_long_desc = "Duration in cycles to reload either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to a marked load", ++[ POWER9_PME_PM_MRK_DPTEG_FROM_MEMORY ] = { ++ .pme_name = "PM_MRK_DPTEG_FROM_MEMORY", ++ .pme_code = 0x000002F14C, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from a memory location including L4 from local remote or distant due to a marked data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from a memory location including L4 from local remote or distant due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", + }, +-[ POWER9_PME_PM_DATA_STORE ] = { /* 695 */ +- .pme_name = "PM_DATA_STORE", +- .pme_code = 0x000000F0A0, +- .pme_short_desc = "All ops that drain from s2q to L2 containing data", +- .pme_long_desc = "All ops that drain from s2q to L2 containing data", ++[ POWER9_PME_PM_MRK_DPTEG_FROM_OFF_CHIP_CACHE ] = { ++ .pme_name = "PM_MRK_DPTEG_FROM_OFF_CHIP_CACHE", ++ .pme_code = 0x000004F14A, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to a marked data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", + }, +-[ POWER9_PME_PM_LS1_UNALIGNED_LD ] = { /* 696 */ +- .pme_name = "PM_LS1_UNALIGNED_LD", +- .pme_code = 0x000000C894, +- .pme_short_desc = "Load instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the load of that size.", +- .pme_long_desc = "Load instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the load of that size. If the load wraps from slice 3 to slice 0, thee is an additional 3-cycle penalty", ++[ POWER9_PME_PM_MRK_DPTEG_FROM_ON_CHIP_CACHE ] = { ++ .pme_name = "PM_MRK_DPTEG_FROM_ON_CHIP_CACHE", ++ .pme_code = 0x000001F148, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB either shared or modified data from another core's L2/L3 on the same chip due to a marked data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB either shared or modified data from another core's L2/L3 on the same chip due to a marked data side request.. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", + }, +-[ POWER9_PME_PM_TM_TRANS_RUN_INST ] = { /* 697 */ +- .pme_name = "PM_TM_TRANS_RUN_INST", +- .pme_code = 0x0000030060, +- .pme_short_desc = "Run instructions completed in transactional state (gated by the run latch)", +- .pme_long_desc = "Run instructions completed in transactional state (gated by the run latch)", ++[ POWER9_PME_PM_MRK_DPTEG_FROM_RL2L3_MOD ] = { ++ .pme_name = "PM_MRK_DPTEG_FROM_RL2L3_MOD", ++ .pme_code = 0x000002F146, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a marked data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", + }, +-[ POWER9_PME_PM_IC_MISS_CMPL ] = { /* 698 */ +- .pme_name = "PM_IC_MISS_CMPL", +- .pme_code = 0x000001D15A, +- .pme_short_desc = "Non-speculative icache miss, counted at completion", +- .pme_long_desc = "Non-speculative icache miss, counted at completion", ++[ POWER9_PME_PM_MRK_DPTEG_FROM_RL2L3_SHR ] = { ++ .pme_name = "PM_MRK_DPTEG_FROM_RL2L3_SHR", ++ .pme_code = 0x000001F14A, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a marked data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a marked data side request.. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", + }, +-[ POWER9_PME_PM_THRESH_NOT_MET ] = { /* 699 */ +- .pme_name = "PM_THRESH_NOT_MET", +- .pme_code = 0x000004016E, +- .pme_short_desc = "Threshold counter did not meet threshold", +- .pme_long_desc = "Threshold counter did not meet threshold", ++[ POWER9_PME_PM_MRK_DPTEG_FROM_RL4 ] = { ++ .pme_name = "PM_MRK_DPTEG_FROM_RL4", ++ .pme_code = 0x000002F14A, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from another chip's L4 on the same Node or Group ( Remote) due to a marked data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from another chip's L4 on the same Node or Group ( Remote) due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", + }, +-[ POWER9_PME_PM_DPTEG_FROM_L2 ] = { /* 700 */ +- .pme_name = "PM_DPTEG_FROM_L2", +- .pme_code = 0x000001E042, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L2 due to a data side request.", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L2 due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++[ POWER9_PME_PM_MRK_DPTEG_FROM_RMEM ] = { ++ .pme_name = "PM_MRK_DPTEG_FROM_RMEM", ++ .pme_code = 0x000003F14A, ++ .pme_short_desc = "A Page Table Entry was loaded into the TLB from another chip's memory on the same Node or Group ( Remote) due to a marked data side request.", ++ .pme_long_desc = "A Page Table Entry was loaded into the TLB from another chip's memory on the same Node or Group ( Remote) due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", + }, +-[ POWER9_PME_PM_IPTEG_FROM_RL2L3_SHR ] = { /* 701 */ +- .pme_name = "PM_IPTEG_FROM_RL2L3_SHR", +- .pme_code = 0x000001504A, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a instruction side request", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a instruction side request", ++[ POWER9_PME_PM_MRK_DTLB_MISS_16G ] = { ++ .pme_name = "PM_MRK_DTLB_MISS_16G", ++ .pme_code = 0x000002D15E, ++ .pme_short_desc = "Marked Data TLB Miss page size 16G", ++ .pme_long_desc = "Marked Data TLB Miss page size 16G", + }, +-[ POWER9_PME_PM_DPTEG_FROM_RMEM ] = { /* 702 */ +- .pme_name = "PM_DPTEG_FROM_RMEM", +- .pme_code = 0x000003E04A, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB from another chip's memory on the same Node or Group ( Remote) due to a data side request.", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB from another chip's memory on the same Node or Group ( Remote) due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++[ POWER9_PME_PM_MRK_DTLB_MISS_16M ] = { ++ .pme_name = "PM_MRK_DTLB_MISS_16M", ++ .pme_code = 0x000004C15E, ++ .pme_short_desc = "Marked Data TLB Miss page size 16M", ++ .pme_long_desc = "Marked Data TLB Miss page size 16M", + }, +-[ POWER9_PME_PM_L3_L2_CO_MISS ] = { /* 703 */ +- .pme_name = "PM_L3_L2_CO_MISS", +- .pme_code = 0x00000368A2, +- .pme_short_desc = "L2 castout miss", +- .pme_long_desc = "L2 castout miss", ++[ POWER9_PME_PM_MRK_DTLB_MISS_1G ] = { ++ .pme_name = "PM_MRK_DTLB_MISS_1G", ++ .pme_code = 0x000001D15C, ++ .pme_short_desc = "Marked Data TLB reload (after a miss) page size 2M.", ++ .pme_long_desc = "Marked Data TLB reload (after a miss) page size 2M. Implies radix translation was used", + }, +-[ POWER9_PME_PM_IPTEG_FROM_DMEM ] = { /* 704 */ +- .pme_name = "PM_IPTEG_FROM_DMEM", +- .pme_code = 0x000004504C, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB from another chip's memory on the same Node or Group (Distant) due to a instruction side request", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB from another chip's memory on the same Node or Group (Distant) due to a instruction side request", ++[ POWER9_PME_PM_MRK_DTLB_MISS_4K ] = { ++ .pme_name = "PM_MRK_DTLB_MISS_4K", ++ .pme_code = 0x000002D156, ++ .pme_short_desc = "Marked Data TLB Miss page size 4k", ++ .pme_long_desc = "Marked Data TLB Miss page size 4k", + }, +-[ POWER9_PME_PM_MRK_DTLB_MISS_64K ] = { /* 705 */ ++[ POWER9_PME_PM_MRK_DTLB_MISS_64K ] = { + .pme_name = "PM_MRK_DTLB_MISS_64K", + .pme_code = 0x000003D156, + .pme_short_desc = "Marked Data TLB Miss page size 64K", + .pme_long_desc = "Marked Data TLB Miss page size 64K", + }, +-[ POWER9_PME_PM_MRK_DATA_FROM_L3_DISP_CONFLICT_CYC ] = { /* 706 */ +- .pme_name = "PM_MRK_DATA_FROM_L3_DISP_CONFLICT_CYC", +- .pme_code = 0x000002C122, +- .pme_short_desc = "Duration in cycles to reload from local core's L3 with dispatch conflict due to a marked load", +- .pme_long_desc = "Duration in cycles to reload from local core's L3 with dispatch conflict due to a marked load", ++[ POWER9_PME_PM_MRK_DTLB_MISS ] = { ++ .pme_name = "PM_MRK_DTLB_MISS", ++ .pme_code = 0x00000401E4, ++ .pme_short_desc = "Marked dtlb miss", ++ .pme_long_desc = "Marked dtlb miss", ++}, ++[ POWER9_PME_PM_MRK_FAB_RSP_BKILL_CYC ] = { ++ .pme_name = "PM_MRK_FAB_RSP_BKILL_CYC", ++ .pme_code = 0x000001F152, ++ .pme_short_desc = "cycles L2 RC took for a bkill", ++ .pme_long_desc = "cycles L2 RC took for a bkill", ++}, ++[ POWER9_PME_PM_MRK_FAB_RSP_BKILL ] = { ++ .pme_name = "PM_MRK_FAB_RSP_BKILL", ++ .pme_code = 0x0000040154, ++ .pme_short_desc = "Marked store had to do a bkill", ++ .pme_long_desc = "Marked store had to do a bkill", ++}, ++[ POWER9_PME_PM_MRK_FAB_RSP_CLAIM_RTY ] = { ++ .pme_name = "PM_MRK_FAB_RSP_CLAIM_RTY", ++ .pme_code = 0x000003015E, ++ .pme_short_desc = "Sampled store did a rwitm and got a rty", ++ .pme_long_desc = "Sampled store did a rwitm and got a rty", ++}, ++[ POWER9_PME_PM_MRK_FAB_RSP_DCLAIM_CYC ] = { ++ .pme_name = "PM_MRK_FAB_RSP_DCLAIM_CYC", ++ .pme_code = 0x000002F152, ++ .pme_short_desc = "cycles L2 RC took for a dclaim", ++ .pme_long_desc = "cycles L2 RC took for a dclaim", ++}, ++[ POWER9_PME_PM_MRK_FAB_RSP_DCLAIM ] = { ++ .pme_name = "PM_MRK_FAB_RSP_DCLAIM", ++ .pme_code = 0x0000030154, ++ .pme_short_desc = "Marked store had to do a dclaim", ++ .pme_long_desc = "Marked store had to do a dclaim", ++}, ++[ POWER9_PME_PM_MRK_FAB_RSP_RD_RTY ] = { ++ .pme_name = "PM_MRK_FAB_RSP_RD_RTY", ++ .pme_code = 0x000004015E, ++ .pme_short_desc = "Sampled L2 reads retry count", ++ .pme_long_desc = "Sampled L2 reads retry count", + }, +-[ POWER9_PME_PM_LSU_FIN ] = { /* 707 */ +- .pme_name = "PM_LSU_FIN", +- .pme_code = 0x0000030066, +- .pme_short_desc = "LSU Finished a PPC instruction (up to 4 per cycle)", +- .pme_long_desc = "LSU Finished a PPC instruction (up to 4 per cycle)", ++[ POWER9_PME_PM_MRK_FAB_RSP_RD_T_INTV ] = { ++ .pme_name = "PM_MRK_FAB_RSP_RD_T_INTV", ++ .pme_code = 0x000001015E, ++ .pme_short_desc = "Sampled Read got a T intervention", ++ .pme_long_desc = "Sampled Read got a T intervention", + }, +-[ POWER9_PME_PM_DATA_FROM_L2_DISP_CONFLICT_OTHER ] = { /* 708 */ +- .pme_name = "PM_DATA_FROM_L2_DISP_CONFLICT_OTHER", +- .pme_code = 0x000004C040, +- .pme_short_desc = "The processor's data cache was reloaded from local core's L2 with dispatch conflict due to a demand load", +- .pme_long_desc = "The processor's data cache was reloaded from local core's L2 with dispatch conflict due to a demand load", ++[ POWER9_PME_PM_MRK_FAB_RSP_RWITM_CYC ] = { ++ .pme_name = "PM_MRK_FAB_RSP_RWITM_CYC", ++ .pme_code = 0x000004F150, ++ .pme_short_desc = "cycles L2 RC took for a rwitm", ++ .pme_long_desc = "cycles L2 RC took for a rwitm", + }, +-[ POWER9_PME_PM_MRK_DATA_FROM_ON_CHIP_CACHE ] = { /* 709 */ +- .pme_name = "PM_MRK_DATA_FROM_ON_CHIP_CACHE", +- .pme_code = 0x000004D140, +- .pme_short_desc = "The processor's data cache was reloaded either shared or modified data from another core's L2/L3 on the same chip due to a marked load", +- .pme_long_desc = "The processor's data cache was reloaded either shared or modified data from another core's L2/L3 on the same chip due to a marked load", ++[ POWER9_PME_PM_MRK_FAB_RSP_RWITM_RTY ] = { ++ .pme_name = "PM_MRK_FAB_RSP_RWITM_RTY", ++ .pme_code = 0x000002015E, ++ .pme_short_desc = "Sampled store did a rwitm and got a rty", ++ .pme_long_desc = "Sampled store did a rwitm and got a rty", + }, +-[ POWER9_PME_PM_LSU_STCX ] = { /* 710 */ +- .pme_name = "PM_LSU_STCX", +- .pme_code = 0x000000C090, +- .pme_short_desc = "STCX sent to nest, i.", +- .pme_long_desc = "STCX sent to nest, i.e. total", ++[ POWER9_PME_PM_MRK_FXU_FIN ] = { ++ .pme_name = "PM_MRK_FXU_FIN", ++ .pme_code = 0x0000020134, ++ .pme_short_desc = "fxu marked instr finish", ++ .pme_long_desc = "fxu marked instr finish", + }, +-[ POWER9_PME_PM_MRK_DATA_FROM_L2_1_MOD ] = { /* 711 */ +- .pme_name = "PM_MRK_DATA_FROM_L2_1_MOD", +- .pme_code = 0x000004D146, +- .pme_short_desc = "The processor's data cache was reloaded with Modified (M) data from another core's L2 on the same chip due to a marked load", +- .pme_long_desc = "The processor's data cache was reloaded with Modified (M) data from another core's L2 on the same chip due to a marked load", ++[ POWER9_PME_PM_MRK_IC_MISS ] = { ++ .pme_name = "PM_MRK_IC_MISS", ++ .pme_code = 0x000004013A, ++ .pme_short_desc = "Marked instruction experienced I cache miss", ++ .pme_long_desc = "Marked instruction experienced I cache miss", + }, +-[ POWER9_PME_PM_VSU_NON_FLOP_CMPL ] = { /* 712 */ +- .pme_name = "PM_VSU_NON_FLOP_CMPL", +- .pme_code = 0x000004D050, +- .pme_short_desc = "", +- .pme_long_desc = "", ++[ POWER9_PME_PM_MRK_INST_CMPL ] = { ++ .pme_name = "PM_MRK_INST_CMPL", ++ .pme_code = 0x00000401E0, ++ .pme_short_desc = "marked instruction completed", ++ .pme_long_desc = "marked instruction completed", + }, +-[ POWER9_PME_PM_INST_FROM_L3_DISP_CONFLICT ] = { /* 713 */ +- .pme_name = "PM_INST_FROM_L3_DISP_CONFLICT", +- .pme_code = 0x0000034042, +- .pme_short_desc = "The processor's Instruction cache was reloaded from local core's L3 with dispatch conflict due to an instruction fetch (not prefetch)", +- .pme_long_desc = "The processor's Instruction cache was reloaded from local core's L3 with dispatch conflict due to an instruction fetch (not prefetch)", ++[ POWER9_PME_PM_MRK_INST_DECODED ] = { ++ .pme_name = "PM_MRK_INST_DECODED", ++ .pme_code = 0x0000020130, ++ .pme_short_desc = "An instruction was marked at decode time.", ++ .pme_long_desc = "An instruction was marked at decode time. Random Instruction Sampling (RIS) only", + }, +-[ POWER9_PME_PM_MRK_DATA_FROM_L2_1_SHR ] = { /* 714 */ +- .pme_name = "PM_MRK_DATA_FROM_L2_1_SHR", +- .pme_code = 0x000002D14E, +- .pme_short_desc = "The processor's data cache was reloaded with Shared (S) data from another core's L2 on the same chip due to a marked load", +- .pme_long_desc = "The processor's data cache was reloaded with Shared (S) data from another core's L2 on the same chip due to a marked load", ++[ POWER9_PME_PM_MRK_INST_DISP ] = { ++ .pme_name = "PM_MRK_INST_DISP", ++ .pme_code = 0x00000101E0, ++ .pme_short_desc = "The thread has dispatched a randomly sampled marked instruction", ++ .pme_long_desc = "The thread has dispatched a randomly sampled marked instruction", + }, +-[ POWER9_PME_PM_RADIX_PWC_L4_PTE_FROM_L3 ] = { /* 715 */ +- .pme_name = "PM_RADIX_PWC_L4_PTE_FROM_L3", +- .pme_code = 0x000004F05A, +- .pme_short_desc = "A Page Table Entry was reloaded to a level 4 page walk cache from the core's L3 data cache.", +- .pme_long_desc = "A Page Table Entry was reloaded to a level 4 page walk cache from the core's L3 data cache. This is the deepest level of PWC possible for a translation", ++[ POWER9_PME_PM_MRK_INST_FIN ] = { ++ .pme_name = "PM_MRK_INST_FIN", ++ .pme_code = 0x0000030130, ++ .pme_short_desc = "marked instruction finished", ++ .pme_long_desc = "marked instruction finished", + }, +-[ POWER9_PME_PM_TAGE_CORRECT ] = { /* 716 */ +- .pme_name = "PM_TAGE_CORRECT", +- .pme_code = 0x00000058B4, +- .pme_short_desc = "The TAGE overrode BHT direction prediction and it was correct.", +- .pme_long_desc = "The TAGE overrode BHT direction prediction and it was correct. Includes taken and not taken and is counted at execution time", ++[ POWER9_PME_PM_MRK_INST_FROM_L3MISS ] = { ++ .pme_name = "PM_MRK_INST_FROM_L3MISS", ++ .pme_code = 0x00000401E6, ++ .pme_short_desc = "Marked instruction was reloaded from a location beyond the local chiplet", ++ .pme_long_desc = "Marked instruction was reloaded from a location beyond the local chiplet", + }, +-[ POWER9_PME_PM_TM_FAV_CAUSED_FAIL ] = { /* 717 */ +- .pme_name = "PM_TM_FAV_CAUSED_FAIL", +- .pme_code = 0x000002688C, +- .pme_short_desc = "TM Load (fav) caused another thread to fail", +- .pme_long_desc = "TM Load (fav) caused another thread to fail", ++[ POWER9_PME_PM_MRK_INST_ISSUED ] = { ++ .pme_name = "PM_MRK_INST_ISSUED", ++ .pme_code = 0x0000010132, ++ .pme_short_desc = "Marked instruction issued", ++ .pme_long_desc = "Marked instruction issued", + }, +-[ POWER9_PME_PM_RADIX_PWC_L1_HIT ] = { /* 718 */ +- .pme_name = "PM_RADIX_PWC_L1_HIT", +- .pme_code = 0x000001F056, +- .pme_short_desc = "A radix translation attempt missed in the TLB and only the first level page walk cache was a hit.", +- .pme_long_desc = "A radix translation attempt missed in the TLB and only the first level page walk cache was a hit.", ++[ POWER9_PME_PM_MRK_INST_TIMEO ] = { ++ .pme_name = "PM_MRK_INST_TIMEO", ++ .pme_code = 0x0000040134, ++ .pme_short_desc = "marked Instruction finish timeout (instruction lost)", ++ .pme_long_desc = "marked Instruction finish timeout (instruction lost)", + }, +-[ POWER9_PME_PM_LSU0_LMQ_S0_VALID ] = { /* 719 */ +- .pme_name = "PM_LSU0_LMQ_S0_VALID", +- .pme_code = 0x000000D8B8, +- .pme_short_desc = "Slot 0 of LMQ valid", +- .pme_long_desc = "Slot 0 of LMQ valid", ++[ POWER9_PME_PM_MRK_INST ] = { ++ .pme_name = "PM_MRK_INST", ++ .pme_code = 0x0000024058, ++ .pme_short_desc = "An instruction was marked.", ++ .pme_long_desc = "An instruction was marked. Includes both Random Instruction Sampling (RIS) at decode time and Random Event Sampling (RES) at the time the configured event happens", + }, +-[ POWER9_PME_PM_BR_MPRED_CCACHE ] = { /* 720 */ +- .pme_name = "PM_BR_MPRED_CCACHE", +- .pme_code = 0x00000040AC, +- .pme_short_desc = "Conditional Branch Completed that was Mispredicted due to the Count Cache Target Prediction", +- .pme_long_desc = "Conditional Branch Completed that was Mispredicted due to the Count Cache Target Prediction", ++[ POWER9_PME_PM_MRK_L1_ICACHE_MISS ] = { ++ .pme_name = "PM_MRK_L1_ICACHE_MISS", ++ .pme_code = 0x00000101E4, ++ .pme_short_desc = "sampled Instruction suffered an icache Miss", ++ .pme_long_desc = "sampled Instruction suffered an icache Miss", + }, +-[ POWER9_PME_PM_L1_DEMAND_WRITE ] = { /* 721 */ +- .pme_name = "PM_L1_DEMAND_WRITE", +- .pme_code = 0x000000408C, +- .pme_short_desc = "Instruction Demand sectors wriittent into IL1", +- .pme_long_desc = "Instruction Demand sectors wriittent into IL1", ++[ POWER9_PME_PM_MRK_L1_RELOAD_VALID ] = { ++ .pme_name = "PM_MRK_L1_RELOAD_VALID", ++ .pme_code = 0x00000101EA, ++ .pme_short_desc = "Marked demand reload", ++ .pme_long_desc = "Marked demand reload", + }, +-[ POWER9_PME_PM_CMPLU_STALL_FLUSH_ANY_THREAD ] = { /* 722 */ +- .pme_name = "PM_CMPLU_STALL_FLUSH_ANY_THREAD", +- .pme_code = 0x000001E056, +- .pme_short_desc = "Cycles in which the NTC instruction is not allowed to complete because any of the 4 threads in the same core suffered a flush, which blocks completion", +- .pme_long_desc = "Cycles in which the NTC instruction is not allowed to complete because any of the 4 threads in the same core suffered a flush, which blocks completion", ++[ POWER9_PME_PM_MRK_L2_RC_DISP ] = { ++ .pme_name = "PM_MRK_L2_RC_DISP", ++ .pme_code = 0x0000020114, ++ .pme_short_desc = "Marked Instruction RC dispatched in L2", ++ .pme_long_desc = "Marked Instruction RC dispatched in L2", + }, +-[ POWER9_PME_PM_IPTEG_FROM_L3MISS ] = { /* 723 */ +- .pme_name = "PM_IPTEG_FROM_L3MISS", +- .pme_code = 0x000004504E, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB from a localtion other than the local core's L3 due to a instruction side request", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB from a localtion other than the local core's L3 due to a instruction side request", ++[ POWER9_PME_PM_MRK_L2_RC_DONE ] = { ++ .pme_name = "PM_MRK_L2_RC_DONE", ++ .pme_code = 0x000003012A, ++ .pme_short_desc = "Marked RC done", ++ .pme_long_desc = "Marked RC done", + }, +-[ POWER9_PME_PM_MRK_DTLB_MISS_16G ] = { /* 724 */ +- .pme_name = "PM_MRK_DTLB_MISS_16G", +- .pme_code = 0x000002D15E, +- .pme_short_desc = "Marked Data TLB Miss page size 16G", +- .pme_long_desc = "Marked Data TLB Miss page size 16G", ++[ POWER9_PME_PM_MRK_L2_TM_REQ_ABORT ] = { ++ .pme_name = "PM_MRK_L2_TM_REQ_ABORT", ++ .pme_code = 0x000001E15E, ++ .pme_short_desc = "TM abort", ++ .pme_long_desc = "TM abort", + }, +-[ POWER9_PME_PM_IPTEG_FROM_RL4 ] = { /* 725 */ +- .pme_name = "PM_IPTEG_FROM_RL4", +- .pme_code = 0x000002504A, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB from another chip's L4 on the same Node or Group ( Remote) due to a instruction side request", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB from another chip's L4 on the same Node or Group ( Remote) due to a instruction side request", ++[ POWER9_PME_PM_MRK_L2_TM_ST_ABORT_SISTER ] = { ++ .pme_name = "PM_MRK_L2_TM_ST_ABORT_SISTER", ++ .pme_code = 0x000003E15C, ++ .pme_short_desc = "TM marked store abort for this thread", ++ .pme_long_desc = "TM marked store abort for this thread", + }, +-[ POWER9_PME_PM_L2_RCST_DISP ] = { /* 726 */ +- .pme_name = "PM_L2_RCST_DISP", +- .pme_code = 0x0000036084, +- .pme_short_desc = "L2 RC store dispatch attempt", +- .pme_long_desc = "L2 RC store dispatch attempt", ++[ POWER9_PME_PM_MRK_LARX_FIN ] = { ++ .pme_name = "PM_MRK_LARX_FIN", ++ .pme_code = 0x0000040116, ++ .pme_short_desc = "Larx finished", ++ .pme_long_desc = "Larx finished", + }, +-[ POWER9_PME_PM_MRK_DATA_FROM_L2_DISP_CONFLICT_OTHER_CYC ] = { /* 727 */ +- .pme_name = "PM_MRK_DATA_FROM_L2_DISP_CONFLICT_OTHER_CYC", +- .pme_code = 0x000003D140, +- .pme_short_desc = "Duration in cycles to reload from local core's L2 with dispatch conflict due to a marked load", +- .pme_long_desc = "Duration in cycles to reload from local core's L2 with dispatch conflict due to a marked load", ++[ POWER9_PME_PM_MRK_LD_MISS_EXPOSED_CYC ] = { ++ .pme_name = "PM_MRK_LD_MISS_EXPOSED_CYC", ++ .pme_code = 0x000001013E, ++ .pme_short_desc = "Marked Load exposed Miss (use edge detect to count #)", ++ .pme_long_desc = "Marked Load exposed Miss (use edge detect to count #)", + }, +-[ POWER9_PME_PM_CMPLU_STALL ] = { /* 728 */ +- .pme_name = "PM_CMPLU_STALL", +- .pme_code = 0x000001E054, +- .pme_short_desc = "Nothing completed and ICT not empty", +- .pme_long_desc = "Nothing completed and ICT not empty", ++[ POWER9_PME_PM_MRK_LD_MISS_L1_CYC ] = { ++ .pme_name = "PM_MRK_LD_MISS_L1_CYC", ++ .pme_code = 0x000001D056, ++ .pme_short_desc = "Marked ld latency", ++ .pme_long_desc = "Marked ld latency", + }, +-[ POWER9_PME_PM_DISP_CLB_HELD_SB ] = { /* 729 */ +- .pme_name = "PM_DISP_CLB_HELD_SB", +- .pme_code = 0x0000002090, +- .pme_short_desc = "Dispatch/CLB Hold: Scoreboard", +- .pme_long_desc = "Dispatch/CLB Hold: Scoreboard", ++[ POWER9_PME_PM_MRK_LD_MISS_L1 ] = { ++ .pme_name = "PM_MRK_LD_MISS_L1", ++ .pme_code = 0x00000201E2, ++ .pme_short_desc = "Marked DL1 Demand Miss counted at exec time.", ++ .pme_long_desc = "Marked DL1 Demand Miss counted at exec time. Note that this count is per slice, so if a load spans multiple slices this event will increment multiple times for a single load.", + }, +-[ POWER9_PME_PM_L3_SN_USAGE ] = { /* 730 */ +- .pme_name = "PM_L3_SN_USAGE", +- .pme_code = 0x00000160AC, +- .pme_short_desc = "rotating sample of 8 snoop valids", +- .pme_long_desc = "rotating sample of 8 snoop valids", ++[ POWER9_PME_PM_MRK_LSU_DERAT_MISS ] = { ++ .pme_name = "PM_MRK_LSU_DERAT_MISS", ++ .pme_code = 0x0000030162, ++ .pme_short_desc = "Marked derat reload (miss) for any page size", ++ .pme_long_desc = "Marked derat reload (miss) for any page size", + }, +-[ POWER9_PME_PM_FLOP_CMPL ] = { /* 731 */ +- .pme_name = "PM_FLOP_CMPL", +- .pme_code = 0x00000100F4, +- .pme_short_desc = "Floating Point Operation Finished", +- .pme_long_desc = "Floating Point Operation Finished", ++[ POWER9_PME_PM_MRK_LSU_FIN ] = { ++ .pme_name = "PM_MRK_LSU_FIN", ++ .pme_code = 0x0000040132, ++ .pme_short_desc = "lsu marked instr PPC finish", ++ .pme_long_desc = "lsu marked instr PPC finish", + }, +-[ POWER9_PME_PM_MRK_L2_RC_DISP ] = { /* 732 */ +- .pme_name = "PM_MRK_L2_RC_DISP", +- .pme_code = 0x0000020114, +- .pme_short_desc = "Marked Instruction RC dispatched in L2", +- .pme_long_desc = "Marked Instruction RC dispatched in L2", ++[ POWER9_PME_PM_MRK_LSU_FLUSH_ATOMIC ] = { ++ .pme_name = "PM_MRK_LSU_FLUSH_ATOMIC", ++ .pme_code = 0x000000D098, ++ .pme_short_desc = "Quad-word loads (lq) are considered atomic because they always span at least 2 slices.", ++ .pme_long_desc = "Quad-word loads (lq) are considered atomic because they always span at least 2 slices. If a snoop or store from another thread changes the data the load is accessing between the 2 or 3 pieces of the lq instruction, the lq will be flushed", + }, +-[ POWER9_PME_PM_L3_PF_ON_CHIP_CACHE ] = { /* 733 */ +- .pme_name = "PM_L3_PF_ON_CHIP_CACHE", +- .pme_code = 0x00000360A0, +- .pme_short_desc = "L3 Prefetch from On chip cache", +- .pme_long_desc = "L3 Prefetch from On chip cache", ++[ POWER9_PME_PM_MRK_LSU_FLUSH_EMSH ] = { ++ .pme_name = "PM_MRK_LSU_FLUSH_EMSH", ++ .pme_code = 0x000000D898, ++ .pme_short_desc = "An ERAT miss was detected after a set-p hit.", ++ .pme_long_desc = "An ERAT miss was detected after a set-p hit. Erat tracker indicates fail due to tlbmiss and the instruction gets flushed because the instruction was working on the wrong address", + }, +-[ POWER9_PME_PM_IC_DEMAND_CYC ] = { /* 734 */ +- .pme_name = "PM_IC_DEMAND_CYC", +- .pme_code = 0x0000010018, +- .pme_short_desc = "Final Pump Scope (Group) ended up larger than Initial Pump Scope (Chip) for a demand load", +- .pme_long_desc = "Final Pump Scope (Group) ended up larger than Initial Pump Scope (Chip) for a demand load", ++[ POWER9_PME_PM_MRK_LSU_FLUSH_LARX_STCX ] = { ++ .pme_name = "PM_MRK_LSU_FLUSH_LARX_STCX", ++ .pme_code = 0x000000D8A4, ++ .pme_short_desc = "A larx is flushed because an older larx has an LMQ reservation for the same thread.", ++ .pme_long_desc = "A larx is flushed because an older larx has an LMQ reservation for the same thread. A stcx is flushed because an older stcx is in the LMQ. The flush happens when the older larx/stcx relaunches", + }, +-[ POWER9_PME_PM_CO_USAGE ] = { /* 735 */ +- .pme_name = "PM_CO_USAGE", +- .pme_code = 0x000002688E, +- .pme_short_desc = " continuous 16 cycle(2to1) window where this signals rotates thru sampling each machine", +- .pme_long_desc = " continuous 16 cycle(2to1) window where this signals rotates thru sampling each machine", ++[ POWER9_PME_PM_MRK_LSU_FLUSH_LHL_SHL ] = { ++ .pme_name = "PM_MRK_LSU_FLUSH_LHL_SHL", ++ .pme_code = 0x000000D8A0, ++ .pme_short_desc = "The instruction was flushed because of a sequential load/store consistency.", ++ .pme_long_desc = "The instruction was flushed because of a sequential load/store consistency. If a load or store hits on an older load that has either been snooped (for loads) or has stale data (for stores).", + }, +-[ POWER9_PME_PM_ISYNC ] = { /* 736 */ +- .pme_name = "PM_ISYNC", +- .pme_code = 0x0000002884, +- .pme_short_desc = "Isync completion count per thread", +- .pme_long_desc = "Isync completion count per thread", ++[ POWER9_PME_PM_MRK_LSU_FLUSH_LHS ] = { ++ .pme_name = "PM_MRK_LSU_FLUSH_LHS", ++ .pme_code = 0x000000D0A0, ++ .pme_short_desc = "Effective Address alias flush : no EA match but Real Address match.", ++ .pme_long_desc = "Effective Address alias flush : no EA match but Real Address match. If the data has not yet been returned for this load, the instruction will just be rejected, but if it has returned data, it will be flushed", + }, +-[ POWER9_PME_PM_MEM_CO ] = { /* 737 */ +- .pme_name = "PM_MEM_CO", +- .pme_code = 0x000004C058, +- .pme_short_desc = "Memory castouts from this thread", +- .pme_long_desc = "Memory castouts from this thread", ++[ POWER9_PME_PM_MRK_LSU_FLUSH_RELAUNCH_MISS ] = { ++ .pme_name = "PM_MRK_LSU_FLUSH_RELAUNCH_MISS", ++ .pme_code = 0x000000D09C, ++ .pme_short_desc = "If a load that has already returned data and has to relaunch for any reason then gets a miss (erat, setp, data cache), it will often be flushed at relaunch time because the data might be inconsistent", ++ .pme_long_desc = "If a load that has already returned data and has to relaunch for any reason then gets a miss (erat, setp, data cache), it will often be flushed at relaunch time because the data might be inconsistent", + }, +-[ POWER9_PME_PM_NTC_ALL_FIN ] = { /* 738 */ +- .pme_name = "PM_NTC_ALL_FIN", +- .pme_code = 0x000002001A, +- .pme_short_desc = "Cycles after all instructions have finished to group completed", +- .pme_long_desc = "Cycles after all instructions have finished to group completed", ++[ POWER9_PME_PM_MRK_LSU_FLUSH_SAO ] = { ++ .pme_name = "PM_MRK_LSU_FLUSH_SAO", ++ .pme_code = 0x000000D0A4, ++ .pme_short_desc = "A load-hit-load condition with Strong Address Ordering will have address compare disabled and flush", ++ .pme_long_desc = "A load-hit-load condition with Strong Address Ordering will have address compare disabled and flush", + }, +-[ POWER9_PME_PM_CMPLU_STALL_EXCEPTION ] = { /* 739 */ +- .pme_name = "PM_CMPLU_STALL_EXCEPTION", +- .pme_code = 0x000003003A, +- .pme_short_desc = "Cycles in which the NTC instruction is not allowed to complete because it was interrupted by ANY exception, which has to be serviced before the instruction can complete", +- .pme_long_desc = "Cycles in which the NTC instruction is not allowed to complete because it was interrupted by ANY exception, which has to be serviced before the instruction can complete", ++[ POWER9_PME_PM_MRK_LSU_FLUSH_UE ] = { ++ .pme_name = "PM_MRK_LSU_FLUSH_UE", ++ .pme_code = 0x000000D89C, ++ .pme_short_desc = "Correctable ECC error on reload data, reported at critical data forward time", ++ .pme_long_desc = "Correctable ECC error on reload data, reported at critical data forward time", + }, +-[ POWER9_PME_PM_LS0_LAUNCH_HELD_PREF ] = { /* 740 */ +- .pme_name = "PM_LS0_LAUNCH_HELD_PREF", +- .pme_code = 0x000000C09C, +- .pme_short_desc = "Number of times a load or store instruction was unable to launch/relaunch because a high priority prefetch used that relaunch cycle", +- .pme_long_desc = "Number of times a load or store instruction was unable to launch/relaunch because a high priority prefetch used that relaunch cycle", ++[ POWER9_PME_PM_MRK_NTC_CYC ] = { ++ .pme_name = "PM_MRK_NTC_CYC", ++ .pme_code = 0x000002011C, ++ .pme_short_desc = "Cycles during which the marked instruction is next to complete (completion is held up because the marked instruction hasn't completed yet)", ++ .pme_long_desc = "Cycles during which the marked instruction is next to complete (completion is held up because the marked instruction hasn't completed yet)", + }, +-[ POWER9_PME_PM_ICT_NOSLOT_BR_MPRED ] = { /* 741 */ +- .pme_name = "PM_ICT_NOSLOT_BR_MPRED", +- .pme_code = 0x000004D01E, +- .pme_short_desc = "Ict empty for this thread due to branch mispred", +- .pme_long_desc = "Ict empty for this thread due to branch mispred", ++[ POWER9_PME_PM_MRK_NTF_FIN ] = { ++ .pme_name = "PM_MRK_NTF_FIN", ++ .pme_code = 0x0000020112, ++ .pme_short_desc = "Marked next to finish instruction finished", ++ .pme_long_desc = "Marked next to finish instruction finished", + }, +-[ POWER9_PME_PM_MRK_BR_CMPL ] = { /* 742 */ +- .pme_name = "PM_MRK_BR_CMPL", +- .pme_code = 0x000001016E, +- .pme_short_desc = "Branch Instruction completed", +- .pme_long_desc = "Branch Instruction completed", ++[ POWER9_PME_PM_MRK_PROBE_NOP_CMPL ] = { ++ .pme_name = "PM_MRK_PROBE_NOP_CMPL", ++ .pme_code = 0x000001F05E, ++ .pme_short_desc = "Marked probeNops completed", ++ .pme_long_desc = "Marked probeNops completed", + }, +-[ POWER9_PME_PM_ICT_NOSLOT_DISP_HELD ] = { /* 743 */ +- .pme_name = "PM_ICT_NOSLOT_DISP_HELD", +- .pme_code = 0x000004E01A, +- .pme_short_desc = "Cycles in which the NTC instruciton is held at dispatch for any reason", +- .pme_long_desc = "Cycles in which the NTC instruciton is held at dispatch for any reason", ++[ POWER9_PME_PM_MRK_RUN_CYC ] = { ++ .pme_name = "PM_MRK_RUN_CYC", ++ .pme_code = 0x000001D15E, ++ .pme_short_desc = "Run cycles in which a marked instruction is in the pipeline", ++ .pme_long_desc = "Run cycles in which a marked instruction is in the pipeline", + }, +-[ POWER9_PME_PM_IC_PREF_WRITE ] = { /* 744 */ +- .pme_name = "PM_IC_PREF_WRITE", +- .pme_code = 0x000000488C, +- .pme_short_desc = "Instruction prefetch written into IL1", +- .pme_long_desc = "Instruction prefetch written into IL1", ++[ POWER9_PME_PM_MRK_STALL_CMPLU_CYC ] = { ++ .pme_name = "PM_MRK_STALL_CMPLU_CYC", ++ .pme_code = 0x000003013E, ++ .pme_short_desc = "Number of cycles the marked instruction is experiencing a stall while it is next to complete (NTC)", ++ .pme_long_desc = "Number of cycles the marked instruction is experiencing a stall while it is next to complete (NTC)", + }, +-[ POWER9_PME_PM_MRK_LSU_FLUSH_LHL_SHL ] = { /* 745 */ +- .pme_name = "PM_MRK_LSU_FLUSH_LHL_SHL", +- .pme_code = 0x000000D8A0, +- .pme_short_desc = "The instruction was flushed because of a sequential load/store consistency.", +- .pme_long_desc = "The instruction was flushed because of a sequential load/store consistency. If a load or store hits on an older load that has either been snooped (for loads) or has stale data (for stores).", ++[ POWER9_PME_PM_MRK_ST_CMPL_INT ] = { ++ .pme_name = "PM_MRK_ST_CMPL_INT", ++ .pme_code = 0x0000030134, ++ .pme_short_desc = "marked store finished with intervention", ++ .pme_long_desc = "marked store finished with intervention", + }, +-[ POWER9_PME_PM_DTLB_MISS_1G ] = { /* 746 */ +- .pme_name = "PM_DTLB_MISS_1G", +- .pme_code = 0x000004C05A, +- .pme_short_desc = "Data TLB reload (after a miss) page size 1G.", +- .pme_long_desc = "Data TLB reload (after a miss) page size 1G. Implies radix translation was used", ++[ POWER9_PME_PM_MRK_ST_CMPL ] = { ++ .pme_name = "PM_MRK_ST_CMPL", ++ .pme_code = 0x00000301E2, ++ .pme_short_desc = "Marked store completed and sent to nest", ++ .pme_long_desc = "Marked store completed and sent to nest", + }, +-[ POWER9_PME_PM_DATA_FROM_L2_NO_CONFLICT ] = { /* 747 */ +- .pme_name = "PM_DATA_FROM_L2_NO_CONFLICT", +- .pme_code = 0x000001C040, +- .pme_short_desc = "The processor's data cache was reloaded from local core's L2 without conflict due to a demand load", +- .pme_long_desc = "The processor's data cache was reloaded from local core's L2 without conflict due to a demand load", ++[ POWER9_PME_PM_MRK_STCX_FAIL ] = { ++ .pme_name = "PM_MRK_STCX_FAIL", ++ .pme_code = 0x000003E158, ++ .pme_short_desc = "marked stcx failed", ++ .pme_long_desc = "marked stcx failed", + }, +-[ POWER9_PME_PM_MRK_DPTEG_FROM_L3MISS ] = { /* 748 */ +- .pme_name = "PM_MRK_DPTEG_FROM_L3MISS", +- .pme_code = 0x000004F14E, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB from a localtion other than the local core's L3 due to a marked data side request.", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB from a localtion other than the local core's L3 due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++[ POWER9_PME_PM_MRK_STCX_FIN ] = { ++ .pme_name = "PM_MRK_STCX_FIN", ++ .pme_code = 0x0000024056, ++ .pme_short_desc = "Number of marked stcx instructions finished.", ++ .pme_long_desc = "Number of marked stcx instructions finished. This includes instructions in the speculative path of a branch that may be flushed", + }, +-[ POWER9_PME_PM_BR_PRED ] = { /* 749 */ +- .pme_name = "PM_BR_PRED", +- .pme_code = 0x000000409C, +- .pme_short_desc = "Conditional Branch Executed in which the HW predicted the Direction or Target.", +- .pme_long_desc = "Conditional Branch Executed in which the HW predicted the Direction or Target. Includes taken and not taken and is counted at execution time", ++[ POWER9_PME_PM_MRK_ST_DONE_L2 ] = { ++ .pme_name = "PM_MRK_ST_DONE_L2", ++ .pme_code = 0x0000010134, ++ .pme_short_desc = "marked store completed in L2 ( RC machine done)", ++ .pme_long_desc = "marked store completed in L2 ( RC machine done)", + }, +-[ POWER9_PME_PM_CMPLU_STALL_OTHER_CMPL ] = { /* 750 */ +- .pme_name = "PM_CMPLU_STALL_OTHER_CMPL", +- .pme_code = 0x0000030006, +- .pme_short_desc = "Instructions the core completed while this tread was stalled", +- .pme_long_desc = "Instructions the core completed while this tread was stalled", ++[ POWER9_PME_PM_MRK_ST_DRAIN_TO_L2DISP_CYC ] = { ++ .pme_name = "PM_MRK_ST_DRAIN_TO_L2DISP_CYC", ++ .pme_code = 0x000003F150, ++ .pme_short_desc = "cycles to drain st from core to L2", ++ .pme_long_desc = "cycles to drain st from core to L2", + }, +-[ POWER9_PME_PM_INST_FROM_DMEM ] = { /* 751 */ +- .pme_name = "PM_INST_FROM_DMEM", +- .pme_code = 0x000004404C, +- .pme_short_desc = "The processor's Instruction cache was reloaded from another chip's memory on the same Node or Group (Distant) due to an instruction fetch (not prefetch)", +- .pme_long_desc = "The processor's Instruction cache was reloaded from another chip's memory on the same Node or Group (Distant) due to an instruction fetch (not prefetch)", ++[ POWER9_PME_PM_MRK_ST_FWD ] = { ++ .pme_name = "PM_MRK_ST_FWD", ++ .pme_code = 0x000003012C, ++ .pme_short_desc = "Marked st forwards", ++ .pme_long_desc = "Marked st forwards", + }, +-[ POWER9_PME_PM_MRK_DPTEG_FROM_L2_NO_CONFLICT ] = { /* 752 */ +- .pme_name = "PM_MRK_DPTEG_FROM_L2_NO_CONFLICT", +- .pme_code = 0x000001F140, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L2 without conflict due to a marked data side request.", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L2 without conflict due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++[ POWER9_PME_PM_MRK_ST_L2DISP_TO_CMPL_CYC ] = { ++ .pme_name = "PM_MRK_ST_L2DISP_TO_CMPL_CYC", ++ .pme_code = 0x000001F150, ++ .pme_short_desc = "cycles from L2 rc disp to l2 rc completion", ++ .pme_long_desc = "cycles from L2 rc disp to l2 rc completion", + }, +-[ POWER9_PME_PM_DC_PREF_SW_ALLOC ] = { /* 753 */ +- .pme_name = "PM_DC_PREF_SW_ALLOC", +- .pme_code = 0x000000F8A4, +- .pme_short_desc = "Prefetch stream allocated by software prefetching", +- .pme_long_desc = "Prefetch stream allocated by software prefetching", ++[ POWER9_PME_PM_MRK_ST_NEST ] = { ++ .pme_name = "PM_MRK_ST_NEST", ++ .pme_code = 0x0000020138, ++ .pme_short_desc = "Marked store sent to nest", ++ .pme_long_desc = "Marked store sent to nest", + }, +-[ POWER9_PME_PM_L2_RCST_DISP_FAIL_OTHER ] = { /* 754 */ +- .pme_name = "PM_L2_RCST_DISP_FAIL_OTHER", +- .pme_code = 0x0000046084, +- .pme_short_desc = "L2 RC store dispatch attempt failed due to other reasons", +- .pme_long_desc = "L2 RC store dispatch attempt failed due to other reasons", ++[ POWER9_PME_PM_MRK_TEND_FAIL ] = { ++ .pme_name = "PM_MRK_TEND_FAIL", ++ .pme_code = 0x00000028A4, ++ .pme_short_desc = "Nested or not nested tend failed for a marked tend instruction", ++ .pme_long_desc = "Nested or not nested tend failed for a marked tend instruction", + }, +-[ POWER9_PME_PM_CMPLU_STALL_EMQ_FULL ] = { /* 755 */ +- .pme_name = "PM_CMPLU_STALL_EMQ_FULL", +- .pme_code = 0x0000030004, +- .pme_short_desc = "Finish stall because the next to finish instruction suffered an ERAT miss and the EMQ was full", +- .pme_long_desc = "Finish stall because the next to finish instruction suffered an ERAT miss and the EMQ was full", ++[ POWER9_PME_PM_MRK_VSU_FIN ] = { ++ .pme_name = "PM_MRK_VSU_FIN", ++ .pme_code = 0x0000030132, ++ .pme_short_desc = "VSU marked instr finish", ++ .pme_long_desc = "VSU marked instr finish", + }, +-[ POWER9_PME_PM_MRK_INST_DECODED ] = { /* 756 */ +- .pme_name = "PM_MRK_INST_DECODED", +- .pme_code = 0x0000020130, +- .pme_short_desc = "An instruction was marked at decode time.", +- .pme_long_desc = "An instruction was marked at decode time. Random Instruction Sampling (RIS) only", ++[ POWER9_PME_PM_MULT_MRK ] = { ++ .pme_name = "PM_MULT_MRK", ++ .pme_code = 0x000003D15E, ++ .pme_short_desc = "mult marked instr", ++ .pme_long_desc = "mult marked instr", + }, +-[ POWER9_PME_PM_IERAT_RELOAD_4K ] = { /* 757 */ +- .pme_name = "PM_IERAT_RELOAD_4K", +- .pme_code = 0x0000020064, +- .pme_short_desc = "IERAT reloaded (after a miss) for 4K pages", +- .pme_long_desc = "IERAT reloaded (after a miss) for 4K pages", ++[ POWER9_PME_PM_NEST_REF_CLK ] = { ++ .pme_name = "PM_NEST_REF_CLK", ++ .pme_code = 0x000003006E, ++ .pme_short_desc = "Multiply by 4 to obtain the number of PB cycles", ++ .pme_long_desc = "Multiply by 4 to obtain the number of PB cycles", + }, +-[ POWER9_PME_PM_CMPLU_STALL_LRQ_OTHER ] = { /* 758 */ +- .pme_name = "PM_CMPLU_STALL_LRQ_OTHER", +- .pme_code = 0x0000010004, +- .pme_short_desc = "Finish stall due to LRQ miscellaneous reasons, lost arbitration to LMQ slot, bank collisions, set prediction cleanup, set prediction multihit and others", +- .pme_long_desc = "Finish stall due to LRQ miscellaneous reasons, lost arbitration to LMQ slot, bank collisions, set prediction cleanup, set prediction multihit and others", ++[ POWER9_PME_PM_NON_DATA_STORE ] = { ++ .pme_name = "PM_NON_DATA_STORE", ++ .pme_code = 0x000000F8A0, ++ .pme_short_desc = "All ops that drain from s2q to L2 and contain no data", ++ .pme_long_desc = "All ops that drain from s2q to L2 and contain no data", + }, +-[ POWER9_PME_PM_INST_FROM_L3_1_ECO_MOD ] = { /* 759 */ +- .pme_name = "PM_INST_FROM_L3_1_ECO_MOD", +- .pme_code = 0x0000044044, +- .pme_short_desc = "The processor's Instruction cache was reloaded with Modified (M) data from another core's ECO L3 on the same chip due to an instruction fetch (not prefetch)", +- .pme_long_desc = "The processor's Instruction cache was reloaded with Modified (M) data from another core's ECO L3 on the same chip due to an instruction fetch (not prefetch)", ++[ POWER9_PME_PM_NON_FMA_FLOP_CMPL ] = { ++ .pme_name = "PM_NON_FMA_FLOP_CMPL", ++ .pme_code = 0x000004D056, ++ .pme_short_desc = "Non FMA instruction completed", ++ .pme_long_desc = "Non FMA instruction completed", + }, +-[ POWER9_PME_PM_L3_P0_CO_L31 ] = { /* 760 */ +- .pme_name = "PM_L3_P0_CO_L31", +- .pme_code = 0x00000460AA, +- .pme_short_desc = "l3 CO to L3.", +- .pme_long_desc = "l3 CO to L3.1 (lco) port 0", ++[ POWER9_PME_PM_NON_MATH_FLOP_CMPL ] = { ++ .pme_name = "PM_NON_MATH_FLOP_CMPL", ++ .pme_code = 0x000004D05A, ++ .pme_short_desc = "Non FLOP operation completed", ++ .pme_long_desc = "Non FLOP operation completed", + }, +-[ POWER9_PME_PM_NON_TM_RST_SC ] = { /* 761 */ ++[ POWER9_PME_PM_NON_TM_RST_SC ] = { + .pme_name = "PM_NON_TM_RST_SC", + .pme_code = 0x00000260A6, +- .pme_short_desc = "non tm snp rst tm sc", +- .pme_long_desc = "non tm snp rst tm sc", +-}, +-[ POWER9_PME_PM_RADIX_PWC_L4_PTE_FROM_L2 ] = { /* 762 */ +- .pme_name = "PM_RADIX_PWC_L4_PTE_FROM_L2", +- .pme_code = 0x000001F05A, +- .pme_short_desc = "A Page Table Entry was reloaded to a level 4 page walk cache from the core's L2 data cache.", +- .pme_long_desc = "A Page Table Entry was reloaded to a level 4 page walk cache from the core's L2 data cache. This is the deepest level of PWC possible for a translation", +-}, +-[ POWER9_PME_PM_INST_SYS_PUMP_CPRED ] = { /* 763 */ +- .pme_name = "PM_INST_SYS_PUMP_CPRED", +- .pme_code = 0x0000034050, +- .pme_short_desc = "Initial and Final Pump Scope was system pump (prediction=correct) for an instruction fetch", +- .pme_long_desc = "Initial and Final Pump Scope was system pump (prediction=correct) for an instruction fetch", +-}, +-[ POWER9_PME_PM_DPTEG_FROM_DMEM ] = { /* 764 */ +- .pme_name = "PM_DPTEG_FROM_DMEM", +- .pme_code = 0x000004E04C, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB from another chip's memory on the same Node or Group (Distant) due to a data side request.", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB from another chip's memory on the same Node or Group (Distant) due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++ .pme_short_desc = "Non-TM snp rst TM SC", ++ .pme_long_desc = "Non-TM snp rst TM SC", + }, +-[ POWER9_PME_PM_LSU_LMQ_SRQ_EMPTY_CYC ] = { /* 765 */ +- .pme_name = "PM_LSU_LMQ_SRQ_EMPTY_CYC", +- .pme_code = 0x000002003E, +- .pme_short_desc = "Cycles in which the LSU is empty for all threads (lmq and srq are completely empty)", +- .pme_long_desc = "Cycles in which the LSU is empty for all threads (lmq and srq are completely empty)", ++[ POWER9_PME_PM_NTC_ALL_FIN ] = { ++ .pme_name = "PM_NTC_ALL_FIN", ++ .pme_code = 0x000002001A, ++ .pme_short_desc = "Cycles after all instructions have finished to group completed", ++ .pme_long_desc = "Cycles after all instructions have finished to group completed", + }, +-[ POWER9_PME_PM_SYS_PUMP_CPRED ] = { /* 766 */ +- .pme_name = "PM_SYS_PUMP_CPRED", +- .pme_code = 0x0000030050, +- .pme_short_desc = "Initial and Final Pump Scope was system pump for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", +- .pme_long_desc = "Initial and Final Pump Scope was system pump for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", ++[ POWER9_PME_PM_NTC_FIN ] = { ++ .pme_name = "PM_NTC_FIN", ++ .pme_code = 0x000002405A, ++ .pme_short_desc = "Cycles in which the oldest instruction in the pipeline (NTC) finishes.", ++ .pme_long_desc = "Cycles in which the oldest instruction in the pipeline (NTC) finishes. This event is used to account for cycles in which work is being completed in the CPI stack", + }, +-[ POWER9_PME_PM_DTLB_MISS_64K ] = { /* 767 */ +- .pme_name = "PM_DTLB_MISS_64K", +- .pme_code = 0x000003C056, +- .pme_short_desc = "Data TLB Miss page size 64K", +- .pme_long_desc = "Data TLB Miss page size 64K", ++[ POWER9_PME_PM_NTC_ISSUE_HELD_ARB ] = { ++ .pme_name = "PM_NTC_ISSUE_HELD_ARB", ++ .pme_code = 0x000002E016, ++ .pme_short_desc = "The NTC instruction is being held at dispatch because it lost arbitration onto the issue pipe to another instruction (from the same thread or a different thread)", ++ .pme_long_desc = "The NTC instruction is being held at dispatch because it lost arbitration onto the issue pipe to another instruction (from the same thread or a different thread)", + }, +-[ POWER9_PME_PM_CMPLU_STALL_STCX ] = { /* 768 */ +- .pme_name = "PM_CMPLU_STALL_STCX", +- .pme_code = 0x000002D01C, +- .pme_short_desc = "Finish stall because the NTF instruction was a stcx waiting for response from L2", +- .pme_long_desc = "Finish stall because the NTF instruction was a stcx waiting for response from L2", ++[ POWER9_PME_PM_NTC_ISSUE_HELD_DARQ_FULL ] = { ++ .pme_name = "PM_NTC_ISSUE_HELD_DARQ_FULL", ++ .pme_code = 0x000001006A, ++ .pme_short_desc = "The NTC instruction is being held at dispatch because there are no slots in the DARQ for it", ++ .pme_long_desc = "The NTC instruction is being held at dispatch because there are no slots in the DARQ for it", + }, +-[ POWER9_PME_PM_MRK_FAB_RSP_CLAIM_RTY ] = { /* 769 */ +- .pme_name = "PM_MRK_FAB_RSP_CLAIM_RTY", +- .pme_code = 0x000003015E, +- .pme_short_desc = "Sampled store did a rwitm and got a rty", +- .pme_long_desc = "Sampled store did a rwitm and got a rty", ++[ POWER9_PME_PM_NTC_ISSUE_HELD_OTHER ] = { ++ .pme_name = "PM_NTC_ISSUE_HELD_OTHER", ++ .pme_code = 0x000003D05A, ++ .pme_short_desc = "The NTC instruction is being held at dispatch during regular pipeline cycles, or because the VSU is busy with multi-cycle instructions, or because of a write-back collision with VSU", ++ .pme_long_desc = "The NTC instruction is being held at dispatch during regular pipeline cycles, or because the VSU is busy with multi-cycle instructions, or because of a write-back collision with VSU", + }, +-[ POWER9_PME_PM_PARTIAL_ST_FIN ] = { /* 770 */ ++[ POWER9_PME_PM_PARTIAL_ST_FIN ] = { + .pme_name = "PM_PARTIAL_ST_FIN", + .pme_code = 0x0000034054, + .pme_short_desc = "Any store finished by an LSU slice", + .pme_long_desc = "Any store finished by an LSU slice", + }, +-[ POWER9_PME_PM_THRD_CONC_RUN_INST ] = { /* 771 */ +- .pme_name = "PM_THRD_CONC_RUN_INST", +- .pme_code = 0x00000300F4, +- .pme_short_desc = "PPC Instructions Finished by this thread when all threads in the core had the run-latch set", +- .pme_long_desc = "PPC Instructions Finished by this thread when all threads in the core had the run-latch set", ++[ POWER9_PME_PM_PMC1_OVERFLOW ] = { ++ .pme_name = "PM_PMC1_OVERFLOW", ++ .pme_code = 0x0000020010, ++ .pme_short_desc = "Overflow from counter 1", ++ .pme_long_desc = "Overflow from counter 1", + }, +-[ POWER9_PME_PM_CO_TM_SC_FOOTPRINT ] = { /* 772 */ +- .pme_name = "PM_CO_TM_SC_FOOTPRINT", +- .pme_code = 0x0000026086, +- .pme_short_desc = "L2 did a cleanifdirty CO to the L3 (ie created an SC line in the L3)", +- .pme_long_desc = "L2 did a cleanifdirty CO to the L3 (ie created an SC line in the L3)", ++[ POWER9_PME_PM_PMC1_REWIND ] = { ++ .pme_name = "PM_PMC1_REWIND", ++ .pme_code = 0x000004D02C, ++ .pme_short_desc = "", ++ .pme_long_desc = "", + }, +-[ POWER9_PME_PM_MRK_LARX_FIN ] = { /* 773 */ +- .pme_name = "PM_MRK_LARX_FIN", +- .pme_code = 0x0000040116, +- .pme_short_desc = "Larx finished", +- .pme_long_desc = "Larx finished", ++[ POWER9_PME_PM_PMC1_SAVED ] = { ++ .pme_name = "PM_PMC1_SAVED", ++ .pme_code = 0x000004D010, ++ .pme_short_desc = "PMC1 Rewind Value saved", ++ .pme_long_desc = "PMC1 Rewind Value saved", + }, +-[ POWER9_PME_PM_L3_LOC_GUESS_WRONG ] = { /* 774 */ +- .pme_name = "PM_L3_LOC_GUESS_WRONG", +- .pme_code = 0x00000268B2, +- .pme_short_desc = "Initial scope=node but data from out side local node (near or far or rem).", +- .pme_long_desc = "Initial scope=node but data from out side local node (near or far or rem). Prediction too Low", ++[ POWER9_PME_PM_PMC2_OVERFLOW ] = { ++ .pme_name = "PM_PMC2_OVERFLOW", ++ .pme_code = 0x0000030010, ++ .pme_short_desc = "Overflow from counter 2", ++ .pme_long_desc = "Overflow from counter 2", + }, +-[ POWER9_PME_PM_CMPLU_STALL_DMISS_L21_L31 ] = { /* 775 */ +- .pme_name = "PM_CMPLU_STALL_DMISS_L21_L31", +- .pme_code = 0x000002C018, +- .pme_short_desc = "Completion stall by Dcache miss which resolved on chip ( excluding local L2/L3)", +- .pme_long_desc = "Completion stall by Dcache miss which resolved on chip ( excluding local L2/L3)", ++[ POWER9_PME_PM_PMC2_REWIND ] = { ++ .pme_name = "PM_PMC2_REWIND", ++ .pme_code = 0x0000030020, ++ .pme_short_desc = "PMC2 Rewind Event (did not match condition)", ++ .pme_long_desc = "PMC2 Rewind Event (did not match condition)", ++}, ++[ POWER9_PME_PM_PMC2_SAVED ] = { ++ .pme_name = "PM_PMC2_SAVED", ++ .pme_code = 0x0000010022, ++ .pme_short_desc = "PMC2 Rewind Value saved", ++ .pme_long_desc = "PMC2 Rewind Value saved", ++}, ++[ POWER9_PME_PM_PMC3_OVERFLOW ] = { ++ .pme_name = "PM_PMC3_OVERFLOW", ++ .pme_code = 0x0000040010, ++ .pme_short_desc = "Overflow from counter 3", ++ .pme_long_desc = "Overflow from counter 3", + }, +-[ POWER9_PME_PM_SHL_ST_DISABLE ] = { /* 776 */ +- .pme_name = "PM_SHL_ST_DISABLE", +- .pme_code = 0x0000005090, +- .pme_short_desc = "Store-Hit-Load Table Read Hit with entry Disabled (entry was disabled due to the entry shown to not prevent the flush)", +- .pme_long_desc = "Store-Hit-Load Table Read Hit with entry Disabled (entry was disabled due to the entry shown to not prevent the flush)", ++[ POWER9_PME_PM_PMC3_REWIND ] = { ++ .pme_name = "PM_PMC3_REWIND", ++ .pme_code = 0x000001000A, ++ .pme_short_desc = "PMC3 rewind event.", ++ .pme_long_desc = "PMC3 rewind event. A rewind happens when a speculative event (such as latency or CPI stack) is selected on PMC3 and the stall reason or reload source did not match the one programmed in PMC3. When this occurs, the count in PMC3 will not change.", + }, +-[ POWER9_PME_PM_VSU_FIN ] = { /* 777 */ +- .pme_name = "PM_VSU_FIN", +- .pme_code = 0x000002505C, +- .pme_short_desc = "VSU instruction finished.", +- .pme_long_desc = "VSU instruction finished. Up to 4 per cycle", ++[ POWER9_PME_PM_PMC3_SAVED ] = { ++ .pme_name = "PM_PMC3_SAVED", ++ .pme_code = 0x000004D012, ++ .pme_short_desc = "PMC3 Rewind Value saved", ++ .pme_long_desc = "PMC3 Rewind Value saved", + }, +-[ POWER9_PME_PM_MRK_LSU_FLUSH_ATOMIC ] = { /* 778 */ +- .pme_name = "PM_MRK_LSU_FLUSH_ATOMIC", +- .pme_code = 0x000000D098, +- .pme_short_desc = "Quad-word loads (lq) are considered atomic because they always span at least 2 slices.", +- .pme_long_desc = "Quad-word loads (lq) are considered atomic because they always span at least 2 slices. If a snoop or store from another thread changes the data the load is accessing between the 2 or 3 pieces of the lq instruction, the lq will be flushed", ++[ POWER9_PME_PM_PMC4_OVERFLOW ] = { ++ .pme_name = "PM_PMC4_OVERFLOW", ++ .pme_code = 0x0000010010, ++ .pme_short_desc = "Overflow from counter 4", ++ .pme_long_desc = "Overflow from counter 4", + }, +-[ POWER9_PME_PM_L3_CI_HIT ] = { /* 779 */ +- .pme_name = "PM_L3_CI_HIT", +- .pme_code = 0x00000260A2, +- .pme_short_desc = "L3 Castins Hit (total count", +- .pme_long_desc = "L3 Castins Hit (total count", ++[ POWER9_PME_PM_PMC4_REWIND ] = { ++ .pme_name = "PM_PMC4_REWIND", ++ .pme_code = 0x0000010020, ++ .pme_short_desc = "PMC4 Rewind Event", ++ .pme_long_desc = "PMC4 Rewind Event", + }, +-[ POWER9_PME_PM_CMPLU_STALL_DARQ ] = { /* 780 */ +- .pme_name = "PM_CMPLU_STALL_DARQ", +- .pme_code = 0x000003405A, +- .pme_short_desc = "Finish stall because the next to finish instruction was spending cycles in the DARQ.", +- .pme_long_desc = "Finish stall because the next to finish instruction was spending cycles in the DARQ. If this count is large is likely because the LSAQ had less than 4 slots available", ++[ POWER9_PME_PM_PMC4_SAVED ] = { ++ .pme_name = "PM_PMC4_SAVED", ++ .pme_code = 0x0000030022, ++ .pme_short_desc = "PMC4 Rewind Value saved (matched condition)", ++ .pme_long_desc = "PMC4 Rewind Value saved (matched condition)", + }, +-[ POWER9_PME_PM_L3_PF_ON_CHIP_MEM ] = { /* 781 */ +- .pme_name = "PM_L3_PF_ON_CHIP_MEM", +- .pme_code = 0x00000460A0, +- .pme_short_desc = "L3 Prefetch from On chip memory", +- .pme_long_desc = "L3 Prefetch from On chip memory", ++[ POWER9_PME_PM_PMC5_OVERFLOW ] = { ++ .pme_name = "PM_PMC5_OVERFLOW", ++ .pme_code = 0x0000010024, ++ .pme_short_desc = "Overflow from counter 5", ++ .pme_long_desc = "Overflow from counter 5", + }, +-[ POWER9_PME_PM_THRD_PRIO_0_1_CYC ] = { /* 782 */ +- .pme_name = "PM_THRD_PRIO_0_1_CYC", +- .pme_code = 0x00000040BC, +- .pme_short_desc = "Cycles thread running at priority level 0 or 1", +- .pme_long_desc = "Cycles thread running at priority level 0 or 1", ++[ POWER9_PME_PM_PMC6_OVERFLOW ] = { ++ .pme_name = "PM_PMC6_OVERFLOW", ++ .pme_code = 0x0000030024, ++ .pme_short_desc = "Overflow from counter 6", ++ .pme_long_desc = "Overflow from counter 6", + }, +-[ POWER9_PME_PM_DERAT_MISS_64K ] = { /* 783 */ +- .pme_name = "PM_DERAT_MISS_64K", +- .pme_code = 0x000002C054, +- .pme_short_desc = "Data ERAT Miss (Data TLB Access) page size 64K", +- .pme_long_desc = "Data ERAT Miss (Data TLB Access) page size 64K", ++[ POWER9_PME_PM_PROBE_NOP_DISP ] = { ++ .pme_name = "PM_PROBE_NOP_DISP", ++ .pme_code = 0x0000040014, ++ .pme_short_desc = "ProbeNops dispatched", ++ .pme_long_desc = "ProbeNops dispatched", + }, +-[ POWER9_PME_PM_PMC2_REWIND ] = { /* 784 */ +- .pme_name = "PM_PMC2_REWIND", +- .pme_code = 0x0000030020, +- .pme_short_desc = "PMC2 Rewind Event (did not match condition)", +- .pme_long_desc = "PMC2 Rewind Event (did not match condition)", ++[ POWER9_PME_PM_PTE_PREFETCH ] = { ++ .pme_name = "PM_PTE_PREFETCH", ++ .pme_code = 0x000000F084, ++ .pme_short_desc = "PTE prefetches", ++ .pme_long_desc = "PTE prefetches", + }, +-[ POWER9_PME_PM_INST_FROM_L2 ] = { /* 785 */ +- .pme_name = "PM_INST_FROM_L2", +- .pme_code = 0x0000014042, +- .pme_short_desc = "The processor's Instruction cache was reloaded from local core's L2 due to an instruction fetch (not prefetch)", +- .pme_long_desc = "The processor's Instruction cache was reloaded from local core's L2 due to an instruction fetch (not prefetch)", ++[ POWER9_PME_PM_PTESYNC ] = { ++ .pme_name = "PM_PTESYNC", ++ .pme_code = 0x000000589C, ++ .pme_short_desc = "ptesync instruction counted when the instruction is decoded and transmitted", ++ .pme_long_desc = "ptesync instruction counted when the instruction is decoded and transmitted", + }, +-[ POWER9_PME_PM_MRK_NTF_FIN ] = { /* 786 */ +- .pme_name = "PM_MRK_NTF_FIN", +- .pme_code = 0x0000020112, +- .pme_short_desc = "Marked next to finish instruction finished", +- .pme_long_desc = "Marked next to finish instruction finished", ++[ POWER9_PME_PM_PUMP_CPRED ] = { ++ .pme_name = "PM_PUMP_CPRED", ++ .pme_code = 0x0000010054, ++ .pme_short_desc = "Pump prediction correct.", ++ .pme_long_desc = "Pump prediction correct. Counts across all types of pumps for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", + }, +-[ POWER9_PME_PM_ALL_SRQ_FULL ] = { /* 787 */ +- .pme_name = "PM_ALL_SRQ_FULL", +- .pme_code = 0x0000020004, +- .pme_short_desc = "Number of cycles the SRQ is completely out of srq entries.", +- .pme_long_desc = "Number of cycles the SRQ is completely out of srq entries. This event is not per thread, all threads will get the same count for this core resource", ++[ POWER9_PME_PM_PUMP_MPRED ] = { ++ .pme_name = "PM_PUMP_MPRED", ++ .pme_code = 0x0000040052, ++ .pme_short_desc = "Pump misprediction.", ++ .pme_long_desc = "Pump misprediction. Counts across all types of pumps for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", + }, +-[ POWER9_PME_PM_INST_DISP ] = { /* 788 */ +- .pme_name = "PM_INST_DISP", +- .pme_code = 0x00000200F2, +- .pme_short_desc = "# PPC Dispatched", +- .pme_long_desc = "# PPC Dispatched", ++[ POWER9_PME_PM_RADIX_PWC_L1_HIT ] = { ++ .pme_name = "PM_RADIX_PWC_L1_HIT", ++ .pme_code = 0x000001F056, ++ .pme_short_desc = "A radix translation attempt missed in the TLB and only the first level page walk cache was a hit.", ++ .pme_long_desc = "A radix translation attempt missed in the TLB and only the first level page walk cache was a hit.", + }, +-[ POWER9_PME_PM_LS3_ERAT_MISS_PREF ] = { /* 789 */ +- .pme_name = "PM_LS3_ERAT_MISS_PREF", +- .pme_code = 0x000000E888, +- .pme_short_desc = "LS1 Erat miss due to prefetch", +- .pme_long_desc = "LS1 Erat miss due to prefetch", ++[ POWER9_PME_PM_RADIX_PWC_L1_PDE_FROM_L2 ] = { ++ .pme_name = "PM_RADIX_PWC_L1_PDE_FROM_L2", ++ .pme_code = 0x000002D026, ++ .pme_short_desc = "A Page Directory Entry was reloaded to a level 1 page walk cache from the core's L2 data cache", ++ .pme_long_desc = "A Page Directory Entry was reloaded to a level 1 page walk cache from the core's L2 data cache", + }, +-[ POWER9_PME_PM_STOP_FETCH_PENDING_CYC ] = { /* 790 */ +- .pme_name = "PM_STOP_FETCH_PENDING_CYC", +- .pme_code = 0x00000048A4, +- .pme_short_desc = "Fetching is stopped due to an incoming instruction that will result in a flush", +- .pme_long_desc = "Fetching is stopped due to an incoming instruction that will result in a flush", ++[ POWER9_PME_PM_RADIX_PWC_L1_PDE_FROM_L3MISS ] = { ++ .pme_name = "PM_RADIX_PWC_L1_PDE_FROM_L3MISS", ++ .pme_code = 0x000004F056, ++ .pme_short_desc = "A Page Directory Entry was reloaded to a level 1 page walk cache from beyond the core's L3 data cache.", ++ .pme_long_desc = "A Page Directory Entry was reloaded to a level 1 page walk cache from beyond the core's L3 data cache. The source could be local/remote/distant memory or another core's cache", + }, +-[ POWER9_PME_PM_L1_DCACHE_RELOAD_VALID ] = { /* 791 */ +- .pme_name = "PM_L1_DCACHE_RELOAD_VALID", +- .pme_code = 0x00000300F6, +- .pme_short_desc = "DL1 reloaded due to Demand Load", +- .pme_long_desc = "DL1 reloaded due to Demand Load", ++[ POWER9_PME_PM_RADIX_PWC_L1_PDE_FROM_L3 ] = { ++ .pme_name = "PM_RADIX_PWC_L1_PDE_FROM_L3", ++ .pme_code = 0x000003F058, ++ .pme_short_desc = "A Page Directory Entry was reloaded to a level 1 page walk cache from the core's L3 data cache", ++ .pme_long_desc = "A Page Directory Entry was reloaded to a level 1 page walk cache from the core's L3 data cache", + }, +-[ POWER9_PME_PM_L3_P0_LCO_NO_DATA ] = { /* 792 */ +- .pme_name = "PM_L3_P0_LCO_NO_DATA", +- .pme_code = 0x00000160AA, +- .pme_short_desc = "dataless l3 lco sent port 0", +- .pme_long_desc = "dataless l3 lco sent port 0", ++[ POWER9_PME_PM_RADIX_PWC_L2_HIT ] = { ++ .pme_name = "PM_RADIX_PWC_L2_HIT", ++ .pme_code = 0x000002D024, ++ .pme_short_desc = "A radix translation attempt missed in the TLB but hit on both the first and second levels of page walk cache.", ++ .pme_long_desc = "A radix translation attempt missed in the TLB but hit on both the first and second levels of page walk cache.", + }, +-[ POWER9_PME_PM_LSU3_VECTOR_LD_FIN ] = { /* 793 */ +- .pme_name = "PM_LSU3_VECTOR_LD_FIN", +- .pme_code = 0x000000C884, +- .pme_short_desc = "A vector load instruction finished.", +- .pme_long_desc = "A vector load instruction finished. The ops considered in this category are lxv*, lvx*, lve*, lxsi*zx, lxvwsx, lxsd, lxssp, lxvl, lxvll, lxvb16x, lxvh8x, lxv, lxvx", ++[ POWER9_PME_PM_RADIX_PWC_L2_PDE_FROM_L2 ] = { ++ .pme_name = "PM_RADIX_PWC_L2_PDE_FROM_L2", ++ .pme_code = 0x000002D028, ++ .pme_short_desc = "A Page Directory Entry was reloaded to a level 2 page walk cache from the core's L2 data cache", ++ .pme_long_desc = "A Page Directory Entry was reloaded to a level 2 page walk cache from the core's L2 data cache", + }, +-[ POWER9_PME_PM_MRK_DPTEG_FROM_L3_NO_CONFLICT ] = { /* 794 */ +- .pme_name = "PM_MRK_DPTEG_FROM_L3_NO_CONFLICT", +- .pme_code = 0x000001F144, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L3 without conflict due to a marked data side request.", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L3 without conflict due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++[ POWER9_PME_PM_RADIX_PWC_L2_PDE_FROM_L3 ] = { ++ .pme_name = "PM_RADIX_PWC_L2_PDE_FROM_L3", ++ .pme_code = 0x000003F05A, ++ .pme_short_desc = "A Page Directory Entry was reloaded to a level 2 page walk cache from the core's L3 data cache", ++ .pme_long_desc = "A Page Directory Entry was reloaded to a level 2 page walk cache from the core's L3 data cache", + }, +-[ POWER9_PME_PM_MRK_FXU_FIN ] = { /* 795 */ +- .pme_name = "PM_MRK_FXU_FIN", +- .pme_code = 0x0000020134, +- .pme_short_desc = "fxu marked instr finish", +- .pme_long_desc = "fxu marked instr finish", ++[ POWER9_PME_PM_RADIX_PWC_L2_PTE_FROM_L2 ] = { ++ .pme_name = "PM_RADIX_PWC_L2_PTE_FROM_L2", ++ .pme_code = 0x000001F058, ++ .pme_short_desc = "A Page Table Entry was reloaded to a level 2 page walk cache from the core's L2 data cache.", ++ .pme_long_desc = "A Page Table Entry was reloaded to a level 2 page walk cache from the core's L2 data cache. This implies that level 3 and level 4 PWC accesses were not necessary for this translation", + }, +-[ POWER9_PME_PM_LS3_UNALIGNED_ST ] = { /* 796 */ +- .pme_name = "PM_LS3_UNALIGNED_ST", +- .pme_code = 0x000000F8BC, +- .pme_short_desc = "Store instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the Store of that size.", +- .pme_long_desc = "Store instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the Store of that size. If the Store wraps from slice 3 to slice 0, thee is an additional 3-cycle penalty", ++[ POWER9_PME_PM_RADIX_PWC_L2_PTE_FROM_L3MISS ] = { ++ .pme_name = "PM_RADIX_PWC_L2_PTE_FROM_L3MISS", ++ .pme_code = 0x000004F05C, ++ .pme_short_desc = "A Page Table Entry was reloaded to a level 2 page walk cache from beyond the core's L3 data cache.", ++ .pme_long_desc = "A Page Table Entry was reloaded to a level 2 page walk cache from beyond the core's L3 data cache. This implies that level 3 and level 4 PWC accesses were not necessary for this translation. The source could be local/remote/distant memory or another core's cache", + }, +-[ POWER9_PME_PM_DPTEG_FROM_MEMORY ] = { /* 797 */ +- .pme_name = "PM_DPTEG_FROM_MEMORY", +- .pme_code = 0x000002E04C, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB from a memory location including L4 from local remote or distant due to a data side request.", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB from a memory location including L4 from local remote or distant due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++[ POWER9_PME_PM_RADIX_PWC_L2_PTE_FROM_L3 ] = { ++ .pme_name = "PM_RADIX_PWC_L2_PTE_FROM_L3", ++ .pme_code = 0x000004F058, ++ .pme_short_desc = "A Page Table Entry was reloaded to a level 2 page walk cache from the core's L3 data cache.", ++ .pme_long_desc = "A Page Table Entry was reloaded to a level 2 page walk cache from the core's L3 data cache. This implies that level 3 and level 4 PWC accesses were not necessary for this translation", + }, +-[ POWER9_PME_PM_RUN_CYC_ST_MODE ] = { /* 798 */ +- .pme_name = "PM_RUN_CYC_ST_MODE", +- .pme_code = 0x000001006C, +- .pme_short_desc = "Cycles run latch is set and core is in ST mode", +- .pme_long_desc = "Cycles run latch is set and core is in ST mode", ++[ POWER9_PME_PM_RADIX_PWC_L3_HIT ] = { ++ .pme_name = "PM_RADIX_PWC_L3_HIT", ++ .pme_code = 0x000003F056, ++ .pme_short_desc = "A radix translation attempt missed in the TLB but hit on the first, second, and third levels of page walk cache.", ++ .pme_long_desc = "A radix translation attempt missed in the TLB but hit on the first, second, and third levels of page walk cache.", + }, +-[ POWER9_PME_PM_PMC4_OVERFLOW ] = { /* 799 */ +- .pme_name = "PM_PMC4_OVERFLOW", +- .pme_code = 0x0000010010, +- .pme_short_desc = "Overflow from counter 4", +- .pme_long_desc = "Overflow from counter 4", ++[ POWER9_PME_PM_RADIX_PWC_L3_PDE_FROM_L2 ] = { ++ .pme_name = "PM_RADIX_PWC_L3_PDE_FROM_L2", ++ .pme_code = 0x000002D02A, ++ .pme_short_desc = "A Page Directory Entry was reloaded to a level 3 page walk cache from the core's L2 data cache", ++ .pme_long_desc = "A Page Directory Entry was reloaded to a level 3 page walk cache from the core's L2 data cache", + }, +-[ POWER9_PME_PM_THRESH_EXC_256 ] = { /* 800 */ +- .pme_name = "PM_THRESH_EXC_256", +- .pme_code = 0x00000101E8, +- .pme_short_desc = "Threshold counter exceed a count of 256", +- .pme_long_desc = "Threshold counter exceed a count of 256", ++[ POWER9_PME_PM_RADIX_PWC_L3_PDE_FROM_L3 ] = { ++ .pme_name = "PM_RADIX_PWC_L3_PDE_FROM_L3", ++ .pme_code = 0x000001F15C, ++ .pme_short_desc = "A Page Directory Entry was reloaded to a level 3 page walk cache from the core's L3 data cache", ++ .pme_long_desc = "A Page Directory Entry was reloaded to a level 3 page walk cache from the core's L3 data cache", + }, +-[ POWER9_PME_PM_MRK_DATA_FROM_L3_1_ECO_MOD_CYC ] = { /* 801 */ +- .pme_name = "PM_MRK_DATA_FROM_L3_1_ECO_MOD_CYC", +- .pme_code = 0x0000035158, +- .pme_short_desc = "Duration in cycles to reload with Modified (M) data from another core's ECO L3 on the same chip due to a marked load", +- .pme_long_desc = "Duration in cycles to reload with Modified (M) data from another core's ECO L3 on the same chip due to a marked load", ++[ POWER9_PME_PM_RADIX_PWC_L3_PTE_FROM_L2 ] = { ++ .pme_name = "PM_RADIX_PWC_L3_PTE_FROM_L2", ++ .pme_code = 0x000002D02E, ++ .pme_short_desc = "A Page Table Entry was reloaded to a level 3 page walk cache from the core's L2 data cache.", ++ .pme_long_desc = "A Page Table Entry was reloaded to a level 3 page walk cache from the core's L2 data cache. This implies that a level 4 PWC access was not necessary for this translation", + }, +-[ POWER9_PME_PM_LSU0_LRQ_S0_VALID_CYC ] = { /* 802 */ +- .pme_name = "PM_LSU0_LRQ_S0_VALID_CYC", +- .pme_code = 0x000000D8B4, +- .pme_short_desc = "Slot 0 of LRQ valid", +- .pme_long_desc = "Slot 0 of LRQ valid", ++[ POWER9_PME_PM_RADIX_PWC_L3_PTE_FROM_L3MISS ] = { ++ .pme_name = "PM_RADIX_PWC_L3_PTE_FROM_L3MISS", ++ .pme_code = 0x000004F05E, ++ .pme_short_desc = "A Page Table Entry was reloaded to a level 3 page walk cache from beyond the core's L3 data cache.", ++ .pme_long_desc = "A Page Table Entry was reloaded to a level 3 page walk cache from beyond the core's L3 data cache. This implies that a level 4 PWC access was not necessary for this translation. The source could be local/remote/distant memory or another core's cache", + }, +-[ POWER9_PME_PM_INST_FROM_L2MISS ] = { /* 803 */ +- .pme_name = "PM_INST_FROM_L2MISS", +- .pme_code = 0x000001404E, +- .pme_short_desc = "The processor's Instruction cache was reloaded from a localtion other than the local core's L2 due to an instruction fetch (not prefetch)", +- .pme_long_desc = "The processor's Instruction cache was reloaded from a localtion other than the local core's L2 due to an instruction fetch (not prefetch)", ++[ POWER9_PME_PM_RADIX_PWC_L3_PTE_FROM_L3 ] = { ++ .pme_name = "PM_RADIX_PWC_L3_PTE_FROM_L3", ++ .pme_code = 0x000003F05E, ++ .pme_short_desc = "A Page Table Entry was reloaded to a level 3 page walk cache from the core's L3 data cache.", ++ .pme_long_desc = "A Page Table Entry was reloaded to a level 3 page walk cache from the core's L3 data cache. This implies that a level 4 PWC access was not necessary for this translation", + }, +-[ POWER9_PME_PM_MRK_L2_TM_ST_ABORT_SISTER ] = { /* 804 */ +- .pme_name = "PM_MRK_L2_TM_ST_ABORT_SISTER", +- .pme_code = 0x000003E15C, +- .pme_short_desc = "TM marked store abort for this thread", +- .pme_long_desc = "TM marked store abort for this thread", ++[ POWER9_PME_PM_RADIX_PWC_L4_PTE_FROM_L2 ] = { ++ .pme_name = "PM_RADIX_PWC_L4_PTE_FROM_L2", ++ .pme_code = 0x000001F05A, ++ .pme_short_desc = "A Page Table Entry was reloaded to a level 4 page walk cache from the core's L2 data cache.", ++ .pme_long_desc = "A Page Table Entry was reloaded to a level 4 page walk cache from the core's L2 data cache. This is the deepest level of PWC possible for a translation", + }, +-[ POWER9_PME_PM_L2_ST ] = { /* 805 */ +- .pme_name = "PM_L2_ST", +- .pme_code = 0x0000016880, +- .pme_short_desc = "All successful D-side store dispatches for this thread", +- .pme_long_desc = "All successful D-side store dispatches for this thread", ++[ POWER9_PME_PM_RADIX_PWC_L4_PTE_FROM_L3MISS ] = { ++ .pme_name = "PM_RADIX_PWC_L4_PTE_FROM_L3MISS", ++ .pme_code = 0x000003F054, ++ .pme_short_desc = "A Page Table Entry was reloaded to a level 4 page walk cache from beyond the core's L3 data cache.", ++ .pme_long_desc = "A Page Table Entry was reloaded to a level 4 page walk cache from beyond the core's L3 data cache. This is the deepest level of PWC possible for a translation. The source could be local/remote/distant memory or another core's cache", ++}, ++[ POWER9_PME_PM_RADIX_PWC_L4_PTE_FROM_L3 ] = { ++ .pme_name = "PM_RADIX_PWC_L4_PTE_FROM_L3", ++ .pme_code = 0x000004F05A, ++ .pme_short_desc = "A Page Table Entry was reloaded to a level 4 page walk cache from the core's L3 data cache.", ++ .pme_long_desc = "A Page Table Entry was reloaded to a level 4 page walk cache from the core's L3 data cache. This is the deepest level of PWC possible for a translation", + }, +-[ POWER9_PME_PM_RADIX_PWC_MISS ] = { /* 806 */ ++[ POWER9_PME_PM_RADIX_PWC_MISS ] = { + .pme_name = "PM_RADIX_PWC_MISS", + .pme_code = 0x000004F054, + .pme_short_desc = "A radix translation attempt missed in the TLB and all levels of page walk cache.", + .pme_long_desc = "A radix translation attempt missed in the TLB and all levels of page walk cache.", + }, +-[ POWER9_PME_PM_MRK_ST_L2DISP_TO_CMPL_CYC ] = { /* 807 */ +- .pme_name = "PM_MRK_ST_L2DISP_TO_CMPL_CYC", +- .pme_code = 0x000001F150, +- .pme_short_desc = "cycles from L2 rc disp to l2 rc completion", +- .pme_long_desc = "cycles from L2 rc disp to l2 rc completion", ++[ POWER9_PME_PM_RC0_BUSY ] = { ++ .pme_name = "PM_RC0_BUSY", ++ .pme_code = 0x000001608C, ++ .pme_short_desc = "RC mach 0 Busy.", ++ .pme_long_desc = "RC mach 0 Busy. Used by PMU to sample ave RC lifetime (mach0 used as sample point)", + }, +-[ POWER9_PME_PM_LSU1_LDMX_FIN ] = { /* 808 */ +- .pme_name = "PM_LSU1_LDMX_FIN", +- .pme_code = 0x000000D888, +- .pme_short_desc = " New P9 instruction LDMX.", +- .pme_long_desc = " New P9 instruction LDMX.", ++[ POWER9_PME_PM_RC0_BUSY_ALT ] = { ++ .pme_name = "PM_RC0_BUSY", ++ .pme_code = 0x000002608C, ++ .pme_short_desc = "RC mach 0 Busy.", ++ .pme_long_desc = "RC mach 0 Busy. Used by PMU to sample ave RC lifetime (mach0 used as sample point)", + }, +-[ POWER9_PME_PM_L3_P2_LCO_RTY ] = { /* 809 */ +- .pme_name = "PM_L3_P2_LCO_RTY", +- .pme_code = 0x00000260B4, +- .pme_short_desc = "L3 lateral cast out received retry on port 2", +- .pme_long_desc = "L3 lateral cast out received retry on port 2", ++[ POWER9_PME_PM_RC_USAGE ] = { ++ .pme_name = "PM_RC_USAGE", ++ .pme_code = 0x000001688C, ++ .pme_short_desc = "Continuous 16 cycle (2to1) window where this signals rotates thru sampling each RC machine busy.", ++ .pme_long_desc = "Continuous 16 cycle (2to1) window where this signals rotates thru sampling each RC machine busy. PMU uses this wave to then do 16 cyc count to sample total number of machs running", + }, +-[ POWER9_PME_PM_MRK_DATA_FROM_DL2L3_SHR ] = { /* 810 */ +- .pme_name = "PM_MRK_DATA_FROM_DL2L3_SHR", +- .pme_code = 0x000001D150, +- .pme_short_desc = "The processor's data cache was reloaded with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a marked load", +- .pme_long_desc = "The processor's data cache was reloaded with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a marked load", ++[ POWER9_PME_PM_RD_CLEARING_SC ] = { ++ .pme_name = "PM_RD_CLEARING_SC", ++ .pme_code = 0x00000468A6, ++ .pme_short_desc = "Read clearing SC", ++ .pme_long_desc = "Read clearing SC", ++}, ++[ POWER9_PME_PM_RD_FORMING_SC ] = { ++ .pme_name = "PM_RD_FORMING_SC", ++ .pme_code = 0x00000460A6, ++ .pme_short_desc = "Read forming SC", ++ .pme_long_desc = "Read forming SC", ++}, ++[ POWER9_PME_PM_RD_HIT_PF ] = { ++ .pme_name = "PM_RD_HIT_PF", ++ .pme_code = 0x00000268A8, ++ .pme_short_desc = "RD machine hit L3 PF machine", ++ .pme_long_desc = "RD machine hit L3 PF machine", ++}, ++[ POWER9_PME_PM_RUN_CYC_SMT2_MODE ] = { ++ .pme_name = "PM_RUN_CYC_SMT2_MODE", ++ .pme_code = 0x000003006C, ++ .pme_short_desc = "Cycles in which this thread's run latch is set and the core is in SMT2 mode", ++ .pme_long_desc = "Cycles in which this thread's run latch is set and the core is in SMT2 mode", + }, +-[ POWER9_PME_PM_L2_GRP_GUESS_CORRECT ] = { /* 811 */ +- .pme_name = "PM_L2_GRP_GUESS_CORRECT", +- .pme_code = 0x0000026088, +- .pme_short_desc = "L2 guess grp and guess was correct (data intra-6chip AND ^on-chip)", +- .pme_long_desc = "L2 guess grp and guess was correct (data intra-6chip AND ^on-chip)", ++[ POWER9_PME_PM_RUN_CYC_SMT4_MODE ] = { ++ .pme_name = "PM_RUN_CYC_SMT4_MODE", ++ .pme_code = 0x000002006C, ++ .pme_short_desc = "Cycles in which this thread's run latch is set and the core is in SMT4 mode", ++ .pme_long_desc = "Cycles in which this thread's run latch is set and the core is in SMT4 mode", + }, +-[ POWER9_PME_PM_LSU0_1_LRQF_FULL_CYC ] = { /* 812 */ +- .pme_name = "PM_LSU0_1_LRQF_FULL_CYC", +- .pme_code = 0x000000D0BC, +- .pme_short_desc = "Counts the number of cycles the LRQF is full.", +- .pme_long_desc = "Counts the number of cycles the LRQF is full. LRQF is the queue that holds loads between finish and completion. If it fills up, instructions stay in LRQ until completion, potentially backing up the LRQ", ++[ POWER9_PME_PM_RUN_CYC_ST_MODE ] = { ++ .pme_name = "PM_RUN_CYC_ST_MODE", ++ .pme_code = 0x000001006C, ++ .pme_short_desc = "Cycles run latch is set and core is in ST mode", ++ .pme_long_desc = "Cycles run latch is set and core is in ST mode", + }, +-[ POWER9_PME_PM_DATA_GRP_PUMP_MPRED ] = { /* 813 */ +- .pme_name = "PM_DATA_GRP_PUMP_MPRED", +- .pme_code = 0x000002C052, +- .pme_short_desc = "Final Pump Scope (Group) ended up either larger or smaller than Initial Pump Scope for a demand load", +- .pme_long_desc = "Final Pump Scope (Group) ended up either larger or smaller than Initial Pump Scope for a demand load", ++[ POWER9_PME_PM_RUN_CYC ] = { ++ .pme_name = "PM_RUN_CYC", ++ .pme_code = 0x00000200F4, ++ .pme_short_desc = "Run_cycles", ++ .pme_long_desc = "Run_cycles", + }, +-[ POWER9_PME_PM_LSU3_ERAT_HIT ] = { /* 814 */ +- .pme_name = "PM_LSU3_ERAT_HIT", +- .pme_code = 0x000000E890, +- .pme_short_desc = "Primary ERAT hit.", +- .pme_long_desc = "Primary ERAT hit. There is no secondary ERAT", ++[ POWER9_PME_PM_RUN_INST_CMPL ] = { ++ .pme_name = "PM_RUN_INST_CMPL", ++ .pme_code = 0x00000400FA, ++ .pme_short_desc = "Run_Instructions", ++ .pme_long_desc = "Run_Instructions", + }, +-[ POWER9_PME_PM_FORCED_NOP ] = { /* 815 */ +- .pme_name = "PM_FORCED_NOP", +- .pme_code = 0x000000509C, +- .pme_short_desc = "Instruction was forced to execute as a nop because it was found to behave like a nop (have no effect) at decode time", +- .pme_long_desc = "Instruction was forced to execute as a nop because it was found to behave like a nop (have no effect) at decode time", ++[ POWER9_PME_PM_RUN_PURR ] = { ++ .pme_name = "PM_RUN_PURR", ++ .pme_code = 0x00000400F4, ++ .pme_short_desc = "Run_PURR", ++ .pme_long_desc = "Run_PURR", + }, +-[ POWER9_PME_PM_MRK_DATA_FROM_L2_DISP_CONFLICT_LDHITST ] = { /* 816 */ +- .pme_name = "PM_MRK_DATA_FROM_L2_DISP_CONFLICT_LDHITST", +- .pme_code = 0x000002D148, +- .pme_short_desc = "The processor's data cache was reloaded from local core's L2 with load hit store conflict due to a marked load", +- .pme_long_desc = "The processor's data cache was reloaded from local core's L2 with load hit store conflict due to a marked load", ++[ POWER9_PME_PM_RUN_SPURR ] = { ++ .pme_name = "PM_RUN_SPURR", ++ .pme_code = 0x0000010008, ++ .pme_short_desc = "Run SPURR", ++ .pme_long_desc = "Run SPURR", + }, +-[ POWER9_PME_PM_CMPLU_STALL_LARX ] = { /* 817 */ +- .pme_name = "PM_CMPLU_STALL_LARX", +- .pme_code = 0x000001002A, +- .pme_short_desc = "Finish stall because the NTF instruction was a larx waiting to be satisfied", +- .pme_long_desc = "Finish stall because the NTF instruction was a larx waiting to be satisfied", ++[ POWER9_PME_PM_S2Q_FULL ] = { ++ .pme_name = "PM_S2Q_FULL", ++ .pme_code = 0x000000E080, ++ .pme_short_desc = "Cycles during which the S2Q is full", ++ .pme_long_desc = "Cycles during which the S2Q is full", + }, +-[ POWER9_PME_PM_MRK_DPTEG_FROM_RL4 ] = { /* 818 */ +- .pme_name = "PM_MRK_DPTEG_FROM_RL4", +- .pme_code = 0x000002F14A, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB from another chip's L4 on the same Node or Group ( Remote) due to a marked data side request.", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB from another chip's L4 on the same Node or Group ( Remote) due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++[ POWER9_PME_PM_SCALAR_FLOP_CMPL ] = { ++ .pme_name = "PM_SCALAR_FLOP_CMPL", ++ .pme_code = 0x0000045056, ++ .pme_short_desc = "Scalar flop operation completed", ++ .pme_long_desc = "Scalar flop operation completed", + }, +-[ POWER9_PME_PM_MRK_DATA_FROM_L2 ] = { /* 819 */ +- .pme_name = "PM_MRK_DATA_FROM_L2", +- .pme_code = 0x000002C126, +- .pme_short_desc = "The processor's data cache was reloaded from local core's L2 due to a marked load", +- .pme_long_desc = "The processor's data cache was reloaded from local core's L2 due to a marked load", ++[ POWER9_PME_PM_SHL_CREATED ] = { ++ .pme_name = "PM_SHL_CREATED", ++ .pme_code = 0x000000508C, ++ .pme_short_desc = "Store-Hit-Load Table Entry Created", ++ .pme_long_desc = "Store-Hit-Load Table Entry Created", + }, +-[ POWER9_PME_PM_TM_FAIL_CONF_NON_TM ] = { /* 820 */ +- .pme_name = "PM_TM_FAIL_CONF_NON_TM", +- .pme_code = 0x00000028A8, +- .pme_short_desc = "TM aborted because a conflict occurred with a non-transactional access by another processor", +- .pme_long_desc = "TM aborted because a conflict occurred with a non-transactional access by another processor", ++[ POWER9_PME_PM_SHL_ST_DEP_CREATED ] = { ++ .pme_name = "PM_SHL_ST_DEP_CREATED", ++ .pme_code = 0x000000588C, ++ .pme_short_desc = "Store-Hit-Load Table Read Hit with entry Enabled", ++ .pme_long_desc = "Store-Hit-Load Table Read Hit with entry Enabled", + }, +-[ POWER9_PME_PM_DPTEG_FROM_RL2L3_SHR ] = { /* 821 */ +- .pme_name = "PM_DPTEG_FROM_RL2L3_SHR", +- .pme_code = 0x000001E04A, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a data side request.", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++[ POWER9_PME_PM_SHL_ST_DISABLE ] = { ++ .pme_name = "PM_SHL_ST_DISABLE", ++ .pme_code = 0x0000005090, ++ .pme_short_desc = "Store-Hit-Load Table Read Hit with entry Disabled (entry was disabled due to the entry shown to not prevent the flush)", ++ .pme_long_desc = "Store-Hit-Load Table Read Hit with entry Disabled (entry was disabled due to the entry shown to not prevent the flush)", + }, +-[ POWER9_PME_PM_DARQ_4_6_ENTRIES ] = { /* 822 */ +- .pme_name = "PM_DARQ_4_6_ENTRIES", +- .pme_code = 0x000003504E, +- .pme_short_desc = "Cycles in which 4, 5, or 6 DARQ entries (out of 12) are in use", +- .pme_long_desc = "Cycles in which 4, 5, or 6 DARQ entries (out of 12) are in use", ++[ POWER9_PME_PM_SLB_TABLEWALK_CYC ] = { ++ .pme_name = "PM_SLB_TABLEWALK_CYC", ++ .pme_code = 0x000000F09C, ++ .pme_short_desc = "Cycles when a tablewalk is pending on this thread on the SLB table", ++ .pme_long_desc = "Cycles when a tablewalk is pending on this thread on the SLB table", + }, +-[ POWER9_PME_PM_L2_SYS_PUMP ] = { /* 823 */ +- .pme_name = "PM_L2_SYS_PUMP", +- .pme_code = 0x000004688A, +- .pme_short_desc = "RC requests that were system pump attempts", +- .pme_long_desc = "RC requests that were system pump attempts", ++[ POWER9_PME_PM_SN0_BUSY ] = { ++ .pme_name = "PM_SN0_BUSY", ++ .pme_code = 0x0000016090, ++ .pme_short_desc = "SN mach 0 Busy.", ++ .pme_long_desc = "SN mach 0 Busy. Used by PMU to sample ave SN lifetime (mach0 used as sample point)", + }, +-[ POWER9_PME_PM_IOPS_CMPL ] = { /* 824 */ +- .pme_name = "PM_IOPS_CMPL", +- .pme_code = 0x0000024050, +- .pme_short_desc = "Internal Operations completed", +- .pme_long_desc = "Internal Operations completed", ++[ POWER9_PME_PM_SN0_BUSY_ALT ] = { ++ .pme_name = "PM_SN0_BUSY", ++ .pme_code = 0x0000026090, ++ .pme_short_desc = "SN mach 0 Busy.", ++ .pme_long_desc = "SN mach 0 Busy. Used by PMU to sample ave SN lifetime (mach0 used as sample point)", + }, +-[ POWER9_PME_PM_LSU_FLUSH_LHS ] = { /* 825 */ +- .pme_name = "PM_LSU_FLUSH_LHS", +- .pme_code = 0x000000C8B4, +- .pme_short_desc = "Effective Address alias flush : no EA match but Real Address match.", +- .pme_long_desc = "Effective Address alias flush : no EA match but Real Address match. If the data has not yet been returned for this load, the instruction will just be rejected, but if it has returned data, it will be flushed", ++[ POWER9_PME_PM_SN_HIT ] = { ++ .pme_name = "PM_SN_HIT", ++ .pme_code = 0x00000460A8, ++ .pme_short_desc = "Any port snooper hit L3.", ++ .pme_long_desc = "Any port snooper hit L3. Up to 4 can happen in a cycle but we only count 1", + }, +-[ POWER9_PME_PM_DATA_FROM_L3_1_SHR ] = { /* 826 */ +- .pme_name = "PM_DATA_FROM_L3_1_SHR", +- .pme_code = 0x000001C046, +- .pme_short_desc = "The processor's data cache was reloaded with Shared (S) data from another core's L3 on the same chip due to a demand load", +- .pme_long_desc = "The processor's data cache was reloaded with Shared (S) data from another core's L3 on the same chip due to a demand load", ++[ POWER9_PME_PM_SN_INVL ] = { ++ .pme_name = "PM_SN_INVL", ++ .pme_code = 0x00000368A8, ++ .pme_short_desc = "Any port snooper detects a store to a line in the Sx state and invalidates the line.", ++ .pme_long_desc = "Any port snooper detects a store to a line in the Sx state and invalidates the line. Up to 4 can happen in a cycle but we only count 1", + }, +-[ POWER9_PME_PM_NTC_FIN ] = { /* 827 */ +- .pme_name = "PM_NTC_FIN", +- .pme_code = 0x000002405A, +- .pme_short_desc = "Cycles in which the oldest instruction in the pipeline (NTC) finishes.", +- .pme_long_desc = "Cycles in which the oldest instruction in the pipeline (NTC) finishes. This event is used to account for cycles in which work is being completed in the CPI stack", ++[ POWER9_PME_PM_SN_MISS ] = { ++ .pme_name = "PM_SN_MISS", ++ .pme_code = 0x00000468A8, ++ .pme_short_desc = "Any port snooper L3 miss or collision.", ++ .pme_long_desc = "Any port snooper L3 miss or collision. Up to 4 can happen in a cycle but we only count 1", + }, +-[ POWER9_PME_PM_LS2_DC_COLLISIONS ] = { /* 828 */ +- .pme_name = "PM_LS2_DC_COLLISIONS", +- .pme_code = 0x000000D094, +- .pme_short_desc = "Read-write data cache collisions", +- .pme_long_desc = "Read-write data cache collisions", ++[ POWER9_PME_PM_SNOOP_TLBIE ] = { ++ .pme_name = "PM_SNOOP_TLBIE", ++ .pme_code = 0x000000F880, ++ .pme_short_desc = "TLBIE snoop", ++ .pme_long_desc = "TLBIE snoop", + }, +-[ POWER9_PME_PM_FMA_CMPL ] = { /* 829 */ +- .pme_name = "PM_FMA_CMPL", +- .pme_code = 0x0000010014, +- .pme_short_desc = "two flops operation completed (fmadd, fnmadd, fmsub, fnmsub) Scalar instructions only.", +- .pme_long_desc = "two flops operation completed (fmadd, fnmadd, fmsub, fnmsub) Scalar instructions only.?", ++[ POWER9_PME_PM_SNP_TM_HIT_M ] = { ++ .pme_name = "PM_SNP_TM_HIT_M", ++ .pme_code = 0x00000360A6, ++ .pme_short_desc = "Snp TM st hit M/Mu", ++ .pme_long_desc = "Snp TM st hit M/Mu", + }, +-[ POWER9_PME_PM_IPTEG_FROM_MEMORY ] = { /* 830 */ +- .pme_name = "PM_IPTEG_FROM_MEMORY", +- .pme_code = 0x000002504C, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB from a memory location including L4 from local remote or distant due to a instruction side request", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB from a memory location including L4 from local remote or distant due to a instruction side request", ++[ POWER9_PME_PM_SNP_TM_HIT_T ] = { ++ .pme_name = "PM_SNP_TM_HIT_T", ++ .pme_code = 0x00000368A6, ++ .pme_short_desc = "Snp TM sthit T/Tn/Te", ++ .pme_long_desc = "Snp TM sthit T/Tn/Te", + }, +-[ POWER9_PME_PM_TM_NON_FAV_TBEGIN ] = { /* 831 */ +- .pme_name = "PM_TM_NON_FAV_TBEGIN", +- .pme_code = 0x000000289C, +- .pme_short_desc = "Dispatch time non favored tbegin", +- .pme_long_desc = "Dispatch time non favored tbegin", ++[ POWER9_PME_PM_SN_USAGE ] = { ++ .pme_name = "PM_SN_USAGE", ++ .pme_code = 0x000003688C, ++ .pme_short_desc = "Continuous 16 cycle (2to1) window where this signals rotates thru sampling each SN machine busy.", ++ .pme_long_desc = "Continuous 16 cycle (2to1) window where this signals rotates thru sampling each SN machine busy. PMU uses this wave to then do 16 cyc count to sample total number of machs running", + }, +-[ POWER9_PME_PM_PMC1_REWIND ] = { /* 832 */ +- .pme_name = "PM_PMC1_REWIND", +- .pme_code = 0x000004D02C, +- .pme_short_desc = "", +- .pme_long_desc = "", ++[ POWER9_PME_PM_SP_FLOP_CMPL ] = { ++ .pme_name = "PM_SP_FLOP_CMPL", ++ .pme_code = 0x000004505A, ++ .pme_short_desc = "SP instruction completed", ++ .pme_long_desc = "SP instruction completed", + }, +-[ POWER9_PME_PM_ISU2_ISS_HOLD_ALL ] = { /* 833 */ +- .pme_name = "PM_ISU2_ISS_HOLD_ALL", +- .pme_code = 0x0000003880, +- .pme_short_desc = "All ISU rejects", +- .pme_long_desc = "All ISU rejects", ++[ POWER9_PME_PM_SRQ_EMPTY_CYC ] = { ++ .pme_name = "PM_SRQ_EMPTY_CYC", ++ .pme_code = 0x0000040008, ++ .pme_short_desc = "Cycles in which the SRQ has at least one (out of four) empty slice", ++ .pme_long_desc = "Cycles in which the SRQ has at least one (out of four) empty slice", + }, +-[ POWER9_PME_PM_MRK_DATA_FROM_DL2L3_MOD_CYC ] = { /* 834 */ +- .pme_name = "PM_MRK_DATA_FROM_DL2L3_MOD_CYC", +- .pme_code = 0x000004D12E, +- .pme_short_desc = "Duration in cycles to reload with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a marked load", +- .pme_long_desc = "Duration in cycles to reload with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a marked load", ++[ POWER9_PME_PM_SRQ_SYNC_CYC ] = { ++ .pme_name = "PM_SRQ_SYNC_CYC", ++ .pme_code = 0x000000D0AC, ++ .pme_short_desc = "A sync is in the S2Q (edge detect to count)", ++ .pme_long_desc = "A sync is in the S2Q (edge detect to count)", + }, +-[ POWER9_PME_PM_PTESYNC ] = { /* 835 */ +- .pme_name = "PM_PTESYNC", +- .pme_code = 0x000000589C, +- .pme_short_desc = "ptesync instruction counted when the instructio is decoded and transmitted", +- .pme_long_desc = "ptesync instruction counted when the instructio is decoded and transmitted", ++[ POWER9_PME_PM_STALL_END_ICT_EMPTY ] = { ++ .pme_name = "PM_STALL_END_ICT_EMPTY", ++ .pme_code = 0x0000010028, ++ .pme_short_desc = "The number a times the core transitioned from a stall to ICT-empty for this thread", ++ .pme_long_desc = "The number a times the core transitioned from a stall to ICT-empty for this thread", + }, +-[ POWER9_PME_PM_ISIDE_DISP_FAIL_OTHER ] = { /* 836 */ +- .pme_name = "PM_ISIDE_DISP_FAIL_OTHER", +- .pme_code = 0x000002688A, +- .pme_short_desc = "All i-side dispatch attempts that failed due to a reason other than addrs collision", +- .pme_long_desc = "All i-side dispatch attempts that failed due to a reason other than addrs collision", ++[ POWER9_PME_PM_ST_CAUSED_FAIL ] = { ++ .pme_name = "PM_ST_CAUSED_FAIL", ++ .pme_code = 0x000001608E, ++ .pme_short_desc = "Non-TM Store caused any thread to fail", ++ .pme_long_desc = "Non-TM Store caused any thread to fail", + }, +-[ POWER9_PME_PM_L2_IC_INV ] = { /* 837 */ +- .pme_name = "PM_L2_IC_INV", +- .pme_code = 0x0000026082, +- .pme_short_desc = "Icache Invalidates from L2", +- .pme_long_desc = "Icache Invalidates from L2", ++[ POWER9_PME_PM_ST_CMPL ] = { ++ .pme_name = "PM_ST_CMPL", ++ .pme_code = 0x00000200F0, ++ .pme_short_desc = "Stores completed from S2Q (2nd-level store queue).", ++ .pme_long_desc = "Stores completed from S2Q (2nd-level store queue).", + }, +-[ POWER9_PME_PM_DPTEG_FROM_L3 ] = { /* 838 */ +- .pme_name = "PM_DPTEG_FROM_L3", +- .pme_code = 0x000004E042, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L3 due to a data side request.", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L3 due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++[ POWER9_PME_PM_STCX_FAIL ] = { ++ .pme_name = "PM_STCX_FAIL", ++ .pme_code = 0x000001E058, ++ .pme_short_desc = "stcx failed", ++ .pme_long_desc = "stcx failed", + }, +-[ POWER9_PME_PM_RADIX_PWC_L2_HIT ] = { /* 839 */ +- .pme_name = "PM_RADIX_PWC_L2_HIT", +- .pme_code = 0x000002D024, +- .pme_short_desc = "A radix translation attempt missed in the TLB but hit on both the first and second levels of page walk cache.", +- .pme_long_desc = "A radix translation attempt missed in the TLB but hit on both the first and second levels of page walk cache.", ++[ POWER9_PME_PM_STCX_FIN ] = { ++ .pme_name = "PM_STCX_FIN", ++ .pme_code = 0x000002E014, ++ .pme_short_desc = "Number of stcx instructions finished.", ++ .pme_long_desc = "Number of stcx instructions finished. This includes instructions in the speculative path of a branch that may be flushed", + }, +-[ POWER9_PME_PM_DC_PREF_HW_ALLOC ] = { /* 840 */ +- .pme_name = "PM_DC_PREF_HW_ALLOC", +- .pme_code = 0x000000F0A4, +- .pme_short_desc = "Prefetch stream allocated by the hardware prefetch mechanism", +- .pme_long_desc = "Prefetch stream allocated by the hardware prefetch mechanism", ++[ POWER9_PME_PM_STCX_SUCCESS_CMPL ] = { ++ .pme_name = "PM_STCX_SUCCESS_CMPL", ++ .pme_code = 0x000000C8BC, ++ .pme_short_desc = "Number of stcx instructions that completed successfully", ++ .pme_long_desc = "Number of stcx instructions that completed successfully", + }, +-[ POWER9_PME_PM_LSU0_VECTOR_LD_FIN ] = { /* 841 */ +- .pme_name = "PM_LSU0_VECTOR_LD_FIN", +- .pme_code = 0x000000C080, +- .pme_short_desc = "A vector load instruction finished.", +- .pme_long_desc = "A vector load instruction finished. The ops considered in this category are lxv*, lvx*, lve*, lxsi*zx, lxvwsx, lxsd, lxssp, lxvl, lxvll, lxvb16x, lxvh8x, lxv, lxvx", ++[ POWER9_PME_PM_ST_FIN ] = { ++ .pme_name = "PM_ST_FIN", ++ .pme_code = 0x0000020016, ++ .pme_short_desc = "Store finish count.", ++ .pme_long_desc = "Store finish count. Includes speculative activity", + }, +-[ POWER9_PME_PM_1PLUS_PPC_DISP ] = { /* 842 */ +- .pme_name = "PM_1PLUS_PPC_DISP", +- .pme_code = 0x00000400F2, +- .pme_short_desc = "Cycles at least one Instr Dispatched", +- .pme_long_desc = "Cycles at least one Instr Dispatched", ++[ POWER9_PME_PM_ST_FWD ] = { ++ .pme_name = "PM_ST_FWD", ++ .pme_code = 0x0000020018, ++ .pme_short_desc = "Store forwards that finished", ++ .pme_long_desc = "Store forwards that finished", + }, +-[ POWER9_PME_PM_RADIX_PWC_L3_PTE_FROM_L2 ] = { /* 843 */ +- .pme_name = "PM_RADIX_PWC_L3_PTE_FROM_L2", +- .pme_code = 0x000002D02E, +- .pme_short_desc = "A Page Table Entry was reloaded to a level 3 page walk cache from the core's L2 data cache.", +- .pme_long_desc = "A Page Table Entry was reloaded to a level 3 page walk cache from the core's L2 data cache. This implies that a level 4 PWC access was not necessary for this translation", ++[ POWER9_PME_PM_ST_MISS_L1 ] = { ++ .pme_name = "PM_ST_MISS_L1", ++ .pme_code = 0x00000300F0, ++ .pme_short_desc = "Store Missed L1", ++ .pme_long_desc = "Store Missed L1", + }, +-[ POWER9_PME_PM_DATA_FROM_L2MISS ] = { /* 844 */ +- .pme_name = "PM_DATA_FROM_L2MISS", +- .pme_code = 0x00000200FE, +- .pme_short_desc = "Demand LD - L2 Miss (not L2 hit)", +- .pme_long_desc = "Demand LD - L2 Miss (not L2 hit)", ++[ POWER9_PME_PM_STOP_FETCH_PENDING_CYC ] = { ++ .pme_name = "PM_STOP_FETCH_PENDING_CYC", ++ .pme_code = 0x00000048A4, ++ .pme_short_desc = "Fetching is stopped due to an incoming instruction that will result in a flush", ++ .pme_long_desc = "Fetching is stopped due to an incoming instruction that will result in a flush", + }, +-[ POWER9_PME_PM_MRK_FAB_RSP_RD_T_INTV ] = { /* 845 */ +- .pme_name = "PM_MRK_FAB_RSP_RD_T_INTV", +- .pme_code = 0x000001015E, +- .pme_short_desc = "Sampled Read got a T intervention", +- .pme_long_desc = "Sampled Read got a T intervention", ++[ POWER9_PME_PM_SUSPENDED ] = { ++ .pme_name = "PM_SUSPENDED", ++ .pme_code = 0x0000010000, ++ .pme_short_desc = "Counter OFF", ++ .pme_long_desc = "Counter OFF", + }, +-[ POWER9_PME_PM_NTC_ISSUE_HELD_ARB ] = { /* 846 */ +- .pme_name = "PM_NTC_ISSUE_HELD_ARB", +- .pme_code = 0x000002E016, +- .pme_short_desc = "The NTC instruction is being held at dispatch because it lost arbitration onto the issue pipe to another instruction (from the same thread or a different thread)", +- .pme_long_desc = "The NTC instruction is being held at dispatch because it lost arbitration onto the issue pipe to another instruction (from the same thread or a different thread)", ++[ POWER9_PME_PM_SYNC_MRK_BR_LINK ] = { ++ .pme_name = "PM_SYNC_MRK_BR_LINK", ++ .pme_code = 0x0000015152, ++ .pme_short_desc = "Marked Branch and link branch that can cause a synchronous interrupt", ++ .pme_long_desc = "Marked Branch and link branch that can cause a synchronous interrupt", + }, +-[ POWER9_PME_PM_LSU2_L1_CAM_CANCEL ] = { /* 847 */ +- .pme_name = "PM_LSU2_L1_CAM_CANCEL", +- .pme_code = 0x000000F094, +- .pme_short_desc = "ls2 l1 tm cam cancel", +- .pme_long_desc = "ls2 l1 tm cam cancel", ++[ POWER9_PME_PM_SYNC_MRK_BR_MPRED ] = { ++ .pme_name = "PM_SYNC_MRK_BR_MPRED", ++ .pme_code = 0x000001515C, ++ .pme_short_desc = "Marked Branch mispredict that can cause a synchronous interrupt", ++ .pme_long_desc = "Marked Branch mispredict that can cause a synchronous interrupt", ++}, ++[ POWER9_PME_PM_SYNC_MRK_FX_DIVIDE ] = { ++ .pme_name = "PM_SYNC_MRK_FX_DIVIDE", ++ .pme_code = 0x0000015156, ++ .pme_short_desc = "Marked fixed point divide that can cause a synchronous interrupt", ++ .pme_long_desc = "Marked fixed point divide that can cause a synchronous interrupt", + }, +-[ POWER9_PME_PM_L3_GRP_GUESS_WRONG_HIGH ] = { /* 848 */ +- .pme_name = "PM_L3_GRP_GUESS_WRONG_HIGH", +- .pme_code = 0x00000368B2, +- .pme_short_desc = "Initial scope=group but data from local node.", +- .pme_long_desc = "Initial scope=group but data from local node. Predition too high", ++[ POWER9_PME_PM_SYNC_MRK_L2HIT ] = { ++ .pme_name = "PM_SYNC_MRK_L2HIT", ++ .pme_code = 0x0000015158, ++ .pme_short_desc = "Marked L2 Hits that can throw a synchronous interrupt", ++ .pme_long_desc = "Marked L2 Hits that can throw a synchronous interrupt", + }, +-[ POWER9_PME_PM_DATA_FROM_L3_NO_CONFLICT ] = { /* 849 */ +- .pme_name = "PM_DATA_FROM_L3_NO_CONFLICT", +- .pme_code = 0x000001C044, +- .pme_short_desc = "The processor's data cache was reloaded from local core's L3 without conflict due to a demand load", +- .pme_long_desc = "The processor's data cache was reloaded from local core's L3 without conflict due to a demand load", ++[ POWER9_PME_PM_SYNC_MRK_L2MISS ] = { ++ .pme_name = "PM_SYNC_MRK_L2MISS", ++ .pme_code = 0x000001515A, ++ .pme_short_desc = "Marked L2 Miss that can throw a synchronous interrupt", ++ .pme_long_desc = "Marked L2 Miss that can throw a synchronous interrupt", + }, +-[ POWER9_PME_PM_SUSPENDED ] = { /* 850 */ +- .pme_name = "PM_SUSPENDED", +- .pme_code = 0x0000010000, +- .pme_short_desc = "Counter OFF", +- .pme_long_desc = "Counter OFF", ++[ POWER9_PME_PM_SYNC_MRK_L3MISS ] = { ++ .pme_name = "PM_SYNC_MRK_L3MISS", ++ .pme_code = 0x0000015154, ++ .pme_short_desc = "Marked L3 misses that can throw a synchronous interrupt", ++ .pme_long_desc = "Marked L3 misses that can throw a synchronous interrupt", + }, +-[ POWER9_PME_PM_L3_SYS_GUESS_WRONG ] = { /* 851 */ +- .pme_name = "PM_L3_SYS_GUESS_WRONG", +- .pme_code = 0x00000460B2, +- .pme_short_desc = "Initial scope=system but data from local or near.", +- .pme_long_desc = "Initial scope=system but data from local or near. Predction too high", ++[ POWER9_PME_PM_SYNC_MRK_PROBE_NOP ] = { ++ .pme_name = "PM_SYNC_MRK_PROBE_NOP", ++ .pme_code = 0x0000015150, ++ .pme_short_desc = "Marked probeNops which can cause synchronous interrupts", ++ .pme_long_desc = "Marked probeNops which can cause synchronous interrupts", + }, +-[ POWER9_PME_PM_L3_L2_CO_HIT ] = { /* 852 */ +- .pme_name = "PM_L3_L2_CO_HIT", +- .pme_code = 0x00000360A2, +- .pme_short_desc = "L2 castout hits", +- .pme_long_desc = "L2 castout hits", ++[ POWER9_PME_PM_SYS_PUMP_CPRED ] = { ++ .pme_name = "PM_SYS_PUMP_CPRED", ++ .pme_code = 0x0000030050, ++ .pme_short_desc = "Initial and Final Pump Scope was system pump for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", ++ .pme_long_desc = "Initial and Final Pump Scope was system pump for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", + }, +-[ POWER9_PME_PM_LSU0_TM_L1_HIT ] = { /* 853 */ +- .pme_name = "PM_LSU0_TM_L1_HIT", +- .pme_code = 0x000000E094, +- .pme_short_desc = "Load tm hit in L1", +- .pme_long_desc = "Load tm hit in L1", ++[ POWER9_PME_PM_SYS_PUMP_MPRED_RTY ] = { ++ .pme_name = "PM_SYS_PUMP_MPRED_RTY", ++ .pme_code = 0x0000040050, ++ .pme_short_desc = "Final Pump Scope (system) ended up larger than Initial Pump Scope (Chip/Group) for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", ++ .pme_long_desc = "Final Pump Scope (system) ended up larger than Initial Pump Scope (Chip/Group) for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", + }, +-[ POWER9_PME_PM_BR_MPRED_PCACHE ] = { /* 854 */ +- .pme_name = "PM_BR_MPRED_PCACHE", +- .pme_code = 0x00000048B0, +- .pme_short_desc = "Conditional Branch Completed that was Mispredicted due to pattern cache prediction", +- .pme_long_desc = "Conditional Branch Completed that was Mispredicted due to pattern cache prediction", ++[ POWER9_PME_PM_SYS_PUMP_MPRED ] = { ++ .pme_name = "PM_SYS_PUMP_MPRED", ++ .pme_code = 0x0000030052, ++ .pme_short_desc = "Final Pump Scope (system) mispredicted.", ++ .pme_long_desc = "Final Pump Scope (system) mispredicted. Either the original scope was too small (Chip/Group) or the original scope was System and it should have been smaller. Counts for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", + }, +-[ POWER9_PME_PM_STCX_FAIL ] = { /* 855 */ +- .pme_name = "PM_STCX_FAIL", +- .pme_code = 0x000001E058, +- .pme_short_desc = "stcx failed", +- .pme_long_desc = "stcx failed", ++[ POWER9_PME_PM_TABLEWALK_CYC_PREF ] = { ++ .pme_name = "PM_TABLEWALK_CYC_PREF", ++ .pme_code = 0x000000F884, ++ .pme_short_desc = "tablewalk qualified for pte prefetches", ++ .pme_long_desc = "tablewalk qualified for pte prefetches", + }, +-[ POWER9_PME_PM_LSU_FLUSH_NEXT ] = { /* 856 */ +- .pme_name = "PM_LSU_FLUSH_NEXT", +- .pme_code = 0x00000020B0, +- .pme_short_desc = "LSU flush next reported at flush time.", +- .pme_long_desc = "LSU flush next reported at flush time. Sometimes these also come with an exception", ++[ POWER9_PME_PM_TABLEWALK_CYC ] = { ++ .pme_name = "PM_TABLEWALK_CYC", ++ .pme_code = 0x0000010026, ++ .pme_short_desc = "Cycles when an instruction tablewalk is active", ++ .pme_long_desc = "Cycles when an instruction tablewalk is active", + }, +-[ POWER9_PME_PM_DSIDE_MRU_TOUCH ] = { /* 857 */ +- .pme_name = "PM_DSIDE_MRU_TOUCH", +- .pme_code = 0x0000026884, +- .pme_short_desc = "dside L2 MRU touch", +- .pme_long_desc = "dside L2 MRU touch", ++[ POWER9_PME_PM_TAGE_CORRECT_TAKEN_CMPL ] = { ++ .pme_name = "PM_TAGE_CORRECT_TAKEN_CMPL", ++ .pme_code = 0x00000050B4, ++ .pme_short_desc = "The TAGE overrode BHT direction prediction and it was correct.", ++ .pme_long_desc = "The TAGE overrode BHT direction prediction and it was correct. Counted at completion for taken branches only", + }, +-[ POWER9_PME_PM_SN_MISS ] = { /* 858 */ +- .pme_name = "PM_SN_MISS", +- .pme_code = 0x00000468A8, +- .pme_short_desc = "Any port snooper miss.", +- .pme_long_desc = "Any port snooper miss. Up to 4 can happen in a cycle but we only count 1", ++[ POWER9_PME_PM_TAGE_CORRECT ] = { ++ .pme_name = "PM_TAGE_CORRECT", ++ .pme_code = 0x00000058B4, ++ .pme_short_desc = "The TAGE overrode BHT direction prediction and it was correct.", ++ .pme_long_desc = "The TAGE overrode BHT direction prediction and it was correct. Includes taken and not taken and is counted at execution time", + }, +-[ POWER9_PME_PM_BR_PRED_TAKEN_CMPL ] = { /* 859 */ +- .pme_name = "PM_BR_PRED_TAKEN_CMPL", +- .pme_code = 0x000000489C, +- .pme_short_desc = "Conditional Branch Completed in which the HW predicted the Direction or Target and the branch was resolved taken.", +- .pme_long_desc = "Conditional Branch Completed in which the HW predicted the Direction or Target and the branch was resolved taken. Counted at completion time", ++[ POWER9_PME_PM_TAGE_OVERRIDE_WRONG_SPEC ] = { ++ .pme_name = "PM_TAGE_OVERRIDE_WRONG_SPEC", ++ .pme_code = 0x00000058B8, ++ .pme_short_desc = "The TAGE overrode BHT direction prediction and it was correct.", ++ .pme_long_desc = "The TAGE overrode BHT direction prediction and it was correct. Includes taken and not taken and is counted at execution time", + }, +-[ POWER9_PME_PM_L3_P0_SYS_PUMP ] = { /* 860 */ +- .pme_name = "PM_L3_P0_SYS_PUMP", +- .pme_code = 0x00000360B0, +- .pme_short_desc = "L3 pf sent with sys scope port 0", +- .pme_long_desc = "L3 pf sent with sys scope port 0", ++[ POWER9_PME_PM_TAGE_OVERRIDE_WRONG ] = { ++ .pme_name = "PM_TAGE_OVERRIDE_WRONG", ++ .pme_code = 0x00000050B8, ++ .pme_short_desc = "The TAGE overrode BHT direction prediction but it was incorrect.", ++ .pme_long_desc = "The TAGE overrode BHT direction prediction but it was incorrect. Counted at completion for taken branches only", + }, +-[ POWER9_PME_PM_L3_HIT ] = { /* 861 */ +- .pme_name = "PM_L3_HIT", +- .pme_code = 0x00000160A4, +- .pme_short_desc = "L3 Hits", +- .pme_long_desc = "L3 Hits", ++[ POWER9_PME_PM_TAKEN_BR_MPRED_CMPL ] = { ++ .pme_name = "PM_TAKEN_BR_MPRED_CMPL", ++ .pme_code = 0x0000020056, ++ .pme_short_desc = "Total number of taken branches that were incorrectly predicted as not-taken.", ++ .pme_long_desc = "Total number of taken branches that were incorrectly predicted as not-taken. This event counts branches completed and does not include speculative instructions", + }, +-[ POWER9_PME_PM_MRK_DFU_FIN ] = { /* 862 */ +- .pme_name = "PM_MRK_DFU_FIN", +- .pme_code = 0x0000020132, +- .pme_short_desc = "Decimal Unit marked Instruction Finish", +- .pme_long_desc = "Decimal Unit marked Instruction Finish", ++[ POWER9_PME_PM_TB_BIT_TRANS ] = { ++ .pme_name = "PM_TB_BIT_TRANS", ++ .pme_code = 0x00000300F8, ++ .pme_short_desc = "timebase event", ++ .pme_long_desc = "timebase event", + }, +-[ POWER9_PME_PM_CMPLU_STALL_NESTED_TEND ] = { /* 863 */ +- .pme_name = "PM_CMPLU_STALL_NESTED_TEND", +- .pme_code = 0x000003003C, +- .pme_short_desc = "Completion stall because the ISU is updating the TEXASR to keep track of the nested tend and decrement the TEXASR nested level.", +- .pme_long_desc = "Completion stall because the ISU is updating the TEXASR to keep track of the nested tend and decrement the TEXASR nested level. This is a short delay", ++[ POWER9_PME_PM_TEND_PEND_CYC ] = { ++ .pme_name = "PM_TEND_PEND_CYC", ++ .pme_code = 0x000000E8B0, ++ .pme_short_desc = "TEND latency per thread", ++ .pme_long_desc = "TEND latency per thread", + }, +-[ POWER9_PME_PM_INST_FROM_L1 ] = { /* 864 */ +- .pme_name = "PM_INST_FROM_L1", +- .pme_code = 0x0000004080, +- .pme_short_desc = "Instruction fetches from L1.", +- .pme_long_desc = "Instruction fetches from L1. L1 instruction hit", ++[ POWER9_PME_PM_THRD_ALL_RUN_CYC ] = { ++ .pme_name = "PM_THRD_ALL_RUN_CYC", ++ .pme_code = 0x000002000C, ++ .pme_short_desc = "Cycles in which all the threads have the run latch set", ++ .pme_long_desc = "Cycles in which all the threads have the run latch set", + }, +-[ POWER9_PME_PM_IC_DEMAND_REQ ] = { /* 865 */ +- .pme_name = "PM_IC_DEMAND_REQ", +- .pme_code = 0x0000004088, +- .pme_short_desc = "Demand Instruction fetch request", +- .pme_long_desc = "Demand Instruction fetch request", ++[ POWER9_PME_PM_THRD_CONC_RUN_INST ] = { ++ .pme_name = "PM_THRD_CONC_RUN_INST", ++ .pme_code = 0x00000300F4, ++ .pme_short_desc = "PPC Instructions Finished by this thread when all threads in the core had the run-latch set", ++ .pme_long_desc = "PPC Instructions Finished by this thread when all threads in the core had the run-latch set", + }, +-[ POWER9_PME_PM_BRU_FIN ] = { /* 866 */ +- .pme_name = "PM_BRU_FIN", +- .pme_code = 0x0000010068, +- .pme_short_desc = "Branch Instruction Finished", +- .pme_long_desc = "Branch Instruction Finished", ++[ POWER9_PME_PM_THRD_PRIO_0_1_CYC ] = { ++ .pme_name = "PM_THRD_PRIO_0_1_CYC", ++ .pme_code = 0x00000040BC, ++ .pme_short_desc = "Cycles thread running at priority level 0 or 1", ++ .pme_long_desc = "Cycles thread running at priority level 0 or 1", + }, +-[ POWER9_PME_PM_L1_ICACHE_RELOADED_ALL ] = { /* 867 */ +- .pme_name = "PM_L1_ICACHE_RELOADED_ALL", +- .pme_code = 0x0000040012, +- .pme_short_desc = "Counts all Icache reloads includes demand, prefetch, prefetch turned into demand and demand turned into prefetch", +- .pme_long_desc = "Counts all Icache reloads includes demand, prefetch, prefetch turned into demand and demand turned into prefetch", ++[ POWER9_PME_PM_THRD_PRIO_2_3_CYC ] = { ++ .pme_name = "PM_THRD_PRIO_2_3_CYC", ++ .pme_code = 0x00000048BC, ++ .pme_short_desc = "Cycles thread running at priority level 2 or 3", ++ .pme_long_desc = "Cycles thread running at priority level 2 or 3", + }, +-[ POWER9_PME_PM_IERAT_RELOAD_16M ] = { /* 868 */ +- .pme_name = "PM_IERAT_RELOAD_16M", +- .pme_code = 0x000004006A, +- .pme_short_desc = "IERAT Reloaded (Miss) for a 16M page", +- .pme_long_desc = "IERAT Reloaded (Miss) for a 16M page", ++[ POWER9_PME_PM_THRD_PRIO_4_5_CYC ] = { ++ .pme_name = "PM_THRD_PRIO_4_5_CYC", ++ .pme_code = 0x0000005080, ++ .pme_short_desc = "Cycles thread running at priority level 4 or 5", ++ .pme_long_desc = "Cycles thread running at priority level 4 or 5", + }, +-[ POWER9_PME_PM_DATA_FROM_L2MISS_MOD ] = { /* 869 */ +- .pme_name = "PM_DATA_FROM_L2MISS_MOD", +- .pme_code = 0x000001C04E, +- .pme_short_desc = "The processor's data cache was reloaded from a localtion other than the local core's L2 due to a demand load", +- .pme_long_desc = "The processor's data cache was reloaded from a localtion other than the local core's L2 due to a demand load", ++[ POWER9_PME_PM_THRD_PRIO_6_7_CYC ] = { ++ .pme_name = "PM_THRD_PRIO_6_7_CYC", ++ .pme_code = 0x0000005880, ++ .pme_short_desc = "Cycles thread running at priority level 6 or 7", ++ .pme_long_desc = "Cycles thread running at priority level 6 or 7", + }, +-[ POWER9_PME_PM_LSU0_ERAT_HIT ] = { /* 870 */ +- .pme_name = "PM_LSU0_ERAT_HIT", +- .pme_code = 0x000000E08C, +- .pme_short_desc = "Primary ERAT hit.", +- .pme_long_desc = "Primary ERAT hit. There is no secondary ERAT", ++[ POWER9_PME_PM_THRESH_ACC ] = { ++ .pme_name = "PM_THRESH_ACC", ++ .pme_code = 0x0000024154, ++ .pme_short_desc = "This event increments every time the threshold event counter ticks.", ++ .pme_long_desc = "This event increments every time the threshold event counter ticks. Thresholding must be enabled (via MMCRA) and the thresholding start event must occur for this counter to increment. It will stop incrementing when the thresholding stop event occurs or when thresholding is disabled, until the next time a configured thresholding start event occurs.", + }, +-[ POWER9_PME_PM_L3_PF0_BUSY ] = { /* 871 */ +- .pme_name = "PM_L3_PF0_BUSY", +- .pme_code = 0x00000460B4, +- .pme_short_desc = "lifetime, sample of PF machine 0 valid", +- .pme_long_desc = "lifetime, sample of PF machine 0 valid", ++[ POWER9_PME_PM_THRESH_EXC_1024 ] = { ++ .pme_name = "PM_THRESH_EXC_1024", ++ .pme_code = 0x00000301EA, ++ .pme_short_desc = "Threshold counter exceeded a value of 1024", ++ .pme_long_desc = "Threshold counter exceeded a value of 1024", + }, +-[ POWER9_PME_PM_MRK_DPTEG_FROM_LL4 ] = { /* 872 */ +- .pme_name = "PM_MRK_DPTEG_FROM_LL4", +- .pme_code = 0x000001F14C, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB from the local chip's L4 cache due to a marked data side request.", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB from the local chip's L4 cache due to a marked data side request.. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++[ POWER9_PME_PM_THRESH_EXC_128 ] = { ++ .pme_name = "PM_THRESH_EXC_128", ++ .pme_code = 0x00000401EA, ++ .pme_short_desc = "Threshold counter exceeded a value of 128", ++ .pme_long_desc = "Threshold counter exceeded a value of 128", + }, +-[ POWER9_PME_PM_LSU3_SET_MPRED ] = { /* 873 */ +- .pme_name = "PM_LSU3_SET_MPRED", +- .pme_code = 0x000000D884, +- .pme_short_desc = "Set prediction(set-p) miss.", +- .pme_long_desc = "Set prediction(set-p) miss. The entry was not found in the Set prediction table", ++[ POWER9_PME_PM_THRESH_EXC_2048 ] = { ++ .pme_name = "PM_THRESH_EXC_2048", ++ .pme_code = 0x00000401EC, ++ .pme_short_desc = "Threshold counter exceeded a value of 2048", ++ .pme_long_desc = "Threshold counter exceeded a value of 2048", + }, +-[ POWER9_PME_PM_TM_CAM_OVERFLOW ] = { /* 874 */ +- .pme_name = "PM_TM_CAM_OVERFLOW", +- .pme_code = 0x00000168A6, +- .pme_short_desc = "l3 tm cam overflow during L2 co of SC", +- .pme_long_desc = "l3 tm cam overflow during L2 co of SC", ++[ POWER9_PME_PM_THRESH_EXC_256 ] = { ++ .pme_name = "PM_THRESH_EXC_256", ++ .pme_code = 0x00000101E8, ++ .pme_short_desc = "Threshold counter exceed a count of 256", ++ .pme_long_desc = "Threshold counter exceed a count of 256", + }, +-[ POWER9_PME_PM_SYNC_MRK_FX_DIVIDE ] = { /* 875 */ +- .pme_name = "PM_SYNC_MRK_FX_DIVIDE", +- .pme_code = 0x0000015156, +- .pme_short_desc = "Marked fixed point divide that can cause a synchronous interrupt", +- .pme_long_desc = "Marked fixed point divide that can cause a synchronous interrupt", ++[ POWER9_PME_PM_THRESH_EXC_32 ] = { ++ .pme_name = "PM_THRESH_EXC_32", ++ .pme_code = 0x00000201E6, ++ .pme_short_desc = "Threshold counter exceeded a value of 32", ++ .pme_long_desc = "Threshold counter exceeded a value of 32", + }, +-[ POWER9_PME_PM_IPTEG_FROM_L2_1_SHR ] = { /* 876 */ +- .pme_name = "PM_IPTEG_FROM_L2_1_SHR", +- .pme_code = 0x0000035046, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's L2 on the same chip due to a instruction side request", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's L2 on the same chip due to a instruction side request", ++[ POWER9_PME_PM_THRESH_EXC_4096 ] = { ++ .pme_name = "PM_THRESH_EXC_4096", ++ .pme_code = 0x00000101E6, ++ .pme_short_desc = "Threshold counter exceed a count of 4096", ++ .pme_long_desc = "Threshold counter exceed a count of 4096", + }, +-[ POWER9_PME_PM_MRK_LD_MISS_L1 ] = { /* 877 */ +- .pme_name = "PM_MRK_LD_MISS_L1", +- .pme_code = 0x00000201E2, +- .pme_short_desc = "Marked DL1 Demand Miss counted at exec time.", +- .pme_long_desc = "Marked DL1 Demand Miss counted at exec time. Note that this count is per slice, so if a load spans multiple slices this event will increment multiple times for a single load.", ++[ POWER9_PME_PM_THRESH_EXC_512 ] = { ++ .pme_name = "PM_THRESH_EXC_512", ++ .pme_code = 0x00000201E8, ++ .pme_short_desc = "Threshold counter exceeded a value of 512", ++ .pme_long_desc = "Threshold counter exceeded a value of 512", + }, +-[ POWER9_PME_PM_MRK_FAB_RSP_DCLAIM ] = { /* 878 */ +- .pme_name = "PM_MRK_FAB_RSP_DCLAIM", +- .pme_code = 0x0000030154, +- .pme_short_desc = "Marked store had to do a dclaim", +- .pme_long_desc = "Marked store had to do a dclaim", ++[ POWER9_PME_PM_THRESH_EXC_64 ] = { ++ .pme_name = "PM_THRESH_EXC_64", ++ .pme_code = 0x00000301E8, ++ .pme_short_desc = "Threshold counter exceeded a value of 64", ++ .pme_long_desc = "Threshold counter exceeded a value of 64", + }, +-[ POWER9_PME_PM_IPTEG_FROM_L3_DISP_CONFLICT ] = { /* 879 */ +- .pme_name = "PM_IPTEG_FROM_L3_DISP_CONFLICT", +- .pme_code = 0x0000035042, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L3 with dispatch conflict due to a instruction side request", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L3 with dispatch conflict due to a instruction side request", ++[ POWER9_PME_PM_THRESH_MET ] = { ++ .pme_name = "PM_THRESH_MET", ++ .pme_code = 0x00000101EC, ++ .pme_short_desc = "threshold exceeded", ++ .pme_long_desc = "threshold exceeded", + }, +-[ POWER9_PME_PM_NON_FMA_FLOP_CMPL ] = { /* 880 */ +- .pme_name = "PM_NON_FMA_FLOP_CMPL", +- .pme_code = 0x000004D056, +- .pme_short_desc = "Non fma flop instruction completed", +- .pme_long_desc = "Non fma flop instruction completed", ++[ POWER9_PME_PM_THRESH_NOT_MET ] = { ++ .pme_name = "PM_THRESH_NOT_MET", ++ .pme_code = 0x000004016E, ++ .pme_short_desc = "Threshold counter did not meet threshold", ++ .pme_long_desc = "Threshold counter did not meet threshold", + }, +-[ POWER9_PME_PM_MRK_DATA_FROM_L2MISS ] = { /* 881 */ +- .pme_name = "PM_MRK_DATA_FROM_L2MISS", +- .pme_code = 0x00000401E8, +- .pme_short_desc = "The processor's data cache was reloaded from a localtion other than the local core's L2 due to a marked load", +- .pme_long_desc = "The processor's data cache was reloaded from a localtion other than the local core's L2 due to a marked load", ++[ POWER9_PME_PM_TLB_HIT ] = { ++ .pme_name = "PM_TLB_HIT", ++ .pme_code = 0x000001F054, ++ .pme_short_desc = "Number of times the TLB had the data required by the instruction.", ++ .pme_long_desc = "Number of times the TLB had the data required by the instruction. Applies to both HPT and RPT", + }, +-[ POWER9_PME_PM_L2_SYS_GUESS_WRONG ] = { /* 882 */ +- .pme_name = "PM_L2_SYS_GUESS_WRONG", +- .pme_code = 0x0000036888, +- .pme_short_desc = "L2 guess sys and guess was not correct (ie data ^beyond-6chip)", +- .pme_long_desc = "L2 guess sys and guess was not correct (ie data ^beyond-6chip)", ++[ POWER9_PME_PM_TLBIE_FIN ] = { ++ .pme_name = "PM_TLBIE_FIN", ++ .pme_code = 0x0000030058, ++ .pme_short_desc = "tlbie finished", ++ .pme_long_desc = "tlbie finished", + }, +-[ POWER9_PME_PM_THRESH_EXC_2048 ] = { /* 883 */ +- .pme_name = "PM_THRESH_EXC_2048", +- .pme_code = 0x00000401EC, +- .pme_short_desc = "Threshold counter exceeded a value of 2048", +- .pme_long_desc = "Threshold counter exceeded a value of 2048", ++[ POWER9_PME_PM_TLB_MISS ] = { ++ .pme_name = "PM_TLB_MISS", ++ .pme_code = 0x0000020066, ++ .pme_short_desc = "TLB Miss (I + D)", ++ .pme_long_desc = "TLB Miss (I + D)", + }, +-[ POWER9_PME_PM_INST_FROM_LL4 ] = { /* 884 */ +- .pme_name = "PM_INST_FROM_LL4", +- .pme_code = 0x000001404C, +- .pme_short_desc = "The processor's Instruction cache was reloaded from the local chip's L4 cache due to an instruction fetch (not prefetch)", +- .pme_long_desc = "The processor's Instruction cache was reloaded from the local chip's L4 cache due to an instruction fetch (not prefetch)", ++[ POWER9_PME_PM_TM_ABORTS ] = { ++ .pme_name = "PM_TM_ABORTS", ++ .pme_code = 0x0000030056, ++ .pme_short_desc = "Number of TM transactions aborted", ++ .pme_long_desc = "Number of TM transactions aborted", + }, +-[ POWER9_PME_PM_DATA_FROM_RL2L3_SHR ] = { /* 885 */ +- .pme_name = "PM_DATA_FROM_RL2L3_SHR", +- .pme_code = 0x000001C04A, +- .pme_short_desc = "The processor's data cache was reloaded with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a demand load", +- .pme_long_desc = "The processor's data cache was reloaded with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a demand load", ++[ POWER9_PME_PM_TMA_REQ_L2 ] = { ++ .pme_name = "PM_TMA_REQ_L2", ++ .pme_code = 0x000000E0A4, ++ .pme_short_desc = "addrs only req to L2 only on the first one,Indication that Load footprint is not expanding", ++ .pme_long_desc = "addrs only req to L2 only on the first one,Indication that Load footprint is not expanding", + }, +-[ POWER9_PME_PM_DATA_FROM_L2_DISP_CONFLICT_LDHITST ] = { /* 886 */ +- .pme_name = "PM_DATA_FROM_L2_DISP_CONFLICT_LDHITST", +- .pme_code = 0x000003C040, +- .pme_short_desc = "The processor's data cache was reloaded from local core's L2 with load hit store conflict due to a demand load", +- .pme_long_desc = "The processor's data cache was reloaded from local core's L2 with load hit store conflict due to a demand load", ++[ POWER9_PME_PM_TM_CAM_OVERFLOW ] = { ++ .pme_name = "PM_TM_CAM_OVERFLOW", ++ .pme_code = 0x00000168A6, ++ .pme_short_desc = "L3 TM cam overflow during L2 co of SC", ++ .pme_long_desc = "L3 TM cam overflow during L2 co of SC", + }, +-[ POWER9_PME_PM_LSU_FLUSH_WRK_ARND ] = { /* 887 */ +- .pme_name = "PM_LSU_FLUSH_WRK_ARND", +- .pme_code = 0x000000C0B4, +- .pme_short_desc = "LSU workaround flush.", +- .pme_long_desc = "LSU workaround flush. These flushes are setup with programmable scan only latches to perform various actions when the flsh macro receives a trigger from the dbg macros. These actions include things like flushing the next op encountered for a particular thread or flushing the next op that is NTC op that is encountered on a particular slice. The kind of flush that the workaround is setup to perform is highly variable.", ++[ POWER9_PME_PM_TM_CAP_OVERFLOW ] = { ++ .pme_name = "PM_TM_CAP_OVERFLOW", ++ .pme_code = 0x000004608E, ++ .pme_short_desc = "TM Footprint Capacity Overflow", ++ .pme_long_desc = "TM Footprint Capacity Overflow", + }, +-[ POWER9_PME_PM_L3_PF_HIT_L3 ] = { /* 888 */ +- .pme_name = "PM_L3_PF_HIT_L3", +- .pme_code = 0x00000260A8, +- .pme_short_desc = "l3 pf hit in l3", +- .pme_long_desc = "l3 pf hit in l3", ++[ POWER9_PME_PM_TM_FAIL_CONF_NON_TM ] = { ++ .pme_name = "PM_TM_FAIL_CONF_NON_TM", ++ .pme_code = 0x00000028A8, ++ .pme_short_desc = "TM aborted because a conflict occurred with a non-transactional access by another processor", ++ .pme_long_desc = "TM aborted because a conflict occurred with a non-transactional access by another processor", + }, +-[ POWER9_PME_PM_RD_FORMING_SC ] = { /* 889 */ +- .pme_name = "PM_RD_FORMING_SC", +- .pme_code = 0x00000460A6, +- .pme_short_desc = "rd forming sc", +- .pme_long_desc = "rd forming sc", ++[ POWER9_PME_PM_TM_FAIL_CONF_TM ] = { ++ .pme_name = "PM_TM_FAIL_CONF_TM", ++ .pme_code = 0x00000020AC, ++ .pme_short_desc = "TM aborted because a conflict occurred with another transaction.", ++ .pme_long_desc = "TM aborted because a conflict occurred with another transaction.", + }, +-[ POWER9_PME_PM_MRK_DATA_FROM_L2_1_MOD_CYC ] = { /* 890 */ +- .pme_name = "PM_MRK_DATA_FROM_L2_1_MOD_CYC", +- .pme_code = 0x000003D148, +- .pme_short_desc = "Duration in cycles to reload with Modified (M) data from another core's L2 on the same chip due to a marked load", +- .pme_long_desc = "Duration in cycles to reload with Modified (M) data from another core's L2 on the same chip due to a marked load", ++[ POWER9_PME_PM_TM_FAIL_FOOTPRINT_OVERFLOW ] = { ++ .pme_name = "PM_TM_FAIL_FOOTPRINT_OVERFLOW", ++ .pme_code = 0x00000020A8, ++ .pme_short_desc = "TM aborted because the tracking limit for transactional storage accesses was exceeded.", ++ .pme_long_desc = "TM aborted because the tracking limit for transactional storage accesses was exceeded.. Asynchronous", + }, +-[ POWER9_PME_PM_IPTEG_FROM_DL4 ] = { /* 891 */ +- .pme_name = "PM_IPTEG_FROM_DL4", +- .pme_code = 0x000003504C, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB from another chip's L4 on a different Node or Group (Distant) due to a instruction side request", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB from another chip's L4 on a different Node or Group (Distant) due to a instruction side request", ++[ POWER9_PME_PM_TM_FAIL_NON_TX_CONFLICT ] = { ++ .pme_name = "PM_TM_FAIL_NON_TX_CONFLICT", ++ .pme_code = 0x000000E0B0, ++ .pme_short_desc = "Non transactional conflict from LSU, gets reported to TEXASR", ++ .pme_long_desc = "Non transactional conflict from LSU, gets reported to TEXASR", + }, +-[ POWER9_PME_PM_CMPLU_STALL_STORE_FINISH ] = { /* 892 */ +- .pme_name = "PM_CMPLU_STALL_STORE_FINISH", +- .pme_code = 0x000002C014, +- .pme_short_desc = "Finish stall because the NTF instruction was a store with all its dependencies met, just waiting to go through the LSU pipe to finish", +- .pme_long_desc = "Finish stall because the NTF instruction was a store with all its dependencies met, just waiting to go through the LSU pipe to finish", ++[ POWER9_PME_PM_TM_FAIL_SELF ] = { ++ .pme_name = "PM_TM_FAIL_SELF", ++ .pme_code = 0x00000028AC, ++ .pme_short_desc = "TM aborted because a self-induced conflict occurred in Suspended state, due to one of the following: a store to a storage location that was previously accessed transactionally; a dcbf, dcbi, or icbi specify- ing a block that was previously accessed transactionally; a dcbst specifying a block that was previously written transactionally; or a tlbie that specifies a translation that was pre- viously used transactionally", ++ .pme_long_desc = "TM aborted because a self-induced conflict occurred in Suspended state, due to one of the following: a store to a storage location that was previously accessed transactionally; a dcbf, dcbi, or icbi specify- ing a block that was previously accessed transactionally; a dcbst specifying a block that was previously written transactionally; or a tlbie that specifies a translation that was pre- viously used transactionally", + }, +-[ POWER9_PME_PM_IPTEG_FROM_LL4 ] = { /* 893 */ +- .pme_name = "PM_IPTEG_FROM_LL4", +- .pme_code = 0x000001504C, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB from the local chip's L4 cache due to a instruction side request", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB from the local chip's L4 cache due to a instruction side request", ++[ POWER9_PME_PM_TM_FAIL_TLBIE ] = { ++ .pme_name = "PM_TM_FAIL_TLBIE", ++ .pme_code = 0x000000E0AC, ++ .pme_short_desc = "Transaction failed because there was a TLBIE hit in the bloom filter", ++ .pme_long_desc = "Transaction failed because there was a TLBIE hit in the bloom filter", + }, +-[ POWER9_PME_PM_1FLOP_CMPL ] = { /* 894 */ +- .pme_name = "PM_1FLOP_CMPL", +- .pme_code = 0x000001000C, +- .pme_short_desc = "one flop (fadd, fmul, fsub, fcmp, fsel, fabs, fnabs, fres, fsqrte, fneg) operation completed", +- .pme_long_desc = "one flop (fadd, fmul, fsub, fcmp, fsel, fabs, fnabs, fres, fsqrte, fneg) operation completed", ++[ POWER9_PME_PM_TM_FAIL_TX_CONFLICT ] = { ++ .pme_name = "PM_TM_FAIL_TX_CONFLICT", ++ .pme_code = 0x000000E8AC, ++ .pme_short_desc = "Transactional conflict from LSU, gets reported to TEXASR", ++ .pme_long_desc = "Transactional conflict from LSU, gets reported to TEXASR", + }, +-[ POWER9_PME_PM_L2_GRP_GUESS_WRONG ] = { /* 895 */ +- .pme_name = "PM_L2_GRP_GUESS_WRONG", +- .pme_code = 0x0000026888, +- .pme_short_desc = "L2 guess grp and guess was not correct (ie data on-chip OR beyond-6chip)", +- .pme_long_desc = "L2 guess grp and guess was not correct (ie data on-chip OR beyond-6chip)", ++[ POWER9_PME_PM_TM_FAV_CAUSED_FAIL ] = { ++ .pme_name = "PM_TM_FAV_CAUSED_FAIL", ++ .pme_code = 0x000002688E, ++ .pme_short_desc = "TM Load (fav) caused another thread to fail", ++ .pme_long_desc = "TM Load (fav) caused another thread to fail", + }, +-[ POWER9_PME_PM_TM_FAV_TBEGIN ] = { /* 896 */ ++[ POWER9_PME_PM_TM_FAV_TBEGIN ] = { + .pme_name = "PM_TM_FAV_TBEGIN", + .pme_code = 0x000000209C, + .pme_short_desc = "Dispatch time Favored tbegin", + .pme_long_desc = "Dispatch time Favored tbegin", + }, +-[ POWER9_PME_PM_INST_FROM_L2_NO_CONFLICT ] = { /* 897 */ +- .pme_name = "PM_INST_FROM_L2_NO_CONFLICT", +- .pme_code = 0x0000014040, +- .pme_short_desc = "The processor's Instruction cache was reloaded from local core's L2 without conflict due to an instruction fetch (not prefetch)", +- .pme_long_desc = "The processor's Instruction cache was reloaded from local core's L2 without conflict due to an instruction fetch (not prefetch)", ++[ POWER9_PME_PM_TM_LD_CAUSED_FAIL ] = { ++ .pme_name = "PM_TM_LD_CAUSED_FAIL", ++ .pme_code = 0x000001688E, ++ .pme_short_desc = "Non-TM Load caused any thread to fail", ++ .pme_long_desc = "Non-TM Load caused any thread to fail", + }, +-[ POWER9_PME_PM_2FLOP_CMPL ] = { /* 898 */ +- .pme_name = "PM_2FLOP_CMPL", +- .pme_code = 0x000004D052, +- .pme_short_desc = "DP vector version of fmul, fsub, fcmp, fsel, fabs, fnabs, fres ,fsqrte, fneg?", +- .pme_long_desc = "DP vector version of fmul, fsub, fcmp, fsel, fabs, fnabs, fres ,fsqrte, fneg?", ++[ POWER9_PME_PM_TM_LD_CONF ] = { ++ .pme_name = "PM_TM_LD_CONF", ++ .pme_code = 0x000002608E, ++ .pme_short_desc = "TM Load (fav or non-fav) ran into conflict (failed)", ++ .pme_long_desc = "TM Load (fav or non-fav) ran into conflict (failed)", + }, +-[ POWER9_PME_PM_LS2_TM_DISALLOW ] = { /* 899 */ +- .pme_name = "PM_LS2_TM_DISALLOW", +- .pme_code = 0x000000E0B8, +- .pme_short_desc = "A TM-ineligible instruction tries to execute inside a transaction and the LSU disallows it", +- .pme_long_desc = "A TM-ineligible instruction tries to execute inside a transaction and the LSU disallows it", ++[ POWER9_PME_PM_TM_NESTED_TBEGIN ] = { ++ .pme_name = "PM_TM_NESTED_TBEGIN", ++ .pme_code = 0x00000020A0, ++ .pme_short_desc = "Completion Tm nested tbegin", ++ .pme_long_desc = "Completion Tm nested tbegin", + }, +-[ POWER9_PME_PM_L2_LD_DISP ] = { /* 900 */ +- .pme_name = "PM_L2_LD_DISP", +- .pme_code = 0x000001609E, +- .pme_short_desc = "All successful load dispatches", +- .pme_long_desc = "All successful load dispatches", ++[ POWER9_PME_PM_TM_NESTED_TEND ] = { ++ .pme_name = "PM_TM_NESTED_TEND", ++ .pme_code = 0x0000002098, ++ .pme_short_desc = "Completion time nested tend", ++ .pme_long_desc = "Completion time nested tend", + }, +-[ POWER9_PME_PM_CMPLU_STALL_LHS ] = { /* 901 */ +- .pme_name = "PM_CMPLU_STALL_LHS", +- .pme_code = 0x000002C01A, +- .pme_short_desc = "Finish stall because the NTF instruction was a load that hit on an older store and it was waiting for store data", +- .pme_long_desc = "Finish stall because the NTF instruction was a load that hit on an older store and it was waiting for store data", ++[ POWER9_PME_PM_TM_NON_FAV_TBEGIN ] = { ++ .pme_name = "PM_TM_NON_FAV_TBEGIN", ++ .pme_code = 0x000000289C, ++ .pme_short_desc = "Dispatch time non favored tbegin", ++ .pme_long_desc = "Dispatch time non favored tbegin", + }, +-[ POWER9_PME_PM_TLB_HIT ] = { /* 902 */ +- .pme_name = "PM_TLB_HIT", +- .pme_code = 0x000001F054, +- .pme_short_desc = "Number of times the TLB had the data required by the instruction.", +- .pme_long_desc = "Number of times the TLB had the data required by the instruction. Applies to both HPT and RPT", ++[ POWER9_PME_PM_TM_OUTER_TBEGIN_DISP ] = { ++ .pme_name = "PM_TM_OUTER_TBEGIN_DISP", ++ .pme_code = 0x000004E05E, ++ .pme_short_desc = "Number of outer tbegin instructions dispatched.", ++ .pme_long_desc = "Number of outer tbegin instructions dispatched. The dispatch unit determines whether the tbegin instruction is outer or nested. This is a speculative count, which includes flushed instructions", + }, +-[ POWER9_PME_PM_HV_CYC ] = { /* 903 */ +- .pme_name = "PM_HV_CYC", +- .pme_code = 0x0000020006, +- .pme_short_desc = "Cycles in which msr_hv is high.", +- .pme_long_desc = "Cycles in which msr_hv is high. Note that this event does not take msr_pr into consideration", ++[ POWER9_PME_PM_TM_OUTER_TBEGIN ] = { ++ .pme_name = "PM_TM_OUTER_TBEGIN", ++ .pme_code = 0x0000002094, ++ .pme_short_desc = "Completion time outer tbegin", ++ .pme_long_desc = "Completion time outer tbegin", + }, +-[ POWER9_PME_PM_L2_RTY_LD ] = { /* 904 */ +- .pme_name = "PM_L2_RTY_LD", +- .pme_code = 0x000003689E, +- .pme_short_desc = "RC retries on PB for any load from core", +- .pme_long_desc = "RC retries on PB for any load from core", ++[ POWER9_PME_PM_TM_OUTER_TEND ] = { ++ .pme_name = "PM_TM_OUTER_TEND", ++ .pme_code = 0x0000002894, ++ .pme_short_desc = "Completion time outer tend", ++ .pme_long_desc = "Completion time outer tend", + }, +-[ POWER9_PME_PM_STCX_SUCCESS_CMPL ] = { /* 905 */ +- .pme_name = "PM_STCX_SUCCESS_CMPL", +- .pme_code = 0x000000C8BC, +- .pme_short_desc = "Number of stcx instructions that completed successfully", +- .pme_long_desc = "Number of stcx instructions that completed successfully", ++[ POWER9_PME_PM_TM_PASSED ] = { ++ .pme_name = "PM_TM_PASSED", ++ .pme_code = 0x000002E052, ++ .pme_short_desc = "Number of TM transactions that passed", ++ .pme_long_desc = "Number of TM transactions that passed", + }, +-[ POWER9_PME_PM_INST_PUMP_MPRED ] = { /* 906 */ +- .pme_name = "PM_INST_PUMP_MPRED", +- .pme_code = 0x0000044052, +- .pme_short_desc = "Pump misprediction.", +- .pme_long_desc = "Pump misprediction. Counts across all types of pumps for an instruction fetch", ++[ POWER9_PME_PM_TM_RST_SC ] = { ++ .pme_name = "PM_TM_RST_SC", ++ .pme_code = 0x00000268A6, ++ .pme_short_desc = "TM-snp rst RM SC", ++ .pme_long_desc = "TM-snp rst RM SC", + }, +-[ POWER9_PME_PM_LSU2_ERAT_HIT ] = { /* 907 */ +- .pme_name = "PM_LSU2_ERAT_HIT", +- .pme_code = 0x000000E090, +- .pme_short_desc = "Primary ERAT hit.", +- .pme_long_desc = "Primary ERAT hit. There is no secondary ERAT", ++[ POWER9_PME_PM_TM_SC_CO ] = { ++ .pme_name = "PM_TM_SC_CO", ++ .pme_code = 0x00000160A6, ++ .pme_short_desc = "L3 castout TM SC line", ++ .pme_long_desc = "L3 castout TM SC line", + }, +-[ POWER9_PME_PM_INST_FROM_RL4 ] = { /* 908 */ +- .pme_name = "PM_INST_FROM_RL4", +- .pme_code = 0x000002404A, +- .pme_short_desc = "The processor's Instruction cache was reloaded from another chip's L4 on the same Node or Group ( Remote) due to an instruction fetch (not prefetch)", +- .pme_long_desc = "The processor's Instruction cache was reloaded from another chip's L4 on the same Node or Group ( Remote) due to an instruction fetch (not prefetch)", ++[ POWER9_PME_PM_TM_ST_CAUSED_FAIL ] = { ++ .pme_name = "PM_TM_ST_CAUSED_FAIL", ++ .pme_code = 0x000003688E, ++ .pme_short_desc = "TM Store (fav or non-fav) caused another thread to fail", ++ .pme_long_desc = "TM Store (fav or non-fav) caused another thread to fail", + }, +-[ POWER9_PME_PM_LD_L3MISS_PEND_CYC ] = { /* 909 */ +- .pme_name = "PM_LD_L3MISS_PEND_CYC", +- .pme_code = 0x0000010062, +- .pme_short_desc = "Cycles L3 miss was pending for this thread", +- .pme_long_desc = "Cycles L3 miss was pending for this thread", ++[ POWER9_PME_PM_TM_ST_CONF ] = { ++ .pme_name = "PM_TM_ST_CONF", ++ .pme_code = 0x000003608E, ++ .pme_short_desc = "TM Store (fav or non-fav) ran into conflict (failed)", ++ .pme_long_desc = "TM Store (fav or non-fav) ran into conflict (failed)", + }, +-[ POWER9_PME_PM_L3_LAT_CI_MISS ] = { /* 910 */ +- .pme_name = "PM_L3_LAT_CI_MISS", +- .pme_code = 0x00000468A2, +- .pme_short_desc = "L3 Lateral Castins Miss", +- .pme_long_desc = "L3 Lateral Castins Miss", ++[ POWER9_PME_PM_TM_TABORT_TRECLAIM ] = { ++ .pme_name = "PM_TM_TABORT_TRECLAIM", ++ .pme_code = 0x0000002898, ++ .pme_short_desc = "Completion time tabortnoncd, tabortcd, treclaim", ++ .pme_long_desc = "Completion time tabortnoncd, tabortcd, treclaim", + }, +-[ POWER9_PME_PM_MRK_FAB_RSP_RD_RTY ] = { /* 911 */ +- .pme_name = "PM_MRK_FAB_RSP_RD_RTY", +- .pme_code = 0x000004015E, +- .pme_short_desc = "Sampled L2 reads retry count", +- .pme_long_desc = "Sampled L2 reads retry count", ++[ POWER9_PME_PM_TM_TRANS_RUN_CYC ] = { ++ .pme_name = "PM_TM_TRANS_RUN_CYC", ++ .pme_code = 0x0000010060, ++ .pme_short_desc = "run cycles in transactional state", ++ .pme_long_desc = "run cycles in transactional state", + }, +-[ POWER9_PME_PM_DTLB_MISS_16M ] = { /* 912 */ +- .pme_name = "PM_DTLB_MISS_16M", +- .pme_code = 0x000004C056, +- .pme_short_desc = "Data TLB Miss page size 16M", +- .pme_long_desc = "Data TLB Miss page size 16M", ++[ POWER9_PME_PM_TM_TRANS_RUN_INST ] = { ++ .pme_name = "PM_TM_TRANS_RUN_INST", ++ .pme_code = 0x0000030060, ++ .pme_short_desc = "Run instructions completed in transactional state (gated by the run latch)", ++ .pme_long_desc = "Run instructions completed in transactional state (gated by the run latch)", + }, +-[ POWER9_PME_PM_DPTEG_FROM_L2_1_MOD ] = { /* 913 */ +- .pme_name = "PM_DPTEG_FROM_L2_1_MOD", +- .pme_code = 0x000004E046, +- .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's L2 on the same chip due to a data side request.", +- .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's L2 on the same chip due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", ++[ POWER9_PME_PM_TM_TRESUME ] = { ++ .pme_name = "PM_TM_TRESUME", ++ .pme_code = 0x00000020A4, ++ .pme_short_desc = "TM resume instruction completed", ++ .pme_long_desc = "TM resume instruction completed", + }, +-[ POWER9_PME_PM_MRK_DATA_FROM_RL2L3_SHR ] = { /* 914 */ +- .pme_name = "PM_MRK_DATA_FROM_RL2L3_SHR", +- .pme_code = 0x0000035150, +- .pme_short_desc = "The processor's data cache was reloaded with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a marked load", +- .pme_long_desc = "The processor's data cache was reloaded with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a marked load", ++[ POWER9_PME_PM_TM_TSUSPEND ] = { ++ .pme_name = "PM_TM_TSUSPEND", ++ .pme_code = 0x00000028A0, ++ .pme_short_desc = "TM suspend instruction completed", ++ .pme_long_desc = "TM suspend instruction completed", + }, +-[ POWER9_PME_PM_MRK_LSU_FIN ] = { /* 915 */ +- .pme_name = "PM_MRK_LSU_FIN", +- .pme_code = 0x0000040132, +- .pme_short_desc = "lsu marked instr PPC finish", +- .pme_long_desc = "lsu marked instr PPC finish", ++[ POWER9_PME_PM_TM_TX_PASS_RUN_CYC ] = { ++ .pme_name = "PM_TM_TX_PASS_RUN_CYC", ++ .pme_code = 0x000002E012, ++ .pme_short_desc = "cycles spent in successful transactions", ++ .pme_long_desc = "cycles spent in successful transactions", + }, +-[ POWER9_PME_PM_LSU0_STORE_REJECT ] = { /* 916 */ +- .pme_name = "PM_LSU0_STORE_REJECT", +- .pme_code = 0x000000F08C, +- .pme_short_desc = "All internal store rejects cause the instruction to go back to the SRQ and go to sleep until woken up to try again after the condition has been met", +- .pme_long_desc = "All internal store rejects cause the instruction to go back to the SRQ and go to sleep until woken up to try again after the condition has been met", ++[ POWER9_PME_PM_TM_TX_PASS_RUN_INST ] = { ++ .pme_name = "PM_TM_TX_PASS_RUN_INST", ++ .pme_code = 0x000004E014, ++ .pme_short_desc = "Run instructions spent in successful transactions", ++ .pme_long_desc = "Run instructions spent in successful transactions", + }, +-[ POWER9_PME_PM_CLB_HELD ] = { /* 917 */ +- .pme_name = "PM_CLB_HELD", +- .pme_code = 0x000000208C, +- .pme_short_desc = "CLB Hold: Any Reason", +- .pme_long_desc = "CLB Hold: Any Reason", ++[ POWER9_PME_PM_VECTOR_FLOP_CMPL ] = { ++ .pme_name = "PM_VECTOR_FLOP_CMPL", ++ .pme_code = 0x000004D058, ++ .pme_short_desc = "Vector FP instruction completed", ++ .pme_long_desc = "Vector FP instruction completed", ++}, ++[ POWER9_PME_PM_VECTOR_LD_CMPL ] = { ++ .pme_name = "PM_VECTOR_LD_CMPL", ++ .pme_code = 0x0000044054, ++ .pme_short_desc = "Number of vector load instructions completed", ++ .pme_long_desc = "Number of vector load instructions completed", ++}, ++[ POWER9_PME_PM_VECTOR_ST_CMPL ] = { ++ .pme_name = "PM_VECTOR_ST_CMPL", ++ .pme_code = 0x0000044056, ++ .pme_short_desc = "Number of vector store instructions completed", ++ .pme_long_desc = "Number of vector store instructions completed", ++}, ++[ POWER9_PME_PM_VSU_DP_FSQRT_FDIV ] = { ++ .pme_name = "PM_VSU_DP_FSQRT_FDIV", ++ .pme_code = 0x000003D058, ++ .pme_short_desc = "vector versions of fdiv,fsqrt", ++ .pme_long_desc = "vector versions of fdiv,fsqrt", + }, +-[ POWER9_PME_PM_LS2_ERAT_MISS_PREF ] = { /* 918 */ +- .pme_name = "PM_LS2_ERAT_MISS_PREF", +- .pme_code = 0x000000E088, +- .pme_short_desc = "LS0 Erat miss due to prefetch", +- .pme_long_desc = "LS0 Erat miss due to prefetch", ++[ POWER9_PME_PM_VSU_FIN ] = { ++ .pme_name = "PM_VSU_FIN", ++ .pme_code = 0x000002505C, ++ .pme_short_desc = "VSU instruction finished.", ++ .pme_long_desc = "VSU instruction finished. Up to 4 per cycle", ++}, ++[ POWER9_PME_PM_VSU_FSQRT_FDIV ] = { ++ .pme_name = "PM_VSU_FSQRT_FDIV", ++ .pme_code = 0x000004D04E, ++ .pme_short_desc = "four flops operation (fdiv,fsqrt) Scalar Instructions only", ++ .pme_long_desc = "four flops operation (fdiv,fsqrt) Scalar Instructions only", ++}, ++[ POWER9_PME_PM_VSU_NON_FLOP_CMPL ] = { ++ .pme_name = "PM_VSU_NON_FLOP_CMPL", ++ .pme_code = 0x000004D050, ++ .pme_short_desc = "Non FLOP operation completed", ++ .pme_long_desc = "Non FLOP operation completed", ++}, ++[ POWER9_PME_PM_XLATE_HPT_MODE ] = { ++ .pme_name = "PM_XLATE_HPT_MODE", ++ .pme_code = 0x000000F098, ++ .pme_short_desc = "LSU reports every cycle the thread is in HPT translation mode (as opposed to radix mode)", ++ .pme_long_desc = "LSU reports every cycle the thread is in HPT translation mode (as opposed to radix mode)", ++}, ++[ POWER9_PME_PM_XLATE_MISS ] = { ++ .pme_name = "PM_XLATE_MISS", ++ .pme_code = 0x000000F89C, ++ .pme_short_desc = "The LSU requested a line from L2 for translation.", ++ .pme_long_desc = "The LSU requested a line from L2 for translation. It may be satisfied from any source beyond L2. Includes speculative instructions", ++}, ++[ POWER9_PME_PM_XLATE_RADIX_MODE ] = { ++ .pme_name = "PM_XLATE_RADIX_MODE", ++ .pme_code = 0x000000F898, ++ .pme_short_desc = "LSU reports every cycle the thread is in radix translation mode (as opposed to HPT mode)", ++ .pme_long_desc = "LSU reports every cycle the thread is in radix translation mode (as opposed to HPT mode)", + }, ++/* total 945 */ + }; + #endif +-- +2.9.4 + +From ce5b320031f75f9a9881333c13902d5541f91cc8 Mon Sep 17 00:00:00 2001 +From: Will Schmidt +Date: Tue, 6 Jun 2017 11:09:17 -0500 +Subject: [PATCH 6/6] add power9 entries to validate_power.c + +Hi, + +Update the validate_power test to include power9 entries. + +sniff-test run output: +$ ./validate +Libpfm structure tests: + libpfm ABI version : 0 + pfm_pmu_info_t : Passed + pfm_event_info_t : Passed + pfm_event_attr_info_t : Passed + pfm_pmu_encode_arg_t : Passed + pfm_perf_encode_arg_t : Passed +Libpfm internal table tests: + + checking power9 (946 events): Passed +Architecture specific tests: + 20 PowerPC events: 0 errors +All tests passed + +Signed-off-by: Will Schmidt +--- + tests/validate_power.c | 14 ++++++++++++++ + 1 file changed, 14 insertions(+) + +diff --git a/tests/validate_power.c b/tests/validate_power.c +index 74ab30c..617efca 100644 +--- a/tests/validate_power.c ++++ b/tests/validate_power.c +@@ -157,6 +157,20 @@ static const test_event_t ppc_test_events[]={ + .codes[0] = 0xde200201e6ull, + .fstr = "power8::PM_RC_LIFETIME_EXC_32", + }, ++ { SRC_LINE, ++ .name = "power9::PM_CYC", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x1001e, ++ .fstr = "power9::PM_CYC", ++ }, ++ { SRC_LINE, ++ .name = "power9::PM_INST_DISP", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x200f2, ++ .fstr = "power9::PM_INST_DISP", ++ }, + { SRC_LINE, + .name = "powerpc_nest_mcs_read::MCS_00", + .ret = PFM_SUCCESS, +-- +2.9.4 + diff --git a/SOURCES/libpfm-rhbz1440249.patch b/SOURCES/libpfm-rhbz1440249.patch new file mode 100644 index 0000000..9326b02 --- /dev/null +++ b/SOURCES/libpfm-rhbz1440249.patch @@ -0,0 +1,1421 @@ +From b9709a7866498a84dc4ab60fb006631569bedbf0 Mon Sep 17 00:00:00 2001 +From: Stephane Eranian +Date: Mon, 3 Apr 2017 22:48:31 -0700 +Subject: [PATCH 1/7] Revert "fix struct validation for pfm_event_attr_info_t" + +This reverts commit 06b296c72838be44d8950dc03227fe0dc8ca1fb1. + +Break ABI compatibility from 4.7 to 4.8. + +Signed-off-by: Stephane Eranian +--- + include/perfmon/pfmlib.h | 5 ++--- + tests/validate.c | 3 +-- + 2 files changed, 3 insertions(+), 5 deletions(-) + +diff --git a/include/perfmon/pfmlib.h b/include/perfmon/pfmlib.h +index 0e370ba50318..d9be4453accf 100644 +--- a/include/perfmon/pfmlib.h ++++ b/include/perfmon/pfmlib.h +@@ -490,7 +490,6 @@ typedef struct { + size_t size; /* struct sizeof */ + uint64_t code; /* attribute code */ + pfm_attr_t type; /* attribute type */ +- int pad; /* padding */ + uint64_t idx; /* attribute opaque index */ + pfm_attr_ctrl_t ctrl; /* what is providing attr */ + struct { +@@ -520,13 +519,13 @@ typedef struct { + #if __WORDSIZE == 64 + #define PFM_PMU_INFO_ABI0 56 + #define PFM_EVENT_INFO_ABI0 64 +-#define PFM_ATTR_INFO_ABI0 72 ++#define PFM_ATTR_INFO_ABI0 64 + + #define PFM_RAW_ENCODE_ABI0 32 + #else + #define PFM_PMU_INFO_ABI0 44 + #define PFM_EVENT_INFO_ABI0 48 +-#define PFM_ATTR_INFO_ABI0 56 ++#define PFM_ATTR_INFO_ABI0 48 + + #define PFM_RAW_ENCODE_ABI0 20 + #endif +diff --git a/tests/validate.c b/tests/validate.c +index 0da0adc4995a..522a6ab7140d 100644 +--- a/tests/validate.c ++++ b/tests/validate.c +@@ -201,7 +201,6 @@ static const struct_desc_t pfmlib_structs[]={ + FIELD(code, pfm_event_attr_info_t), + FIELD(type, pfm_event_attr_info_t), + FIELD(idx, pfm_event_attr_info_t), +- FIELD(pad, pfm_event_attr_info_t), /* padding */ + FIELD(ctrl, pfm_event_attr_info_t), + LAST_FIELD + }, +@@ -271,7 +270,7 @@ validate_structs(void) + } + + if (sz != d->sz) { +- printf("Failed (invisible padding of %zu bytes, total struct size %zu bytes)\n", d->sz - sz, d->sz); ++ printf("Failed (invisible padding of %zu bytes)\n", d->sz - sz); + errors++; + continue; + } +-- +2.7.4 + +From 01c24ef2c781c614544eeb5ce3922313118e3053 Mon Sep 17 00:00:00 2001 +From: Stephane Eranian +Date: Mon, 3 Apr 2017 22:49:18 -0700 +Subject: [PATCH 2/7] Revert "Fix pfmlib_parse_event_attr() parsing of raw + umask for 32-bit" + +This reverts commit bfb9baf1c8a9533fde271d0436ecd465934dfa17. + +support for 32-bit umask as implemented breaks ABI between 4.7 and 4.8. + +Signed-off-by: Stephane Eranian +--- + lib/pfmlib_common.c | 8 ++++---- + 1 file changed, 4 insertions(+), 4 deletions(-) + +diff --git a/lib/pfmlib_common.c b/lib/pfmlib_common.c +index cff4d2ecbd2c..c88e2aaae274 100644 +--- a/lib/pfmlib_common.c ++++ b/lib/pfmlib_common.c +@@ -1011,10 +1011,10 @@ pfmlib_parse_event_attr(char *str, pfmlib_event_desc_t *d) + ainfo->name = "RAW_UMASK"; + ainfo->type = PFM_ATTR_RAW_UMASK; + ainfo->ctrl = PFM_ATTR_CTRL_PMU; +- ainfo->idx = strtoull(s, &endptr, 0); ++ ainfo->idx = strtoul(s, &endptr, 0); + ainfo->equiv= NULL; + if (*endptr) { +- DPRINT("raw umask (%s) is not a number\n", s); ++ DPRINT("raw umask (%s) is not a number\n"); + return PFM_ERR_ATTR; + } + +@@ -1368,9 +1368,9 @@ pfmlib_parse_event(const char *event, pfmlib_event_desc_t *d) + for (i = 0; i < d->nattrs; i++) { + pfm_event_attr_info_t *a = attr(d, i); + if (a->type != PFM_ATTR_RAW_UMASK) +- DPRINT("%d %d %"PRIu64" %s\n", d->event, i, a->idx, d->pattrs[d->attrs[i].id].name); ++ DPRINT("%d %d %d %s\n", d->event, i, a->idx, d->pattrs[d->attrs[i].id].name); + else +- DPRINT("%d %d RAW_UMASK (0x%"PRIx64")\n", d->event, i, a->idx); ++ DPRINT("%d %d RAW_UMASK (0x%x)\n", d->event, i, a->idx); + } + error: + free(str); +-- +2.7.4 + +From e206315c36e39409b7fc1e4cdd72caa5040b45c4 Mon Sep 17 00:00:00 2001 +From: Stephane Eranian +Date: Mon, 3 Apr 2017 22:52:22 -0700 +Subject: [PATCH 3/7] Revert "Allow raw umask for OFFCORE_RESPONSE on Intel + core PMUs" + +This reverts commit 4dc4c6ada254f30eee8cd2ae27bb0869a111b613. + +32-bit raw umask support break ABI between 4.7 and 4.8, so remove +for now. + +Signed-off-by: Stephane Eranian +--- + include/perfmon/pfmlib.h | 4 +- + lib/pfmlib_intel_x86.c | 16 ++-- + tests/validate_x86.c | 232 ----------------------------------------------- + 3 files changed, 9 insertions(+), 243 deletions(-) + +diff --git a/include/perfmon/pfmlib.h b/include/perfmon/pfmlib.h +index d9be4453accf..6904c1c79b68 100644 +--- a/include/perfmon/pfmlib.h ++++ b/include/perfmon/pfmlib.h +@@ -490,8 +490,8 @@ typedef struct { + size_t size; /* struct sizeof */ + uint64_t code; /* attribute code */ + pfm_attr_t type; /* attribute type */ +- uint64_t idx; /* attribute opaque index */ +- pfm_attr_ctrl_t ctrl; /* what is providing attr */ ++ int idx; /* attribute opaque index */ ++ pfm_attr_ctrl_t ctrl; /* what is providing attr */ + struct { + unsigned int is_dfl:1; /* is default umask */ + unsigned int is_precise:1; /* Intel X86: supports PEBS */ +diff --git a/lib/pfmlib_intel_x86.c b/lib/pfmlib_intel_x86.c +index b698144f1da4..497cf1b9246a 100644 +--- a/lib/pfmlib_intel_x86.c ++++ b/lib/pfmlib_intel_x86.c +@@ -481,18 +481,16 @@ pfm_intel_x86_encode_gen(void *this, pfmlib_event_desc_t *e) + reg.sel_event_select = last_ucode; + } + } else if (a->type == PFM_ATTR_RAW_UMASK) { +- uint64_t rmask; ++ + /* there can only be one RAW_UMASK per event */ +- if (intel_x86_eflag(this, e->event, INTEL_X86_NHM_OFFCORE)) { +- rmask = (1ULL << 38) - 1; +- } else { +- rmask = 0xff; +- } +- if (a->idx & ~rmask) { +- DPRINT("raw umask is too wide\n"); ++ ++ /* sanity check */ ++ if (a->idx & ~0xff) { ++ DPRINT("raw umask is 8-bit wide\n"); + return PFM_ERR_ATTR; + } +- umask2 = a->idx & rmask; ++ /* override umask */ ++ umask2 = a->idx & 0xff; + ugrpmsk = grpmsk; + } else { + uint64_t ival = e->attrs[k].ival; +diff --git a/tests/validate_x86.c b/tests/validate_x86.c +index 790ba585d8e7..906afba636e1 100644 +--- a/tests/validate_x86.c ++++ b/tests/validate_x86.c +@@ -4057,238 +4057,6 @@ static const test_event_t x86_test_events[]={ + .fstr = "hsw::CYCLE_ACTIVITY:CYCLES_L2_PENDING:k=1:u=1:e=0:i=0:c=1:t=0:intx=0:intxcp=0", + }, + { SRC_LINE, +- .name = "wsm::offcore_response_0:0xf", +- .ret = PFM_SUCCESS, +- .count = 2, +- .codes[0] = 0x5301b7, +- .codes[1] = 0xf, +- .fstr = "wsm::OFFCORE_RESPONSE_0:0xf:k=1:u=1:e=0:i=0:c=0:t=0", +- }, +- { SRC_LINE, +- .name = "wsm::offcore_response_0:0xfffffffff", +- .ret = PFM_SUCCESS, +- .count = 2, +- .codes[0] = 0x5301b7, +- .codes[1] = 0xfffffffffull, +- .fstr = "wsm::OFFCORE_RESPONSE_0:0xffffffff:k=1:u=1:e=0:i=0:c=0:t=0", +- }, +- { SRC_LINE, +- .name = "wsm::offcore_response_0:0x7fffffffff", +- .ret = PFM_ERR_ATTR, +- }, +- { SRC_LINE, +- .name = "snb::offcore_response_0:0xf", +- .ret = PFM_SUCCESS, +- .count = 2, +- .codes[0] = 0x5301b7, +- .codes[1] = 0xf, +- .fstr = "snb::OFFCORE_RESPONSE_0:0xf:k=1:u=1:e=0:i=0:c=0:t=0", +- }, +- { SRC_LINE, +- .name = "snb::offcore_response_0:0xfffffffff", +- .ret = PFM_SUCCESS, +- .count = 2, +- .codes[0] = 0x5301b7, +- .codes[1] = 0xfffffffffull, +- .fstr = "snb::OFFCORE_RESPONSE_0:0xffffffff:k=1:u=1:e=0:i=0:c=0:t=0", +- }, +- { SRC_LINE, +- .name = "snb::offcore_response_0:0x7fffffffff", +- .ret = PFM_ERR_ATTR, +- }, +- { SRC_LINE, +- .name = "ivb_ep::offcore_response_0:0xf", +- .ret = PFM_SUCCESS, +- .count = 2, +- .codes[0] = 0x5301b7, +- .codes[1] = 0xf, +- .fstr = "ivb_ep::OFFCORE_RESPONSE_0:0xf:k=1:u=1:e=0:i=0:c=0:t=0", +- }, +- { SRC_LINE, +- .name = "ivb_ep::offcore_response_0:0xfffffffff", +- .ret = PFM_SUCCESS, +- .count = 2, +- .codes[0] = 0x5301b7, +- .codes[1] = 0xfffffffffull, +- .fstr = "ivb_ep::OFFCORE_RESPONSE_0:0xffffffff:k=1:u=1:e=0:i=0:c=0:t=0", +- }, +- { SRC_LINE, +- .name = "ivb_ep::offcore_response_0:0x7fffffffff", +- .ret = PFM_ERR_ATTR, +- }, +- { SRC_LINE, +- .name = "hsw::offcore_response_0:0xf", +- .ret = PFM_SUCCESS, +- .count = 2, +- .codes[0] = 0x5301b7, +- .codes[1] = 0xf, +- .fstr = "hsw::OFFCORE_RESPONSE_0:0xf:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", +- }, +- { SRC_LINE, +- .name = "hsw::offcore_response_0:0xfffffffff", +- .ret = PFM_SUCCESS, +- .count = 2, +- .codes[0] = 0x5301b7, +- .codes[1] = 0xfffffffffull, +- .fstr = "hsw::OFFCORE_RESPONSE_0:0xffffffff:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", +- }, +- { SRC_LINE, +- .name = "hsw::offcore_response_0:0x7fffffffff", +- .ret = PFM_ERR_ATTR, +- }, +- { SRC_LINE, +- .name = "bdw_ep::offcore_response_0:0xf", +- .ret = PFM_SUCCESS, +- .count = 2, +- .codes[0] = 0x5301b7, +- .codes[1] = 0xf, +- .fstr = "bdw_ep::OFFCORE_RESPONSE_0:0xf:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", +- }, +- { SRC_LINE, +- .name = "bdw_ep::offcore_response_0:0xfffffffff", +- .ret = PFM_SUCCESS, +- .count = 2, +- .codes[0] = 0x5301b7, +- .codes[1] = 0xfffffffffull, +- .fstr = "bdw_ep::OFFCORE_RESPONSE_0:0xffffffff:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", +- }, +- { SRC_LINE, +- .name = "bdw_ep::offcore_response_0:0x7fffffffff", +- .ret = PFM_ERR_ATTR, +- }, +- { SRC_LINE, +- .name = "skl::offcore_response_0:0xf", +- .ret = PFM_SUCCESS, +- .count = 2, +- .codes[0] = 0x5301b7, +- .codes[1] = 0xf, +- .fstr = "skl::OFFCORE_RESPONSE_0:0xf:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", +- }, +- { SRC_LINE, +- .name = "skl::offcore_response_0:0xfffffffff", +- .ret = PFM_SUCCESS, +- .count = 2, +- .codes[0] = 0x5301b7, +- .codes[1] = 0xfffffffffull, +- .fstr = "skl::OFFCORE_RESPONSE_0:0xffffffff:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", +- }, +- { SRC_LINE, +- .name = "skl::offcore_response_0:0x7fffffffff", +- .ret = PFM_ERR_ATTR, +- }, +- { SRC_LINE, +- .name = "wsm::offcore_response_1:0xfffffffff", +- .ret = PFM_SUCCESS, +- .count = 2, +- .codes[0] = 0x5301bb, +- .codes[1] = 0xfffffffffull, +- .fstr = "wsm::OFFCORE_RESPONSE_1:0xffffffff:k=1:u=1:e=0:i=0:c=0:t=0", +- }, +- { SRC_LINE, +- .name = "wsm::offcore_response_1:0x7fffffffff", +- .ret = PFM_ERR_ATTR, +- }, +- { SRC_LINE, +- .name = "snb::offcore_response_1:0xf", +- .ret = PFM_SUCCESS, +- .count = 2, +- .codes[0] = 0x5301bb, +- .codes[1] = 0xf, +- .fstr = "snb::OFFCORE_RESPONSE_1:0xf:k=1:u=1:e=0:i=0:c=0:t=0", +- }, +- { SRC_LINE, +- .name = "snb::offcore_response_1:0xfffffffff", +- .ret = PFM_SUCCESS, +- .count = 2, +- .codes[0] = 0x5301bb, +- .codes[1] = 0xfffffffffull, +- .fstr = "snb::OFFCORE_RESPONSE_1:0xffffffff:k=1:u=1:e=0:i=0:c=0:t=0", +- }, +- { SRC_LINE, +- .name = "snb::offcore_response_1:0x7fffffffff", +- .ret = PFM_ERR_ATTR, +- }, +- { SRC_LINE, +- .name = "ivb_ep::offcore_response_1:0xf", +- .ret = PFM_SUCCESS, +- .count = 2, +- .codes[0] = 0x5301bb, +- .codes[1] = 0xf, +- .fstr = "ivb_ep::OFFCORE_RESPONSE_1:0xf:k=1:u=1:e=0:i=0:c=0:t=0", +- }, +- { SRC_LINE, +- .name = "ivb_ep::offcore_response_1:0xfffffffff", +- .ret = PFM_SUCCESS, +- .count = 2, +- .codes[0] = 0x5301bb, +- .codes[1] = 0xfffffffffull, +- .fstr = "ivb_ep::OFFCORE_RESPONSE_1:0xffffffff:k=1:u=1:e=0:i=0:c=0:t=0", +- }, +- { SRC_LINE, +- .name = "ivb_ep::offcore_response_1:0x7fffffffff", +- .ret = PFM_ERR_ATTR, +- }, +- { SRC_LINE, +- .name = "hsw::offcore_response_1:0xf", +- .ret = PFM_SUCCESS, +- .count = 2, +- .codes[0] = 0x5301bb, +- .codes[1] = 0xf, +- .fstr = "hsw::OFFCORE_RESPONSE_1:0xf:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", +- }, +- { SRC_LINE, +- .name = "hsw::offcore_response_1:0xfffffffff", +- .ret = PFM_SUCCESS, +- .count = 2, +- .codes[0] = 0x5301bb, +- .codes[1] = 0xfffffffffull, +- .fstr = "hsw::OFFCORE_RESPONSE_1:0xffffffff:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", +- }, +- { SRC_LINE, +- .name = "hsw::offcore_response_1:0x7fffffffff", +- .ret = PFM_ERR_ATTR, +- }, +- { SRC_LINE, +- .name = "bdw_ep::offcore_response_1:0xf", +- .ret = PFM_SUCCESS, +- .count = 2, +- .codes[0] = 0x5301bb, +- .codes[1] = 0xf, +- .fstr = "bdw_ep::OFFCORE_RESPONSE_1:0xf:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", +- }, +- { SRC_LINE, +- .name = "bdw_ep::offcore_response_1:0xfffffffff", +- .ret = PFM_SUCCESS, +- .count = 2, +- .codes[0] = 0x5301bb, +- .codes[1] = 0xfffffffffull, +- .fstr = "bdw_ep::OFFCORE_RESPONSE_1:0xffffffff:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", +- }, +- { SRC_LINE, +- .name = "bdw_ep::offcore_response_1:0x7fffffffff", +- .ret = PFM_ERR_ATTR, +- }, +- { SRC_LINE, +- .name = "skl::offcore_response_1:0xf", +- .ret = PFM_SUCCESS, +- .count = 2, +- .codes[0] = 0x5301bb, +- .codes[1] = 0xf, +- .fstr = "skl::OFFCORE_RESPONSE_1:0xf:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", +- }, +- { SRC_LINE, +- .name = "skl::offcore_response_1:0xfffffffff", +- .ret = PFM_SUCCESS, +- .count = 2, +- .codes[0] = 0x5301bb, +- .codes[1] = 0xfffffffffull, +- .fstr = "skl::OFFCORE_RESPONSE_1:0xffffffff:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", +- }, +- { SRC_LINE, +- .name = "skl::offcore_response_1:0x7fffffffff", +- .ret = PFM_ERR_ATTR, +- }, +- { SRC_LINE, + .name = "glm::offcore_response_1:any_request", + .ret = PFM_SUCCESS, + .count = 2, +-- +2.7.4 + +From 1e01aa2112461ecb67ddc58750316cadd19a8612 Mon Sep 17 00:00:00 2001 +From: Stephane Eranian +Date: Mon, 3 Apr 2017 22:55:16 -0700 +Subject: [PATCH 4/7] improve error message in validate.c + +Add more detailed info in czase of size mismatch. + +Signed-off-by: Stephane Eranian +--- + tests/validate.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/tests/validate.c b/tests/validate.c +index 522a6ab7140d..e4a8025f3f14 100644 +--- a/tests/validate.c ++++ b/tests/validate.c +@@ -270,7 +270,7 @@ validate_structs(void) + } + + if (sz != d->sz) { +- printf("Failed (invisible padding of %zu bytes)\n", d->sz - sz); ++ printf("Failed (invisible padding of %zu bytes, total struct size %zu bytes)\n", d->sz - sz, d->sz); + errors++; + continue; + } +-- +2.7.4 + +From 321133e1486084ea2b1494bc67b38ee085b31f71 Mon Sep 17 00:00:00 2001 +From: Stephane Eranian +Date: Mon, 3 Apr 2017 23:32:50 -0700 +Subject: [PATCH 5/7] create internal type for perf_event_attr_info_t + +This patch creates an internal version of the ABI +pfm_event_attr_info structure called pfmlib_event_attr_info_t. +The advantage is that we can change the internal version without +ABI changes. The new struct is just a clone of the external version. +But it can be customized for internal needs. + +The pfm_get_event_attr_info() converts the internal version into +the external version. + +This patch changes internal interface to use pfmlib_event_attr_info_t +for all architectures. + +Signed-off-by: Stephane Eranian +--- + lib/pfmlib_amd64.c | 4 ++-- + lib/pfmlib_amd64_priv.h | 2 +- + lib/pfmlib_arm.c | 4 ++-- + lib/pfmlib_arm_priv.h | 2 +- + lib/pfmlib_common.c | 32 ++++++++++++++++++++------------ + lib/pfmlib_intel_netburst.c | 4 ++-- + lib/pfmlib_intel_nhm_unc.c | 2 +- + lib/pfmlib_intel_snbep_unc.c | 4 ++-- + lib/pfmlib_intel_snbep_unc_priv.h | 2 +- + lib/pfmlib_intel_x86.c | 10 +++++----- + lib/pfmlib_intel_x86_perf_event.c | 6 +++--- + lib/pfmlib_intel_x86_priv.h | 2 +- + lib/pfmlib_mips.c | 4 ++-- + lib/pfmlib_mips_priv.h | 2 +- + lib/pfmlib_perf_event.c | 4 ++-- + lib/pfmlib_perf_event_pmu.c | 6 +++--- + lib/pfmlib_perf_event_raw.c | 2 +- + lib/pfmlib_power_priv.h | 2 +- + lib/pfmlib_powerpc.c | 2 +- + lib/pfmlib_priv.h | 26 ++++++++++++++++++++++++-- + lib/pfmlib_sparc.c | 4 ++-- + lib/pfmlib_sparc_priv.h | 2 +- + lib/pfmlib_torrent.c | 2 +- + 23 files changed, 80 insertions(+), 50 deletions(-) + +diff --git a/lib/pfmlib_amd64.c b/lib/pfmlib_amd64.c +index 13838040b55a..be2a4ef86faf 100644 +--- a/lib/pfmlib_amd64.c ++++ b/lib/pfmlib_amd64.c +@@ -426,7 +426,7 @@ pfm_amd64_get_encoding(void *this, pfmlib_event_desc_t *e) + { + const amd64_entry_t *pe = this_pe(this); + pfm_amd64_reg_t reg; +- pfm_event_attr_info_t *a; ++ pfmlib_event_attr_info_t *a; + uint64_t umask = 0; + unsigned int plmmsk = 0; + int k, ret, grpid; +@@ -661,7 +661,7 @@ pfm_amd64_event_is_valid(void *this, int pidx) + } + + int +-pfm_amd64_get_event_attr_info(void *this, int pidx, int attr_idx, pfm_event_attr_info_t *info) ++pfm_amd64_get_event_attr_info(void *this, int pidx, int attr_idx, pfmlib_event_attr_info_t *info) + { + const amd64_entry_t *pe = this_pe(this); + int numasks, idx; +diff --git a/lib/pfmlib_amd64_priv.h b/lib/pfmlib_amd64_priv.h +index 66ca49ef1b1d..c3caae514f52 100644 +--- a/lib/pfmlib_amd64_priv.h ++++ b/lib/pfmlib_amd64_priv.h +@@ -202,7 +202,7 @@ extern int pfm_amd64_get_encoding(void *this, pfmlib_event_desc_t *e); + extern int pfm_amd64_get_event_first(void *this); + extern int pfm_amd64_get_event_next(void *this, int idx); + extern int pfm_amd64_event_is_valid(void *this, int idx); +-extern int pfm_amd64_get_event_attr_info(void *this, int idx, int attr_idx, pfm_event_attr_info_t *info); ++extern int pfm_amd64_get_event_attr_info(void *this, int idx, int attr_idx, pfmlib_event_attr_info_t *info); + extern int pfm_amd64_get_event_info(void *this, int idx, pfm_event_info_t *info); + extern int pfm_amd64_validate_table(void *this, FILE *fp); + extern int pfm_amd64_detect(void *this); +diff --git a/lib/pfmlib_arm.c b/lib/pfmlib_arm.c +index a49ca4504644..91c35c670ebe 100644 +--- a/lib/pfmlib_arm.c ++++ b/lib/pfmlib_arm.c +@@ -180,7 +180,7 @@ pfm_arm_get_encoding(void *this, pfmlib_event_desc_t *e) + { + + const arm_entry_t *pe = this_pe(this); +- pfm_event_attr_info_t *a; ++ pfmlib_event_attr_info_t *a; + pfm_arm_reg_t reg; + unsigned int plm = 0; + int i, idx, has_plm = 0; +@@ -305,7 +305,7 @@ pfm_arm_validate_table(void *this, FILE *fp) + } + + int +-pfm_arm_get_event_attr_info(void *this, int pidx, int attr_idx, pfm_event_attr_info_t *info) ++pfm_arm_get_event_attr_info(void *this, int pidx, int attr_idx, pfmlib_event_attr_info_t *info) + { + int idx; + +diff --git a/lib/pfmlib_arm_priv.h b/lib/pfmlib_arm_priv.h +index 81a9df9afdc7..4fc2e74955e4 100644 +--- a/lib/pfmlib_arm_priv.h ++++ b/lib/pfmlib_arm_priv.h +@@ -66,7 +66,7 @@ extern int pfm_arm_get_event_first(void *this); + extern int pfm_arm_get_event_next(void *this, int idx); + extern int pfm_arm_event_is_valid(void *this, int pidx); + extern int pfm_arm_validate_table(void *this, FILE *fp); +-extern int pfm_arm_get_event_attr_info(void *this, int pidx, int attr_idx, pfm_event_attr_info_t *info); ++extern int pfm_arm_get_event_attr_info(void *this, int pidx, int attr_idx, pfmlib_event_attr_info_t *info); + extern int pfm_arm_get_event_info(void *this, int idx, pfm_event_info_t *info); + extern unsigned int pfm_arm_get_event_nattrs(void *this, int pidx); + +diff --git a/lib/pfmlib_common.c b/lib/pfmlib_common.c +index c88e2aaae274..f3c6dfa23e55 100644 +--- a/lib/pfmlib_common.c ++++ b/lib/pfmlib_common.c +@@ -504,7 +504,7 @@ pfmlib_compact_attrs(pfmlib_event_desc_t *e, int i) + static inline int + pfmlib_same_attr(pfmlib_event_desc_t *d, int i, int j) + { +- pfm_event_attr_info_t *a1, *a2; ++ pfmlib_event_attr_info_t *a1, *a2; + pfmlib_attr_t *b1, *b2; + + a1 = attr(d, i); +@@ -967,7 +967,7 @@ pfmlib_sanitize_event(pfmlib_event_desc_t *d) + static int + pfmlib_parse_event_attr(char *str, pfmlib_event_desc_t *d) + { +- pfm_event_attr_info_t *ainfo; ++ pfmlib_event_attr_info_t *ainfo; + char *s, *p, *q, *endptr; + char yes[2] = "y"; + pfm_attr_t type; +@@ -1366,7 +1366,7 @@ pfmlib_parse_event(const char *event, pfmlib_event_desc_t *d) + ret = pfmlib_sanitize_event(d); + + for (i = 0; i < d->nattrs; i++) { +- pfm_event_attr_info_t *a = attr(d, i); ++ pfmlib_event_attr_info_t *a = attr(d, i); + if (a->type != PFM_ATTR_RAW_UMASK) + DPRINT("%d %d %d %s\n", d->event, i, a->idx, d->pattrs[d->attrs[i].id].name); + else +@@ -1549,7 +1549,7 @@ static int + pfmlib_pmu_validate_encoding(pfmlib_pmu_t *pmu, FILE *fp) + { + pfm_event_info_t einfo; +- pfm_event_attr_info_t ainfo; ++ pfmlib_event_attr_info_t ainfo; + char *buf; + size_t maxlen = 0, len; + int i, u, n = 0, um; +@@ -1811,7 +1811,7 @@ pfm_get_event_info(int idx, pfm_os_t os, pfm_event_info_t *uinfo) + int + pfm_get_event_attr_info(int idx, int attr_idx, pfm_os_t os, pfm_event_attr_info_t *uinfo) + { +- pfm_event_attr_info_t info; ++ pfmlib_event_attr_info_t info; + pfmlib_event_desc_t e; + pfmlib_pmu_t *pmu; + size_t sz = sizeof(info); +@@ -1857,17 +1857,25 @@ pfm_get_event_attr_info(int idx, int attr_idx, pfm_os_t os, pfm_event_attr_info_ + info = e.pattrs[attr_idx]; + + /* +- * rewrite size to reflect what we are returning +- */ +- info.size = sz; +- /* + * info.idx = private, namespace specific index, + * should not be visible externally, so override + * with public index ++ * ++ * cannot memcpy() info into uinfo as they do not ++ * have the same size, cf. idx field (uint64 vs, uint32) + */ +- info.idx = attr_idx; +- +- memcpy(uinfo, &info, sz); ++ uinfo->name = info.name; ++ uinfo->desc = info.desc; ++ uinfo->equiv = info.equiv; ++ uinfo->size = sz; ++ uinfo->code = info.code; ++ uinfo->type = info.type; ++ uinfo->idx = attr_idx; ++ uinfo->ctrl = info.ctrl; ++ uinfo->is_dfl= info.is_dfl; ++ uinfo->is_precise = info.is_precise; ++ uinfo->reserved_bits = 0; ++ uinfo->dfl_val64 = info.dfl_val64; + + ret = PFM_SUCCESS; + error: +diff --git a/lib/pfmlib_intel_netburst.c b/lib/pfmlib_intel_netburst.c +index 9d8f22b7705d..9b4960583523 100644 +--- a/lib/pfmlib_intel_netburst.c ++++ b/lib/pfmlib_intel_netburst.c +@@ -110,7 +110,7 @@ netburst_add_defaults(pfmlib_event_desc_t *e, unsigned int *evmask) + int + pfm_netburst_get_encoding(void *this, pfmlib_event_desc_t *e) + { +- pfm_event_attr_info_t *a; ++ pfmlib_event_attr_info_t *a; + netburst_escr_value_t escr; + netburst_cccr_value_t cccr; + unsigned int evmask = 0; +@@ -322,7 +322,7 @@ pfm_netburst_event_is_valid(void *this, int pidx) + } + + static int +-pfm_netburst_get_event_attr_info(void *this, int pidx, int attr_idx, pfm_event_attr_info_t *info) ++pfm_netburst_get_event_attr_info(void *this, int pidx, int attr_idx, pfmlib_event_attr_info_t *info) + { + const netburst_entry_t *pe = this_pe(this); + int numasks, idx; +diff --git a/lib/pfmlib_intel_nhm_unc.c b/lib/pfmlib_intel_nhm_unc.c +index 4c27b070f2d6..6731f4045332 100644 +--- a/lib/pfmlib_intel_nhm_unc.c ++++ b/lib/pfmlib_intel_nhm_unc.c +@@ -82,7 +82,7 @@ static int + pfm_nhm_unc_get_encoding(void *this, pfmlib_event_desc_t *e) + { + pfm_intel_x86_reg_t reg; +- pfm_event_attr_info_t *a; ++ pfmlib_event_attr_info_t *a; + const intel_x86_entry_t *pe = this_pe(this); + unsigned int grpmsk, ugrpmsk = 0; + int umodmsk = 0, modmsk_r = 0; +diff --git a/lib/pfmlib_intel_snbep_unc.c b/lib/pfmlib_intel_snbep_unc.c +index 075ae33b3a57..1e80147fc1a3 100644 +--- a/lib/pfmlib_intel_snbep_unc.c ++++ b/lib/pfmlib_intel_snbep_unc.c +@@ -281,7 +281,7 @@ pfm_intel_snbep_unc_get_encoding(void *this, pfmlib_event_desc_t *e) + pfm_snbep_unc_reg_t reg; + pfm_snbep_unc_reg_t filters[INTEL_X86_MAX_FILTERS]; + pfm_snbep_unc_reg_t addr; +- pfm_event_attr_info_t *a; ++ pfmlib_event_attr_info_t *a; + uint64_t val, umask1, umask2; + int k, ret; + int has_cbo_tid = 0; +@@ -641,7 +641,7 @@ pfm_intel_snbep_unc_can_auto_encode(void *this, int pidx, int uidx) + } + + int +-pfm_intel_snbep_unc_get_event_attr_info(void *this, int pidx, int attr_idx, pfm_event_attr_info_t *info) ++pfm_intel_snbep_unc_get_event_attr_info(void *this, int pidx, int attr_idx, pfmlib_event_attr_info_t *info) + { + const intel_x86_entry_t *pe = this_pe(this); + const pfmlib_attr_desc_t *atdesc = this_atdesc(this); +diff --git a/lib/pfmlib_intel_snbep_unc_priv.h b/lib/pfmlib_intel_snbep_unc_priv.h +index 500ff84cc123..4984242c35bb 100644 +--- a/lib/pfmlib_intel_snbep_unc_priv.h ++++ b/lib/pfmlib_intel_snbep_unc_priv.h +@@ -329,7 +329,7 @@ extern int pfm_intel_hswep_unc_detect(void *this); + extern int pfm_intel_knl_unc_detect(void *this); + extern int pfm_intel_snbep_unc_get_perf_encoding(void *this, pfmlib_event_desc_t *e); + extern int pfm_intel_snbep_unc_can_auto_encode(void *this, int pidx, int uidx); +-extern int pfm_intel_snbep_unc_get_event_attr_info(void *this, int pidx, int attr_idx, pfm_event_attr_info_t *info); ++extern int pfm_intel_snbep_unc_get_event_attr_info(void *this, int pidx, int attr_idx, pfmlib_event_attr_info_t *info); + + static inline int + is_cbo_filt_event(void *this, pfm_intel_x86_reg_t reg) +diff --git a/lib/pfmlib_intel_x86.c b/lib/pfmlib_intel_x86.c +index 497cf1b9246a..09a0f50a3a4e 100644 +--- a/lib/pfmlib_intel_x86.c ++++ b/lib/pfmlib_intel_x86.c +@@ -296,7 +296,7 @@ static int + intel_x86_check_pebs(void *this, pfmlib_event_desc_t *e) + { + const intel_x86_entry_t *pe = this_pe(this); +- pfm_event_attr_info_t *a; ++ pfmlib_event_attr_info_t *a; + int numasks = 0, pebs = 0; + int i; + +@@ -340,7 +340,7 @@ static int + intel_x86_check_max_grpid(void *this, pfmlib_event_desc_t *e, int max_grpid) + { + const intel_x86_entry_t *pe; +- pfm_event_attr_info_t *a; ++ pfmlib_event_attr_info_t *a; + int i, grpid; + + DPRINT("check: max_grpid=%d\n", max_grpid); +@@ -366,7 +366,7 @@ pfm_intel_x86_encode_gen(void *this, pfmlib_event_desc_t *e) + + { + pfmlib_pmu_t *pmu = this; +- pfm_event_attr_info_t *a; ++ pfmlib_event_attr_info_t *a; + const intel_x86_entry_t *pe; + pfm_intel_x86_reg_t reg, reg2; + unsigned int grpmsk, ugrpmsk = 0; +@@ -964,7 +964,7 @@ pfm_intel_x86_validate_table(void *this, FILE *fp) + } + + int +-pfm_intel_x86_get_event_attr_info(void *this, int pidx, int attr_idx, pfm_event_attr_info_t *info) ++pfm_intel_x86_get_event_attr_info(void *this, int pidx, int attr_idx, pfmlib_event_attr_info_t *info) + { + const intel_x86_entry_t *pe = this_pe(this); + const pfmlib_attr_desc_t *atdesc = this_atdesc(this); +@@ -1029,7 +1029,7 @@ pfm_intel_x86_get_event_info(void *this, int idx, pfm_event_info_t *info) + int + pfm_intel_x86_valid_pebs(pfmlib_event_desc_t *e) + { +- pfm_event_attr_info_t *a; ++ pfmlib_event_attr_info_t *a; + int i, npebs = 0, numasks = 0; + + /* first check at the event level */ +diff --git a/lib/pfmlib_intel_x86_perf_event.c b/lib/pfmlib_intel_x86_perf_event.c +index f346d4f92be5..0735ef9d88c1 100644 +--- a/lib/pfmlib_intel_x86_perf_event.c ++++ b/lib/pfmlib_intel_x86_perf_event.c +@@ -60,7 +60,7 @@ find_pmu_type_by_name(const char *name) + static int + has_ldlat(void *this, pfmlib_event_desc_t *e) + { +- pfm_event_attr_info_t *a; ++ pfmlib_event_attr_info_t *a; + int i; + + for (i = 0; i < e->nattrs; i++) { +@@ -217,7 +217,7 @@ pfm_intel_nhm_unc_get_perf_encoding(void *this, pfmlib_event_desc_t *e) + int + pfm_intel_x86_requesting_pebs(pfmlib_event_desc_t *e) + { +- pfm_event_attr_info_t *a; ++ pfmlib_event_attr_info_t *a; + int i; + + for (i = 0; i < e->nattrs; i++) { +@@ -233,7 +233,7 @@ pfm_intel_x86_requesting_pebs(pfmlib_event_desc_t *e) + static int + intel_x86_event_has_pebs(void *this, pfmlib_event_desc_t *e) + { +- pfm_event_attr_info_t *a; ++ pfmlib_event_attr_info_t *a; + int i; + + /* first check at the event level */ +diff --git a/lib/pfmlib_intel_x86_priv.h b/lib/pfmlib_intel_x86_priv.h +index 963b41a8a766..e2dfbf3d9b40 100644 +--- a/lib/pfmlib_intel_x86_priv.h ++++ b/lib/pfmlib_intel_x86_priv.h +@@ -335,7 +335,7 @@ extern int pfm_intel_x86_get_event_next(void *this, int idx); + extern int pfm_intel_x86_get_event_umask_first(void *this, int idx); + extern int pfm_intel_x86_get_event_umask_next(void *this, int idx, int attr); + extern int pfm_intel_x86_validate_table(void *this, FILE *fp); +-extern int pfm_intel_x86_get_event_attr_info(void *this, int idx, int attr_idx, pfm_event_attr_info_t *info); ++extern int pfm_intel_x86_get_event_attr_info(void *this, int idx, int attr_idx, pfmlib_event_attr_info_t *info); + extern int pfm_intel_x86_get_event_info(void *this, int idx, pfm_event_info_t *info); + extern int pfm_intel_x86_valid_pebs(pfmlib_event_desc_t *e); + extern int pfm_intel_x86_perf_event_encoding(pfmlib_event_desc_t *e, void *data); +diff --git a/lib/pfmlib_mips.c b/lib/pfmlib_mips.c +index 8357ea515045..61db613be433 100644 +--- a/lib/pfmlib_mips.c ++++ b/lib/pfmlib_mips.c +@@ -174,7 +174,7 @@ pfm_mips_get_encoding(void *this, pfmlib_event_desc_t *e) + + pfmlib_pmu_t *pmu = this; + const mips_entry_t *pe = this_pe(this); +- pfm_event_attr_info_t *a; ++ pfmlib_event_attr_info_t *a; + pfm_mips_sel_reg_t reg; + uint64_t ival, cntmask = 0; + int plmmsk = 0, code; +@@ -333,7 +333,7 @@ pfm_mips_get_event_nattrs(void *this, int pidx) + } + + int +-pfm_mips_get_event_attr_info(void *this, int pidx, int attr_idx, pfm_event_attr_info_t *info) ++pfm_mips_get_event_attr_info(void *this, int pidx, int attr_idx, pfmlib_event_attr_info_t *info) + { + /* no umasks, so all attrs are modifiers */ + +diff --git a/lib/pfmlib_mips_priv.h b/lib/pfmlib_mips_priv.h +index c5112f510acf..1ed2bcba28c8 100644 +--- a/lib/pfmlib_mips_priv.h ++++ b/lib/pfmlib_mips_priv.h +@@ -107,7 +107,7 @@ extern int pfm_mips_get_event_first(void *this); + extern int pfm_mips_get_event_next(void *this, int idx); + extern int pfm_mips_event_is_valid(void *this, int pidx); + extern int pfm_mips_validate_table(void *this, FILE *fp); +-extern int pfm_mips_get_event_attr_info(void *this, int pidx, int attr_idx, pfm_event_attr_info_t *info); ++extern int pfm_mips_get_event_attr_info(void *this, int pidx, int attr_idx, pfmlib_event_attr_info_t *info); + extern int pfm_mips_get_event_info(void *this, int idx, pfm_event_info_t *info); + extern unsigned int pfm_mips_get_event_nattrs(void *this, int pidx); + +diff --git a/lib/pfmlib_perf_event.c b/lib/pfmlib_perf_event.c +index 8618d6070968..df18821a540d 100644 +--- a/lib/pfmlib_perf_event.c ++++ b/lib/pfmlib_perf_event.c +@@ -82,7 +82,7 @@ pfmlib_perf_event_encode(void *this, const char *str, int dfl_plm, void *data) + struct perf_event_attr my_attr, *attr; + pfmlib_pmu_t *pmu; + pfmlib_event_desc_t e; +- pfm_event_attr_info_t *a; ++ pfmlib_event_attr_info_t *a; + size_t orig_sz, asz, sz = sizeof(arg); + uint64_t ival; + int has_plm = 0, has_vmx_plm = 0; +@@ -357,7 +357,7 @@ static int + perf_get_os_attr_info(void *this, pfmlib_event_desc_t *e) + { + pfmlib_os_t *os = this; +- pfm_event_attr_info_t *info; ++ pfmlib_event_attr_info_t *info; + int i, k, j = e->npattrs; + + for (i = k = 0; os->atdesc[i].name; i++) { +diff --git a/lib/pfmlib_perf_event_pmu.c b/lib/pfmlib_perf_event_pmu.c +index 5b2d8104696a..5c81552da71e 100644 +--- a/lib/pfmlib_perf_event_pmu.c ++++ b/lib/pfmlib_perf_event_pmu.c +@@ -569,7 +569,7 @@ static int + pfmlib_perf_encode_tp(pfmlib_event_desc_t *e) + { + perf_umask_t *um; +- pfm_event_attr_info_t *a; ++ pfmlib_event_attr_info_t *a; + int i, nu = 0; + + e->fstr[0] = '\0'; +@@ -607,7 +607,7 @@ pfmlib_perf_encode_tp(pfmlib_event_desc_t *e) + static int + pfmlib_perf_encode_hw_cache(pfmlib_event_desc_t *e) + { +- pfm_event_attr_info_t *a; ++ pfmlib_event_attr_info_t *a; + perf_event_t *ent; + unsigned int msk, grpmsk; + uint64_t umask = 0; +@@ -733,7 +733,7 @@ pfm_perf_event_is_valid(void *this, int idx) + } + + static int +-pfm_perf_get_event_attr_info(void *this, int idx, int attr_idx, pfm_event_attr_info_t *info) ++pfm_perf_get_event_attr_info(void *this, int idx, int attr_idx, pfmlib_event_attr_info_t *info) + { + perf_umask_t *um; + +diff --git a/lib/pfmlib_perf_event_raw.c b/lib/pfmlib_perf_event_raw.c +index e10d215912ea..71d944334876 100644 +--- a/lib/pfmlib_perf_event_raw.c ++++ b/lib/pfmlib_perf_event_raw.c +@@ -91,7 +91,7 @@ pfm_perf_raw_event_is_valid(void *this, int idx) + } + + static int +-pfm_perf_raw_get_event_attr_info(void *this, int idx, int attr_idx, pfm_event_attr_info_t *info) ++pfm_perf_raw_get_event_attr_info(void *this, int idx, int attr_idx, pfmlib_event_attr_info_t *info) + { + return PFM_ERR_ATTR; + } +diff --git a/lib/pfmlib_power_priv.h b/lib/pfmlib_power_priv.h +index 8b5c3ac0ffcf..3b72d326e3bb 100644 +--- a/lib/pfmlib_power_priv.h ++++ b/lib/pfmlib_power_priv.h +@@ -101,7 +101,7 @@ typedef struct { + #define POWER8_PLM (POWER_PLM|PFM_PLMH) + + extern int pfm_gen_powerpc_get_event_info(void *this, int pidx, pfm_event_info_t *info); +-extern int pfm_gen_powerpc_get_event_attr_info(void *this, int pidx, int umask_idx, pfm_event_attr_info_t *info); ++extern int pfm_gen_powerpc_get_event_attr_info(void *this, int pidx, int umask_idx, pfmlib_event_attr_info_t *info); + extern int pfm_gen_powerpc_get_encoding(void *this, pfmlib_event_desc_t *e); + extern int pfm_gen_powerpc_get_event_first(void *this); + extern int pfm_gen_powerpc_get_event_next(void *this, int idx); +diff --git a/lib/pfmlib_powerpc.c b/lib/pfmlib_powerpc.c +index f025dede599d..f32080d63b5e 100644 +--- a/lib/pfmlib_powerpc.c ++++ b/lib/pfmlib_powerpc.c +@@ -56,7 +56,7 @@ pfm_gen_powerpc_get_event_info(void *this, int pidx, pfm_event_info_t *info) + } + + int +-pfm_gen_powerpc_get_event_attr_info(void *this, int pidx, int umask_idx, pfm_event_attr_info_t *info) ++pfm_gen_powerpc_get_event_attr_info(void *this, int pidx, int umask_idx, pfmlib_event_attr_info_t *info) + { + /* No attributes are supported */ + return PFM_ERR_ATTR; +diff --git a/lib/pfmlib_priv.h b/lib/pfmlib_priv.h +index 33d7fdf2013d..2f4d2b9d494b 100644 +--- a/lib/pfmlib_priv.h ++++ b/lib/pfmlib_priv.h +@@ -56,6 +56,28 @@ typedef struct { + pfm_attr_t type; /* used to validate value (if any) */ + } pfmlib_attr_desc_t; + ++typedef struct { ++ const char *name; /* attribute symbolic name */ ++ const char *desc; /* attribute description */ ++ const char *equiv; /* attribute is equivalent to */ ++ size_t size; /* struct sizeof */ ++ uint64_t code; /* attribute code */ ++ pfm_attr_t type; /* attribute type */ ++ int idx; /* attribute opaque index */ ++ pfm_attr_ctrl_t ctrl; /* what is providing attr */ ++ struct { ++ unsigned int is_dfl:1; /* is default umask */ ++ unsigned int is_precise:1; /* Intel X86: supports PEBS */ ++ unsigned int reserved_bits:30; ++ }; ++ union { ++ uint64_t dfl_val64; /* default 64-bit value */ ++ const char *dfl_str; /* default string value */ ++ int dfl_bool; /* default boolean value */ ++ int dfl_int; /* default integer value */ ++ }; ++} pfmlib_event_attr_info_t; ++ + /* + * attribute description passed to model-specific layer + */ +@@ -90,7 +112,7 @@ typedef struct { + int count; /* number of entries in codes[] */ + pfmlib_attr_t attrs[PFMLIB_MAX_ATTRS]; /* list of requested attributes */ + +- pfm_event_attr_info_t *pattrs; /* list of possible attributes */ ++ pfmlib_event_attr_info_t *pattrs; /* list of possible attributes */ + char fstr[PFMLIB_EVT_MAX_NAME_LEN]; /* fully qualified event string */ + uint64_t codes[PFMLIB_MAX_ENCODING]; /* event encoding */ + void *os_data; +@@ -129,7 +151,7 @@ typedef struct pfmlib_pmu { + int (*event_is_valid)(void *this, int pidx); + int (*can_auto_encode)(void *this, int pidx, int uidx); + +- int (*get_event_attr_info)(void *this, int pidx, int umask_idx, pfm_event_attr_info_t *info); ++ int (*get_event_attr_info)(void *this, int pidx, int umask_idx, pfmlib_event_attr_info_t *info); + int (*get_event_encoding[PFM_OS_MAX])(void *this, pfmlib_event_desc_t *e); + + void (*validate_pattrs[PFM_OS_MAX])(void *this, pfmlib_event_desc_t *e); +diff --git a/lib/pfmlib_sparc.c b/lib/pfmlib_sparc.c +index f88b5512a5f4..fe8da0618d31 100644 +--- a/lib/pfmlib_sparc.c ++++ b/lib/pfmlib_sparc.c +@@ -165,7 +165,7 @@ int + pfm_sparc_get_encoding(void *this, pfmlib_event_desc_t *e) + { + const sparc_entry_t *pe = this_pe(this); +- pfm_event_attr_info_t *a; ++ pfmlib_event_attr_info_t *a; + pfm_sparc_reg_t reg; + int i; + +@@ -260,7 +260,7 @@ pfm_sparc_validate_table(void *this, FILE *fp) + } + + int +-pfm_sparc_get_event_attr_info(void *this, int pidx, int attr_idx, pfm_event_attr_info_t *info) ++pfm_sparc_get_event_attr_info(void *this, int pidx, int attr_idx, pfmlib_event_attr_info_t *info) + { + const sparc_entry_t *pe = this_pe(this); + int idx; +diff --git a/lib/pfmlib_sparc_priv.h b/lib/pfmlib_sparc_priv.h +index 7de9b3dc327a..332651ff051e 100644 +--- a/lib/pfmlib_sparc_priv.h ++++ b/lib/pfmlib_sparc_priv.h +@@ -45,7 +45,7 @@ extern int pfm_sparc_get_event_first(void *this); + extern int pfm_sparc_get_event_next(void *this, int idx); + extern int pfm_sparc_event_is_valid(void *this, int pidx); + extern int pfm_sparc_validate_table(void *this, FILE *fp); +-extern int pfm_sparc_get_event_attr_info(void *this, int pidx, int attr_idx, pfm_event_attr_info_t *info); ++extern int pfm_sparc_get_event_attr_info(void *this, int pidx, int attr_idx, pfmlib_event_attr_info_t *info); + extern int pfm_sparc_get_event_info(void *this, int idx, pfm_event_info_t *info); + extern unsigned int pfm_sparc_get_event_nattrs(void *this, int pidx); + +diff --git a/lib/pfmlib_torrent.c b/lib/pfmlib_torrent.c +index b8d697aa27ac..72991e7ec98a 100644 +--- a/lib/pfmlib_torrent.c ++++ b/lib/pfmlib_torrent.c +@@ -104,7 +104,7 @@ pfm_torrent_get_event_info(void *this, int pidx, pfm_event_info_t *info) + + static int + pfm_torrent_get_event_attr_info(void *this, int idx, int attr_idx, +- pfm_event_attr_info_t *info) ++ pfmlib_event_attr_info_t *info) + { + int m; + +-- +2.7.4 + +From 39d4b76fa96825ec65724eb94939a3b534a62fd0 Mon Sep 17 00:00:00 2001 +From: Stephane Eranian +Date: Mon, 3 Apr 2017 23:41:10 -0700 +Subject: [PATCH 6/7] enable generic support for 64-bit raw umask + +This patch modifies the generic code to handle 64-bit raw umasks +passed by users. + +Signed-off-by: Stephane Eranian +--- + lib/pfmlib_common.c | 3 ++- + lib/pfmlib_priv.h | 4 ++-- + 2 files changed, 4 insertions(+), 3 deletions(-) + +diff --git a/lib/pfmlib_common.c b/lib/pfmlib_common.c +index f3c6dfa23e55..6ff44994203b 100644 +--- a/lib/pfmlib_common.c ++++ b/lib/pfmlib_common.c +@@ -1011,7 +1011,8 @@ pfmlib_parse_event_attr(char *str, pfmlib_event_desc_t *d) + ainfo->name = "RAW_UMASK"; + ainfo->type = PFM_ATTR_RAW_UMASK; + ainfo->ctrl = PFM_ATTR_CTRL_PMU; +- ainfo->idx = strtoul(s, &endptr, 0); ++ /* can handle up to 64-bit raw umask */ ++ ainfo->idx = strtoull(s, &endptr, 0); + ainfo->equiv= NULL; + if (*endptr) { + DPRINT("raw umask (%s) is not a number\n"); +diff --git a/lib/pfmlib_priv.h b/lib/pfmlib_priv.h +index 2f4d2b9d494b..b7503a76de01 100644 +--- a/lib/pfmlib_priv.h ++++ b/lib/pfmlib_priv.h +@@ -63,8 +63,8 @@ typedef struct { + size_t size; /* struct sizeof */ + uint64_t code; /* attribute code */ + pfm_attr_t type; /* attribute type */ +- int idx; /* attribute opaque index */ +- pfm_attr_ctrl_t ctrl; /* what is providing attr */ ++ pfm_attr_ctrl_t ctrl; /* what is providing attr */ ++ uint64_t idx; /* attribute opaque index */ + struct { + unsigned int is_dfl:1; /* is default umask */ + unsigned int is_precise:1; /* Intel X86: supports PEBS */ +-- +2.7.4 + +From 088a1806676382e1a0324ba4c2d59b9d07a96caf Mon Sep 17 00:00:00 2001 +From: Stephane Eranian +Date: Tue, 4 Apr 2017 09:42:25 -0700 +Subject: [PATCH 7/7] enable 38-bit raw umask for Intel offcore_response event + +This patch enables support for passing and encoding of 38-bit +offcore_response matrix umask. Without the patch, the raw umask +was limited to 32-bit which is not enough to cover all the possible +bits of the offcore_response event available since Intel SandyBridge. + +$ examples/check_events offcore_response_0:0xffffff +Requested Event: offcore_response_0:0xffffff +Actual Event: ivb::OFFCORE_RESPONSE_0:0xffffff:k=1:u=1:e=0:i=0:c=0:t=0 +PMU : Intel Ivy Bridge +IDX : 155189325 +Codes : 0x5301b7 0xffffff + +The patch also adds tests to the validation code. + +Signed-off-by: Stephane Eranian +--- + lib/pfmlib_intel_x86.c | 20 +++-- + tests/validate_x86.c | 232 +++++++++++++++++++++++++++++++++++++++++++++++++ + 2 files changed, 246 insertions(+), 6 deletions(-) + +diff --git a/lib/pfmlib_intel_x86.c b/lib/pfmlib_intel_x86.c +index 09a0f50a3a4e..8fe93115dfa9 100644 +--- a/lib/pfmlib_intel_x86.c ++++ b/lib/pfmlib_intel_x86.c +@@ -481,16 +481,24 @@ pfm_intel_x86_encode_gen(void *this, pfmlib_event_desc_t *e) + reg.sel_event_select = last_ucode; + } + } else if (a->type == PFM_ATTR_RAW_UMASK) { ++ int ofr_bits = 8; ++ uint64_t rmask; ++ ++ /* set limit on width of raw umask */ ++ if (intel_x86_eflag(this, e->event, INTEL_X86_NHM_OFFCORE)) { ++ ofr_bits = 38; ++ if (e->pmu->pmu == PFM_PMU_INTEL_WSM || e->pmu->pmu == PFM_PMU_INTEL_WSM_DP) ++ ofr_bits = 16; ++ } ++ rmask = (1ULL << ofr_bits) - 1; + +- /* there can only be one RAW_UMASK per event */ +- +- /* sanity check */ +- if (a->idx & ~0xff) { +- DPRINT("raw umask is 8-bit wide\n"); ++ if (a->idx & ~rmask) { ++ DPRINT("raw umask is too wide max %d bits\n", ofr_bits); + return PFM_ERR_ATTR; + } ++ + /* override umask */ +- umask2 = a->idx & 0xff; ++ umask2 = a->idx & rmask; + ugrpmsk = grpmsk; + } else { + uint64_t ival = e->attrs[k].ival; +diff --git a/tests/validate_x86.c b/tests/validate_x86.c +index 906afba636e1..aa0aaa114d0d 100644 +--- a/tests/validate_x86.c ++++ b/tests/validate_x86.c +@@ -4523,6 +4523,238 @@ static const test_event_t x86_test_events[]={ + .codes[0] = 0x0825, + .fstr = "knl_unc_m2pcie::UNC_M2P_EGRESS_CYCLES_FULL:AD_1", + }, ++ { SRC_LINE, ++ .name = "wsm::offcore_response_0:0xf", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] = 0x5301b7, ++ .codes[1] = 0xf, ++ .fstr = "wsm::OFFCORE_RESPONSE_0:0xf:k=1:u=1:e=0:i=0:c=0:t=0", ++ }, ++ { SRC_LINE, ++ .name = "wsm::offcore_response_0:0xffff", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] = 0x5301b7, ++ .codes[1] = 0xffff, ++ .fstr = "wsm::OFFCORE_RESPONSE_0:0xffff:k=1:u=1:e=0:i=0:c=0:t=0", ++ }, ++ { SRC_LINE, ++ .name = "wsm::offcore_response_0:0x7fffffffff", ++ .ret = PFM_ERR_ATTR, ++ }, ++ { SRC_LINE, ++ .name = "snb::offcore_response_0:0xf", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] = 0x5301b7, ++ .codes[1] = 0xf, ++ .fstr = "snb::OFFCORE_RESPONSE_0:0xf:k=1:u=1:e=0:i=0:c=0:t=0", ++ }, ++ { SRC_LINE, ++ .name = "snb::offcore_response_0:0xfffffffff", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] = 0x5301b7, ++ .codes[1] = 0xfffffffff, ++ .fstr = "snb::OFFCORE_RESPONSE_0:0xffffffff:k=1:u=1:e=0:i=0:c=0:t=0", ++ }, ++ { SRC_LINE, ++ .name = "snb::offcore_response_0:0x7fffffffff", ++ .ret = PFM_ERR_ATTR, ++ }, ++ { SRC_LINE, ++ .name = "ivb_ep::offcore_response_0:0xf", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] = 0x5301b7, ++ .codes[1] = 0xf, ++ .fstr = "ivb_ep::OFFCORE_RESPONSE_0:0xf:k=1:u=1:e=0:i=0:c=0:t=0", ++ }, ++ { SRC_LINE, ++ .name = "ivb_ep::offcore_response_0:0xfffffffff", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] = 0x5301b7, ++ .codes[1] = 0xfffffffff, ++ .fstr = "ivb_ep::OFFCORE_RESPONSE_0:0xffffffff:k=1:u=1:e=0:i=0:c=0:t=0", ++ }, ++ { SRC_LINE, ++ .name = "ivb_ep::offcore_response_0:0x7fffffffff", ++ .ret = PFM_ERR_ATTR, ++ }, ++ { SRC_LINE, ++ .name = "hsw::offcore_response_0:0xf", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] = 0x5301b7, ++ .codes[1] = 0xf, ++ .fstr = "hsw::OFFCORE_RESPONSE_0:0xf:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", ++ }, ++ { SRC_LINE, ++ .name = "hsw::offcore_response_0:0xfffffffff", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] = 0x5301b7, ++ .codes[1] = 0xfffffffff, ++ .fstr = "hsw::OFFCORE_RESPONSE_0:0xffffffff:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", ++ }, ++ { SRC_LINE, ++ .name = "hsw::offcore_response_0:0x7fffffffff", ++ .ret = PFM_ERR_ATTR, ++ }, ++ { SRC_LINE, ++ .name = "bdw_ep::offcore_response_0:0xf", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] = 0x5301b7, ++ .codes[1] = 0xf, ++ .fstr = "bdw_ep::OFFCORE_RESPONSE_0:0xf:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", ++ }, ++ { SRC_LINE, ++ .name = "bdw_ep::offcore_response_0:0xfffffffff", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] = 0x5301b7, ++ .codes[1] = 0xfffffffff, ++ .fstr = "bdw_ep::OFFCORE_RESPONSE_0:0xffffffff:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", ++ }, ++ { SRC_LINE, ++ .name = "bdw_ep::offcore_response_0:0x7fffffffff", ++ .ret = PFM_ERR_ATTR, ++ }, ++ { SRC_LINE, ++ .name = "skl::offcore_response_0:0xf", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] = 0x5301b7, ++ .codes[1] = 0xf, ++ .fstr = "skl::OFFCORE_RESPONSE_0:0xf:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", ++ }, ++ { SRC_LINE, ++ .name = "skl::offcore_response_0:0xfffffffff", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] = 0x5301b7, ++ .codes[1] = 0xfffffffff, ++ .fstr = "skl::OFFCORE_RESPONSE_0:0xffffffff:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", ++ }, ++ { SRC_LINE, ++ .name = "skl::offcore_response_0:0x7fffffffff", ++ .ret = PFM_ERR_ATTR, ++ }, ++ { SRC_LINE, ++ .name = "wsm::offcore_response_1:0xfff", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] = 0x5301bb, ++ .codes[1] = 0xfff, ++ .fstr = "wsm::OFFCORE_RESPONSE_1:0xfff:k=1:u=1:e=0:i=0:c=0:t=0", ++ }, ++ { SRC_LINE, ++ .name = "wsm::offcore_response_1:0x7fffffffff", ++ .ret = PFM_ERR_ATTR, ++ }, ++ { SRC_LINE, ++ .name = "snb::offcore_response_1:0xf", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] = 0x5301bb, ++ .codes[1] = 0xf, ++ .fstr = "snb::OFFCORE_RESPONSE_1:0xf:k=1:u=1:e=0:i=0:c=0:t=0", ++ }, ++ { SRC_LINE, ++ .name = "snb::offcore_response_1:0xfffffffff", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] = 0x5301bb, ++ .codes[1] = 0xfffffffff, ++ .fstr = "snb::OFFCORE_RESPONSE_1:0xffffffff:k=1:u=1:e=0:i=0:c=0:t=0", ++ }, ++ { SRC_LINE, ++ .name = "snb::offcore_response_1:0x7fffffffff", ++ .ret = PFM_ERR_ATTR, ++ }, ++ { SRC_LINE, ++ .name = "ivb_ep::offcore_response_1:0xf", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] = 0x5301bb, ++ .codes[1] = 0xf, ++ .fstr = "ivb_ep::OFFCORE_RESPONSE_1:0xf:k=1:u=1:e=0:i=0:c=0:t=0", ++ }, ++ { SRC_LINE, ++ .name = "ivb_ep::offcore_response_1:0xfffffffff", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] = 0x5301bb, ++ .codes[1] = 0xfffffffff, ++ .fstr = "ivb_ep::OFFCORE_RESPONSE_1:0xffffffff:k=1:u=1:e=0:i=0:c=0:t=0", ++ }, ++ { SRC_LINE, ++ .name = "ivb_ep::offcore_response_1:0x7fffffffff", ++ .ret = PFM_ERR_ATTR, ++ }, ++ { SRC_LINE, ++ .name = "hsw::offcore_response_1:0xf", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] = 0x5301bb, ++ .codes[1] = 0xf, ++ .fstr = "hsw::OFFCORE_RESPONSE_1:0xf:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", ++ }, ++ { SRC_LINE, ++ .name = "hsw::offcore_response_1:0xfffffffff", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] = 0x5301bb, ++ .codes[1] = 0xfffffffff, ++ .fstr = "hsw::OFFCORE_RESPONSE_1:0xffffffff:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", ++ }, ++ { SRC_LINE, ++ .name = "hsw::offcore_response_1:0x7fffffffff", ++ .ret = PFM_ERR_ATTR, ++ }, ++ { SRC_LINE, ++ .name = "bdw_ep::offcore_response_1:0xf", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] = 0x5301bb, ++ .codes[1] = 0xf, ++ .fstr = "bdw_ep::OFFCORE_RESPONSE_1:0xf:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", ++ }, ++ { SRC_LINE, ++ .name = "bdw_ep::offcore_response_1:0xfffffffff", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] = 0x5301bb, ++ .codes[1] = 0xfffffffff, ++ .fstr = "bdw_ep::OFFCORE_RESPONSE_1:0xffffffff:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", ++ }, ++ { SRC_LINE, ++ .name = "bdw_ep::offcore_response_1:0x7fffffffff", ++ .ret = PFM_ERR_ATTR, ++ }, ++ { SRC_LINE, ++ .name = "skl::offcore_response_1:0xf", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] = 0x5301bb, ++ .codes[1] = 0xf, ++ .fstr = "skl::OFFCORE_RESPONSE_1:0xf:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", ++ }, ++ { SRC_LINE, ++ .name = "skl::offcore_response_1:0xfffffffff", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] = 0x5301bb, ++ .codes[1] = 0xfffffffff, ++ .fstr = "skl::OFFCORE_RESPONSE_1:0xffffffff:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", ++ }, ++ { SRC_LINE, ++ .name = "skl::offcore_response_1:0x7fffffffff", ++ .ret = PFM_ERR_ATTR, ++ }, + }; + + #define NUM_TEST_EVENTS (int)(sizeof(x86_test_events)/sizeof(test_event_t)) +-- +2.7.4 + +From 1eac17750c99cc29156d3cf2815b4bf0cdf1a1be Mon Sep 17 00:00:00 2001 +From: William Cohen +Date: Tue, 11 Apr 2017 11:22:59 -0400 +Subject: [PATCH] Also convert s390 to use the internal + pfmlib_event_attr_info_t + +Commit 321133e converted most of the architectures to use the internal +perflib_event_attr_info_t type. However, the s390 was missed in that +previous commit. This patch corrects the issue so libpfm compiles on +s390. +--- + lib/pfmlib_s390x_cpumf.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/lib/pfmlib_s390x_cpumf.c b/lib/pfmlib_s390x_cpumf.c +index db2a215..b5444ef 100644 +--- a/lib/pfmlib_s390x_cpumf.c ++++ b/lib/pfmlib_s390x_cpumf.c +@@ -254,7 +254,7 @@ static int pfm_cpumf_get_event_info(void *this, int idx, + } + + static int pfm_cpumf_get_event_attr_info(void *this, int idx, int umask_idx, +- pfm_event_attr_info_t *info) ++ pfmlib_event_attr_info_t *info) + { + /* Attributes are not supported */ + return PFM_ERR_ATTR; +-- +2.9.3 + diff --git a/SOURCES/libpfm-s390.patch b/SOURCES/libpfm-s390.patch new file mode 100644 index 0000000..ee9084a --- /dev/null +++ b/SOURCES/libpfm-s390.patch @@ -0,0 +1,1218 @@ +commit 9b38b5435de72ae4253bd8a6d6558e50fff618e7 +Author: Hendrik Brueckner +Date: Fri Feb 17 15:50:14 2017 +0100 + + s390/cpumf: add support for IBM z13/z13s counters + + This commit adds the counter definitions for the IBM z13/z13s + specific counters. These counters are available in the extended + and the new MT-diagnostic counter sets. + + Signed-off-by: Hendrik Brueckner + +diff --git a/lib/events/s390x_cpumf_events.h b/lib/events/s390x_cpumf_events.h +index e00b088..be9d7d9 100644 +--- a/lib/events/s390x_cpumf_events.h ++++ b/lib/events/s390x_cpumf_events.h +@@ -11,6 +11,7 @@ + #define CPUMF_CTRSET_PROBLEM_STATE 4 + #define CPUMF_CTRSET_CRYPTO 8 + #define CPUMF_CTRSET_EXTENDED 1 ++#define CPUMF_CTRSET_MT_DIAG 32 + + + static const pme_cpumf_ctr_t cpumcf_generic_counters[] = { +@@ -840,6 +841,458 @@ static const pme_cpumf_ctr_t cpumcf_zec12_counters[] = { + }, + }; + ++static const pme_cpumf_ctr_t cpumcf_z13_counters[] = { ++ { ++ .ctrnum = 128, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "L1D_WRITES_RO_EXCL", ++ .desc = "Counter:128 Name:L1D_WRITES_RO_EXCL A directory" ++ " write to the Level-1 Data cache where the line was" ++ " originally in a Read-Only state in the cache but" ++ " has been updated to be in the Exclusive state that" ++ " allows stores to the cache line.", ++ }, ++ { ++ .ctrnum = 129, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "DTLB1_WRITES", ++ .desc = "A translation entry has been written to the Level-1" ++ " Data Translation Lookaside Buffer", ++ }, ++ { ++ .ctrnum = 130, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "DTLB1_MISSES", ++ .desc = "Level-1 Data TLB miss in progress. Incremented by" ++ " one for every cycle a DTLB1 miss is in progress.", ++ }, ++ { ++ .ctrnum = 131, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "DTLB1_HPAGE_WRITES", ++ .desc = "A translation entry has been written to the Level-1" ++ " Data Translation Lookaside Buffer for a one-" ++ " megabyte page", ++ }, ++ { ++ .ctrnum = 132, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "DTLB1_GPAGE_WRITES", ++ .desc = "Counter:132 Name:DTLB1_GPAGE_WRITES A translation" ++ " entry has been written to the Level-1 Data" ++ " Translation Lookaside Buffer for a two-gigabyte" ++ " page.", ++ }, ++ { ++ .ctrnum = 133, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "L1D_L2D_SOURCED_WRITES", ++ .desc = "A directory write to the Level-1 Data cache" ++ " directory where the returned cache line was sourced" ++ " from the Level-2 Data cache", ++ }, ++ { ++ .ctrnum = 134, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "ITLB1_WRITES", ++ .desc = "A translation entry has been written to the Level-1" ++ " Instruction Translation Lookaside Buffer", ++ }, ++ { ++ .ctrnum = 135, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "ITLB1_MISSES", ++ .desc = "Level-1 Instruction TLB miss in progress." ++ " Incremented by one for every cycle an ITLB1 miss is" ++ " in progress", ++ }, ++ { ++ .ctrnum = 136, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "L1I_L2I_SOURCED_WRITES", ++ .desc = "A directory write to the Level-1 Instruction cache" ++ " directory where the returned cache line was sourced" ++ " from the Level-2 Instruction cache", ++ }, ++ { ++ .ctrnum = 137, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "TLB2_PTE_WRITES", ++ .desc = "A translation entry has been written to the Level-2" ++ " TLB Page Table Entry arrays", ++ }, ++ { ++ .ctrnum = 138, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "TLB2_CRSTE_HPAGE_WRITES", ++ .desc = "A translation entry has been written to the Level-2" ++ " TLB Combined Region Segment Table Entry arrays for" ++ " a one-megabyte large page translation", ++ }, ++ { ++ .ctrnum = 139, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "TLB2_CRSTE_WRITES", ++ .desc = "A translation entry has been written to the Level-2" ++ " TLB Combined Region Segment Table Entry arrays", ++ }, ++ { ++ .ctrnum = 140, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "TX_C_TEND", ++ .desc = "A TEND instruction has completed in a constrained" ++ " transactional-execution mode", ++ }, ++ { ++ .ctrnum = 141, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "TX_NC_TEND", ++ .desc = "A TEND instruction has completed in a non-" ++ " constrained transactional-execution mode", ++ }, ++ { ++ .ctrnum = 143, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "L1C_TLB1_MISSES", ++ .desc = "Increments by one for any cycle where a Level-1" ++ " cache or Level-1 TLB miss is in progress.", ++ }, ++ { ++ .ctrnum = 144, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "L1D_ONCHIP_L3_SOURCED_WRITES", ++ .desc = "A directory write to the Level-1 Data cache" ++ " directory where the returned cache line was sourced" ++ " from an On-Chip Level-3 cache without intervention", ++ }, ++ { ++ .ctrnum = 145, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "L1D_ONCHIP_L3_SOURCED_WRITES_IV", ++ .desc = "A directory write to the Level-1 Data cache" ++ " directory where the returned cache line was sourced" ++ " from an On-Chip Level-3 cache with intervention", ++ }, ++ { ++ .ctrnum = 146, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "L1D_ONNODE_L4_SOURCED_WRITES", ++ .desc = "A directory write to the Level-1 Data cache" ++ " directory where the returned cache line was sourced" ++ " from an On-Node Level-4 cache", ++ }, ++ { ++ .ctrnum = 147, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "L1D_ONNODE_L3_SOURCED_WRITES_IV", ++ .desc = "A directory write to the Level-1 Data cache" ++ " directory where the returned cache line was sourced" ++ " from an On-Node Level-3 cache with intervention", ++ }, ++ { ++ .ctrnum = 148, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "L1D_ONNODE_L3_SOURCED_WRITES", ++ .desc = "A directory write to the Level-1 Data cache" ++ " directory where the returned cache line was sourced" ++ " from an On-Node Level-3 cache without intervention", ++ }, ++ { ++ .ctrnum = 149, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "L1D_ONDRAWER_L4_SOURCED_WRITES", ++ .desc = "A directory write to the Level-1 Data cache" ++ " directory where the returned cache line was sourced" ++ " from an On-Drawer Level-4 cache", ++ }, ++ { ++ .ctrnum = 150, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "L1D_ONDRAWER_L3_SOURCED_WRITES_IV", ++ .desc = "A directory write to the Level-1 Data cache" ++ " directory where the returned cache line was sourced" ++ " from an On-Drawer Level-3 cache with intervention", ++ }, ++ { ++ .ctrnum = 151, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "L1D_ONDRAWER_L3_SOURCED_WRITES", ++ .desc = "A directory write to the Level-1 Data cache" ++ " directory where the returned cache line was sourced" ++ " from an On-Drawer Level-3 cache without" ++ " intervention", ++ }, ++ { ++ .ctrnum = 152, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "L1D_OFFDRAWER_SCOL_L4_SOURCED_WRITES", ++ .desc = "A directory write to the Level-1 Data cache" ++ " directory where the returned cache line was sourced" ++ " from an Off-Drawer Same-Column Level-4 cache", ++ }, ++ { ++ .ctrnum = 153, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "L1D_OFFDRAWER_SCOL_L3_SOURCED_WRITES_IV", ++ .desc = "A directory write to the Level-1 Data cache" ++ " directory where the returned cache line was sourced" ++ " from an Off-Drawer Same-Column Level-3 cache with" ++ " intervention", ++ }, ++ { ++ .ctrnum = 154, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "L1D_OFFDRAWER_SCOL_L3_SOURCED_WRITES", ++ .desc = "A directory write to the Level-1 Data cache" ++ " directory where the returned cache line was sourced" ++ " from an Off-Drawer Same-Column Level-3 cache" ++ " without intervention", ++ }, ++ { ++ .ctrnum = 155, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "L1D_OFFDRAWER_FCOL_L4_SOURCED_WRITES", ++ .desc = "A directory write to the Level-1 Data cache" ++ " directory where the returned cache line was sourced" ++ " from an Off-Drawer Far-Column Level-4 cache", ++ }, ++ { ++ .ctrnum = 156, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "L1D_OFFDRAWER_FCOL_L3_SOURCED_WRITES_IV", ++ .desc = "A directory write to the Level-1 Data cache" ++ " directory where the returned cache line was sourced" ++ " from an Off-Drawer Far-Column Level-3 cache with" ++ " intervention", ++ }, ++ { ++ .ctrnum = 157, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "L1D_OFFDRAWER_FCOL_L3_SOURCED_WRITES", ++ .desc = "A directory write to the Level-1 Data cache" ++ " directory where the returned cache line was sourced" ++ " from an Off-Drawer Far-Column Level-3 cache without" ++ " intervention", ++ }, ++ { ++ .ctrnum = 158, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "L1D_ONNODE_MEM_SOURCED_WRITES", ++ .desc = "A directory write to the Level-1 Data cache" ++ " directory where the returned cache line was sourced" ++ " from On-Node memory", ++ }, ++ { ++ .ctrnum = 159, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "L1D_ONDRAWER_MEM_SOURCED_WRITES", ++ .desc = "A directory write to the Level-1 Data cache" ++ " directory where the returned cache line was sourced" ++ " from On-Drawer memory", ++ }, ++ { ++ .ctrnum = 160, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "L1D_OFFDRAWER_MEM_SOURCED_WRITES", ++ .desc = "A directory write to the Level-1 Data cache" ++ " directory where the returned cache line was sourced" ++ " from On-Drawer memory", ++ }, ++ { ++ .ctrnum = 161, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "L1D_ONCHIP_MEM_SOURCED_WRITES", ++ .desc = "A directory write to the Level-1 Data cache" ++ " directory where the returned cache line was sourced" ++ " from On-Chip memory", ++ }, ++ { ++ .ctrnum = 162, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "L1I_ONCHIP_L3_SOURCED_WRITES", ++ .desc = "A directory write to the Level-1 Instruction cache" ++ " directory where the returned cache line was sourced" ++ " from an On-Chip Level-3 cache without intervention", ++ }, ++ { ++ .ctrnum = 163, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "L1I_ONCHIP_L3_SOURCED_WRITES_IV", ++ .desc = "A directory write to the Level-1 Instruction cache" ++ " directory where the returned cache line was sourced" ++ " from an On Chip Level-3 cache with intervention", ++ }, ++ { ++ .ctrnum = 164, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "L1I_ONNODE_L4_SOURCED_WRITES", ++ .desc = "A directory write to the Level-1 Instruction cache" ++ " directory where the returned cache line was sourced" ++ " from an On-Node Level-4 cache", ++ }, ++ { ++ .ctrnum = 165, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "L1I_ONNODE_L3_SOURCED_WRITES_IV", ++ .desc = "A directory write to the Level-1 Instruction cache" ++ " directory where the returned cache line was sourced" ++ " from an On-Node Level-3 cache with intervention", ++ }, ++ { ++ .ctrnum = 166, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "L1I_ONNODE_L3_SOURCED_WRITES", ++ .desc = "A directory write to the Level-1 Instruction cache" ++ " directory where the returned cache line was sourced" ++ " from an On-Node Level-3 cache without intervention", ++ }, ++ { ++ .ctrnum = 167, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "L1I_ONDRAWER_L4_SOURCED_WRITES", ++ .desc = "A directory write to the Level-1 Instruction cache" ++ " directory where the returned cache line was sourced" ++ " from an On-Drawer Level-4 cache", ++ }, ++ { ++ .ctrnum = 168, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "L1I_ONDRAWER_L3_SOURCED_WRITES_IV", ++ .desc = "A directory write to the Level-1 Instruction cache" ++ " directory where the returned cache line was sourced" ++ " from an On-Drawer Level-3 cache with intervention", ++ }, ++ { ++ .ctrnum = 169, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "L1I_ONDRAWER_L3_SOURCED_WRITES", ++ .desc = "A directory write to the Level-1 Instruction cache" ++ " directory where the returned cache line was sourced" ++ " from an On-Drawer Level-3 cache without" ++ " intervention", ++ }, ++ { ++ .ctrnum = 170, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "L1I_OFFDRAWER_SCOL_L4_SOURCED_WRITES", ++ .desc = "A directory write to the Level-1 Instruction cache" ++ " directory where the returned cache line was sourced" ++ " from an Off-Drawer Same-Column Level-4 cache", ++ }, ++ { ++ .ctrnum = 171, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "L1I_OFFDRAWER_SCOL_L3_SOURCED_WRITES_IV", ++ .desc = "A directory write to the Level-1 Instruction cache" ++ " directory where the returned cache line was sourced" ++ " from an Off-Drawer Same-Column Level-3 cache with" ++ " intervention", ++ }, ++ { ++ .ctrnum = 172, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "L1I_OFFDRAWER_SCOL_L3_SOURCED_WRITES", ++ .desc = "A directory write to the Level-1 Instruction cache" ++ " directory where the returned cache line was sourced" ++ " from an Off-Drawer Same-Column Level-3 cache" ++ " without intervention", ++ }, ++ { ++ .ctrnum = 173, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "L1I_OFFDRAWER_FCOL_L4_SOURCED_WRITES", ++ .desc = "A directory write to the Level-1 Instruction cache" ++ " directory where the returned cache line was sourced" ++ " from an Off-Drawer Far-Column Level-4 cache", ++ }, ++ { ++ .ctrnum = 174, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "L1I_OFFDRAWER_FCOL_L3_SOURCED_WRITES_IV", ++ .desc = "A directory write to the Level-1 Instruction cache" ++ " directory where the returned cache line was sourced" ++ " from an Off-Drawer Far-Column Level-3 cache with" ++ " intervention", ++ }, ++ { ++ .ctrnum = 175, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "L1I_OFFDRAWER_FCOL_L3_SOURCED_WRITES", ++ .desc = "A directory write to the Level-1 Instruction cache" ++ " directory where the returned cache line was sourced" ++ " from an Off-Drawer Far-Column Level-3 cache without" ++ " intervention", ++ }, ++ { ++ .ctrnum = 176, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "L1I_ONNODE_MEM_SOURCED_WRITES", ++ .desc = "A directory write to the Level-1 Instruction cache" ++ " directory where the returned cache line was sourced" ++ " from On-Node memory", ++ }, ++ { ++ .ctrnum = 177, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "L1I_ONDRAWER_MEM_SOURCED_WRITES", ++ .desc = "A directory write to the Level-1 Instruction cache" ++ " directory where the returned cache line was sourced" ++ " from On-Drawer memory", ++ }, ++ { ++ .ctrnum = 178, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "L1I_OFFDRAWER_MEM_SOURCED_WRITES", ++ .desc = "A directory write to the Level-1 Instruction cache" ++ " directory where the returned cache line was sourced" ++ " from On-Drawer memory", ++ }, ++ { ++ .ctrnum = 179, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "L1I_ONCHIP_MEM_SOURCED_WRITES", ++ .desc = "A directory write to the Level-1 Instruction cache" ++ " directory where the returned cache line was sourced" ++ " from On-Chip memory", ++ }, ++ { ++ .ctrnum = 218, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "TX_NC_TABORT", ++ .desc = "A transaction abort has occurred in a non-" ++ " constrained transactional-execution mode", ++ }, ++ { ++ .ctrnum = 219, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "TX_C_TABORT_NO_SPECIAL", ++ .desc = "A transaction abort has occurred in a constrained" ++ " transactional-execution mode and the CPU is not" ++ " using any special logic to allow the transaction to" ++ " complete", ++ }, ++ { ++ .ctrnum = 220, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "TX_C_TABORT_SPECIAL", ++ .desc = "A transaction abort has occurred in a constrained" ++ " transactional-execution mode and the CPU is using" ++ " special logic to allow the transaction to complete", ++ }, ++ { ++ .ctrnum = 448, ++ .ctrset = CPUMF_CTRSET_MT_DIAG, ++ .name = "MT_DIAG_CYCLES_ONE_THR_ACTIVE", ++ .desc = "Cycle count with one thread active", ++ }, ++ { ++ .ctrnum = 449, ++ .ctrset = CPUMF_CTRSET_MT_DIAG, ++ .name = "MT_DIAG_CYCLES_TWO_THR_ACTIVE", ++ .desc = "Cycle count with two threads active", ++ }, ++}; ++ + static const pme_cpumf_ctr_t cpumsf_counters[] = { + { + .ctrnum = 720896, +diff --git a/lib/pfmlib_s390x_cpumf.c b/lib/pfmlib_s390x_cpumf.c +index b5444ef..7273962 100644 +--- a/lib/pfmlib_s390x_cpumf.c ++++ b/lib/pfmlib_s390x_cpumf.c +@@ -128,6 +128,11 @@ static int pfm_cpumcf_init(void *this) + ext_set = cpumcf_zec12_counters; + ext_set_count = LIBPFM_ARRAY_SIZE(cpumcf_zec12_counters); + break; ++ case 2964: /* IBM z13 */ ++ case 2965: /* IBM z13s */ ++ ext_set = cpumcf_z13_counters; ++ ext_set_count = LIBPFM_ARRAY_SIZE(cpumcf_z13_counters); ++ break; + default: + /* No extended counter set for this machine type or there + * was an error retrieving the machine type */ +commit 8f2653b8e2e18bad44ba1acc7f92c825f226ef71 +Author: Hendrik Brueckner +Date: Fri Oct 13 16:57:32 2017 +0200 + + s390/cpumf: add support for IBM z14 counters + + Add counter definitions for the IBM z14 hardware model. With z14, + the counters in the problem-state set are reduced and the counter + first number version is increased accordingly. Now, the counters + are processed depending on the counter facility versions. + + Signed-off-by: Hendrik Brueckner + +diff --git a/lib/events/s390x_cpumf_events.h b/lib/events/s390x_cpumf_events.h +index be9d7d9..c843bc3 100644 +--- a/lib/events/s390x_cpumf_events.h ++++ b/lib/events/s390x_cpumf_events.h +@@ -14,7 +14,7 @@ + #define CPUMF_CTRSET_MT_DIAG 32 + + +-static const pme_cpumf_ctr_t cpumcf_generic_counters[] = { ++static const pme_cpumf_ctr_t cpumcf_fvn1_counters[] = { + { + .ctrnum = 0, + .ctrset = CPUMF_CTRSET_BASIC, +@@ -87,6 +87,60 @@ static const pme_cpumf_ctr_t cpumcf_generic_counters[] = { + .name = "PROBLEM_STATE_L1D_PENALTY_CYCLES", + .desc = "Problem-State Level-1 D-Cache Penalty Cycle Count", + }, ++}; ++ ++static const pme_cpumf_ctr_t cpumcf_fvn3_counters[] = { ++ { ++ .ctrnum = 0, ++ .ctrset = CPUMF_CTRSET_BASIC, ++ .name = "CPU_CYCLES", ++ .desc = "Cycle Count", ++ }, ++ { ++ .ctrnum = 1, ++ .ctrset = CPUMF_CTRSET_BASIC, ++ .name = "INSTRUCTIONS", ++ .desc = "Instruction Count", ++ }, ++ { ++ .ctrnum = 2, ++ .ctrset = CPUMF_CTRSET_BASIC, ++ .name = "L1I_DIR_WRITES", ++ .desc = "Level-1 I-Cache Directory Write Count", ++ }, ++ { ++ .ctrnum = 3, ++ .ctrset = CPUMF_CTRSET_BASIC, ++ .name = "L1I_PENALTY_CYCLES", ++ .desc = "Level-1 I-Cache Penalty Cycle Count", ++ }, ++ { ++ .ctrnum = 4, ++ .ctrset = CPUMF_CTRSET_BASIC, ++ .name = "L1D_DIR_WRITES", ++ .desc = "Level-1 D-Cache Directory Write Count", ++ }, ++ { ++ .ctrnum = 5, ++ .ctrset = CPUMF_CTRSET_BASIC, ++ .name = "L1D_PENALTY_CYCLES", ++ .desc = "Level-1 D-Cache Penalty Cycle Count", ++ }, ++ { ++ .ctrnum = 32, ++ .ctrset = CPUMF_CTRSET_PROBLEM_STATE, ++ .name = "PROBLEM_STATE_CPU_CYCLES", ++ .desc = "Problem-State Cycle Count", ++ }, ++ { ++ .ctrnum = 33, ++ .ctrset = CPUMF_CTRSET_PROBLEM_STATE, ++ .name = "PROBLEM_STATE_INSTRUCTIONS", ++ .desc = "Problem-State Instruction Count", ++ }, ++}; ++ ++static const pme_cpumf_ctr_t cpumcf_svn_generic_counters[] = { + { + .ctrnum = 64, + .ctrset = CPUMF_CTRSET_CRYPTO, +@@ -1293,6 +1347,434 @@ static const pme_cpumf_ctr_t cpumcf_z13_counters[] = { + }, + }; + ++static const pme_cpumf_ctr_t cpumcf_z14_counters[] = { ++ { ++ .ctrnum = 128, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "L1D_WRITES_RO_EXCL", ++ .desc = "Counter:128 Name:L1D_WRITES_RO_EXCL A directory" ++ " write to the Level-1 Data cache where the line was" ++ " originally in a Read-Only state in the cache but" ++ " has been updated to be in the Exclusive state that" ++ " allows stores to the cache line", ++ }, ++ { ++ .ctrnum = 129, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "DTLB2_WRITES", ++ .desc = "A translation has been written into The Translation" ++ " Lookaside Buffer 2 (TLB2) and the request was made" ++ " by the data cache", ++ }, ++ { ++ .ctrnum = 130, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "DTLB2_MISSES", ++ .desc = "A TLB2 miss is in progress for a request made by" ++ " the data cache. Incremented by one for every TLB2" ++ " miss in progress for the Level-1 Data cache on this" ++ " cycle", ++ }, ++ { ++ .ctrnum = 131, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "DTLB2_HPAGE_WRITES", ++ .desc = "A translation entry was written into the Combined" ++ " Region and Segment Table Entry array in the Level-2" ++ " TLB for a one-megabyte page or a Last Host" ++ " Translation was done", ++ }, ++ { ++ .ctrnum = 132, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "DTLB2_GPAGE_WRITES", ++ .desc = "A translation entry for a two-gigabyte page was" ++ " written into the Level-2 TLB", ++ }, ++ { ++ .ctrnum = 133, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "L1D_L2D_SOURCED_WRITES", ++ .desc = "A directory write to the Level-1 Data cache" ++ " directory where the returned cache line was sourced" ++ " from the Level-2 Data cache", ++ }, ++ { ++ .ctrnum = 134, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "ITLB2_WRITES", ++ .desc = "A translation entry has been written into the" ++ " Translation Lookaside Buffer 2 (TLB2) and the" ++ " request was made by the instruction cache", ++ }, ++ { ++ .ctrnum = 135, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "ITLB2_MISSES", ++ .desc = "A TLB2 miss is in progress for a request made by" ++ " the instruction cache. Incremented by one for every" ++ " TLB2 miss in progress for the Level-1 Instruction" ++ " cache in a cycle", ++ }, ++ { ++ .ctrnum = 136, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "L1I_L2I_SOURCED_WRITES", ++ .desc = "A directory write to the Level-1 Instruction cache" ++ " directory where the returned cache line was sourced" ++ " from the Level-2 Instruction cache", ++ }, ++ { ++ .ctrnum = 137, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "TLB2_PTE_WRITES", ++ .desc = "A translation entry was written into the Page Table" ++ " Entry array in the Level-2 TLB", ++ }, ++ { ++ .ctrnum = 138, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "TLB2_CRSTE_WRITES", ++ .desc = "Translation entries were written into the Combined" ++ " Region and Segment Table Entry array and the Page" ++ " Table Entry array in the Level-2 TLB", ++ }, ++ { ++ .ctrnum = 139, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "TLB2_ENGINES_BUSY", ++ .desc = "The number of Level-2 TLB translation engines busy" ++ " in a cycle", ++ }, ++ { ++ .ctrnum = 140, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "TX_C_TEND", ++ .desc = "A TEND instruction has completed in a constrained" ++ " transactional-execution mode", ++ }, ++ { ++ .ctrnum = 141, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "TX_NC_TEND", ++ .desc = "A TEND instruction has completed in a non-" ++ " constrained transactional-execution mode", ++ }, ++ { ++ .ctrnum = 143, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "L1C_TLB2_MISSES", ++ .desc = "Increments by one for any cycle where a level-1" ++ " cache or level-2 TLB miss is in progress", ++ }, ++ { ++ .ctrnum = 144, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "L1D_ONCHIP_L3_SOURCED_WRITES", ++ .desc = "A directory write to the Level-1 Data cache" ++ " directory where the returned cache line was sourced" ++ " from an On-Chip Level-3 cache without intervention", ++ }, ++ { ++ .ctrnum = 145, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "L1D_ONCHIP_MEMORY_SOURCED_WRITES", ++ .desc = "A directory write to the Level-1 Data cache" ++ " directory where the returned cache line was sourced" ++ " from On-Chip memory", ++ }, ++ { ++ .ctrnum = 146, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "L1D_ONCHIP_L3_SOURCED_WRITES_IV", ++ .desc = "A directory write to the Level-1 Data cache" ++ " directory where the returned cache line was sourced" ++ " from an On-Chip Level-3 cache with intervention", ++ }, ++ { ++ .ctrnum = 147, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "L1D_ONCLUSTER_L3_SOURCED_WRITES", ++ .desc = "A directory write to the Level-1 Data cache" ++ " directory where the returned cache line was sourced" ++ " from On-Cluster Level-3 cache withountervention", ++ }, ++ { ++ .ctrnum = 148, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "L1D_ONCLUSTER_MEMORY_SOURCED_WRITES", ++ .desc = "A directory write to the Level-1 Data cache" ++ " directory where the returned cache line was sourced" ++ " from an On-Cluster memory", ++ }, ++ { ++ .ctrnum = 149, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "L1D_ONCLUSTER_L3_SOURCED_WRITES_IV", ++ .desc = "A directory write to the Level-1 Data cache" ++ " directory where the returned cache line was sourced" ++ " from an On-Cluster Level-3 cache with intervention", ++ }, ++ { ++ .ctrnum = 150, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "L1D_OFFCLUSTER_L3_SOURCED_WRITES", ++ .desc = "A directory write to the Level-1 Data cache" ++ " directory where the returned cache line was sourced" ++ " from an Off-Cluster Level-3 cache without" ++ " intervention", ++ }, ++ { ++ .ctrnum = 151, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "L1D_OFFCLUSTER_MEMORY_SOURCED_WRITES", ++ .desc = "A directory write to the Level-1 Data cache" ++ " directory where the returned cache line was sourced" ++ " from Off-Cluster memory", ++ }, ++ { ++ .ctrnum = 152, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "L1D_OFFCLUSTER_L3_SOURCED_WRITES_IV", ++ .desc = "A directory write to the Level-1 Data cache" ++ " directory where the returned cache line was sourced" ++ " from an Off-Cluster Level-3 cache with intervention", ++ }, ++ { ++ .ctrnum = 153, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "L1D_OFFDRAWER_L3_SOURCED_WRITES", ++ .desc = "A directory write to the Level-1 Data cache" ++ " directory where the returned cache line was sourced" ++ " from an Off-Drawer Level-3 cache without" ++ " intervention", ++ }, ++ { ++ .ctrnum = 154, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "L1D_OFFDRAWER_MEMORY_SOURCED_WRITES", ++ .desc = "A directory write to the Level-1 Data cache" ++ " directory where the returned cache line was sourced" ++ " from Off-Drawer memory", ++ }, ++ { ++ .ctrnum = 155, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "L1D_OFFDRAWER_L3_SOURCED_WRITES_IV", ++ .desc = "A directory write to the Level-1 Data cache" ++ " directory where the returned cache line was sourced" ++ " from an Off-Drawer Level-3 cache with intervention", ++ }, ++ { ++ .ctrnum = 156, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "L1D_ONDRAWER_L4_SOURCED_WRITES", ++ .desc = "A directory write to the Level-1 Data cache" ++ " directory where the returned cache line was sourced" ++ " from On-Drawer Level-4 cache", ++ }, ++ { ++ .ctrnum = 157, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "L1D_OFFDRAWER_L4_SOURCED_WRITES", ++ .desc = "A directory write to the Level-1 Data cache" ++ " directory where the returned cache line was sourced" ++ " from Off-Drawer Level-4 cache", ++ }, ++ { ++ .ctrnum = 158, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "L1D_ONCHIP_L3_SOURCED_WRITES_RO", ++ .desc = "A directory write to the Level-1 Data cache" ++ " directory where the returned cache line was sourced" ++ " from On-Chip L3 but a read-only invalidate was done" ++ " to remove other copies of the cache line", ++ }, ++ { ++ .ctrnum = 162, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "L1I_ONCHIP_L3_SOURCED_WRITES", ++ .desc = "A directory write to the Level-1 Instruction cache" ++ " directory where the returned cache ine was sourced" ++ " from an On-Chip Level-3 cache without intervention", ++ }, ++ { ++ .ctrnum = 163, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "L1I_ONCHIP_MEMORY_SOURCED_WRITES", ++ .desc = "A directory write to the Level-1 Instruction cache" ++ " directory where the returned cache ine was sourced" ++ " from On-Chip memory", ++ }, ++ { ++ .ctrnum = 164, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "L1I_ONCHIP_L3_SOURCED_WRITES_IV", ++ .desc = "A directory write to the Level-1 Instruction cache" ++ " directory where the returned cache ine was sourced" ++ " from an On-Chip Level-3 cache with intervention", ++ }, ++ { ++ .ctrnum = 165, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "L1I_ONCLUSTER_L3_SOURCED_WRITES", ++ .desc = "A directory write to the Level-1 Instruction cache" ++ " directory where the returned cache line was sourced" ++ " from an On-Cluster Level-3 cache without" ++ " intervention", ++ }, ++ { ++ .ctrnum = 166, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "L1I_ONCLUSTER_MEMORY_SOURCED_WRITES", ++ .desc = "A directory write to the Level-1 Instruction cache" ++ " directory where the returned cache line was sourced" ++ " from an On-Cluster memory", ++ }, ++ { ++ .ctrnum = 167, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "L1I_ONCLUSTER_L3_SOURCED_WRITES_IV", ++ .desc = "A directory write to the Level-1 Instruction cache" ++ " directory where the returned cache line was sourced" ++ " from On-Cluster Level-3 cache with intervention", ++ }, ++ { ++ .ctrnum = 168, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "L1I_OFFCLUSTER_L3_SOURCED_WRITES", ++ .desc = "A directory write to the Level-1 Instruction cache" ++ " directory where the returned cache line was sourced" ++ " from an Off-Cluster Level-3 cache without" ++ " intervention", ++ }, ++ { ++ .ctrnum = 169, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "L1I_OFFCLUSTER_MEMORY_SOURCED_WRITES", ++ .desc = "A directory write to the Level-1 Instruction cache" ++ " directory where the returned cache line was sourced" ++ " from Off-Cluster memory", ++ }, ++ { ++ .ctrnum = 170, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "L1I_OFFCLUSTER_L3_SOURCED_WRITES_IV", ++ .desc = "A directory write to the Level-1 Instruction cache" ++ " directory where the returned cache line was sourced" ++ " from an Off-Cluster Level-3 cache with intervention", ++ }, ++ { ++ .ctrnum = 171, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "L1I_OFFDRAWER_L3_SOURCED_WRITES", ++ .desc = "A directory write to the Level-1 Instruction cache" ++ " directory where the returned cache line was sourced" ++ " from an Off-Drawer Level-3 cache without" ++ " intervention", ++ }, ++ { ++ .ctrnum = 172, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "L1I_OFFDRAWER_MEMORY_SOURCED_WRITES", ++ .desc = "A directory write to the Level-1 Instruction cache" ++ " directory where the returned cache line was sourced" ++ " from Off-Drawer memory", ++ }, ++ { ++ .ctrnum = 173, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "L1I_OFFDRAWER_L3_SOURCED_WRITES_IV", ++ .desc = "A directory write to the Level-1 Instruction cache" ++ " directory where the returned cache line was sourced" ++ " from an Off-Drawer Level-3 cache with intervention", ++ }, ++ { ++ .ctrnum = 174, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "L1I_ONDRAWER_L4_SOURCED_WRITES", ++ .desc = "A directory write to the Level-1 Instruction cache" ++ " directory where the returned cache line was sourced" ++ " from On-Drawer Level-4 cache", ++ }, ++ { ++ .ctrnum = 175, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "L1I_OFFDRAWER_L4_SOURCED_WRITES", ++ .desc = "A directory write to the Level-1 Instruction cache" ++ " directory where the returned cache line was sourced" ++ " from Off-Drawer Level-4 cache", ++ }, ++ { ++ .ctrnum = 224, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "BCD_DFP_EXECUTION_SLOTS", ++ .desc = "Count of floating point execution slots used for" ++ " finished Binary Coded Decimal to Decimal Floating" ++ " Point conversions. Instructions: CDZT, CXZT, CZDT," ++ " CZXT", ++ }, ++ { ++ .ctrnum = 225, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "VX_BCD_EXECUTION_SLOTS", ++ .desc = "Count of floating point execution slots used for" ++ " finished vector arithmetic Binary Coded Decimal" ++ " instructions. Instructions: VAP, VSP, VMPVMSP, VDP," ++ " VSDP, VRP, VLIP, VSRP, VPSOPVCP, VTP, VPKZ, VUPKZ," ++ " VCVB, VCVBG, VCVDVCVDG", ++ }, ++ { ++ .ctrnum = 226, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "DECIMAL_INSTRUCTIONS", ++ .desc = "Decimal instructions dispatched. Instructions: CVB," ++ " CVD, AP, CP, DP, ED, EDMK, MP, SRP, SP, ZAP", ++ }, ++ { ++ .ctrnum = 232, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "LAST_HOST_TRANSLATIONS", ++ .desc = "Last Host Translation done", ++ }, ++ { ++ .ctrnum = 243, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "TX_NC_TABORT", ++ .desc = "A transaction abort has occurred in a non-" ++ " constrained transactional-execution mode", ++ }, ++ { ++ .ctrnum = 244, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "TX_C_TABORT_NO_SPECIAL", ++ .desc = "A transaction abort has occurred in a constrained" ++ " transactional-execution mode and the CPU is not" ++ " using any special logic to allow the transaction to" ++ " complete", ++ }, ++ { ++ .ctrnum = 245, ++ .ctrset = CPUMF_CTRSET_EXTENDED, ++ .name = "TX_C_TABORT_SPECIAL", ++ .desc = "A transaction abort has occurred in a constrained" ++ " transactional-execution mode and the CPU is using" ++ " special logic to allow the transaction to complete", ++ }, ++ { ++ .ctrnum = 448, ++ .ctrset = CPUMF_CTRSET_MT_DIAG, ++ .name = "MT_DIAG_CYCLES_ONE_THR_ACTIVE", ++ .desc = "Cycle count with one thread active", ++ }, ++ { ++ .ctrnum = 449, ++ .ctrset = CPUMF_CTRSET_MT_DIAG, ++ .name = "MT_DIAG_CYCLES_TWO_THR_ACTIVE", ++ .desc = "Cycle count with two threads active", ++ }, ++}; ++ + static const pme_cpumf_ctr_t cpumsf_counters[] = { + { + .ctrnum = 720896, +diff --git a/lib/pfmlib_s390x_cpumf.c b/lib/pfmlib_s390x_cpumf.c +index 7273962..62b1457 100644 +--- a/lib/pfmlib_s390x_cpumf.c ++++ b/lib/pfmlib_s390x_cpumf.c +@@ -37,6 +37,8 @@ + #define CPUM_CF_DEVICE_DIR "/sys/bus/event_source/devices/cpum_cf" + #define CPUM_SF_DEVICE_DIR "/sys/bus/event_source/devices/cpum_sf" + #define SYS_INFO "/proc/sysinfo" ++#define SERVICE_LEVEL "/proc/service_levels" ++#define CF_VERSION_STR "CPU-MF: Counter facility: version=" + + + /* CPU-measurement counter list (pmu events) */ +@@ -99,6 +101,37 @@ out: + return machine_type; + } + ++static void get_cf_version(unsigned int *cfvn, unsigned int *csvn) ++{ ++ int rc; ++ FILE *fp; ++ char *buffer; ++ size_t buflen; ++ ++ *cfvn = *csvn = 0; ++ fp = fopen(SERVICE_LEVEL, "r"); ++ if (fp == NULL) ++ return; ++ ++ buffer = NULL; ++ while (pfmlib_getl(&buffer, &buflen, fp) != -1) { ++ /* skip empty lines */ ++ if (*buffer == '\n') ++ continue; ++ ++ /* look for 'CPU-MF: Counter facility: version=' entry */ ++ if (!strncmp(CF_VERSION_STR, buffer, strlen(CF_VERSION_STR))) { ++ rc = sscanf(buffer + strlen(CF_VERSION_STR), "%u.%u", ++ cfvn, csvn); ++ if (rc != 2) ++ *cfvn = *csvn = 0; ++ break; ++ } ++ } ++ fclose(fp); ++ free(buffer); ++} ++ + /* Initialize the PMU representation for CPUMF. + * + * Set up the PMU events array based on +@@ -108,8 +141,33 @@ out: + static int pfm_cpumcf_init(void *this) + { + pfmlib_pmu_t *pmu = this; +- const pme_cpumf_ctr_t *ext_set; +- size_t generic_count, ext_set_count; ++ unsigned int cfvn, csvn; ++ const pme_cpumf_ctr_t *cfvn_set, *csvn_set, *ext_set; ++ size_t cfvn_set_count, csvn_set_count, ext_set_count, pme_count; ++ ++ /* obtain counter first/second version number */ ++ get_cf_version(&cfvn, &csvn); ++ ++ /* counters based on first version number */ ++ switch (cfvn) ++ { ++ case 1: ++ cfvn_set = cpumcf_fvn1_counters; ++ cfvn_set_count = LIBPFM_ARRAY_SIZE(cpumcf_fvn1_counters); ++ break; ++ case 3: ++ cfvn_set = cpumcf_fvn3_counters; ++ cfvn_set_count = LIBPFM_ARRAY_SIZE(cpumcf_fvn3_counters); ++ break; ++ default: ++ cfvn_set = NULL; ++ cfvn_set_count = 0; ++ break; ++ } ++ ++ /* counters based on second version number */ ++ csvn_set = cpumcf_svn_generic_counters; ++ csvn_set_count = LIBPFM_ARRAY_SIZE(cpumcf_svn_generic_counters); + + /* check and assign a machine-specific extended counter set */ + switch (get_machine_type()) { +@@ -133,6 +191,10 @@ static int pfm_cpumcf_init(void *this) + ext_set = cpumcf_z13_counters; + ext_set_count = LIBPFM_ARRAY_SIZE(cpumcf_z13_counters); + break; ++ case 3906: /* IBM z14 */ ++ ext_set = cpumcf_z14_counters; ++ ext_set_count = LIBPFM_ARRAY_SIZE(cpumcf_z14_counters); ++ break; + default: + /* No extended counter set for this machine type or there + * was an error retrieving the machine type */ +@@ -141,20 +203,30 @@ static int pfm_cpumcf_init(void *this) + break; + } + +- generic_count = LIBPFM_ARRAY_SIZE(cpumcf_generic_counters); +- +- cpumcf_pe = calloc(sizeof(*cpumcf_pe), generic_count + ext_set_count); ++ cpumcf_pe = calloc(sizeof(*cpumcf_pe), ++ cfvn_set_count + csvn_set_count + ext_set_count); + if (cpumcf_pe == NULL) + return PFM_ERR_NOMEM; + +- memcpy(cpumcf_pe, cpumcf_generic_counters, +- sizeof(*cpumcf_pe) * generic_count); ++ pme_count = 0; ++ memcpy(cpumcf_pe, cfvn_set, sizeof(*cpumcf_pe) * cfvn_set_count); ++ pme_count += cfvn_set_count; ++ memcpy((void *) (cpumcf_pe + pme_count), csvn_set, ++ sizeof(*cpumcf_pe) * csvn_set_count); ++ pme_count += csvn_set_count; + if (ext_set_count) +- memcpy((void *) (cpumcf_pe + generic_count), ++ memcpy((void *) (cpumcf_pe + pme_count), + ext_set, sizeof(*cpumcf_pe) * ext_set_count); ++ pme_count += ext_set_count; + + pmu->pe = cpumcf_pe; +- pmu->pme_count = generic_count + ext_set_count; ++ pmu->pme_count = pme_count; ++ ++ /* CPUM-CF provides fixed counters only. The number of installed ++ * counters depends on the version and hardware model up to ++ * CPUMF_COUNTER_MAX. ++ */ ++ pmu->num_fixed_cntrs = pme_count; + + return PFM_SUCCESS; + } +@@ -276,8 +348,8 @@ pfmlib_pmu_t s390x_cpum_cf_support = { + .num_fixed_cntrs = CPUMF_COUNTER_MAX, /* fixed counters only */ + .max_encoding = 1, + +- .pe = cpumcf_generic_counters, +- .pme_count = LIBPFM_ARRAY_SIZE(cpumcf_generic_counters), ++ .pe = NULL, ++ .pme_count = 0, + + .pmu_detect = pfm_cpumcf_detect, + .pmu_init = pfm_cpumcf_init, +diff --git a/lib/pfmlib_s390x_priv.h b/lib/pfmlib_s390x_priv.h +index 22c775a..48a96c3 100644 +--- a/lib/pfmlib_s390x_priv.h ++++ b/lib/pfmlib_s390x_priv.h +@@ -1,7 +1,7 @@ + #ifndef __PFMLIB_S390X_PRIV_H__ + #define __PFMLIB_S390X_PRIV_H__ + +-#define CPUMF_COUNTER_MAX 256 ++#define CPUMF_COUNTER_MAX 0xffff + typedef struct { + uint64_t ctrnum; /* counter number */ + unsigned int ctrset; /* counter set */ +commit 31ab4b33773750fbd13a1824e485805b70fc0bff +Author: Hendrik Brueckner +Date: Fri Feb 9 09:42:07 2018 +0100 + + s390/cpumf: check for counter facility availability + + If the counter facility is not available, counter information + are not being set up. Introduce checks to protect against + access to counter information in that case. + + Signed-off-by: Hendrik Brueckner + +diff --git a/lib/pfmlib_s390x_cpumf.c b/lib/pfmlib_s390x_cpumf.c +index 62b1457..4e03fc4 100644 +--- a/lib/pfmlib_s390x_cpumf.c ++++ b/lib/pfmlib_s390x_cpumf.c +@@ -254,7 +254,9 @@ static int pfm_cpumf_get_encoding(void *this, pfmlib_event_desc_t *e) + + static int pfm_cpumf_get_event_first(void *this) + { +- return 0; ++ pfmlib_pmu_t *pmu = this; ++ ++ return !!pmu->pme_count ? 0 : -1; + } + + static int pfm_cpumf_get_event_next(void *this, int idx) +@@ -317,6 +319,9 @@ static int pfm_cpumf_get_event_info(void *this, int idx, + pfmlib_pmu_t *pmu = this; + const pme_cpumf_ctr_t *pe = this_pe(this); + ++ if (idx >= pmu->pme_count) ++ return PFM_ERR_INVAL; ++ + info->name = pe[idx].name; + info->desc = pe[idx].desc; + info->code = pe[idx].ctrnum; diff --git a/SOURCES/libpfm-updates.patch b/SOURCES/libpfm-updates.patch new file mode 100644 index 0000000..95ed8b2 --- /dev/null +++ b/SOURCES/libpfm-updates.patch @@ -0,0 +1,10351 @@ +From 756658bff2e346b72d54ae569a68ae4028cf541e Mon Sep 17 00:00:00 2001 +From: Stephane Eranian +Date: Fri, 19 Feb 2016 20:12:23 +0100 +Subject: [PATCH] fix encoding of UNC_M_PRE_COUNT for HSW-EP and IVB-EP + +The encoding of RD, WR, BYP umasks were wrong. +Added a couple of test to check th encodings of this event. + +Signed-off-by: Stephane Eranian +--- + lib/events/intel_hswep_unc_imc_events.h | 6 +++--- + lib/events/intel_ivbep_unc_imc_events.h | 6 +++--- + tests/validate_x86.c | 14 ++++++++++++++ + 3 files changed, 20 insertions(+), 6 deletions(-) + +diff --git a/lib/events/intel_hswep_unc_imc_events.h b/lib/events/intel_hswep_unc_imc_events.h +index 8b52be4..7f77615 100644 +--- a/lib/events/intel_hswep_unc_imc_events.h ++++ b/lib/events/intel_hswep_unc_imc_events.h +@@ -162,15 +162,15 @@ static const intel_x86_umask_t hswep_unc_m_pre_count[]={ + }, + { .uname = "RD", + .udesc = "Precharge due to read", +- .ucode = 0x100, ++ .ucode = 0x400, + }, + { .uname = "WR", + .udesc = "Precharhe due to write", +- .ucode = 0x200, ++ .ucode = 0x800, + }, + { .uname = "BYP", + .udesc = "Precharge due to bypass", +- .ucode = 0x800, ++ .ucode = 0x1000, + }, + }; + +diff --git a/lib/events/intel_ivbep_unc_imc_events.h b/lib/events/intel_ivbep_unc_imc_events.h +index 473afc4..ba60c7e 100644 +--- a/lib/events/intel_ivbep_unc_imc_events.h ++++ b/lib/events/intel_ivbep_unc_imc_events.h +@@ -162,15 +162,15 @@ static const intel_x86_umask_t ivbep_unc_m_pre_count[]={ + }, + { .uname = "RD", + .udesc = "Precharge due to read", +- .ucode = 0x100, ++ .ucode = 0x400, + }, + { .uname = "WR", + .udesc = "Precharhe due to write", +- .ucode = 0x200, ++ .ucode = 0x800, + }, + { .uname = "BYP", + .udesc = "Precharge due to bypass", +- .ucode = 0x800, ++ .ucode = 0x1000, + }, + }; + +diff --git a/tests/validate_x86.c b/tests/validate_x86.c +index a29b031..4bf8604 100644 +--- a/tests/validate_x86.c ++++ b/tests/validate_x86.c +@@ -2664,6 +2664,13 @@ static const test_event_t x86_test_events[]={ + .fstr = "ivbep_unc_imc0::UNC_M_CAS_COUNT:RD:e=0:t=0", + }, + { SRC_LINE, ++ .name = "ivbep_unc_imc0::UNC_M_PRE_COUNT:WR", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x0802, ++ .fstr = "ivbep_unc_imc0::UNC_M_PRE_COUNT:WR:e=0:t=0", ++ }, ++ { SRC_LINE, + .name = "ivbep_unc_imc0::UNC_M_POWER_CKE_CYCLES:RANK0", + .ret = PFM_SUCCESS, + .count = 1, +@@ -3607,6 +3614,13 @@ static const test_event_t x86_test_events[]={ + .fstr = "hswep_unc_imc0::UNC_M_CAS_COUNT:RD:e=0:i=0:t=0", + }, + { SRC_LINE, ++ .name = "hswep_unc_imc0::UNC_M_PRE_COUNT:WR", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x0802, ++ .fstr = "hswep_unc_imc0::UNC_M_PRE_COUNT:WR:e=0:i=0:t=0", ++ }, ++ { SRC_LINE, + .name = "hswep_unc_imc0::UNC_M_POWER_CKE_CYCLES:RANK0", + .ret = PFM_SUCCESS, + .count = 1, +-- +2.9.3 + + +From 1fc70406adb18233251c31848a6fc372813599b2 Mon Sep 17 00:00:00 2001 +From: Will Schmidt +Date: Thu, 10 Mar 2016 13:43:58 -0600 +Subject: [PATCH] Update the POWER8 PVR values + +Update the POWER8 PVR values to include additional flavors +of the POWER8 processor. +The existing POWER8 entry is now POWER8E, this is to be +consistent with changes made on the kernel side. +(arch/powerpc/kernel/cputable.c) + +Signed-off-by: Will Schmidt +--- + lib/pfmlib_power8.c | 6 ++++-- + lib/pfmlib_power_priv.h | 4 +++- + 2 files changed, 7 insertions(+), 3 deletions(-) + +diff --git a/lib/pfmlib_power8.c b/lib/pfmlib_power8.c +index d274b59..ea964b7 100644 +--- a/lib/pfmlib_power8.c ++++ b/lib/pfmlib_power8.c +@@ -1,7 +1,7 @@ + /* + * pfmlib_power8.c : IBM Power8 support + * +- * Copyright (C) IBM Corporation, 2013. All rights reserved. ++ * Copyright (C) IBM Corporation, 2013-2016. All rights reserved. + * Contributed by Carl Love (carll@us.ibm.com) + * + * Permission is hereby granted, free of charge, to any person obtaining a copy +@@ -29,7 +29,9 @@ + static int + pfm_power8_detect(void* this) + { +- if (__is_processor(PV_POWER8)) ++ if (__is_processor(PV_POWER8) || ++ __is_processor(PV_POWER8E) || ++ __is_processor(PV_POWER8NVL)) + return PFM_SUCCESS; + return PFM_ERR_NOTSUPP; + } +diff --git a/lib/pfmlib_power_priv.h b/lib/pfmlib_power_priv.h +index 0d8b473..e66e7e9 100644 +--- a/lib/pfmlib_power_priv.h ++++ b/lib/pfmlib_power_priv.h +@@ -93,7 +93,9 @@ typedef struct { + #define PV_POWER7p 0x004a + #define PV_970MP 0x0044 + #define PV_970GX 0x0045 +-#define PV_POWER8 0x004b ++#define PV_POWER8E 0x004b ++#define PV_POWER8NVL 0x004c ++#define PV_POWER8 0x004d + + extern int pfm_gen_powerpc_get_event_info(void *this, int pidx, pfm_event_info_t *info); + extern int pfm_gen_powerpc_get_event_attr_info(void *this, int pidx, int umask_idx, pfm_event_attr_info_t *info); +-- +2.9.3 + + +From c6f8a4db1b83eb6ad6dee81e33124d259e37c2c5 Mon Sep 17 00:00:00 2001 +From: Will Schmidt +Date: Thu, 10 Mar 2016 13:44:02 -0600 +Subject: [PATCH] A small assortment of cosmetic touch-ups. + +A small assortment of cosmetic touch-ups. + +Signed-off-by: Will Schmidt +--- + lib/events/power8_events.h | 2 +- + lib/pfmlib_power7.c | 2 +- + lib/pfmlib_power_priv.h | 2 +- + 3 files changed, 3 insertions(+), 3 deletions(-) + +diff --git a/lib/events/power8_events.h b/lib/events/power8_events.h +index 2aee218..92337f8 100644 +--- a/lib/events/power8_events.h ++++ b/lib/events/power8_events.h +@@ -8,7 +8,7 @@ + /* + * File: power8_events.h + * CVS: +-Author: Carl Love ++* Author: Carl Love + * carll.ibm.com + * Mods: + * +diff --git a/lib/pfmlib_power7.c b/lib/pfmlib_power7.c +index ceab517..a32977c 100644 +--- a/lib/pfmlib_power7.c ++++ b/lib/pfmlib_power7.c +@@ -1,5 +1,5 @@ + /* +- * pfmlib_power7.c : IBM Power6 support ++ * pfmlib_power7.c : IBM Power7 support + * + * Copyright (C) IBM Corporation, 2009. All rights reserved. + * Contributed by Corey Ashford (cjashfor@us.ibm.com) +diff --git a/lib/pfmlib_power_priv.h b/lib/pfmlib_power_priv.h +index e66e7e9..04f1437 100644 +--- a/lib/pfmlib_power_priv.h ++++ b/lib/pfmlib_power_priv.h +@@ -77,7 +77,7 @@ typedef struct { + /* Processor Version Register (PVR) field extraction */ + + #define PVR_VER(pvr) (((pvr) >> 16) & 0xFFFF) /* Version field */ +-#define PVR_REV(pvr) (((pvr) >> 0) & 0xFFFF) /* Revison field */ ++#define PVR_REV(pvr) (((pvr) >> 0) & 0xFFFF) /* Revision field */ + + #define __is_processor(pv) (PVR_VER(mfspr(SPRN_PVR)) == (pv)) + +-- +2.9.3 + + +From f191f9048a3adb191bbde3dac1bddec5436250dc Mon Sep 17 00:00:00 2001 +From: Will Cohen +Date: Thu, 24 Mar 2016 07:11:35 +0100 +Subject: [PATCH] Limit functions visibility in libpfm + +Limiting functions and data structures visibility in libpfm.so +so they are hidden from other other code linked with library can allow +the compiler to generate better quality code and reduce linking +overhead on startup. Hiding the internal functions and data +structures also allow more flexibility in changing internal +implementation while keeping compatibility with previous versions of +the library. + +This patch limits libpfm to making visible the function listed in the +header files it provides. The llvm clang compiler honor the gcc +visibility option and pragmas. According to the libabigail tool +abidiff 59 functions and 154 variables were hidden as a result of this +change. The patch reduces the size of the shared library by about 14KB +(0.8%) on x86_64. + +Signed-off-by: William Cohen +--- + include/perfmon/perf_event.h | 4 ++++ + include/perfmon/pfmlib.h | 4 ++++ + include/perfmon/pfmlib_perf_event.h | 4 ++++ + lib/Makefile | 2 +- + 4 files changed, 13 insertions(+), 1 deletion(-) + +diff --git a/include/perfmon/perf_event.h b/include/perfmon/perf_event.h +index cadcec7..a11a8cd 100644 +--- a/include/perfmon/perf_event.h ++++ b/include/perfmon/perf_event.h +@@ -22,6 +22,8 @@ + #ifndef __PERFMON_PERF_EVENT_H__ + #define __PERFMON_PERF_EVENT_H__ + ++#pragma GCC visibility push(default) ++ + #include + #include /* for syscall numbers */ + #include +@@ -588,4 +590,6 @@ union perf_mem_data_src { + } + #endif + ++#pragma GCC visibility pop ++ + #endif /* __PERFMON_PERF_EVENT_H__ */ +diff --git a/include/perfmon/pfmlib.h b/include/perfmon/pfmlib.h +index a548be2..b05754b 100644 +--- a/include/perfmon/pfmlib.h ++++ b/include/perfmon/pfmlib.h +@@ -26,6 +26,8 @@ + #ifndef __PFMLIB_H__ + #define __PFMLIB_H__ + ++#pragma GCC visibility push(default) ++ + #ifdef __cplusplus + extern "C" { + #endif +@@ -534,4 +536,6 @@ extern pfm_err_t pfm_get_event_encoding(const char *str, int dfl_plm, char **fst + } + #endif + ++#pragma GCC visibility pop ++ + #endif /* __PFMLIB_H__ */ +diff --git a/include/perfmon/pfmlib_perf_event.h b/include/perfmon/pfmlib_perf_event.h +index 8b3dae2..0516277 100644 +--- a/include/perfmon/pfmlib_perf_event.h ++++ b/include/perfmon/pfmlib_perf_event.h +@@ -25,6 +25,8 @@ + #include + #include + ++#pragma GCC visibility push(default) ++ + #ifdef __cplusplus + extern "C" { + #endif +@@ -61,4 +63,6 @@ extern pfm_err_t pfm_get_perf_event_encoding(const char *str, + } + #endif + ++#pragma GCC visibility pop ++ + #endif /* __PFMLIB_PERF_EVENT_H__ */ +diff --git a/lib/Makefile b/lib/Makefile +index a2c5818..f035307 100644 +--- a/lib/Makefile ++++ b/lib/Makefile +@@ -33,7 +33,7 @@ ifeq ($(SYS),Linux) + SRCS += pfmlib_perf_event_pmu.c pfmlib_perf_event.c pfmlib_perf_event_raw.c + endif + +-CFLAGS+=-D_REENTRANT -I. ++CFLAGS+=-D_REENTRANT -I. -fvisibility=hidden + + # + # list all library support modules +-- +2.9.3 + + +From 4f9fc8b50b761807b12b739372af48b22a46ad28 Mon Sep 17 00:00:00 2001 +From: Stephane Eranian +Date: Thu, 24 Mar 2016 07:35:31 +0100 +Subject: [PATCH] update Intel Skylake event table + +To match V18 pbulish on download-01.org. +Basically adding missing: + ITLB_MISSES.WALK_COMPLETED_4K + ITLB_MISSES.WALK_COMPLETED_2M_4M + + DTLB_LOAD_MISSES.WALK_COMPLETED_4K + DTLB_LOAD_MISSES.WALK_COMPLETED_2M_4M + + DTLB_STORE_MISSES.WALK_COMPLETED_4K + DTLB_STORE_MISSES.WALK_COMPLETED_2M_4M + +Signed-off-by: Stephane Eranian +--- + lib/events/intel_skl_events.h | 20 ++++++++++++++++++++ + tests/validate_x86.c | 12 ++++++++++++ + 2 files changed, 32 insertions(+) + +diff --git a/lib/events/intel_skl_events.h b/lib/events/intel_skl_events.h +index d48e87e..4980164 100644 +--- a/lib/events/intel_skl_events.h ++++ b/lib/events/intel_skl_events.h +@@ -223,6 +223,16 @@ static const intel_x86_umask_t skl_dtlb_load_misses[]={ + .ucode = 0xe00, + .uflags = INTEL_X86_NCOMBO, + }, ++ { .uname = "WALK_COMPLETED_4K", ++ .udesc = "Misses in all TLB levels causes a page walk of 4KB page size that completes", ++ .ucode = 0x200, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "WALK_COMPLETED_2M_4M", ++ .udesc = "Misses in all TLB levels causes a page walk of 2MB/4MB page size that completes", ++ .ucode = 0x400, ++ .uflags = INTEL_X86_NCOMBO, ++ }, + { .uname = "WALK_ACTIVE", + .udesc = "Cycles with at least one hardware walker active for a load", + .ucode = 0x1000 | (0x1 << INTEL_X86_CMASK_BIT), +@@ -257,6 +267,16 @@ static const intel_x86_umask_t skl_itlb_misses[]={ + .ucode = 0xe00, + .uflags = INTEL_X86_NCOMBO, + }, ++ { .uname = "WALK_COMPLETED_4K", ++ .udesc = "Misses in all TLB levels causes a page walk of 4KB page size that completes", ++ .ucode = 0x200, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "WALK_COMPLETED_2M_4M", ++ .udesc = "Misses in all TLB levels causes a page walk of 2MB/4MB page size that completes", ++ .ucode = 0x400, ++ .uflags = INTEL_X86_NCOMBO, ++ }, + { .uname = "WALK_DURATION", + .udesc = "Cycles when PMH is busy with page walks", + .ucode = 0x1000, +diff --git a/tests/validate_x86.c b/tests/validate_x86.c +index 4bf8604..84e08b2 100644 +--- a/tests/validate_x86.c ++++ b/tests/validate_x86.c +@@ -3921,6 +3921,18 @@ static const test_event_t x86_test_events[]={ + .fstr = "skl::CYCLE_ACTIVITY:0x6:k=1:u=1:e=0:i=0:c=6:t=0:intx=0:intxcp=0", + }, + { SRC_LINE, ++ .name = "skl::dtlb_store_misses:walk_completed_2m_4m:c=1", ++ .count = 1, ++ .codes[0] = 0x1530449, ++ .fstr = "skl::DTLB_STORE_MISSES:WALK_COMPLETED_2M_4M:k=1:u=1:e=0:i=0:c=1:t=0:intx=0:intxcp=0", ++ }, ++ { SRC_LINE, ++ .name = "skl::rob_misc_events:lbr_inserts", ++ .count = 1, ++ .codes[0] = 0x5320cc, ++ .fstr = "skl::ROB_MISC_EVENTS:LBR_INSERTS:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", ++ }, ++ { SRC_LINE, + .name = "skl::cycle_activity:stalls_mem_any:c=6", + .ret = PFM_ERR_ATTR_SET, + }, +-- +2.9.3 + + +From ec6289ddde0a8826f16158e00fb45636e25f0d06 Mon Sep 17 00:00:00 2001 +From: Stephane Eranian +Date: Thu, 24 Mar 2016 07:48:30 +0100 +Subject: [PATCH] update Intel Broadwell event table + +Based on V13 from download.01.org + +Following events added: + + ITLB_MISSES.WALK_COMPLETED_4K + ITLB_MISSES.WALK_COMPLETED_2M_4M + ITLB_MISSES.WALK_COMPLETED_1G + ITLB_MISSES.STLB_HIT_2M + + DTLB_LOAD_MISSES.WALK_COMPLETED_4K + DTLB_LOAD_MISSES.WALK_COMPLETED_2M_4M + DTLB_LOAD_MISSES.WALK_COMPLETED_1G + DTLB_LOAD_MISSES.STLB_HIT_2M + + DTLB_STORE_MISSES.WALK_COMPLETED_4K + DTLB_STORE_MISSES.WALK_COMPLETED_2M_4M + DTLB_STORE_MISSES.WALK_COMPLETED_1G + DTLB_STORE_MISSES.STLB_HIT_2M + LOAD_HIT_PRE.SW_PREF + + BR_MISP_EXEC:TAKEN_RETURN_NEAR + + Signed-off-by: Stephane Eranian +--- + lib/events/intel_bdw_events.h | 46 +++++++++++++++++++++++++++++++++++++++---- + 1 file changed, 42 insertions(+), 4 deletions(-) + +diff --git a/lib/events/intel_bdw_events.h b/lib/events/intel_bdw_events.h +index 3d21a04..e59d61a 100644 +--- a/lib/events/intel_bdw_events.h ++++ b/lib/events/intel_bdw_events.h +@@ -223,6 +223,11 @@ static const intel_x86_umask_t bdw_br_misp_exec[]={ + .ucode = 0xa000, + .uflags = INTEL_X86_NCOMBO, + }, ++ { .uname = "TAKEN_RETURN_NEAR", ++ .udesc = "Taken speculative and retired mispredicted direct returns", ++ .ucode = 0x8800, ++ .uflags = INTEL_X86_NCOMBO, ++ }, + }; + + static const intel_x86_umask_t bdw_br_misp_retired[]={ +@@ -381,6 +386,16 @@ static const intel_x86_umask_t bdw_dtlb_load_misses[]={ + .ucode = 0x200, + .uflags = INTEL_X86_NCOMBO, + }, ++ { .uname = "WALK_COMPLETED_2M_4M", ++ .udesc = "Misses in all TLB levels causes a page walk of 2MB/4MB page sizes that completes", ++ .ucode = 0x400, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "WALK_COMPLETED_1G", ++ .udesc = "Misses in all TLB levels causes a page walk of 1GB page sizes that completes", ++ .ucode = 0x800, ++ .uflags = INTEL_X86_NCOMBO, ++ }, + { .uname = "WALK_COMPLETED", + .udesc = "Misses in all TLB levels causes a page walk of any page size that completes", + .ucode = 0xe00, +@@ -392,10 +407,15 @@ static const intel_x86_umask_t bdw_dtlb_load_misses[]={ + .uflags = INTEL_X86_NCOMBO, + }, + { .uname = "STLB_HIT_4K", +- .udesc = "Misses that miss the DTLB and hit the STLB (4K)", ++ .udesc = "Misses that miss the DTLB and hit the STLB (4KB)", + .ucode = 0x2000, + .uflags = INTEL_X86_NCOMBO, + }, ++ { .uname = "STLB_HIT_2M", ++ .udesc = "Misses that miss the DTLB and hit the STLB (2MB)", ++ .ucode = 0x4000, ++ .uflags = INTEL_X86_NCOMBO, ++ }, + { .uname = "STLB_HIT", + .udesc = "Number of cache load STLB hits. No page walk", + .ucode = 0x6000, +@@ -410,10 +430,20 @@ static const intel_x86_umask_t bdw_itlb_misses[]={ + .uflags = INTEL_X86_NCOMBO, + }, + { .uname = "WALK_COMPLETED_4K", +- .udesc = "Misses in all TLB levels causes a page walk that completes (4K)", ++ .udesc = "Misses in all TLB levels causes a page walk that completes (4KB)", + .ucode = 0x200, + .uflags = INTEL_X86_NCOMBO, + }, ++ { .uname = "WALK_COMPLETED_2M_4M", ++ .udesc = "Misses in all TLB levels causes a page walk that completes (2MB/4MB)", ++ .ucode = 0x400, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "WALK_COMPLETED_1G", ++ .udesc = "Misses in all TLB levels causes a page walk that completes (1GB)", ++ .ucode = 0x800, ++ .uflags = INTEL_X86_NCOMBO, ++ }, + { .uname = "WALK_COMPLETED", + .udesc = "Misses in all TLB levels causes a page walk of any page size that completes", + .ucode = 0xe00, +@@ -425,10 +455,15 @@ static const intel_x86_umask_t bdw_itlb_misses[]={ + .uflags = INTEL_X86_NCOMBO, + }, + { .uname = "STLB_HIT_4K", +- .udesc = "Misses that miss the DTLB and hit the STLB (4K)", ++ .udesc = "Misses that miss the DTLB and hit the STLB (4KB)", + .ucode = 0x2000, + .uflags = INTEL_X86_NCOMBO, + }, ++ { .uname = "STLB_HIT_2M", ++ .udesc = "Misses that miss the DTLB and hit the STLB (2MB)", ++ .ucode = 0x4000, ++ .uflags = INTEL_X86_NCOMBO, ++ }, + { .uname = "STLB_HIT", + .udesc = "Number of cache load STLB hits. No page walk", + .ucode = 0x6000, +@@ -969,7 +1004,10 @@ static const intel_x86_umask_t bdw_load_hit_pre[]={ + { .uname = "HW_PF", + .udesc = "Non software-prefetch load dispatches that hit FB allocated for hardware prefetch", + .ucode = 0x200, +- .uflags = INTEL_X86_DFL, ++ }, ++ { .uname = "SW_PF", ++ .udesc = "Non software-prefetch load dispatches that hit FB allocated for software prefetch", ++ .ucode = 0x100, + }, + }; + +-- +2.9.3 + + +From 9603a098df47994a03ffb0c4fdaed5a94fbf1c6f Mon Sep 17 00:00:00 2001 +From: Stephane Eranian +Date: Sat, 16 Apr 2016 05:26:11 +0200 +Subject: [PATCH] enable Intel Broadwell EP core PMU support + +This file enables full support for Intel Broadwell EP core PMU. +Prior, it was based on Broadwell desktop. This patch adds the +remote memory events. + +Signed-off-by: Stephane Eranian +--- + include/perfmon/pfmlib.h | 3 +- + lib/events/intel_bdw_events.h | 159 ++++++++++++++++++++++++++++-------------- + lib/pfmlib_common.c | 1 + + lib/pfmlib_intel_bdw.c | 37 +++++++++- + lib/pfmlib_priv.h | 1 + + tests/validate_x86.c | 26 ++++++- + 6 files changed, 171 insertions(+), 56 deletions(-) + +diff --git a/include/perfmon/pfmlib.h b/include/perfmon/pfmlib.h +index b05754b..24a2a60 100644 +--- a/include/perfmon/pfmlib.h ++++ b/include/perfmon/pfmlib.h +@@ -245,7 +245,7 @@ typedef enum { + PFM_PMU_ARM_CORTEX_A7, /* ARM Cortex A7 */ + + PFM_PMU_INTEL_HSW_EP, /* Intel Haswell EP */ +- PFM_PMU_INTEL_BDW, /* Intel Broadwell EP */ ++ PFM_PMU_INTEL_BDW, /* Intel Broadwell */ + + PFM_PMU_ARM_XGENE, /* Applied Micro X-Gene (ARMv8) */ + +@@ -296,6 +296,7 @@ typedef enum { + + PFM_PMU_INTEL_SKL, /* Intel Skylake */ + ++ PFM_PMU_INTEL_BDW_EP, /* Intel Broadwell EP */ + /* MUST ADD NEW PMU MODELS HERE */ + + PFM_PMU_MAX /* end marker */ +diff --git a/lib/events/intel_bdw_events.h b/lib/events/intel_bdw_events.h +index e59d61a..439d3c6 100644 +--- a/lib/events/intel_bdw_events.h ++++ b/lib/events/intel_bdw_events.h +@@ -1092,10 +1092,28 @@ static const intel_x86_umask_t bdw_mem_load_uops_l3_hit_retired[]={ + + static const intel_x86_umask_t bdw_mem_load_uops_l3_miss_retired[]={ + { .uname = "LOCAL_DRAM", +- .udesc = "Retired load uops missing L3 cache but hitting local memory", ++ .udesc = "Retired load uops missing L3 cache but hitting local memory (Precise Event)", + .ucode = 0x100, +- .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_DFL, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS , + }, ++ { .uname = "REMOTE_DRAM", ++ .udesc = "Number of retired load uops that missed L3 but were service by remote RAM, snoop not needed, snoop miss, snoop hit data not forwarded (Precise Event)", ++ .ucode = 0x400, ++ .umodel = PFM_PMU_INTEL_BDW_EP, ++ .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, ++ }, ++ { .uname = "REMOTE_HITM", ++ .udesc = "Number of retired load uops whose data sources was remote HITM (Precise Event)", ++ .ucode = 0x1000, ++ .umodel = PFM_PMU_INTEL_BDW_EP, ++ .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, ++ }, ++ { .uname = "REMOTE_FWD", ++ .udesc = "Load uops that miss in the L3 whose data source was forwarded from a remote cache (Precise Event)", ++ .ucode = 0x2000, ++ .umodel = PFM_PMU_INTEL_BDW_EP, ++ .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, ++ }, + }; + + static const intel_x86_umask_t bdw_mem_load_uops_retired[]={ +@@ -1785,96 +1803,135 @@ static const intel_x86_umask_t bdw_offcore_response[]={ + .umodel = PFM_PMU_INTEL_BDW, + .grpid = 1, + }, +- { .uname = "L4_HIT_LOCAL_L4", +- .udesc = "Supplier: L4 local hit", +- .ucode = 0x1ULL << (22+8), ++ { .uname = "L3_MISS_LOCAL", ++ .udesc = "Supplier: counts L3 misses to local DRAM", ++ .ucode = 1ULL << (26+8), + .umodel = PFM_PMU_INTEL_BDW, +- .grpid = 1, ++ .grpid = 1, + }, +- { .uname = "L4_HIT_REMOTE_HOP0_L4", +- .udesc = "Supplier: L4 hit on L4 from same socket (hop0)", +- .ucode = 0x1ULL << (23+8), ++ { .uname = "LLC_MISS_LOCAL", ++ .udesc = "Supplier: counts L3 misses to local DRAM", ++ .ucode = 1ULL << (26+8), ++ .uequiv = "L3_MISS_LOCAL", + .umodel = PFM_PMU_INTEL_BDW, +- .grpid = 1, ++ .grpid = 1, + }, +- { .uname = "L4_HIT_REMOTE_HOP1_L4", +- .udesc = "Supplier: L4 hit on remote L4 with 1 hop", +- .ucode = 0x1ULL << (24+8), ++ { .uname = "LLC_MISS_LOCAL_DRAM", ++ .udesc = "Supplier: counts L3 misses to local DRAM", ++ .ucode = 1ULL << (26+8), ++ .uequiv = "L3_MISS_LOCAL", + .umodel = PFM_PMU_INTEL_BDW, + .grpid = 1, + }, +- { .uname = "L4_HIT_REMOTE_HOP2P_L4", +- .udesc = "Supplier: L4 hit on remote L4 with 2 hops", +- .ucode = 0x1ULL << (25+8), +- .umodel = PFM_PMU_INTEL_BDW, ++ { .uname = "L3_MISS", ++ .udesc = "Supplier: counts L3 misses", ++ .ucode = 0xfULL << (26+8), + .grpid = 1, + }, +- { .uname = "L4_HIT", +- .udesc = "Supplier: L4 hits (covers all L4 hits)", +- .ucode = 0xfULL << (22+8), +- .umodel = PFM_PMU_INTEL_BDW, ++ { .uname = "L3_MISS_REMOTE_HOP0", ++ .udesc = "Supplier: counts L3 misses to remote DRAM with 0 hop", ++ .ucode = 0x1ULL << (27+8), ++ .umodel = PFM_PMU_INTEL_BDW_EP, + .grpid = 1, + }, +- { .uname = "L3_MISS_LOCAL", +- .udesc = "Supplier: counts L3 misses to local DRAM", +- .ucode = 1ULL << (26+8), +- .umodel = PFM_PMU_INTEL_BDW, +- .grpid = 1, ++ { .uname = "L3_MISS_REMOTE_HOP0_DRAM", ++ .udesc = "Supplier: counts L3 misses to remote DRAM with 0 hop", ++ .ucode = 0x1ULL << (27+8), ++ .uequiv = "L3_MISS_REMOTE_HOP0", ++ .umodel = PFM_PMU_INTEL_BDW_EP, ++ .grpid = 1, + }, +- { .uname = "LLC_MISS_LOCAL", +- .udesc = "Supplier: counts L3 misses to local DRAM", +- .ucode = 1ULL << (26+8), +- .uequiv = "L3_MISS_LOCAL", +- .umodel = PFM_PMU_INTEL_BDW, +- .grpid = 1, ++ { .uname = "L3_MISS_REMOTE_HOP1", ++ .udesc = "Supplier: counts L3 misses to remote DRAM with 1 hop", ++ .ucode = 0x1ULL << (28+8), ++ .umodel = PFM_PMU_INTEL_BDW_EP, ++ .grpid = 1, ++ }, ++ { .uname = "L3_MISS_REMOTE_HOP1_DRAM", ++ .udesc = "Supplier: counts L3 misses to remote DRAM with 1 hop", ++ .ucode = 0x1ULL << (28+8), ++ .uequiv = "L3_MISS_REMOTE_HOP1", ++ .umodel = PFM_PMU_INTEL_BDW_EP, ++ .grpid = 1, ++ }, ++ { .uname = "L3_MISS_REMOTE_HOP2P", ++ .udesc = "Supplier: counts L3 misses to remote DRAM with 2P hops", ++ .ucode = 0x1ULL << (29+8), ++ .umodel = PFM_PMU_INTEL_BDW_EP, ++ .grpid = 1, ++ }, ++ { .uname = "L3_MISS_REMOTE_HOP2P_DRAM", ++ .udesc = "Supplier: counts L3 misses to remote DRAM with 2P hops", ++ .ucode = 0x1ULL << (29+8), ++ .uequiv = "L3_MISS_REMOTE_HOP2P", ++ .umodel = PFM_PMU_INTEL_BDW_EP, ++ .grpid = 1, ++ }, ++ { .uname = "L3_MISS_REMOTE", ++ .udesc = "Supplier: counts L3 misses to remote node", ++ .ucode = 0x7ULL << (26+8), ++ .umodel = PFM_PMU_INTEL_BDW_EP, ++ .grpid = 1, ++ }, ++ { .uname = "L3_MISS_REMOTE_DRAM", ++ .udesc = "Supplier: counts L3 misses to remote node", ++ .ucode = 0x7ULL << (26+8), ++ .uequiv = "L3_MISS_REMOTE", ++ .umodel = PFM_PMU_INTEL_BDW_EP, ++ .grpid = 1, ++ }, ++ { .uname = "SPL_HIT", ++ .udesc = "Supplier: counts L3 supplier hit", ++ .ucode = 0x1ULL << (30+8), ++ .grpid = 1, + }, +- { .uname = "SNP_NONE", +- .udesc = "Snoop: counts number of times no snoop-related information is available", ++ { .uname = "SNP_NONE", ++ .udesc = "Snoop: counts number of times no snoop-related information is available", + .ucode = 1ULL << (31+8), + .grpid = 2, + }, +- { .uname = "SNP_NOT_NEEDED", +- .udesc = "Snoop: counts the number of times no snoop was needed to satisfy the request", ++ { .uname = "SNP_NOT_NEEDED", ++ .udesc = "Snoop: counts the number of times no snoop was needed to satisfy the request", + .ucode = 1ULL << (32+8), + .grpid = 2, + }, +- { .uname = "SNP_MISS", +- .udesc = "Snoop: counts number of times a snoop was needed and it missed all snooped caches", ++ { .uname = "SNP_MISS", ++ .udesc = "Snoop: counts number of times a snoop was needed and it missed all snooped caches", + .ucode = 1ULL << (33+8), + .grpid = 2, + }, +- { .uname = "SNP_NO_FWD", +- .udesc = "Snoop: counts number of times a snoop was needed and it hit in at leas one snooped cache", ++ { .uname = "SNP_NO_FWD", ++ .udesc = "Snoop: counts number of times a snoop was needed and it hit in at leas one snooped cache", + .ucode = 1ULL << (34+8), + .grpid = 2, + }, +- { .uname = "SNP_FWD", +- .udesc = "Snoop: counts number of times a snoop was needed and data was forwarded from a remote socket", ++ { .uname = "SNP_FWD", ++ .udesc = "Snoop: counts number of times a snoop was needed and data was forwarded from a remote socket", + .ucode = 1ULL << (35+8), + .grpid = 2, + }, + { .uname = "HITM", + .udesc = "Snoop: counts number of times a snoop was needed and it hitM-ed in local or remote cache", +- .ucode = 1ULL << (36+8), ++ .ucode = 1ULL << (36+8), + .uequiv = "SNP_HITM", +- .grpid = 2, ++ .grpid = 2, + }, + { .uname = "SNP_HITM", + .udesc = "Snoop: counts number of times a snoop was needed and it hitM-ed in local or remote cache", +- .ucode = 1ULL << (36+8), +- .grpid = 2, ++ .ucode = 1ULL << (36+8), ++ .grpid = 2, + }, + { .uname = "NON_DRAM", + .udesc = "Snoop: counts number of times target was a non-DRAM system address. This includes MMIO transactions", +- .ucode = 1ULL << (37+8), +- .grpid = 2, ++ .ucode = 1ULL << (37+8), ++ .grpid = 2, + }, + { .uname = "SNP_ANY", + .udesc = "Snoop: any snoop reason", +- .ucode = 0x7fULL << (31+8), ++ .ucode = 0x7fULL << (31+8), + .uequiv = "SNP_NONE:SNP_NOT_NEEDED:SNP_MISS:SNP_NO_FWD:SNP_FWD:HITM:NON_DRAM", +- .uflags= INTEL_X86_DFL, +- .grpid = 2, ++ .uflags = INTEL_X86_DFL, ++ .grpid = 2, + }, + }; + +diff --git a/lib/pfmlib_common.c b/lib/pfmlib_common.c +index f9012fe..50d48fb 100644 +--- a/lib/pfmlib_common.c ++++ b/lib/pfmlib_common.c +@@ -100,6 +100,7 @@ static pfmlib_pmu_t *pfmlib_pmus[]= + &intel_hsw_support, + &intel_hsw_ep_support, + &intel_bdw_support, ++ &intel_bdw_ep_support, + &intel_skl_support, + &intel_rapl_support, + &intel_snbep_unc_cb0_support, +diff --git a/lib/pfmlib_intel_bdw.c b/lib/pfmlib_intel_bdw.c +index ea3b7be..1de8438 100644 +--- a/lib/pfmlib_intel_bdw.c ++++ b/lib/pfmlib_intel_bdw.c +@@ -28,9 +28,13 @@ + + static const int bdw_models[] = { + 61, /* Broadwell Core-M */ ++ 71, /* Broadwell + GT3e (Iris Pro graphics) */ ++ 0 ++}; ++ ++static const int bdwep_models[] = { + 79, /* Broadwell-EP, Xeon */ + 86, /* Broadwell-EP, Xeon D */ +- 71, /* Broadwell + GT3e (Iris Pro graphics) */ + 0 + }; + +@@ -71,3 +75,34 @@ pfmlib_pmu_t intel_bdw_support={ + .get_event_nattrs = pfm_intel_x86_get_event_nattrs, + .can_auto_encode = pfm_intel_x86_can_auto_encode, + }; ++ ++pfmlib_pmu_t intel_bdw_ep_support={ ++ .desc = "Intel Broadwell EP", ++ .name = "bdw_ep", ++ .pmu = PFM_PMU_INTEL_BDW_EP, ++ .pme_count = LIBPFM_ARRAY_SIZE(intel_bdw_pe), ++ .type = PFM_PMU_TYPE_CORE, ++ .supported_plm = INTEL_X86_PLM, ++ .num_cntrs = 8, /* consider with HT off by default */ ++ .num_fixed_cntrs = 3, ++ .max_encoding = 2, /* offcore_response */ ++ .pe = intel_bdw_pe, ++ .atdesc = intel_x86_mods, ++ .flags = PFMLIB_PMU_FL_RAW_UMASK ++ | INTEL_X86_PMU_FL_ECMASK, ++ .cpu_family = 6, ++ .cpu_models = bdwep_models, ++ .pmu_detect = pfm_intel_x86_model_detect, ++ .pmu_init = pfm_bdw_init, ++ .get_event_encoding[PFM_OS_NONE] = pfm_intel_x86_get_encoding, ++ PFMLIB_ENCODE_PERF(pfm_intel_x86_get_perf_encoding), ++ .get_event_first = pfm_intel_x86_get_event_first, ++ .get_event_next = pfm_intel_x86_get_event_next, ++ .event_is_valid = pfm_intel_x86_event_is_valid, ++ .validate_table = pfm_intel_x86_validate_table, ++ .get_event_info = pfm_intel_x86_get_event_info, ++ .get_event_attr_info = pfm_intel_x86_get_event_attr_info, ++ PFMLIB_VALID_PERF_PATTRS(pfm_intel_x86_perf_validate_pattrs), ++ .get_event_nattrs = pfm_intel_x86_get_event_nattrs, ++ .can_auto_encode = pfm_intel_x86_can_auto_encode, ++}; +diff --git a/lib/pfmlib_priv.h b/lib/pfmlib_priv.h +index 4c075a2..2c760ea 100644 +--- a/lib/pfmlib_priv.h ++++ b/lib/pfmlib_priv.h +@@ -252,6 +252,7 @@ extern pfmlib_pmu_t intel_ivb_ep_support; + extern pfmlib_pmu_t intel_hsw_support; + extern pfmlib_pmu_t intel_hsw_ep_support; + extern pfmlib_pmu_t intel_bdw_support; ++extern pfmlib_pmu_t intel_bdw_ep_support; + extern pfmlib_pmu_t intel_skl_support; + extern pfmlib_pmu_t intel_rapl_support; + extern pfmlib_pmu_t intel_snbep_unc_cb0_support; +diff --git a/tests/validate_x86.c b/tests/validate_x86.c +index 84e08b2..57d2ce0 100644 +--- a/tests/validate_x86.c ++++ b/tests/validate_x86.c +@@ -3042,12 +3042,12 @@ static const test_event_t x86_test_events[]={ + .fstr = "bdw::OFFCORE_RESPONSE_0:ANY_REQUEST:L3_MISS_LOCAL:SNP_ANY:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", + }, + { SRC_LINE, +- .name = "bdw::offcore_response_0:l4_hit", ++ .name = "bdw::offcore_response_0:l3_miss", + .ret = PFM_SUCCESS, + .count = 2, + .codes[0] =0x5301b7, +- .codes[1] = 0x3f83c08fffull, +- .fstr = "bdw::OFFCORE_RESPONSE_0:ANY_REQUEST:L4_HIT:SNP_ANY:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", ++ .codes[1] = 0x3fbc008fffull, ++ .fstr = "bdw::OFFCORE_RESPONSE_0:ANY_REQUEST:L3_MISS:SNP_ANY:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", + }, + { SRC_LINE, + .name = "bdw::offcore_response_1:any_data", +@@ -3058,6 +3058,26 @@ static const test_event_t x86_test_events[]={ + .fstr = "bdw::OFFCORE_RESPONSE_1:DMND_DATA_RD:PF_DATA_RD:PF_LLC_DATA_RD:ANY_RESPONSE:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", + }, + { SRC_LINE, ++ .name = "bdw_ep::offcore_response_0:l3_miss", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] =0x5301b7, ++ .codes[1] = 0x3fbc008fffull, ++ .fstr = "bdw_ep::OFFCORE_RESPONSE_0:ANY_REQUEST:L3_MISS:SNP_ANY:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", ++ }, ++ { SRC_LINE, ++ .name = "bdw_ep::offcore_response_0:L3_MISS_REMOTE_HOP0_DRAM", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] =0x5301b7, ++ .codes[1] = 0x3f88008fffull, ++ .fstr = "bdw_ep::OFFCORE_RESPONSE_0:ANY_REQUEST:L3_MISS_REMOTE_HOP0:SNP_ANY:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", ++ }, ++ { SRC_LINE, ++ .name = "bdw::offcore_response_0:L3_MISS_REMOTE_HOP0_DRAM", ++ .ret = PFM_ERR_ATTR, ++ }, ++ { SRC_LINE, + .name = "hswep_unc_cbo1::UNC_C_CLOCKTICKS:u", + .ret = PFM_ERR_ATTR, + }, +-- +2.9.3 + + +From f009e5b7e06c611321c553aed3c0864d59536f32 Mon Sep 17 00:00:00 2001 +From: Stephane Eranian +Date: Mon, 25 Apr 2016 17:22:54 +0200 +Subject: [PATCH] add Intel Broadwell/Skylake RAPL support + +Add model numbers for Intel Broadwell and Skylake processors. + +Signed-off-by: Stephane Eranian +--- + lib/pfmlib_intel_rapl.c | 7 ++++++- + 1 file changed, 6 insertions(+), 1 deletion(-) + +diff --git a/lib/pfmlib_intel_rapl.c b/lib/pfmlib_intel_rapl.c +index cdbf178..1413b5f 100644 +--- a/lib/pfmlib_intel_rapl.c ++++ b/lib/pfmlib_intel_rapl.c +@@ -102,7 +102,10 @@ pfm_rapl_detect(void *this) + case 60: /* Haswell */ + case 69: /* Haswell */ + case 70: /* Haswell */ +- case 71: /* Haswell */ ++ case 61: /* Broadwell */ ++ case 71: /* Broadwell */ ++ case 78: /* Skylake */ ++ case 94: /* Skylake H/S */ + /* already setup by default */ + break; + case 45: /* Sandy Bridg-EP */ +@@ -111,6 +114,8 @@ pfm_rapl_detect(void *this) + intel_rapl_support.pme_count = LIBPFM_ARRAY_SIZE(intel_rapl_srv_pe); + break; + case 63: /* Haswell-EP */ ++ case 79: /* Broadwell-EP */ ++ case 86: /* Broadwell D */ + intel_rapl_support.pe = intel_rapl_hswep_pe; + intel_rapl_support.pme_count = LIBPFM_ARRAY_SIZE(intel_rapl_hswep_pe); + break; +-- +2.9.3 + + +From 4dc4c6ada254f30eee8cd2ae27bb0869a111b613 Mon Sep 17 00:00:00 2001 +From: Stephane Eranian +Date: Sat, 28 May 2016 03:49:04 +0200 +Subject: [PATCH] Allow raw umask for OFFCORE_RESPONSE on Intel core PMUs + +This patch makes it possible to specify the raw umask as +hexadecimal for the Intel core PMU OFFCORE_RESPONSE_* event. +This makes it possible to encode a umask which could have been +omitted by mistake from the library or not yet supported. + +$ examples/check_events offcore_response_0:0xffff + +Added validation tests for this new support. + +Signed-off-by: Stephane Eranian +--- + include/perfmon/pfmlib.h | 4 +- + lib/pfmlib_intel_x86.c | 16 ++-- + tests/validate_x86.c | 232 +++++++++++++++++++++++++++++++++++++++++++++++ + 3 files changed, 243 insertions(+), 9 deletions(-) + +diff --git a/include/perfmon/pfmlib.h b/include/perfmon/pfmlib.h +index 24a2a60..8921164 100644 +--- a/include/perfmon/pfmlib.h ++++ b/include/perfmon/pfmlib.h +@@ -420,8 +420,8 @@ typedef struct { + size_t size; /* struct sizeof */ + uint64_t code; /* attribute code */ + pfm_attr_t type; /* attribute type */ +- int idx; /* attribute opaque index */ +- pfm_attr_ctrl_t ctrl; /* what is providing attr */ ++ uint64_t idx; /* attribute opaque index */ ++ pfm_attr_ctrl_t ctrl; /* what is providing attr */ + struct { + unsigned int is_dfl:1; /* is default umask */ + unsigned int is_precise:1; /* Intel X86: supports PEBS */ +diff --git a/lib/pfmlib_intel_x86.c b/lib/pfmlib_intel_x86.c +index bb671bd..031de0d 100644 +--- a/lib/pfmlib_intel_x86.c ++++ b/lib/pfmlib_intel_x86.c +@@ -471,16 +471,18 @@ pfm_intel_x86_encode_gen(void *this, pfmlib_event_desc_t *e) + reg.sel_event_select = last_ucode; + } + } else if (a->type == PFM_ATTR_RAW_UMASK) { +- ++ uint64_t rmask; + /* there can only be one RAW_UMASK per event */ +- +- /* sanity check */ +- if (a->idx & ~0xff) { +- DPRINT("raw umask is 8-bit wide\n"); ++ if (intel_x86_eflag(this, e->event, INTEL_X86_NHM_OFFCORE)) { ++ rmask = (1ULL << 38) - 1; ++ } else { ++ rmask = 0xff; ++ } ++ if (a->idx & ~rmask) { ++ DPRINT("raw umask is too wide\n"); + return PFM_ERR_ATTR; + } +- /* override umask */ +- umask2 = a->idx & 0xff; ++ umask2 = a->idx & rmask; + ugrpmsk = grpmsk; + } else { + uint64_t ival = e->attrs[k].ival; +diff --git a/tests/validate_x86.c b/tests/validate_x86.c +index 57d2ce0..0fce00c 100644 +--- a/tests/validate_x86.c ++++ b/tests/validate_x86.c +@@ -3970,6 +3970,238 @@ static const test_event_t x86_test_events[]={ + .codes[0] = 0x15301a3, + .fstr = "hsw::CYCLE_ACTIVITY:CYCLES_L2_PENDING:k=1:u=1:e=0:i=0:c=1:t=0:intx=0:intxcp=0", + }, ++ { SRC_LINE, ++ .name = "wsm::offcore_response_0:0xf", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] = 0x5301b7, ++ .codes[1] = 0xf, ++ .fstr = "wsm::OFFCORE_RESPONSE_0:0xf:k=1:u=1:e=0:i=0:c=0:t=0", ++ }, ++ { SRC_LINE, ++ .name = "wsm::offcore_response_0:0xfffffffff", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] = 0x5301b7, ++ .codes[1] = 0xfffffffff, ++ .fstr = "wsm::OFFCORE_RESPONSE_0:0xffffffff:k=1:u=1:e=0:i=0:c=0:t=0", ++ }, ++ { SRC_LINE, ++ .name = "wsm::offcore_response_0:0x7fffffffff", ++ .ret = PFM_ERR_ATTR, ++ }, ++ { SRC_LINE, ++ .name = "snb::offcore_response_0:0xf", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] = 0x5301b7, ++ .codes[1] = 0xf, ++ .fstr = "snb::OFFCORE_RESPONSE_0:0xf:k=1:u=1:e=0:i=0:c=0:t=0", ++ }, ++ { SRC_LINE, ++ .name = "snb::offcore_response_0:0xfffffffff", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] = 0x5301b7, ++ .codes[1] = 0xfffffffff, ++ .fstr = "snb::OFFCORE_RESPONSE_0:0xffffffff:k=1:u=1:e=0:i=0:c=0:t=0", ++ }, ++ { SRC_LINE, ++ .name = "snb::offcore_response_0:0x7fffffffff", ++ .ret = PFM_ERR_ATTR, ++ }, ++ { SRC_LINE, ++ .name = "ivb_ep::offcore_response_0:0xf", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] = 0x5301b7, ++ .codes[1] = 0xf, ++ .fstr = "ivb_ep::OFFCORE_RESPONSE_0:0xf:k=1:u=1:e=0:i=0:c=0:t=0", ++ }, ++ { SRC_LINE, ++ .name = "ivb_ep::offcore_response_0:0xfffffffff", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] = 0x5301b7, ++ .codes[1] = 0xfffffffff, ++ .fstr = "ivb_ep::OFFCORE_RESPONSE_0:0xffffffff:k=1:u=1:e=0:i=0:c=0:t=0", ++ }, ++ { SRC_LINE, ++ .name = "ivb_ep::offcore_response_0:0x7fffffffff", ++ .ret = PFM_ERR_ATTR, ++ }, ++ { SRC_LINE, ++ .name = "hsw::offcore_response_0:0xf", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] = 0x5301b7, ++ .codes[1] = 0xf, ++ .fstr = "hsw::OFFCORE_RESPONSE_0:0xf:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", ++ }, ++ { SRC_LINE, ++ .name = "hsw::offcore_response_0:0xfffffffff", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] = 0x5301b7, ++ .codes[1] = 0xfffffffff, ++ .fstr = "hsw::OFFCORE_RESPONSE_0:0xffffffff:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", ++ }, ++ { SRC_LINE, ++ .name = "hsw::offcore_response_0:0x7fffffffff", ++ .ret = PFM_ERR_ATTR, ++ }, ++ { SRC_LINE, ++ .name = "bdw_ep::offcore_response_0:0xf", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] = 0x5301b7, ++ .codes[1] = 0xf, ++ .fstr = "bdw_ep::OFFCORE_RESPONSE_0:0xf:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", ++ }, ++ { SRC_LINE, ++ .name = "bdw_ep::offcore_response_0:0xfffffffff", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] = 0x5301b7, ++ .codes[1] = 0xfffffffff, ++ .fstr = "bdw_ep::OFFCORE_RESPONSE_0:0xffffffff:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", ++ }, ++ { SRC_LINE, ++ .name = "bdw_ep::offcore_response_0:0x7fffffffff", ++ .ret = PFM_ERR_ATTR, ++ }, ++ { SRC_LINE, ++ .name = "skl::offcore_response_0:0xf", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] = 0x5301b7, ++ .codes[1] = 0xf, ++ .fstr = "skl::OFFCORE_RESPONSE_0:0xf:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", ++ }, ++ { SRC_LINE, ++ .name = "skl::offcore_response_0:0xfffffffff", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] = 0x5301b7, ++ .codes[1] = 0xfffffffff, ++ .fstr = "skl::OFFCORE_RESPONSE_0:0xffffffff:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", ++ }, ++ { SRC_LINE, ++ .name = "skl::offcore_response_0:0x7fffffffff", ++ .ret = PFM_ERR_ATTR, ++ }, ++ { SRC_LINE, ++ .name = "wsm::offcore_response_1:0xfffffffff", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] = 0x5301bb, ++ .codes[1] = 0xfffffffff, ++ .fstr = "wsm::OFFCORE_RESPONSE_1:0xffffffff:k=1:u=1:e=0:i=0:c=0:t=0", ++ }, ++ { SRC_LINE, ++ .name = "wsm::offcore_response_1:0x7fffffffff", ++ .ret = PFM_ERR_ATTR, ++ }, ++ { SRC_LINE, ++ .name = "snb::offcore_response_1:0xf", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] = 0x5301bb, ++ .codes[1] = 0xf, ++ .fstr = "snb::OFFCORE_RESPONSE_1:0xf:k=1:u=1:e=0:i=0:c=0:t=0", ++ }, ++ { SRC_LINE, ++ .name = "snb::offcore_response_1:0xfffffffff", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] = 0x5301bb, ++ .codes[1] = 0xfffffffff, ++ .fstr = "snb::OFFCORE_RESPONSE_1:0xffffffff:k=1:u=1:e=0:i=0:c=0:t=0", ++ }, ++ { SRC_LINE, ++ .name = "snb::offcore_response_1:0x7fffffffff", ++ .ret = PFM_ERR_ATTR, ++ }, ++ { SRC_LINE, ++ .name = "ivb_ep::offcore_response_1:0xf", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] = 0x5301bb, ++ .codes[1] = 0xf, ++ .fstr = "ivb_ep::OFFCORE_RESPONSE_1:0xf:k=1:u=1:e=0:i=0:c=0:t=0", ++ }, ++ { SRC_LINE, ++ .name = "ivb_ep::offcore_response_1:0xfffffffff", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] = 0x5301bb, ++ .codes[1] = 0xfffffffff, ++ .fstr = "ivb_ep::OFFCORE_RESPONSE_1:0xffffffff:k=1:u=1:e=0:i=0:c=0:t=0", ++ }, ++ { SRC_LINE, ++ .name = "ivb_ep::offcore_response_1:0x7fffffffff", ++ .ret = PFM_ERR_ATTR, ++ }, ++ { SRC_LINE, ++ .name = "hsw::offcore_response_1:0xf", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] = 0x5301bb, ++ .codes[1] = 0xf, ++ .fstr = "hsw::OFFCORE_RESPONSE_1:0xf:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", ++ }, ++ { SRC_LINE, ++ .name = "hsw::offcore_response_1:0xfffffffff", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] = 0x5301bb, ++ .codes[1] = 0xfffffffff, ++ .fstr = "hsw::OFFCORE_RESPONSE_1:0xffffffff:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", ++ }, ++ { SRC_LINE, ++ .name = "hsw::offcore_response_1:0x7fffffffff", ++ .ret = PFM_ERR_ATTR, ++ }, ++ { SRC_LINE, ++ .name = "bdw_ep::offcore_response_1:0xf", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] = 0x5301bb, ++ .codes[1] = 0xf, ++ .fstr = "bdw_ep::OFFCORE_RESPONSE_1:0xf:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", ++ }, ++ { SRC_LINE, ++ .name = "bdw_ep::offcore_response_1:0xfffffffff", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] = 0x5301bb, ++ .codes[1] = 0xfffffffff, ++ .fstr = "bdw_ep::OFFCORE_RESPONSE_1:0xffffffff:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", ++ }, ++ { SRC_LINE, ++ .name = "bdw_ep::offcore_response_1:0x7fffffffff", ++ .ret = PFM_ERR_ATTR, ++ }, ++ { SRC_LINE, ++ .name = "skl::offcore_response_1:0xf", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] = 0x5301bb, ++ .codes[1] = 0xf, ++ .fstr = "skl::OFFCORE_RESPONSE_1:0xf:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", ++ }, ++ { SRC_LINE, ++ .name = "skl::offcore_response_1:0xfffffffff", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] = 0x5301bb, ++ .codes[1] = 0xfffffffff, ++ .fstr = "skl::OFFCORE_RESPONSE_1:0xffffffff:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", ++ }, ++ { SRC_LINE, ++ .name = "skl::offcore_response_1:0x7fffffffff", ++ .ret = PFM_ERR_ATTR, ++ }, + }; + #define NUM_TEST_EVENTS (int)(sizeof(x86_test_events)/sizeof(test_event_t)) + +-- +2.9.3 + + +From 36a34982dafcf784e7d5636c8c4186fca6457c3d Mon Sep 17 00:00:00 2001 +From: Stephane Eranian +Date: Sat, 28 May 2016 04:08:36 +0200 +Subject: [PATCH] Fix offcore_response raw umask encodings for 32-bit in test + suite + +Constants with more than 32 bits must have the ull suffix in 32-bit mode. + +Signed-off-by: Stephane Eranian +--- + tests/validate_x86.c | 24 ++++++++++++------------ + 1 file changed, 12 insertions(+), 12 deletions(-) + +diff --git a/tests/validate_x86.c b/tests/validate_x86.c +index 0fce00c..09152f7 100644 +--- a/tests/validate_x86.c ++++ b/tests/validate_x86.c +@@ -3983,7 +3983,7 @@ static const test_event_t x86_test_events[]={ + .ret = PFM_SUCCESS, + .count = 2, + .codes[0] = 0x5301b7, +- .codes[1] = 0xfffffffff, ++ .codes[1] = 0xfffffffffull, + .fstr = "wsm::OFFCORE_RESPONSE_0:0xffffffff:k=1:u=1:e=0:i=0:c=0:t=0", + }, + { SRC_LINE, +@@ -4003,7 +4003,7 @@ static const test_event_t x86_test_events[]={ + .ret = PFM_SUCCESS, + .count = 2, + .codes[0] = 0x5301b7, +- .codes[1] = 0xfffffffff, ++ .codes[1] = 0xfffffffffull, + .fstr = "snb::OFFCORE_RESPONSE_0:0xffffffff:k=1:u=1:e=0:i=0:c=0:t=0", + }, + { SRC_LINE, +@@ -4023,7 +4023,7 @@ static const test_event_t x86_test_events[]={ + .ret = PFM_SUCCESS, + .count = 2, + .codes[0] = 0x5301b7, +- .codes[1] = 0xfffffffff, ++ .codes[1] = 0xfffffffffull, + .fstr = "ivb_ep::OFFCORE_RESPONSE_0:0xffffffff:k=1:u=1:e=0:i=0:c=0:t=0", + }, + { SRC_LINE, +@@ -4043,7 +4043,7 @@ static const test_event_t x86_test_events[]={ + .ret = PFM_SUCCESS, + .count = 2, + .codes[0] = 0x5301b7, +- .codes[1] = 0xfffffffff, ++ .codes[1] = 0xfffffffffull, + .fstr = "hsw::OFFCORE_RESPONSE_0:0xffffffff:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", + }, + { SRC_LINE, +@@ -4063,7 +4063,7 @@ static const test_event_t x86_test_events[]={ + .ret = PFM_SUCCESS, + .count = 2, + .codes[0] = 0x5301b7, +- .codes[1] = 0xfffffffff, ++ .codes[1] = 0xfffffffffull, + .fstr = "bdw_ep::OFFCORE_RESPONSE_0:0xffffffff:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", + }, + { SRC_LINE, +@@ -4083,7 +4083,7 @@ static const test_event_t x86_test_events[]={ + .ret = PFM_SUCCESS, + .count = 2, + .codes[0] = 0x5301b7, +- .codes[1] = 0xfffffffff, ++ .codes[1] = 0xfffffffffull, + .fstr = "skl::OFFCORE_RESPONSE_0:0xffffffff:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", + }, + { SRC_LINE, +@@ -4095,7 +4095,7 @@ static const test_event_t x86_test_events[]={ + .ret = PFM_SUCCESS, + .count = 2, + .codes[0] = 0x5301bb, +- .codes[1] = 0xfffffffff, ++ .codes[1] = 0xfffffffffull, + .fstr = "wsm::OFFCORE_RESPONSE_1:0xffffffff:k=1:u=1:e=0:i=0:c=0:t=0", + }, + { SRC_LINE, +@@ -4115,7 +4115,7 @@ static const test_event_t x86_test_events[]={ + .ret = PFM_SUCCESS, + .count = 2, + .codes[0] = 0x5301bb, +- .codes[1] = 0xfffffffff, ++ .codes[1] = 0xfffffffffull, + .fstr = "snb::OFFCORE_RESPONSE_1:0xffffffff:k=1:u=1:e=0:i=0:c=0:t=0", + }, + { SRC_LINE, +@@ -4135,7 +4135,7 @@ static const test_event_t x86_test_events[]={ + .ret = PFM_SUCCESS, + .count = 2, + .codes[0] = 0x5301bb, +- .codes[1] = 0xfffffffff, ++ .codes[1] = 0xfffffffffull, + .fstr = "ivb_ep::OFFCORE_RESPONSE_1:0xffffffff:k=1:u=1:e=0:i=0:c=0:t=0", + }, + { SRC_LINE, +@@ -4155,7 +4155,7 @@ static const test_event_t x86_test_events[]={ + .ret = PFM_SUCCESS, + .count = 2, + .codes[0] = 0x5301bb, +- .codes[1] = 0xfffffffff, ++ .codes[1] = 0xfffffffffull, + .fstr = "hsw::OFFCORE_RESPONSE_1:0xffffffff:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", + }, + { SRC_LINE, +@@ -4175,7 +4175,7 @@ static const test_event_t x86_test_events[]={ + .ret = PFM_SUCCESS, + .count = 2, + .codes[0] = 0x5301bb, +- .codes[1] = 0xfffffffff, ++ .codes[1] = 0xfffffffffull, + .fstr = "bdw_ep::OFFCORE_RESPONSE_1:0xffffffff:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", + }, + { SRC_LINE, +@@ -4195,7 +4195,7 @@ static const test_event_t x86_test_events[]={ + .ret = PFM_SUCCESS, + .count = 2, + .codes[0] = 0x5301bb, +- .codes[1] = 0xfffffffff, ++ .codes[1] = 0xfffffffffull, + .fstr = "skl::OFFCORE_RESPONSE_1:0xffffffff:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", + }, + { SRC_LINE, +-- +2.9.3 + + +From bfb9baf1c8a9533fde271d0436ecd465934dfa17 Mon Sep 17 00:00:00 2001 +From: Stephane Eranian +Date: Sat, 28 May 2016 04:20:14 +0200 +Subject: [PATCH] Fix pfmlib_parse_event_attr() parsing of raw umask for 32-bit + +This function was using strtoul() instad of strtoull() now +that a->idx is uint64_t. That was causing bogus encodings +in 32-bit mode when the raw umask was larger than 32-bit. + +Also fix a few other bugs in debug prints related to this. + +Signed-off-by: Stephane Eranian +--- + lib/pfmlib_common.c | 8 ++++---- + 1 file changed, 4 insertions(+), 4 deletions(-) + +diff --git a/lib/pfmlib_common.c b/lib/pfmlib_common.c +index 50d48fb..05ce1c0 100644 +--- a/lib/pfmlib_common.c ++++ b/lib/pfmlib_common.c +@@ -937,10 +937,10 @@ pfmlib_parse_event_attr(char *str, pfmlib_event_desc_t *d) + ainfo->name = "RAW_UMASK"; + ainfo->type = PFM_ATTR_RAW_UMASK; + ainfo->ctrl = PFM_ATTR_CTRL_PMU; +- ainfo->idx = strtoul(s, &endptr, 0); ++ ainfo->idx = strtoull(s, &endptr, 0); + ainfo->equiv= NULL; + if (*endptr) { +- DPRINT("raw umask (%s) is not a number\n"); ++ DPRINT("raw umask (%s) is not a number\n", s); + return PFM_ERR_ATTR; + } + +@@ -1291,9 +1291,9 @@ found: + for (i = 0; i < d->nattrs; i++) { + pfm_event_attr_info_t *a = attr(d, i); + if (a->type != PFM_ATTR_RAW_UMASK) +- DPRINT("%d %d %d %s\n", d->event, i, a->idx, d->pattrs[d->attrs[i].id].name); ++ DPRINT("%d %d %"PRIu64" %s\n", d->event, i, a->idx, d->pattrs[d->attrs[i].id].name); + else +- DPRINT("%d %d RAW_UMASK (0x%x)\n", d->event, i, a->idx); ++ DPRINT("%d %d RAW_UMASK (0x%"PRIx64")\n", d->event, i, a->idx); + } + error: + free(str); +-- +2.9.3 + + +From 487937da54654c699c932c6938484ddcdb91a297 Mon Sep 17 00:00:00 2001 +From: Phil Mucci +Date: Tue, 21 Jun 2016 09:20:42 -0700 +Subject: [PATCH] IBM Power8 add missing suppored_plm mask initialization + +Without this patch, there was no way to encode priv level +at the perf_evnets OS level. They would come out as zero +because libpm4 did not know hardware supports filtering. + +Signed-off-by: Phil Mucci +--- + lib/pfmlib_power8.c | 1 + + lib/pfmlib_power_priv.h | 3 +++ + 2 files changed, 4 insertions(+) + +diff --git a/lib/pfmlib_power8.c b/lib/pfmlib_power8.c +index ea964b7..d30f036 100644 +--- a/lib/pfmlib_power8.c ++++ b/lib/pfmlib_power8.c +@@ -42,6 +42,7 @@ pfmlib_pmu_t power8_support={ + .pmu = PFM_PMU_POWER8, + .pme_count = LIBPFM_ARRAY_SIZE(power8_pe), + .type = PFM_PMU_TYPE_CORE, ++ .supported_plm = POWER8_PLM, + .num_cntrs = 4, + .num_fixed_cntrs = 2, + .max_encoding = 1, +diff --git a/lib/pfmlib_power_priv.h b/lib/pfmlib_power_priv.h +index 04f1437..8b5c3ac 100644 +--- a/lib/pfmlib_power_priv.h ++++ b/lib/pfmlib_power_priv.h +@@ -97,6 +97,9 @@ typedef struct { + #define PV_POWER8NVL 0x004c + #define PV_POWER8 0x004d + ++#define POWER_PLM (PFM_PLM0|PFM_PLM3) ++#define POWER8_PLM (POWER_PLM|PFM_PLMH) ++ + extern int pfm_gen_powerpc_get_event_info(void *this, int pidx, pfm_event_info_t *info); + extern int pfm_gen_powerpc_get_event_attr_info(void *this, int pidx, int umask_idx, pfm_event_attr_info_t *info); + extern int pfm_gen_powerpc_get_encoding(void *this, pfmlib_event_desc_t *e); +-- +2.9.3 + + +From a31c90ed0aecdc3da5b47611d0068448cac38e5b Mon Sep 17 00:00:00 2001 +From: Stephane Eranian +Date: Fri, 8 Jul 2016 15:05:32 -0700 +Subject: [PATCH] fix/add offcore_response:l3_miss alias for Intel + SNB/IVB/HSW/BDW/SKL + +This patch adds a L3_MISS alias for Intel Snb/IVB/HSW/BDW/SKL processors. +L3_MISS counts local and remote misses (if any). + +Adds the corresponding validation tests. + +Signed-off-by: Stephane Eranian +--- + lib/events/intel_bdw_events.h | 16 +++++++---- + lib/events/intel_hsw_events.h | 11 ++++++-- + lib/events/intel_ivb_events.h | 14 ++++++++++ + lib/events/intel_skl_events.h | 5 ++-- + lib/events/intel_snb_events.h | 21 ++++++++++++++ + tests/validate_x86.c | 64 +++++++++++++++++++++++++++++++++++++++++-- + 6 files changed, 119 insertions(+), 12 deletions(-) + +diff --git a/lib/events/intel_bdw_events.h b/lib/events/intel_bdw_events.h +index 439d3c6..c22755e 100644 +--- a/lib/events/intel_bdw_events.h ++++ b/lib/events/intel_bdw_events.h +@@ -1806,27 +1806,33 @@ static const intel_x86_umask_t bdw_offcore_response[]={ + { .uname = "L3_MISS_LOCAL", + .udesc = "Supplier: counts L3 misses to local DRAM", + .ucode = 1ULL << (26+8), +- .umodel = PFM_PMU_INTEL_BDW, + .grpid = 1, + }, + { .uname = "LLC_MISS_LOCAL", + .udesc = "Supplier: counts L3 misses to local DRAM", + .ucode = 1ULL << (26+8), + .uequiv = "L3_MISS_LOCAL", +- .umodel = PFM_PMU_INTEL_BDW, + .grpid = 1, + }, + { .uname = "LLC_MISS_LOCAL_DRAM", + .udesc = "Supplier: counts L3 misses to local DRAM", + .ucode = 1ULL << (26+8), + .uequiv = "L3_MISS_LOCAL", +- .umodel = PFM_PMU_INTEL_BDW, + .grpid = 1, + }, + { .uname = "L3_MISS", +- .udesc = "Supplier: counts L3 misses", +- .ucode = 0xfULL << (26+8), ++ .udesc = "Supplier: counts L3 misses to local DRAM", ++ .ucode = 1ULL << (26+8), ++ .uequiv = "L3_MISS_LOCAL", + .grpid = 1, ++ .umodel = PFM_PMU_INTEL_BDW, ++ }, ++ { .uname = "L3_MISS", ++ .udesc = "Supplier: counts L3 misses to local or remote DRAM", ++ .ucode = 0xfULL << (26+8), ++ .uequiv = "L3_MISS_LOCAL:L3_MISS_REMOTE_HOP0:L3_MISS_REMOTE_HOP1:L3_MISS_REMOTE_HOP2P", ++ .umodel = PFM_PMU_INTEL_BDW_EP, ++ .grpid = 1, + }, + { .uname = "L3_MISS_REMOTE_HOP0", + .udesc = "Supplier: counts L3 misses to remote DRAM with 0 hop", +diff --git a/lib/events/intel_hsw_events.h b/lib/events/intel_hsw_events.h +index e4546cf..426119b 100644 +--- a/lib/events/intel_hsw_events.h ++++ b/lib/events/intel_hsw_events.h +@@ -1784,12 +1784,19 @@ static const intel_x86_umask_t hsw_offcore_response[]={ + .grpid = 1, + }, + { .uname = "L3_MISS", ++ .udesc = "Supplier: counts L3 misses to local DRAM", ++ .ucode = 0x1ULL << (22+8), ++ .uequiv = "L3_MISS_LOCAL", ++ .grpid = 1, ++ .umodel = PFM_PMU_INTEL_HSW, ++ }, ++ { .uname = "L3_MISS", + .udesc = "Supplier: counts L3 misses to local or remote DRAM", +- .ucode = 0xfULL << (26+8), ++ .ucode = 0x7ULL << (27+8) | 0x1ULL << (22+8), ++ .uequiv = "L3_MISS_LOCAL:L3_MISS_REMOTE_HOP0:L3_MISS_REMOTE_HOP1:L3_MISS_REMOTE_HOP2P", + .umodel = PFM_PMU_INTEL_HSW_EP, + .grpid = 1, + }, +- + { .uname = "SPL_HIT", + .udesc = "Supplier: counts L3 supplier hit", + .ucode = 0x1ULL << (30+8), +diff --git a/lib/events/intel_ivb_events.h b/lib/events/intel_ivb_events.h +index cf5059e..fa29dcb 100644 +--- a/lib/events/intel_ivb_events.h ++++ b/lib/events/intel_ivb_events.h +@@ -1732,6 +1732,20 @@ static const intel_x86_umask_t ivb_offcore_response[]={ + .umodel = PFM_PMU_INTEL_IVB_EP, + .grpid = 1, + }, ++ { .uname = "L3_MISS", ++ .udesc = "Supplier: counts L3 misses to local DRAM", ++ .ucode = 0x1ULL << (22+8), ++ .grpid = 1, ++ .uequiv = "LLC_MISS_LOCAL", ++ .umodel = PFM_PMU_INTEL_IVB, ++ }, ++ { .uname = "L3_MISS", ++ .udesc = "Supplier: counts L3 misses to local or remote DRAM", ++ .ucode = 0x3ULL << (22+8), ++ .uequiv = "LLC_MISS_LOCAL:LLC_MISS_REMOTE", ++ .umodel = PFM_PMU_INTEL_IVB_EP, ++ .grpid = 1, ++ }, + { .uname = "LLC_MISS_REMOTE_DRAM", + .udesc = "Supplier: counts L3 misses to remote DRAM", + .ucode = 0xffULL << (23+8), +diff --git a/lib/events/intel_skl_events.h b/lib/events/intel_skl_events.h +index 4980164..3a107f3 100644 +--- a/lib/events/intel_skl_events.h ++++ b/lib/events/intel_skl_events.h +@@ -1471,12 +1471,13 @@ static const intel_x86_umask_t skl_offcore_response[]={ + { .uname = "L3_MISS_LOCAL", + .udesc = "Supplier: counts L3 misses to local DRAM", + .ucode = 1ULL << (26+8), +- .umodel = PFM_PMU_INTEL_SKL, + .grpid = 1, + }, + { .uname = "L3_MISS", + .udesc = "Supplier: counts L3 misses", +- .ucode = 0xfULL << (26+8), ++ .ucode = 0x1ULL << (26+8), ++ .uequiv = "L3_MISS_LOCAL", ++ .umodel = PFM_PMU_INTEL_SKL, + .grpid = 1, + }, + { .uname = "SPL_HIT", +diff --git a/lib/events/intel_snb_events.h b/lib/events/intel_snb_events.h +index 829f710..0d448b7 100644 +--- a/lib/events/intel_snb_events.h ++++ b/lib/events/intel_snb_events.h +@@ -1765,6 +1765,13 @@ static const intel_x86_umask_t snb_offcore_response[]={ + .uequiv = "LLC_MISS_LOCAL_DRAM", + .grpid = 1, + }, ++ { .uname = "L3_MISS", ++ .udesc = "Supplier: counts L3 misses to local DRAM", ++ .ucode = 0x1ULL << (22+8), ++ .grpid = 1, ++ .uequiv = "LLC_MISS_LOCAL", ++ .umodel = PFM_PMU_INTEL_SNB, ++ }, + { .uname = "LLC_MISS_REMOTE", + .udesc = "Supplier: counts L3 misses to remote DRAM", + .ucode = 0xffULL << (23+8), +@@ -1778,6 +1785,20 @@ static const intel_x86_umask_t snb_offcore_response[]={ + .grpid = 1, + .umodel = PFM_PMU_INTEL_SNB_EP, + }, ++ { .uname = "L3_MISS", ++ .udesc = "Supplier: counts L3 misses to local or remote DRAM", ++ .ucode = 0x3ULL << (22+8), ++ .uequiv = "LLC_MISS_LOCAL:LLC_MISS_REMOTE", ++ .umodel = PFM_PMU_INTEL_SNB_EP, ++ .grpid = 1, ++ }, ++ { .uname = "L3_MISS", ++ .udesc = "Supplier: counts L3 misses to local or remote DRAM", ++ .ucode = 0x3ULL << (22+8), ++ .uequiv = "LLC_MISS_LOCAL:LLC_MISS_REMOTE", ++ .umodel = PFM_PMU_INTEL_SNB_EP, ++ .grpid = 1, ++ }, + { .uname = "LLC_HITMESF", + .udesc = "Supplier: counts L3 hits in any state (M, E, S, F)", + .ucode = 0xfULL << (18+8), +diff --git a/tests/validate_x86.c b/tests/validate_x86.c +index 09152f7..876453f 100644 +--- a/tests/validate_x86.c ++++ b/tests/validate_x86.c +@@ -998,6 +998,14 @@ static const test_event_t x86_test_events[]={ + .fstr = "snb::OFFCORE_RESPONSE_0:ANY_REQUEST:ANY_RESPONSE:k=1:u=1:e=0:i=0:c=0:t=0", + }, + { SRC_LINE, ++ .name = "snb::offcore_response_0:l3_miss", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] =0x5301b7, ++ .codes[1] = 0x3f80408fffull, ++ .fstr = "snb::OFFCORE_RESPONSE_0:ANY_REQUEST:LLC_MISS_LOCAL_DRAM:SNP_ANY:k=1:u=1:e=0:i=0:c=0:t=0", ++ }, ++ { SRC_LINE, + .name = "amd64_fam11h_turion::MAB_REQUESTS:DC_BUFFER_0", + .ret = PFM_ERR_NOTFOUND, + }, +@@ -1155,6 +1163,14 @@ static const test_event_t x86_test_events[]={ + .fstr = "ivb::OFFCORE_RESPONSE_0:ANY_REQUEST:LLC_MISS_LOCAL:SNP_ANY:k=1:u=1:e=0:i=0:c=0:t=0", + }, + { SRC_LINE, ++ .name = "ivb::offcore_response_0:l3_miss", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] =0x5301b7, ++ .codes[1] = 0x3f80408fffull, ++ .fstr = "ivb::OFFCORE_RESPONSE_0:ANY_REQUEST:LLC_MISS_LOCAL:SNP_ANY:k=1:u=1:e=0:i=0:c=0:t=0", ++ }, ++ { SRC_LINE, + .name = "ivb::DTLB_LOAD_MISSES:STLB_HIT", + .ret = PFM_SUCCESS, + .count = 1, +@@ -1777,6 +1793,15 @@ static const test_event_t x86_test_events[]={ + .fstr = "snb_ep::OFFCORE_RESPONSE_1:DMND_DATA_RD:DMND_RFO:DMND_IFETCH:WB:PF_DATA_RD:PF_RFO:PF_IFETCH:PF_LLC_DATA_RD:PF_LLC_RFO:PF_LLC_IFETCH:BUS_LOCKS:STRM_ST:OTHER:LLC_MISS_REMOTE_DRAM:SNP_ANY:k=1:u=1:e=0:i=0:c=0:t=0", + }, + { SRC_LINE, ++ .name = "snb_ep::offcore_response_0:l3_miss", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] =0x5301b7, ++ .codes[1] = 0x3fffc08fffull, ++ .fstr = "snb_ep::OFFCORE_RESPONSE_0:ANY_REQUEST:LLC_MISS_LOCAL_DRAM:LLC_MISS_REMOTE_DRAM:SNP_ANY:k=1:u=1:e=0:i=0:c=0:t=0", ++ }, ++ ++ { SRC_LINE, + .name = "snb_ep::mem_trans_retired:latency_above_threshold", + .ret = PFM_SUCCESS, + .count = 2, +@@ -2028,6 +2053,14 @@ static const test_event_t x86_test_events[]={ + .fstr = "ivb_ep::OFFCORE_RESPONSE_0:DMND_DATA_RD:DMND_RFO:DMND_IFETCH:WB:PF_DATA_RD:PF_RFO:PF_IFETCH:PF_LLC_DATA_RD:PF_LLC_RFO:PF_LLC_IFETCH:BUS_LOCKS:STRM_ST:OTHER:LLC_MISS_REMOTE_DRAM:SNP_ANY:k=1:u=1:e=0:i=0:c=0:t=0" + }, + { SRC_LINE, ++ .name = "ivb_ep::offcore_response_0:l3_miss", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] =0x5301b7, ++ .codes[1] = 0x3fffc08fffull, ++ .fstr = "ivb_ep::OFFCORE_RESPONSE_0:ANY_REQUEST:LLC_MISS_LOCAL:LLC_MISS_REMOTE_DRAM:SNP_ANY:k=1:u=1:e=0:i=0:c=0:t=0", ++ }, ++ { SRC_LINE, + .name = "hsw::mem_trans_retired:latency_above_threshold:ldlat=3:u", + .ret = PFM_SUCCESS, + .count = 2, +@@ -2167,6 +2200,14 @@ static const test_event_t x86_test_events[]={ + .fstr = "hsw::OFFCORE_RESPONSE_0:DMND_DATA_RD:L3_HITS:SNP_FWD:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", + }, + { SRC_LINE, ++ .name = "hsw::offcore_response_0:l3_miss", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] =0x5301b7, ++ .codes[1] = 0x3f80408fffull, ++ .fstr = "hsw::OFFCORE_RESPONSE_0:ANY_REQUEST:L3_MISS_LOCAL:SNP_ANY:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", ++ }, ++ { SRC_LINE, + .name = "ivb_unc_cbo0::unc_clockticks", + .ret = PFM_SUCCESS, + .count = 1, +@@ -2906,6 +2947,14 @@ static const test_event_t x86_test_events[]={ + .fstr = "hsw_ep::OFFCORE_RESPONSE_0:DMND_DATA_RD:PF_DATA_RD:PF_L3_DATA_RD:L3_HITM:L3_HITE:L3_HITS:L3_HITF:SNP_ANY:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", + }, + { SRC_LINE, ++ .name = "hsw_ep::offcore_response_0:l3_miss", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] =0x5301b7, ++ .codes[1] = 0x3fb8408fffull, ++ .fstr = "hsw_ep::OFFCORE_RESPONSE_0:ANY_REQUEST:L3_MISS_LOCAL:L3_MISS_REMOTE_HOP0:L3_MISS_REMOTE_HOP1:L3_MISS_REMOTE_HOP2P:SNP_ANY:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", ++ }, ++ { SRC_LINE, + .name = "bdw::mem_trans_retired:latency_above_threshold:ldlat=3:u", + .ret = PFM_SUCCESS, + .count = 2, +@@ -3046,8 +3095,8 @@ static const test_event_t x86_test_events[]={ + .ret = PFM_SUCCESS, + .count = 2, + .codes[0] =0x5301b7, +- .codes[1] = 0x3fbc008fffull, +- .fstr = "bdw::OFFCORE_RESPONSE_0:ANY_REQUEST:L3_MISS:SNP_ANY:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", ++ .codes[1] = 0x3f84008fffull, ++ .fstr = "bdw::OFFCORE_RESPONSE_0:ANY_REQUEST:L3_MISS_LOCAL:SNP_ANY:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", + }, + { SRC_LINE, + .name = "bdw::offcore_response_1:any_data", +@@ -3063,7 +3112,7 @@ static const test_event_t x86_test_events[]={ + .count = 2, + .codes[0] =0x5301b7, + .codes[1] = 0x3fbc008fffull, +- .fstr = "bdw_ep::OFFCORE_RESPONSE_0:ANY_REQUEST:L3_MISS:SNP_ANY:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", ++ .fstr = "bdw_ep::OFFCORE_RESPONSE_0:ANY_REQUEST:L3_MISS_LOCAL:L3_MISS_REMOTE_HOP0:L3_MISS_REMOTE_HOP1:L3_MISS_REMOTE_HOP2P:SNP_ANY:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", + }, + { SRC_LINE, + .name = "bdw_ep::offcore_response_0:L3_MISS_REMOTE_HOP0_DRAM", +@@ -3935,6 +3984,15 @@ static const test_event_t x86_test_events[]={ + .fstr = "skl::OFFCORE_RESPONSE_0:ANY_REQUEST:L3_MISS_LOCAL:SNP_ANY:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", + }, + { SRC_LINE, ++ .name = "skl::offcore_response_0:l3_miss", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] =0x5301b7, ++ .codes[1] = 0x3f84008fffull, ++ .fstr = "skl::OFFCORE_RESPONSE_0:ANY_REQUEST:L3_MISS_LOCAL:SNP_ANY:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", ++ }, ++ ++ { SRC_LINE, + .name = "skl::cycle_activity:0x6:c=6", + .count = 1, + .codes[0] = 0x65306a3, +-- +2.9.3 + + +From b74653d106613015632d865e5e934bf20137f3a7 Mon Sep 17 00:00:00 2001 +From: Stephane Eranian +Date: Fri, 1 Jul 2016 17:12:19 -0700 +Subject: [PATCH] add support for Intel Goldmont processor + +Enable support for Intel Goldmont processor core PMU. + +Based on official event table from download.01.org +version V6. + +Signed-off-by: Stephane Eranian +--- + include/perfmon/pfmlib.h | 1 + + lib/Makefile | 2 + + lib/events/intel_glm_events.h | 1476 +++++++++++++++++++++++++++++++++++++++++ + lib/pfmlib_common.c | 1 + + lib/pfmlib_intel_glm.c | 73 ++ + lib/pfmlib_priv.h | 1 + + 6 files changed, 1554 insertions(+) + create mode 100644 lib/events/intel_glm_events.h + create mode 100644 lib/pfmlib_intel_glm.c + +diff --git a/include/perfmon/pfmlib.h b/include/perfmon/pfmlib.h +index 8921164..ba3a54f 100644 +--- a/include/perfmon/pfmlib.h ++++ b/include/perfmon/pfmlib.h +@@ -297,6 +297,7 @@ typedef enum { + PFM_PMU_INTEL_SKL, /* Intel Skylake */ + + PFM_PMU_INTEL_BDW_EP, /* Intel Broadwell EP */ ++ PFM_PMU_INTEL_GLM, /* Intel Goldmont */ + /* MUST ADD NEW PMU MODELS HERE */ + + PFM_PMU_MAX /* end marker */ +diff --git a/lib/Makefile b/lib/Makefile +index f035307..bd74d50 100644 +--- a/lib/Makefile ++++ b/lib/Makefile +@@ -93,6 +93,7 @@ SRCS += pfmlib_amd64.c pfmlib_intel_core.c pfmlib_intel_x86.c \ + pfmlib_intel_hswep_unc_sbo.c \ + pfmlib_intel_knc.c \ + pfmlib_intel_slm.c \ ++ pfmlib_intel_glm.c \ + pfmlib_intel_netburst.c \ + pfmlib_amd64_k7.c pfmlib_amd64_k8.c pfmlib_amd64_fam10h.c \ + pfmlib_amd64_fam11h.c pfmlib_amd64_fam12h.c \ +@@ -238,6 +239,7 @@ INC_X86= pfmlib_intel_x86_priv.h \ + events/intel_hsw_events.h \ + events/intel_bdw_events.h \ + events/intel_skl_events.h \ ++ events/intel_glm_events.h \ + pfmlib_intel_snbep_unc_priv.h \ + events/intel_snbep_unc_cbo_events.h \ + events/intel_snbep_unc_ha_events.h \ +diff --git a/lib/events/intel_glm_events.h b/lib/events/intel_glm_events.h +new file mode 100644 +index 0000000..fd0b27c +--- /dev/null ++++ b/lib/events/intel_glm_events.h +@@ -0,0 +1,1476 @@ ++/* ++ * Contributed by Stephane Eranian ++ * ++ * Permission is hereby granted, free of charge, to any person obtaining a copy ++ * of this software and associated documentation files (the "Software"), to deal ++ * in the Software without restriction, including without limitation the rights ++ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies ++ * of the Software, and to permit persons to whom the Software is furnished to do so, ++ * subject to the following conditions: ++ * ++ * The above copyright notice and this permission notice shall be included in all ++ * copies or substantial portions of the Software. ++ * ++ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, ++ * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A ++ * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT ++ * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF ++ * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE ++ * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ++ * ++ * This file is part of libpfm, a performance monitoring support library for ++ * applications on Linux. ++ * FILE AUTOMATICALLY GENERATED from download.01.org/perfmon/GLM/Goldmont_core_V6.json ++ * PMU: glm (Intel Goldmont) ++ */ ++static const intel_x86_umask_t glm_icache[]={ ++ { .uname = "HIT", ++ .udesc = "References per ICache line that are available in the ICache (hit). This event counts differently than Intel processors based on Silvermont microarchitecture", ++ .ucode = 0x0100, ++ .uflags = INTEL_X86_NCOMBO, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++ { .uname = "MISSES", ++ .udesc = "References per ICache line that are not available in the ICache (miss). This event counts differently than Intel processors based on Silvermont microarchitecture", ++ .ucode = 0x0200, ++ .uflags = INTEL_X86_NCOMBO, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++ { .uname = "ACCESSES", ++ .udesc = "References per ICache line. This event counts differently than Intel processors based on Silvermont microarchitecture", ++ .ucode = 0x0300, ++ .uflags = INTEL_X86_NCOMBO, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++}; ++ ++static const intel_x86_umask_t glm_l2_reject_xq[]={ ++ { .uname = "ALL", ++ .udesc = "Requests rejected by the XQ", ++ .ucode = 0x0000, ++ .uflags = INTEL_X86_DFL, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++}; ++ ++static const intel_x86_umask_t glm_hw_interrupts[]={ ++ { .uname = "RECEIVED", ++ .udesc = "Hardware interrupts received", ++ .ucode = 0x0100, ++ .uflags = INTEL_X86_NCOMBO, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++ { .uname = "PENDING_AND_MASKED", ++ .udesc = "Cycles pending interrupts are masked", ++ .ucode = 0x0400, ++ .uflags = INTEL_X86_NCOMBO, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++}; ++ ++static const intel_x86_umask_t glm_br_misp_retired[]={ ++ { .uname = "ALL_BRANCHES", ++ .udesc = "Retired mispredicted branch instructions (Precise Event)", ++ .ucode = 0x0000, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_DFL, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++ { .uname = "JCC", ++ .udesc = "Retired mispredicted conditional branch instructions (Precise Event)", ++ .ucode = 0x7e00, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++ { .uname = "TAKEN_JCC", ++ .udesc = "Retired mispredicted conditional branch instructions that were taken (Precise Event)", ++ .ucode = 0xfe00, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++ { .uname = "IND_CALL", ++ .udesc = "Retired mispredicted near indirect call instructions (Precise Event)", ++ .ucode = 0xfb00, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++ { .uname = "RETURN", ++ .udesc = "Retired mispredicted near return instructions (Precise Event)", ++ .ucode = 0xf700, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++ { .uname = "NON_RETURN_IND", ++ .udesc = "Retired mispredicted instructions of near indirect Jmp or near indirect call (Precise Event)", ++ .ucode = 0xeb00, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++}; ++ ++static const intel_x86_umask_t glm_decode_restriction[]={ ++ { .uname = "PREDECODE_WRONG", ++ .udesc = "Decode restrictions due to predicting wrong instruction length", ++ .ucode = 0x0100, ++ .uflags = INTEL_X86_DFL, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++}; ++ ++static const intel_x86_umask_t glm_misalign_mem_ref[]={ ++ { .uname = "LOAD_PAGE_SPLIT", ++ .udesc = "Load uops that split a page (Precise Event)", ++ .ucode = 0x0200, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++ { .uname = "STORE_PAGE_SPLIT", ++ .udesc = "Store uops that split a page (Precise Event)", ++ .ucode = 0x0400, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++}; ++ ++static const intel_x86_umask_t glm_inst_retired[]={ ++ { .uname = "ANY_P", ++ .udesc = "Counts the number of instructions that retire execution. For instructions that consist of multiple uops, this event counts the retirement of the last uop of the instruction. The event continues counting during hardware interrupts, traps, and inside interrupt handlers. This is an architectural performance event. This event uses a (_P)rogrammable general purpose performance counter. *This event is Precise Event capable: The EventingRIP field in the PEBS record is precise to the address of the instruction which caused the event. Note: Because PEBS records can be collected only on IA32_PMC0, only one event can use the PEBS facility at a time.", ++ .ucode = 0x0000, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_DFL, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++}; ++ ++static const intel_x86_umask_t glm_issue_slots_not_consumed[]={ ++ { .uname = "RESOURCE_FULL", ++ .udesc = "Unfilled issue slots per cycle because of a full resource in the backend", ++ .ucode = 0x0100, ++ .uflags = INTEL_X86_NCOMBO, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++ { .uname = "RECOVERY", ++ .udesc = "Unfilled issue slots per cycle to recover", ++ .ucode = 0x0200, ++ .uflags = INTEL_X86_NCOMBO, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++ { .uname = "ANY", ++ .udesc = "Unfilled issue slots per cycle", ++ .ucode = 0x0000, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++}; ++ ++static const intel_x86_umask_t glm_itlb[]={ ++ { .uname = "MISS", ++ .udesc = "ITLB misses", ++ .ucode = 0x0400, ++ .uflags = INTEL_X86_DFL, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++}; ++ ++static const intel_x86_umask_t glm_longest_lat_cache[]={ ++ { .uname = "REFERENCE", ++ .udesc = "L2 cache requests", ++ .ucode = 0x4f00, ++ .uflags = INTEL_X86_NCOMBO, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++ { .uname = "MISS", ++ .udesc = "L2 cache request misses", ++ .ucode = 0x4100, ++ .uflags = INTEL_X86_NCOMBO, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++}; ++ ++static const intel_x86_umask_t glm_mem_load_uops_retired[]={ ++ { .uname = "L1_HIT", ++ .udesc = "Load uops retired that hit L1 data cache (Precise Event)", ++ .ucode = 0x0100, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++ { .uname = "L1_MISS", ++ .udesc = "Load uops retired that missed L1 data cache (Precise Event)", ++ .ucode = 0x0800, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++ { .uname = "L2_HIT", ++ .udesc = "Load uops retired that hit L2 (Precise Event)", ++ .ucode = 0x0200, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++ { .uname = "L2_MISS", ++ .udesc = "Load uops retired that missed L2 (Precise Event)", ++ .ucode = 0x1000, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++ { .uname = "HITM", ++ .udesc = "Memory uop retired where cross core or cross module HITM occured (Precise Event)", ++ .ucode = 0x2000, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++ { .uname = "WCB_HIT", ++ .udesc = "Loads retired that hit WCB (Precise Event)", ++ .ucode = 0x4000, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++ { .uname = "DRAM_HIT", ++ .udesc = "Loads retired that came from DRAM (Precise Event)", ++ .ucode = 0x8000, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++}; ++ ++static const intel_x86_umask_t glm_ld_blocks[]={ ++ { .uname = "ALL_BLOCK", ++ .udesc = "Loads blocked (Precise Event)", ++ .ucode = 0x1000, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++ { .uname = "UTLB_MISS", ++ .udesc = "Loads blocked because adress in not in the UTLB (Precise Event)", ++ .ucode = 0x0800, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++ { .uname = "STORE_FORWARD", ++ .udesc = "Loads blocked due to store forward restriction (Precise Event)", ++ .ucode = 0x0200, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++ { .uname = "DATA_UNKNOWN", ++ .udesc = "Loads blocked due to store data not ready (Precise Event)", ++ .ucode = 0x0100, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++ { .uname = "4K_ALIAS", ++ .udesc = "Loads blocked because address has 4k partial address false dependence (Precise Event)", ++ .ucode = 0x0400, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++}; ++ ++static const intel_x86_umask_t glm_dl1[]={ ++ { .uname = "DIRTY_EVICTION", ++ .udesc = "L1 Cache evictions for dirty data", ++ .ucode = 0x0100, ++ .uflags = INTEL_X86_DFL, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++}; ++ ++static const intel_x86_umask_t glm_cycles_div_busy[]={ ++ { .uname = "ALL", ++ .udesc = "Cycles a divider is busy", ++ .ucode = 0x0000, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++ { .uname = "IDIV", ++ .udesc = "Cycles the integer divide unit is busy", ++ .ucode = 0x0100, ++ .uflags = INTEL_X86_NCOMBO, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++ { .uname = "FPDIV", ++ .udesc = "Cycles the FP divide unit is busy", ++ .ucode = 0x0200, ++ .uflags = INTEL_X86_NCOMBO, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++}; ++ ++static const intel_x86_umask_t glm_ms_decoded[]={ ++ { .uname = "MS_ENTRY", ++ .udesc = "MS decode starts", ++ .ucode = 0x0100, ++ .uflags = INTEL_X86_DFL, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++}; ++ ++static const intel_x86_umask_t glm_uops_retired[]={ ++ { .uname = "ANY", ++ .udesc = "Uops retired (Precise Event)", ++ .ucode = 0x0000, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_DFL, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++ { .uname = "MS", ++ .udesc = "MS uops retired (Precise Event)", ++ .ucode = 0x0100, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++}; ++ ++static const intel_x86_umask_t glm_offcore_response_1[]={ ++ { .uname = "DMND_DATA_RD", ++ .udesc = "Request: number of demand and DCU prefetch data reads of full and partial cachelines as well as demand data page table entry cacheline reads. Does not count L2 data read prefetches or instruction fetches", ++ .ucode = 1ULL << (0 + 8), ++ .grpid = 0, ++ .ucntmsk = 0xffull, ++ }, ++ { .uname = "DMND_RFO", ++ .udesc = "Request: number of demand and DCU prefetch reads for ownership (RFO) requests generated by a write to data cacheline. Does not count L2 RFO prefetches", ++ .ucode = 1ULL << (1 + 8), ++ .grpid = 0, ++ .ucntmsk = 0xffull, ++ }, ++ { .uname = "DMND_CODE_RD", ++ .udesc = "Request: number of demand and DCU prefetch instruction cacheline reads. Does not count L2 code read prefetches", ++ .ucode = 1ULL << (2 + 8), ++ .grpid = 0, ++ .ucntmsk = 0xffull, ++ }, ++ { .uname = "WB", ++ .udesc = "Request: number of writebacks (modified to exclusive) transactions", ++ .ucode = 1ULL << (3 + 8), ++ .grpid = 0, ++ .ucntmsk = 0xffull, ++ }, ++ { .uname = "PF_DATA_RD", ++ .udesc = "Request: number of data cacheline reads generated by L2 prefetcher", ++ .ucode = 1ULL << (4 + 8), ++ .grpid = 0, ++ .ucntmsk = 0xffull, ++ }, ++ { .uname = "PF_RFO", ++ .udesc = "Request: number of RFO requests generated by L2 prefetcher", ++ .ucode = 1ULL << (5 + 8), ++ .grpid = 0, ++ .ucntmsk = 0xffull, ++ }, ++ { .uname = "PARTIAL_READS", ++ .udesc = "Request: number of partil reads", ++ .ucode = 1ULL << (7 + 8), ++ .grpid = 0, ++ .ucntmsk = 0xffull, ++ }, ++ { .uname = "PARTIAL_WRITES", ++ .udesc = "Request: number of partial writes", ++ .ucode = 1ULL << (8 + 8), ++ .grpid = 0, ++ .ucntmsk = 0xffull, ++ .uequiv = "PF_CODE_RD", ++ }, ++ { .uname = "UC_CODE_READS", ++ .udesc = "Request: number of uncached code reads", ++ .ucode = 1ULL << (9 + 8), ++ .grpid = 0, ++ .ucntmsk = 0xffull, ++ }, ++ { .uname = "BUS_LOCKS", ++ .udesc = "Request: number of bus lock and split lock requests", ++ .ucode = 1ULL << (10 + 8), ++ .grpid = 0, ++ .ucntmsk = 0xffull, ++ }, ++ { .uname = "STRM_ST", ++ .udesc = "Request: number of streaming store requests for full cacheline", ++ .ucode = 1ULL << (11 + 8), ++ .grpid = 0, ++ .ucntmsk = 0xffull, ++ }, ++ { .uname = "SW_PF", ++ .udesc = "Request: number of cacheline requests due to software prefetch", ++ .ucode = 1ULL << (12 + 8), ++ .grpid = 0, ++ .ucntmsk = 0xffull, ++ }, ++ { .uname = "PF_L1_DATA_RD", ++ .udesc = "Request: number of data cacheline reads generated by L1 data prefetcher", ++ .ucode = 1ULL << (13 + 8), ++ .grpid = 0, ++ .ucntmsk = 0xffull, ++ }, ++ { .uname = "PARTIAL_STRM_ST", ++ .udesc = "Request: number of streaming store requests for partial cacheline", ++ .ucode = 1ULL << (11 + 8), ++ .grpid = 0, ++ .ucntmsk = 0xffull, ++ }, ++ { .uname = "OTHER", ++ .udesc = "Request: counts one of the following transaction types, including L3 invalidate, I/O, full or partial writes, WC or non-temporal stores, CLFLUSH, Fences, lock, unlock, split lock", ++ .ucode = 1ULL << (15 + 8), ++ .grpid = 0, ++ .ucntmsk = 0xffull, ++ }, ++ { .uname = "ANY_CODE_RD", ++ .udesc = "Request: combination of PF_CODE_RD | DMND_CODE_RD | PF_L3_CODE_RD", ++ .ucode = 0x24400, ++ .grpid = 0, ++ .ucntmsk = 0xffull, ++ .uequiv = "PF_CODE_RD:DMND_CODE_RD:PF_L3_CODE_RD", ++ }, ++ { .uname = "ANY_IFETCH", ++ .udesc = "Request: combination of PF_CODE_RD | PF_L3_CODE_RD", ++ .ucode = 0x24000, ++ .grpid = 0, ++ .ucntmsk = 0xffull, ++ .uequiv = "PF_CODE_RD:PF_L3_CODE_RD", ++ }, ++ { .uname = "ANY_REQUEST", ++ .udesc = "Request: combination of all request umasks", ++ .ucode = 0x8fff00, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, ++ .grpid = 0, ++ .ucntmsk = 0xffull, ++ .uequiv = "DMND_DATA_RD:DMND_RFO:DMND_CODE_RD:WB:PF_DATA_RD:PF_RFO:PF_CODE_RD:PF_L3_DATA_RD:PF_L3_RFO:PF_L3_CODE_RD:SPLIT_LOCK_UC_LOCK:STRM_ST:OTHER", ++ }, ++ { .uname = "ANY_DATA", ++ .udesc = "Request: combination of DMND_DATA | PF_DATA_RD | PF_L3_DATA_RD", ++ .ucode = 0x9100, ++ .grpid = 0, ++ .ucntmsk = 0xffull, ++ .uequiv = "DMND_DATA_RD:PF_DATA_RD:PF_L3_DATA_RD", ++ }, ++ { .uname = "ANY_RFO", ++ .udesc = "Request: combination of DMND_RFO | PF_RFO | PF_L3_RFO", ++ .ucode = 0x12200, ++ .grpid = 0, ++ .ucntmsk = 0xffull, ++ .uequiv = "DMND_RFO:PF_RFO:PF_L3_RFO", ++ }, ++ { .uname = "ANY_RESPONSE", ++ .udesc = "Response: any response type", ++ .ucode = 1ULL << (16 + 8), ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL | INTEL_X86_EXCL_GRP_GT, ++ .grpid = 1, ++ .ucntmsk = 0xffull, ++ }, ++ { .uname = "NO_SUPP", ++ .udesc = "Supplier: counts number of times supplier information is not available", ++ .ucode = 1ULL << (17 + 8), ++ .grpid = 1, ++ .ucntmsk = 0xffull, ++ }, ++ { .uname = "L3_HITM", ++ .udesc = "Supplier: counts L3 hits in M-state (initial lookup)", ++ .ucode = 1ULL << (18 + 8), ++ .grpid = 1, ++ .ucntmsk = 0xffull, ++ }, ++ { .uname = "L3_HITE", ++ .udesc = "Supplier: counts L3 hits in E-state", ++ .ucode = 1ULL << (19 + 8), ++ .grpid = 1, ++ .ucntmsk = 0xffull, ++ }, ++ { .uname = "L3_HITS", ++ .udesc = "Supplier: counts L3 hits in S-state", ++ .ucode = 1ULL << (20 + 8), ++ .grpid = 1, ++ .ucntmsk = 0xffull, ++ }, ++ { .uname = "L3_HIT", ++ .udesc = "Supplier: counts L3 hits in any state (M, E, S)", ++ .ucode = 7ULL << (18 + 8), ++ .grpid = 1, ++ .ucntmsk = 0xffull, ++ .umodel = PFM_PMU_INTEL_GLM, ++ .uequiv = "L3_HITM:L3_HITE:L3_HITS", ++ }, ++ { .uname = "L3_MISS_LOCAL_DRAM", ++ .udesc = "Supplier: counts L3 misses to local DRAM", ++ .ucode = 1ULL << (22 + 8), ++ .grpid = 1, ++ .ucntmsk = 0xffull, ++ }, ++ { .uname = "L3_MISS_REMOTE_DRAM", ++ .udesc = "Supplier: counts L3 misses to remote DRAM", ++ .ucode = 0x7fULL << (23 + 8), ++ .grpid = 1, ++ .ucntmsk = 0xffull, ++ }, ++ { .uname = "L3_MISS", ++ .udesc = "Supplier: counts L3 misses to local or remote DRAM", ++ .ucode = 0xffULL << (22 + 8), ++ .grpid = 1, ++ .ucntmsk = 0xffull, ++ .uequiv = "L3_MISS_REMOTE_DRAM:L3_MISS_LOCAL_DRAM", ++ }, ++ { .uname = "SPL_HIT", ++ .udesc = "Supplier: counts L3 supplier hit", ++ .ucode = 1ULL << (30 + 8), ++ .grpid = 1, ++ .ucntmsk = 0xffull, ++ }, ++ { .uname = "SNP_NONE", ++ .udesc = "Snoop: counts number of times no snoop-related information is available", ++ .ucode = 1ULL << (31 + 8), ++ .grpid = 2, ++ .ucntmsk = 0xffull, ++ }, ++ { .uname = "SNP_NOT_NEEDED", ++ .udesc = "Snoop: counts the number of times no snoop was needed to satisfy the request", ++ .ucode = 1ULL << (32 + 8), ++ .grpid = 2, ++ .ucntmsk = 0xffull, ++ }, ++ { .uname = "SNP_MISS", ++ .udesc = "Snoop: counts number of times a snoop was needed and it missed all snooped caches", ++ .ucode = 1ULL << (33 + 8), ++ .grpid = 2, ++ .ucntmsk = 0xffull, ++ }, ++ { .uname = "SNP_NO_FWD", ++ .udesc = "Snoop: counts number of times a snoop was needed and data was forwarded from a remote socket", ++ .ucode = 1ULL << (34 + 8), ++ .grpid = 2, ++ .ucntmsk = 0xffull, ++ }, ++ { .uname = "SNP_FWD", ++ .udesc = "Snoop: counts number of times a snoop was needed and data was forwarded from a remote socket", ++ .ucode = 1ULL << (35 + 8), ++ .grpid = 2, ++ .ucntmsk = 0xffull, ++ }, ++ { .uname = "SNP_HITM", ++ .udesc = "Snoop: counts number of times a snoop was needed and it hitM-ed in local or remote cache", ++ .ucode = 1ULL << (36 + 8), ++ .grpid = 2, ++ .ucntmsk = 0xffull, ++ }, ++ { .uname = "SNP_NON_DRAM", ++ .udesc = "Snoop: counts number of times target was a non-DRAM system address. This includes MMIO transactions", ++ .ucode = 1ULL << (37 + 8), ++ .grpid = 2, ++ .ucntmsk = 0xffull, ++ }, ++ { .uname = "SNP_ANY", ++ .udesc = "Snoop: any snoop reason", ++ .ucode = 0x7ULL << (31 + 8), ++ .uflags = INTEL_X86_DFL, ++ .grpid = 2, ++ .ucntmsk = 0xffull, ++ .uequiv = "SNP_NONE:SNP_NOT_NEEDED:SNP_MISS:SNP_NO_FWD:SNP_FWD:SNP_HITM:SNP_NON_DRAM", ++ }, ++}; ++ ++static const intel_x86_umask_t glm_machine_clears[]={ ++ { .uname = "SMC", ++ .udesc = "Self-Modifying Code detected", ++ .ucode = 0x0100, ++ .uflags = INTEL_X86_NCOMBO, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++ { .uname = "MEMORY_ORDERING", ++ .udesc = "Machine cleas due to memory ordering issue", ++ .ucode = 0x0200, ++ .uflags = INTEL_X86_NCOMBO, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++ { .uname = "FP_ASSIST", ++ .udesc = "Machine clears due to FP assists", ++ .ucode = 0x0400, ++ .uflags = INTEL_X86_NCOMBO, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++ { .uname = "DISAMBIGUATION", ++ .udesc = "Machine clears due to memory disambiguation", ++ .ucode = 0x0800, ++ .uflags = INTEL_X86_NCOMBO, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++ { .uname = "ALL", ++ .udesc = "All machine clears", ++ .ucode = 0x0000, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++}; ++ ++static const intel_x86_umask_t glm_br_inst_retired[]={ ++ { .uname = "ALL_BRANCHES", ++ .udesc = "Retired branch instructions (Precise Event)", ++ .ucode = 0x0000, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++ { .uname = "JCC", ++ .udesc = "Retired conditional branch instructions (Precise Event)", ++ .ucode = 0x7e00, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++ { .uname = "TAKEN_JCC", ++ .udesc = "Retired conditional branch instructions that were taken (Precise Event)", ++ .ucode = 0xfe00, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++ { .uname = "CALL", ++ .udesc = "Retired near call instructions (Precise Event)", ++ .ucode = 0xf900, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++ { .uname = "REL_CALL", ++ .udesc = "Retired near relative call instructions (Precise Event)", ++ .ucode = 0xfd00, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++ { .uname = "IND_CALL", ++ .udesc = "Retired near indirect call instructions (Precise Event)", ++ .ucode = 0xfb00, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++ { .uname = "RETURN", ++ .udesc = "Retired near return instructions (Precise Event)", ++ .ucode = 0xf700, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++ { .uname = "NON_RETURN_IND", ++ .udesc = "Retired instructions of near indirect Jmp or call (Precise Event)", ++ .ucode = 0xeb00, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++ { .uname = "FAR_BRANCH", ++ .udesc = "Retired far branch instructions (Precise Event)", ++ .ucode = 0xbf00, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++}; ++ ++static const intel_x86_umask_t glm_fetch_stall[]={ ++ { .uname = "ICACHE_FILL_PENDING_CYCLES", ++ .udesc = "Cycles where code-fetch is stalled and an ICache miss is outstanding. This is not the same as an ICache Miss", ++ .ucode = 0x0200, ++ .uflags = INTEL_X86_DFL, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++}; ++ ++static const intel_x86_umask_t glm_uops_not_delivered[]={ ++ { .uname = "ANY", ++ .udesc = "Uops requested but not-delivered to the back-end per cycle", ++ .ucode = 0x0000, ++ .uflags = INTEL_X86_DFL, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++}; ++ ++static const intel_x86_umask_t glm_mem_uops_retired[]={ ++ { .uname = "ALL_LOADS", ++ .udesc = "Load uops retired (Precise Event)", ++ .ucode = 0x8100, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++ { .uname = "ALL_STORES", ++ .udesc = "Store uops retired (Precise Event)", ++ .ucode = 0x8200, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++ { .uname = "ALL", ++ .udesc = "Memory uops retired (Precise Event)", ++ .ucode = 0x8300, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++ { .uname = "DTLB_MISS_LOADS", ++ .udesc = "Load uops retired that missed the DTLB (Precise Event)", ++ .ucode = 0x1100, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++ { .uname = "DTLB_MISS_STORES", ++ .udesc = "Store uops retired that missed the DTLB (Precise Event)", ++ .ucode = 0x1200, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++ { .uname = "DTLB_MISS", ++ .udesc = "Memory uops retired that missed the DTLB (Precise Event)", ++ .ucode = 0x1300, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++ { .uname = "LOCK_LOADS", ++ .udesc = "Locked load uops retired (Precise Event)", ++ .ucode = 0x2100, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++ { .uname = "SPLIT_LOADS", ++ .udesc = "Load uops retired that split a cache-line (Precise Event)", ++ .ucode = 0x4100, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++ { .uname = "SPLIT_STORES", ++ .udesc = "Stores uops retired that split a cache-line (Precise Event)", ++ .ucode = 0x4200, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++ { .uname = "SPLIT", ++ .udesc = "Memory uops retired that split a cache-line (Precise Event)", ++ .ucode = 0x4300, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++}; ++ ++static const intel_x86_umask_t glm_uops_issued[]={ ++ { .uname = "ANY", ++ .udesc = "Uops issued to the back end per cycle", ++ .ucode = 0x0000, ++ .uflags = INTEL_X86_DFL, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++}; ++ ++static const intel_x86_umask_t glm_offcore_response_0[]={ ++ { .uname = "DMND_DATA_RD", ++ .udesc = "Request: number of demand and DCU prefetch data reads of full and partial cachelines as well as demand data page table entry cacheline reads. Does not count L2 data read prefetches or instruction fetches", ++ .ucode = 1ULL << (0 + 8), ++ .grpid = 0, ++ .ucntmsk = 0xffull, ++ }, ++ { .uname = "DMND_RFO", ++ .udesc = "Request: number of demand and DCU prefetch reads for ownership (RFO) requests generated by a write to data cacheline. Does not count L2 RFO prefetches", ++ .ucode = 1ULL << (1 + 8), ++ .grpid = 0, ++ .ucntmsk = 0xffull, ++ }, ++ { .uname = "DMND_CODE_RD", ++ .udesc = "Request: number of demand and DCU prefetch instruction cacheline reads. Does not count L2 code read prefetches", ++ .ucode = 1ULL << (2 + 8), ++ .grpid = 0, ++ .ucntmsk = 0xffull, ++ }, ++ { .uname = "DMND_IFETCH", ++ .udesc = "Request: number of demand and DCU prefetch instruction cacheline reads. Does not count L2 code read prefetches", ++ .ucode = 1ULL << (2 + 8), ++ .grpid = 0, ++ .ucntmsk = 0xffull, ++ .uequiv = "DMND_CODE_RD", ++ }, ++ { .uname = "WB", ++ .udesc = "Request: number of writebacks (modified to exclusive) transactions", ++ .ucode = 1ULL << (3 + 8), ++ .grpid = 0, ++ .ucntmsk = 0xffull, ++ }, ++ { .uname = "PF_DATA_RD", ++ .udesc = "Request: number of data cacheline reads generated by L2 prefetchers", ++ .ucode = 1ULL << (4 + 8), ++ .grpid = 0, ++ .ucntmsk = 0xffull, ++ }, ++ { .uname = "PF_RFO", ++ .udesc = "Request: number of RFO requests generated by L2 prefetchers", ++ .ucode = 1ULL << (5 + 8), ++ .grpid = 0, ++ .ucntmsk = 0xffull, ++ }, ++ { .uname = "PF_CODE_RD", ++ .udesc = "Request: number of code reads generated by L2 prefetchers", ++ .ucode = 1ULL << (6 + 8), ++ .grpid = 0, ++ .ucntmsk = 0xffull, ++ }, ++ { .uname = "PF_IFETCH", ++ .udesc = "Request: number of code reads generated by L2 prefetchers", ++ .ucode = 1ULL << (6 + 8), ++ .grpid = 0, ++ .ucntmsk = 0xffull, ++ .uequiv = "PF_CODE_RD", ++ }, ++ { .uname = "PF_L3_DATA_RD", ++ .udesc = "Request: number of L2 prefetcher requests to L3 for loads", ++ .ucode = 1ULL << (7 + 8), ++ .grpid = 0, ++ .ucntmsk = 0xffull, ++ }, ++ { .uname = "PF_L3_RFO", ++ .udesc = "Request: number of RFO requests generated by L2 prefetcher", ++ .ucode = 1ULL << (8 + 8), ++ .grpid = 0, ++ .ucntmsk = 0xffull, ++ }, ++ { .uname = "PF_L3_CODE_RD", ++ .udesc = "Request: number of L2 prefetcher requests to L3 for instruction fetches", ++ .ucode = 1ULL << (9 + 8), ++ .grpid = 0, ++ .ucntmsk = 0xffull, ++ }, ++ { .uname = "PF_L3_IFETCH", ++ .udesc = "Request: number of L2 prefetcher requests to L3 for instruction fetches", ++ .ucode = 1ULL << (9 + 8), ++ .grpid = 0, ++ .ucntmsk = 0xffull, ++ .uequiv = "PF_L3_CODE_RD", ++ }, ++ { .uname = "SPLIT_LOCK_UC_LOCK", ++ .udesc = "Request: number of bus lock and split lock requests", ++ .ucode = 1ULL << (10 + 8), ++ .grpid = 0, ++ .ucntmsk = 0xffull, ++ }, ++ { .uname = "BUS_LOCKS", ++ .udesc = "Request: number of bus lock and split lock requests", ++ .ucode = 1ULL << (10 + 8), ++ .grpid = 0, ++ .ucntmsk = 0xffull, ++ .uequiv = "SPLIT_LOCK_UC_LOCK", ++ }, ++ { .uname = "BUS_LOCK", ++ .udesc = "Request: number of bus lock and split lock requests", ++ .ucode = 1ULL << (10 + 8), ++ .grpid = 0, ++ .ucntmsk = 0xffull, ++ .uequiv = "SPLIT_LOCK_UC_LOCK", ++ }, ++ { .uname = "STRM_ST", ++ .udesc = "Request: number of streaming store requests", ++ .ucode = 1ULL << (11 + 8), ++ .grpid = 0, ++ .ucntmsk = 0xffull, ++ }, ++ { .uname = "OTHER", ++ .udesc = "Request: counts one of the following transaction types, including L3 invalidate, I/O, full or partial writes, WC or non-temporal stores, CLFLUSH, Fences, lock, unlock, split lock", ++ .ucode = 1ULL << (15 + 8), ++ .grpid = 0, ++ .ucntmsk = 0xffull, ++ }, ++ { .uname = "ANY_CODE_RD", ++ .udesc = "Request: combination of PF_CODE_RD | DMND_CODE_RD | PF_L3_CODE_RD", ++ .ucode = 0x24400, ++ .grpid = 0, ++ .ucntmsk = 0xffull, ++ .uequiv = "PF_CODE_RD:DMND_CODE_RD:PF_L3_CODE_RD", ++ }, ++ { .uname = "ANY_IFETCH", ++ .udesc = "Request: combination of PF_CODE_RD | PF_L3_CODE_RD", ++ .ucode = 0x24000, ++ .grpid = 0, ++ .ucntmsk = 0xffull, ++ .uequiv = "PF_CODE_RD:PF_L3_CODE_RD", ++ }, ++ { .uname = "ANY_REQUEST", ++ .udesc = "Request: combination of all request umasks", ++ .ucode = 0x8fff00, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, ++ .grpid = 0, ++ .ucntmsk = 0xffull, ++ .uequiv = "DMND_DATA_RD:DMND_RFO:DMND_CODE_RD:WB:PF_DATA_RD:PF_RFO:PF_CODE_RD:PF_L3_DATA_RD:PF_L3_RFO:PF_L3_CODE_RD:SPLIT_LOCK_UC_LOCK:STRM_ST:OTHER", ++ }, ++ { .uname = "ANY_DATA", ++ .udesc = "Request: combination of DMND_DATA | PF_DATA_RD | PF_L3_DATA_RD", ++ .ucode = 0x9100, ++ .grpid = 0, ++ .ucntmsk = 0xffull, ++ .uequiv = "DMND_DATA_RD:PF_DATA_RD:PF_L3_DATA_RD", ++ }, ++ { .uname = "ANY_RFO", ++ .udesc = "Request: combination of DMND_RFO | PF_RFO | PF_L3_RFO", ++ .ucode = 0x12200, ++ .grpid = 0, ++ .ucntmsk = 0xffull, ++ .uequiv = "DMND_RFO:PF_RFO:PF_L3_RFO", ++ }, ++ { .uname = "ANY_RESPONSE", ++ .udesc = "Response: any response type", ++ .ucode = 1ULL << (16 + 8), ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL | INTEL_X86_EXCL_GRP_GT, ++ .grpid = 1, ++ .ucntmsk = 0xffull, ++ }, ++ { .uname = "NO_SUPP", ++ .udesc = "Supplier: counts number of times supplier information is not available", ++ .ucode = 1ULL << (17 + 8), ++ .grpid = 1, ++ .ucntmsk = 0xffull, ++ }, ++ { .uname = "L3_HITM", ++ .udesc = "Supplier: counts L3 hits in M-state (initial lookup)", ++ .ucode = 1ULL << (18 + 8), ++ .grpid = 1, ++ .ucntmsk = 0xffull, ++ }, ++ { .uname = "L3_HITE", ++ .udesc = "Supplier: counts L3 hits in E-state", ++ .ucode = 1ULL << (19 + 8), ++ .grpid = 1, ++ .ucntmsk = 0xffull, ++ }, ++ { .uname = "L3_HITS", ++ .udesc = "Supplier: counts L3 hits in S-state", ++ .ucode = 1ULL << (20 + 8), ++ .grpid = 1, ++ .ucntmsk = 0xffull, ++ }, ++ { .uname = "L3_HIT", ++ .udesc = "Supplier: counts L3 hits in any state (M, E, S)", ++ .ucode = 7ULL << (18 + 8), ++ .grpid = 1, ++ .ucntmsk = 0xffull, ++ .umodel = PFM_PMU_INTEL_GLM, ++ .uequiv = "L3_HITM:L3_HITE:L3_HITS", ++ }, ++ { .uname = "L3_MISS_LOCAL_DRAM", ++ .udesc = "Supplier: counts L3 misses to local DRAM", ++ .ucode = 1ULL << (22 + 8), ++ .grpid = 1, ++ .ucntmsk = 0xffull, ++ }, ++ { .uname = "L3_MISS_REMOTE_DRAM", ++ .udesc = "Supplier: counts L3 misses to remote DRAM", ++ .ucode = 0x7fULL << (23 + 8), ++ .grpid = 1, ++ .ucntmsk = 0xffull, ++ }, ++ { .uname = "L3_MISS", ++ .udesc = "Supplier: counts L3 misses to local or remote DRAM", ++ .ucode = 0xffULL << (22 + 8), ++ .grpid = 1, ++ .ucntmsk = 0xffull, ++ .uequiv = "L3_MISS_REMOTE_DRAM:L3_MISS_LOCAL_DRAM", ++ }, ++ { .uname = "SPL_HIT", ++ .udesc = "Supplier: counts L3 supplier hit", ++ .ucode = 1ULL << (30 + 8), ++ .grpid = 1, ++ .ucntmsk = 0xffull, ++ }, ++ { .uname = "SNP_NONE", ++ .udesc = "Snoop: counts number of times no snoop-related information is available", ++ .ucode = 1ULL << (31 + 8), ++ .grpid = 2, ++ .ucntmsk = 0xffull, ++ }, ++ { .uname = "SNP_NOT_NEEDED", ++ .udesc = "Snoop: counts the number of times no snoop was needed to satisfy the request", ++ .ucode = 1ULL << (32 + 8), ++ .grpid = 2, ++ .ucntmsk = 0xffull, ++ }, ++ { .uname = "SNP_MISS", ++ .udesc = "Snoop: counts number of times a snoop was needed and it missed all snooped caches", ++ .ucode = 1ULL << (33 + 8), ++ .grpid = 2, ++ .ucntmsk = 0xffull, ++ }, ++ { .uname = "SNP_NO_FWD", ++ .udesc = "Snoop: counts number of times a snoop was needed and data was forwarded from a remote socket", ++ .ucode = 1ULL << (34 + 8), ++ .grpid = 2, ++ .ucntmsk = 0xffull, ++ }, ++ { .uname = "SNP_FWD", ++ .udesc = "Snoop: counts number of times a snoop was needed and data was forwarded from a remote socket", ++ .ucode = 1ULL << (35 + 8), ++ .grpid = 2, ++ .ucntmsk = 0xffull, ++ }, ++ { .uname = "SNP_HITM", ++ .udesc = "Snoop: counts number of times a snoop was needed and it hitM-ed in local or remote cache", ++ .ucode = 1ULL << (36 + 8), ++ .grpid = 2, ++ .ucntmsk = 0xffull, ++ }, ++ { .uname = "SNP_NON_DRAM", ++ .udesc = "Snoop: counts number of times target was a non-DRAM system address. This includes MMIO transactions", ++ .ucode = 1ULL << (37 + 8), ++ .grpid = 2, ++ .ucntmsk = 0xffull, ++ }, ++ { .uname = "SNP_ANY", ++ .udesc = "Snoop: any snoop reason", ++ .ucode = 0x7ULL << (31 + 8), ++ .uflags = INTEL_X86_DFL, ++ .grpid = 2, ++ .ucntmsk = 0xffull, ++ .uequiv = "SNP_NONE:SNP_NOT_NEEDED:SNP_MISS:SNP_NO_FWD:SNP_FWD:SNP_HITM:SNP_NON_DRAM", ++ }, ++}; ++ ++static const intel_x86_umask_t glm_core_reject_l2q[]={ ++ { .uname = "ALL", ++ .udesc = "Requests rejected by the L2Q ", ++ .ucode = 0x0000, ++ .uflags = INTEL_X86_DFL, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++}; ++ ++static const intel_x86_umask_t glm_page_walks[]={ ++ { .uname = "D_SIDE_CYCLES", ++ .udesc = "Duration of D-side page-walks in cycles", ++ .ucode = 0x0100, ++ .uflags = INTEL_X86_NCOMBO, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++ { .uname = "I_SIDE_CYCLES", ++ .udesc = "Duration of I-side pagewalks in cycles", ++ .ucode = 0x0200, ++ .uflags = INTEL_X86_NCOMBO, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++ { .uname = "CYCLES", ++ .udesc = "Duration of page-walks in cycles", ++ .ucode = 0x0300, ++ .uflags = INTEL_X86_NCOMBO, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++}; ++ ++static const intel_x86_umask_t glm_baclears[]={ ++ { .uname = "ALL", ++ .udesc = "BACLEARs asserted for any branch type", ++ .ucode = 0x0100, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++ { .uname = "RETURN", ++ .udesc = "BACLEARs asserted for return branch", ++ .ucode = 0x0800, ++ .uflags = INTEL_X86_NCOMBO, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++ { .uname = "COND", ++ .udesc = "BACLEARs asserted for conditional branch", ++ .ucode = 0x1000, ++ .uflags = INTEL_X86_NCOMBO, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++}; ++ ++static const intel_x86_umask_t glm_cpu_clk_unhalted[]={ ++ { .uname = "CORE", ++ .udesc = "Core cycles when core is not halted (Fixed event)", ++ .ucode = 0x0200, ++ .uflags = INTEL_X86_NCOMBO, ++ .grpid = 0, ++ .ucntmsk = 0x200000000ull, ++ }, ++ { .uname = "REF_TSC", ++ .udesc = "Reference cycles when core is not halted (Fixed event)", ++ .ucode = 0x0300, ++ .uflags = INTEL_X86_NCOMBO, ++ .grpid = 0, ++ .ucntmsk = 0x400000000ull, ++ }, ++ { .uname = "CORE_P", ++ .udesc = "Core cycles when core is not halted", ++ .ucode = 0x0000, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++ { .uname = "REF", ++ .udesc = "Reference cycles when core is not halted", ++ .ucode = 0x0100, ++ .uflags = INTEL_X86_NCOMBO, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++}; ++ ++static const intel_x86_entry_t intel_glm_pe[]={ ++ { .name = "ICACHE", ++ .desc = "References per ICache line that are available in the ICache (hit). This event counts differently than Intel processors based on Silvermont microarchitecture", ++ .code = 0x80, ++ .modmsk = INTEL_V4_ATTRS, ++ .cntmsk = 0xfull, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(glm_icache), ++ .umasks = glm_icache, ++ }, ++ { .name = "L2_REJECT_XQ", ++ .desc = "Requests rejected by the XQ", ++ .code = 0x30, ++ .modmsk = INTEL_V4_ATTRS, ++ .cntmsk = 0xfull, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(glm_l2_reject_xq), ++ .umasks = glm_l2_reject_xq, ++ }, ++ { .name = "HW_INTERRUPTS", ++ .desc = "Hardware interrupts received", ++ .code = 0xcb, ++ .modmsk = INTEL_V4_ATTRS, ++ .cntmsk = 0xfull, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(glm_hw_interrupts), ++ .umasks = glm_hw_interrupts, ++ }, ++ { .name = "BR_MISP_RETIRED", ++ .desc = "Retired mispredicted branch instructions (Precise Event)", ++ .code = 0xc5, ++ .modmsk = INTEL_V4_ATTRS, ++ .cntmsk = 0xfull, ++ .flags = INTEL_X86_PEBS, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(glm_br_misp_retired), ++ .umasks = glm_br_misp_retired, ++ }, ++ { .name = "DECODE_RESTRICTION", ++ .desc = "Decode restrictions due to predicting wrong instruction length", ++ .code = 0xe9, ++ .modmsk = INTEL_V4_ATTRS, ++ .cntmsk = 0xfull, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(glm_decode_restriction), ++ .umasks = glm_decode_restriction, ++ }, ++ { .name = "MISALIGN_MEM_REF", ++ .desc = "Load uops that split a page (Precise Event)", ++ .code = 0x13, ++ .modmsk = INTEL_V4_ATTRS, ++ .cntmsk = 0xfull, ++ .flags = INTEL_X86_PEBS, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(glm_misalign_mem_ref), ++ .umasks = glm_misalign_mem_ref, ++ }, ++ { .name = "INST_RETIRED", ++ .desc = "Instructions retired (Precise Event)", ++ .code = 0xc0, ++ .modmsk = INTEL_V4_ATTRS, ++ .cntmsk = 0x10000000full, ++ .flags = INTEL_X86_PEBS, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(glm_inst_retired), ++ .umasks = glm_inst_retired, ++ }, ++ { .name = "INSTRUCTION_RETIRED", ++ .desc = "Number of instructions retired", ++ .code = 0xc0, ++ .modmsk = INTEL_V4_ATTRS, ++ .cntmsk = 0x100000ffull, ++ .ngrp = 0, ++ }, ++ { .name = "ISSUE_SLOTS_NOT_CONSUMED", ++ .desc = "Unfilled issue slots per cycle because of a full resource in the backend", ++ .code = 0xca, ++ .modmsk = INTEL_V4_ATTRS, ++ .cntmsk = 0xfull, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(glm_issue_slots_not_consumed), ++ .umasks = glm_issue_slots_not_consumed, ++ }, ++ { .name = "ITLB", ++ .desc = "ITLB misses", ++ .code = 0x81, ++ .modmsk = INTEL_V4_ATTRS, ++ .cntmsk = 0xfull, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(glm_itlb), ++ .umasks = glm_itlb, ++ }, ++ { .name = "LONGEST_LAT_CACHE", ++ .desc = "L2 cache requests", ++ .code = 0x2e, ++ .modmsk = INTEL_V4_ATTRS, ++ .cntmsk = 0xfull, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(glm_longest_lat_cache), ++ .umasks = glm_longest_lat_cache, ++ }, ++ { .name = "MEM_LOAD_UOPS_RETIRED", ++ .desc = "Load uops retired that hit L1 data cache (Precise Event)", ++ .code = 0xd1, ++ .modmsk = INTEL_V4_ATTRS, ++ .cntmsk = 0xfull, ++ .flags = INTEL_X86_PEBS, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(glm_mem_load_uops_retired), ++ .umasks = glm_mem_load_uops_retired, ++ }, ++ { .name = "LD_BLOCKS", ++ .desc = "Loads blocked (Precise Event)", ++ .code = 0x03, ++ .modmsk = INTEL_V4_ATTRS, ++ .cntmsk = 0xfull, ++ .flags = INTEL_X86_PEBS, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(glm_ld_blocks), ++ .umasks = glm_ld_blocks, ++ }, ++ { .name = "DL1", ++ .desc = "L1 Cache evictions for dirty data", ++ .code = 0x51, ++ .modmsk = INTEL_V4_ATTRS, ++ .cntmsk = 0xfull, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(glm_dl1), ++ .umasks = glm_dl1, ++ }, ++ { .name = "CYCLES_DIV_BUSY", ++ .desc = "Cycles a divider is busy", ++ .code = 0xcd, ++ .modmsk = INTEL_V4_ATTRS, ++ .cntmsk = 0xfull, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(glm_cycles_div_busy), ++ .umasks = glm_cycles_div_busy, ++ }, ++ { .name = "MS_DECODED", ++ .desc = "MS decode starts", ++ .code = 0xe7, ++ .modmsk = INTEL_V4_ATTRS, ++ .cntmsk = 0xfull, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(glm_ms_decoded), ++ .umasks = glm_ms_decoded, ++ }, ++ { .name = "UOPS_RETIRED", ++ .desc = "Uops retired (Precise Event)", ++ .code = 0xc2, ++ .modmsk = INTEL_V4_ATTRS, ++ .cntmsk = 0xfull, ++ .flags = INTEL_X86_PEBS, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(glm_uops_retired), ++ .umasks = glm_uops_retired, ++ }, ++ { .name = "OFFCORE_RESPONSE_1", ++ .desc = "Offcore response event (must provide at least one request type and either any_response or any combination of supplier + snoop)", ++ .code = 0x1bb, ++ .modmsk = INTEL_V4_ATTRS, ++ .cntmsk = 0xffull, ++ .flags = INTEL_X86_NHM_OFFCORE, ++ .ngrp = 3, ++ .numasks = LIBPFM_ARRAY_SIZE(glm_offcore_response_1), ++ .umasks = glm_offcore_response_1, ++ }, ++ { .name = "MACHINE_CLEARS", ++ .desc = "Self-Modifying Code detected", ++ .code = 0xc3, ++ .modmsk = INTEL_V4_ATTRS, ++ .cntmsk = 0xfull, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(glm_machine_clears), ++ .umasks = glm_machine_clears, ++ }, ++ { .name = "BR_INST_RETIRED", ++ .desc = "Retired branch instructions (Precise Event)", ++ .code = 0xc4, ++ .modmsk = INTEL_V4_ATTRS, ++ .cntmsk = 0xfull, ++ .flags = INTEL_X86_PEBS, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(glm_br_inst_retired), ++ .umasks = glm_br_inst_retired, ++ }, ++ { .name = "FETCH_STALL", ++ .desc = "Cycles where code-fetch is stalled and an ICache miss is outstanding. This is not the same as an ICache Miss", ++ .code = 0x86, ++ .modmsk = INTEL_V4_ATTRS, ++ .cntmsk = 0xfull, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(glm_fetch_stall), ++ .umasks = glm_fetch_stall, ++ }, ++ { .name = "UOPS_NOT_DELIVERED", ++ .desc = "Uops requested but not-delivered to the back-end per cycle", ++ .code = 0x9c, ++ .modmsk = INTEL_V4_ATTRS, ++ .cntmsk = 0xfull, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(glm_uops_not_delivered), ++ .umasks = glm_uops_not_delivered, ++ }, ++ { .name = "MISPREDICTED_BRANCH_RETIRED", ++ .desc = "Number of mispredicted branch instructions retired", ++ .code = 0xc5, ++ .modmsk = INTEL_V4_ATTRS, ++ .cntmsk = 0xffull, ++ .equiv = "BR_MISP_RETIRED:ALL_BRANCHES", ++ .ngrp = 0, ++ }, ++ { .name = "INSTRUCTIONS_RETIRED", ++ .desc = "Number of instructions retired", ++ .code = 0xc0, ++ .modmsk = INTEL_V4_ATTRS, ++ .cntmsk = 0x100000ffull, ++ .equiv = "INSTRUCTION_RETIRED", ++ .ngrp = 0, ++ }, ++ { .name = "MEM_UOPS_RETIRED", ++ .desc = "Load uops retired (Precise Event)", ++ .code = 0xd0, ++ .modmsk = INTEL_V4_ATTRS, ++ .cntmsk = 0xfull, ++ .flags = INTEL_X86_PEBS, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(glm_mem_uops_retired), ++ .umasks = glm_mem_uops_retired, ++ }, ++ { .name = "UOPS_ISSUED", ++ .desc = "Uops issued to the back end per cycle", ++ .code = 0x0e, ++ .modmsk = INTEL_V4_ATTRS, ++ .cntmsk = 0xfull, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(glm_uops_issued), ++ .umasks = glm_uops_issued, ++ }, ++ { .name = "OFFCORE_RESPONSE_0", ++ .desc = "Offcore response event (must provide at least one request type and either any_response or any combination of supplier + snoop)", ++ .code = 0x1b7, ++ .modmsk = INTEL_V4_ATTRS, ++ .cntmsk = 0xffull, ++ .flags = INTEL_X86_NHM_OFFCORE, ++ .ngrp = 3, ++ .numasks = LIBPFM_ARRAY_SIZE(glm_offcore_response_0), ++ .umasks = glm_offcore_response_0, ++ }, ++ { .name = "UNHALTED_REFERENCE_CYCLES", ++ .desc = "Unhalted reference cycles. Ticks at constant reference frequency", ++ .code = 0x0300, ++ .modmsk = INTEL_FIXED3_ATTRS, ++ .cntmsk = 0x40000000ull, ++ .flags = INTEL_X86_FIXED, ++ .ngrp = 0, ++ }, ++ { .name = "BRANCH_INSTRUCTIONS_RETIRED", ++ .desc = "Number of branch instructions retired", ++ .code = 0xc4, ++ .modmsk = INTEL_V4_ATTRS, ++ .cntmsk = 0xffull, ++ .equiv = "BR_INST_RETIRED:ALL_BRANCHES", ++ .ngrp = 0, ++ }, ++ { .name = "CORE_REJECT_L2Q", ++ .desc = "Requests rejected by the L2Q ", ++ .code = 0x31, ++ .modmsk = INTEL_V4_ATTRS, ++ .cntmsk = 0xfull, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(glm_core_reject_l2q), ++ .umasks = glm_core_reject_l2q, ++ }, ++ { .name = "PAGE_WALKS", ++ .desc = "Duration of D-side page-walks in cycles", ++ .code = 0x05, ++ .modmsk = INTEL_V4_ATTRS, ++ .cntmsk = 0xfull, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(glm_page_walks), ++ .umasks = glm_page_walks, ++ }, ++ { .name = "BACLEARS", ++ .desc = "BACLEARs asserted for any branch type", ++ .code = 0xe6, ++ .modmsk = INTEL_V4_ATTRS, ++ .cntmsk = 0xfull, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(glm_baclears), ++ .umasks = glm_baclears, ++ }, ++ { .name = "CPU_CLK_UNHALTED", ++ .desc = "Core cycles when core is not halted (Fixed event)", ++ .code = 0x00, ++ .modmsk = INTEL_V4_ATTRS, ++ .cntmsk = 0x60000000full, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(glm_cpu_clk_unhalted), ++ .umasks = glm_cpu_clk_unhalted, ++ }, ++ { .name = "UNHALTED_CORE_CYCLES", ++ .desc = "Core clock cycles whenever the clock signal on the specific core is running (not halted)", ++ .code = 0x3c, ++ .modmsk = INTEL_V4_ATTRS, ++ .cntmsk = 0x20000000ull, ++ .ngrp = 0, ++ }, ++}; +diff --git a/lib/pfmlib_common.c b/lib/pfmlib_common.c +index 05ce1c0..4c4c376 100644 +--- a/lib/pfmlib_common.c ++++ b/lib/pfmlib_common.c +@@ -125,6 +125,7 @@ static pfmlib_pmu_t *pfmlib_pmus[]= + &intel_snbep_unc_r3qpi1_support, + &intel_knc_support, + &intel_slm_support, ++ &intel_glm_support, + &intel_ivbep_unc_cb0_support, + &intel_ivbep_unc_cb1_support, + &intel_ivbep_unc_cb2_support, +diff --git a/lib/pfmlib_intel_glm.c b/lib/pfmlib_intel_glm.c +new file mode 100644 +index 0000000..0b8bd9d +--- /dev/null ++++ b/lib/pfmlib_intel_glm.c +@@ -0,0 +1,73 @@ ++/* ++ * pfmlib_intel_glm.c : Intel Goldmont core PMU ++ * ++ * Copyright (c) 2016 Google ++ * Contributed by Stephane Eranian ++ * ++ * Permission is hereby granted, free of charge, to any person obtaining a copy ++ * of this software and associated documentation files (the "Software"), to deal ++ * in the Software without restriction, including without limitation the rights ++ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies ++ * of the Software, and to permit persons to whom the Software is furnished to do so, ++ * subject to the following conditions: ++ * ++ * The above copyright notice and this permission notice shall be included in all ++ * copies or substantial portions of the Software. ++ * ++ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, ++ * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A ++ * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT ++ * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF ++ * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE ++ * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ++ */ ++ ++/* private headers */ ++#include "pfmlib_priv.h" ++#include "pfmlib_intel_x86_priv.h" ++#include "events/intel_glm_events.h" ++ ++static const int glm_models[] = { ++ 92, /* Goldmont */ ++ 95, /* Goldmont Denverton */ ++ 0 ++}; ++ ++static int ++pfm_intel_glm_init(void *this) ++{ ++ pfm_intel_x86_cfg.arch_version = 3; ++ return PFM_SUCCESS; ++} ++ ++pfmlib_pmu_t intel_glm_support={ ++ .desc = "Intel Goldmont", ++ .name = "glm", ++ .pmu = PFM_PMU_INTEL_GLM, ++ .pme_count = LIBPFM_ARRAY_SIZE(intel_glm_pe), ++ .type = PFM_PMU_TYPE_CORE, ++ .num_cntrs = 4, ++ .num_fixed_cntrs = 3, ++ .max_encoding = 2, ++ .pe = intel_glm_pe, ++ .atdesc = intel_x86_mods, ++ .flags = PFMLIB_PMU_FL_RAW_UMASK, ++ .supported_plm = INTEL_X86_PLM, ++ ++ .cpu_family = 6, ++ .cpu_models = glm_models, ++ .pmu_detect = pfm_intel_x86_model_detect, ++ .pmu_init = pfm_intel_glm_init, ++ ++ .get_event_encoding[PFM_OS_NONE] = pfm_intel_x86_get_encoding, ++ PFMLIB_ENCODE_PERF(pfm_intel_x86_get_perf_encoding), ++ ++ .get_event_first = pfm_intel_x86_get_event_first, ++ .get_event_next = pfm_intel_x86_get_event_next, ++ .event_is_valid = pfm_intel_x86_event_is_valid, ++ .validate_table = pfm_intel_x86_validate_table, ++ .get_event_info = pfm_intel_x86_get_event_info, ++ .get_event_attr_info = pfm_intel_x86_get_event_attr_info, ++ PFMLIB_VALID_PERF_PATTRS(pfm_intel_x86_perf_validate_pattrs), ++ .get_event_nattrs = pfm_intel_x86_get_event_nattrs, ++}; +diff --git a/lib/pfmlib_priv.h b/lib/pfmlib_priv.h +index 2c760ea..0d106a4 100644 +--- a/lib/pfmlib_priv.h ++++ b/lib/pfmlib_priv.h +@@ -353,6 +353,7 @@ extern pfmlib_pmu_t intel_hswep_unc_r3qpi2_support; + extern pfmlib_pmu_t intel_hswep_unc_irp_support; + extern pfmlib_pmu_t intel_knc_support; + extern pfmlib_pmu_t intel_slm_support; ++extern pfmlib_pmu_t intel_glm_support; + extern pfmlib_pmu_t power4_support; + extern pfmlib_pmu_t ppc970_support; + extern pfmlib_pmu_t ppc970mp_support; +-- +2.9.3 + + +From c7e1e2ad413997c0cce36b040681e9e5bf6a8ef8 Mon Sep 17 00:00:00 2001 +From: Stephane Eranian +Date: Wed, 20 Jul 2016 22:16:18 -0700 +Subject: [PATCH] update Intel Goldmont support + +This patch fixes errors in the initial commit (b74653d10661) +for Intel Goldmont core PMU. Mostly the offcore_response support +was incorrect. + +This patch series adds support for the average latency cycle +feature of offcore_response on Intel Goldmont. As a consequence +a new umask/event flag called INTEL_X86_EXCL_GRP_BUT_0 is introduced. +It allows a umask to restrict which other umask can be combined. +It allows the current umask group and group 0 to be used, no other +group can be used. This feature is used to support the average latency +encodings in offcore_response on this PMU. + +The patch also adds the proper man page for libpfm_intel_glm.3 +The patch adds validation test sfor encoding Goldmont events including +the new offcore_response features, + +Signed-off-by: Stephane Eranian +--- + README | 1 + + docs/Makefile | 1 + + docs/man3/libpfm_intel_glm.3 | 97 +++++++ + lib/events/intel_glm_events.h | 605 +++++++++++++++--------------------------- + lib/pfmlib_intel_nhm_unc.c | 2 +- + lib/pfmlib_intel_x86.c | 32 ++- + lib/pfmlib_intel_x86_priv.h | 3 +- + tests/validate_x86.c | 82 ++++++ + 8 files changed, 423 insertions(+), 400 deletions(-) + create mode 100644 docs/man3/libpfm_intel_glm.3 + +diff --git a/README b/README +index 6a1bbc1..ce60d3a 100644 +--- a/README ++++ b/README +@@ -52,6 +52,7 @@ The library supports many PMUs. The current version can handle: + Intel SkyLake + Intel Silvermont + Intel Airmont ++ Intel Goldmont + Intel RAPL (energy consumption) + Intel Knights Corner + Intel architectural perfmon v1, v2, v3 +diff --git a/docs/Makefile b/docs/Makefile +index c7c82ef..873f31f 100644 +--- a/docs/Makefile ++++ b/docs/Makefile +@@ -52,6 +52,7 @@ ARCH_MAN=libpfm_intel_core.3 \ + libpfm_intel_rapl.3 \ + libpfm_intel_slm.3 \ + libpfm_intel_skl.3 \ ++ libpfm_intel_glm.3 \ + libpfm_intel_snbep_unc_cbo.3 \ + libpfm_intel_snbep_unc_ha.3 \ + libpfm_intel_snbep_unc_imc.3 \ +diff --git a/docs/man3/libpfm_intel_glm.3 b/docs/man3/libpfm_intel_glm.3 +new file mode 100644 +index 0000000..1a9338b +--- /dev/null ++++ b/docs/man3/libpfm_intel_glm.3 +@@ -0,0 +1,97 @@ ++.TH LIBPFM 3 "July, 2016" "" "Linux Programmer's Manual" ++.SH NAME ++libpfm_intel_glm - support for Intel Goldmont core PMU ++.SH SYNOPSIS ++.nf ++.B #include ++.sp ++.B PMU name: glm ++.B PMU desc: Intel Goldmont ++.sp ++.SH DESCRIPTION ++The library supports the Intel Goldmont core PMU. It should be noted that ++this PMU model only covers each core's PMU and not the socket level ++PMU. ++ ++On Goldmont, the number of generic counters is 4. There is no HyperThreading support. ++The \fBpfm_get_pmu_info()\fR function returns the maximum number of generic counters ++in \fBnum_cntrs\fr. ++ ++.SH MODIFIERS ++The following modifiers are supported on Intel Goldmont processors: ++.TP ++.B u ++Measure at user level which includes privilege levels 1, 2, 3. This corresponds to \fBPFM_PLM3\fR. ++This is a boolean modifier. ++.TP ++.B k ++Measure at kernel level which includes privilege level 0. This corresponds to \fBPFM_PLM0\fR. ++This is a boolean modifier. ++.TP ++.B i ++Invert the meaning of the event. The counter will now count cycles in which the event is \fBnot\fR ++occurring. This is a boolean modifier ++.TP ++.B e ++Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event ++to at least one occurrence. This modifier must be combined with a counter mask modifier (m) with a value greater or equal to one. ++This is a boolean modifier. ++.TP ++.B c ++Set the counter mask value. The mask acts as a threshold. The counter will count the number of cycles ++in which the number of occurrences of the event is greater or equal to the threshold. This is an integer ++modifier with values in the range [0:255]. ++ ++.SH OFFCORE_RESPONSE events ++Intel Goldmont provides two offcore_response events. They are called OFFCORE_RESPONSE_0 and OFFCORE_RESPONSE_1. ++ ++Those events need special treatment in the performance monitoring infrastructure ++because each event uses an extra register to store some settings. Thus, in ++case multiple offcore_response events are monitored simultaneously, the kernel needs ++to manage the sharing of that extra register. ++ ++The offcore_response events are exposed as normal events by the library. The extra ++settings are exposed as regular umasks. The library takes care of encoding the ++events according to the underlying kernel interface. ++ ++On Intel Goldmont, the umasks are divided into 4 categories: request, supplier ++and snoop and average latency. Offcore_response event has two modes of operations: normal and average latency. ++In the first mode, the two offcore_respnse events operate independently of each other. The user must provide at ++least one umask for each of the first 3 categories: request, supplier, snoop. In the second mode, the two ++offcore_response events are combined to compute an average latency per request type. ++ ++For the normal mode, there is a special supplier (response) umask called \fBANY_RESPONSE\fR. When this umask ++is used then it overrides any supplier and snoop umasks. In other words, users can ++specify either \fBANY_RESPONSE\fR \fBOR\fR any combinations of supplier + snoops. In case no supplier or snoop ++is specified, the library defaults to using \fBANY_RESPONSE\fR. ++ ++For instance, the following are valid event selections: ++.TP ++.B OFFCORE_RESPONSE_0:DMND_DATA_RD:ANY_RESPONSE ++.TP ++.B OFFCORE_RESPONSE_0:ANY_REQUEST ++.TP ++.B OFFCORE_RESPONSE_0:ANY_RFO:LLC_HITM:SNOOP_ANY ++ ++.P ++But the following are illegal: ++ ++.TP ++.B OFFCORE_RESPONSE_0:ANY_RFO:LLC_HITM:ANY_RESPONSE ++.TP ++.B OFFCORE_RESPONSE_0:ANY_RFO:LLC_HITM:SNOOP_ANY:ANY_RESPONSE ++.P ++In average latency mode, \fBOFFCORE_RESPONSE_0\fR must be programmed to select the request types of interest, for instance, \fBDMND_DATA_RD\fR, and the \fBOUTSTANDING\fR umask must be set and no others. the library will enforce that restriction as soon as the \fBOUTSTANDING\fR umask is used. Then \fBOFFCORE_RESPONSE_1\fR must be set with the same request types and the \fBANY_RESPONSE\fR umask. It should be noted that the library encodes events independently of each other and therefore cannot verify that the requests are matching between the two events. ++Example of average latency settings: ++.TP ++.B OFFCORE_RESPONSE_0:DMND_DATA_RD:OUTSTANDING+OFFCORE_RESPONSE_1:DMND_DATA_RD:ANY_RESPONSE ++.TP ++.B OFFCORE_RESPONSE_0:ANY_REQUEST:OUTSTANDING+OFFCORE_RESPONSE_1:ANY_REQUEST:ANY_RESPONSE ++.P ++The average latency for the request(s) is obtained by dividing the counts of \fBOFFCORE_RESPONSE_0\fR by the count of \fBOFFCORE_RESPONSE_1\fR. The ratio is expressed in core cycles. ++ ++.SH AUTHORS ++.nf ++Stephane Eranian ++.if ++.PP +diff --git a/lib/events/intel_glm_events.h b/lib/events/intel_glm_events.h +index fd0b27c..a7ed811 100644 +--- a/lib/events/intel_glm_events.h ++++ b/lib/events/intel_glm_events.h +@@ -358,7 +358,7 @@ static const intel_x86_umask_t glm_uops_retired[]={ + }, + }; + +-static const intel_x86_umask_t glm_offcore_response_1[]={ ++static const intel_x86_umask_t glm_offcore_response_0[]={ + { .uname = "DMND_DATA_RD", + .udesc = "Request: number of demand and DCU prefetch data reads of full and partial cachelines as well as demand data page table entry cacheline reads. Does not count L2 data read prefetches or instruction fetches", + .ucode = 1ULL << (0 + 8), +@@ -406,7 +406,6 @@ static const intel_x86_umask_t glm_offcore_response_1[]={ + .ucode = 1ULL << (8 + 8), + .grpid = 0, + .ucntmsk = 0xffull, +- .uequiv = "PF_CODE_RD", + }, + { .uname = "UC_CODE_READS", + .udesc = "Request: number of uncached code reads", +@@ -420,7 +419,7 @@ static const intel_x86_umask_t glm_offcore_response_1[]={ + .grpid = 0, + .ucntmsk = 0xffull, + }, +- { .uname = "STRM_ST", ++ { .uname = "FULL_STRM_ST", + .udesc = "Request: number of streaming store requests for full cacheline", + .ucode = 1ULL << (11 + 8), + .grpid = 0, +@@ -440,51 +439,37 @@ static const intel_x86_umask_t glm_offcore_response_1[]={ + }, + { .uname = "PARTIAL_STRM_ST", + .udesc = "Request: number of streaming store requests for partial cacheline", +- .ucode = 1ULL << (11 + 8), +- .grpid = 0, +- .ucntmsk = 0xffull, +- }, +- { .uname = "OTHER", +- .udesc = "Request: counts one of the following transaction types, including L3 invalidate, I/O, full or partial writes, WC or non-temporal stores, CLFLUSH, Fences, lock, unlock, split lock", +- .ucode = 1ULL << (15 + 8), +- .grpid = 0, +- .ucntmsk = 0xffull, +- }, +- { .uname = "ANY_CODE_RD", +- .udesc = "Request: combination of PF_CODE_RD | DMND_CODE_RD | PF_L3_CODE_RD", +- .ucode = 0x24400, ++ .ucode = 1ULL << (14 + 8), + .grpid = 0, + .ucntmsk = 0xffull, +- .uequiv = "PF_CODE_RD:DMND_CODE_RD:PF_L3_CODE_RD", + }, +- { .uname = "ANY_IFETCH", +- .udesc = "Request: combination of PF_CODE_RD | PF_L3_CODE_RD", +- .ucode = 0x24000, ++ { .uname = "STRM_ST", ++ .udesc = "Request: number of streaming store requests for partial or full cacheline", ++ .ucode = (1ULL << (14 + 8)) | (1ULL << (11+8)), ++ .uequiv = "FULL_STRM_ST:PARTIAL_STRM_ST", + .grpid = 0, + .ucntmsk = 0xffull, +- .uequiv = "PF_CODE_RD:PF_L3_CODE_RD", + }, + { .uname = "ANY_REQUEST", + .udesc = "Request: combination of all request umasks", +- .ucode = 0x8fff00, ++ .ucode = 1ULL << (15 + 8), + .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, + .grpid = 0, + .ucntmsk = 0xffull, +- .uequiv = "DMND_DATA_RD:DMND_RFO:DMND_CODE_RD:WB:PF_DATA_RD:PF_RFO:PF_CODE_RD:PF_L3_DATA_RD:PF_L3_RFO:PF_L3_CODE_RD:SPLIT_LOCK_UC_LOCK:STRM_ST:OTHER", + }, +- { .uname = "ANY_DATA", +- .udesc = "Request: combination of DMND_DATA | PF_DATA_RD | PF_L3_DATA_RD", +- .ucode = 0x9100, ++ { .uname = "ANY_PF_DATA_RD", ++ .udesc = "Request: number of prefetch data reads", ++ .ucode = (1ULL << (4+8)) | (1ULL << (12+8)) | (1ULL << (13+8)), + .grpid = 0, + .ucntmsk = 0xffull, +- .uequiv = "DMND_DATA_RD:PF_DATA_RD:PF_L3_DATA_RD", ++ .uequiv = "PF_DATA_RD:SW_PF:PF_L1_DATA_RD", + }, + { .uname = "ANY_RFO", +- .udesc = "Request: combination of DMND_RFO | PF_RFO | PF_L3_RFO", +- .ucode = 0x12200, ++ .udesc = "Request: number of RFO", ++ .ucode = (1ULL << (1+8)) | (1ULL << (5+8)), + .grpid = 0, + .ucntmsk = 0xffull, +- .uequiv = "DMND_RFO:PF_RFO:PF_L3_RFO", ++ .uequiv = "DMND_RFO:PF_RFO", + }, + { .uname = "ANY_RESPONSE", + .udesc = "Response: any response type", +@@ -493,112 +478,210 @@ static const intel_x86_umask_t glm_offcore_response_1[]={ + .grpid = 1, + .ucntmsk = 0xffull, + }, +- { .uname = "NO_SUPP", +- .udesc = "Supplier: counts number of times supplier information is not available", +- .ucode = 1ULL << (17 + 8), ++ { .uname = "L2_HIT", ++ .udesc = "Supplier: counts L2 hits", ++ .ucode = 1ULL << (18 + 8), + .grpid = 1, + .ucntmsk = 0xffull, + }, +- { .uname = "L3_HITM", +- .udesc = "Supplier: counts L3 hits in M-state (initial lookup)", +- .ucode = 1ULL << (18 + 8), +- .grpid = 1, ++ { .uname = "L2_MISS_SNP_MISS_OR_NO_SNOOP_NEEDED", ++ .udesc = "Snoop: counts number true misses to this processor module for which a snoop request missed the other processor module or no snoop was needed", ++ .ucode = 1ULL << (33 + 8), ++ .grpid = 2, + .ucntmsk = 0xffull, + }, +- { .uname = "L3_HITE", +- .udesc = "Supplier: counts L3 hits in E-state", +- .ucode = 1ULL << (19 + 8), +- .grpid = 1, ++ { .uname = "L2_MISS_HIT_OTHER_CORE_NO_FWD", ++ .udesc = "Snoop: counts number of times a snoop request hits the other processor module but no data forwarding is needed", ++ .ucode = 1ULL << (34 + 8), ++ .grpid = 2, + .ucntmsk = 0xffull, + }, +- { .uname = "L3_HITS", +- .udesc = "Supplier: counts L3 hits in S-state", +- .ucode = 1ULL << (20 + 8), +- .grpid = 1, ++ { .uname = "L2_MISS_HITM_OTHER_CORE", ++ .udesc = "Snoop: counts number of times a snoop request hits in the other processor module or other core's L1 where a modified copy (M-state) is found", ++ .ucode = 1ULL << (36 + 8), ++ .grpid = 2, + .ucntmsk = 0xffull, + }, +- { .uname = "L3_HIT", +- .udesc = "Supplier: counts L3 hits in any state (M, E, S)", +- .ucode = 7ULL << (18 + 8), +- .grpid = 1, ++ { .uname = "L2_MISS_SNP_NON_DRAM", ++ .udesc = "Snoop: counts number of times target was a non-DRAM system address. This includes MMIO transactions", ++ .ucode = 1ULL << (37 + 8), ++ .grpid = 2, + .ucntmsk = 0xffull, +- .umodel = PFM_PMU_INTEL_GLM, +- .uequiv = "L3_HITM:L3_HITE:L3_HITS", + }, +- { .uname = "L3_MISS_LOCAL_DRAM", +- .udesc = "Supplier: counts L3 misses to local DRAM", +- .ucode = 1ULL << (22 + 8), +- .grpid = 1, ++ { .uname = "L2_MISS_SNP_ANY", ++ .udesc = "Snoop: any snoop reason", ++ .ucode = 0x1bULL << (33 + 8), ++ .uflags = INTEL_X86_DFL, ++ .uequiv = "L2_MISS_SNP_MISS_OR_NO_SNOOP_NEEDED:L2_MISS_HIT_OTHER_CORE_NO_FWD:L2_MISS_HITM_OTHER_CORE:L2_MISS_SNP_NON_DRAM", ++ .grpid = 2, + .ucntmsk = 0xffull, + }, +- { .uname = "L3_MISS_REMOTE_DRAM", +- .udesc = "Supplier: counts L3 misses to remote DRAM", +- .ucode = 0x7fULL << (23 + 8), +- .grpid = 1, ++ { .uname = "OUTSTANDING", ++ .udesc = "Outstanding request: counts weighted cycles of outstanding offcore requests of the request type specified in the bits 15:0 of offcore_response from the time the XQ receives the request and any response received. Bits 37:16 must be set to 0. This is only available for offcore_response_0", ++ .ucode = 1ULL << (38 + 8), ++ .uflags = INTEL_X86_DFL | INTEL_X86_EXCL_GRP_BUT_0, /* can only be combined with request type bits (grpid = 0) */ ++ .grpid = 3, + .ucntmsk = 0xffull, + }, +- { .uname = "L3_MISS", +- .udesc = "Supplier: counts L3 misses to local or remote DRAM", +- .ucode = 0xffULL << (22 + 8), +- .grpid = 1, ++}; ++ ++static const intel_x86_umask_t glm_offcore_response_1[]={ ++ { .uname = "DMND_DATA_RD", ++ .udesc = "Request: number of demand and DCU prefetch data reads of full and partial cachelines as well as demand data page table entry cacheline reads. Does not count L2 data read prefetches or instruction fetches", ++ .ucode = 1ULL << (0 + 8), ++ .grpid = 0, + .ucntmsk = 0xffull, +- .uequiv = "L3_MISS_REMOTE_DRAM:L3_MISS_LOCAL_DRAM", + }, +- { .uname = "SPL_HIT", +- .udesc = "Supplier: counts L3 supplier hit", +- .ucode = 1ULL << (30 + 8), +- .grpid = 1, ++ { .uname = "DMND_RFO", ++ .udesc = "Request: number of demand and DCU prefetch reads for ownership (RFO) requests generated by a write to data cacheline. Does not count L2 RFO prefetches", ++ .ucode = 1ULL << (1 + 8), ++ .grpid = 0, + .ucntmsk = 0xffull, + }, +- { .uname = "SNP_NONE", +- .udesc = "Snoop: counts number of times no snoop-related information is available", +- .ucode = 1ULL << (31 + 8), +- .grpid = 2, ++ { .uname = "DMND_CODE_RD", ++ .udesc = "Request: number of demand and DCU prefetch instruction cacheline reads. Does not count L2 code read prefetches", ++ .ucode = 1ULL << (2 + 8), ++ .grpid = 0, + .ucntmsk = 0xffull, + }, +- { .uname = "SNP_NOT_NEEDED", +- .udesc = "Snoop: counts the number of times no snoop was needed to satisfy the request", +- .ucode = 1ULL << (32 + 8), +- .grpid = 2, ++ { .uname = "WB", ++ .udesc = "Request: number of writebacks (modified to exclusive) transactions", ++ .ucode = 1ULL << (3 + 8), ++ .grpid = 0, + .ucntmsk = 0xffull, + }, +- { .uname = "SNP_MISS", +- .udesc = "Snoop: counts number of times a snoop was needed and it missed all snooped caches", +- .ucode = 1ULL << (33 + 8), +- .grpid = 2, ++ { .uname = "PF_DATA_RD", ++ .udesc = "Request: number of data cacheline reads generated by L2 prefetcher", ++ .ucode = 1ULL << (4 + 8), ++ .grpid = 0, + .ucntmsk = 0xffull, + }, +- { .uname = "SNP_NO_FWD", +- .udesc = "Snoop: counts number of times a snoop was needed and data was forwarded from a remote socket", +- .ucode = 1ULL << (34 + 8), ++ { .uname = "PF_RFO", ++ .udesc = "Request: number of RFO requests generated by L2 prefetcher", ++ .ucode = 1ULL << (5 + 8), ++ .grpid = 0, ++ .ucntmsk = 0xffull, ++ }, ++ { .uname = "PARTIAL_READS", ++ .udesc = "Request: number of partil reads", ++ .ucode = 1ULL << (7 + 8), ++ .grpid = 0, ++ .ucntmsk = 0xffull, ++ }, ++ { .uname = "PARTIAL_WRITES", ++ .udesc = "Request: number of partial writes", ++ .ucode = 1ULL << (8 + 8), ++ .grpid = 0, ++ .ucntmsk = 0xffull, ++ }, ++ { .uname = "UC_CODE_READS", ++ .udesc = "Request: number of uncached code reads", ++ .ucode = 1ULL << (9 + 8), ++ .grpid = 0, ++ .ucntmsk = 0xffull, ++ }, ++ { .uname = "BUS_LOCKS", ++ .udesc = "Request: number of bus lock and split lock requests", ++ .ucode = 1ULL << (10 + 8), ++ .grpid = 0, ++ .ucntmsk = 0xffull, ++ }, ++ { .uname = "FULL_STRM_ST", ++ .udesc = "Request: number of streaming store requests for full cacheline", ++ .ucode = 1ULL << (11 + 8), ++ .grpid = 0, ++ .ucntmsk = 0xffull, ++ }, ++ { .uname = "SW_PF", ++ .udesc = "Request: number of cacheline requests due to software prefetch", ++ .ucode = 1ULL << (12 + 8), ++ .grpid = 0, ++ .ucntmsk = 0xffull, ++ }, ++ { .uname = "PF_L1_DATA_RD", ++ .udesc = "Request: number of data cacheline reads generated by L1 data prefetcher", ++ .ucode = 1ULL << (13 + 8), ++ .grpid = 0, ++ .ucntmsk = 0xffull, ++ }, ++ { .uname = "PARTIAL_STRM_ST", ++ .udesc = "Request: number of streaming store requests for partial cacheline", ++ .ucode = 1ULL << (14 + 8), ++ .grpid = 0, ++ .ucntmsk = 0xffull, ++ }, ++ { .uname = "STRM_ST", ++ .udesc = "Request: number of streaming store requests for partial or full cacheline", ++ .ucode = (1ULL << (14 + 8)) | (1ULL << (11+8)), ++ .uequiv = "FULL_STRM_ST:PARTIAL_STRM_ST", ++ .grpid = 0, ++ .ucntmsk = 0xffull, ++ }, ++ { .uname = "ANY_REQUEST", ++ .udesc = "Request: combination of all request umasks", ++ .ucode = 1ULL << (15 + 8), ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, ++ .grpid = 0, ++ .ucntmsk = 0xffull, ++ }, ++ { .uname = "ANY_PF_DATA_RD", ++ .udesc = "Request: number of prefetch data reads", ++ .ucode = (1ULL << (4+8)) | (1ULL << (12+8)) | (1ULL << (13+8)), ++ .grpid = 0, ++ .ucntmsk = 0xffull, ++ .uequiv = "PF_DATA_RD:SW_PF:PF_L1_DATA_RD", ++ }, ++ { .uname = "ANY_RFO", ++ .udesc = "Request: number of RFO", ++ .ucode = (1ULL << (1+8)) | (1ULL << (5+8)), ++ .grpid = 0, ++ .ucntmsk = 0xffull, ++ .uequiv = "DMND_RFO:PF_RFO", ++ }, ++ { .uname = "ANY_RESPONSE", ++ .udesc = "Response: any response type", ++ .ucode = 1ULL << (16 + 8), ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL | INTEL_X86_EXCL_GRP_GT, ++ .grpid = 1, ++ .ucntmsk = 0xffull, ++ }, ++ { .uname = "L2_HIT", ++ .udesc = "Supplier: counts L2 hits", ++ .ucode = 1ULL << (18 + 8), ++ .grpid = 1, ++ .ucntmsk = 0xffull, ++ }, ++ { .uname = "L2_MISS_SNP_MISS_OR_NO_SNOOP_NEEDED", ++ .udesc = "Snoop: counts number true misses to this processor module for which a snoop request missed the other processor module or no snoop was needed", ++ .ucode = 1ULL << (33 + 8), + .grpid = 2, + .ucntmsk = 0xffull, + }, +- { .uname = "SNP_FWD", +- .udesc = "Snoop: counts number of times a snoop was needed and data was forwarded from a remote socket", +- .ucode = 1ULL << (35 + 8), ++ { .uname = "L2_MISS_HIT_OTHER_CORE_NO_FWD", ++ .udesc = "Snoop: counts number of times a snoop request hits the other processor module but no data forwarding is needed", ++ .ucode = 1ULL << (34 + 8), + .grpid = 2, + .ucntmsk = 0xffull, + }, +- { .uname = "SNP_HITM", +- .udesc = "Snoop: counts number of times a snoop was needed and it hitM-ed in local or remote cache", ++ { .uname = "L2_MISS_HITM_OTHER_CORE", ++ .udesc = "Snoop: counts number of times a snoop request hits in the other processor module or other core's L1 where a modified copy (M-state) is found", + .ucode = 1ULL << (36 + 8), + .grpid = 2, + .ucntmsk = 0xffull, + }, +- { .uname = "SNP_NON_DRAM", ++ { .uname = "L2_MISS_SNP_NON_DRAM", + .udesc = "Snoop: counts number of times target was a non-DRAM system address. This includes MMIO transactions", + .ucode = 1ULL << (37 + 8), + .grpid = 2, + .ucntmsk = 0xffull, + }, +- { .uname = "SNP_ANY", ++ { .uname = "L2_MISS_SNP_ANY", + .udesc = "Snoop: any snoop reason", +- .ucode = 0x7ULL << (31 + 8), ++ .ucode = 0xfULL << (33 + 8), + .uflags = INTEL_X86_DFL, + .grpid = 2, + .ucntmsk = 0xffull, +- .uequiv = "SNP_NONE:SNP_NOT_NEEDED:SNP_MISS:SNP_NO_FWD:SNP_FWD:SNP_HITM:SNP_NON_DRAM", ++ .uequiv = "L2_MISS_SNP_MISS_OR_NO_SNOOP_NEEDED:L2_MISS_HIT_OTHER_CORE_NO_FWD:L2_MISS_HITM_OTHER_CORE:L2_MISS_SNP_NON_DRAM", + }, + }; + +@@ -809,272 +892,6 @@ static const intel_x86_umask_t glm_uops_issued[]={ + }, + }; + +-static const intel_x86_umask_t glm_offcore_response_0[]={ +- { .uname = "DMND_DATA_RD", +- .udesc = "Request: number of demand and DCU prefetch data reads of full and partial cachelines as well as demand data page table entry cacheline reads. Does not count L2 data read prefetches or instruction fetches", +- .ucode = 1ULL << (0 + 8), +- .grpid = 0, +- .ucntmsk = 0xffull, +- }, +- { .uname = "DMND_RFO", +- .udesc = "Request: number of demand and DCU prefetch reads for ownership (RFO) requests generated by a write to data cacheline. Does not count L2 RFO prefetches", +- .ucode = 1ULL << (1 + 8), +- .grpid = 0, +- .ucntmsk = 0xffull, +- }, +- { .uname = "DMND_CODE_RD", +- .udesc = "Request: number of demand and DCU prefetch instruction cacheline reads. Does not count L2 code read prefetches", +- .ucode = 1ULL << (2 + 8), +- .grpid = 0, +- .ucntmsk = 0xffull, +- }, +- { .uname = "DMND_IFETCH", +- .udesc = "Request: number of demand and DCU prefetch instruction cacheline reads. Does not count L2 code read prefetches", +- .ucode = 1ULL << (2 + 8), +- .grpid = 0, +- .ucntmsk = 0xffull, +- .uequiv = "DMND_CODE_RD", +- }, +- { .uname = "WB", +- .udesc = "Request: number of writebacks (modified to exclusive) transactions", +- .ucode = 1ULL << (3 + 8), +- .grpid = 0, +- .ucntmsk = 0xffull, +- }, +- { .uname = "PF_DATA_RD", +- .udesc = "Request: number of data cacheline reads generated by L2 prefetchers", +- .ucode = 1ULL << (4 + 8), +- .grpid = 0, +- .ucntmsk = 0xffull, +- }, +- { .uname = "PF_RFO", +- .udesc = "Request: number of RFO requests generated by L2 prefetchers", +- .ucode = 1ULL << (5 + 8), +- .grpid = 0, +- .ucntmsk = 0xffull, +- }, +- { .uname = "PF_CODE_RD", +- .udesc = "Request: number of code reads generated by L2 prefetchers", +- .ucode = 1ULL << (6 + 8), +- .grpid = 0, +- .ucntmsk = 0xffull, +- }, +- { .uname = "PF_IFETCH", +- .udesc = "Request: number of code reads generated by L2 prefetchers", +- .ucode = 1ULL << (6 + 8), +- .grpid = 0, +- .ucntmsk = 0xffull, +- .uequiv = "PF_CODE_RD", +- }, +- { .uname = "PF_L3_DATA_RD", +- .udesc = "Request: number of L2 prefetcher requests to L3 for loads", +- .ucode = 1ULL << (7 + 8), +- .grpid = 0, +- .ucntmsk = 0xffull, +- }, +- { .uname = "PF_L3_RFO", +- .udesc = "Request: number of RFO requests generated by L2 prefetcher", +- .ucode = 1ULL << (8 + 8), +- .grpid = 0, +- .ucntmsk = 0xffull, +- }, +- { .uname = "PF_L3_CODE_RD", +- .udesc = "Request: number of L2 prefetcher requests to L3 for instruction fetches", +- .ucode = 1ULL << (9 + 8), +- .grpid = 0, +- .ucntmsk = 0xffull, +- }, +- { .uname = "PF_L3_IFETCH", +- .udesc = "Request: number of L2 prefetcher requests to L3 for instruction fetches", +- .ucode = 1ULL << (9 + 8), +- .grpid = 0, +- .ucntmsk = 0xffull, +- .uequiv = "PF_L3_CODE_RD", +- }, +- { .uname = "SPLIT_LOCK_UC_LOCK", +- .udesc = "Request: number of bus lock and split lock requests", +- .ucode = 1ULL << (10 + 8), +- .grpid = 0, +- .ucntmsk = 0xffull, +- }, +- { .uname = "BUS_LOCKS", +- .udesc = "Request: number of bus lock and split lock requests", +- .ucode = 1ULL << (10 + 8), +- .grpid = 0, +- .ucntmsk = 0xffull, +- .uequiv = "SPLIT_LOCK_UC_LOCK", +- }, +- { .uname = "BUS_LOCK", +- .udesc = "Request: number of bus lock and split lock requests", +- .ucode = 1ULL << (10 + 8), +- .grpid = 0, +- .ucntmsk = 0xffull, +- .uequiv = "SPLIT_LOCK_UC_LOCK", +- }, +- { .uname = "STRM_ST", +- .udesc = "Request: number of streaming store requests", +- .ucode = 1ULL << (11 + 8), +- .grpid = 0, +- .ucntmsk = 0xffull, +- }, +- { .uname = "OTHER", +- .udesc = "Request: counts one of the following transaction types, including L3 invalidate, I/O, full or partial writes, WC or non-temporal stores, CLFLUSH, Fences, lock, unlock, split lock", +- .ucode = 1ULL << (15 + 8), +- .grpid = 0, +- .ucntmsk = 0xffull, +- }, +- { .uname = "ANY_CODE_RD", +- .udesc = "Request: combination of PF_CODE_RD | DMND_CODE_RD | PF_L3_CODE_RD", +- .ucode = 0x24400, +- .grpid = 0, +- .ucntmsk = 0xffull, +- .uequiv = "PF_CODE_RD:DMND_CODE_RD:PF_L3_CODE_RD", +- }, +- { .uname = "ANY_IFETCH", +- .udesc = "Request: combination of PF_CODE_RD | PF_L3_CODE_RD", +- .ucode = 0x24000, +- .grpid = 0, +- .ucntmsk = 0xffull, +- .uequiv = "PF_CODE_RD:PF_L3_CODE_RD", +- }, +- { .uname = "ANY_REQUEST", +- .udesc = "Request: combination of all request umasks", +- .ucode = 0x8fff00, +- .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, +- .grpid = 0, +- .ucntmsk = 0xffull, +- .uequiv = "DMND_DATA_RD:DMND_RFO:DMND_CODE_RD:WB:PF_DATA_RD:PF_RFO:PF_CODE_RD:PF_L3_DATA_RD:PF_L3_RFO:PF_L3_CODE_RD:SPLIT_LOCK_UC_LOCK:STRM_ST:OTHER", +- }, +- { .uname = "ANY_DATA", +- .udesc = "Request: combination of DMND_DATA | PF_DATA_RD | PF_L3_DATA_RD", +- .ucode = 0x9100, +- .grpid = 0, +- .ucntmsk = 0xffull, +- .uequiv = "DMND_DATA_RD:PF_DATA_RD:PF_L3_DATA_RD", +- }, +- { .uname = "ANY_RFO", +- .udesc = "Request: combination of DMND_RFO | PF_RFO | PF_L3_RFO", +- .ucode = 0x12200, +- .grpid = 0, +- .ucntmsk = 0xffull, +- .uequiv = "DMND_RFO:PF_RFO:PF_L3_RFO", +- }, +- { .uname = "ANY_RESPONSE", +- .udesc = "Response: any response type", +- .ucode = 1ULL << (16 + 8), +- .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL | INTEL_X86_EXCL_GRP_GT, +- .grpid = 1, +- .ucntmsk = 0xffull, +- }, +- { .uname = "NO_SUPP", +- .udesc = "Supplier: counts number of times supplier information is not available", +- .ucode = 1ULL << (17 + 8), +- .grpid = 1, +- .ucntmsk = 0xffull, +- }, +- { .uname = "L3_HITM", +- .udesc = "Supplier: counts L3 hits in M-state (initial lookup)", +- .ucode = 1ULL << (18 + 8), +- .grpid = 1, +- .ucntmsk = 0xffull, +- }, +- { .uname = "L3_HITE", +- .udesc = "Supplier: counts L3 hits in E-state", +- .ucode = 1ULL << (19 + 8), +- .grpid = 1, +- .ucntmsk = 0xffull, +- }, +- { .uname = "L3_HITS", +- .udesc = "Supplier: counts L3 hits in S-state", +- .ucode = 1ULL << (20 + 8), +- .grpid = 1, +- .ucntmsk = 0xffull, +- }, +- { .uname = "L3_HIT", +- .udesc = "Supplier: counts L3 hits in any state (M, E, S)", +- .ucode = 7ULL << (18 + 8), +- .grpid = 1, +- .ucntmsk = 0xffull, +- .umodel = PFM_PMU_INTEL_GLM, +- .uequiv = "L3_HITM:L3_HITE:L3_HITS", +- }, +- { .uname = "L3_MISS_LOCAL_DRAM", +- .udesc = "Supplier: counts L3 misses to local DRAM", +- .ucode = 1ULL << (22 + 8), +- .grpid = 1, +- .ucntmsk = 0xffull, +- }, +- { .uname = "L3_MISS_REMOTE_DRAM", +- .udesc = "Supplier: counts L3 misses to remote DRAM", +- .ucode = 0x7fULL << (23 + 8), +- .grpid = 1, +- .ucntmsk = 0xffull, +- }, +- { .uname = "L3_MISS", +- .udesc = "Supplier: counts L3 misses to local or remote DRAM", +- .ucode = 0xffULL << (22 + 8), +- .grpid = 1, +- .ucntmsk = 0xffull, +- .uequiv = "L3_MISS_REMOTE_DRAM:L3_MISS_LOCAL_DRAM", +- }, +- { .uname = "SPL_HIT", +- .udesc = "Supplier: counts L3 supplier hit", +- .ucode = 1ULL << (30 + 8), +- .grpid = 1, +- .ucntmsk = 0xffull, +- }, +- { .uname = "SNP_NONE", +- .udesc = "Snoop: counts number of times no snoop-related information is available", +- .ucode = 1ULL << (31 + 8), +- .grpid = 2, +- .ucntmsk = 0xffull, +- }, +- { .uname = "SNP_NOT_NEEDED", +- .udesc = "Snoop: counts the number of times no snoop was needed to satisfy the request", +- .ucode = 1ULL << (32 + 8), +- .grpid = 2, +- .ucntmsk = 0xffull, +- }, +- { .uname = "SNP_MISS", +- .udesc = "Snoop: counts number of times a snoop was needed and it missed all snooped caches", +- .ucode = 1ULL << (33 + 8), +- .grpid = 2, +- .ucntmsk = 0xffull, +- }, +- { .uname = "SNP_NO_FWD", +- .udesc = "Snoop: counts number of times a snoop was needed and data was forwarded from a remote socket", +- .ucode = 1ULL << (34 + 8), +- .grpid = 2, +- .ucntmsk = 0xffull, +- }, +- { .uname = "SNP_FWD", +- .udesc = "Snoop: counts number of times a snoop was needed and data was forwarded from a remote socket", +- .ucode = 1ULL << (35 + 8), +- .grpid = 2, +- .ucntmsk = 0xffull, +- }, +- { .uname = "SNP_HITM", +- .udesc = "Snoop: counts number of times a snoop was needed and it hitM-ed in local or remote cache", +- .ucode = 1ULL << (36 + 8), +- .grpid = 2, +- .ucntmsk = 0xffull, +- }, +- { .uname = "SNP_NON_DRAM", +- .udesc = "Snoop: counts number of times target was a non-DRAM system address. This includes MMIO transactions", +- .ucode = 1ULL << (37 + 8), +- .grpid = 2, +- .ucntmsk = 0xffull, +- }, +- { .uname = "SNP_ANY", +- .udesc = "Snoop: any snoop reason", +- .ucode = 0x7ULL << (31 + 8), +- .uflags = INTEL_X86_DFL, +- .grpid = 2, +- .ucntmsk = 0xffull, +- .uequiv = "SNP_NONE:SNP_NOT_NEEDED:SNP_MISS:SNP_NO_FWD:SNP_FWD:SNP_HITM:SNP_NON_DRAM", +- }, +-}; +- + static const intel_x86_umask_t glm_core_reject_l2q[]={ + { .uname = "ALL", + .udesc = "Requests rejected by the L2Q ", +@@ -1168,7 +985,7 @@ static const intel_x86_entry_t intel_glm_pe[]={ + { .name = "ICACHE", + .desc = "References per ICache line that are available in the ICache (hit). This event counts differently than Intel processors based on Silvermont microarchitecture", + .code = 0x80, +- .modmsk = INTEL_V4_ATTRS, ++ .modmsk = INTEL_V2_ATTRS, + .cntmsk = 0xfull, + .ngrp = 1, + .numasks = LIBPFM_ARRAY_SIZE(glm_icache), +@@ -1177,7 +994,7 @@ static const intel_x86_entry_t intel_glm_pe[]={ + { .name = "L2_REJECT_XQ", + .desc = "Requests rejected by the XQ", + .code = 0x30, +- .modmsk = INTEL_V4_ATTRS, ++ .modmsk = INTEL_V2_ATTRS, + .cntmsk = 0xfull, + .ngrp = 1, + .numasks = LIBPFM_ARRAY_SIZE(glm_l2_reject_xq), +@@ -1186,7 +1003,7 @@ static const intel_x86_entry_t intel_glm_pe[]={ + { .name = "HW_INTERRUPTS", + .desc = "Hardware interrupts received", + .code = 0xcb, +- .modmsk = INTEL_V4_ATTRS, ++ .modmsk = INTEL_V2_ATTRS, + .cntmsk = 0xfull, + .ngrp = 1, + .numasks = LIBPFM_ARRAY_SIZE(glm_hw_interrupts), +@@ -1195,7 +1012,7 @@ static const intel_x86_entry_t intel_glm_pe[]={ + { .name = "BR_MISP_RETIRED", + .desc = "Retired mispredicted branch instructions (Precise Event)", + .code = 0xc5, +- .modmsk = INTEL_V4_ATTRS, ++ .modmsk = INTEL_V2_ATTRS, + .cntmsk = 0xfull, + .flags = INTEL_X86_PEBS, + .ngrp = 1, +@@ -1205,7 +1022,7 @@ static const intel_x86_entry_t intel_glm_pe[]={ + { .name = "DECODE_RESTRICTION", + .desc = "Decode restrictions due to predicting wrong instruction length", + .code = 0xe9, +- .modmsk = INTEL_V4_ATTRS, ++ .modmsk = INTEL_V2_ATTRS, + .cntmsk = 0xfull, + .ngrp = 1, + .numasks = LIBPFM_ARRAY_SIZE(glm_decode_restriction), +@@ -1214,7 +1031,7 @@ static const intel_x86_entry_t intel_glm_pe[]={ + { .name = "MISALIGN_MEM_REF", + .desc = "Load uops that split a page (Precise Event)", + .code = 0x13, +- .modmsk = INTEL_V4_ATTRS, ++ .modmsk = INTEL_V2_ATTRS, + .cntmsk = 0xfull, + .flags = INTEL_X86_PEBS, + .ngrp = 1, +@@ -1224,7 +1041,7 @@ static const intel_x86_entry_t intel_glm_pe[]={ + { .name = "INST_RETIRED", + .desc = "Instructions retired (Precise Event)", + .code = 0xc0, +- .modmsk = INTEL_V4_ATTRS, ++ .modmsk = INTEL_V2_ATTRS, + .cntmsk = 0x10000000full, + .flags = INTEL_X86_PEBS, + .ngrp = 1, +@@ -1234,14 +1051,14 @@ static const intel_x86_entry_t intel_glm_pe[]={ + { .name = "INSTRUCTION_RETIRED", + .desc = "Number of instructions retired", + .code = 0xc0, +- .modmsk = INTEL_V4_ATTRS, ++ .modmsk = INTEL_V2_ATTRS, + .cntmsk = 0x100000ffull, + .ngrp = 0, + }, + { .name = "ISSUE_SLOTS_NOT_CONSUMED", + .desc = "Unfilled issue slots per cycle because of a full resource in the backend", + .code = 0xca, +- .modmsk = INTEL_V4_ATTRS, ++ .modmsk = INTEL_V2_ATTRS, + .cntmsk = 0xfull, + .ngrp = 1, + .numasks = LIBPFM_ARRAY_SIZE(glm_issue_slots_not_consumed), +@@ -1250,7 +1067,7 @@ static const intel_x86_entry_t intel_glm_pe[]={ + { .name = "ITLB", + .desc = "ITLB misses", + .code = 0x81, +- .modmsk = INTEL_V4_ATTRS, ++ .modmsk = INTEL_V2_ATTRS, + .cntmsk = 0xfull, + .ngrp = 1, + .numasks = LIBPFM_ARRAY_SIZE(glm_itlb), +@@ -1259,7 +1076,7 @@ static const intel_x86_entry_t intel_glm_pe[]={ + { .name = "LONGEST_LAT_CACHE", + .desc = "L2 cache requests", + .code = 0x2e, +- .modmsk = INTEL_V4_ATTRS, ++ .modmsk = INTEL_V2_ATTRS, + .cntmsk = 0xfull, + .ngrp = 1, + .numasks = LIBPFM_ARRAY_SIZE(glm_longest_lat_cache), +@@ -1268,7 +1085,7 @@ static const intel_x86_entry_t intel_glm_pe[]={ + { .name = "MEM_LOAD_UOPS_RETIRED", + .desc = "Load uops retired that hit L1 data cache (Precise Event)", + .code = 0xd1, +- .modmsk = INTEL_V4_ATTRS, ++ .modmsk = INTEL_V2_ATTRS, + .cntmsk = 0xfull, + .flags = INTEL_X86_PEBS, + .ngrp = 1, +@@ -1278,7 +1095,7 @@ static const intel_x86_entry_t intel_glm_pe[]={ + { .name = "LD_BLOCKS", + .desc = "Loads blocked (Precise Event)", + .code = 0x03, +- .modmsk = INTEL_V4_ATTRS, ++ .modmsk = INTEL_V2_ATTRS, + .cntmsk = 0xfull, + .flags = INTEL_X86_PEBS, + .ngrp = 1, +@@ -1288,7 +1105,7 @@ static const intel_x86_entry_t intel_glm_pe[]={ + { .name = "DL1", + .desc = "L1 Cache evictions for dirty data", + .code = 0x51, +- .modmsk = INTEL_V4_ATTRS, ++ .modmsk = INTEL_V2_ATTRS, + .cntmsk = 0xfull, + .ngrp = 1, + .numasks = LIBPFM_ARRAY_SIZE(glm_dl1), +@@ -1297,7 +1114,7 @@ static const intel_x86_entry_t intel_glm_pe[]={ + { .name = "CYCLES_DIV_BUSY", + .desc = "Cycles a divider is busy", + .code = 0xcd, +- .modmsk = INTEL_V4_ATTRS, ++ .modmsk = INTEL_V2_ATTRS, + .cntmsk = 0xfull, + .ngrp = 1, + .numasks = LIBPFM_ARRAY_SIZE(glm_cycles_div_busy), +@@ -1306,7 +1123,7 @@ static const intel_x86_entry_t intel_glm_pe[]={ + { .name = "MS_DECODED", + .desc = "MS decode starts", + .code = 0xe7, +- .modmsk = INTEL_V4_ATTRS, ++ .modmsk = INTEL_V2_ATTRS, + .cntmsk = 0xfull, + .ngrp = 1, + .numasks = LIBPFM_ARRAY_SIZE(glm_ms_decoded), +@@ -1315,7 +1132,7 @@ static const intel_x86_entry_t intel_glm_pe[]={ + { .name = "UOPS_RETIRED", + .desc = "Uops retired (Precise Event)", + .code = 0xc2, +- .modmsk = INTEL_V4_ATTRS, ++ .modmsk = INTEL_V2_ATTRS, + .cntmsk = 0xfull, + .flags = INTEL_X86_PEBS, + .ngrp = 1, +@@ -1324,8 +1141,8 @@ static const intel_x86_entry_t intel_glm_pe[]={ + }, + { .name = "OFFCORE_RESPONSE_1", + .desc = "Offcore response event (must provide at least one request type and either any_response or any combination of supplier + snoop)", +- .code = 0x1bb, +- .modmsk = INTEL_V4_ATTRS, ++ .code = 0x2b7, ++ .modmsk = INTEL_V2_ATTRS, + .cntmsk = 0xffull, + .flags = INTEL_X86_NHM_OFFCORE, + .ngrp = 3, +@@ -1335,7 +1152,7 @@ static const intel_x86_entry_t intel_glm_pe[]={ + { .name = "MACHINE_CLEARS", + .desc = "Self-Modifying Code detected", + .code = 0xc3, +- .modmsk = INTEL_V4_ATTRS, ++ .modmsk = INTEL_V2_ATTRS, + .cntmsk = 0xfull, + .ngrp = 1, + .numasks = LIBPFM_ARRAY_SIZE(glm_machine_clears), +@@ -1344,7 +1161,7 @@ static const intel_x86_entry_t intel_glm_pe[]={ + { .name = "BR_INST_RETIRED", + .desc = "Retired branch instructions (Precise Event)", + .code = 0xc4, +- .modmsk = INTEL_V4_ATTRS, ++ .modmsk = INTEL_V2_ATTRS, + .cntmsk = 0xfull, + .flags = INTEL_X86_PEBS, + .ngrp = 1, +@@ -1354,7 +1171,7 @@ static const intel_x86_entry_t intel_glm_pe[]={ + { .name = "FETCH_STALL", + .desc = "Cycles where code-fetch is stalled and an ICache miss is outstanding. This is not the same as an ICache Miss", + .code = 0x86, +- .modmsk = INTEL_V4_ATTRS, ++ .modmsk = INTEL_V2_ATTRS, + .cntmsk = 0xfull, + .ngrp = 1, + .numasks = LIBPFM_ARRAY_SIZE(glm_fetch_stall), +@@ -1363,7 +1180,7 @@ static const intel_x86_entry_t intel_glm_pe[]={ + { .name = "UOPS_NOT_DELIVERED", + .desc = "Uops requested but not-delivered to the back-end per cycle", + .code = 0x9c, +- .modmsk = INTEL_V4_ATTRS, ++ .modmsk = INTEL_V2_ATTRS, + .cntmsk = 0xfull, + .ngrp = 1, + .numasks = LIBPFM_ARRAY_SIZE(glm_uops_not_delivered), +@@ -1372,7 +1189,7 @@ static const intel_x86_entry_t intel_glm_pe[]={ + { .name = "MISPREDICTED_BRANCH_RETIRED", + .desc = "Number of mispredicted branch instructions retired", + .code = 0xc5, +- .modmsk = INTEL_V4_ATTRS, ++ .modmsk = INTEL_V2_ATTRS, + .cntmsk = 0xffull, + .equiv = "BR_MISP_RETIRED:ALL_BRANCHES", + .ngrp = 0, +@@ -1380,7 +1197,7 @@ static const intel_x86_entry_t intel_glm_pe[]={ + { .name = "INSTRUCTIONS_RETIRED", + .desc = "Number of instructions retired", + .code = 0xc0, +- .modmsk = INTEL_V4_ATTRS, ++ .modmsk = INTEL_V2_ATTRS, + .cntmsk = 0x100000ffull, + .equiv = "INSTRUCTION_RETIRED", + .ngrp = 0, +@@ -1388,7 +1205,7 @@ static const intel_x86_entry_t intel_glm_pe[]={ + { .name = "MEM_UOPS_RETIRED", + .desc = "Load uops retired (Precise Event)", + .code = 0xd0, +- .modmsk = INTEL_V4_ATTRS, ++ .modmsk = INTEL_V2_ATTRS, + .cntmsk = 0xfull, + .flags = INTEL_X86_PEBS, + .ngrp = 1, +@@ -1398,7 +1215,7 @@ static const intel_x86_entry_t intel_glm_pe[]={ + { .name = "UOPS_ISSUED", + .desc = "Uops issued to the back end per cycle", + .code = 0x0e, +- .modmsk = INTEL_V4_ATTRS, ++ .modmsk = INTEL_V2_ATTRS, + .cntmsk = 0xfull, + .ngrp = 1, + .numasks = LIBPFM_ARRAY_SIZE(glm_uops_issued), +@@ -1407,17 +1224,17 @@ static const intel_x86_entry_t intel_glm_pe[]={ + { .name = "OFFCORE_RESPONSE_0", + .desc = "Offcore response event (must provide at least one request type and either any_response or any combination of supplier + snoop)", + .code = 0x1b7, +- .modmsk = INTEL_V4_ATTRS, ++ .modmsk = INTEL_V2_ATTRS, + .cntmsk = 0xffull, + .flags = INTEL_X86_NHM_OFFCORE, +- .ngrp = 3, ++ .ngrp = 4, + .numasks = LIBPFM_ARRAY_SIZE(glm_offcore_response_0), + .umasks = glm_offcore_response_0, + }, + { .name = "UNHALTED_REFERENCE_CYCLES", + .desc = "Unhalted reference cycles. Ticks at constant reference frequency", + .code = 0x0300, +- .modmsk = INTEL_FIXED3_ATTRS, ++ .modmsk = INTEL_FIXED2_ATTRS, + .cntmsk = 0x40000000ull, + .flags = INTEL_X86_FIXED, + .ngrp = 0, +@@ -1425,7 +1242,7 @@ static const intel_x86_entry_t intel_glm_pe[]={ + { .name = "BRANCH_INSTRUCTIONS_RETIRED", + .desc = "Number of branch instructions retired", + .code = 0xc4, +- .modmsk = INTEL_V4_ATTRS, ++ .modmsk = INTEL_V2_ATTRS, + .cntmsk = 0xffull, + .equiv = "BR_INST_RETIRED:ALL_BRANCHES", + .ngrp = 0, +@@ -1433,7 +1250,7 @@ static const intel_x86_entry_t intel_glm_pe[]={ + { .name = "CORE_REJECT_L2Q", + .desc = "Requests rejected by the L2Q ", + .code = 0x31, +- .modmsk = INTEL_V4_ATTRS, ++ .modmsk = INTEL_V2_ATTRS, + .cntmsk = 0xfull, + .ngrp = 1, + .numasks = LIBPFM_ARRAY_SIZE(glm_core_reject_l2q), +@@ -1442,7 +1259,7 @@ static const intel_x86_entry_t intel_glm_pe[]={ + { .name = "PAGE_WALKS", + .desc = "Duration of D-side page-walks in cycles", + .code = 0x05, +- .modmsk = INTEL_V4_ATTRS, ++ .modmsk = INTEL_V2_ATTRS, + .cntmsk = 0xfull, + .ngrp = 1, + .numasks = LIBPFM_ARRAY_SIZE(glm_page_walks), +@@ -1451,7 +1268,7 @@ static const intel_x86_entry_t intel_glm_pe[]={ + { .name = "BACLEARS", + .desc = "BACLEARs asserted for any branch type", + .code = 0xe6, +- .modmsk = INTEL_V4_ATTRS, ++ .modmsk = INTEL_V2_ATTRS, + .cntmsk = 0xfull, + .ngrp = 1, + .numasks = LIBPFM_ARRAY_SIZE(glm_baclears), +@@ -1460,7 +1277,7 @@ static const intel_x86_entry_t intel_glm_pe[]={ + { .name = "CPU_CLK_UNHALTED", + .desc = "Core cycles when core is not halted (Fixed event)", + .code = 0x00, +- .modmsk = INTEL_V4_ATTRS, ++ .modmsk = INTEL_V2_ATTRS, + .cntmsk = 0x60000000full, + .ngrp = 1, + .numasks = LIBPFM_ARRAY_SIZE(glm_cpu_clk_unhalted), +@@ -1469,7 +1286,7 @@ static const intel_x86_entry_t intel_glm_pe[]={ + { .name = "UNHALTED_CORE_CYCLES", + .desc = "Core clock cycles whenever the clock signal on the specific core is running (not halted)", + .code = 0x3c, +- .modmsk = INTEL_V4_ATTRS, ++ .modmsk = INTEL_V2_ATTRS, + .cntmsk = 0x20000000ull, + .ngrp = 0, + }, +diff --git a/lib/pfmlib_intel_nhm_unc.c b/lib/pfmlib_intel_nhm_unc.c +index fbf1b19..4c27b07 100644 +--- a/lib/pfmlib_intel_nhm_unc.c ++++ b/lib/pfmlib_intel_nhm_unc.c +@@ -213,7 +213,7 @@ pfm_nhm_unc_get_encoding(void *this, pfmlib_event_desc_t *e) + */ + if ((ugrpmsk != grpmsk && !intel_x86_eflag(this, e->event, INTEL_X86_GRP_EXCL)) || ugrpmsk == 0) { + ugrpmsk ^= grpmsk; +- ret = pfm_intel_x86_add_defaults(this, e, ugrpmsk, &umask, -1); ++ ret = pfm_intel_x86_add_defaults(this, e, ugrpmsk, &umask, -1, -1); + if (ret != PFM_SUCCESS) + return ret; + } +diff --git a/lib/pfmlib_intel_x86.c b/lib/pfmlib_intel_x86.c +index 031de0d..b698144 100644 +--- a/lib/pfmlib_intel_x86.c ++++ b/lib/pfmlib_intel_x86.c +@@ -200,13 +200,14 @@ int + pfm_intel_x86_add_defaults(void *this, pfmlib_event_desc_t *e, + unsigned int msk, + uint64_t *umask, +- unsigned int max_grpid) ++ unsigned int max_grpid, ++ int excl_grp_but_0) + { + const intel_x86_entry_t *pe = this_pe(this); + const intel_x86_entry_t *ent; + unsigned int i; + int j, k, added, skip; +- int idx; ++ int idx, grpid; + + k = e->nattrs; + ent = pe+e->event; +@@ -242,6 +243,12 @@ pfm_intel_x86_add_defaults(void *this, pfmlib_event_desc_t *e, + skip = 1; + continue; + } ++ grpid = ent->umasks[idx].grpid; ++ ++ if (excl_grp_but_0 != -1 && grpid != 0 && excl_grp_but_0 != grpid) { ++ skip = 1; ++ continue; ++ } + + /* umask is default for group */ + if (intel_x86_uflag(this, e->event, idx, INTEL_X86_DFL)) { +@@ -373,6 +380,7 @@ pfm_intel_x86_encode_gen(void *this, pfmlib_event_desc_t *e) + unsigned int grpid; + int ldlat = 0, ldlat_um = 0; + int fe_thr= 0, fe_thr_um = 0; ++ int excl_grp_but_0 = -1; + int grpcounts[INTEL_X86_NUM_GRP]; + int ncombo[INTEL_X86_NUM_GRP]; + +@@ -425,6 +433,8 @@ pfm_intel_x86_encode_gen(void *this, pfmlib_event_desc_t *e) + if (intel_x86_uflag(this, e->event, a->idx, INTEL_X86_EXCL_GRP_GT)) + max_grpid = grpid; + ++ if (intel_x86_uflag(this, e->event, a->idx, INTEL_X86_EXCL_GRP_BUT_0)) ++ excl_grp_but_0 = grpid; + /* + * upper layer has removed duplicates + * so if we come here more than once, it is for two +@@ -580,11 +590,25 @@ pfm_intel_x86_encode_gen(void *this, pfmlib_event_desc_t *e) + */ + if ((ugrpmsk != grpmsk && !intel_x86_eflag(this, e->event, INTEL_X86_GRP_EXCL)) || ugrpmsk == 0) { + ugrpmsk ^= grpmsk; +- ret = pfm_intel_x86_add_defaults(this, e, ugrpmsk, &umask2, max_grpid); ++ ret = pfm_intel_x86_add_defaults(this, e, ugrpmsk, &umask2, max_grpid, excl_grp_but_0); + if (ret != PFM_SUCCESS) + return ret; + } +- ++ /* ++ * GRP_EXCL_BUT_0 groups require at least one bit set in grpid = 0 and one in theirs ++ * applies to OFFCORE_RESPONSE umasks on some processors (e.g., Goldmont) ++ */ ++ DPRINT("excl_grp_but_0=%d\n", excl_grp_but_0); ++ if (excl_grp_but_0 != -1) { ++ /* skip group 0, because it is authorized */ ++ for (k = 1; k < INTEL_X86_NUM_GRP; k++) { ++ DPRINT("grpcounts[%d]=%d\n", k, grpcounts[k]); ++ if (grpcounts[k] && k != excl_grp_but_0) { ++ DPRINT("GRP_EXCL_BUT_0 but grpcounts[%d]=%d\n", k, grpcounts[k]); ++ return PFM_ERR_FEATCOMB; ++ } ++ } ++ } + ret = intel_x86_check_pebs(this, e); + if (ret != PFM_SUCCESS) + return ret; +diff --git a/lib/pfmlib_intel_x86_priv.h b/lib/pfmlib_intel_x86_priv.h +index 74aab3e..963b41a 100644 +--- a/lib/pfmlib_intel_x86_priv.h ++++ b/lib/pfmlib_intel_x86_priv.h +@@ -89,6 +89,7 @@ typedef struct { + #define INTEL_X86_GRP_DFL_NONE 0x0800 /* ok if umask group defaults to no umask */ + #define INTEL_X86_FRONTEND 0x1000 /* Skylake Precise frontend */ + #define INTEL_X86_FETHR 0x2000 /* precise frontend umask requires threshold modifier (fe_thres) */ ++#define INTEL_X86_EXCL_GRP_BUT_0 0x4000 /* exclude all groups except self and grpid = 0 */ + + typedef union pfm_intel_x86_reg { + unsigned long long val; /* complete register value */ +@@ -325,7 +326,7 @@ intel_x86_attr2umask(void *this, int pidx, int attr_idx) + } + + extern int pfm_intel_x86_detect(void); +-extern int pfm_intel_x86_add_defaults(void *this, pfmlib_event_desc_t *e, unsigned int msk, uint64_t *umask, unsigned int max_grpid); ++extern int pfm_intel_x86_add_defaults(void *this, pfmlib_event_desc_t *e, unsigned int msk, uint64_t *umask, unsigned int max_grpid, int excl_grp_but_0); + + extern int pfm_intel_x86_event_is_valid(void *this, int pidx); + extern int pfm_intel_x86_get_encoding(void *this, pfmlib_event_desc_t *e); +diff --git a/tests/validate_x86.c b/tests/validate_x86.c +index 876453f..3e6f408 100644 +--- a/tests/validate_x86.c ++++ b/tests/validate_x86.c +@@ -4260,6 +4260,88 @@ static const test_event_t x86_test_events[]={ + .name = "skl::offcore_response_1:0x7fffffffff", + .ret = PFM_ERR_ATTR, + }, ++ { SRC_LINE, ++ .name = "glm::offcore_response_1:any_request", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] = 0x5302b7, ++ .codes[1] = 0x18000, ++ .fstr = "glm::OFFCORE_RESPONSE_1:ANY_REQUEST:ANY_RESPONSE:k=1:u=1:e=0:i=0:c=0", ++ }, ++ { SRC_LINE, ++ .name = "glm::offcore_response_1:any_rfo", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] = 0x5302b7, ++ .codes[1] = 0x10022, ++ .fstr = "glm::OFFCORE_RESPONSE_1:DMND_RFO:PF_RFO:ANY_RESPONSE:k=1:u=1:e=0:i=0:c=0", ++ }, ++ { SRC_LINE, ++ .name = "glm::offcore_response_1:any_rfo:l2_miss_snp_miss_or_no_snoop_needed", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] = 0x5302b7, ++ .codes[1] = 0x200010022ull, ++ .fstr = "glm::OFFCORE_RESPONSE_1:DMND_RFO:PF_RFO:ANY_RESPONSE:L2_MISS_SNP_MISS_OR_NO_SNOOP_NEEDED:k=1:u=1:e=0:i=0:c=0", ++ }, ++ { SRC_LINE, ++ .name = "glm::offcore_response_0:strm_st", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] = 0x5301b7, ++ .codes[1] = 0x14800, ++ .fstr = "glm::OFFCORE_RESPONSE_0:FULL_STRM_ST:PARTIAL_STRM_ST:ANY_RESPONSE:k=1:u=1:e=0:i=0:c=0", ++ }, ++ { SRC_LINE, ++ .name = "glm::offcore_response_1:dmnd_data_rd:outstanding", ++ .ret = PFM_ERR_ATTR, ++ }, ++ { SRC_LINE, ++ .name = "glm::offcore_response_1:dmnd_data_rd:l2_hit:outstanding", ++ .ret = PFM_ERR_ATTR, ++ }, ++ { SRC_LINE, ++ .name = "glm::offcore_response_0:strm_st:outstanding", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] = 0x5301b7, ++ .codes[1] = 0x4000004800ull, ++ .fstr = "glm::OFFCORE_RESPONSE_0:FULL_STRM_ST:PARTIAL_STRM_ST:OUTSTANDING:k=1:u=1:e=0:i=0:c=0", ++ }, ++ { SRC_LINE, ++ .name = "glm::offcore_response_0:outstanding:dmnd_data_rd:u", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] = 0x5101b7, ++ .codes[1] = 0x4000000001ull, ++ .fstr = "glm::OFFCORE_RESPONSE_0:DMND_DATA_RD:OUTSTANDING:k=0:u=1:e=0:i=0:c=0", ++ }, ++ { SRC_LINE, ++ .name = "glm::offcore_response_0:strm_st:l2_hit:outstanding", ++ .ret = PFM_ERR_FEATCOMB, ++ }, ++ { SRC_LINE, ++ .name = "glm::ISSUE_SLOTS_NOT_CONSUMED:RESOURCE_FULL", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x5301ca, ++ .fstr = "glm::ISSUE_SLOTS_NOT_CONSUMED:RESOURCE_FULL:k=1:u=1:e=0:i=0:c=0", ++ }, ++ { SRC_LINE, ++ .name = "glm::ISSUE_SLOTS_NOT_CONSUMED:RESOURCE_FULL:k:c=1:i", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x1d201ca, ++ .fstr = "glm::ISSUE_SLOTS_NOT_CONSUMED:RESOURCE_FULL:k=1:u=0:e=0:i=1:c=1", ++ }, ++ { SRC_LINE, ++ .name = "glm::ISSUE_SLOTS_NOT_CONSUMED:RESOURCE_FULL:u:t", ++ .ret = PFM_ERR_ATTR, ++ }, ++ { SRC_LINE, ++ .name = "glm::ISSUE_SLOTS_NOT_CONSUMED:RESOURCE_FULL:u:intxcp", ++ .ret = PFM_ERR_ATTR, ++ }, + }; + #define NUM_TEST_EVENTS (int)(sizeof(x86_test_events)/sizeof(test_event_t)) + +-- +2.9.3 + + +From f7d50753d0e0148d00060e191c29afdd9d39d146 Mon Sep 17 00:00:00 2001 +From: Stephane Eranian +Date: Wed, 20 Jul 2016 22:23:36 -0700 +Subject: [PATCH] fix Intel Broadwell-EP OFFCORE_RESPONSE:L3_MISS_REMOTE + +This encoding of the umask was off by one bit for +L3_MISS_REMOTE and L3_MISS_REMOTE_DRAM (alias). + +Also adds the uequiv alias for the umask. + +Also adds a validation test for the umask. + +Signed-off-by: Stephane Eranian +--- + lib/events/intel_bdw_events.h | 5 +++-- + tests/validate_x86.c | 10 +++++++++- + 2 files changed, 12 insertions(+), 3 deletions(-) + +diff --git a/lib/events/intel_bdw_events.h b/lib/events/intel_bdw_events.h +index c22755e..6be3ac9 100644 +--- a/lib/events/intel_bdw_events.h ++++ b/lib/events/intel_bdw_events.h +@@ -1875,13 +1875,14 @@ static const intel_x86_umask_t bdw_offcore_response[]={ + }, + { .uname = "L3_MISS_REMOTE", + .udesc = "Supplier: counts L3 misses to remote node", +- .ucode = 0x7ULL << (26+8), ++ .uequiv = "L3_MISS_REMOTE_HOP0:L3_MISS_REMOTE_HOP1:L3_MISS_REMOTE_HOP2P", ++ .ucode = 0x7ULL << (27+8), + .umodel = PFM_PMU_INTEL_BDW_EP, + .grpid = 1, + }, + { .uname = "L3_MISS_REMOTE_DRAM", + .udesc = "Supplier: counts L3 misses to remote node", +- .ucode = 0x7ULL << (26+8), ++ .ucode = 0x7ULL << (27+8), + .uequiv = "L3_MISS_REMOTE", + .umodel = PFM_PMU_INTEL_BDW_EP, + .grpid = 1, +diff --git a/tests/validate_x86.c b/tests/validate_x86.c +index 3e6f408..4096372 100644 +--- a/tests/validate_x86.c ++++ b/tests/validate_x86.c +@@ -3110,11 +3110,19 @@ static const test_event_t x86_test_events[]={ + .name = "bdw_ep::offcore_response_0:l3_miss", + .ret = PFM_SUCCESS, + .count = 2, +- .codes[0] =0x5301b7, ++ .codes[0] = 0x5301b7, + .codes[1] = 0x3fbc008fffull, + .fstr = "bdw_ep::OFFCORE_RESPONSE_0:ANY_REQUEST:L3_MISS_LOCAL:L3_MISS_REMOTE_HOP0:L3_MISS_REMOTE_HOP1:L3_MISS_REMOTE_HOP2P:SNP_ANY:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", + }, + { SRC_LINE, ++ .name = "bdw_ep::offcore_response_1:l3_miss_remote", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] = 0x5301bb, ++ .codes[1] = 0x3fb8008fffull, ++ .fstr = "bdw_ep::OFFCORE_RESPONSE_1:ANY_REQUEST:L3_MISS_REMOTE_HOP0:L3_MISS_REMOTE_HOP1:L3_MISS_REMOTE_HOP2P:SNP_ANY:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", ++ }, ++ { SRC_LINE, + .name = "bdw_ep::offcore_response_0:L3_MISS_REMOTE_HOP0_DRAM", + .ret = PFM_SUCCESS, + .count = 2, +-- +2.9.3 + + +From a3012f86d5f96ca814585b181f830861774f29da Mon Sep 17 00:00:00 2001 +From: Stephane Eranian +Date: Wed, 20 Jul 2016 22:28:01 -0700 +Subject: [PATCH] add Intel Haswell-EP alias for offcore_response remote L3 + miss + +This patch adds offcore_response_*:L3_MISS_REMOTE and L3_MISS_REMOTE_DRAM +umasks to be consistent with Intel Broadwell. + +Also adds a validation test for it. + +Signed-off-by: Stephane Eranian +--- + lib/events/intel_hsw_events.h | 14 ++++++++++++++ + tests/validate_x86.c | 8 ++++++++ + 2 files changed, 22 insertions(+) + +diff --git a/lib/events/intel_hsw_events.h b/lib/events/intel_hsw_events.h +index 426119b..2a17e47 100644 +--- a/lib/events/intel_hsw_events.h ++++ b/lib/events/intel_hsw_events.h +@@ -1797,6 +1797,20 @@ static const intel_x86_umask_t hsw_offcore_response[]={ + .umodel = PFM_PMU_INTEL_HSW_EP, + .grpid = 1, + }, ++ { .uname = "L3_MISS_REMOTE", ++ .udesc = "Supplier: counts L3 misses to remote node", ++ .ucode = 0x7ULL << (27+8), ++ .uequiv = "L3_MISS_REMOTE_HOP0:L3_MISS_REMOTE_HOP1:L3_MISS_REMOTE_HOP2P", ++ .umodel = PFM_PMU_INTEL_HSW_EP, ++ .grpid = 1, ++ }, ++ { .uname = "L3_MISS_REMOTE_DRAM", ++ .udesc = "Supplier: counts L3 misses to remote node", ++ .ucode = 0x7ULL << (27+8), ++ .uequiv = "L3_MISS_REMOTE", ++ .umodel = PFM_PMU_INTEL_HSW_EP, ++ .grpid = 1, ++ }, + { .uname = "SPL_HIT", + .udesc = "Supplier: counts L3 supplier hit", + .ucode = 0x1ULL << (30+8), +diff --git a/tests/validate_x86.c b/tests/validate_x86.c +index 4096372..0247c3e 100644 +--- a/tests/validate_x86.c ++++ b/tests/validate_x86.c +@@ -2922,6 +2922,14 @@ static const test_event_t x86_test_events[]={ + .codes[1] = 0x3f80400091ull, + .fstr = "hsw_ep::OFFCORE_RESPONSE_0:DMND_DATA_RD:PF_DATA_RD:PF_L3_DATA_RD:L3_MISS_LOCAL:SNP_ANY:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", + }, ++ { SRC_LINE, ++ .name = "hsw_ep::offcore_response_0:any_data:L3_miss_remote", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] = 0x5301b7, ++ .codes[1] = 0x3fb8000091ull, ++ .fstr = "hsw_ep::OFFCORE_RESPONSE_0:DMND_DATA_RD:PF_DATA_RD:PF_L3_DATA_RD:L3_MISS_REMOTE_HOP0:L3_MISS_REMOTE_HOP1:L3_MISS_REMOTE_HOP2P:SNP_ANY:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", ++ }, + { SRC_LINE, /* here SNP_ANY gets expanded when passed on the cmdline, but not when added automatically by library */ + .name = "hsw_ep::OFFCORE_RESPONSE_0:DMND_DATA_RD:PF_DATA_RD:PF_L3_DATA_RD:L3_MISS_LOCAL:SNP_ANY:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", + .ret = PFM_SUCCESS, +-- +2.9.3 + + +From 1d57dbe8dbc4864ca501b6f3666c228adbee8910 Mon Sep 17 00:00:00 2001 +From: Stephane Eranian +Date: Fri, 22 Jul 2016 12:53:32 -0700 +Subject: [PATCH] fix error in pfmlib_is_blacklisted_pmu() with some compilers + +Some compilers or compiler options do not like: + + char buffer[strlen(pfm_cfg.blacklist_pmus) + 1]; + +So revert to a more classic style declaration with heap +allocation via strdup(); + +Signed-off-by: Stephane Eranian +--- + lib/pfmlib_common.c | 14 +++++++++++--- + 1 file changed, 11 insertions(+), 3 deletions(-) + +diff --git a/lib/pfmlib_common.c b/lib/pfmlib_common.c +index 4c4c376..6297fdd 100644 +--- a/lib/pfmlib_common.c ++++ b/lib/pfmlib_common.c +@@ -676,6 +676,9 @@ pfmlib_match_forced_pmu(const char *name) + static int + pfmlib_is_blacklisted_pmu(pfmlib_pmu_t *p) + { ++ char *q, *buffer; ++ int ret = 1; ++ + if (!pfm_cfg.blacklist_pmus) + return 0; + +@@ -683,15 +686,20 @@ pfmlib_is_blacklisted_pmu(pfmlib_pmu_t *p) + * scan list for matching PMU names, we accept substrings. + * for instance: snbep does match snbep* + */ +- char *q, buffer[strlen(pfm_cfg.blacklist_pmus) + 1]; ++ buffer = strdup(pfm_cfg.blacklist_pmus); ++ if (!buffer) ++ return 0; + + strcpy (buffer, pfm_cfg.blacklist_pmus); + for (q = strtok (buffer, ","); q != NULL; q = strtok (NULL, ",")) { + if (strstr (p->name, q) != NULL) { +- return 1; ++ goto done; + } + } +- return 0; ++ ret = 0; ++done: ++ free(buffer); ++ return ret; + } + + static int +-- +2.9.3 + + +From a347a0a29389093e44c1049e351fb20e8702d040 Mon Sep 17 00:00:00 2001 +From: Stephane Eranian +Date: Fri, 22 Jul 2016 13:02:55 -0700 +Subject: [PATCH] remove duplicate offcore_response_*:l3_miss umask for SNB_EP + +The L3_MISS was duplicated. + +Bug introduced by: +a31c90ed0aec fix/add offcore_response:l3_miss alias for Intel SNB/IVB/HSW/BDW/SKL + +Signed-off-by: Stephane Eranian +--- + lib/events/intel_snb_events.h | 7 ------- + 1 file changed, 7 deletions(-) + +diff --git a/lib/events/intel_snb_events.h b/lib/events/intel_snb_events.h +index 0d448b7..475dd09 100644 +--- a/lib/events/intel_snb_events.h ++++ b/lib/events/intel_snb_events.h +@@ -1792,13 +1792,6 @@ static const intel_x86_umask_t snb_offcore_response[]={ + .umodel = PFM_PMU_INTEL_SNB_EP, + .grpid = 1, + }, +- { .uname = "L3_MISS", +- .udesc = "Supplier: counts L3 misses to local or remote DRAM", +- .ucode = 0x3ULL << (22+8), +- .uequiv = "LLC_MISS_LOCAL:LLC_MISS_REMOTE", +- .umodel = PFM_PMU_INTEL_SNB_EP, +- .grpid = 1, +- }, + { .uname = "LLC_HITMESF", + .udesc = "Supplier: counts L3 hits in any state (M, E, S, F)", + .ucode = 0xfULL << (18+8), +-- +2.9.3 + + +From 06b296c72838be44d8950dc03227fe0dc8ca1fb1 Mon Sep 17 00:00:00 2001 +From: Stephane Eranian +Date: Fri, 22 Jul 2016 14:35:21 -0700 +Subject: [PATCH] fix struct validation for pfm_event_attr_info_t + +There was a mismatch between the test and the actual struct. +The compiler adds a padding field of 4 bytes before idx for +both 64 and 32-bit modes. So take it into account explicitly +to avoid the test failure. + +Signed-off-by: Stephane Eranian +--- + include/perfmon/pfmlib.h | 5 +++-- + tests/validate.c | 3 ++- + 2 files changed, 5 insertions(+), 3 deletions(-) + +diff --git a/include/perfmon/pfmlib.h b/include/perfmon/pfmlib.h +index ba3a54f..d3a3c41 100644 +--- a/include/perfmon/pfmlib.h ++++ b/include/perfmon/pfmlib.h +@@ -421,6 +421,7 @@ typedef struct { + size_t size; /* struct sizeof */ + uint64_t code; /* attribute code */ + pfm_attr_t type; /* attribute type */ ++ int pad; /* padding */ + uint64_t idx; /* attribute opaque index */ + pfm_attr_ctrl_t ctrl; /* what is providing attr */ + struct { +@@ -450,13 +451,13 @@ typedef struct { + #if __WORDSIZE == 64 + #define PFM_PMU_INFO_ABI0 56 + #define PFM_EVENT_INFO_ABI0 64 +-#define PFM_ATTR_INFO_ABI0 64 ++#define PFM_ATTR_INFO_ABI0 72 + + #define PFM_RAW_ENCODE_ABI0 32 + #else + #define PFM_PMU_INFO_ABI0 44 + #define PFM_EVENT_INFO_ABI0 48 +-#define PFM_ATTR_INFO_ABI0 48 ++#define PFM_ATTR_INFO_ABI0 56 + + #define PFM_RAW_ENCODE_ABI0 20 + #endif +diff --git a/tests/validate.c b/tests/validate.c +index 522a6ab..0da0adc 100644 +--- a/tests/validate.c ++++ b/tests/validate.c +@@ -201,6 +201,7 @@ static const struct_desc_t pfmlib_structs[]={ + FIELD(code, pfm_event_attr_info_t), + FIELD(type, pfm_event_attr_info_t), + FIELD(idx, pfm_event_attr_info_t), ++ FIELD(pad, pfm_event_attr_info_t), /* padding */ + FIELD(ctrl, pfm_event_attr_info_t), + LAST_FIELD + }, +@@ -270,7 +271,7 @@ validate_structs(void) + } + + if (sz != d->sz) { +- printf("Failed (invisible padding of %zu bytes)\n", d->sz - sz); ++ printf("Failed (invisible padding of %zu bytes, total struct size %zu bytes)\n", d->sz - sz, d->sz); + errors++; + continue; + } +-- +2.9.3 + + +From bdf03951b7f493306c2c1adf434edbdb62c0f805 Mon Sep 17 00:00:00 2001 +From: Stephane Eranian +Date: Tue, 23 Aug 2016 00:47:21 -0700 +Subject: [PATCH] Add SQ_MISC:SPLIT_LOCK to Intel Broadwell event table + +As SQ_MISC:SPLIT_LOCK to Intel Broadwell event table. +Based on V9 from download.01.org. + +Signed-off-by: Stephane Eranian +--- + lib/events/intel_bdw_events.h | 17 +++++++++++++++++ + 1 file changed, 17 insertions(+) + +diff --git a/lib/events/intel_bdw_events.h b/lib/events/intel_bdw_events.h +index 6be3ac9..f6ab78a 100644 +--- a/lib/events/intel_bdw_events.h ++++ b/lib/events/intel_bdw_events.h +@@ -749,6 +749,14 @@ static const intel_x86_umask_t bdw_l1d[]={ + }, + }; + ++static const intel_x86_umask_t bdw_sq_misc[]={ ++ { .uname = "SPLIT_LOCK", ++ .udesc = "Number of split locks in the super queue (SQ)", ++ .ucode = 0x1000, ++ .uflags = INTEL_X86_DFL, ++ }, ++}; ++ + static const intel_x86_umask_t bdw_l1d_pend_miss[]={ + { .uname = "PENDING", + .udesc = "Cycles with L1D load misses outstanding", +@@ -2943,6 +2951,15 @@ static const intel_x86_entry_t intel_bdw_pe[]={ + .ngrp = 1, + .umasks = bdw_uops_dispatches_cancelled, + }, ++ { .name = "SQ_MISC", ++ .desc = "SuperQueue miscellaneous", ++ .modmsk = INTEL_V4_ATTRS, ++ .cntmsk = 0xf, ++ .code = 0xf4, ++ .numasks = LIBPFM_ARRAY_SIZE(bdw_sq_misc), ++ .ngrp = 1, ++ .umasks = bdw_sq_misc, ++ }, + { .name = "OFFCORE_RESPONSE_0", + .desc = "Offcore response event (must provide at least one request type and either any_response or any combination of supplier + snoop)", + .modmsk = INTEL_V4_ATTRS, +-- +2.9.3 + + +From 98a2c6461dd01512f06c10966429f7d932642c19 Mon Sep 17 00:00:00 2001 +From: Stephane Eranian +Date: Tue, 23 Aug 2016 01:01:56 -0700 +Subject: [PATCH] update Intel Skylake event table + +Based on V22 from download.01.org. + +Added: BR_MISP_RETIRED.NEAR_CALL +Added: SQ_MISC.SPLIT_LOCK +Added: ITLB_MISSES.WALK_COMPLETED_1G +Added: DTLB_LOAD_MISSES.WALK_COMPLETED_1G +Added: DTLB_STORE_MISSES.WALK_COMPLETED_1G +Added: MEM_LOAD_MISC_RETIRED:UC +Added: CPU_CLK_UNHALTED.RING0_TRANS + +Signed-off-by: Stephane Eranian +--- + lib/events/intel_skl_events.h | 74 +++++++++++++++++++++++++++++++++++++------ + 1 file changed, 65 insertions(+), 9 deletions(-) + +diff --git a/lib/events/intel_skl_events.h b/lib/events/intel_skl_events.h +index 3a107f3..e7b522d 100644 +--- a/lib/events/intel_skl_events.h ++++ b/lib/events/intel_skl_events.h +@@ -94,10 +94,15 @@ static const intel_x86_umask_t skl_br_misp_retired[]={ + .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_DFL, + }, + { .uname = "NEAR_TAKEN", +- .udesc = "number of near branch instructions retired that were mispredicted and taken", ++ .udesc = "Number of near branch instructions retired that were mispredicted and taken", + .ucode = 0x2000, + .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, + }, ++ { .uname = "NEAR_CALL", ++ .udesc = "Counts both taken and not taken retired mispredicted direct and indirect near calls, including both register and memory indirect.", ++ .ucode = 0x200, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, ++ }, + }; + + static const intel_x86_umask_t skl_cpu_clk_thread_unhalted[]={ +@@ -129,6 +134,12 @@ static const intel_x86_umask_t skl_cpu_clk_thread_unhalted[]={ + .ucode = 0x200, + .uflags= INTEL_X86_NCOMBO, + }, ++ { .uname = "RING0_TRANS", ++ .udesc = "Counts when the current privilege level transitions from ring 1, 2 or 3 to ring 0 (kernel)", ++ .ucode = 0x000 | INTEL_X86_MOD_EDGE | (1 << INTEL_X86_CMASK_BIT), /* edge=1 cnt=1 */ ++ .uequiv = "THREAD_P:e:c=1", ++ .uflags= INTEL_X86_NCOMBO, ++ }, + }; + + static const intel_x86_umask_t skl_cycle_activity[]={ +@@ -219,20 +230,25 @@ static const intel_x86_umask_t skl_dtlb_load_misses[]={ + .uflags = INTEL_X86_NCOMBO, + }, + { .uname = "WALK_COMPLETED", +- .udesc = "Misses in all TLB levels causes a page walk of any page size that completes", ++ .udesc = "Number of misses in all TLB levels causing a page walk of any page size that completes", + .ucode = 0xe00, + .uflags = INTEL_X86_NCOMBO, + }, + { .uname = "WALK_COMPLETED_4K", +- .udesc = "Misses in all TLB levels causes a page walk of 4KB page size that completes", ++ .udesc = "Number of misses in all TLB levels causing a page walk of 4KB page size that completes", + .ucode = 0x200, + .uflags = INTEL_X86_NCOMBO, + }, + { .uname = "WALK_COMPLETED_2M_4M", +- .udesc = "Misses in all TLB levels causes a page walk of 2MB/4MB page size that completes", ++ .udesc = "Number of misses in all TLB levels causing a page walk of 2MB/4MB page size that completes", + .ucode = 0x400, + .uflags = INTEL_X86_NCOMBO, + }, ++ { .uname = "WALK_COMPLETED_1G", ++ .udesc = "Number of misses in all TLB levels causing a page walk of 1GB page size that completes", ++ .ucode = 0x800, ++ .uflags = INTEL_X86_NCOMBO, ++ }, + { .uname = "WALK_ACTIVE", + .udesc = "Cycles with at least one hardware walker active for a load", + .ucode = 0x1000 | (0x1 << INTEL_X86_CMASK_BIT), +@@ -263,20 +279,25 @@ static const intel_x86_umask_t skl_itlb_misses[]={ + .uflags = INTEL_X86_NCOMBO, + }, + { .uname = "WALK_COMPLETED", +- .udesc = "Misses in all TLB levels causes a page walk of any page size that completes", ++ .udesc = "Number of misses in all TLB levels causing a page walk of any page size that completes", + .ucode = 0xe00, + .uflags = INTEL_X86_NCOMBO, + }, + { .uname = "WALK_COMPLETED_4K", +- .udesc = "Misses in all TLB levels causes a page walk of 4KB page size that completes", ++ .udesc = "Number of misses in all TLB levels causing a page walk of 4KB page size that completes", + .ucode = 0x200, + .uflags = INTEL_X86_NCOMBO, + }, + { .uname = "WALK_COMPLETED_2M_4M", +- .udesc = "Misses in all TLB levels causes a page walk of 2MB/4MB page size that completes", ++ .udesc = "Number of misses in all TLB levels causing a page walk of 2MB/4MB page size that completes", + .ucode = 0x400, + .uflags = INTEL_X86_NCOMBO, + }, ++ { .uname = "WALK_COMPLETED_1G", ++ .udesc = "Number of misses in all TLB levels causing a page walk of 1GB page size that completes", ++ .ucode = 0x800, ++ .uflags = INTEL_X86_NCOMBO, ++ }, + { .uname = "WALK_DURATION", + .udesc = "Cycles when PMH is busy with page walks", + .ucode = 0x1000, +@@ -539,6 +560,14 @@ static const intel_x86_umask_t skl_l1d[]={ + }, + }; + ++static const intel_x86_umask_t skl_sq_misc[]={ ++ { .uname = "SPLIT_LOCK", ++ .udesc = "Number of split locks in the super queue (SQ)", ++ .ucode = 0x1000, ++ .uflags = INTEL_X86_DFL, ++ }, ++}; ++ + static const intel_x86_umask_t skl_l1d_pend_miss[]={ + { .uname = "PENDING", + .udesc = "Cycles with L1D load misses outstanding", +@@ -602,8 +631,8 @@ static const intel_x86_umask_t skl_l2_lines_out[]={ + .ucode = 0x200, + .uflags = INTEL_X86_NCOMBO, + }, +- { .uname = "USELESS_PREF", +- .udesc = "TBD", ++ { .uname = "USELESS_HWPREF", ++ .udesc = "Counts the number of lines that have been hardware prefetched but not used and now evicted by L2 cache", + .ucode = 0x400, + .uflags = INTEL_X86_NCOMBO, + }, +@@ -1976,6 +2005,14 @@ static const intel_x86_umask_t skl_offcore_requests_buffer[]={ + }, + }; + ++static const intel_x86_umask_t skl_mem_load_misc_retired[]={ ++ { .uname = "UC", ++ .udesc = "Number of uncached load retired", ++ .ucode = 0x400, ++ .uflags= INTEL_X86_PEBS | INTEL_X86_DFL, ++ }, ++}; ++ + static const intel_x86_entry_t intel_skl_pe[]={ + { .name = "UNHALTED_CORE_CYCLES", + .desc = "Count core clock cycles whenever the clock signal on the specific core is running (not halted)", +@@ -2602,6 +2639,25 @@ static const intel_x86_entry_t intel_skl_pe[]={ + .ngrp = 1, + .umasks = skl_hw_interrupts, + }, ++ { .name = "SQ_MISC", ++ .desc = "SuperQueue miscellaneous", ++ .modmsk = INTEL_V4_ATTRS, ++ .cntmsk = 0xf, ++ .code = 0xf4, ++ .numasks = LIBPFM_ARRAY_SIZE(skl_sq_misc), ++ .ngrp = 1, ++ .umasks = skl_sq_misc, ++ }, ++ { .name = "MEM_LOAD_MISC_RETIRED", ++ .desc = "Load retired miscellaneous", ++ .modmsk = INTEL_V4_ATTRS, ++ .flags = INTEL_X86_PEBS, ++ .cntmsk = 0xf, ++ .code = 0xd4, ++ .numasks = LIBPFM_ARRAY_SIZE(skl_mem_load_misc_retired), ++ .ngrp = 1, ++ .umasks = skl_mem_load_misc_retired, ++ }, + { .name = "OFFCORE_REQUESTS_BUFFER", + .desc = "Offcore requests buffer", + .modmsk = INTEL_V4_ATTRS, +-- +2.9.3 + + +From 073e4dbbdde1adab02e01c659028bddaea969541 Mon Sep 17 00:00:00 2001 +From: Stephane Eranian +Date: Tue, 23 Aug 2016 09:24:25 -0700 +Subject: [PATCH] add SQ_MISC:SPLIT_LOCK to Intel Haswell event table + +Added: SQ_MISC:SPLIT_LOCK + +Based on V23 of public event table for Haswell published on +download.01.org + +Signed-off-by: Stephane Eranian +--- + lib/events/intel_hsw_events.h | 17 +++++++++++++++++ + 1 file changed, 17 insertions(+) + +diff --git a/lib/events/intel_hsw_events.h b/lib/events/intel_hsw_events.h +index 2a17e47..ab211cc 100644 +--- a/lib/events/intel_hsw_events.h ++++ b/lib/events/intel_hsw_events.h +@@ -2224,6 +2224,14 @@ static const intel_x86_umask_t hsw_avx[]={ + }, + }; + ++static const intel_x86_umask_t hsw_sq_misc[]={ ++ { .uname = "SPLIT_LOCK", ++ .udesc = "Number of split locks in the super queue (SQ)", ++ .ucode = 0x1000, ++ .uflags = INTEL_X86_DFL, ++ }, ++}; ++ + static const intel_x86_entry_t intel_hsw_pe[]={ + { .name = "UNHALTED_CORE_CYCLES", + .desc = "Count core clock cycles whenever the clock signal on the specific core is running (not halted)", +@@ -2856,6 +2864,15 @@ static const intel_x86_entry_t intel_hsw_pe[]={ + .ngrp = 1, + .umasks = hsw_avx, + }, ++ { .name = "SQ_MISC", ++ .desc = "SuperQueue miscellaneous", ++ .modmsk = INTEL_V4_ATTRS, ++ .cntmsk = 0xf, ++ .code = 0xf4, ++ .numasks = LIBPFM_ARRAY_SIZE(hsw_sq_misc), ++ .ngrp = 1, ++ .umasks = hsw_sq_misc, ++ }, + { .name = "OFFCORE_REQUESTS_BUFFER", + .desc = "Offcore reqest buffer", + .modmsk = INTEL_V4_ATTRS, +-- +2.9.3 + + +From 25117cf79620936ed58c2c7cff72b77fd678a0a7 Mon Sep 17 00:00:00 2001 +From: Stephane Eranian +Date: Tue, 23 Aug 2016 09:30:19 -0700 +Subject: [PATCH] add SQ_MISC:SPLIT_LOCK to Intel IvyBridge event table + +Added: SQ_MISC:SPLIT_LOCK + +Based on V18 of Intel Ivybridge event table published on +download.01.org. + +Signed-off-by: Stephane Eranian +--- + lib/events/intel_ivb_events.h | 59 ++++++++++++++++++++++++++++--------------- + 1 file changed, 38 insertions(+), 21 deletions(-) + +diff --git a/lib/events/intel_ivb_events.h b/lib/events/intel_ivb_events.h +index fa29dcb..dd4175a 100644 +--- a/lib/events/intel_ivb_events.h ++++ b/lib/events/intel_ivb_events.h +@@ -1970,6 +1970,14 @@ static const intel_x86_umask_t ivb_offcore_requests_buffer[]={ + }, + }; + ++static const intel_x86_umask_t ivb_sq_misc[]={ ++ { .uname = "SPLIT_LOCK", ++ .udesc = "Number of split locks in the super queue (SQ)", ++ .ucode = 0x1000, ++ .uflags = INTEL_X86_DFL, ++ }, ++}; ++ + static const intel_x86_entry_t intel_ivb_pe[]={ + { .name = "ARITH", + .desc = "Counts arithmetic multiply operations", +@@ -2651,24 +2659,33 @@ static const intel_x86_entry_t intel_ivb_pe[]={ + .ngrp = 1, + .umasks = ivb_offcore_requests_buffer, + }, +-{ .name = "OFFCORE_RESPONSE_0", +- .desc = "Offcore response event (must provide at least one request type and either any_response or any combination of supplier + snoop)", +- .modmsk = INTEL_V3_ATTRS, +- .cntmsk = 0xf, +- .code = 0x1b7, +- .flags= INTEL_X86_NHM_OFFCORE, +- .numasks = LIBPFM_ARRAY_SIZE(ivb_offcore_response), +- .ngrp = 3, +- .umasks = ivb_offcore_response, +-}, +-{ .name = "OFFCORE_RESPONSE_1", +- .desc = "Offcore response event (must provide at least one request type and either any_response or any combination of supplier + snoop)", +- .modmsk = INTEL_V3_ATTRS, +- .cntmsk = 0xf, +- .code = 0x1bb, +- .flags= INTEL_X86_NHM_OFFCORE, +- .numasks = LIBPFM_ARRAY_SIZE(ivb_offcore_response), +- .ngrp = 3, +- .umasks = ivb_offcore_response, /* identical to actual umasks list for this event */ +-}, +-}; ++ { .name = "SQ_MISC", ++ .desc = "SuperQueue miscellaneous", ++ .modmsk = INTEL_V4_ATTRS, ++ .cntmsk = 0xf, ++ .code = 0xf4, ++ .numasks = LIBPFM_ARRAY_SIZE(ivb_sq_misc), ++ .ngrp = 1, ++ .umasks = ivb_sq_misc, ++ }, ++ { .name = "OFFCORE_RESPONSE_0", ++ .desc = "Offcore response event (must provide at least one request type and either any_response or any combination of supplier + snoop)", ++ .modmsk = INTEL_V3_ATTRS, ++ .cntmsk = 0xf, ++ .code = 0x1b7, ++ .flags= INTEL_X86_NHM_OFFCORE, ++ .numasks = LIBPFM_ARRAY_SIZE(ivb_offcore_response), ++ .ngrp = 3, ++ .umasks = ivb_offcore_response, ++ }, ++ { .name = "OFFCORE_RESPONSE_1", ++ .desc = "Offcore response event (must provide at least one request type and either any_response or any combination of supplier + snoop)", ++ .modmsk = INTEL_V3_ATTRS, ++ .cntmsk = 0xf, ++ .code = 0x1bb, ++ .flags= INTEL_X86_NHM_OFFCORE, ++ .numasks = LIBPFM_ARRAY_SIZE(ivb_offcore_response), ++ .ngrp = 3, ++ .umasks = ivb_offcore_response, /* identical to actual umasks list for this event */ ++ }, ++ }; +-- +2.9.3 + + +From 6e764d5d2f7a9fbbcdf1c987ab9895600826e467 Mon Sep 17 00:00:00 2001 +From: Stephane Eranian +Date: Tue, 23 Aug 2016 09:54:21 -0700 +Subject: [PATCH] add BR_INST_RETIRED:ALL_TAKEN_BRANCHES to Intel Goldmont + event table + +Added: BR_INST_RETIRED:ALL_TAKEN_BRANCHES + +Based on Goldmont V8 event table published on download.01.org. + +Signed-off-by: Stephane Eranian +--- + lib/events/intel_glm_events.h | 7 +++++++ + 1 file changed, 7 insertions(+) + +diff --git a/lib/events/intel_glm_events.h b/lib/events/intel_glm_events.h +index a7ed811..78dc5da 100644 +--- a/lib/events/intel_glm_events.h ++++ b/lib/events/intel_glm_events.h +@@ -727,6 +727,13 @@ static const intel_x86_umask_t glm_br_inst_retired[]={ + { .uname = "ALL_BRANCHES", + .udesc = "Retired branch instructions (Precise Event)", + .ucode = 0x0000, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_DFL, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, ++ { .uname = "ALL_TAKEN_BRANCHES", ++ .udesc = "Retired branch instructions (Precise Event)", ++ .ucode = 0x8000, + .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, + .grpid = 0, + .ucntmsk = 0xfull, +-- +2.9.3 + + +From 7ac65a64d557a02244fef535b26ceb01b2258159 Mon Sep 17 00:00:00 2001 +From: Stephane Eranian +Date: Tue, 23 Aug 2016 10:05:37 -0700 +Subject: [PATCH] add BR_INST_RETIRED:ALL_TAKEN_BRANCHES to Intel Silvermont + event table + +Added: BR_INST_RETIRED:ALL_TAKEN_BRANCHES + +Based on Silvermont V13 event table published on download.01.org. + +Signed-off-by: Stephane Eranian +--- + lib/events/intel_slm_events.h | 7 +++++++ + 1 file changed, 7 insertions(+) + +diff --git a/lib/events/intel_slm_events.h b/lib/events/intel_slm_events.h +index 3dbd90d..3d54f27 100644 +--- a/lib/events/intel_slm_events.h ++++ b/lib/events/intel_slm_events.h +@@ -127,6 +127,13 @@ static const intel_x86_umask_t slm_br_inst_retired[]={ + .ucode = 0x0, + .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, + }, ++ { .uname = "ALL_TAKEN_BRANCHES", ++ .udesc = "Retired branch instructions (Precise Event)", ++ .ucode = 0x8000, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, ++ .grpid = 0, ++ .ucntmsk = 0xfull, ++ }, + { .uname = "JCC", + .udesc = "JCC instructions retired (Precise Event)", + .ucode = 0x7e00, +-- +2.9.3 + + +From 408701ebe9cd1bb83b711ebdb5cb3d3dd58bec4b Mon Sep 17 00:00:00 2001 +From: Stephane Eranian +Date: Tue, 30 Aug 2016 09:45:14 -0700 +Subject: [PATCH] fix encodings of L2_RQSTS:PF_MISS and PF_HIT for HSW/BDW + +This encodings of these two umakss were wrong for Haswell and Broadwell. + +Signed-off-by: Stephane Eranian +--- + lib/events/intel_bdw_events.h | 16 ++++++++++++++-- + lib/events/intel_hsw_events.h | 18 ++++++++++++++++-- + 2 files changed, 30 insertions(+), 4 deletions(-) + +diff --git a/lib/events/intel_bdw_events.h b/lib/events/intel_bdw_events.h +index f6ab78a..fba5ad2 100644 +--- a/lib/events/intel_bdw_events.h ++++ b/lib/events/intel_bdw_events.h +@@ -899,7 +899,13 @@ static const intel_x86_umask_t bdw_l2_rqsts[]={ + }, + { .uname = "L2_PF_MISS", + .udesc = "Requests from the L2 hardware prefetchers that miss L2 cache", +- .ucode = 0x3000, ++ .ucode = 0x3800, ++ .uequiv = "PF_MISS", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "PF_MISS", ++ .udesc = "Requests from the L2 hardware prefetchers that miss L2 cache", ++ .ucode = 0x3800, + .uflags = INTEL_X86_NCOMBO, + }, + { .uname = "MISS", +@@ -909,7 +915,13 @@ static const intel_x86_umask_t bdw_l2_rqsts[]={ + }, + { .uname = "L2_PF_HIT", + .udesc = "Requests from the L2 hardware prefetchers that hit L2 cache", +- .ucode = 0x5000, ++ .ucode = 0xd800, ++ .uequiv = "PF_HIT", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "PF_HIT", ++ .udesc = "Requests from the L2 hardware prefetchers that hit L2 cache", ++ .ucode = 0xd800, + .uflags = INTEL_X86_NCOMBO, + }, + { .uname = "ALL_DEMAND_DATA_RD", +diff --git a/lib/events/intel_hsw_events.h b/lib/events/intel_hsw_events.h +index ab211cc..64cb06a 100644 +--- a/lib/events/intel_hsw_events.h ++++ b/lib/events/intel_hsw_events.h +@@ -863,9 +863,16 @@ static const intel_x86_umask_t hsw_l2_rqsts[]={ + }, + { .uname = "L2_PF_MISS", + .udesc = "Requests from the L2 hardware prefetchers that miss L2 cache", +- .ucode = 0x3000, ++ .ucode = 0x3800, ++ .uequiv = "PF_MISS", ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "PF_MISS", ++ .udesc = "Requests from the L1/L2/L3 hardware prefetchers or Load software prefetches that miss L2 cache", ++ .ucode = 0x3800, + .uflags = INTEL_X86_NCOMBO, + }, ++ + { .uname = "MISS", + .udesc = "All requests that miss the L2 cache", + .ucode = 0x3f00, +@@ -873,9 +880,16 @@ static const intel_x86_umask_t hsw_l2_rqsts[]={ + }, + { .uname = "L2_PF_HIT", + .udesc = "Requests from the L2 hardware prefetchers that hit L2 cache", +- .ucode = 0x5000, ++ .ucode = 0xd800, ++ .uequiv = "PF_HIT", + .uflags = INTEL_X86_NCOMBO, + }, ++ { .uname = "PF_HIT", ++ .udesc = "Requests from the L2 hardware prefetchers that hit L2 cache", ++ .ucode = 0xd800, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ + { .uname = "ALL_DEMAND_DATA_RD", + .udesc = "Any data read request to L2 cache", + .ucode = 0xe100, +-- +2.9.3 + + +From 359a11a6347b4a6495d4de18a4f916859d8d471a Mon Sep 17 00:00:00 2001 +From: Philip Mucci +Date: Thu, 1 Sep 2016 12:10:14 -0700 +Subject: [PATCH] allow . as a delimiter for event strings + +This patch allows either : or . as the event string delimiter: + +knl::offcore_response_0.any_request.L2_HIT_NEAR_TILE.L2_HIT_FAR_TILE.c=1.u + +is equivalent to + +knl::offcore_response_0:any_request:L2_HIT_NEAR_TILE:L2_HIT_FAR_TILE:c=1:u + +Delimiters can be mixed and matched. + +The change is motivated by the fact that it makes it easier to use vendor +provided symbolic event names directly as many of them use the . as the +event/umask delimiter, e.g., Intel event tables. + +Signed-off-by: Philip Mucci +Signed-off-by: Stephane Eranian +--- + docs/man3/libpfm.3 | 4 ++++ + lib/pfmlib_common.c | 22 +++++++++++++--------- + lib/pfmlib_priv.h | 2 +- + tests/validate_x86.c | 24 ++++++++++++++++++++++++ + 4 files changed, 42 insertions(+), 10 deletions(-) + +diff --git a/docs/man3/libpfm.3 b/docs/man3/libpfm.3 +index 08a0f49..3852a3c 100644 +--- a/docs/man3/libpfm.3 ++++ b/docs/man3/libpfm.3 +@@ -62,6 +62,10 @@ The string structure is defined as follows: + .ce + .B [pmu::][event_name][:unit_mask][:modifier|:modifier=val] + ++or ++.ce ++.B [pmu::][event_name][.unit_mask][.modifier|.modifier=val] ++ + The components are defined as follows: + .TP + .B pmu +diff --git a/lib/pfmlib_common.c b/lib/pfmlib_common.c +index 6297fdd..b4547be 100644 +--- a/lib/pfmlib_common.c ++++ b/lib/pfmlib_common.c +@@ -913,9 +913,10 @@ pfmlib_parse_event_attr(char *str, pfmlib_event_desc_t *d) + s = str; + + while(s) { +- p = strchr(s, PFMLIB_ATTR_DELIM); +- if (p) +- *p++ = '\0'; ++ p = s; ++ strsep(&p, PFMLIB_ATTR_DELIM); ++ /* if (p) ++ *p++ = '\0'; */ + + q = strchr(s, '='); + if (q) +@@ -1159,9 +1160,10 @@ pfmlib_parse_equiv_event(const char *event, pfmlib_event_desc_t *d) + if (!str) + return PFM_ERR_NOMEM; + +- p = strchr(s, PFMLIB_ATTR_DELIM); +- if (p) +- *p++ = '\0'; ++ p = s; ++ strsep(&p, PFMLIB_ATTR_DELIM); ++ /* if (p) ++ *p++ = '\0'; */ + + match = pmu->match_event ? pmu->match_event : match_event; + +@@ -1234,9 +1236,11 @@ pfmlib_parse_event(const char *event, pfmlib_event_desc_t *d) + pname = s; + s = p + strlen(PFMLIB_PMU_DELIM); + } +- p = strchr(s, PFMLIB_ATTR_DELIM); +- if (p) +- *p++ = '\0'; ++ p = s; ++ strsep(&p, PFMLIB_ATTR_DELIM); ++ /* if (p) ++ *p++ = '\0'; */ ++ + /* + * for each pmu + */ +diff --git a/lib/pfmlib_priv.h b/lib/pfmlib_priv.h +index 0d106a4..5cde35c 100644 +--- a/lib/pfmlib_priv.h ++++ b/lib/pfmlib_priv.h +@@ -29,7 +29,7 @@ + + #define PFM_PLM_ALL (PFM_PLM0|PFM_PLM1|PFM_PLM2|PFM_PLM3|PFM_PLMH) + +-#define PFMLIB_ATTR_DELIM ':' /* event attribute delimiter */ ++#define PFMLIB_ATTR_DELIM ":." /* event attribute delimiter possible */ + #define PFMLIB_PMU_DELIM "::" /* pmu to event delimiter */ + #define PFMLIB_EVENT_DELIM ',' /* event to event delimiter */ + +diff --git a/tests/validate_x86.c b/tests/validate_x86.c +index 0247c3e..83b8c88 100644 +--- a/tests/validate_x86.c ++++ b/tests/validate_x86.c +@@ -4358,6 +4358,30 @@ static const test_event_t x86_test_events[]={ + .name = "glm::ISSUE_SLOTS_NOT_CONSUMED:RESOURCE_FULL:u:intxcp", + .ret = PFM_ERR_ATTR, + }, ++ /* ++ * test delimiter options ++ */ ++ { SRC_LINE, ++ .name = "glm::ISSUE_SLOTS_NOT_CONSUMED.RESOURCE_FULL.k=1.u=0.e=0.i=0.c=1", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x15201ca, ++ .fstr = "glm::ISSUE_SLOTS_NOT_CONSUMED:RESOURCE_FULL:k=1:u=0:e=0:i=0:c=1", ++ }, ++ { SRC_LINE, ++ .name = "glm::ISSUE_SLOTS_NOT_CONSUMED.RESOURCE_FULL:k=1:u=1:e=0:i=0:c=1", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x15301ca, ++ .fstr = "glm::ISSUE_SLOTS_NOT_CONSUMED:RESOURCE_FULL:k=1:u=1:e=0:i=0:c=1", ++ }, ++ { SRC_LINE, ++ .name = "glm::ISSUE_SLOTS_NOT_CONSUMED.RESOURCE_FULL:k=1:u=1:e=0.i=0.c=1", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x15301ca, ++ .fstr = "glm::ISSUE_SLOTS_NOT_CONSUMED:RESOURCE_FULL:k=1:u=1:e=0:i=0:c=1", ++ }, + }; + #define NUM_TEST_EVENTS (int)(sizeof(x86_test_events)/sizeof(test_event_t)) + +-- +2.9.3 + + +From ef73810593a80b3202daee3e94d090b6ecefa068 Mon Sep 17 00:00:00 2001 +From: Asim YarKhan +Date: Thu, 28 Jul 2016 14:58:11 -0400 +Subject: [PATCH] Add support for Intel Knights Landing core PMU + +This patch adds support for Intel Knights Landing core PMU. + +This patch was contributed by Intel and altered to match updates to +libpfm4. Intel's contributed patch was split into two core and uncore +patches for libpfm4. This is the patch for the KNL core events only. + +Signed-off-by: Peinan Zhang +[yarkhan@icl.utk.edu: Split into core/uncore patches] +Signed-off-by: Asim YarKhan +--- + README | 1 + + docs/Makefile | 1 + + docs/man3/libpfm_intel_knl.3 | 100 ++++ + include/perfmon/pfmlib.h | 3 + + lib/Makefile | 2 + + lib/events/intel_knl_events.h | 1150 +++++++++++++++++++++++++++++++++++++++++ + lib/pfmlib_common.c | 1 + + lib/pfmlib_intel_knl.c | 75 +++ + lib/pfmlib_priv.h | 1 + + tests/validate_x86.c | 96 ++++ + 10 files changed, 1430 insertions(+) + create mode 100644 docs/man3/libpfm_intel_knl.3 + create mode 100644 lib/events/intel_knl_events.h + create mode 100644 lib/pfmlib_intel_knl.c + +diff --git a/README b/README +index ce60d3a..287616e 100644 +--- a/README ++++ b/README +@@ -55,6 +55,7 @@ The library supports many PMUs. The current version can handle: + Intel Goldmont + Intel RAPL (energy consumption) + Intel Knights Corner ++ Intel Knights Landing + Intel architectural perfmon v1, v2, v3 + + - For ARM: +diff --git a/docs/Makefile b/docs/Makefile +index 873f31f..f8f8838 100644 +--- a/docs/Makefile ++++ b/docs/Makefile +@@ -53,6 +53,7 @@ ARCH_MAN=libpfm_intel_core.3 \ + libpfm_intel_slm.3 \ + libpfm_intel_skl.3 \ + libpfm_intel_glm.3 \ ++ libpfm_intel_knl.3 \ + libpfm_intel_snbep_unc_cbo.3 \ + libpfm_intel_snbep_unc_ha.3 \ + libpfm_intel_snbep_unc_imc.3 \ +diff --git a/docs/man3/libpfm_intel_knl.3 b/docs/man3/libpfm_intel_knl.3 +new file mode 100644 +index 0000000..e521e01 +--- /dev/null ++++ b/docs/man3/libpfm_intel_knl.3 +@@ -0,0 +1,100 @@ ++.TH LIBPFM 3 "July, 2016" "" "Linux Programmer's Manual" ++.SH NAME ++libpfm_intel_knl - support for Intel Kinghts Landing core PMU ++.SH SYNOPSIS ++.nf ++.B #include ++.sp ++.B PMU name: knl ++.B PMU desc: Intel Kinghts Landing ++.sp ++.SH DESCRIPTION ++The library supports the Intel Kinghts Landing core PMU. It should be noted that ++this PMU model only covers each core's PMU and not the socket level PMU. ++ ++On Knights Landing, the number of generic counters is 4. There is 4-way HyperThreading support. ++The \fBpfm_get_pmu_info()\fR function returns the maximum number of generic counters ++in \fBnum_cntrs\fr. ++ ++.SH MODIFIERS ++The following modifiers are supported on Intel Kinghts Landing processors: ++.TP ++.B u ++Measure at user level which includes privilege levels 1, 2, 3. This corresponds to \fBPFM_PLM3\fR. ++This is a boolean modifier. ++.TP ++.B k ++Measure at kernel level which includes privilege level 0. This corresponds to \fBPFM_PLM0\fR. ++This is a boolean modifier. ++.TP ++.B i ++Invert the meaning of the event. The counter will now count cycles in which the event is \fBnot\fR ++occurring. This is a boolean modifier ++.TP ++.B e ++Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event ++to at least one occurrence. This modifier must be combined with a counter mask modifier (m) with a value greater or equal to one. ++This is a boolean modifier. ++.TP ++.B c ++Set the counter mask value. The mask acts as a threshold. The counter will count the number of cycles ++in which the number of occurrences of the event is greater or equal to the threshold. This is an integer ++modifier with values in the range [0:255]. ++.TP ++.B t ++Measure on any of the 4 hyper-threads at the same time assuming hyper-threading is enabled. This is a boolean modifier. ++This modifier is only available on fixed counters (unhalted_reference_cycles, instructions_retired, unhalted_core_cycles). ++Depending on the underlying kernel interface, the event may be programmed on a fixed counter or a generic counter, except for ++unhalted_reference_cycles, in which case, this modifier may be ignored or rejected. ++ ++.SH OFFCORE_RESPONSE events ++Intel Knights Landing provides two offcore_response events. They are called OFFCORE_RESPONSE_0 and OFFCORE_RESPONSE_1. ++ ++Those events need special treatment in the performance monitoring infrastructure ++because each event uses an extra register to store some settings. Thus, in ++case multiple offcore_response events are monitored simultaneously, the kernel needs ++to manage the sharing of that extra register. ++ ++The offcore_response events are exposed as normal events by the library. The extra ++settings are exposed as regular umasks. The library takes care of encoding the ++events according to the underlying kernel interface. ++ ++On Intel Knights Landing, the umasks are divided into 4 categories: request, supplier ++and snoop and average latency. Offcore_response event has two modes of operations: normal and average latency. ++In the first mode, the two offcore_respnse events operate independently of each other. The user must provide at ++least one umask for each of the first 3 categories: request, supplier, snoop. In the second mode, the two ++offcore_response events are combined to compute an average latency per request type. ++ ++For the normal mode, there is a special supplier (response) umask called \fBANY_RESPONSE\fR. When this umask ++is used then it overrides any supplier and snoop umasks. In other words, users can ++specify either \fBANY_RESPONSE\fR \fBOR\fR any combinations of supplier + snoops. In case no supplier or snoop ++is specified, the library defaults to using \fBANY_RESPONSE\fR. ++ ++For instance, the following are valid event selections: ++.TP ++.B OFFCORE_RESPONSE_0:DMND_DATA_RD:ANY_RESPONSE ++.TP ++.B OFFCORE_RESPONSE_0:ANY_REQUEST ++.TP ++.B OFFCORE_RESPONSE_0:ANY_RFO:DDR_NEAR ++ ++.P ++But the following is illegal: ++ ++.TP ++.B OFFCORE_RESPONSE_0:ANY_RFO:DDR_NEAR:ANY_RESPONSE ++.P ++In average latency mode, \fBOFFCORE_RESPONSE_0\fR must be programmed to select the request types of interest, for instance, \fBDMND_DATA_RD\fR, and the \fBOUTSTANDING\fR umask must be set and no others. the library will enforce that restriction as soon as the \fBOUTSTANDING\fR umask is used. Then \fBOFFCORE_RESPONSE_1\fR must be set with the same request types and the \fBANY_RESPONSE\fR umask. It should be noted that the library encodes events independently of each other and therefore cannot verify that the requests are matching between the two events. ++Example of average latency settings: ++.TP ++.B OFFCORE_RESPONSE_0:DMND_DATA_RD:OUTSTANDING+OFFCORE_RESPONSE_1:DMND_DATA_RD:ANY_RESPONSE ++.TP ++.B OFFCORE_RESPONSE_0:ANY_REQUEST:OUTSTANDING+OFFCORE_RESPONSE_1:ANY_REQUEST:ANY_RESPONSE ++.P ++The average latency for the request(s) is obtained by dividing the counts of \fBOFFCORE_RESPONSE_0\fR by the count of \fBOFFCORE_RESPONSE_1\fR. The ratio is expressed in core cycles. ++ ++.SH AUTHORS ++.nf ++Stephane Eranian ++.if ++.PP +diff --git a/include/perfmon/pfmlib.h b/include/perfmon/pfmlib.h +index d3a3c41..b584672 100644 +--- a/include/perfmon/pfmlib.h ++++ b/include/perfmon/pfmlib.h +@@ -297,7 +297,10 @@ typedef enum { + PFM_PMU_INTEL_SKL, /* Intel Skylake */ + + PFM_PMU_INTEL_BDW_EP, /* Intel Broadwell EP */ ++ + PFM_PMU_INTEL_GLM, /* Intel Goldmont */ ++ ++ PFM_PMU_INTEL_KNL, /* Intel Knights Landing */ + /* MUST ADD NEW PMU MODELS HERE */ + + PFM_PMU_MAX /* end marker */ +diff --git a/lib/Makefile b/lib/Makefile +index bd74d50..3c5033f 100644 +--- a/lib/Makefile ++++ b/lib/Makefile +@@ -93,6 +93,7 @@ SRCS += pfmlib_amd64.c pfmlib_intel_core.c pfmlib_intel_x86.c \ + pfmlib_intel_hswep_unc_sbo.c \ + pfmlib_intel_knc.c \ + pfmlib_intel_slm.c \ ++ pfmlib_intel_knl.c \ + pfmlib_intel_glm.c \ + pfmlib_intel_netburst.c \ + pfmlib_amd64_k7.c pfmlib_amd64_k8.c pfmlib_amd64_fam10h.c \ +@@ -250,6 +251,7 @@ INC_X86= pfmlib_intel_x86_priv.h \ + events/intel_snbep_unc_r2pcie_events.h \ + events/intel_snbep_unc_r3qpi_events.h \ + events/intel_knc_events.h \ ++ events/intel_knl_events.h \ + events/intel_ivbep_unc_cbo_events.h \ + events/intel_ivbep_unc_ha_events.h \ + events/intel_ivbep_unc_imc_events.h \ +diff --git a/lib/events/intel_knl_events.h b/lib/events/intel_knl_events.h +new file mode 100644 +index 0000000..d0255ba +--- /dev/null ++++ b/lib/events/intel_knl_events.h +@@ -0,0 +1,1150 @@ ++/* ++ * Copyright (c) 2016 Intel Corp. All rights reserved ++ * Contributed by Peinan Zhang ++ * ++ * Permission is hereby granted, free of charge, to any person obtaining a copy ++ * of this software and associated documentation files (the "Software"), to deal ++ * in the Software without restriction, including without limitation the rights ++ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies ++ * of the Software, and to permit persons to whom the Software is furnished to do so, ++ * subject to the following conditions: ++ * ++ * The above copyright notice and this permission notice shall be included in all ++ * copies or substantial portions of the Software. ++ * ++ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, ++ * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A ++ * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT ++ * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF ++ * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE ++ * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ++ * ++ * This file is part of libpfm, a performance monitoring support library for ++ * applications on Linux. ++ * ++ * PMU: knl (Intel Knights Landing) ++ */ ++ ++static const intel_x86_umask_t knl_icache[]={ ++ { .uname = "HIT", ++ .udesc = "Counts all instruction fetches that hit the instruction cache.", ++ .ucode = 0x100, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "MISSES", ++ .udesc = "Counts all instruction fetches that miss the instruction cache or produce memory requests. An instruction fetch miss is counted only once and not once for every cycle it is outstanding.", ++ .ucode = 0x200, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "ACCESSES", ++ .udesc = "Counts all instruction fetches, including uncacheable fetches.", ++ .ucode = 0x300, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, ++ }, ++}; ++ ++static const intel_x86_umask_t knl_uops_retired[]={ ++ { .uname = "ALL", ++ .udesc = "Counts the number of micro-ops retired.", ++ .ucode = 0x1000, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, ++ }, ++ { .uname = "MS", ++ .udesc = "Counts the number of micro-ops retired that are from the complex flows issued by the micro-sequencer (MS).", ++ .ucode = 0x0100, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "SCALAR_SIMD", ++ .udesc = "Counts the number of scalar SSE, AVX, AVX2, AVX-512 micro-ops retired. More specifically, it counts scalar SSE, AVX, AVX2, AVX-512 micro-ops except for loads (memory-to-register mov-type micro ops), division, sqrt.", ++ .ucode = 0x2000, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "PACKED_SIMD", ++ .udesc = "Counts the number of vector SSE, AVX, AVX2, AVX-512 micro-ops retired. More specifically, it counts packed SSE, AVX, AVX2, AVX-512 micro-ops (both floating point and integer) except for loads (memory-to-register mov-type micro-ops), packed byte and word multiplies.", ++ .ucode = 0x4000, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++}; ++ ++static const intel_x86_umask_t knl_inst_retired[]={ ++ { .uname = "ANY_P", ++ .udesc = "Instructions retired using generic counter (precise event)", ++ .ucode = 0x0, ++ .uflags = INTEL_X86_PEBS | INTEL_X86_DFL, ++ }, ++ { .uname = "ANY", ++ .udesc = "Instructions retired using generic counter (precise event)", ++ .uequiv = "ANY_P", ++ .ucode = 0x0, ++ .uflags = INTEL_X86_PEBS, ++ }, ++}; ++ ++static const intel_x86_umask_t knl_l2_requests_reject[]={ ++ { .uname = "ALL", ++ .udesc = "Counts the number of MEC requests from the L2Q that reference a cache line excluding SW prefetches filling only to L2 cache and L1 evictions (automatically exlcudes L2HWP, UC, WC) that were rejected - Multiple repeated rejects should be counted multiple times.", ++ .ucode = 0x000, ++ .uflags = INTEL_X86_DFL, ++ }, ++}; ++ ++static const intel_x86_umask_t knl_core_reject[]={ ++ { .uname = "ALL", ++ .udesc = "Counts the number of MEC requests that were not accepted into the L2Q because of any L2 queue reject condition. There is no concept of at-ret here. It might include requests due to instructions in the speculative path", ++ .ucode = 0x000, ++ .uflags = INTEL_X86_DFL, ++ }, ++}; ++ ++static const intel_x86_umask_t knl_machine_clears[]={ ++ { .uname = "SMC", ++ .udesc = "Counts the number of times that the machine clears due to program modifying data within 1K of a recently fetched code page.", ++ .ucode = 0x0100, ++ .uflags = INTEL_X86_DFL, ++ }, ++ { .uname = "MEMORY_ORDERING", ++ .udesc = "Counts the number of times the machine clears due to memory ordering hazards", ++ .ucode = 0x0200, ++ }, ++ { .uname = "FP_ASSIST", ++ .udesc = "Counts the number of floating operations retired that required microcode assists", ++ .ucode = 0x0400, ++ }, ++ { .uname = "ALL", ++ .udesc = "Counts all nukes", ++ .ucode = 0x0800, ++ }, ++ { .uname = "ANY", ++ .udesc = "Counts all nukes", ++ .uequiv = "ALL", ++ .ucode = 0x0800, ++ }, ++}; ++ ++static const intel_x86_umask_t knl_br_inst_retired[]={ ++ { .uname = "ANY", ++ .udesc = "Counts the number of branch instructions retired (Precise Event)", ++ .ucode = 0x0, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL | INTEL_X86_PEBS, ++ }, ++ { .uname = "ALL_BRANCHES", ++ .udesc = "Counts the number of branch instructions retired", ++ .uequiv = "ANY", ++ .ucode = 0x0, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, ++ }, ++ { .uname = "JCC", ++ .udesc = "Counts the number of branch instructions retired that were conditional jumps.", ++ .ucode = 0x7e00, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, ++ }, ++ { .uname = "TAKEN_JCC", ++ .udesc = "Counts the number of branch instructions retired that were conditional jumps and predicted taken.", ++ .ucode = 0xfe00, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, ++ }, ++ { .uname = "CALL", ++ .udesc = "Counts the number of near CALL branch instructions retired.", ++ .ucode = 0xf900, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, ++ }, ++ { .uname = "REL_CALL", ++ .udesc = "Counts the number of near relative CALL branch instructions retired.", ++ .ucode = 0xfd00, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, ++ }, ++ { .uname = "IND_CALL", ++ .udesc = "Counts the number of near indirect CALL branch instructions retired. (Precise Event)", ++ .ucode = 0xfb00, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, ++ }, ++ { .uname = "RETURN", ++ .udesc = "Counts the number of near RET branch instructions retired. (Precise Event)", ++ .ucode = 0xf700, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, ++ }, ++ { .uname = "NON_RETURN_IND", ++ .udesc = "Counts the number of branch instructions retired that were near indirect CALL or near indirect JMP. (Precise Event)", ++ .ucode = 0xeb00, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, ++ }, ++ { .uname = "FAR_BRANCH", ++ .udesc = "Counts the number of far branch instructions retired. (Precise Event)", ++ .uequiv = "FAR", ++ .ucode = 0xbf00, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, ++ }, ++ { .uname = "FAR", ++ .udesc = "Counts the number of far branch instructions retired. (Precise Event)", ++ .ucode = 0xbf00, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, ++ }, ++}; ++ ++static const intel_x86_umask_t knl_fetch_stall[]={ ++ { .uname = "ICACHE_FILL_PENDING_CYCLES", ++ .udesc = "Counts the number of core cycles the fetch stalls because of an icache miss. This is a cummulative count of core cycles the fetch stalled for all icache misses", ++ .ucode = 0x0400, ++ .uflags = INTEL_X86_DFL | INTEL_X86_NCOMBO, ++ }, ++}; ++ ++static const intel_x86_umask_t knl_baclears[]={ ++ { .uname = "ALL", ++ .udesc = "Counts the number of times the front end resteers for any branch as a result of another branch handling mechanism in the front end.", ++ .ucode = 0x100, ++ .uflags = INTEL_X86_DFL | INTEL_X86_NCOMBO, ++ }, ++ { .uname = "ANY", ++ .udesc = "Counts the number of times the front end resteers for any branch as a result of another branch handling mechanism in the front end.", ++ .uequiv = "ALL", ++ .ucode = 0x100, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "RETURN", ++ .udesc = "Counts the number of times the front end resteers for RET branches as a result of another branch handling mechanism in the front end.", ++ .ucode = 0x800, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "COND", ++ .udesc = "Counts the number of times the front end resteers for conditional branches as a result of another branch handling mechanism in the front end.", ++ .ucode = 0x1000, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++}; ++ ++static const intel_x86_umask_t knl_cpu_clk_unhalted[]={ ++ { .uname = "THREAD_P", ++ .udesc = "thread cycles when core is not halted", ++ .ucode = 0x0, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, ++ }, ++ { .uname = "BUS", ++ .udesc = "Bus cycles when core is not halted. This event can give a measurement of the elapsed time. This events has a constant ratio with CPU_CLK_UNHALTED:REF event, which is the maximum bus to processor frequency ratio", ++ .uequiv = "REF_P", ++ .ucode = 0x100, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "REF_P", ++ .udesc = "Number of reference cycles that the cpu is not in a halted state. The core enters the halted state when it is running the HLT instruction. In mobile systems, the core frequency may change from time to time. This event is not affected by core frequency changes but counts as if the core is running a the same maximum frequency all the time", ++ .ucode = 0x200, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++}; ++ ++static const intel_x86_umask_t knl_mem_uops_retired[]={ ++ { .uname = "L1_MISS_LOADS", ++ .udesc = "Counts the number of load micro-ops retired that miss in L1 D cache.", ++ .ucode = 0x100, ++ }, ++ { .uname = "LD_DCU_MISS", ++ .udesc = "Counts the number of load micro-ops retired that miss in L1 D cache.", ++ .uequiv = "L1_MISS_LOADS", ++ .ucode = 0x100, ++ }, ++ { .uname = "L2_HIT_LOADS", ++ .udesc = "Counts the number of load micro-ops retired that hit in the L2.", ++ .ucode = 0x200, ++ .uflags = INTEL_X86_PEBS, ++ }, ++ { .uname = "L2_MISS_LOADS", ++ .udesc = "Counts the number of load micro-ops retired that miss in the L2.", ++ .ucode = 0x400, ++ .uflags = INTEL_X86_PEBS, ++ }, ++ { .uname = "LD_L2_MISS", ++ .udesc = "Counts the number of load micro-ops retired that miss in the L2.", ++ .uequiv = "L2_MISS_LOADS", ++ .ucode = 0x400, ++ .uflags = INTEL_X86_PEBS, ++ }, ++ { .uname = "DTLB_MISS_LOADS", ++ .udesc = "Counts the number of load micro-ops retired that cause a DTLB miss.", ++ .ucode = 0x800, ++ .uflags = INTEL_X86_PEBS, ++ }, ++ { .uname = "UTLB_MISS_LOADS", ++ .udesc = "Counts the number of load micro-ops retired that caused micro TLB miss.", ++ .ucode = 0x1000, ++ }, ++ { .uname = "LD_UTLB_MISS", ++ .udesc = "Counts the number of load micro-ops retired that caused micro TLB miss.", ++ .uequiv = "UTLB_MISS_LOADS", ++ .ucode = 0x1000, ++ }, ++ { .uname = "HITM", ++ .udesc = "Counts the loads retired that get the data from the other core in the same tile in M state.", ++ .ucode = 0x2000, ++ .uflags = INTEL_X86_PEBS, ++ }, ++ { .uname = "ALL_LOADS", ++ .udesc = "Counts all the load micro-ops retired.", ++ .ucode = 0x4000, ++ .uflags = INTEL_X86_DFL, ++ }, ++ { .uname = "ANY_LD", ++ .udesc = "Counts all the load micro-ops retired.", ++ .uequiv = "ALL_LOADS", ++ .ucode = 0x4000, ++ }, ++ { .uname = "ALL_STORES", ++ .udesc = "Counts all the store micro-ops retired.", ++ .ucode = 0x8000, ++ }, ++ { .uname = "ANY_ST", ++ .udesc = "Counts all the store micro-ops retired.", ++ .uequiv = "ALL_STORES", ++ .ucode = 0x8000, ++ }, ++}; ++ ++static const intel_x86_umask_t knl_page_walks[]={ ++ { .uname = "D_SIDE_CYCLES", ++ .udesc = "Counts the total D-side page walks that are completed or started. The page walks started in the speculative path will also be counted.", ++ .ucode = 0x100, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "D_SIDE_WALKS", ++ .udesc = "Counts the total number of core cycles for all the D-side page walks. The cycles for page walks started in speculative path will also be included.", ++ .ucode = 0x100 | INTEL_X86_MOD_EDGE | (1ULL << INTEL_X86_CMASK_BIT), ++ .modhw = _INTEL_X86_ATTR_E | _INTEL_X86_ATTR_C, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "I_SIDE_CYCLES", ++ .udesc = "Counts the total I-side page walks that are completed.", ++ .ucode = 0x200, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "I_SIDE_WALKS", ++ .udesc = "Counts the total number of core cycles for all the I-side page walks. The cycles for page walks started in speculative path will also be included.", ++ .ucode = 0x200 | INTEL_X86_MOD_EDGE | (1ULL << INTEL_X86_CMASK_BIT), ++ .modhw = _INTEL_X86_ATTR_E | _INTEL_X86_ATTR_C, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "CYCLES", ++ .udesc = "Counts the total page walks completed (I-side and D-side)", ++ .ucode = 0x300, ++ .modhw = _INTEL_X86_ATTR_E | _INTEL_X86_ATTR_C, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, ++ }, ++ { .uname = "WALKS", ++ .udesc = "Counts the total number of core cycles for all the page walks. The cycles for page walks started in speculative path will also be included.", ++ .ucode = 0x300 | INTEL_X86_MOD_EDGE | (1ULL << INTEL_X86_CMASK_BIT), ++ .modhw = _INTEL_X86_ATTR_E | _INTEL_X86_ATTR_C, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++}; ++ ++static const intel_x86_umask_t knl_l2_rqsts[]={ ++ { .uname = "MISS", ++ .udesc = "Counts the number of L2 cache misses", ++ .ucode = 0x4100, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "REFERENCE", ++ .udesc = "Counts the total number of L2 cache references.", ++ .ucode = 0x4f00, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, ++ }, ++}; ++ ++static const intel_x86_umask_t knl_recycleq[]={ ++ { .uname = "LD_BLOCK_ST_FORWARD", ++ .udesc = "Counts the number of occurences a retired load gets blocked because its address partially overlaps with a store (Precise Event).", ++ .ucode = 0x0100, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, ++ }, ++ { .uname = "LD_BLOCK_STD_NOTREADY", ++ .udesc = "Counts the number of occurences a retired load gets blocked because its address overlaps with a store whose data is not ready.", ++ .ucode = 0x0200, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "ST_SPLITS", ++ .udesc = "Counts the number of occurences a retired store that is a cache line split. Each split should be counted only once.", ++ .ucode = 0x0400, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "LD_SPLITS", ++ .udesc = "Counts the number of occurences a retired load that is a cache line split. Each split should be counted only once (Precise Event).", ++ .ucode = 0x0800, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, ++ }, ++ { .uname = "LOCK", ++ .udesc = "Counts all the retired locked loads. It does not include stores because we would double count if we count stores.", ++ .ucode = 0x1000, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "STA_FULL", ++ .udesc = "Counts the store micro-ops retired that were pushed in the rehad queue because the store address buffer is full.", ++ .ucode = 0x2000, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "ANY_LD", ++ .udesc = "Counts any retired load that was pushed into the recycle queue for any reason.", ++ .ucode = 0x4000, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, ++ }, ++ { .uname = "ANY_ST", ++ .udesc = "Counts any retired store that was pushed into the recycle queue for any reason.", ++ .ucode = 0x8000, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++}; ++ ++static const intel_x86_umask_t knl_offcore_response_0[]={ ++ { .uname = "DMND_DATA_RD", ++ .udesc = "Counts demand cacheable data and L1 prefetch data reads", ++ .ucode = 1ULL << (0 + 8), ++ .grpid = 0, ++ }, ++ { .uname = "DMND_RFO", ++ .udesc = "Counts Demand cacheable data writes", ++ .ucode = 1ULL << (1 + 8), ++ .grpid = 0, ++ }, ++ { .uname = "DMND_CODE_RD", ++ .udesc = "Counts demand code reads and prefetch code reads", ++ .ucode = 1ULL << (2 + 8), ++ .grpid = 0, ++ }, ++ { .uname = "PF_L2_RFO", ++ .udesc = "Counts L2 data RFO prefetches (includes PREFETCHW instruction)", ++ .ucode = 1ULL << (5 + 8), ++ .grpid = 0, ++ }, ++ { .uname = "PF_L2_CODE_RD", ++ .udesc = "Request: number of code reads generated by L2 prefetchers", ++ .ucode = 1ULL << (6 + 8), ++ .grpid = 0, ++ }, ++ { .uname = "PARTIAL_READS", ++ .udesc = "Counts Partial reads (UC or WC and is valid only for Outstanding response type).", ++ .ucode = 1ULL << (7 + 8), ++ .grpid = 0, ++ }, ++ { .uname = "PARTIAL_WRITES", ++ .udesc = "Counts Partial writes (UC or WT or WP and should be programmed on PMC1)", ++ .ucode = 1ULL << (8 + 8), ++ .grpid = 0, ++ }, ++ { .uname = "UC_CODE_READS", ++ .udesc = "Counts UC code reads (valid only for Outstanding response type)", ++ .ucode = 1ULL << (9 + 8), ++ .grpid = 0, ++ }, ++ { .uname = "BUS_LOCKS", ++ .udesc = "Counts Bus locks and split lock requests", ++ .ucode = 1ULL << (10 + 8), ++ .grpid = 0, ++ }, ++ { .uname = "FULL_STREAMING_STORES", ++ .udesc = "Counts Full streaming stores (WC and should be programmed on PMC1)", ++ .ucode = 1ULL << (11 + 8), ++ .grpid = 0, ++ }, ++ { .uname = "PF_SOFTWARE", ++ .udesc = "Counts Software prefetches", ++ .ucode = 1ULL << (12 + 8), ++ .grpid = 0, ++ }, ++ { .uname = "PF_L1_DATA_RD", ++ .udesc = "Counts L1 data HW prefetches", ++ .ucode = 1ULL << (13 + 8), ++ .grpid = 0, ++ }, ++ { .uname = "PARTIAL_STREAMING_STORES", ++ .udesc = "Counts Partial streaming stores (WC and should be programmed on PMC1)", ++ .ucode = 1ULL << (14 + 8), ++ .grpid = 0, ++ }, ++ { .uname = "STREAMING_STORES", ++ .udesc = "Counts all streaming stores (WC and should be programmed on PMC1)", ++ .ucode = (1ULL << 14 | 1ULL << 11) << 8, ++ .uequiv = "PARTIAL_STREAMING_STORES:FULL_STREAMING_STORES", ++ .grpid = 0, ++ }, ++ { .uname = "ANY_REQUEST", ++ .udesc = "Counts any request", ++ .ucode = 1ULL << (15 + 8), ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, ++ .grpid = 0, ++ }, ++ { .uname = "ANY_DATA_RD", ++ .udesc = "Counts Demand cacheable data and L1 prefetch data read requests", ++ .ucode = (1ULL << 0 | 1ULL << 7 | 1ULL << 12 | 1ULL << 13) << 8, ++ .uequiv = "DMND_DATA_RD:PARTIAL_READS:PF_SOFTWARE:PF_L1_DATA_RD", ++ .grpid = 0, ++ }, ++ { .uname = "ANY_RFO", ++ .udesc = "Counts Demand cacheable data write requests", ++ .ucode = (1ULL << 1 | 1ULL << 5) << 8, ++ .grpid = 0, ++ }, ++ { .uname = "ANY_CODE_RD", ++ .udesc = "Counts Demand code reads and prefetch code read requests", ++ .ucode = (1ULL << 2 | 1ULL << 6) << 8, ++ .uequiv = "DMND_CODE_RD:PF_L2_CODE_RD", ++ .grpid = 0, ++ }, ++ { .uname = "ANY_READ", ++ .udesc = "Counts any Read request", ++ .ucode = (1ULL << 0 | 1ULL << 1 | 1ULL << 2 | 1ULL << 5 | 1ULL << 6 | 1ULL << 7 | 1ULL << 9 | 1ULL << 12 | 1ULL << 13 ) << 8, ++ .uequiv = "DMND_DATA_RD:DMND_RFO:DMND_CODE_RD:PF_L2_RFO:PF_L2_CODE_RD:PARTIAL_READS:UC_CODE_READS:PF_SOFTWARE:PF_L1_DATA_RD", ++ .grpid = 0, ++ }, ++ { .uname = "ANY_PF_L2", ++ .udesc = "Counts any Prefetch requests", ++ .ucode = (1ULL << 5 | 1ULL << 6) << 8, ++ .uequiv = "PF_L2_RFO:PF_L2_CODE_RD", ++ .grpid = 0, ++ }, ++ { .uname = "ANY_RESPONSE", ++ .udesc = "Accounts for any response", ++ .ucode = (1ULL << 16) << 8, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL | INTEL_X86_EXCL_GRP_GT, ++ .grpid = 1, ++ }, ++ { .uname = "DDR_NEAR", ++ .udesc = "Accounts for data responses from DRAM Local.", ++ .ucode = (1ULL << 31 | 1ULL << 23 ) << 8, ++ .grpid = 1, ++ }, ++ { .uname = "DDR_FAR", ++ .udesc = "Accounts for data responses from DRAM Far.", ++ .ucode = (1ULL << 31 | 1ULL << 24 ) << 8, ++ .grpid = 1, ++ }, ++ { .uname = "MCDRAM_NEAR", ++ .udesc = "Accounts for data responses from MCDRAM Local.", ++ .ucode = (1ULL << 31 | 1ULL << 21 ) << 8, ++ .grpid = 1, ++ }, ++ { .uname = "MCDRAM_FAR", ++ .udesc = "Accounts for data responses from MCDRAM Far or Other tile L2 hit far.", ++ .ucode = (1ULL << 32 | 1ULL << 22 ) << 8, ++ .grpid = 1, ++ }, ++ { .uname = "L2_HIT_NEAR_TILE_E_F", ++ .udesc = "Accounts for responses from a snoop request hit with data forwarded from its Near-other tile's L2 in E/F state.", ++ .ucode = (1ULL << 35 | 1ULL << 19 ) << 8, ++ .grpid = 1, ++ }, ++ { .uname = "L2_HIT_NEAR_TILE_M", ++ .udesc = "Accounts for responses from a snoop request hit with data forwarded from its Near-other tile's L2 in M state.", ++ .ucode = (1ULL << 36 | 1ULL << 19 ) << 8, ++ .grpid = 1, ++ }, ++ { .uname = "L2_HIT_FAR_TILE_E_F", ++ .udesc = "Accounts for responses from a snoop request hit with data forwarded from its Far(not in the same quadrant as the request)-other tile's L2 in E/F state. Valid only for SNC4 cluster mode.", ++ .ucode = (1ULL << 35 | 1ULL << 22 ) << 8, ++ .grpid = 1, ++ }, ++ { .uname = "L2_HIT_FAR_TILE_M", ++ .udesc = "Accounts for responses from a snoop request hit with data forwarded from its Far(not in the same quadrant as the request)-other tile's L2 in M state.", ++ .ucode = (1ULL << 36 | 1ULL << 22 ) << 8, ++ .grpid = 1, ++ }, ++ { .uname = "NON_DRAM", ++ .udesc = "accounts for responses from any NON_DRAM system address. This includes MMIO transactions", ++ .ucode = (1ULL << 37 | 1ULL << 17 ) << 8, ++ .grpid = 1, ++ }, ++ { .uname = "MCDRAM", ++ .udesc = "accounts for responses from MCDRAM (local and far)", ++ .ucode = (1ULL << 32 | 1ULL << 31 | 1ULL << 22 | 1ULL << 21 ) << 8, ++ .grpid = 1, ++ }, ++ { .uname = "DDR", ++ .udesc = "accounts for responses from DDR (local and far)", ++ .ucode = (1ULL << 32 | 1ULL << 31 | 1ULL << 24 | 1ULL << 23 ) << 8, ++ .grpid = 1, ++ }, ++ { .uname = "L2_HIT_NEAR_TILE", ++ .udesc = " accounts for reponses from snoop request hit with data forwarded from its Near-other tile L2 in E/F/M state", ++ .ucode = (1ULL << 36 | 1ULL << 35 | 1ULL << 20 | 1ULL << 19 ) << 8, ++ .grpid = 1, ++ }, ++ { .uname = "L2_HIT_FAR_TILE", ++ .udesc = "accounts for reponses from snoop request hit with data forwarded from it Far(not in the same quadrant as the request)-other tile L2 in E/F/M state. Valid only in SNC4 Cluster mode.", ++ .ucode = (1ULL << 36 | 1ULL << 35 | 1ULL << 22 ) << 8, ++ .grpid = 1, ++ }, ++ { .uname = "OUTSTANDING", ++ .udesc = "outstanding, per weighted cycle, from the time of the request to when any response is received. The oustanding response should be programmed only on PMC0.", ++ .ucode = (1ULL << 38) << 8, ++ .uflags = INTEL_X86_GRP_DFL_NONE | INTEL_X86_EXCL_GRP_BUT_0, /* can only be combined with request type bits (grpid = 0) */ ++ .grpid = 2, ++ }, ++}; ++ ++static const intel_x86_umask_t knl_offcore_response_1[]={ ++ { .uname = "DMND_DATA_RD", ++ .udesc = "Counts demand cacheable data and L1 prefetch data reads", ++ .ucode = 1ULL << (0 + 8), ++ .grpid = 0, ++ }, ++ { .uname = "DMND_RFO", ++ .udesc = "Counts Demand cacheable data writes", ++ .ucode = 1ULL << (1 + 8), ++ .grpid = 0, ++ }, ++ { .uname = "DMND_CODE_RD", ++ .udesc = "Counts demand code reads and prefetch code reads", ++ .ucode = 1ULL << (2 + 8), ++ .grpid = 0, ++ }, ++ { .uname = "PF_L2_RFO", ++ .udesc = "Counts L2 data RFO prefetches (includes PREFETCHW instruction)", ++ .ucode = 1ULL << (5 + 8), ++ .grpid = 0, ++ }, ++ { .uname = "PF_L2_CODE_RD", ++ .udesc = "Request: number of code reads generated by L2 prefetchers", ++ .ucode = 1ULL << (6 + 8), ++ .grpid = 0, ++ }, ++ { .uname = "PARTIAL_READS", ++ .udesc = "Counts Partial reads (UC or WC and is valid only for Outstanding response type).", ++ .ucode = 1ULL << (7 + 8), ++ .grpid = 0, ++ }, ++ { .uname = "PARTIAL_WRITES", ++ .udesc = "Counts Partial writes (UC or WT or WP and should be programmed on PMC1)", ++ .ucode = 1ULL << (8 + 8), ++ .grpid = 0, ++ }, ++ { .uname = "UC_CODE_READS", ++ .udesc = "Counts UC code reads (valid only for Outstanding response type)", ++ .ucode = 1ULL << (9 + 8), ++ .grpid = 0, ++ }, ++ { .uname = "BUS_LOCKS", ++ .udesc = "Counts Bus locks and split lock requests", ++ .ucode = 1ULL << (10 + 8), ++ .grpid = 0, ++ }, ++ { .uname = "FULL_STREAMING_STORES", ++ .udesc = "Counts Full streaming stores (WC and should be programmed on PMC1)", ++ .ucode = 1ULL << (11 + 8), ++ .grpid = 0, ++ }, ++ { .uname = "PF_SOFTWARE", ++ .udesc = "Counts Software prefetches", ++ .ucode = 1ULL << (12 + 8), ++ .grpid = 0, ++ }, ++ { .uname = "PF_L1_DATA_RD", ++ .udesc = "Counts L1 data HW prefetches", ++ .ucode = 1ULL << (13 + 8), ++ .grpid = 0, ++ }, ++ { .uname = "PARTIAL_STREAMING_STORES", ++ .udesc = "Counts Partial streaming stores (WC and should be programmed on PMC1)", ++ .ucode = 1ULL << (14 + 8), ++ .grpid = 0, ++ }, ++ { .uname = "STREAMING_STORES", ++ .udesc = "Counts all streaming stores (WC and should be programmed on PMC1)", ++ .ucode = (1ULL << 14 | 1ULL << 11) << 8, ++ .uequiv = "PARTIAL_STREAMING_STORES:FULL_STREAMING_STORES", ++ .grpid = 0, ++ }, ++ { .uname = "ANY_REQUEST", ++ .udesc = "Counts any request", ++ .ucode = 1ULL << (15 + 8), ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, ++ .grpid = 0, ++ }, ++ { .uname = "ANY_DATA_RD", ++ .udesc = "Counts Demand cacheable data and L1 prefetch data read requests", ++ .ucode = (1ULL << 0 | 1ULL << 7 | 1ULL << 12 | 1ULL << 13) << 8, ++ .uequiv = "DMND_DATA_RD:PARTIAL_READS:PF_SOFTWARE:PF_L1_DATA_RD", ++ .grpid = 0, ++ }, ++ { .uname = "ANY_RFO", ++ .udesc = "Counts Demand cacheable data write requests", ++ .ucode = (1ULL << 1 | 1ULL << 5) << 8, ++ .grpid = 0, ++ }, ++ { .uname = "ANY_CODE_RD", ++ .udesc = "Counts Demand code reads and prefetch code read requests", ++ .ucode = (1ULL << 2 | 1ULL << 6) << 8, ++ .uequiv = "DMND_CODE_RD:PF_L2_CODE_RD", ++ .grpid = 0, ++ }, ++ { .uname = "ANY_READ", ++ .udesc = "Counts any Read request", ++ .ucode = (1ULL << 0 | 1ULL << 1 | 1ULL << 2 | 1ULL << 5 | 1ULL << 6 | 1ULL << 7 | 1ULL << 9 | 1ULL << 12 | 1ULL << 13 ) << 8, ++ .uequiv = "DMND_DATA_RD:DMND_RFO:DMND_CODE_RD:PF_L2_RFO:PF_L2_CODE_RD:PARTIAL_READS:UC_CODE_READS:PF_SOFTWARE:PF_L1_DATA_RD", ++ .grpid = 0, ++ }, ++ { .uname = "ANY_PF_L2", ++ .udesc = "Counts any Prefetch requests", ++ .ucode = (1ULL << 5 | 1ULL << 6) << 8, ++ .uequiv = "PF_L2_RFO:PF_L2_CODE_RD", ++ .grpid = 0, ++ }, ++ { .uname = "ANY_RESPONSE", ++ .udesc = "Accounts for any response", ++ .ucode = (1ULL << 16) << 8, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL | INTEL_X86_EXCL_GRP_GT, ++ .grpid = 1, ++ }, ++ { .uname = "DDR_NEAR", ++ .udesc = "Accounts for data responses from DRAM Local.", ++ .ucode = (1ULL << 31 | 1ULL << 23 ) << 8, ++ .grpid = 1, ++ }, ++ { .uname = "DDR_FAR", ++ .udesc = "Accounts for data responses from DRAM Far.", ++ .ucode = (1ULL << 31 | 1ULL << 24 ) << 8, ++ .grpid = 1, ++ }, ++ { .uname = "MCDRAM_NEAR", ++ .udesc = "Accounts for data responses from MCDRAM Local.", ++ .ucode = (1ULL << 31 | 1ULL << 21 ) << 8, ++ .grpid = 1, ++ }, ++ { .uname = "MCDRAM_FAR", ++ .udesc = "Accounts for data responses from MCDRAM Far or Other tile L2 hit far.", ++ .ucode = (1ULL << 32 | 1ULL << 22 ) << 8, ++ .grpid = 1, ++ }, ++ { .uname = "L2_HIT_NEAR_TILE_E_F", ++ .udesc = "Accounts for responses from a snoop request hit with data forwarded from its Near-other tile's L2 in E/F state.", ++ .ucode = (1ULL << 35 | 1ULL << 19 ) << 8, ++ .grpid = 1, ++ }, ++ { .uname = "L2_HIT_NEAR_TILE_M", ++ .udesc = "Accounts for responses from a snoop request hit with data forwarded from its Near-other tile's L2 in M state.", ++ .ucode = (1ULL << 36 | 1ULL << 19 ) << 8, ++ .grpid = 1, ++ }, ++ { .uname = "L2_HIT_FAR_TILE_E_F", ++ .udesc = "Accounts for responses from a snoop request hit with data forwarded from its Far(not in the same quadrant as the request)-other tile's L2 in E/F state. Valid only for SNC4 cluster mode.", ++ .ucode = (1ULL << 35 | 1ULL << 22 ) << 8, ++ .grpid = 1, ++ }, ++ { .uname = "L2_HIT_FAR_TILE_M", ++ .udesc = "Accounts for responses from a snoop request hit with data forwarded from its Far(not in the same quadrant as the request)-other tile's L2 in M state.", ++ .ucode = (1ULL << 36 | 1ULL << 22 ) << 8, ++ .grpid = 1, ++ }, ++ { .uname = "NON_DRAM", ++ .udesc = "accounts for responses from any NON_DRAM system address. This includes MMIO transactions", ++ .ucode = (1ULL << 37 | 1ULL << 17 ) << 8, ++ .grpid = 1, ++ }, ++ { .uname = "MCDRAM", ++ .udesc = "accounts for responses from MCDRAM (local and far)", ++ .ucode = (1ULL << 32 | 1ULL << 31 | 1ULL << 22 | 1ULL << 21 ) << 8, ++ .grpid = 1, ++ }, ++ { .uname = "DDR", ++ .udesc = "accounts for responses from DDR (local and far)", ++ .ucode = (1ULL << 32 | 1ULL << 31 | 1ULL << 24 | 1ULL << 23 ) << 8, ++ .grpid = 1, ++ }, ++ { .uname = "L2_HIT_NEAR_TILE", ++ .udesc = " accounts for reponses from snoop request hit with data forwarded from its Near-other tile L2 in E/F/M state", ++ .ucode = (1ULL << 36 | 1ULL << 35 | 1ULL << 20 | 1ULL << 19 ) << 8, ++ .grpid = 1, ++ }, ++ { .uname = "L2_HIT_FAR_TILE", ++ .udesc = "accounts for reponses from snoop request hit with data forwarded from it Far(not in the same quadrant as the request)-other tile L2 in E/F/M state. Valid only in SNC4 Cluster mode.", ++ .ucode = (1ULL << 36 | 1ULL << 35 | 1ULL << 22 ) << 8, ++ .grpid = 1, ++ }, ++}; ++ ++static const intel_x86_umask_t knl_br_misp_retired[]={ ++ { .uname = "ALL_BRANCHES", ++ .udesc = "All mispredicted branches (Precise Event)", ++ .uequiv = "ANY", ++ .ucode = 0x0000, /* architectural encoding */ ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, ++ }, ++ { .uname = "ANY", ++ .udesc = "All mispredicted branches (Precise Event)", ++ .ucode = 0x0000, /* architectural encoding */ ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_DFL, ++ }, ++ { .uname = "JCC", ++ .udesc = "Number of mispredicted conditional branch instructions retired (Precise Event)", ++ .ucode = 0x7e00, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, ++ }, ++ { .uname = "NON_RETURN_IND", ++ .udesc = "Number of mispredicted non-return branch instructions retired (Precise Event)", ++ .ucode = 0xeb00, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, ++ }, ++ { .uname = "RETURN", ++ .udesc = "Number of mispredicted return branch instructions retired (Precise Event)", ++ .ucode = 0xf700, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, ++ }, ++ { .uname = "IND_CALL", ++ .udesc = "Number of mispredicted indirect call branch instructions retired (Precise Event)", ++ .ucode = 0xfb00, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, ++ }, ++ { .uname = "TAKEN_JCC", ++ .udesc = "Number of mispredicted taken conditional branch instructions retired (Precise Event)", ++ .ucode = 0xfe00, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, ++ }, ++ { .uname = "CALL", ++ .udesc = "Counts the number of mispredicted near CALL branch instructions retired.", ++ .ucode = 0xf900, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, ++ }, ++ { .uname = "REL_CALL", ++ .udesc = "Counts the number of mispredicted near relative CALL branch instructions retired.", ++ .ucode = 0xfd00, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, ++ }, ++ { .uname = "FAR_BRANCH", ++ .udesc = "Counts the number of mispredicted far branch instructions retired.", ++ .ucode = 0xbf00, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, ++ }, ++}; ++ ++static const intel_x86_umask_t knl_no_alloc_cycles[]={ ++ { .uname = "ROB_FULL", ++ .udesc = "Counts the number of core cycles when no micro-ops are allocated and the ROB is full", ++ .ucode = 0x0100, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "MISPREDICTS", ++ .udesc = "Counts the number of core cycles when no micro-ops are allocated and the alloc pipe is stalled waiting for a mispredicted branch to retire.", ++ .ucode = 0x0400, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "RAT_STALL", ++ .udesc = "Counts the number of core cycles when no micro-ops are allocated and a RATstall (caused by reservation station full) is asserted.", ++ .ucode = 0x2000, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "NOT_DELIVERED", ++ .udesc = "Counts the number of core cycles when no micro-ops are allocated, the IQ is empty, and no other condition is blocking allocation.", ++ .ucode = 0x9000, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "ALL", ++ .udesc = "Counts the total number of core cycles when no micro-ops are allocated for any reason.", ++ .ucode = 0x7f00, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, ++ }, ++ { .uname = "ANY", ++ .udesc = "Counts the total number of core cycles when no micro-ops are allocated for any reason.", ++ .uequiv = "ALL", ++ .ucode = 0x7f00, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++}; ++ ++static const intel_x86_umask_t knl_rs_full_stall[]={ ++ { .uname = "MEC", ++ .udesc = "Counts the number of core cycles when allocation pipeline is stalled and is waiting for a free MEC reservation station entry.", ++ .ucode = 0x0100, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "ANY", ++ .udesc = "Counts the total number of core cycles the Alloc pipeline is stalled when any one of the reservation stations is full.", ++ .ucode = 0x1f00, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, ++ }, ++}; ++ ++static const intel_x86_umask_t knl_cycles_div_busy[]={ ++ { .uname = "ALL", ++ .udesc = "Counts the number of core cycles when divider is busy. Does not imply a stall waiting for the divider.", ++ .ucode = 0x0100, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, ++ }, ++}; ++ ++static const intel_x86_umask_t knl_ms_decoded[]={ ++ { .uname = "ENTRY", ++ .udesc = "Counts the number of times the MSROM starts a flow of uops.", ++ .ucode = 0x0100, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, ++ }, ++}; ++ ++static const intel_x86_umask_t knl_decode_restriction[]={ ++ { .uname = "PREDECODE_WRONG", ++ .udesc = "Number of times the prediction (from the predecode cache) for instruction length is incorrect", ++ .ucode = 0x0100, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, ++ }, ++}; ++ ++static const intel_x86_entry_t intel_knl_pe[]={ ++{ .name = "UNHALTED_CORE_CYCLES", ++ .desc = "Unhalted core cycles", ++ .modmsk = INTEL_V3_ATTRS, /* any thread only supported in fixed counter */ ++ .cntmsk = 0x200000003ull, ++ .code = 0x3c, ++}, ++{ .name = "UNHALTED_REFERENCE_CYCLES", ++ .desc = "Unhalted reference cycle", ++ .modmsk = INTEL_FIXED3_ATTRS, ++ .cntmsk = 0x400000000ull, ++ .code = 0x0300, /* pseudo encoding */ ++ .flags = INTEL_X86_FIXED, ++}, ++{ .name = "INSTRUCTION_RETIRED", ++ .desc = "Instructions retired (any thread modifier supported in fixed counter)", ++ .modmsk = INTEL_V3_ATTRS, /* any thread only supported in fixed counter */ ++ .cntmsk = 0x100000003ull, ++ .code = 0xc0, ++}, ++{ .name = "INSTRUCTIONS_RETIRED", ++ .desc = "This is an alias for INSTRUCTION_RETIRED (any thread modifier supported in fixed counter)", ++ .modmsk = INTEL_V3_ATTRS, /* any thread only supported in fixed counter */ ++ .equiv = "INSTRUCTION_RETIRED", ++ .cntmsk = 0x10003, ++ .code = 0xc0, ++}, ++{ .name = "LLC_REFERENCES", ++ .desc = "Last level of cache references", ++ .modmsk = INTEL_V2_ATTRS, ++ .cntmsk = 0x3, ++ .code = 0x4f2e, ++}, ++{ .name = "LAST_LEVEL_CACHE_REFERENCES", ++ .desc = "This is an alias for LLC_REFERENCES", ++ .modmsk = INTEL_V2_ATTRS, ++ .equiv = "LLC_REFERENCES", ++ .cntmsk = 0x3, ++ .code = 0x4f2e, ++}, ++{ .name = "LLC_MISSES", ++ .desc = "Last level of cache misses", ++ .modmsk = INTEL_V2_ATTRS, ++ .cntmsk = 0x3, ++ .code = 0x412e, ++}, ++{ .name = "LAST_LEVEL_CACHE_MISSES", ++ .desc = "This is an alias for LLC_MISSES", ++ .modmsk = INTEL_V2_ATTRS, ++ .equiv = "LLC_MISSES", ++ .cntmsk = 0x3, ++ .code = 0x412e, ++}, ++{ .name = "BRANCH_INSTRUCTIONS_RETIRED", ++ .desc = "Branch instructions retired", ++ .modmsk = INTEL_V2_ATTRS, ++ .equiv = "BR_INST_RETIRED:ANY", ++ .cntmsk = 0x3, ++ .code = 0xc4, ++}, ++{ .name = "MISPREDICTED_BRANCH_RETIRED", ++ .desc = "Mispredicted branch instruction retired", ++ .equiv = "BR_MISP_RETIRED:ANY", ++ .modmsk = INTEL_V2_ATTRS, ++ .cntmsk = 0x3, ++ .code = 0xc5, ++ .flags = INTEL_X86_PEBS, ++}, ++/* begin model specific events */ ++{ .name = "ICACHE", ++ .desc = "Instruction fetches", ++ .modmsk = INTEL_V2_ATTRS, ++ .cntmsk = 0x3, ++ .code = 0x80, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_icache), ++ .ngrp = 1, ++ .umasks = knl_icache, ++}, ++{ .name = "UOPS_RETIRED", ++ .desc = "Micro-ops retired", ++ .modmsk = INTEL_V2_ATTRS, ++ .cntmsk = 0x3, ++ .code = 0xc2, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_uops_retired), ++ .ngrp = 1, ++ .umasks = knl_uops_retired, ++}, ++{ .name = "INST_RETIRED", ++ .desc = "Instructions retired", ++ .modmsk = INTEL_V2_ATTRS, ++ .cntmsk = 0x3, ++ .code = 0xc0, ++ .flags = INTEL_X86_PEBS, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_inst_retired), ++ .ngrp = 1, ++ .umasks = knl_inst_retired, ++}, ++{ .name = "CYCLES_DIV_BUSY", ++ .desc = "Counts the number of core cycles when divider is busy.", ++ .modmsk = INTEL_V2_ATTRS, ++ .cntmsk = 0x3, ++ .code = 0xcd, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_cycles_div_busy), ++ .ngrp = 1, ++ .umasks = knl_cycles_div_busy, ++}, ++{ .name = "RS_FULL_STALL", ++ .desc = "Counts the number of core cycles when allocation pipeline is stalled.", ++ .modmsk = INTEL_V2_ATTRS, ++ .cntmsk = 0x3, ++ .code = 0xcb, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_rs_full_stall), ++ .ngrp = 1, ++ .umasks = knl_rs_full_stall, ++}, ++{ .name = "L2_REQUESTS", ++ .desc = "L2 cache requests", ++ .modmsk = INTEL_V2_ATTRS, ++ .cntmsk = 0x3, ++ .code = 0x2e, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_l2_rqsts), ++ .ngrp = 1, ++ .umasks = knl_l2_rqsts, ++}, ++{ .name = "MACHINE_CLEARS", ++ .desc = "Counts the number of times that the machine clears.", ++ .modmsk = INTEL_V2_ATTRS, ++ .cntmsk = 0x3, ++ .code = 0xc3, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_machine_clears), ++ .ngrp = 1, ++ .umasks = knl_machine_clears, ++}, ++{ .name = "BR_INST_RETIRED", ++ .desc = "Retired branch instructions", ++ .modmsk = INTEL_V2_ATTRS, ++ .cntmsk = 0x3, ++ .code = 0xc4, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_br_inst_retired), ++ .flags = INTEL_X86_PEBS, ++ .ngrp = 1, ++ .umasks = knl_br_inst_retired, ++}, ++{ .name = "BR_MISP_RETIRED", ++ .desc = "Counts the number of mispredicted branch instructions retired.", ++ .modmsk = INTEL_V2_ATTRS, ++ .cntmsk = 0x3, ++ .code = 0xc5, ++ .flags = INTEL_X86_PEBS, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_br_misp_retired), ++ .ngrp = 1, ++ .umasks = knl_br_misp_retired, ++}, ++{ .name = "MS_DECODED", ++ .desc = "Number of times the MSROM starts a flow of uops.", ++ .modmsk = INTEL_V2_ATTRS, ++ .cntmsk = 0x3, ++ .code = 0xe7, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_ms_decoded), ++ .ngrp = 1, ++ .umasks = knl_ms_decoded, ++}, ++{ .name = "FETCH_STALL", ++ .desc = "Counts the number of core cycles the fetch stalls.", ++ .modmsk = INTEL_V2_ATTRS, ++ .cntmsk = 0x3, ++ .code = 0x86, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_fetch_stall), ++ .ngrp = 1, ++ .umasks = knl_fetch_stall, ++}, ++{ .name = "BACLEARS", ++ .desc = "Branch address calculator", ++ .modmsk = INTEL_V2_ATTRS, ++ .cntmsk = 0x3, ++ .code = 0xe6, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_baclears), ++ .ngrp = 1, ++ .umasks = knl_baclears, ++}, ++{ .name = "NO_ALLOC_CYCLES", ++ .desc = "Front-end allocation", ++ .modmsk = INTEL_V2_ATTRS, ++ .cntmsk = 0x3, ++ .code = 0xca, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_no_alloc_cycles), ++ .ngrp = 1, ++ .umasks = knl_no_alloc_cycles, ++}, ++{ .name = "CPU_CLK_UNHALTED", ++ .desc = "Core cycles when core is not halted", ++ .modmsk = INTEL_V2_ATTRS, ++ .cntmsk = 0x3, ++ .code = 0x3c, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_cpu_clk_unhalted), ++ .ngrp = 1, ++ .umasks = knl_cpu_clk_unhalted, ++}, ++{ .name = "MEM_UOPS_RETIRED", ++ .desc = "Counts the number of load micro-ops retired.", ++ .modmsk = INTEL_V2_ATTRS, ++ .cntmsk = 0x3, ++ .code = 0x4, ++ .flags = INTEL_X86_PEBS, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_mem_uops_retired), ++ .ngrp = 1, ++ .umasks = knl_mem_uops_retired, ++}, ++{ .name = "PAGE_WALKS", ++ .desc = "Number of page-walks executed", ++ .modmsk = INTEL_V2_ATTRS, ++ .cntmsk = 0x3, ++ .code = 0x5, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_page_walks), ++ .ngrp = 1, ++ .umasks = knl_page_walks, ++}, ++{ .name = "L2_REQUESTS_REJECT", ++ .desc = "Counts the number of MEC requests from the L2Q that reference a cache line were rejected.", ++ .modmsk = INTEL_V2_ATTRS, ++ .cntmsk = 0x3, ++ .code = 0x30, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_l2_requests_reject), ++ .ngrp = 1, ++ .umasks = knl_l2_requests_reject, ++}, ++{ .name = "CORE_REJECT_L2Q", ++ .desc = "Number of requests not accepted into the L2Q because of any L2 queue reject condition.", ++ .modmsk = INTEL_V2_ATTRS, ++ .cntmsk = 0x3, ++ .code = 0x31, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_core_reject), ++ .ngrp = 1, ++ .umasks = knl_core_reject, ++}, ++{ .name = "RECYCLEQ", ++ .desc = "Counts the number of occurences a retired load gets blocked.", ++ .modmsk = INTEL_V2_ATTRS, ++ .cntmsk = 0x3, ++ .code = 0x03, ++ .flags = INTEL_X86_PEBS, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_recycleq), ++ .ngrp = 1, ++ .umasks = knl_recycleq, ++}, ++{ .name = "OFFCORE_RESPONSE_0", ++ .desc = "Offcore response event (must provide at least one request type and either any_response or any combination of supplier + snoop)", ++ .modmsk = INTEL_V2_ATTRS, ++ .cntmsk = 0xf, ++ .code = 0x01b7, ++ .flags = INTEL_X86_NHM_OFFCORE, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_offcore_response_0), ++ .ngrp = 3, ++ .umasks = knl_offcore_response_0, ++}, ++{ .name = "OFFCORE_RESPONSE_1", ++ .desc = "Offcore response event (must provide at least one request type and either any_response or any combination of supplier + snoop)", ++ .modmsk = INTEL_V2_ATTRS, ++ .cntmsk = 0xf, ++ .code = 0x02b7, ++ .flags = INTEL_X86_NHM_OFFCORE, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_offcore_response_1), ++ .ngrp = 2, ++ .umasks = knl_offcore_response_1, ++}, ++}; +diff --git a/lib/pfmlib_common.c b/lib/pfmlib_common.c +index b4547be..f4a56df 100644 +--- a/lib/pfmlib_common.c ++++ b/lib/pfmlib_common.c +@@ -202,6 +202,7 @@ static pfmlib_pmu_t *pfmlib_pmus[]= + &intel_hswep_unc_r3qpi1_support, + &intel_hswep_unc_r3qpi2_support, + &intel_hswep_unc_irp_support, ++ &intel_knl_support, + &intel_x86_arch_support, /* must always be last for x86 */ + #endif + +diff --git a/lib/pfmlib_intel_knl.c b/lib/pfmlib_intel_knl.c +new file mode 100644 +index 0000000..eb24b96 +--- /dev/null ++++ b/lib/pfmlib_intel_knl.c +@@ -0,0 +1,75 @@ ++/* ++ * pfmlib_intel_knl.c : Intel Knights Landing core PMU ++ * ++ * Copyright (c) 2016 Intel Corp. All rights reserved ++ * Contributed by Peinan Zhang ++ * ++ * Permission is hereby granted, free of charge, to any person obtaining a copy ++ * of this software and associated documentation files (the "Software"), to deal ++ * in the Software without restriction, including without limitation the rights ++ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies ++ * of the Software, and to permit persons to whom the Software is furnished to do so, ++ * subject to the following conditions: ++ * ++ * The above copyright notice and this permission notice shall be included in all ++ * copies or substantial portions of the Software. ++ * ++ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, ++ * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A ++ * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT ++ * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF ++ * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE ++ * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ++ * ++ * Based on Intel Software Optimization Guide 2015 ++ */ ++ ++/* private headers */ ++#include "pfmlib_priv.h" ++#include "pfmlib_intel_x86_priv.h" ++#include "events/intel_knl_events.h" ++ ++static const int knl_models[] = { ++ 87, /* knights landing */ ++ 0 ++}; ++ ++static int ++pfm_intel_knl_init(void *this) ++{ ++ pfm_intel_x86_cfg.arch_version = 2; ++ return PFM_SUCCESS; ++} ++ ++pfmlib_pmu_t intel_knl_support={ ++ .desc = "Intel Knights Landing", ++ .name = "knl", ++ .pmu = PFM_PMU_INTEL_KNL, ++ .pme_count = LIBPFM_ARRAY_SIZE(intel_knl_pe), ++ .type = PFM_PMU_TYPE_CORE, ++ .num_cntrs = 2, ++ .num_fixed_cntrs = 3, ++ .max_encoding = 2, ++ .pe = intel_knl_pe, ++ .atdesc = intel_x86_mods, ++ .flags = PFMLIB_PMU_FL_RAW_UMASK ++ | INTEL_X86_PMU_FL_ECMASK, ++ .supported_plm = INTEL_X86_PLM, ++ ++ .cpu_family = 6, ++ .cpu_models = knl_models, ++ .pmu_detect = pfm_intel_x86_model_detect, ++ .pmu_init = pfm_intel_knl_init, ++ ++ .get_event_encoding[PFM_OS_NONE] = pfm_intel_x86_get_encoding, ++ PFMLIB_ENCODE_PERF(pfm_intel_x86_get_perf_encoding), ++ ++ .get_event_first = pfm_intel_x86_get_event_first, ++ .get_event_next = pfm_intel_x86_get_event_next, ++ .event_is_valid = pfm_intel_x86_event_is_valid, ++ .validate_table = pfm_intel_x86_validate_table, ++ .get_event_info = pfm_intel_x86_get_event_info, ++ .get_event_attr_info = pfm_intel_x86_get_event_attr_info, ++ PFMLIB_VALID_PERF_PATTRS(pfm_intel_x86_perf_validate_pattrs), ++ .get_event_nattrs = pfm_intel_x86_get_event_nattrs, ++}; +diff --git a/lib/pfmlib_priv.h b/lib/pfmlib_priv.h +index 5cde35c..c49975f 100644 +--- a/lib/pfmlib_priv.h ++++ b/lib/pfmlib_priv.h +@@ -353,6 +353,7 @@ extern pfmlib_pmu_t intel_hswep_unc_r3qpi2_support; + extern pfmlib_pmu_t intel_hswep_unc_irp_support; + extern pfmlib_pmu_t intel_knc_support; + extern pfmlib_pmu_t intel_slm_support; ++extern pfmlib_pmu_t intel_knl_support; + extern pfmlib_pmu_t intel_glm_support; + extern pfmlib_pmu_t power4_support; + extern pfmlib_pmu_t ppc970_support; +diff --git a/tests/validate_x86.c b/tests/validate_x86.c +index 83b8c88..cede40b 100644 +--- a/tests/validate_x86.c ++++ b/tests/validate_x86.c +@@ -4382,7 +4382,103 @@ static const test_event_t x86_test_events[]={ + .codes[0] = 0x15301ca, + .fstr = "glm::ISSUE_SLOTS_NOT_CONSUMED:RESOURCE_FULL:k=1:u=1:e=0:i=0:c=1", + }, ++ { SRC_LINE, ++ .name = "knl::no_alloc_cycles:all", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x537fca, ++ .fstr = "knl::NO_ALLOC_CYCLES:ALL:k=1:u=1:e=0:i=0:c=0", ++ }, ++ { SRC_LINE, ++ .name = "knl::MEM_UOPS_RETIRED:DTLB_MISS_LOADS", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x530804, ++ .fstr = "knl::MEM_UOPS_RETIRED:DTLB_MISS_LOADS:k=1:u=1:e=0:i=0:c=0", ++ }, ++ { SRC_LINE, ++ .name = "knl::uops_retired:any:t", ++ .ret = PFM_ERR_ATTR, ++ }, ++ { SRC_LINE, ++ .name = "knl::unhalted_reference_cycles:u:t", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x710300, ++ .fstr = "knl::UNHALTED_REFERENCE_CYCLES:k=0:u=1:t=1", ++ }, ++ { SRC_LINE, ++ .name = "knl::instructions_retired:k:t", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] =0x7200c0, ++ .fstr = "knl::INSTRUCTION_RETIRED:k=1:u=0:e=0:i=0:c=0:t=1", ++ }, ++ { SRC_LINE, ++ .name = "knl::unhalted_core_cycles:k:t", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x72003c, ++ .fstr = "knl::UNHALTED_CORE_CYCLES:k=1:u=0:e=0:i=0:c=0:t=1", ++ }, ++ { SRC_LINE, ++ .name = "knl::offcore_response_1:any_request", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] = 0x5302b7, ++ .codes[1] = 0x18000, ++ .fstr = "knl::OFFCORE_RESPONSE_1:ANY_REQUEST:ANY_RESPONSE:k=1:u=1:e=0:i=0:c=0", ++ }, ++ { SRC_LINE, ++ .name = "knl::offcore_response_0:any_read", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] = 0x5301b7, ++ .codes[1] = 0x132e7, ++ .fstr = "knl::OFFCORE_RESPONSE_0:DMND_DATA_RD:DMND_RFO:DMND_CODE_RD:PF_L2_RFO:PF_L2_CODE_RD:PARTIAL_READS:UC_CODE_READS:PF_SOFTWARE:PF_L1_DATA_RD:ANY_RESPONSE:k=1:u=1:e=0:i=0:c=0", ++ }, ++ { SRC_LINE, ++ .name = "knl::offcore_response_1:any_read", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] = 0x5302b7, ++ .codes[1] = 0x132e7, ++ .fstr = "knl::OFFCORE_RESPONSE_1:DMND_DATA_RD:DMND_RFO:DMND_CODE_RD:PF_L2_RFO:PF_L2_CODE_RD:PARTIAL_READS:UC_CODE_READS:PF_SOFTWARE:PF_L1_DATA_RD:ANY_RESPONSE:k=1:u=1:e=0:i=0:c=0", ++ }, ++ { SRC_LINE, ++ .name = "knl::offcore_response_0:any_request:ddr_near", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] = 0x5301b7, ++ .codes[1] = 0x80808000ull, ++ .fstr = "knl::OFFCORE_RESPONSE_0:ANY_REQUEST:DDR_NEAR:k=1:u=1:e=0:i=0:c=0", ++ }, ++ { SRC_LINE, ++ .name = "knl::offcore_response_0:any_request:L2_HIT_NEAR_TILE:L2_HIT_FAR_TILE", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] = 0x5301b7, ++ .codes[1] = 0x1800588000ull, ++ .fstr = "knl::OFFCORE_RESPONSE_0:ANY_REQUEST:L2_HIT_NEAR_TILE:L2_HIT_FAR_TILE:k=1:u=1:e=0:i=0:c=0", ++ }, ++ { SRC_LINE, ++ .name = "knl::offcore_response_0:dmnd_data_rd:outstanding", ++ .ret = PFM_SUCCESS, ++ .count = 2, ++ .codes[0] = 0x5301b7, ++ .codes[1] = 0x4000000001ull, ++ .fstr = "knl::OFFCORE_RESPONSE_0:DMND_DATA_RD:OUTSTANDING:k=1:u=1:e=0:i=0:c=0", ++ }, ++ { SRC_LINE, ++ .name = "knl::offcore_response_0:dmnd_data_rd:ddr_near:outstanding", ++ .ret = PFM_ERR_FEATCOMB, ++ }, ++ { SRC_LINE, ++ .name = "knl::offcore_response_1:dmnd_data_rd:outstanding", ++ .ret = PFM_ERR_ATTR, ++ }, + }; ++ + #define NUM_TEST_EVENTS (int)(sizeof(x86_test_events)/sizeof(test_event_t)) + + static int +-- +2.9.3 + + +From d422ba2ed289ba5293c35e11405d0d0ca495d3e9 Mon Sep 17 00:00:00 2001 +From: Stephane Eranian +Date: Tue, 16 Aug 2016 10:08:59 -0700 +Subject: [PATCH] fix Intel Goldmont offcore_response average latency support + +The OUTSTANDING umask is in its own umask group however, it should +not be the default. Instead, the whole group is optional so mark +it as such. This avoids issues encoding events such as: +OFFCORE_RESPONSE_0:dmnd_data_rd:l2_hit + +Signed-off-by: Stephane Eranian +--- + lib/events/intel_glm_events.h | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/lib/events/intel_glm_events.h b/lib/events/intel_glm_events.h +index 78dc5da..4a11b9f 100644 +--- a/lib/events/intel_glm_events.h ++++ b/lib/events/intel_glm_events.h +@@ -519,7 +519,7 @@ static const intel_x86_umask_t glm_offcore_response_0[]={ + { .uname = "OUTSTANDING", + .udesc = "Outstanding request: counts weighted cycles of outstanding offcore requests of the request type specified in the bits 15:0 of offcore_response from the time the XQ receives the request and any response received. Bits 37:16 must be set to 0. This is only available for offcore_response_0", + .ucode = 1ULL << (38 + 8), +- .uflags = INTEL_X86_DFL | INTEL_X86_EXCL_GRP_BUT_0, /* can only be combined with request type bits (grpid = 0) */ ++ .uflags = INTEL_X86_GRP_DFL_NONE | INTEL_X86_EXCL_GRP_BUT_0, /* can only be combined with request type bits (grpid = 0) */ + .grpid = 3, + .ucntmsk = 0xffull, + }, +-- +2.9.3 + + +From a2348eea45d02dd0e2a22406adb03f858b31a764 Mon Sep 17 00:00:00 2001 +From: Peinan Zhang +Date: Mon, 17 Oct 2016 05:28:44 -0700 +Subject: [PATCH] Add Intel Knights Landing untile PMU support + +This patch adds support for Intel Knights Landing untile (uncore) PMUs. + +The patch covers the following PMUs: + - CHA + - EDC + - IMC + - M2PCIE + +Based on the documentation: +Intel Xeon Phi Processor Performance Monitoring Reference Manual Vol2 rev1.0 June2016 +And event table from download.01.org/perfmon/KNL V9. + +Signed-off-by: Peinan Zhang +[yarkhan@icl.utk.edu: Split into core/uncore patches] +Signed-off-by: Asim YarKhan +Reviewed-by: Stephane Eranian +--- + README | 2 +- + include/perfmon/pfmlib.h | 66 ++ + lib/Makefile | 8 + + lib/events/intel_knl_unc_cha_events.h | 1276 ++++++++++++++++++++++++++++++ + lib/events/intel_knl_unc_edc_events.h | 88 +++ + lib/events/intel_knl_unc_imc_events.h | 68 ++ + lib/events/intel_knl_unc_m2pcie_events.h | 145 ++++ + lib/pfmlib_common.c | 63 ++ + lib/pfmlib_intel_knl_unc_cha.c | 103 +++ + lib/pfmlib_intel_knl_unc_edc.c | 111 +++ + lib/pfmlib_intel_knl_unc_imc.c | 101 +++ + lib/pfmlib_intel_knl_unc_m2pcie.c | 80 ++ + lib/pfmlib_intel_snbep_unc.c | 22 + + lib/pfmlib_intel_snbep_unc_priv.h | 3 + + lib/pfmlib_priv.h | 63 ++ + tests/validate_x86.c | 266 +++++++ + 16 files changed, 2464 insertions(+), 1 deletion(-) + create mode 100644 lib/events/intel_knl_unc_cha_events.h + create mode 100644 lib/events/intel_knl_unc_edc_events.h + create mode 100644 lib/events/intel_knl_unc_imc_events.h + create mode 100644 lib/events/intel_knl_unc_m2pcie_events.h + create mode 100644 lib/pfmlib_intel_knl_unc_cha.c + create mode 100644 lib/pfmlib_intel_knl_unc_edc.c + create mode 100644 lib/pfmlib_intel_knl_unc_imc.c + create mode 100644 lib/pfmlib_intel_knl_unc_m2pcie.c + +diff --git a/README b/README +index 287616e..6a49591 100644 +--- a/README ++++ b/README +@@ -55,7 +55,7 @@ The library supports many PMUs. The current version can handle: + Intel Goldmont + Intel RAPL (energy consumption) + Intel Knights Corner +- Intel Knights Landing ++ Intel Knights Landing (core, uncore) + Intel architectural perfmon v1, v2, v3 + + - For ARM: +diff --git a/include/perfmon/pfmlib.h b/include/perfmon/pfmlib.h +index b584672..0e370ba 100644 +--- a/include/perfmon/pfmlib.h ++++ b/include/perfmon/pfmlib.h +@@ -301,6 +301,72 @@ typedef enum { + PFM_PMU_INTEL_GLM, /* Intel Goldmont */ + + PFM_PMU_INTEL_KNL, /* Intel Knights Landing */ ++ PFM_PMU_INTEL_KNL_UNC_IMC0, /* Intel KnightLanding IMC channel 0 uncore */ ++ PFM_PMU_INTEL_KNL_UNC_IMC1, /* Intel KnightLanding IMC channel 1 uncore */ ++ PFM_PMU_INTEL_KNL_UNC_IMC2, /* Intel KnightLanding IMC channel 2 uncore */ ++ PFM_PMU_INTEL_KNL_UNC_IMC3, /* Intel KnightLanding IMC channel 3 uncore */ ++ PFM_PMU_INTEL_KNL_UNC_IMC4, /* Intel KnightLanding IMC channel 4 uncore */ ++ PFM_PMU_INTEL_KNL_UNC_IMC5, /* Intel KnightLanding IMC channel 5 uncore */ ++ PFM_PMU_INTEL_KNL_UNC_IMC_UCLK0,/* Intel KnightLanding IMC UCLK unit 0 uncore */ ++ PFM_PMU_INTEL_KNL_UNC_IMC_UCLK1,/* Intel KnightLanding IMC UCLK unit 1 uncore */ ++ PFM_PMU_INTEL_KNL_UNC_EDC_ECLK0,/* Intel KnightLanding EDC ECLK unit 0 uncore */ ++ PFM_PMU_INTEL_KNL_UNC_EDC_ECLK1,/* Intel KnightLanding EDC ECLK unit 1 uncore */ ++ PFM_PMU_INTEL_KNL_UNC_EDC_ECLK2,/* Intel KnightLanding EDC ECLK unit 2 uncore */ ++ PFM_PMU_INTEL_KNL_UNC_EDC_ECLK3,/* Intel KnightLanding EDC ECLK unit 3 uncore */ ++ PFM_PMU_INTEL_KNL_UNC_EDC_ECLK4,/* Intel KnightLanding EDC ECLK unit 4 uncore */ ++ PFM_PMU_INTEL_KNL_UNC_EDC_ECLK5,/* Intel KnightLanding EDC ECLK unit 5 uncore */ ++ PFM_PMU_INTEL_KNL_UNC_EDC_ECLK6,/* Intel KnightLanding EDC ECLK unit 6 uncore */ ++ PFM_PMU_INTEL_KNL_UNC_EDC_ECLK7,/* Intel KnightLanding EDC ECLK unit 7 uncore */ ++ PFM_PMU_INTEL_KNL_UNC_EDC_UCLK0,/* Intel KnightLanding EDC UCLK unit 0 uncore */ ++ PFM_PMU_INTEL_KNL_UNC_EDC_UCLK1,/* Intel KnightLanding EDC UCLK unit 1 uncore */ ++ PFM_PMU_INTEL_KNL_UNC_EDC_UCLK2,/* Intel KnightLanding EDC UCLK unit 2 uncore */ ++ PFM_PMU_INTEL_KNL_UNC_EDC_UCLK3,/* Intel KnightLanding EDC UCLK unit 3 uncore */ ++ PFM_PMU_INTEL_KNL_UNC_EDC_UCLK4,/* Intel KnightLanding EDC UCLK unit 4 uncore */ ++ PFM_PMU_INTEL_KNL_UNC_EDC_UCLK5,/* Intel KnightLanding EDC UCLK unit 5 uncore */ ++ PFM_PMU_INTEL_KNL_UNC_EDC_UCLK6,/* Intel KnightLanding EDC UCLK unit 6 uncore */ ++ PFM_PMU_INTEL_KNL_UNC_EDC_UCLK7,/* Intel KnightLanding EDC UCLK unit 7 uncore */ ++ ++ PFM_PMU_INTEL_KNL_UNC_CHA0, /* Intel KnightLanding CHA unit 0 uncore */ ++ PFM_PMU_INTEL_KNL_UNC_CHA1, /* Intel KnightLanding CHA unit 1 uncore */ ++ PFM_PMU_INTEL_KNL_UNC_CHA2, /* Intel KnightLanding CHA unit 2 uncore */ ++ PFM_PMU_INTEL_KNL_UNC_CHA3, /* Intel KnightLanding CHA unit 3 uncore */ ++ PFM_PMU_INTEL_KNL_UNC_CHA4, /* Intel KnightLanding CHA unit 4 uncore */ ++ PFM_PMU_INTEL_KNL_UNC_CHA5, /* Intel KnightLanding CHA unit 5 uncore */ ++ PFM_PMU_INTEL_KNL_UNC_CHA6, /* Intel KnightLanding CHA unit 6 uncore */ ++ PFM_PMU_INTEL_KNL_UNC_CHA7, /* Intel KnightLanding CHA unit 7 uncore */ ++ PFM_PMU_INTEL_KNL_UNC_CHA8, /* Intel KnightLanding CHA unit 8 uncore */ ++ PFM_PMU_INTEL_KNL_UNC_CHA9, /* Intel KnightLanding CHA unit 9 uncore */ ++ PFM_PMU_INTEL_KNL_UNC_CHA10, /* Intel KnightLanding CHA unit 10 uncore */ ++ PFM_PMU_INTEL_KNL_UNC_CHA11, /* Intel KnightLanding CHA unit 11 uncore */ ++ PFM_PMU_INTEL_KNL_UNC_CHA12, /* Intel KnightLanding CHA unit 12 uncore */ ++ PFM_PMU_INTEL_KNL_UNC_CHA13, /* Intel KnightLanding CHA unit 13 uncore */ ++ PFM_PMU_INTEL_KNL_UNC_CHA14, /* Intel KnightLanding CHA unit 14 uncore */ ++ PFM_PMU_INTEL_KNL_UNC_CHA15, /* Intel KnightLanding CHA unit 15 uncore */ ++ PFM_PMU_INTEL_KNL_UNC_CHA16, /* Intel KnightLanding CHA unit 16 uncore */ ++ PFM_PMU_INTEL_KNL_UNC_CHA17, /* Intel KnightLanding CHA unit 17 uncore */ ++ PFM_PMU_INTEL_KNL_UNC_CHA18, /* Intel KnightLanding CHA unit 18 uncore */ ++ PFM_PMU_INTEL_KNL_UNC_CHA19, /* Intel KnightLanding CHA unit 19 uncore */ ++ PFM_PMU_INTEL_KNL_UNC_CHA20, /* Intel KnightLanding CHA unit 20 uncore */ ++ PFM_PMU_INTEL_KNL_UNC_CHA21, /* Intel KnightLanding CHA unit 21 uncore */ ++ PFM_PMU_INTEL_KNL_UNC_CHA22, /* Intel KnightLanding CHA unit 22 uncore */ ++ PFM_PMU_INTEL_KNL_UNC_CHA23, /* Intel KnightLanding CHA unit 23 uncore */ ++ PFM_PMU_INTEL_KNL_UNC_CHA24, /* Intel KnightLanding CHA unit 24 uncore */ ++ PFM_PMU_INTEL_KNL_UNC_CHA25, /* Intel KnightLanding CHA unit 25 uncore */ ++ PFM_PMU_INTEL_KNL_UNC_CHA26, /* Intel KnightLanding CHA unit 26 uncore */ ++ PFM_PMU_INTEL_KNL_UNC_CHA27, /* Intel KnightLanding CHA unit 27 uncore */ ++ PFM_PMU_INTEL_KNL_UNC_CHA28, /* Intel KnightLanding CHA unit 28 uncore */ ++ PFM_PMU_INTEL_KNL_UNC_CHA29, /* Intel KnightLanding CHA unit 29 uncore */ ++ PFM_PMU_INTEL_KNL_UNC_CHA30, /* Intel KnightLanding CHA unit 30 uncore */ ++ PFM_PMU_INTEL_KNL_UNC_CHA31, /* Intel KnightLanding CHA unit 31 uncore */ ++ PFM_PMU_INTEL_KNL_UNC_CHA32, /* Intel KnightLanding CHA unit 32 uncore */ ++ PFM_PMU_INTEL_KNL_UNC_CHA33, /* Intel KnightLanding CHA unit 33 uncore */ ++ PFM_PMU_INTEL_KNL_UNC_CHA34, /* Intel KnightLanding CHA unit 34 uncore */ ++ PFM_PMU_INTEL_KNL_UNC_CHA35, /* Intel KnightLanding CHA unit 35 uncore */ ++ PFM_PMU_INTEL_KNL_UNC_CHA36, /* Intel KnightLanding CHA unit 36 uncore */ ++ PFM_PMU_INTEL_KNL_UNC_CHA37, /* Intel KnightLanding CHA unit 37 uncore */ ++ ++ PFM_PMU_INTEL_KNL_UNC_UBOX, /* Intel KnightLanding Ubox uncore */ ++ PFM_PMU_INTEL_KNL_UNC_M2PCIE, /* Intel KnightLanding M2PCIe uncore */ + /* MUST ADD NEW PMU MODELS HERE */ + + PFM_PMU_MAX /* end marker */ +diff --git a/lib/Makefile b/lib/Makefile +index 3c5033f..20fc385 100644 +--- a/lib/Makefile ++++ b/lib/Makefile +@@ -94,6 +94,10 @@ SRCS += pfmlib_amd64.c pfmlib_intel_core.c pfmlib_intel_x86.c \ + pfmlib_intel_knc.c \ + pfmlib_intel_slm.c \ + pfmlib_intel_knl.c \ ++ pfmlib_intel_knl_unc_imc.c \ ++ pfmlib_intel_knl_unc_edc.c \ ++ pfmlib_intel_knl_unc_cha.c \ ++ pfmlib_intel_knl_unc_m2pcie.c \ + pfmlib_intel_glm.c \ + pfmlib_intel_netburst.c \ + pfmlib_amd64_k7.c pfmlib_amd64_k8.c pfmlib_amd64_fam10h.c \ +@@ -271,6 +275,10 @@ INC_X86= pfmlib_intel_x86_priv.h \ + events/intel_hswep_unc_r2pcie_events.h \ + events/intel_hswep_unc_r3qpi_events.h \ + events/intel_hswep_unc_irp_events.h \ ++ events/intel_knl_unc_imc_events.h \ ++ events/intel_knl_unc_edc_events.h \ ++ events/intel_knl_unc_cha_events.h \ ++ events/intel_knl_unc_m2pcie_events.h \ + events/intel_slm_events.h + + INC_MIPS=events/mips_74k_events.h events/mips_74k_events.h +diff --git a/lib/events/intel_knl_unc_cha_events.h b/lib/events/intel_knl_unc_cha_events.h +new file mode 100644 +index 0000000..11ace65 +--- /dev/null ++++ b/lib/events/intel_knl_unc_cha_events.h +@@ -0,0 +1,1276 @@ ++/* ++ * Copyright (c) 2016 Intel Corp. All rights reserved ++ * Contributed by Peinan Zhang ++ * ++ * Permission is hereby granted, free of charge, to any person obtaining a copy ++ * of this software and associated documentation files (the "Software"), to deal ++ * in the Software without restriction, including without limitation the rights ++ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies ++ * of the Software, and to permit persons to whom the Software is furnished to do so, ++ * subject to the following conditions: ++ * ++ * The above copyright notice and this permission notice shall be included in all ++ * copies or substantial portions of the Software. ++ * ++ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, ++ * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A ++ * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT ++ * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF ++ * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE ++ * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ++ * ++ * This file is part of libpfm, a performance monitoring support library for ++ * applications on Linux. ++ * ++ * PMU: knl_unc_cha (Intel Knights Landing CHA uncore PMU) ++ */ ++ ++static const intel_x86_umask_t knl_unc_cha_llc_lookup[]={ ++ { .uname = "DATA_READ", ++ .udesc = "Data read requests", ++ .ucode = 0x0300, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "WRITE", ++ .udesc = "Write requests. Includes all write transactions (cached, uncached)", ++ .ucode = 0x0500, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "REMOTE_SNOOP", ++ .udesc = "External snoop request", ++ .ucode = 0x0900, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "ANY", ++ .udesc = "Any request", ++ .ucode = 0x1100, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, ++ }, ++}; ++ ++static const intel_x86_umask_t knl_unc_cha_llc_victims[]={ ++ { .uname = "M_STATE", ++ .udesc = "Lines in M state", ++ .ucode = 0x0100, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "E_STATE", ++ .udesc = "Lines in E state", ++ .ucode = 0x0200, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "S_STATE", ++ .udesc = "Lines in S state", ++ .ucode = 0x0400, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "F_STATE", ++ .udesc = "Lines in F state", ++ .ucode = 0x0800, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "LOCAL", ++ .udesc = "Victimized Lines matching the NID filter.", ++ .ucode = 0x2000, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, ++ }, ++ { .uname = "REMOTE", ++ .udesc = "Victimized Lines does not matching the NID.", ++ .ucode = 0x8000, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++}; ++ ++ ++static const intel_x86_umask_t knl_unc_cha_ingress_int_starved[]={ ++ { .uname = "IRQ", ++ .udesc = "Internal starved with IRQ.", ++ .ucode = 0x0100, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "IPQ", ++ .udesc = "Internal starved with IPQ.", ++ .ucode = 0x0400, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "ISMQ", ++ .udesc = "Internal starved with ISMQ.", ++ .ucode = 0x0800, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "PRQ", ++ .udesc = "Internal starved with PRQ.", ++ .ucode = 0x1000, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++}; ++ ++static const intel_x86_umask_t knl_unc_cha_ingress_ext[]={ ++ { .uname = "IRQ", ++ .udesc = "IRQ", ++ .ucode = 0x0100, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "IRQ_REJ", ++ .udesc = "IRQ rejected", ++ .ucode = 0x0200, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "IPQ", ++ .udesc = "IPQ", ++ .ucode = 0x0400, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "PRQ", ++ .udesc = "PRQ", ++ .ucode = 0x1000, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "PRQ_REJ", ++ .udesc = "PRQ rejected", ++ .ucode = 0x2000, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++}; ++ ++ ++static const intel_x86_umask_t knl_unc_cha_ingress_entry_reject_q0[]={ ++ { .uname = "AD_REQ_VN0", ++ .udesc = "AD Request", ++ .ucode = 0x0100, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "AD_RSP_VN0", ++ .udesc = "AD Response", ++ .ucode = 0x0200, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "BL_RSP_VN0", ++ .udesc = "BL Response", ++ .ucode = 0x0400, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "BL_WB_VN0", ++ .udesc = "BL WB", ++ .ucode = 0x0800, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "BL_NCB_VN0", ++ .udesc = "BL NCB", ++ .ucode = 0x1000, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "BL_NCS_VN0", ++ .udesc = "BL NCS", ++ .ucode = 0x2000, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "AK_NON_UPI", ++ .udesc = "AK non upi", ++ .ucode = 0x4000, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "IV_NON_UPI", ++ .udesc = "IV non upi", ++ .ucode = 0x8000, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++}; ++ ++static const intel_x86_umask_t knl_unc_cha_ingress_entry_reject_q1[]={ ++ { .uname = "ANY_REJECT", ++ .udesc = "Any reject from request queue0", ++ .ucode = 0x0100, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, ++ }, ++ { .uname = "SF_VICTIM", ++ .udesc = "SF victim", ++ .ucode = 0x0800, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "SF_WAY", ++ .udesc = "SF way", ++ .ucode = 0x2000, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "ALLOW_SNP", ++ .udesc = "allow snoop", ++ .ucode = 0x4000, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "PA_MATCH", ++ .udesc = "PA match", ++ .ucode = 0x8000, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++}; ++ ++ ++static const intel_x86_umask_t knl_unc_cha_tor_subevent[]={ ++ { .uname = "IRQ", ++ .udesc = " -IRQ.", ++ .ucode = 0x3100, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "EVICT", ++ .udesc = " -SF/LLC Evictions.", ++ .ucode = 0x3200, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "PRQ", ++ .udesc = " -PRQ.", ++ .ucode = 0x3400, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "IPQ", ++ .udesc = " -IPQ.", ++ .ucode = 0x3800, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "HIT", ++ .udesc = " -Hit (Not a Miss).", ++ .ucode = 0x1f00, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "MISS", ++ .udesc = " -Miss.", ++ .ucode = 0x2f00, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "IRQ_HIT", ++ .udesc = " -IRQ HIT.", ++ .ucode = 0x1100, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "IRQ_MISS", ++ .udesc = " -IRQ MISS.", ++ .ucode = 0x2100, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "PRQ_HIT", ++ .udesc = " -PRQ HIT.", ++ .ucode = 0x1400, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "PRQ_MISS", ++ .udesc = " -PRQ MISS.", ++ .ucode = 0x2400, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "IPQ_HIT", ++ .udesc = " -IPQ HIT", ++ .ucode = 0x1800, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "IPQ_MISS", ++ .udesc = " -IPQ MISS", ++ .ucode = 0x2800, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++}; ++ ++static const intel_x86_umask_t knl_unc_cha_misc[]={ ++ { .uname = "RSPI_WAS_FSE", ++ .udesc = "Silent Snoop Eviction", ++ .ucode = 0x0100, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "WC_ALIASING", ++ .udesc = "Write Combining Aliasing.", ++ .ucode = 0x0200, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "RFO_HIT_S", ++ .udesc = "Counts the number of times that an RFO hits in S state.", ++ .ucode = 0x0800, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "CV0_PREF_VIC", ++ .udesc = "CV0 Prefetch Victim.", ++ .ucode = 0x1000, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "CV0_PREF_MISS", ++ .udesc = "CV0 Prefetch Miss.", ++ .ucode = 0x2000, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++}; ++ ++static const intel_x86_umask_t knl_unc_cha_tgr_ext[]={ ++ { .uname = "TGR0", ++ .udesc = "for Transgress 0", ++ .ucode = 0x0100, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "TGR1", ++ .udesc = "for Transgress 1", ++ .ucode = 0x0200, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "TGR2", ++ .udesc = "for Transgress 2", ++ .ucode = 0x0400, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "TGR3", ++ .udesc = "for Transgress 3", ++ .ucode = 0x0800, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "TGR4", ++ .udesc = "for Transgress 4", ++ .ucode = 0x1000, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "TGR5", ++ .udesc = "for Transgress 5", ++ .ucode = 0x2000, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "TGR6", ++ .udesc = "for Transgress 6", ++ .ucode = 0x4000, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "TGR7", ++ .udesc = "for Transgress 7", ++ .ucode = 0x8000, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++}; ++ ++static const intel_x86_umask_t knl_unc_cha_tgr_ext1[]={ ++ { .uname = "TGR8", ++ .udesc = "for Transgress 8", ++ .ucode = 0x0100, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "ANY_OF_TGR0_THRU_TGR7", ++ .udesc = "for Transgress 0-7", ++ .ucode = 0x0200, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++}; ++ ++static const intel_x86_umask_t knl_unc_cha_ring_type_agent[]={ ++ { .uname = "AD_AG0", ++ .udesc = "AD - Agent 0", ++ .ucode = 0x0100, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "AK_AG0", ++ .udesc = "AK - Agent 0", ++ .ucode = 0x0200, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "BL_AG0", ++ .udesc = "BL - Agent 0", ++ .ucode = 0x0400, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "IV_AG0", ++ .udesc = "IV - Agent 0", ++ .ucode = 0x0800, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "AD_AG1", ++ .udesc = "AD - Agent 1", ++ .ucode = 0x1000, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "AK_AG1", ++ .udesc = "AK - Agent 1", ++ .ucode = 0x2000, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "BL_AG1", ++ .udesc = "BL - Agent 1", ++ .ucode = 0x4000, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++}; ++ ++static const intel_x86_umask_t knl_unc_cha_ring_type[]={ ++ { .uname = "AD", ++ .udesc = " - AD ring", ++ .ucode = 0x0100, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "AK", ++ .udesc = " - AK ring", ++ .ucode = 0x0200, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "BL", ++ .udesc = " - BL ring", ++ .ucode = 0x0400, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "IV", ++ .udesc = " - IV ring", ++ .ucode = 0x0800, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++}; ++ ++static const intel_x86_umask_t knl_unc_cha_dire_ext[]={ ++ { .uname = "VERT", ++ .udesc = " - vertical", ++ .ucode = 0x0000, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "HORZ", ++ .udesc = " - horizontal", ++ .ucode = 0x0100, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++}; ++ ++static const intel_x86_umask_t knl_unc_cha_ring_use_vert[]={ ++ { .uname = "UP_EVEN", ++ .udesc = "UP_EVEN", ++ .ucode = 0x0100, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "UP_ODD", ++ .udesc = "UP_ODD", ++ .ucode = 0x0200, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "DN_EVEN", ++ .udesc = "DN_EVEN", ++ .ucode = 0x0200, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "DN_ODD", ++ .udesc = "DN_ODD", ++ .ucode = 0x0400, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++}; ++ ++static const intel_x86_umask_t knl_unc_cha_ring_use_hori[]={ ++ { .uname = "LEFT_EVEN", ++ .udesc = "LEFT_EVEN", ++ .ucode = 0x0100, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "LEFT_ODD", ++ .udesc = "LEFT_ODD", ++ .ucode = 0x0200, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "RIGHT_EVEN", ++ .udesc = "RIGHT_EVEN", ++ .ucode = 0x0400, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "RIGHT_ODD", ++ .udesc = "RIGHT_ODD", ++ .ucode = 0x0800, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++}; ++ ++static const intel_x86_umask_t knl_unc_cha_ring_use_updn[]={ ++ { .uname = "UP", ++ .udesc = "up", ++ .ucode = 0x0100, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "DN", ++ .udesc = "down", ++ .ucode = 0x0400, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++}; ++ ++static const intel_x86_umask_t knl_unc_cha_ring_use_lfrt[]={ ++ { .uname = "LEFT", ++ .udesc = "left", ++ .ucode = 0x0100, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "RIGHT", ++ .udesc = "right", ++ .ucode = 0x0400, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++}; ++ ++static const intel_x86_umask_t knl_unc_cha_iv_snp[]={ ++ { .uname = "IV_SNP_GO_UP", ++ .udesc = "IV_SNP_GO_UP", ++ .ucode = 0x0100, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "IV_SNP_GO_DN", ++ .udesc = "IV_SNP_GO_DN", ++ .ucode = 0x0400, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++}; ++ ++static const intel_x86_umask_t knl_unc_cha_cms_ext[]={ ++ { .uname = "AD_BNC", ++ .udesc = "AD_BNC", ++ .ucode = 0x0100, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "AK_BNC", ++ .udesc = "AK_BNC", ++ .ucode = 0x0200, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "BL_BNC", ++ .udesc = "BL_BNC", ++ .ucode = 0x0400, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "IV_BNC", ++ .udesc = "IV_BNC", ++ .ucode = 0x0800, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "AD_CRD", ++ .udesc = "AD_CRD", ++ .ucode = 0x1000, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "BL_CRD", ++ .udesc = "AD_CRD", ++ .ucode = 0x4000, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++}; ++ ++static const intel_x86_umask_t knl_unc_cha_cms_crd_starved[]={ ++ { .uname = "AD_BNC", ++ .udesc = "AD_BNC", ++ .ucode = 0x0100, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "AK_BNC", ++ .udesc = "AK_BNC", ++ .ucode = 0x0200, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "BL_BNC", ++ .udesc = "BL_BNC", ++ .ucode = 0x0400, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "IV_BNC", ++ .udesc = "IV_BNC", ++ .ucode = 0x0800, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "AD_CRD", ++ .udesc = "AD_CRD", ++ .ucode = 0x1000, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "BL_CRD", ++ .udesc = "AD_CRD", ++ .ucode = 0x4000, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "IVF", ++ .udesc = "IVF", ++ .ucode = 0x8000, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++}; ++ ++static const intel_x86_umask_t knl_unc_cha_cms_busy_starved[]={ ++ { .uname = "AD_BNC", ++ .udesc = "AD_BNC", ++ .ucode = 0x0100, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "BL_BNC", ++ .udesc = "BL_BNC", ++ .ucode = 0x0400, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "AD_CRD", ++ .udesc = "AD_CRD", ++ .ucode = 0x1000, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++ { .uname = "BL_CRD", ++ .udesc = "AD_CRD", ++ .ucode = 0x4000, ++ .uflags = INTEL_X86_NCOMBO, ++ }, ++}; ++ ++static const intel_x86_entry_t intel_knl_unc_cha_pe[]={ ++ { .name = "UNC_H_U_CLOCKTICKS", ++ .desc = "Uncore clockticks", ++ .modmsk = 0x0, ++ .cntmsk = 0xf, ++ .code = 0x00, ++ .flags = INTEL_X86_FIXED, ++ }, ++ { .name = "UNC_H_INGRESS_OCCUPANCY", ++ .desc = "Ingress Occupancy. Ingress Occupancy. Counts number of entries in the specified Ingress queue in each cycle", ++ .cntmsk = 0xf, ++ .code = 0x11, ++ .ngrp = 1, ++ .flags = INTEL_X86_NO_AUTOENCODE, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ingress_ext), ++ .umasks = knl_unc_cha_ingress_ext, ++ }, ++ { .name = "UNC_H_INGRESS_INSERTS", ++ .desc = "Ingress Allocations. Counts number of allocations per cycle into the specified Ingress queue", ++ .cntmsk = 0xf, ++ .code = 0x13, ++ .ngrp = 1, ++ .flags = INTEL_X86_NO_AUTOENCODE, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ingress_ext), ++ .umasks = knl_unc_cha_ingress_ext, ++ }, ++ { .name = "UNC_H_INGRESS_INT_STARVED", ++ .desc = "Cycles Internal Starvation", ++ .cntmsk = 0xf, ++ .code = 0x14, ++ .ngrp = 1, ++ .flags = INTEL_X86_NO_AUTOENCODE, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ingress_int_starved), ++ .umasks = knl_unc_cha_ingress_int_starved, ++ }, ++ { .name = "UNC_H_INGRESS_RETRY_IRQ0_REJECT", ++ .desc = "Ingress Request Queue Rejects", ++ .cntmsk = 0xf, ++ .code = 0x18, ++ .ngrp = 1, ++ .flags = INTEL_X86_NO_AUTOENCODE, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ingress_entry_reject_q0), ++ .umasks = knl_unc_cha_ingress_entry_reject_q0, ++ }, ++ { .name = "UNC_H_INGRESS_RETRY_IRQ01_REJECT", ++ .desc = "Ingress Request Queue Rejects", ++ .cntmsk = 0xf, ++ .code = 0x19, ++ .ngrp = 1, ++ .flags = INTEL_X86_NO_AUTOENCODE, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ingress_entry_reject_q1), ++ .umasks = knl_unc_cha_ingress_entry_reject_q1, ++ }, ++ { .name = "UNC_H_INGRESS_RETRY_PRQ0_REJECT", ++ .desc = "Ingress Request Queue Rejects", ++ .cntmsk = 0xf, ++ .code = 0x20, ++ .ngrp = 1, ++ .flags = INTEL_X86_NO_AUTOENCODE, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ingress_entry_reject_q0), ++ .umasks = knl_unc_cha_ingress_entry_reject_q0, ++ }, ++ { .name = "UNC_H_INGRESS_RETRY_PRQ1_REJECT", ++ .desc = "Ingress Request Queue Rejects", ++ .cntmsk = 0xf, ++ .code = 0x21, ++ .ngrp = 1, ++ .flags = INTEL_X86_NO_AUTOENCODE, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ingress_entry_reject_q1), ++ .umasks = knl_unc_cha_ingress_entry_reject_q1, ++ }, ++ { .name = "UNC_H_INGRESS_RETRY_IPQ0_REJECT", ++ .desc = "Ingress Request Queue Rejects", ++ .cntmsk = 0xf, ++ .code = 0x22, ++ .ngrp = 1, ++ .flags = INTEL_X86_NO_AUTOENCODE, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ingress_entry_reject_q0), ++ .umasks = knl_unc_cha_ingress_entry_reject_q0, ++ }, ++ { .name = "UNC_H_INGRESS_RETRY_IPQ1_REJECT", ++ .desc = "Ingress Request Queue Rejects", ++ .cntmsk = 0xf, ++ .code = 0x23, ++ .ngrp = 1, ++ .flags = INTEL_X86_NO_AUTOENCODE, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ingress_entry_reject_q1), ++ .umasks = knl_unc_cha_ingress_entry_reject_q1, ++ }, ++ { .name = "UNC_H_INGRESS_RETRY_ISMQ0_REJECT", ++ .desc = "ISMQ Rejects", ++ .cntmsk = 0xf, ++ .code = 0x24, ++ .ngrp = 1, ++ .flags = INTEL_X86_NO_AUTOENCODE, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ingress_entry_reject_q0), ++ .umasks = knl_unc_cha_ingress_entry_reject_q0, ++ }, ++ { .name = "UNC_H_INGRESS_RETRY_REQ_Q0_RETRY", ++ .desc = "REQUESTQ includes: IRQ, PRQ, IPQ, RRQ, WBQ (everything except for ISMQ)", ++ .cntmsk = 0xf, ++ .code = 0x2a, ++ .ngrp = 1, ++ .flags = INTEL_X86_NO_AUTOENCODE, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ingress_entry_reject_q0), ++ .umasks = knl_unc_cha_ingress_entry_reject_q0, ++ }, ++ { .name = "UNC_H_INGRESS_RETRY_REQ_Q1_RETRY", ++ .desc = "REQUESTQ includes: IRQ, PRQ, IPQ, RRQ, WBQ (everything except for ISMQ)", ++ .cntmsk = 0xf, ++ .code = 0x2b, ++ .ngrp = 1, ++ .flags = INTEL_X86_NO_AUTOENCODE, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ingress_entry_reject_q1), ++ .umasks = knl_unc_cha_ingress_entry_reject_q1, ++ }, ++ { .name = "UNC_H_INGRESS_RETRY_ISMQ0_RETRY", ++ .desc = "ISMQ retries", ++ .cntmsk = 0xf, ++ .code = 0x2c, ++ .ngrp = 1, ++ .flags = INTEL_X86_NO_AUTOENCODE, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ingress_entry_reject_q0), ++ .umasks = knl_unc_cha_ingress_entry_reject_q0, ++ }, ++ { .name = "UNC_H_INGRESS_RETRY_OTHER0_RETRY", ++ .desc = "Other Queue Retries", ++ .cntmsk = 0xf, ++ .code = 0x2e, ++ .ngrp = 1, ++ .flags = INTEL_X86_NO_AUTOENCODE, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ingress_entry_reject_q0), ++ .umasks = knl_unc_cha_ingress_entry_reject_q0, ++ }, ++ { .name = "UNC_H_INGRESS_RETRY_OTHER1_RETRY", ++ .desc = "Other Queue Retries", ++ .cntmsk = 0xf, ++ .code = 0x2f, ++ .ngrp = 1, ++ .flags = INTEL_X86_NO_AUTOENCODE, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ingress_entry_reject_q1), ++ .umasks = knl_unc_cha_ingress_entry_reject_q1, ++ }, ++ { .name = "UNC_H_SF_LOOKUP", ++ .desc = "Cache Lookups. Counts the number of times the LLC was accessed.", ++ .cntmsk = 0xf, ++ .code = 0x34, ++ .ngrp = 1, ++ .flags = INTEL_X86_NO_AUTOENCODE, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_llc_lookup), ++ .umasks = knl_unc_cha_llc_lookup, ++ }, ++ { .name = "UNC_H_CACHE_LINES_VICTIMIZED", ++ .desc = "Cache Lookups. Counts the number of times the LLC was accessed.", ++ .cntmsk = 0xf, ++ .code = 0x37, ++ .ngrp = 1, ++ .flags = INTEL_X86_NO_AUTOENCODE, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_llc_victims), ++ .umasks = knl_unc_cha_llc_victims, ++ }, ++ { .name = "UNC_H_TOR_INSERTS", ++ .desc = "Counts the number of entries successfuly inserted into the TOR that match qualifications specified by the subevent.", ++ .modmsk = KNL_UNC_CHA_TOR_ATTRS, ++ .cntmsk = 0xf, ++ .code = 0x35, ++ .ngrp = 1, ++ .flags = INTEL_X86_NO_AUTOENCODE, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_tor_subevent), ++ .umasks = knl_unc_cha_tor_subevent ++ }, ++ { .name = "UNC_H_TOR_OCCUPANCY", ++ .desc = "For each cycle, this event accumulates the number of valid entries in the TOR that match qualifications specified by the subevent", ++ .modmsk = KNL_UNC_CHA_TOR_ATTRS, ++ .cntmsk = 0xf, ++ .code = 0x36, ++ .ngrp = 1, ++ .flags = INTEL_X86_NO_AUTOENCODE, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_tor_subevent), ++ .umasks = knl_unc_cha_tor_subevent ++ }, ++ { .name = "UNC_H_MISC", ++ .desc = "Miscellaneous events in the Cha", ++ .cntmsk = 0xf, ++ .code = 0x39, ++ .ngrp = 1, ++ .flags = INTEL_X86_NO_AUTOENCODE, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_misc), ++ .umasks = knl_unc_cha_misc, ++ }, ++ { .name = "UNC_H_AG0_AD_CRD_ACQUIRED", ++ .desc = "CMS Agent0 AD Credits Acquired.", ++ .cntmsk = 0xf, ++ .code = 0x80, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_tgr_ext), ++ .umasks = knl_unc_cha_tgr_ext, ++ }, ++ { .name = "UNC_H_AG0_AD_CRD_ACQUIRED_EXT", ++ .desc = "CMS Agent0 AD Credits Acquired.", ++ .cntmsk = 0xf, ++ .code = 0x81, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_tgr_ext1), ++ .umasks = knl_unc_cha_tgr_ext1, ++ }, ++ { .name = "UNC_H_AG0_AD_CRD_OCCUPANCY", ++ .desc = "CMS Agent0 AD Credits Occupancy.", ++ .cntmsk = 0xf, ++ .code = 0x82, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_tgr_ext), ++ .umasks = knl_unc_cha_tgr_ext, ++ }, ++ { .name = "UNC_H_AG0_AD_CRD_OCCUPANCY_EXT", ++ .desc = "CMS Agent0 AD Credits Acquired For Transgress.", ++ .cntmsk = 0xf, ++ .code = 0x83, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_tgr_ext1), ++ .umasks = knl_unc_cha_tgr_ext1, ++ }, ++ { .name = "UNC_H_AG1_AD_CRD_ACQUIRED", ++ .desc = "CMS Agent1 AD Credits Acquired .", ++ .cntmsk = 0xf, ++ .code = 0x84, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_tgr_ext), ++ .umasks = knl_unc_cha_tgr_ext, ++ }, ++ { .name = "UNC_H_AG1_AD_CRD_ACQUIRED_EXT", ++ .desc = "CMS Agent1 AD Credits Acquired .", ++ .cntmsk = 0xf, ++ .code = 0x85, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_tgr_ext1), ++ .umasks = knl_unc_cha_tgr_ext1, ++ }, ++ { .name = "UNC_H_AG1_AD_CRD_OCCUPANCY", ++ .desc = "CMS Agent1 AD Credits Occupancy.", ++ .cntmsk = 0xf, ++ .code = 0x86, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_tgr_ext), ++ .umasks = knl_unc_cha_tgr_ext, ++ }, ++ { .name = "UNC_H_AG1_AD_CRD_OCCUPANCY_EXT", ++ .desc = "CMS Agent1 AD Credits Occupancy.", ++ .cntmsk = 0xf, ++ .code = 0x87, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_tgr_ext1), ++ .umasks = knl_unc_cha_tgr_ext1, ++ }, ++ { .name = "UNC_H_AG0_BL_CRD_ACQUIRED", ++ .desc = "CMS Agent0 BL Credits Acquired.", ++ .cntmsk = 0xf, ++ .code = 0x88, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_tgr_ext), ++ .umasks = knl_unc_cha_tgr_ext, ++ }, ++ { .name = "UNC_H_AG0_BL_CRD_ACQUIRED_EXT", ++ .desc = "CMS Agent0 BL Credits Acquired.", ++ .cntmsk = 0xf, ++ .code = 0x89, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_tgr_ext1), ++ .umasks = knl_unc_cha_tgr_ext1, ++ }, ++ { .name = "UNC_H_AG0_BL_CRD_OCCUPANCY", ++ .desc = "CMS Agent0 BL Credits Occupancy.", ++ .cntmsk = 0xf, ++ .code = 0x8a, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_tgr_ext), ++ .umasks = knl_unc_cha_tgr_ext, ++ }, ++ { .name = "UNC_H_AG0_BL_CRD_OCCUPANCY_EXT", ++ .desc = "CMS Agent0 BL Credits Occupancy.", ++ .cntmsk = 0xf, ++ .code = 0x8b, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_tgr_ext1), ++ .umasks = knl_unc_cha_tgr_ext1, ++ }, ++ { .name = "UNC_H_AG1_BL_CRD_ACQUIRED", ++ .desc = "CMS Agent1 BL Credits Acquired.", ++ .cntmsk = 0xf, ++ .code = 0x8c, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_tgr_ext), ++ .umasks = knl_unc_cha_tgr_ext, ++ }, ++ { .name = "UNC_H_AG1_BL_CRD_ACQUIRED_EXT", ++ .desc = "CMS Agent1 BL Credits Acquired.", ++ .cntmsk = 0xf, ++ .code = 0x8d, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_tgr_ext1), ++ .umasks = knl_unc_cha_tgr_ext1, ++ }, ++ { .name = "UNC_H_AG1_BL_CRD_OCCUPANCY", ++ .desc = "CMS Agent1 BL Credits Occupancy.", ++ .cntmsk = 0xf, ++ .code = 0x8e, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_tgr_ext), ++ .umasks = knl_unc_cha_tgr_ext, ++ }, ++ { .name = "UNC_H_AG1_BL_CRD_OCCUPANCY_EXT", ++ .desc = "CMS Agent1 BL Credits Occupancy.", ++ .cntmsk = 0xf, ++ .code = 0x8f, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_tgr_ext1), ++ .umasks = knl_unc_cha_tgr_ext1, ++ }, ++ { .name = "UNC_H_AG0_STALL_NO_CRD_EGRESS_HORZ_AD", ++ .desc = "Stall on No AD Transgress Credits.", ++ .cntmsk = 0xf, ++ .code = 0xD0, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_tgr_ext), ++ .umasks = knl_unc_cha_tgr_ext, ++ }, ++ { .name = "UNC_H_AG0_STALL_NO_CRD_EGRESS_HORZ_AD_EXT", ++ .desc = "Stall on No AD Transgress Credits.", ++ .cntmsk = 0xf, ++ .code = 0xD1, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_tgr_ext1), ++ .umasks = knl_unc_cha_tgr_ext1, ++ }, ++ { .name = "UNC_H_AG1_STALL_NO_CRD_EGRESS_HORZ_AD", ++ .desc = "Stall on No AD Transgress Credits.", ++ .cntmsk = 0xf, ++ .code = 0xD2, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_tgr_ext), ++ .umasks = knl_unc_cha_tgr_ext, ++ }, ++ { .name = "UNC_H_AG1_STALL_NO_CRD_EGRESS_HORZ_AD_EXT", ++ .desc = "Stall on No AD Transgress Credits.", ++ .cntmsk = 0xf, ++ .code = 0xD3, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_tgr_ext1), ++ .umasks = knl_unc_cha_tgr_ext1, ++ }, ++ { .name = "UNC_H_AG0_STALL_NO_CRD_EGRESS_HORZ_BL", ++ .desc = "Stall on No AD Transgress Credits.", ++ .cntmsk = 0xf, ++ .code = 0xD4, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_tgr_ext), ++ .umasks = knl_unc_cha_tgr_ext, ++ }, ++ { .name = "UNC_H_AG0_STALL_NO_CRD_EGRESS_HORZ_BL_EXT", ++ .desc = "Stall on No AD Transgress Credits.", ++ .cntmsk = 0xf, ++ .code = 0xD5, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_tgr_ext1), ++ .umasks = knl_unc_cha_tgr_ext1, ++ }, ++ { .name = "UNC_H_AG1_STALL_NO_CRD_EGRESS_HORZ_BL", ++ .desc = "Stall on No AD Transgress Credits.", ++ .cntmsk = 0xf, ++ .code = 0xD6, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_tgr_ext), ++ .umasks = knl_unc_cha_tgr_ext, ++ }, ++ { .name = "UNC_H_AG1_STALL_NO_CRD_EGRESS_HORZ_BL_EXT", ++ .desc = "Stall on No AD Transgress Credits.", ++ .cntmsk = 0xf, ++ .code = 0xD7, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_tgr_ext1), ++ .umasks = knl_unc_cha_tgr_ext1, ++ }, ++ { .name = "UNC_H_EGRESS_VERT_OCCUPANCY", ++ .desc = "CMS Vert Egress Occupancy.", ++ .cntmsk = 0xf, ++ .code = 0x90, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ring_type_agent), ++ .umasks = knl_unc_cha_ring_type_agent, ++ }, ++ { .name = "UNC_H_EGRESS_VERT_INSERTS", ++ .desc = "CMS Vert Egress Allocations.", ++ .cntmsk = 0xf, ++ .code = 0x91, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ring_type_agent), ++ .umasks = knl_unc_cha_ring_type_agent, ++ }, ++ { .name = "UNC_H_EGRESS_VERT_CYCLES_FULL", ++ .desc = "Cycles CMS Vertical Egress Queue Is Full.", ++ .cntmsk = 0xf, ++ .code = 0x92, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ring_type_agent), ++ .umasks = knl_unc_cha_ring_type_agent, ++ }, ++ { .name = "UNC_H_EGRESS_VERT_CYCLES_NE", ++ .desc = "Cycles CMS Vertical Egress Queue Is Not Empty.", ++ .cntmsk = 0xf, ++ .code = 0x93, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ring_type_agent), ++ .umasks = knl_unc_cha_ring_type_agent, ++ }, ++ { .name = "UNC_H_EGRESS_VERT_NACK", ++ .desc = "CMS Vertical Egress NACKs.", ++ .cntmsk = 0xf, ++ .code = 0x98, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ring_type_agent), ++ .umasks = knl_unc_cha_ring_type_agent, ++ }, ++ { .name = "UNC_H_EGRESS_VERT_STARVED", ++ .desc = "CMS Vertical Egress Injection Starvation.", ++ .cntmsk = 0xf, ++ .code = 0x9a, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ring_type_agent), ++ .umasks = knl_unc_cha_ring_type_agent, ++ }, ++ { .name = "UNC_H_EGRESS_VERT_ADS_USED", ++ .desc = "CMS Vertical ADS Used.", ++ .cntmsk = 0xf, ++ .code = 0x9c, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ring_type_agent), ++ .umasks = knl_unc_cha_ring_type_agent, ++ }, ++ { .name = "UNC_H_EGRESS_VERT_BYPASS", ++ .desc = "CMS Vertical Egress Bypass.", ++ .cntmsk = 0xf, ++ .code = 0x9e, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ring_type_agent), ++ .umasks = knl_unc_cha_ring_type_agent, ++ }, ++ { .name = "UNC_H_EGRESS_HORZ_OCCUPANCY", ++ .desc = "CMS Horizontal Egress Occupancy.", ++ .cntmsk = 0xf, ++ .code = 0x94, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ring_type), ++ .umasks = knl_unc_cha_ring_type, ++ }, ++ { .name = "UNC_H_EGRESS_HORZ_INSERTS", ++ .desc = "CMS Horizontal Egress Inserts.", ++ .cntmsk = 0xf, ++ .code = 0x95, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ring_type), ++ .umasks = knl_unc_cha_ring_type, ++ }, ++ { .name = "UNC_H_EGRESS_HORZ_CYCLES_FULL", ++ .desc = "Cycles CMS Horizontal Egress Queue is Full.", ++ .cntmsk = 0xf, ++ .code = 0x96, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ring_type), ++ .umasks = knl_unc_cha_ring_type, ++ }, ++ { .name = "UNC_H_EGRESS_HORZ_CYCLES_NE", ++ .desc = "Cycles CMS Horizontal Egress Queue is Not Empty.", ++ .cntmsk = 0xf, ++ .code = 0x97, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ring_type), ++ .umasks = knl_unc_cha_ring_type, ++ }, ++ { .name = "UNC_H_EGRESS_HORZ_NACK", ++ .desc = "CMS Horizontal Egress NACKs.", ++ .cntmsk = 0xf, ++ .code = 0x99, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ring_type), ++ .umasks = knl_unc_cha_ring_type, ++ }, ++ { .name = "UNC_H_EGRESS_HORZ_STARVED", ++ .desc = "CMS Horizontal Egress Injection Starvation.", ++ .cntmsk = 0xf, ++ .code = 0x9b, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ring_type), ++ .umasks = knl_unc_cha_ring_type, ++ }, ++ { .name = "UNC_H_EGRESS_HORZ_ADS_USED", ++ .desc = "CMS Horizontal ADS Used.", ++ .cntmsk = 0xf, ++ .code = 0x9d, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ring_type), ++ .umasks = knl_unc_cha_ring_type, ++ }, ++ { .name = "UNC_H_EGRESS_HORZ_BYPASS", ++ .desc = "CMS Horizontal Egress Bypass.", ++ .cntmsk = 0xf, ++ .code = 0x9f, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ring_type), ++ .umasks = knl_unc_cha_ring_type, ++ }, ++ { .name = "UNC_H_RING_BOUNCES_VERT", ++ .desc = "Number of incoming messages from the Vertical ring that were bounced, by ring type.", ++ .cntmsk = 0xf, ++ .code = 0xa0, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ring_type), ++ .umasks = knl_unc_cha_ring_type, ++ }, ++ { .name = "UNC_H_RING_BOUNCES_HORZ", ++ .desc = "Number of incoming messages from the Horizontal ring that were bounced, by ring type.", ++ .cntmsk = 0xf, ++ .code = 0xa1, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ring_type), ++ .umasks = knl_unc_cha_ring_type, ++ }, ++ { .name = "UNC_H_RING_SINK_STARVED_VERT", ++ .desc = "Vertical ring sink starvation count.", ++ .cntmsk = 0xf, ++ .code = 0xa2, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ring_type), ++ .umasks = knl_unc_cha_ring_type, ++ }, ++ { .name = "UNC_H_RING_SINK_STARVED_HORZ", ++ .desc = "Horizontal ring sink starvation count.", ++ .cntmsk = 0xf, ++ .code = 0xa3, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ring_type), ++ .umasks = knl_unc_cha_ring_type, ++ }, ++ { .name = "UNC_H_RING_SRC_THRT", ++ .desc = "Counts cycles in throttle mode.", ++ .cntmsk = 0xf, ++ .code = 0xa4, ++ }, ++ { .name = "UNC_H_FAST_ASSERTED", ++ .desc = "Counts cycles source throttling is adderted", ++ .cntmsk = 0xf, ++ .code = 0xa5, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_dire_ext), ++ .umasks = knl_unc_cha_dire_ext, ++ }, ++ { .name = "UNC_H_VERT_RING_AD_IN_USE", ++ .desc = "Counts the number of cycles that the Vertical AD ring is being used at this ring stop.", ++ .cntmsk = 0xf, ++ .code = 0xa6, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ring_use_vert), ++ .umasks = knl_unc_cha_ring_use_vert, ++ }, ++ { .name = "UNC_H_HORZ_RING_AD_IN_USE", ++ .desc = "Counts the number of cycles that the Horizontal AD ring is being used at this ring stop.", ++ .cntmsk = 0xf, ++ .code = 0xa7, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ring_use_hori), ++ .umasks = knl_unc_cha_ring_use_hori, ++ }, ++ { .name = "UNC_H_VERT_RING_AK_IN_USE", ++ .desc = "Counts the number of cycles that the Vertical AK ring is being used at this ring stop.", ++ .cntmsk = 0xf, ++ .code = 0xa8, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ring_use_vert), ++ .umasks = knl_unc_cha_ring_use_vert, ++ }, ++ { .name = "UNC_H_HORZ_RING_AK_IN_USE", ++ .desc = "Counts the number of cycles that the Horizontal AK ring is being used at this ring stop.", ++ .cntmsk = 0xf, ++ .code = 0xa9, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ring_use_hori), ++ .umasks = knl_unc_cha_ring_use_hori, ++ }, ++ { .name = "UNC_H_VERT_RING_BL_IN_USE", ++ .desc = "Counts the number of cycles that the Vertical BL ring is being used at this ring stop.", ++ .cntmsk = 0xf, ++ .code = 0xaa, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ring_use_vert), ++ .umasks = knl_unc_cha_ring_use_vert, ++ }, ++ { .name = "UNC_H_HORZ_RING_BL_IN_USE", ++ .desc = "Counts the number of cycles that the Horizontal BL ring is being used at this ring stop.", ++ .cntmsk = 0xf, ++ .code = 0xab, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ring_use_hori), ++ .umasks = knl_unc_cha_ring_use_hori, ++ }, ++ { .name = "UNC_H_VERT_RING_IV_IN_USE", ++ .desc = "Counts the number of cycles that the Vertical IV ring is being used at this ring stop.", ++ .cntmsk = 0xf, ++ .code = 0xac, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ring_use_updn), ++ .umasks = knl_unc_cha_ring_use_updn, ++ }, ++ { .name = "UNC_H_HORZ_RING_IV_IN_USE", ++ .desc = "Counts the number of cycles that the Horizontal IV ring is being used at this ring stop.", ++ .cntmsk = 0xf, ++ .code = 0xad, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ring_use_lfrt), ++ .umasks = knl_unc_cha_ring_use_lfrt, ++ }, ++ { .name = "UNC_H_EGRESS_ORDERING", ++ .desc = "Counts number of cycles IV was blocked in the TGR Egress due to SNP/GO Ordering requirements.", ++ .cntmsk = 0xf, ++ .code = 0xae, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_iv_snp), ++ .umasks = knl_unc_cha_iv_snp, ++ }, ++ { .name = "UNC_H_TG_INGRESS_OCCUPANCY", ++ .desc = "Transgress Ingress Occupancy. Occupancy event for the Ingress buffers in the CMS The Ingress is used to queue up requests received from the mesh.", ++ .cntmsk = 0xf, ++ .code = 0xb0, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_cms_ext), ++ .umasks = knl_unc_cha_cms_ext, ++ }, ++ { .name = "UNC_H_TG_INGRESS_INSERTS", ++ .desc = "Transgress Ingress Allocations. Number of allocations into the CMS Ingress The Ingress is used to queue up requests received from the mesh.", ++ .cntmsk = 0xf, ++ .code = 0xb1, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_cms_ext), ++ .umasks = knl_unc_cha_cms_ext, ++ }, ++ { .name = "UNC_H_TG_INGRESS_BYPASS", ++ .desc = "Transgress Ingress Bypass. Number of packets bypassing the CMS Ingress.", ++ .cntmsk = 0xf, ++ .code = 0xb2, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_cms_ext), ++ .umasks = knl_unc_cha_cms_ext, ++ }, ++ { .name = "UNC_H_TG_INGRESS_CRD_STARVED", ++ .desc = "Transgress Injection Starvation. Counts cycles under injection starvation mode. This starvation is triggered when the CMS Ingress cannot send a transaction onto the mesh for a long period of time. In this case, the Ingress is unable to forward to the Egress due to a lack of credit.", ++ .cntmsk = 0xf, ++ .code = 0xb3, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_cms_crd_starved), ++ .umasks = knl_unc_cha_cms_crd_starved, ++ }, ++ { .name = "UNC_H_TG_INGRESS_BUSY_STARVED", ++ .desc = "Transgress Injection Starvation. Counts cycles under injection starvation mode. This starvation is triggered when the CMS Ingress cannot send a transaction onto the mesh for a long period of time. In this case, because a message from the other queue has higher priority.", ++ .cntmsk = 0xf, ++ .code = 0xb4, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_cms_busy_starved), ++ .umasks = knl_unc_cha_cms_busy_starved, ++ }, ++}; +diff --git a/lib/events/intel_knl_unc_edc_events.h b/lib/events/intel_knl_unc_edc_events.h +new file mode 100644 +index 0000000..3cbd154 +--- /dev/null ++++ b/lib/events/intel_knl_unc_edc_events.h +@@ -0,0 +1,88 @@ ++/* ++ * Copyright (c) 2016 Intel Corp. All rights reserved ++ * Contributed by Peinan Zhang ++ * ++ * Permission is hereby granted, free of charge, to any person obtaining a copy ++ * of this software and associated documentation files (the "Software"), to deal ++ * in the Software without restriction, including without limitation the rights ++ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies ++ * of the Software, and to permit persons to whom the Software is furnished to do so, ++ * subject to the following conditions: ++ * ++ * The above copyright notice and this permission notice shall be included in all ++ * copies or substantial portions of the Software. ++ * ++ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, ++ * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A ++ * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT ++ * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF ++ * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE ++ * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ++ * ++ * This file is part of libpfm, a performance monitoring support library for ++ * applications on Linux. ++ * ++ * PMU: knl_unc_edc (Intel Knights Landing EDC_UCLK, EDC_ECLK uncore PMUs) ++ */ ++ ++static const intel_x86_umask_t knl_unc_edc_uclk_access_count[]={ ++ { .uname = "HIT_CLEAN", ++ .udesc = "Hit E", ++ .ucode = 0x0100, ++ }, ++ { .uname = "HIT_DIRTY", ++ .udesc = "Hit M", ++ .ucode = 0x0200, ++ }, ++ { .uname = "MISS_CLEAN", ++ .udesc = "Miss E", ++ .ucode = 0x0400, ++ }, ++ { .uname = "MISS_DIRTY", ++ .udesc = "Miss M", ++ .ucode = 0x0800, ++ }, ++ { .uname = "MISS_INVALID", ++ .udesc = "Miss I", ++ .ucode = 0x1000, ++ }, ++ { .uname = "MISS_GARBAGE", ++ .udesc = "Miss G", ++ .ucode = 0x2000, ++ }, ++}; ++ ++ ++static const intel_x86_entry_t intel_knl_unc_edc_uclk_pe[]={ ++ { .name = "UNC_E_U_CLOCKTICKS", ++ .desc = "EDC UCLK clockticks (generic counters)", ++ .code = 0x00, /*encoding for generic counters */ ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_E_EDC_ACCESS", ++ .desc = "Number of EDC Access Hits or Misses.", ++ .code = 0x02, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_edc_uclk_access_count), ++ .umasks = knl_unc_edc_uclk_access_count ++ }, ++}; ++ ++static const intel_x86_entry_t intel_knl_unc_edc_eclk_pe[]={ ++ { .name = "UNC_E_E_CLOCKTICKS", ++ .desc = "EDC ECLK clockticks (generic counters)", ++ .code = 0x00, /*encoding for generic counters */ ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_E_RPQ_INSERTS", ++ .desc = "Counts total number of EDC RPQ insers", ++ .code = 0x0101, ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_E_WPQ_INSERTS", ++ .desc = "Counts total number of EDC WPQ insers", ++ .code = 0x0102, ++ .cntmsk = 0xf, ++ }, ++}; +diff --git a/lib/events/intel_knl_unc_imc_events.h b/lib/events/intel_knl_unc_imc_events.h +new file mode 100644 +index 0000000..cc0aa78 +--- /dev/null ++++ b/lib/events/intel_knl_unc_imc_events.h +@@ -0,0 +1,68 @@ ++/* ++ * Copyright (c) 2016 Intel Corp. All rights reserved ++ * Contributed by Peinan Zhang ++ * ++ * Permission is hereby granted, free of charge, to any person obtaining a copy ++ * of this software and associated documentation files (the "Software"), to deal ++ * in the Software without restriction, including without limitation the rights ++ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies ++ * of the Software, and to permit persons to whom the Software is furnished to do so, ++ * subject to the following conditions: ++ * ++ * The above copyright notice and this permission notice shall be included in all ++ * copies or substantial portions of the Software. ++ * ++ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, ++ * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A ++ * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT ++ * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF ++ * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE ++ * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ++ * ++ * This file is part of libpfm, a performance monitoring support library for ++ * applications on Linux. ++ * ++ * PMU: knl_unc_imc (Intel Knights Landing IMC uncore PMU) ++ */ ++ ++static const intel_x86_umask_t knl_unc_m_cas_count[]={ ++ { .uname = "ALL", ++ .udesc = "Counts total number of DRAM CAS commands issued on this channel", ++ .ucode = 0x0300, ++ }, ++ { .uname = "RD", ++ .udesc = "Counts all DRAM reads on this channel, incl. underfills", ++ .ucode = 0x0100, ++ }, ++ { .uname = "WR", ++ .udesc = "Counts number of DRAM write CAS commands on this channel", ++ .ucode = 0x0200, ++ }, ++}; ++ ++ ++static const intel_x86_entry_t intel_knl_unc_imc_pe[]={ ++ { .name = "UNC_M_D_CLOCKTICKS", ++ .desc = "IMC Uncore DCLK counts", ++ .code = 0x00, /*encoding for generic counters */ ++ .cntmsk = 0xf, ++ }, ++ { .name = "UNC_M_CAS_COUNT", ++ .desc = "DRAM RD_CAS and WR_CAS Commands.", ++ .code = 0x03, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_m_cas_count), ++ .umasks = knl_unc_m_cas_count, ++ }, ++}; ++ ++static const intel_x86_entry_t intel_knl_unc_imc_uclk_pe[]={ ++ { .name = "UNC_M_U_CLOCKTICKS", ++ .desc = "IMC UCLK counts", ++ .code = 0x00, /*encoding for generic counters */ ++ .cntmsk = 0xf, ++ }, ++}; ++ ++ +diff --git a/lib/events/intel_knl_unc_m2pcie_events.h b/lib/events/intel_knl_unc_m2pcie_events.h +new file mode 100644 +index 0000000..7c17c2c +--- /dev/null ++++ b/lib/events/intel_knl_unc_m2pcie_events.h +@@ -0,0 +1,145 @@ ++/* ++ * Copyright (c) 2016 Intel Corp. All rights reserved ++ * Contributed by Peinan Zhang ++ * ++ * Permission is hereby granted, free of charge, to any person obtaining a copy ++ * of this software and associated documentation files (the "Software"), to deal ++ * in the Software without restriction, including without limitation the rights ++ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies ++ * of the Software, and to permit persons to whom the Software is furnished to do so, ++ * subject to the following conditions: ++ * ++ * The above copyright notice and this permission notice shall be included in all ++ * copies or substantial portions of the Software. ++ * ++ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, ++ * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A ++ * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT ++ * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF ++ * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE ++ * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ++ * ++ * This file is part of libpfm, a performance monitoring support library for ++ * applications on Linux. ++ * ++ * PMU: knl_unc_m2pcie (Intel Knights Landing M2PCIe uncore) ++ */ ++ ++ ++static const intel_x86_umask_t knl_unc_m2p_ingress_cycles_ne[]={ ++ { .uname = "CBO_IDI", ++ .udesc = "CBO_IDI", ++ .ucode = 0x0100, ++ }, ++ { .uname = "CBO_NCB", ++ .udesc = "CBO_NCB", ++ .ucode = 0x0200, ++ }, ++ { .uname = "CBO_NCS", ++ .udesc = "CBO_NCS", ++ .ucode = 0x0400, ++ }, ++ { .uname = "ALL", ++ .udesc = "All", ++ .ucode = 0x0800, ++ .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, ++ }, ++}; ++ ++ ++static const intel_x86_umask_t knl_unc_m2p_egress_cycles[]={ ++ { .uname = "AD_0", ++ .udesc = "AD_0", ++ .ucode = 0x0100, ++ }, ++ { .uname = "AK_0", ++ .udesc = "AK_0", ++ .ucode = 0x0200, ++ }, ++ { .uname = "BL_0", ++ .udesc = "BL_0", ++ .ucode = 0x0400, ++ }, ++ { .uname = "AD_1", ++ .udesc = "AD_1", ++ .ucode = 0x0800, ++ }, ++ { .uname = "AK_1", ++ .udesc = "AK_1", ++ .ucode = 0x1000, ++ }, ++ { .uname = "BL_1", ++ .udesc = "BL_1", ++ .ucode = 0x2000, ++ }, ++}; ++ ++static const intel_x86_umask_t knl_unc_m2p_egress_inserts[]={ ++ { .uname = "AD_0", ++ .udesc = "AD_0", ++ .ucode = 0x0100, ++ }, ++ { .uname = "AK_0", ++ .udesc = "AK_0", ++ .ucode = 0x0200, ++ }, ++ { .uname = "BL_0", ++ .udesc = "BL_0", ++ .ucode = 0x0400, ++ }, ++ { .uname = "AK_CRD_0", ++ .udesc = "AK_CRD_0", ++ .ucode = 0x0800, ++ }, ++ { .uname = "AD_1", ++ .udesc = "AD_1", ++ .ucode = 0x1000, ++ }, ++ { .uname = "AK_1", ++ .udesc = "AK_1", ++ .ucode = 0x2000, ++ }, ++ { .uname = "BL_1", ++ .udesc = "BL_1", ++ .ucode = 0x4000, ++ }, ++ { .uname = "AK_CRD_1", ++ .udesc = "AK_CRD_1", ++ .ucode = 0x8000, ++ }, ++}; ++ ++static const intel_x86_entry_t intel_knl_unc_m2pcie_pe[]={ ++ { .name = "UNC_M2P_INGRESS_CYCLES_NE", ++ .desc = "Ingress Queue Cycles Not Empty. Counts the number of cycles when the M2PCIe Ingress is not empty", ++ .code = 0x10, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_m2p_ingress_cycles_ne), ++ .umasks = knl_unc_m2p_ingress_cycles_ne ++ }, ++ { .name = "UNC_M2P_EGRESS_CYCLES_NE", ++ .desc = "Egress (to CMS) Cycles Not Empty. Counts the number of cycles when the M2PCIe Egress is not empty", ++ .code = 0x23, ++ .cntmsk = 0x3, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_m2p_egress_cycles), ++ .umasks = knl_unc_m2p_egress_cycles ++ }, ++ { .name = "UNC_M2P_EGRESS_INSERTS", ++ .desc = "Egress (to CMS) Ingress. Counts the number of number of messages inserted into the the M2PCIe Egress queue", ++ .code = 0x24, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_m2p_egress_inserts), ++ .umasks = knl_unc_m2p_egress_inserts ++ }, ++ { .name = "UNC_M2P_EGRESS_CYCLES_FULL", ++ .desc = "Egress (to CMS) Cycles Full. Counts the number of cycles when the M2PCIe Egress is full", ++ .code = 0x25, ++ .cntmsk = 0xf, ++ .ngrp = 1, ++ .numasks = LIBPFM_ARRAY_SIZE(knl_unc_m2p_egress_cycles), ++ .umasks = knl_unc_m2p_egress_cycles ++ }, ++}; +diff --git a/lib/pfmlib_common.c b/lib/pfmlib_common.c +index f4a56df..cff4d2e 100644 +--- a/lib/pfmlib_common.c ++++ b/lib/pfmlib_common.c +@@ -203,6 +203,69 @@ static pfmlib_pmu_t *pfmlib_pmus[]= + &intel_hswep_unc_r3qpi2_support, + &intel_hswep_unc_irp_support, + &intel_knl_support, ++ &intel_knl_unc_imc0_support, ++ &intel_knl_unc_imc1_support, ++ &intel_knl_unc_imc2_support, ++ &intel_knl_unc_imc3_support, ++ &intel_knl_unc_imc4_support, ++ &intel_knl_unc_imc5_support, ++ &intel_knl_unc_imc_uclk0_support, ++ &intel_knl_unc_imc_uclk1_support, ++ &intel_knl_unc_edc_uclk0_support, ++ &intel_knl_unc_edc_uclk1_support, ++ &intel_knl_unc_edc_uclk2_support, ++ &intel_knl_unc_edc_uclk3_support, ++ &intel_knl_unc_edc_uclk4_support, ++ &intel_knl_unc_edc_uclk5_support, ++ &intel_knl_unc_edc_uclk6_support, ++ &intel_knl_unc_edc_uclk7_support, ++ &intel_knl_unc_edc_eclk0_support, ++ &intel_knl_unc_edc_eclk1_support, ++ &intel_knl_unc_edc_eclk2_support, ++ &intel_knl_unc_edc_eclk3_support, ++ &intel_knl_unc_edc_eclk4_support, ++ &intel_knl_unc_edc_eclk5_support, ++ &intel_knl_unc_edc_eclk6_support, ++ &intel_knl_unc_edc_eclk7_support, ++ &intel_knl_unc_cha0_support, ++ &intel_knl_unc_cha1_support, ++ &intel_knl_unc_cha2_support, ++ &intel_knl_unc_cha3_support, ++ &intel_knl_unc_cha4_support, ++ &intel_knl_unc_cha5_support, ++ &intel_knl_unc_cha6_support, ++ &intel_knl_unc_cha7_support, ++ &intel_knl_unc_cha8_support, ++ &intel_knl_unc_cha9_support, ++ &intel_knl_unc_cha10_support, ++ &intel_knl_unc_cha11_support, ++ &intel_knl_unc_cha12_support, ++ &intel_knl_unc_cha13_support, ++ &intel_knl_unc_cha14_support, ++ &intel_knl_unc_cha15_support, ++ &intel_knl_unc_cha16_support, ++ &intel_knl_unc_cha17_support, ++ &intel_knl_unc_cha18_support, ++ &intel_knl_unc_cha19_support, ++ &intel_knl_unc_cha20_support, ++ &intel_knl_unc_cha21_support, ++ &intel_knl_unc_cha22_support, ++ &intel_knl_unc_cha23_support, ++ &intel_knl_unc_cha24_support, ++ &intel_knl_unc_cha25_support, ++ &intel_knl_unc_cha26_support, ++ &intel_knl_unc_cha27_support, ++ &intel_knl_unc_cha28_support, ++ &intel_knl_unc_cha29_support, ++ &intel_knl_unc_cha30_support, ++ &intel_knl_unc_cha31_support, ++ &intel_knl_unc_cha32_support, ++ &intel_knl_unc_cha33_support, ++ &intel_knl_unc_cha34_support, ++ &intel_knl_unc_cha35_support, ++ &intel_knl_unc_cha36_support, ++ &intel_knl_unc_cha37_support, ++ &intel_knl_unc_m2pcie_support, + &intel_x86_arch_support, /* must always be last for x86 */ + #endif + +diff --git a/lib/pfmlib_intel_knl_unc_cha.c b/lib/pfmlib_intel_knl_unc_cha.c +new file mode 100644 +index 0000000..4f2ee4c +--- /dev/null ++++ b/lib/pfmlib_intel_knl_unc_cha.c +@@ -0,0 +1,103 @@ ++/* ++ * pfmlib_intel_knl_unc_cha.c : Intel KnightsLanding CHA uncore PMU ++ * ++ * Copyright (c) 2016 Intel Corp. All rights reserved ++ * Contributed by Peinan Zhang ++ * ++ * Permission is hereby granted, free of charge, to any person obtaining a copy ++ * of this software and associated documentation files (the "Software"), to deal ++ * in the Software without restriction, including without limitation the rights ++ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies ++ * of the Software, and to permit persons to whom the Software is furnished to do so, ++ * subject to the following conditions: ++ * ++ * The above copyright notice and this permission notice shall be included in all ++ * copies or substantial portions of the Software. ++ * ++ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, ++ * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A ++ * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT ++ * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF ++ * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE ++ * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ++ */ ++#include ++#include ++#include ++#include ++#include ++ ++/* private headers */ ++#include "pfmlib_priv.h" ++#include "pfmlib_intel_x86_priv.h" ++#include "pfmlib_intel_snbep_unc_priv.h" ++#include "events/intel_knl_unc_cha_events.h" ++ ++#define DEFINE_CHA_BOX(n) \ ++pfmlib_pmu_t intel_knl_unc_cha##n##_support = { \ ++ .desc = "Intel KnightLanding CHA "#n" uncore", \ ++ .name = "knl_unc_cha"#n, \ ++ .perf_name = "uncore_cha_"#n, \ ++ .pmu = PFM_PMU_INTEL_KNL_UNC_CHA##n, \ ++ .pme_count = LIBPFM_ARRAY_SIZE(intel_knl_unc_cha_pe), \ ++ .type = PFM_PMU_TYPE_UNCORE, \ ++ .num_cntrs = 4, \ ++ .num_fixed_cntrs = 0, \ ++ .max_encoding = 1, \ ++ .pe = intel_knl_unc_cha_pe, \ ++ .atdesc = snbep_unc_mods, \ ++ .flags = PFMLIB_PMU_FL_RAW_UMASK, \ ++ .pmu_detect = pfm_intel_knl_unc_detect, \ ++ .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding, \ ++ PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding), \ ++ PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), \ ++ .get_event_first = pfm_intel_x86_get_event_first, \ ++ .get_event_next = pfm_intel_x86_get_event_next, \ ++ .event_is_valid = pfm_intel_x86_event_is_valid, \ ++ .validate_table = pfm_intel_x86_validate_table, \ ++ .get_event_info = pfm_intel_x86_get_event_info, \ ++ .get_event_attr_info = pfm_intel_x86_get_event_attr_info, \ ++ PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs), \ ++ .get_event_nattrs = pfm_intel_x86_get_event_nattrs, \ ++}; ++ ++DEFINE_CHA_BOX(0); ++DEFINE_CHA_BOX(1); ++DEFINE_CHA_BOX(2); ++DEFINE_CHA_BOX(3); ++DEFINE_CHA_BOX(4); ++DEFINE_CHA_BOX(5); ++DEFINE_CHA_BOX(6); ++DEFINE_CHA_BOX(7); ++DEFINE_CHA_BOX(8); ++DEFINE_CHA_BOX(9); ++DEFINE_CHA_BOX(10); ++DEFINE_CHA_BOX(11); ++DEFINE_CHA_BOX(12); ++DEFINE_CHA_BOX(13); ++DEFINE_CHA_BOX(14); ++DEFINE_CHA_BOX(15); ++DEFINE_CHA_BOX(16); ++DEFINE_CHA_BOX(17); ++DEFINE_CHA_BOX(18); ++DEFINE_CHA_BOX(19); ++DEFINE_CHA_BOX(20); ++DEFINE_CHA_BOX(21); ++DEFINE_CHA_BOX(22); ++DEFINE_CHA_BOX(23); ++DEFINE_CHA_BOX(24); ++DEFINE_CHA_BOX(25); ++DEFINE_CHA_BOX(26); ++DEFINE_CHA_BOX(27); ++DEFINE_CHA_BOX(28); ++DEFINE_CHA_BOX(29); ++DEFINE_CHA_BOX(30); ++DEFINE_CHA_BOX(31); ++DEFINE_CHA_BOX(32); ++DEFINE_CHA_BOX(33); ++DEFINE_CHA_BOX(34); ++DEFINE_CHA_BOX(35); ++DEFINE_CHA_BOX(36); ++DEFINE_CHA_BOX(37); ++ ++ +diff --git a/lib/pfmlib_intel_knl_unc_edc.c b/lib/pfmlib_intel_knl_unc_edc.c +new file mode 100644 +index 0000000..379496a +--- /dev/null ++++ b/lib/pfmlib_intel_knl_unc_edc.c +@@ -0,0 +1,111 @@ ++/* ++ * pfmlib_intel_knl_unc_edc.c : Intel KnightsLanding Integrated EDRAM uncore PMU ++ * ++ * Copyright (c) 2016 Intel Corp. All rights reserved ++ * Contributed by Peinan Zhang ++ * ++ * Permission is hereby granted, free of charge, to any person obtaining a copy ++ * of this software and associated documentation files (the "Software"), to deal ++ * in the Software without restriction, including without limitation the rights ++ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies ++ * of the Software, and to permit persons to whom the Software is furnished to do so, ++ * subject to the following conditions: ++ * ++ * The above copyright notice and this permission notice shall be included in all ++ * copies or substantial portions of the Software. ++ * ++ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, ++ * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A ++ * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT ++ * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF ++ * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE ++ * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ++ */ ++#include ++#include ++#include ++#include ++#include ++ ++/* private headers */ ++#include "pfmlib_priv.h" ++#include "pfmlib_intel_x86_priv.h" ++#include "pfmlib_intel_snbep_unc_priv.h" ++#include "events/intel_knl_unc_edc_events.h" ++ ++ ++#define DEFINE_EDC_UCLK_BOX(n) \ ++pfmlib_pmu_t intel_knl_unc_edc_uclk##n##_support = { \ ++ .desc = "Intel KnightLanding EDC_UCLK_"#n" uncore", \ ++ .name = "knl_unc_edc_uclk"#n, \ ++ .perf_name = "uncore_edc_uclk_"#n, \ ++ .pmu = PFM_PMU_INTEL_KNL_UNC_EDC_UCLK##n, \ ++ .pme_count = LIBPFM_ARRAY_SIZE(intel_knl_unc_edc_uclk_pe), \ ++ .type = PFM_PMU_TYPE_UNCORE, \ ++ .num_cntrs = 4, \ ++ .num_fixed_cntrs = 0, \ ++ .max_encoding = 1, \ ++ .pe = intel_knl_unc_edc_uclk_pe, \ ++ .atdesc = snbep_unc_mods, \ ++ .flags = PFMLIB_PMU_FL_RAW_UMASK, \ ++ .pmu_detect = pfm_intel_knl_unc_detect, \ ++ .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding, \ ++ PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding), \ ++ PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), \ ++ .get_event_first = pfm_intel_x86_get_event_first, \ ++ .get_event_next = pfm_intel_x86_get_event_next, \ ++ .event_is_valid = pfm_intel_x86_event_is_valid, \ ++ .validate_table = pfm_intel_x86_validate_table, \ ++ .get_event_info = pfm_intel_x86_get_event_info, \ ++ .get_event_attr_info = pfm_intel_x86_get_event_attr_info, \ ++ PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs), \ ++ .get_event_nattrs = pfm_intel_x86_get_event_nattrs, \ ++}; ++ ++DEFINE_EDC_UCLK_BOX(0); ++DEFINE_EDC_UCLK_BOX(1); ++DEFINE_EDC_UCLK_BOX(2); ++DEFINE_EDC_UCLK_BOX(3); ++DEFINE_EDC_UCLK_BOX(4); ++DEFINE_EDC_UCLK_BOX(5); ++DEFINE_EDC_UCLK_BOX(6); ++DEFINE_EDC_UCLK_BOX(7); ++ ++ ++#define DEFINE_EDC_ECLK_BOX(n) \ ++pfmlib_pmu_t intel_knl_unc_edc_eclk##n##_support = { \ ++ .desc = "Intel KnightLanding EDC_ECLK_"#n" uncore", \ ++ .name = "knl_unc_edc_eclk"#n, \ ++ .perf_name = "uncore_edc_eclk_"#n, \ ++ .pmu = PFM_PMU_INTEL_KNL_UNC_EDC_ECLK##n, \ ++ .pme_count = LIBPFM_ARRAY_SIZE(intel_knl_unc_edc_eclk_pe), \ ++ .type = PFM_PMU_TYPE_UNCORE, \ ++ .num_cntrs = 4, \ ++ .num_fixed_cntrs = 0, \ ++ .max_encoding = 1, \ ++ .pe = intel_knl_unc_edc_eclk_pe, \ ++ .atdesc = snbep_unc_mods, \ ++ .flags = PFMLIB_PMU_FL_RAW_UMASK, \ ++ .pmu_detect = pfm_intel_knl_unc_detect, \ ++ .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding, \ ++ PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding), \ ++ PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), \ ++ .get_event_first = pfm_intel_x86_get_event_first, \ ++ .get_event_next = pfm_intel_x86_get_event_next, \ ++ .event_is_valid = pfm_intel_x86_event_is_valid, \ ++ .validate_table = pfm_intel_x86_validate_table, \ ++ .get_event_info = pfm_intel_x86_get_event_info, \ ++ .get_event_attr_info = pfm_intel_x86_get_event_attr_info, \ ++ PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs), \ ++ .get_event_nattrs = pfm_intel_x86_get_event_nattrs, \ ++}; ++ ++DEFINE_EDC_ECLK_BOX(0); ++DEFINE_EDC_ECLK_BOX(1); ++DEFINE_EDC_ECLK_BOX(2); ++DEFINE_EDC_ECLK_BOX(3); ++DEFINE_EDC_ECLK_BOX(4); ++DEFINE_EDC_ECLK_BOX(5); ++DEFINE_EDC_ECLK_BOX(6); ++DEFINE_EDC_ECLK_BOX(7); ++ +diff --git a/lib/pfmlib_intel_knl_unc_imc.c b/lib/pfmlib_intel_knl_unc_imc.c +new file mode 100644 +index 0000000..1d613b2 +--- /dev/null ++++ b/lib/pfmlib_intel_knl_unc_imc.c +@@ -0,0 +1,101 @@ ++/* ++ * pfmlib_intel_knl_unc_imc.c : Intel KnightsLanding Integrated Memory Controller (IMC) uncore PMU ++ * ++ * Copyright (c) 2016 Intel Corp. All rights reserved ++ * Contributed by Peinan Zhang ++ * ++ * Permission is hereby granted, free of charge, to any person obtaining a copy ++ * of this software and associated documentation files (the "Software"), to deal ++ * in the Software without restriction, including without limitation the rights ++ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies ++ * of the Software, and to permit persons to whom the Software is furnished to do so, ++ * subject to the following conditions: ++ * ++ * The above copyright notice and this permission notice shall be included in all ++ * copies or substantial portions of the Software. ++ * ++ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, ++ * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A ++ * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT ++ * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF ++ * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE ++ * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ++ */ ++#include ++#include ++#include ++#include ++#include ++ ++/* private headers */ ++#include "pfmlib_priv.h" ++#include "pfmlib_intel_x86_priv.h" ++#include "pfmlib_intel_snbep_unc_priv.h" ++#include "events/intel_knl_unc_imc_events.h" ++ ++#define DEFINE_IMC_BOX(n) \ ++pfmlib_pmu_t intel_knl_unc_imc##n##_support = { \ ++ .desc = "Intel KnightLanding IMC "#n" uncore", \ ++ .name = "knl_unc_imc"#n, \ ++ .perf_name = "uncore_imc_"#n, \ ++ .pmu = PFM_PMU_INTEL_KNL_UNC_IMC##n, \ ++ .pme_count = LIBPFM_ARRAY_SIZE(intel_knl_unc_imc_pe), \ ++ .type = PFM_PMU_TYPE_UNCORE, \ ++ .num_cntrs = 4, \ ++ .num_fixed_cntrs = 1, \ ++ .max_encoding = 1, \ ++ .pe = intel_knl_unc_imc_pe, \ ++ .atdesc = snbep_unc_mods, \ ++ .flags = PFMLIB_PMU_FL_RAW_UMASK, \ ++ .pmu_detect = pfm_intel_knl_unc_detect, \ ++ .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding, \ ++ PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding), \ ++ PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), \ ++ .get_event_first = pfm_intel_x86_get_event_first, \ ++ .get_event_next = pfm_intel_x86_get_event_next, \ ++ .event_is_valid = pfm_intel_x86_event_is_valid, \ ++ .validate_table = pfm_intel_x86_validate_table, \ ++ .get_event_info = pfm_intel_x86_get_event_info, \ ++ .get_event_attr_info = pfm_intel_x86_get_event_attr_info, \ ++ PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs), \ ++ .get_event_nattrs = pfm_intel_x86_get_event_nattrs, \ ++}; ++ ++DEFINE_IMC_BOX(0); ++DEFINE_IMC_BOX(1); ++DEFINE_IMC_BOX(2); ++DEFINE_IMC_BOX(3); ++DEFINE_IMC_BOX(4); ++DEFINE_IMC_BOX(5); ++ ++#define DEFINE_IMC_UCLK_BOX(n) \ ++pfmlib_pmu_t intel_knl_unc_imc_uclk##n##_support = { \ ++ .desc = "Intel KnightLanding IMC UCLK "#n" uncore", \ ++ .name = "knl_unc_imc_uclk"#n, \ ++ .perf_name = "uncore_mc_uclk_"#n, \ ++ .pmu = PFM_PMU_INTEL_KNL_UNC_IMC_UCLK##n, \ ++ .pme_count = LIBPFM_ARRAY_SIZE(intel_knl_unc_imc_uclk_pe), \ ++ .type = PFM_PMU_TYPE_UNCORE, \ ++ .num_cntrs = 4, \ ++ .num_fixed_cntrs = 1, \ ++ .max_encoding = 1, \ ++ .pe = intel_knl_unc_imc_uclk_pe, \ ++ .atdesc = snbep_unc_mods, \ ++ .flags = PFMLIB_PMU_FL_RAW_UMASK, \ ++ .pmu_detect = pfm_intel_knl_unc_detect, \ ++ .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding, \ ++ PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding), \ ++ PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), \ ++ .get_event_first = pfm_intel_x86_get_event_first, \ ++ .get_event_next = pfm_intel_x86_get_event_next, \ ++ .event_is_valid = pfm_intel_x86_event_is_valid, \ ++ .validate_table = pfm_intel_x86_validate_table, \ ++ .get_event_info = pfm_intel_x86_get_event_info, \ ++ .get_event_attr_info = pfm_intel_x86_get_event_attr_info, \ ++ PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs), \ ++ .get_event_nattrs = pfm_intel_x86_get_event_nattrs, \ ++}; ++ ++DEFINE_IMC_UCLK_BOX(0); ++DEFINE_IMC_UCLK_BOX(1); ++ +diff --git a/lib/pfmlib_intel_knl_unc_m2pcie.c b/lib/pfmlib_intel_knl_unc_m2pcie.c +new file mode 100644 +index 0000000..c4d6059 +--- /dev/null ++++ b/lib/pfmlib_intel_knl_unc_m2pcie.c +@@ -0,0 +1,80 @@ ++/* ++ * pfmlib_intel_knl_m2pcie.c : Intel Knights Landing M2PCIe uncore PMU ++ * ++ * Copyright (c) 2016 Intel Corp. All rights reserved ++ * Contributed by Peinan Zhang ++ * ++ * Permission is hereby granted, free of charge, to any person obtaining a copy ++ * of this software and associated documentation files (the "Software"), to deal ++ * in the Software without restriction, including without limitation the rights ++ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies ++ * of the Software, and to permit persons to whom the Software is furnished to do so, ++ * subject to the following conditions: ++ * ++ * The above copyright notice and this permission notice shall be included in all ++ * copies or substantial portions of the Software. ++ * ++ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, ++ * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A ++ * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT ++ * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF ++ * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE ++ * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ++ */ ++#include ++#include ++#include ++#include ++#include ++ ++/* private headers */ ++#include "pfmlib_priv.h" ++#include "pfmlib_intel_x86_priv.h" ++#include "pfmlib_intel_snbep_unc_priv.h" ++#include "events/intel_knl_unc_m2pcie_events.h" ++ ++static void ++display_m2p(void *this, pfmlib_event_desc_t *e, void *val) ++{ ++ const intel_x86_entry_t *pe = this_pe(this); ++ pfm_snbep_unc_reg_t *reg = val; ++ ++ __pfm_vbprintf("[UNC_R2PCIE=0x%"PRIx64" event=0x%x umask=0x%x en=%d " ++ "inv=%d edge=%d thres=%d] %s\n", ++ reg->val, ++ reg->com.unc_event, ++ reg->com.unc_umask, ++ reg->com.unc_en, ++ reg->com.unc_inv, ++ reg->com.unc_edge, ++ reg->com.unc_thres, ++ pe[e->event].name); ++} ++ ++pfmlib_pmu_t intel_knl_unc_m2pcie_support = { ++ .desc = "Intel Knights Landing M2PCIe uncore", ++ .name = "knl_unc_m2pcie", ++ .perf_name = "uncore_m2pcie", ++ .pmu = PFM_PMU_INTEL_KNL_UNC_M2PCIE, ++ .pme_count = LIBPFM_ARRAY_SIZE(intel_knl_unc_m2pcie_pe), ++ .type = PFM_PMU_TYPE_UNCORE, ++ .num_cntrs = 4, ++ .num_fixed_cntrs = 0, ++ .max_encoding = 1, ++ .pe = intel_knl_unc_m2pcie_pe, ++ .atdesc = snbep_unc_mods, ++ .flags = PFMLIB_PMU_FL_RAW_UMASK, ++ .pmu_detect = pfm_intel_knl_unc_detect, ++ .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding, ++ PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding), ++ PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), ++ .get_event_first = pfm_intel_x86_get_event_first, ++ .get_event_next = pfm_intel_x86_get_event_next, ++ .event_is_valid = pfm_intel_x86_event_is_valid, ++ .validate_table = pfm_intel_x86_validate_table, ++ .get_event_info = pfm_intel_x86_get_event_info, ++ .get_event_attr_info = pfm_intel_x86_get_event_attr_info, ++ PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs), ++ .get_event_nattrs = pfm_intel_x86_get_event_nattrs, ++ .display_reg = display_m2p, ++}; +diff --git a/lib/pfmlib_intel_snbep_unc.c b/lib/pfmlib_intel_snbep_unc.c +index c61065e..075ae33 100644 +--- a/lib/pfmlib_intel_snbep_unc.c ++++ b/lib/pfmlib_intel_snbep_unc.c +@@ -109,6 +109,28 @@ pfm_intel_hswep_unc_detect(void *this) + return PFM_SUCCESS; + } + ++int ++pfm_intel_knl_unc_detect(void *this) ++{ ++ int ret; ++ ++ ret = pfm_intel_x86_detect(); ++ if (ret != PFM_SUCCESS) ++ ++ if (pfm_intel_x86_cfg.family != 6) ++ return PFM_ERR_NOTSUPP; ++ ++ switch(pfm_intel_x86_cfg.model) { ++ case 87: /* Knights Landing */ ++ break; ++ default: ++ return PFM_ERR_NOTSUPP; ++ } ++ return PFM_SUCCESS; ++} ++ ++ ++ + static void + display_com(void *this, pfmlib_event_desc_t *e, void *val) + { +diff --git a/lib/pfmlib_intel_snbep_unc_priv.h b/lib/pfmlib_intel_snbep_unc_priv.h +index 13875f5..500ff84 100644 +--- a/lib/pfmlib_intel_snbep_unc_priv.h ++++ b/lib/pfmlib_intel_snbep_unc_priv.h +@@ -164,6 +164,8 @@ + #define HSWEP_UNC_SBO_ATTRS \ + (_SNBEP_UNC_ATTR_E|_SNBEP_UNC_ATTR_T8|_SNBEP_UNC_ATTR_I) + ++#define KNL_UNC_CHA_TOR_ATTRS _SNBEP_UNC_ATTR_NF1 ++ + typedef union { + uint64_t val; + struct { +@@ -324,6 +326,7 @@ extern const pfmlib_attr_desc_t snbep_unc_mods[]; + extern int pfm_intel_snbep_unc_detect(void *this); + extern int pfm_intel_ivbep_unc_detect(void *this); + extern int pfm_intel_hswep_unc_detect(void *this); ++extern int pfm_intel_knl_unc_detect(void *this); + extern int pfm_intel_snbep_unc_get_perf_encoding(void *this, pfmlib_event_desc_t *e); + extern int pfm_intel_snbep_unc_can_auto_encode(void *this, int pidx, int uidx); + extern int pfm_intel_snbep_unc_get_event_attr_info(void *this, int pidx, int attr_idx, pfm_event_attr_info_t *info); +diff --git a/lib/pfmlib_priv.h b/lib/pfmlib_priv.h +index c49975f..33d7fdf 100644 +--- a/lib/pfmlib_priv.h ++++ b/lib/pfmlib_priv.h +@@ -354,6 +354,69 @@ extern pfmlib_pmu_t intel_hswep_unc_irp_support; + extern pfmlib_pmu_t intel_knc_support; + extern pfmlib_pmu_t intel_slm_support; + extern pfmlib_pmu_t intel_knl_support; ++extern pfmlib_pmu_t intel_knl_unc_imc0_support; ++extern pfmlib_pmu_t intel_knl_unc_imc1_support; ++extern pfmlib_pmu_t intel_knl_unc_imc2_support; ++extern pfmlib_pmu_t intel_knl_unc_imc3_support; ++extern pfmlib_pmu_t intel_knl_unc_imc4_support; ++extern pfmlib_pmu_t intel_knl_unc_imc5_support; ++extern pfmlib_pmu_t intel_knl_unc_imc_uclk0_support; ++extern pfmlib_pmu_t intel_knl_unc_imc_uclk1_support; ++extern pfmlib_pmu_t intel_knl_unc_edc_uclk0_support; ++extern pfmlib_pmu_t intel_knl_unc_edc_uclk1_support; ++extern pfmlib_pmu_t intel_knl_unc_edc_uclk2_support; ++extern pfmlib_pmu_t intel_knl_unc_edc_uclk3_support; ++extern pfmlib_pmu_t intel_knl_unc_edc_uclk4_support; ++extern pfmlib_pmu_t intel_knl_unc_edc_uclk5_support; ++extern pfmlib_pmu_t intel_knl_unc_edc_uclk6_support; ++extern pfmlib_pmu_t intel_knl_unc_edc_uclk7_support; ++extern pfmlib_pmu_t intel_knl_unc_edc_eclk0_support; ++extern pfmlib_pmu_t intel_knl_unc_edc_eclk1_support; ++extern pfmlib_pmu_t intel_knl_unc_edc_eclk2_support; ++extern pfmlib_pmu_t intel_knl_unc_edc_eclk3_support; ++extern pfmlib_pmu_t intel_knl_unc_edc_eclk4_support; ++extern pfmlib_pmu_t intel_knl_unc_edc_eclk5_support; ++extern pfmlib_pmu_t intel_knl_unc_edc_eclk6_support; ++extern pfmlib_pmu_t intel_knl_unc_edc_eclk7_support; ++extern pfmlib_pmu_t intel_knl_unc_cha0_support; ++extern pfmlib_pmu_t intel_knl_unc_cha1_support; ++extern pfmlib_pmu_t intel_knl_unc_cha2_support; ++extern pfmlib_pmu_t intel_knl_unc_cha3_support; ++extern pfmlib_pmu_t intel_knl_unc_cha4_support; ++extern pfmlib_pmu_t intel_knl_unc_cha5_support; ++extern pfmlib_pmu_t intel_knl_unc_cha6_support; ++extern pfmlib_pmu_t intel_knl_unc_cha7_support; ++extern pfmlib_pmu_t intel_knl_unc_cha8_support; ++extern pfmlib_pmu_t intel_knl_unc_cha9_support; ++extern pfmlib_pmu_t intel_knl_unc_cha10_support; ++extern pfmlib_pmu_t intel_knl_unc_cha11_support; ++extern pfmlib_pmu_t intel_knl_unc_cha12_support; ++extern pfmlib_pmu_t intel_knl_unc_cha13_support; ++extern pfmlib_pmu_t intel_knl_unc_cha14_support; ++extern pfmlib_pmu_t intel_knl_unc_cha15_support; ++extern pfmlib_pmu_t intel_knl_unc_cha16_support; ++extern pfmlib_pmu_t intel_knl_unc_cha17_support; ++extern pfmlib_pmu_t intel_knl_unc_cha18_support; ++extern pfmlib_pmu_t intel_knl_unc_cha19_support; ++extern pfmlib_pmu_t intel_knl_unc_cha20_support; ++extern pfmlib_pmu_t intel_knl_unc_cha21_support; ++extern pfmlib_pmu_t intel_knl_unc_cha22_support; ++extern pfmlib_pmu_t intel_knl_unc_cha23_support; ++extern pfmlib_pmu_t intel_knl_unc_cha24_support; ++extern pfmlib_pmu_t intel_knl_unc_cha25_support; ++extern pfmlib_pmu_t intel_knl_unc_cha26_support; ++extern pfmlib_pmu_t intel_knl_unc_cha27_support; ++extern pfmlib_pmu_t intel_knl_unc_cha28_support; ++extern pfmlib_pmu_t intel_knl_unc_cha29_support; ++extern pfmlib_pmu_t intel_knl_unc_cha30_support; ++extern pfmlib_pmu_t intel_knl_unc_cha31_support; ++extern pfmlib_pmu_t intel_knl_unc_cha32_support; ++extern pfmlib_pmu_t intel_knl_unc_cha33_support; ++extern pfmlib_pmu_t intel_knl_unc_cha34_support; ++extern pfmlib_pmu_t intel_knl_unc_cha35_support; ++extern pfmlib_pmu_t intel_knl_unc_cha36_support; ++extern pfmlib_pmu_t intel_knl_unc_cha37_support; ++extern pfmlib_pmu_t intel_knl_unc_m2pcie_support; + extern pfmlib_pmu_t intel_glm_support; + extern pfmlib_pmu_t power4_support; + extern pfmlib_pmu_t ppc970_support; +diff --git a/tests/validate_x86.c b/tests/validate_x86.c +index cede40b..c9770fc 100644 +--- a/tests/validate_x86.c ++++ b/tests/validate_x86.c +@@ -4477,6 +4477,272 @@ static const test_event_t x86_test_events[]={ + .name = "knl::offcore_response_1:dmnd_data_rd:outstanding", + .ret = PFM_ERR_ATTR, + }, ++ { SRC_LINE, ++ .name = "knl_unc_imc0::UNC_M_D_CLOCKTICKS", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x00, ++ .fstr = "knl_unc_imc0::UNC_M_D_CLOCKTICKS", ++ }, ++ { SRC_LINE, ++ .name = "knl_unc_imc0::UNC_M_CAS_COUNT:RD", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x0103, ++ .fstr = "knl_unc_imc0::UNC_M_CAS_COUNT:RD", ++ }, ++ { SRC_LINE, ++ .name = "knl_unc_imc0::UNC_M_CAS_COUNT:WR", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x0203, ++ .fstr = "knl_unc_imc0::UNC_M_CAS_COUNT:WR", ++ }, ++ { SRC_LINE, ++ .name = "knl_unc_imc0::UNC_M_CAS_COUNT:ALL", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x0303, ++ .fstr = "knl_unc_imc0::UNC_M_CAS_COUNT:ALL", ++ }, ++ { SRC_LINE, ++ .name = "knl_unc_imc_uclk0::UNC_M_U_CLOCKTICKS", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x00, ++ .fstr = "knl_unc_imc_uclk0::UNC_M_U_CLOCKTICKS", ++ }, ++ { SRC_LINE, ++ .name = "knl_unc_edc_uclk0::UNC_E_U_CLOCKTICKS", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x00, ++ .fstr = "knl_unc_edc_uclk0::UNC_E_U_CLOCKTICKS", ++ }, ++ { SRC_LINE, ++ .name = "knl_unc_edc_uclk0::UNC_E_EDC_ACCESS:HIT_CLEAN", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x0102, ++ .fstr = "knl_unc_edc_uclk0::UNC_E_EDC_ACCESS:HIT_CLEAN", ++ }, ++ { SRC_LINE, ++ .name = "knl_unc_edc_uclk0::UNC_E_EDC_ACCESS:HIT_DIRTY", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x0202, ++ .fstr = "knl_unc_edc_uclk0::UNC_E_EDC_ACCESS:HIT_DIRTY", ++ }, ++ { SRC_LINE, ++ .name = "knl_unc_edc_uclk0::UNC_E_EDC_ACCESS:MISS_CLEAN", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x0402, ++ .fstr = "knl_unc_edc_uclk0::UNC_E_EDC_ACCESS:MISS_CLEAN", ++ }, ++ { SRC_LINE, ++ .name = "knl_unc_edc_uclk0::UNC_E_EDC_ACCESS:MISS_DIRTY", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x0802, ++ .fstr = "knl_unc_edc_uclk0::UNC_E_EDC_ACCESS:MISS_DIRTY", ++ }, ++ { SRC_LINE, ++ .name = "knl_unc_edc_uclk0::UNC_E_EDC_ACCESS:MISS_INVALID", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x1002, ++ .fstr = "knl_unc_edc_uclk0::UNC_E_EDC_ACCESS:MISS_INVALID", ++ }, ++ { SRC_LINE, ++ .name = "knl_unc_edc_eclk0::UNC_E_E_CLOCKTICKS", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x00, ++ .fstr = "knl_unc_edc_eclk0::UNC_E_E_CLOCKTICKS", ++ }, ++ { SRC_LINE, ++ .name = "knl_unc_edc_eclk0::UNC_E_RPQ_INSERTS", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x0101, ++ .fstr = "knl_unc_edc_eclk0::UNC_E_RPQ_INSERTS", ++ }, ++ { SRC_LINE, ++ .name = "knl_unc_cha0::UNC_H_U_CLOCKTICKS", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x00, ++ .fstr = "knl_unc_cha0::UNC_H_U_CLOCKTICKS", ++ }, ++ { SRC_LINE, ++ .name = "knl_unc_cha1::UNC_H_U_CLOCKTICKS", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x00, ++ .fstr = "knl_unc_cha1::UNC_H_U_CLOCKTICKS", ++ }, ++ { SRC_LINE, ++ .name = "knl_unc_cha10::UNC_H_U_CLOCKTICKS", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x00, ++ .fstr = "knl_unc_cha10::UNC_H_U_CLOCKTICKS", ++ }, ++ { SRC_LINE, ++ .name = "knl_unc_cha20::UNC_H_U_CLOCKTICKS", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x00, ++ .fstr = "knl_unc_cha20::UNC_H_U_CLOCKTICKS", ++ }, ++ { SRC_LINE, ++ .name = "knl_unc_cha25::UNC_H_U_CLOCKTICKS", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x00, ++ .fstr = "knl_unc_cha25::UNC_H_U_CLOCKTICKS", ++ }, ++ { SRC_LINE, ++ .name = "knl_unc_cha30::UNC_H_U_CLOCKTICKS", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x00, ++ .fstr = "knl_unc_cha30::UNC_H_U_CLOCKTICKS", ++ }, ++ { SRC_LINE, ++ .name = "knl_unc_cha37::UNC_H_U_CLOCKTICKS", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x00, ++ .fstr = "knl_unc_cha37::UNC_H_U_CLOCKTICKS", ++ }, ++ { SRC_LINE, ++ .name = "knl_unc_cha0::UNC_H_INGRESS_OCCUPANCY:IRQ", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x0111, ++ .fstr = "knl_unc_cha0::UNC_H_INGRESS_OCCUPANCY:IRQ", ++ }, ++ { SRC_LINE, ++ .name = "knl_unc_cha0::UNC_H_INGRESS_OCCUPANCY:IRQ_REJ", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x0211, ++ .fstr = "knl_unc_cha0::UNC_H_INGRESS_OCCUPANCY:IRQ_REJ", ++ }, ++ { SRC_LINE, ++ .name = "knl_unc_cha0::UNC_H_INGRESS_OCCUPANCY:IPQ", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x0411, ++ .fstr = "knl_unc_cha0::UNC_H_INGRESS_OCCUPANCY:IPQ", ++ }, ++ { SRC_LINE, ++ .name = "knl_unc_cha0::UNC_H_INGRESS_OCCUPANCY:PRQ", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x1011, ++ .fstr = "knl_unc_cha0::UNC_H_INGRESS_OCCUPANCY:PRQ", ++ }, ++ { SRC_LINE, ++ .name = "knl_unc_cha0::UNC_H_INGRESS_OCCUPANCY:PRQ_REJ", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x2011, ++ .fstr = "knl_unc_cha0::UNC_H_INGRESS_OCCUPANCY:PRQ_REJ", ++ }, ++ { SRC_LINE, ++ .name = "knl_unc_cha0::UNC_H_INGRESS_INSERTS:IRQ", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x0113, ++ .fstr = "knl_unc_cha0::UNC_H_INGRESS_INSERTS:IRQ", ++ }, ++ { SRC_LINE, ++ .name = "knl_unc_cha0::UNC_H_INGRESS_INSERTS:IRQ_REJ", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x0213, ++ .fstr = "knl_unc_cha0::UNC_H_INGRESS_INSERTS:IRQ_REJ", ++ }, ++ { SRC_LINE, ++ .name = "knl_unc_cha0::UNC_H_INGRESS_INSERTS:IPQ", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x0413, ++ .fstr = "knl_unc_cha0::UNC_H_INGRESS_INSERTS:IPQ", ++ }, ++ { SRC_LINE, ++ .name = "knl_unc_cha0::UNC_H_INGRESS_INSERTS:PRQ", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x1013, ++ .fstr = "knl_unc_cha0::UNC_H_INGRESS_INSERTS:PRQ", ++ }, ++ { SRC_LINE, ++ .name = "knl_unc_cha0::UNC_H_INGRESS_INSERTS:PRQ_REJ", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x2013, ++ .fstr = "knl_unc_cha0::UNC_H_INGRESS_INSERTS:PRQ_REJ", ++ }, ++ { SRC_LINE, ++ .name = "knl_unc_cha0::UNC_H_INGRESS_RETRY_IRQ0_REJECT:AD_RSP_VN0", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x0218, ++ .fstr = "knl_unc_cha0::UNC_H_INGRESS_RETRY_IRQ0_REJECT:AD_RSP_VN0", ++ }, ++ { SRC_LINE, ++ .name = "knl_unc_m2pcie::UNC_M2P_INGRESS_CYCLES_NE:ALL", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x0810, ++ .fstr = "knl_unc_m2pcie::UNC_M2P_INGRESS_CYCLES_NE:ALL", ++ }, ++ { SRC_LINE, ++ .name = "knl_unc_m2pcie::UNC_M2P_EGRESS_CYCLES_NE:AD_0", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x0123, ++ .fstr = "knl_unc_m2pcie::UNC_M2P_EGRESS_CYCLES_NE:AD_0", ++ }, ++ { SRC_LINE, ++ .name = "knl_unc_m2pcie::UNC_M2P_EGRESS_CYCLES_NE:AD_1", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x0823, ++ .fstr = "knl_unc_m2pcie::UNC_M2P_EGRESS_CYCLES_NE:AD_1", ++ }, ++ { SRC_LINE, ++ .name = "knl_unc_m2pcie::UNC_M2P_EGRESS_INSERTS:AD_0", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x0124, ++ .fstr = "knl_unc_m2pcie::UNC_M2P_EGRESS_INSERTS:AD_0", ++ }, ++ { SRC_LINE, ++ .name = "knl_unc_m2pcie::UNC_M2P_EGRESS_INSERTS:AD_1", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x1024, ++ .fstr = "knl_unc_m2pcie::UNC_M2P_EGRESS_INSERTS:AD_1", ++ }, ++ { SRC_LINE, ++ .name = "knl_unc_m2pcie::UNC_M2P_EGRESS_CYCLES_FULL:AD_0", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x0125, ++ .fstr = "knl_unc_m2pcie::UNC_M2P_EGRESS_CYCLES_FULL:AD_0", ++ }, ++ { SRC_LINE, ++ .name = "knl_unc_m2pcie::UNC_M2P_EGRESS_CYCLES_FULL:AD_1", ++ .ret = PFM_SUCCESS, ++ .count = 1, ++ .codes[0] = 0x0825, ++ .fstr = "knl_unc_m2pcie::UNC_M2P_EGRESS_CYCLES_FULL:AD_1", ++ }, + }; + + #define NUM_TEST_EVENTS (int)(sizeof(x86_test_events)/sizeof(test_event_t)) +-- +2.9.3 + + +From 192db474a97b5c67d917e18c04ab0848405e077d Mon Sep 17 00:00:00 2001 +From: Stephane Eranian +Date: Sun, 6 Nov 2016 23:37:41 -0800 +Subject: [PATCH] add more Skylake models + +Add Skylake X core PMU support (equiv to Skylake desktop for now) +Add Kabylake mobile and desktop. + +Signed-off-by: Stephane Eranian +--- + lib/pfmlib_intel_skl.c | 3 +++ + 1 file changed, 3 insertions(+) + +diff --git a/lib/pfmlib_intel_skl.c b/lib/pfmlib_intel_skl.c +index 87ee70d..a190ead 100644 +--- a/lib/pfmlib_intel_skl.c ++++ b/lib/pfmlib_intel_skl.c +@@ -29,6 +29,9 @@ + static const int skl_models[] = { + 78, /* Skylake mobile */ + 94, /* Skylake desktop */ ++ 85, /* Skylake X */ ++ 142,/* KabyLake mobile */ ++ 158,/* KabyLake desktop */ + 0 + }; + +-- +2.9.3 + + +From 05edb2f56598752e14071009c3c52cb22ae6036b Mon Sep 17 00:00:00 2001 +From: Stephane Eranian +Date: Sun, 5 Feb 2017 00:35:24 -0800 +Subject: [PATCH] Fix offcore_response for Intel BDW-EP + +The umasks was missing all the L3_HIT umasks because +they wer all marked as Broadwell (client) only. + +Signed-off-by: Stephane Eranian +--- + lib/events/intel_bdw_events.h | 12 ------------ + 1 file changed, 12 deletions(-) + +diff --git a/lib/events/intel_bdw_events.h b/lib/events/intel_bdw_events.h +index fba5ad2..ba5d1f7 100644 +--- a/lib/events/intel_bdw_events.h ++++ b/lib/events/intel_bdw_events.h +@@ -1746,81 +1746,69 @@ static const intel_x86_umask_t bdw_offcore_response[]={ + { .uname = "L3_HITM", + .udesc = "Supplier: counts L3 hits in M-state (initial lookup)", + .ucode = 1ULL << (18+8), +- .umodel = PFM_PMU_INTEL_BDW, + .grpid = 1, + }, + { .uname = "LLC_HITM", + .udesc = "Supplier: counts L3 hits in M-state (initial lookup)", + .ucode = 1ULL << (18+8), + .uequiv = "L3_HITM", +- .umodel = PFM_PMU_INTEL_BDW, + .grpid = 1, + }, + { .uname = "L3_HITE", + .udesc = "Supplier: counts L3 hits in E-state", + .ucode = 1ULL << (19+8), +- .umodel = PFM_PMU_INTEL_BDW, + .grpid = 1, + }, + { .uname = "LLC_HITE", + .udesc = "Supplier: counts L3 hits in E-state", + .ucode = 1ULL << (19+8), + .uequiv = "L3_HITE", +- .umodel = PFM_PMU_INTEL_BDW, + .grpid = 1, + }, + { .uname = "L3_HITS", + .udesc = "Supplier: counts L3 hits in S-state", + .ucode = 1ULL << (20+8), +- .umodel = PFM_PMU_INTEL_BDW, + .grpid = 1, + }, + { .uname = "LLC_HITS", + .udesc = "Supplier: counts L3 hits in S-state", + .ucode = 1ULL << (20+8), + .uequiv = "L3_HITS", +- .umodel = PFM_PMU_INTEL_BDW, + .grpid = 1, + }, + { .uname = "L3_HITF", + .udesc = "Supplier: counts L3 hits in F-state", + .ucode = 1ULL << (21+8), +- .umodel = PFM_PMU_INTEL_BDW, + .grpid = 1, + }, + { .uname = "LLC_HITF", + .udesc = "Supplier: counts L3 hits in F-state", + .ucode = 1ULL << (20+8), + .uequiv = "L3_HITF", +- .umodel = PFM_PMU_INTEL_BDW, + .grpid = 1, + }, + { .uname = "L3_HITMESF", + .udesc = "Supplier: counts L3 hits in any state (M, E, S, F)", + .ucode = 0xfULL << (18+8), + .uequiv = "L3_HITM:L3_HITE:L3_HITS:L3_HITF", +- .umodel = PFM_PMU_INTEL_BDW, + .grpid = 1, + }, + { .uname = "LLC_HITMESF", + .udesc = "Supplier: counts L3 hits in any state (M, E, S, F)", + .ucode = 0xfULL << (18+8), + .uequiv = "L3_HITMESF", +- .umodel = PFM_PMU_INTEL_BDW, + .grpid = 1, + }, + { .uname = "L3_HIT", + .udesc = "Alias for L3_HITMESF", + .ucode = 0xfULL << (18+8), + .uequiv = "L3_HITM:L3_HITE:L3_HITS:L3_HITF", +- .umodel = PFM_PMU_INTEL_BDW, + .grpid = 1, + }, + { .uname = "LLC_HIT", + .udesc = "Alias for LLC_HITMESF", + .ucode = 0xfULL << (18+8), + .uequiv = "L3_HITM:L3_HITE:L3_HITS:L3_HITF", +- .umodel = PFM_PMU_INTEL_BDW, + .grpid = 1, + }, + { .uname = "L3_MISS_LOCAL", +-- +2.9.3 + + +From 28ba4f45ab37915a4e91c6f8d33318bb6a1b1947 Mon Sep 17 00:00:00 2001 +From: Stephane Eranian +Date: Tue, 21 Feb 2017 23:49:07 -0800 +Subject: [PATCH] add UOPS_DISPATCHED_PORT event for Intel Skylake + +This patch add UOPS_DISPATCHED_PORT event for Intel Skylake event. +This is the official name of the event 0xa1. + +Make the old UOPS_DISPATCHED event an alias for backward +compatibility reason. + +Also add a test case for the new event and alias. + +Signed-off-by: Stephane Eranian +--- + lib/events/intel_skl_events.h | 19 ++++++++++++++----- + tests/validate_x86.c | 12 ++++++++++++ + 2 files changed, 26 insertions(+), 5 deletions(-) + +diff --git a/lib/events/intel_skl_events.h b/lib/events/intel_skl_events.h +index e7b522d..84dfabf 100644 +--- a/lib/events/intel_skl_events.h ++++ b/lib/events/intel_skl_events.h +@@ -1154,7 +1154,7 @@ static const intel_x86_umask_t skl_uops_executed[]={ + }, + }; + +-static const intel_x86_umask_t skl_uops_dispatched[]={ ++static const intel_x86_umask_t skl_uops_dispatched_port[]={ + { .uname = "PORT_0", + .udesc = "Cycles which a Uop is executed on port 0", + .ucode = 0x100, +@@ -2510,15 +2510,24 @@ static const intel_x86_entry_t intel_skl_pe[]={ + .numasks = LIBPFM_ARRAY_SIZE(skl_lsd), + .umasks = skl_lsd, + }, +- ++ { .name = "UOPS_DISPATCHED_PORT", ++ .desc = "Uops dispatched to specific ports", ++ .code = 0xa1, ++ .cntmsk = 0xff, ++ .ngrp = 1, ++ .modmsk = INTEL_V4_ATTRS, ++ .numasks = LIBPFM_ARRAY_SIZE(skl_uops_dispatched_port), ++ .umasks = skl_uops_dispatched_port, ++ }, + { .name = "UOPS_DISPATCHED", +- .desc = "Uops dispatch to specific ports", ++ .desc = "Uops dispatched to specific ports", ++ .equiv = "UOPS_DISPATCHED_PORT", + .code = 0xa1, + .cntmsk = 0xff, + .ngrp = 1, + .modmsk = INTEL_V4_ATTRS, +- .numasks = LIBPFM_ARRAY_SIZE(skl_uops_dispatched), +- .umasks = skl_uops_dispatched, ++ .numasks = LIBPFM_ARRAY_SIZE(skl_uops_dispatched_port), ++ .umasks = skl_uops_dispatched_port, + }, + { .name = "UOPS_ISSUED", + .desc = "Uops issued", +diff --git a/tests/validate_x86.c b/tests/validate_x86.c +index c9770fc..790ba58 100644 +--- a/tests/validate_x86.c ++++ b/tests/validate_x86.c +@@ -4031,6 +4031,18 @@ static const test_event_t x86_test_events[]={ + .ret = PFM_ERR_ATTR_SET, + }, + { SRC_LINE, ++ .name = "skl::uops_dispatched_port:port_0", ++ .count = 1, ++ .codes[0] = 0x5301a1, ++ .fstr = "skl::UOPS_DISPATCHED_PORT:PORT_0:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", ++ }, ++ { SRC_LINE, ++ .name = "skl::uops_dispatched:port_0", ++ .count = 1, ++ .codes[0] = 0x5301a1, ++ .fstr = "skl::UOPS_DISPATCHED_PORT:PORT_0:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", ++ }, ++ { SRC_LINE, + .name = "hsw::CYCLE_ACTIVITY:CYCLES_L2_PENDING:k=1:u=1:e=0:i=0:c=1:t=0:intx=0:intxcp=0", + .ret = PFM_SUCCESS, + .count = 1, +-- +2.9.3 + + +From 1bd352eef242f53e130c3b025bbf7881a5fb5d1e Mon Sep 17 00:00:00 2001 +From: Stephane Eranian +Date: Wed, 22 Feb 2017 01:16:42 -0800 +Subject: [PATCH] update Intel RAPL processor support + +Added Kabylake, Skylake X + +Added PSYS RAPL event for Skylake client. + +Signed-off-by: Stephane Eranian +--- + lib/pfmlib_intel_rapl.c | 51 ++++++++++++++++++++++++++++++++++--------------- + 1 file changed, 36 insertions(+), 15 deletions(-) + +diff --git a/lib/pfmlib_intel_rapl.c b/lib/pfmlib_intel_rapl.c +index 1413b5f..8a04079 100644 +--- a/lib/pfmlib_intel_rapl.c ++++ b/lib/pfmlib_intel_rapl.c +@@ -59,6 +59,20 @@ static const intel_x86_entry_t intel_rapl_cln_pe[]={ + } + }; + ++static const intel_x86_entry_t intel_rapl_skl_cln_pe[]={ ++ RAPL_COMMON_EVENTS, ++ { .name = "RAPL_ENERGY_GPU", ++ .desc = "Number of Joules consumed by the builtin GPU. Unit is 2^-32 Joules", ++ .cntmsk = 0x8, ++ .code = 0x4, ++ }, ++ { .name = "RAPL_ENERGY_PSYS", ++ .desc = "Number of Joules consumed by the builtin PSYS. Unit is 2^-32 Joules", ++ .cntmsk = 0x8, ++ .code = 0x5, ++ } ++}; ++ + static const intel_x86_entry_t intel_rapl_srv_pe[]={ + RAPL_COMMON_EVENTS, + { .name = "RAPL_ENERGY_DRAM", +@@ -97,29 +111,36 @@ pfm_rapl_detect(void *this) + return PFM_ERR_NOTSUPP; + + switch(pfm_intel_x86_cfg.model) { +- case 42: /* Sandy Bridge */ +- case 58: /* Ivy Bridge */ +- case 60: /* Haswell */ +- case 69: /* Haswell */ +- case 70: /* Haswell */ +- case 61: /* Broadwell */ +- case 71: /* Broadwell */ +- case 78: /* Skylake */ +- case 94: /* Skylake H/S */ ++ case 42: /* Sandy Bridge */ ++ case 58: /* Ivy Bridge */ ++ case 60: /* Haswell */ ++ case 69: /* Haswell */ ++ case 70: /* Haswell */ ++ case 61: /* Broadwell */ ++ case 71: /* Broadwell GT3E */ ++ case 92: /* Goldmont */ + /* already setup by default */ + break; +- case 45: /* Sandy Bridg-EP */ +- case 62: /* Ivy Bridge-EP */ ++ case 45: /* Sandy Bridg-EP */ ++ case 62: /* Ivy Bridge-EP */ + intel_rapl_support.pe = intel_rapl_srv_pe; + intel_rapl_support.pme_count = LIBPFM_ARRAY_SIZE(intel_rapl_srv_pe); + break; +- case 63: /* Haswell-EP */ +- case 79: /* Broadwell-EP */ +- case 86: /* Broadwell D */ ++ case 78: /* Skylake */ ++ case 94: /* Skylake H/S */ ++ case 142: /* Kabylake */ ++ case 158: /* Kabylake */ ++ intel_rapl_support.pe = intel_rapl_skl_cln_pe; ++ intel_rapl_support.pme_count = LIBPFM_ARRAY_SIZE(intel_rapl_skl_cln_pe); ++ break; ++ case 63: /* Haswell-EP */ ++ case 79: /* Broadwell-EP */ ++ case 86: /* Broadwell D */ ++ case 85: /* Skylake X */ + intel_rapl_support.pe = intel_rapl_hswep_pe; + intel_rapl_support.pme_count = LIBPFM_ARRAY_SIZE(intel_rapl_hswep_pe); + break; +- default: ++ default : + return PFM_ERR_NOTSUPP; + } + return PFM_SUCCESS; +-- +2.9.3 + diff --git a/SPECS/libpfm.spec b/SPECS/libpfm.spec new file mode 100644 index 0000000..e288fc7 --- /dev/null +++ b/SPECS/libpfm.spec @@ -0,0 +1,222 @@ +%bcond_without python +%if %{with python} +%define python_sitearch %(python -c "from distutils.sysconfig import get_python_lib; print get_python_lib(1)") +%define python_prefix %(python -c "import sys; print sys.prefix") +%{?filter_setup: +%filter_provides_in %{python_sitearch}/perfmon/.*\.so$ +%filter_setup +} +%endif + +Name: libpfm +Version: 4.7.0 +Release: 10%{?dist} + +Summary: Library to encode performance events for use by perf tool + +Group: System Environment/Libraries +License: MIT +URL: http://perfmon2.sourceforge.net/ +Source0: http://sourceforge.net/projects/perfmon2/files/libpfm4/%{name}-%{version}.tar.gz +Patch1: libpfm-updates.patch +Patch2: libpfm-rhbz1440249.patch +Patch3: libpfm-power9.patch +Patch4: libpfm-bdx_unc.patch +Patch5: libpfm-p9_alt.patch +Patch6: libpfm-p9_uniq.patch +Patch7: libpfm-intel_1port.patch +Patch8: libpfm-s390.patch + +%if %{with python} +BuildRequires: python-devel +BuildRequires: python-setuptools-devel +BuildRequires: swig +%endif + +%description + +libpfm4 is a library to help encode events for use with operating system +kernels performance monitoring interfaces. The current version provides support +for the perf_events interface available in upstream Linux kernels since v2.6.31. + +%package devel +Summary: Development library to encode performance events for perf_events based tools +Group: Development/Libraries +Requires: %{name}%{?_isa} = %{version}-%{release} + +%description devel +Development library and header files to create performance monitoring +applications for the perf_events interface. + +%package static +Summary: Static library to encode performance events for perf_events based tools +Group: Development/Libraries +Requires: %{name}%{?_isa} = %{version}-%{release} + +%description static +Static version of the libpfm library for performance monitoring +applications for the perf_events interface. + +%if %{with python} +%package python +Summary: Python bindings for libpfm and perf_event_open system call +Group: Development/Languages +Requires: %{name}%{?_isa} = %{version}-%{release} + +%description python +Python bindings for libpfm4 and perf_event_open system call. +%endif + +%prep +%setup -q +%patch1 -p1 +%patch2 -p1 +%patch3 -p1 +%patch4 -p1 +%patch5 -p 1 -b .p9_alt +%patch6 -p 1 -b .p9_uniq +%patch7 -p 1 -b .1port +%patch8 -p 1 -b .s390 + +%build +%if %{with python} +%global python_config CONFIG_PFMLIB_NOPYTHON=n +%else +%global python_config CONFIG_PFMLIB_NOPYTHON=y +%endif +make %{python_config} %{?_smp_mflags} + + +%install +rm -rf $RPM_BUILD_ROOT + +%if %{with python} +%global python_config CONFIG_PFMLIB_NOPYTHON=n PYTHON_PREFIX=$RPM_BUILD_ROOT/%{python_prefix} +%else +%global python_config CONFIG_PFMLIB_NOPYTHON=y +%endif + +make \ + PREFIX=$RPM_BUILD_ROOT%{_prefix} \ + LIBDIR=$RPM_BUILD_ROOT%{_libdir} \ + %{python_config} \ + LDCONFIG=/bin/true \ + install + +%post -p /sbin/ldconfig +%postun -p /sbin/ldconfig + +%files +%doc README +%{_libdir}/lib*.so.* + +%files devel +%{_includedir}/* +%{_mandir}/man3/* +%{_libdir}/lib*.so + +%files static +%{_libdir}/lib*.a + +%if %{with python} +%files python +%{python_sitearch}/* +%endif + +%changelog +* Mon Apr 16 2018 William Cohen - 4.7.0-10 +- Add support for z13/z13s. rhbz1548505 + +* Tue Dec 5 2017 William Cohen - 4.7.0-9 +- Correct x86 unit mask naming. rhbz1521076 + +* Mon Dec 4 2017 William Cohen - 4.7.0-8 +- Update IBM Power 9 events. rhbz1510684 + +* Wed Nov 29 2017 William Cohen - 4.7.0-7 +- Update IBM Power 9 events. rhbz1510684 + +* Mon Sep 25 2017 William Cohen - 4.7.0-6 +- Add access to Broadwell Uncore Counters. rhbz1474999 + +* Tue Jun 20 2017 William Cohen - 4.7.0-5 +- Add IBM Power9 support. + +* Wed Apr 12 2017 William Cohen - 4.7.0-4 +- Correct handling of raw offcore umask handling. rhbz1440249 + +* Thu Mar 23 2017 William Cohen - 4.7.0-3 +- Avoid ABI breakage caused by some Intel KNL related patches. rhbz1412950 + +* Tue Mar 21 2017 William Cohen - 4.7.0-2 +- Updates for IBM Power and Intel KNL. rhbz1385009, rhbz1412950 + +* Thu May 12 2016 William Cohen - 4.7.0-1 +- Rebase to libpfm-4.7.0. + +* Thu Jun 4 2015 William Cohen - 4.4.0-11 +- Add additional s390 support. rhbz1182187 + +* Fri May 15 2015 William Cohen - 4.4.0-10 +- Add additional s390 support. rhbz1182187 + +* Thu Oct 16 2014 William Cohen - 4.4.0-9 +- Bump and rebuid for chained build. rhbz1126091 + +* Fri Sep 26 2014 William Cohen - 4.4.0-8 +- Update Intel processor support. rhbz1126091 + +* Wed Sep 3 2014 William Cohen - 4.4.0-7 +- Add aarch64 and power8 support for rhbz963457 and rhbz1088557 + +* Fri Jan 24 2014 Daniel Mach - 4.4.0-6 +- Mass rebuild 2014-01-24 + +* Tue Jan 14 2014 William Cohen - 4.4.0-5 +- Update event descriptions. + +* Mon Jan 13 2014 William Cohen - 4.4.0-4 +- Add Haswell model numbers. + +* Fri Dec 27 2013 Daniel Mach - 4.4.0-3 +- Mass rebuild 2013-12-27 + +* Fri Jul 19 2013 William Cohen 4.4.0-2 +- Add IBM power 8 support. + +* Mon Jun 17 2013 William Cohen 4.4.0-1 +- Rebase on libpfm-4.4.0. + +* Thu Feb 14 2013 Fedora Release Engineering - 4.3.0-3 +- Rebuilt for https://fedoraproject.org/wiki/Fedora_19_Mass_Rebuild + +* Tue Aug 28 2012 William Cohen 4.3.0-2 +- Turn off LDCONFIG and remove patch. + +* Tue Aug 28 2012 William Cohen 4.3.0-1 +- Rebase on libpfm-4.3.0. + +* Thu Jul 19 2012 Fedora Release Engineering - 4.2.0-8 +- Rebuilt for https://fedoraproject.org/wiki/Fedora_18_Mass_Rebuild + +* Fri Jun 8 2012 William Cohen 4.2.0-7 +- Eliminate swig error. + +* Thu Jun 7 2012 William Cohen 4.2.0-6 +- Eliminate rpm_build_root macro in build section. +- Correct location of shared library files. + +* Thu Jun 7 2012 William Cohen 4.2.0-5 +- Use siginfo_t for some examples. + +* Mon Jun 4 2012 William Cohen 4.2.0-4 +- Correct python files. + +* Wed Mar 28 2012 William Cohen 4.2.0-3 +- Additional spec file fixup for rhbz804666. + +* Wed Mar 14 2012 William Cohen 4.2.0-2 +- Some spec file fixup. + +* Wed Jan 12 2011 Arun Sharma 4.2.0-0 +Initial revision