Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
94 changes: 94 additions & 0 deletions Documentation/admin-guide/perf/dwc_pcie_pmu.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
======================================================================
Synopsys DesignWare Cores (DWC) PCIe Performance Monitoring Unit (PMU)
======================================================================

DesignWare Cores (DWC) PCIe PMU
===============================

The PMU is a PCIe configuration space register block provided by each PCIe Root
Port in a Vendor-Specific Extended Capability named RAS D.E.S (Debug, Error
injection, and Statistics).

As the name indicates, the RAS DES capability supports system level
debugging, AER error injection, and collection of statistics. To facilitate
collection of statistics, Synopsys DesignWare Cores PCIe controller
provides the following two features:

- one 64-bit counter for Time Based Analysis (RX/TX data throughput and
time spent in each low-power LTSSM state) and
- one 32-bit counter for Event Counting (error and non-error events for
a specified lane)

Note: There is no interrupt for counter overflow.

Time Based Analysis
-------------------

Using this feature you can obtain information regarding RX/TX data
throughput and time spent in each low-power LTSSM state by the controller.
The PMU measures data in two categories:

- Group#0: Percentage of time the controller stays in LTSSM states.
- Group#1: Amount of data processed (Units of 16 bytes).

Lane Event counters
-------------------

Using this feature you can obtain Error and Non-Error information in
specific lane by the controller. The PMU event is selected by all of:

- Group i
- Event j within the Group i
- Lane k

Some of the events only exist for specific configurations.

DesignWare Cores (DWC) PCIe PMU Driver
=======================================

This driver adds PMU devices for each PCIe Root Port named based on the BDF of
the Root Port. For example,

30:03.0 PCI bridge: Device 1ded:8000 (rev 01)

the PMU device name for this Root Port is dwc_rootport_3018.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue: The derivation of the PMU device name from BDF is unclear.

Clarify and document how '30:03.0' is encoded as '3018' in the PMU device name for user understanding.


The DWC PCIe PMU driver registers a perf PMU driver, which provides
description of available events and configuration options in sysfs, see
/sys/bus/event_source/devices/dwc_rootport_{bdf}.

The "format" directory describes format of the config fields of the
perf_event_attr structure. The "events" directory provides configuration
templates for all documented events. For example,
"Rx_PCIe_TLP_Data_Payload" is an equivalent of "eventid=0x22,type=0x1".

The "perf list" command shall list the available events from sysfs, e.g.::

$# perf list | grep dwc_rootport
<...>
dwc_rootport_3018/Rx_PCIe_TLP_Data_Payload/ [Kernel PMU event]
<...>
dwc_rootport_3018/rx_memory_read,lane=?/ [Kernel PMU event]

Time Based Analysis Event Usage
-------------------------------

Example usage of counting PCIe RX TLP data payload (Units of bytes)::

$# perf stat -a -e dwc_rootport_3018/Rx_PCIe_TLP_Data_Payload/

The average RX/TX bandwidth can be calculated using the following formula:

PCIe RX Bandwidth = Rx_PCIe_TLP_Data_Payload / Measure_Time_Window
PCIe TX Bandwidth = Tx_PCIe_TLP_Data_Payload / Measure_Time_Window

Lane Event Usage
-------------------------------

Each lane has the same event set and to avoid generating a list of hundreds
of events, the user need to specify the lane ID explicitly, e.g.::
Comment on lines +88 to +89

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (typo): Grammatical error: 'need' should be 'needs'.

Change 'the user need' to 'the user needs' for correct grammar.

Suggested change
Each lane has the same event set and to avoid generating a list of hundreds
of events, the user need to specify the lane ID explicitly, e.g.::
Each lane has the same event set and to avoid generating a list of hundreds
of events, the user needs to specify the lane ID explicitly, e.g.::


$# perf stat -a -e dwc_rootport_3018/rx_memory_read,lane=4/

The driver does not support sampling, therefore "perf record" will not
work. Per-task (without "-a") perf sessions are not supported.
1 change: 1 addition & 0 deletions Documentation/admin-guide/perf/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ Performance monitor support
arm_dsu_pmu
thunderx2-pmu
alibaba_pmu
dwc_pcie_pmu
nvidia-pmu
meson-ddr-pmu
cxl
2 changes: 0 additions & 2 deletions drivers/infiniband/hw/erdma/erdma_hw.h
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,6 @@
#include <linux/types.h>

/* PCIe device related definition. */
#define PCI_VENDOR_ID_ALIBABA 0x1ded

#define ERDMA_PCI_WIDTH 64
#define ERDMA_FUNC_BAR 0
#define ERDMA_MISX_BAR 2
Expand Down
12 changes: 12 additions & 0 deletions drivers/pci/access.c
Original file line number Diff line number Diff line change
Expand Up @@ -598,3 +598,15 @@ int pci_write_config_dword(const struct pci_dev *dev, int where,
return pci_bus_write_config_dword(dev->bus, dev->devfn, where, val);
}
EXPORT_SYMBOL(pci_write_config_dword);

void pci_clear_and_set_config_dword(const struct pci_dev *dev, int pos,
u32 clear, u32 set)
{
u32 val;

pci_read_config_dword(dev, pos, &val);

Copilot AI Jun 1, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Consider adding an inline comment for pci_clear_and_set_config_dword to clarify its intended usage and improve maintainability.

Copilot uses AI. Check for mistakes.
val &= ~clear;
val |= set;
pci_write_config_dword(dev, pos, val);
}
Comment on lines +602 to +611

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (bug_risk): Ignoring return value from pci_read_config_dword

Check the return value of pci_read_config_dword to ensure 'val' is valid before using it.

Suggested change
void pci_clear_and_set_config_dword(const struct pci_dev *dev, int pos,
u32 clear, u32 set)
{
u32 val;
pci_read_config_dword(dev, pos, &val);
val &= ~clear;
val |= set;
pci_write_config_dword(dev, pos, val);
}
void pci_clear_and_set_config_dword(const struct pci_dev *dev, int pos,
u32 clear, u32 set)
{
u32 val;
int ret;
ret = pci_read_config_dword(dev, pos, &val);
if (ret)
return;
val &= ~clear;
val |= set;
pci_write_config_dword(dev, pos, val);
}

EXPORT_SYMBOL(pci_clear_and_set_config_dword);
65 changes: 30 additions & 35 deletions drivers/pci/pcie/aspm.c
Original file line number Diff line number Diff line change
Expand Up @@ -423,17 +423,6 @@ static void pcie_aspm_check_latency(struct pci_dev *endpoint)
}
}

static void pci_clear_and_set_dword(struct pci_dev *pdev, int pos,
u32 clear, u32 set)
{
u32 val;

pci_read_config_dword(pdev, pos, &val);
val &= ~clear;
val |= set;
pci_write_config_dword(pdev, pos, val);
}

/* Calculate L1.2 PM substate timing parameters */
static void aspm_calc_l12_info(struct pcie_link_state *link,
u32 parent_l1ss_cap, u32 child_l1ss_cap)
Expand Down Expand Up @@ -494,33 +483,39 @@ static void aspm_calc_l12_info(struct pcie_link_state *link,
cl1_2_enables = cctl1 & PCI_L1SS_CTL1_L1_2_MASK;

if (pl1_2_enables || cl1_2_enables) {
pci_clear_and_set_dword(child, child->l1ss + PCI_L1SS_CTL1,
PCI_L1SS_CTL1_L1_2_MASK, 0);
pci_clear_and_set_dword(parent, parent->l1ss + PCI_L1SS_CTL1,
PCI_L1SS_CTL1_L1_2_MASK, 0);
pci_clear_and_set_config_dword(child,
child->l1ss + PCI_L1SS_CTL1,
PCI_L1SS_CTL1_L1_2_MASK, 0);
pci_clear_and_set_config_dword(parent,
parent->l1ss + PCI_L1SS_CTL1,
PCI_L1SS_CTL1_L1_2_MASK, 0);
}

/* Program T_POWER_ON times in both ports */
pci_write_config_dword(parent, parent->l1ss + PCI_L1SS_CTL2, ctl2);
pci_write_config_dword(child, child->l1ss + PCI_L1SS_CTL2, ctl2);

/* Program Common_Mode_Restore_Time in upstream device */
pci_clear_and_set_dword(parent, parent->l1ss + PCI_L1SS_CTL1,
PCI_L1SS_CTL1_CM_RESTORE_TIME, ctl1);
pci_clear_and_set_config_dword(parent, parent->l1ss + PCI_L1SS_CTL1,
PCI_L1SS_CTL1_CM_RESTORE_TIME, ctl1);

/* Program LTR_L1.2_THRESHOLD time in both ports */
pci_clear_and_set_dword(parent, parent->l1ss + PCI_L1SS_CTL1,
PCI_L1SS_CTL1_LTR_L12_TH_VALUE |
PCI_L1SS_CTL1_LTR_L12_TH_SCALE, ctl1);
pci_clear_and_set_dword(child, child->l1ss + PCI_L1SS_CTL1,
PCI_L1SS_CTL1_LTR_L12_TH_VALUE |
PCI_L1SS_CTL1_LTR_L12_TH_SCALE, ctl1);
pci_clear_and_set_config_dword(parent, parent->l1ss + PCI_L1SS_CTL1,
PCI_L1SS_CTL1_LTR_L12_TH_VALUE |
PCI_L1SS_CTL1_LTR_L12_TH_SCALE,
ctl1);
pci_clear_and_set_config_dword(child, child->l1ss + PCI_L1SS_CTL1,
PCI_L1SS_CTL1_LTR_L12_TH_VALUE |
PCI_L1SS_CTL1_LTR_L12_TH_SCALE,
ctl1);

if (pl1_2_enables || cl1_2_enables) {
pci_clear_and_set_dword(parent, parent->l1ss + PCI_L1SS_CTL1, 0,
pl1_2_enables);
pci_clear_and_set_dword(child, child->l1ss + PCI_L1SS_CTL1, 0,
cl1_2_enables);
pci_clear_and_set_config_dword(parent,
parent->l1ss + PCI_L1SS_CTL1, 0,
pl1_2_enables);
pci_clear_and_set_config_dword(child,
child->l1ss + PCI_L1SS_CTL1, 0,
cl1_2_enables);
}
}

Expand Down Expand Up @@ -680,10 +675,10 @@ static void pcie_config_aspm_l1ss(struct pcie_link_state *link, u32 state)
*/

/* Disable all L1 substates */
pci_clear_and_set_dword(child, child->l1ss + PCI_L1SS_CTL1,
PCI_L1SS_CTL1_L1SS_MASK, 0);
pci_clear_and_set_dword(parent, parent->l1ss + PCI_L1SS_CTL1,
PCI_L1SS_CTL1_L1SS_MASK, 0);
pci_clear_and_set_config_dword(child, child->l1ss + PCI_L1SS_CTL1,
PCI_L1SS_CTL1_L1SS_MASK, 0);
pci_clear_and_set_config_dword(parent, parent->l1ss + PCI_L1SS_CTL1,
PCI_L1SS_CTL1_L1SS_MASK, 0);
/*
* If needed, disable L1, and it gets enabled later
* in pcie_config_aspm_link().
Expand All @@ -706,10 +701,10 @@ static void pcie_config_aspm_l1ss(struct pcie_link_state *link, u32 state)
val |= PCI_L1SS_CTL1_PCIPM_L1_2;

/* Enable what we need to enable */
pci_clear_and_set_dword(parent, parent->l1ss + PCI_L1SS_CTL1,
PCI_L1SS_CTL1_L1SS_MASK, val);
pci_clear_and_set_dword(child, child->l1ss + PCI_L1SS_CTL1,
PCI_L1SS_CTL1_L1SS_MASK, val);
pci_clear_and_set_config_dword(parent, parent->l1ss + PCI_L1SS_CTL1,
PCI_L1SS_CTL1_L1SS_MASK, val);
pci_clear_and_set_config_dword(child, child->l1ss + PCI_L1SS_CTL1,
PCI_L1SS_CTL1_L1SS_MASK, val);
}

static void pcie_config_aspm_dev(struct pci_dev *pdev, u32 val)
Expand Down
7 changes: 7 additions & 0 deletions drivers/perf/Kconfig
Original file line number Diff line number Diff line change
Expand Up @@ -217,6 +217,13 @@ config MARVELL_CN10K_DDR_PMU
Enable perf support for Marvell DDR Performance monitoring
event on CN10K platform.

config DWC_PCIE_PMU
tristate "Synopsys DesignWare PCIe PMU"
depends on PCI
help
Enable perf support for Synopsys DesignWare PCIe PMU Performance
monitoring event on platform including the Alibaba Yitian 710.

source "drivers/perf/arm_cspmu/Kconfig"

source "drivers/perf/amlogic/Kconfig"
Expand Down
1 change: 1 addition & 0 deletions drivers/perf/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ obj-$(CONFIG_MARVELL_CN10K_TAD_PMU) += marvell_cn10k_tad_pmu.o
obj-$(CONFIG_MARVELL_CN10K_DDR_PMU) += marvell_cn10k_ddr_pmu.o
obj-$(CONFIG_APPLE_M1_CPU_PMU) += apple_m1_cpu_pmu.o
obj-$(CONFIG_ALIBABA_UNCORE_DRW_PMU) += alibaba_uncore_drw_pmu.o
obj-$(CONFIG_DWC_PCIE_PMU) += dwc_pcie_pmu.o
obj-$(CONFIG_ARM_CORESIGHT_PMU_ARCH_SYSTEM_PMU) += arm_cspmu/
obj-$(CONFIG_MESON_DDR_PMU) += amlogic/
obj-$(CONFIG_CXL_PMU) += cxl_pmu.o
Expand Down
Loading
Loading