|
1 | | -PEP: 830 |
| 1 | +PEP: 831 |
2 | 2 | Title: Frame Pointers Everywhere: Enabling System-Level Observability for Python |
3 | 3 | Author: Pablo Galindo Salgado <pablogsal@python.org>, |
4 | 4 | Ken Jin <kenjin@python.org>, |
@@ -67,7 +67,7 @@ default experience for Python. |
67 | 67 |
|
68 | 68 | The performance wins that profiling enables far outweigh the modest overhead of |
69 | 69 | frame pointers. As Brendan Gregg notes: "I've seen frame pointers help find |
70 | | -performance wins ranging from 5% to 500%" [#gregg2024]_. A 1-2% overhead that |
| 70 | +performance wins ranging from 5% to 500%" [#gregg2024]_. A 0.5-2% overhead that |
71 | 71 | unlocks the ability to find 5-500% improvements is a favourable trade. |
72 | 72 |
|
73 | 73 | What Are Frame Pointers? |
@@ -109,7 +109,7 @@ At optimisation levels ``-O1`` and above, GCC and Clang omit frame pointers by |
109 | 109 | default [#gcc_fomit]_. This frees the ``%rbp`` register for general use, |
110 | 110 | giving the optimiser one more register to work with. On x86-64 this is a gain |
111 | 111 | of one register out of 16 (about 7%). The performance benefit is small |
112 | | -(typically 1-2%) but it was considered worthwhile when the convention was |
| 112 | +(typically 0.5-2%) but it was considered worthwhile when the convention was |
113 | 113 | established for 32-bit x86, where the gain was one register out of 6 (~20%). |
114 | 114 |
|
115 | 115 | Without frame pointers, the linked list does not exist. Tools that need to |
@@ -611,7 +611,7 @@ system calls. |
611 | 611 | A common misconception in the community is that frame pointers carry large |
612 | 612 | overhead "because there was a single Python case that had a +10% slowdown." |
613 | 613 | [#hn-fp]_ That single case is the eval loop benchmark; the geometric mean |
614 | | -across real workloads is 1-2%. |
| 614 | +across real workloads is 0.5-2%. |
615 | 615 |
|
616 | 616 | Detailed Performance Analysis of CPython with Frame Pointers |
617 | 617 | ------------------------------------------------------------ |
@@ -764,7 +764,7 @@ This distinguishes frame pointers from other flags placed in ``CFLAGS_NODIST``: |
764 | 764 | those flags (such as ``-Werror`` or internal warning suppressions) are |
765 | 765 | correctness or policy controls that are meaningful per-compilation-unit. Frame |
766 | 766 | pointers are an ecosystem-wide property that is only effective when all |
767 | | -participants cooperate. The 1-2% overhead measured on CPython is driven by its |
| 767 | +participants cooperate. The 0.5-2% overhead measured on CPython is driven by its |
768 | 768 | high density of small C helper function calls; typical C extension code does |
769 | 769 | not exhibit the same call density and sees negligible overhead. |
770 | 770 |
|
@@ -861,7 +861,7 @@ are compiled separately and unaffected by Python's ``CFLAGS``. Extensions with |
861 | 861 | hot scalar C loops (e.g., Cython-generated code) may see measurable but modest |
862 | 862 | overhead. |
863 | 863 |
|
864 | | -For context, 1-2% geometric mean is comparable to overhead routinely accepted |
| 864 | +For context, 0.5-2% geometric mean is comparable to overhead routinely accepted |
865 | 865 | for build-time defaults such as ``-fstack-protector-strong`` (security) and the |
866 | 866 | ASLR-compatible ``-fPIC`` flag for shared libraries. In return, the entire |
867 | 867 | Python ecosystem gains the ability to produce complete flame graphs, accurate |
@@ -1074,11 +1074,11 @@ The first graph is the overall effect on pyperformance seen on each system. |
1074 | 1074 | Apart from the Ubuntu AWS Graviton System, all system configurations have below 2% |
1075 | 1075 | geometric mean and median slowdown: |
1076 | 1076 |
|
1077 | | -.. image:: pep-0830_perf_over_baseline.svg |
| 1077 | +.. image:: pep-0831_perf_over_baseline.svg |
1078 | 1078 |
|
1079 | 1079 | For individual benchmark results, see the following: |
1080 | 1080 |
|
1081 | | -.. image:: pep-0830_perf_over_baseline_indiv.svg |
| 1081 | +.. image:: pep-0831_perf_over_baseline_indiv.svg |
1082 | 1082 |
|
1083 | 1083 |
|
1084 | 1084 | Copyright |
|
0 commit comments