You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
fix(linux): Fix headers and whitespace in AM62X performance rst
The performance guides use incorrect header hierarchy, fix
header hierarchy and whitespace to establish an easier to
read/parse html.
Signed-off-by: Judith Mendez <jm@ti.com>
Copy file name to clipboardExpand all lines: source/devices/AM62X/linux/Linux_Performance_Guide.rst
+78-65Lines changed: 78 additions & 65 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,10 +1,11 @@
1
1
2
-
====================================
2
+
#################################
3
3
Linux 12.00.00 Performance Guide
4
-
====================================
4
+
#################################
5
5
6
-
.. rubric:: **Read This First**
7
-
:name: read-this-first-kernel-perf-guide
6
+
***************
7
+
Read This First
8
+
***************
8
9
9
10
**All performance numbers provided in this document are gathered using
10
11
following Evaluation Modules unless otherwise specified.**
@@ -21,25 +22,30 @@ following Evaluation Modules unless otherwise specified.**
21
22
22
23
Table: Evaluation Modules
23
24
24
-
.. rubric:: About This Manual
25
-
:name: about-this-manual-kernel-perf-guide
25
+
*****************
26
+
About This Manual
27
+
*****************
26
28
27
29
This document provides performance data for each of the device drivers
28
30
which are part of the Processor SDK Linux package. This document should be
29
31
used in conjunction with release notes and user guides provided with the
30
32
Processor SDK Linux package for information on specific issues present
31
33
with drivers included in a particular release.
32
34
33
-
.. rubric:: If You Need Assistance
34
-
:name: if-you-need-assistance-kernel-perf-guide
35
-
36
35
For further information or to report any problems, contact
37
36
https://e2e.ti.com/ or https://support.ti.com/
38
37
38
+
|
39
+
40
+
*****************
39
41
System Benchmarks
40
-
-----------------
42
+
*****************
43
+
44
+
|
45
+
41
46
LMBench
42
-
^^^^^^^
47
+
=======
48
+
43
49
LMBench is a collection of microbenchmarks of which the memory bandwidth
44
50
and latency related ones are typically used to estimate processor
45
51
memory system performance. More information about lmbench at
@@ -183,7 +189,8 @@ Execute the LMBench with the following:
183
189
"tcp_latency_using_localhost (microsec)","1.00 (min 0.85, max 1.14)","0.89 (min 0.76, max 1.02)","0.76"
184
190
185
191
Dhrystone
186
-
^^^^^^^^^
192
+
=========
193
+
187
194
Dhrystone is a core only benchmark that runs from warm L1 caches in all
188
195
modern processors. It scales linearly with clock speed.
189
196
@@ -205,7 +212,8 @@ Execute the benchmark with the following:
205
212
"dhrystone_per_second (dhrystonep)","6027183.60 (min 5882353.00, max 6250000.00)","6789289.58 (min 6451613.00, max 7142857.00)","6819923.17 (min 6666666.50, max 6896551.50)"
206
213
207
214
Whetstone
208
-
^^^^^^^^^
215
+
=========
216
+
209
217
Whetstone is a benchmark primarily measuring floating-point arithmetic performance.
210
218
211
219
Execute the benchmark with the following:
@@ -220,7 +228,8 @@ Execute the benchmark with the following:
220
228
"whetstone (mips)","4444.43 (min 3333.30, max 5000.00)","5000.00","5000.00"
221
229
222
230
Linpack
223
-
^^^^^^^
231
+
=======
232
+
224
233
Linpack measures peak double precision (64 bit) floating point performance in
225
234
solving a dense linear system.
226
235
@@ -230,7 +239,8 @@ solving a dense linear system.
230
239
"linpack (kflops)","515140.50 (min 508416.00, max 518513.00)","581699.00 (min 581477.00, max 581921.00)","578855.50 (min 578148.00, max 579563.00)"
231
240
232
241
NBench
233
-
^^^^^^
242
+
======
243
+
234
244
NBench which stands for Native Benchmark is used to measure macro benchmarks
235
245
for commonly used operations such as sorting and analysis algorithms.
"string_sort (iterations)","150.20 (min 150.14, max 150.27)","168.20 (min 168.14, max 168.29)","168.19 (min 168.17, max 168.21)"
252
262
253
263
Stream
254
-
^^^^^^
264
+
======
265
+
255
266
STREAM is a microbenchmark for measuring data memory system performance without
256
267
any data reuse. It is designed to miss on caches and exercise data prefetcher
257
268
and speculative accesses.
@@ -277,7 +288,8 @@ Execute the benchmark with the following:
277
288
"triad (mb/s)","1349.88 (min 1303.10, max 1385.40)","1615.13 (min 1494.30, max 1856.90)","1667.70"
278
289
279
290
CoreMarkPro
280
-
^^^^^^^^^^^
291
+
===========
292
+
281
293
CoreMark®-Pro is a comprehensive, advanced processor benchmark that works with
282
294
and enhances the market-proven industry-standard EEMBC CoreMark® benchmark.
283
295
While CoreMark stresses the CPU pipeline, CoreMark-Pro tests the entire processor,
@@ -313,7 +325,8 @@ and floating-point workloads, and data sets for utilizing larger memory subsyste
313
325
"zip-test (workloads/)","35.29 (min 33.90, max 36.36)","40.31 (min 38.46, max 42.55)","40.60 (min 38.46, max 41.67)"
314
326
315
327
MultiBench
316
-
^^^^^^^^^^
328
+
==========
329
+
317
330
MultiBench™ is a suite of benchmarks that allows processor and system designers to
318
331
analyze, test, and improve multicore processors. It uses three forms of concurrency:
319
332
Data decomposition: multiple threads cooperating on achieving a unified goal and
@@ -361,11 +374,13 @@ thread-enabled workloads to be tested.
361
374
"x264-4mq (workloads/)","0.49 (min 0.48, max 0.50)","0.56 (min 0.55, max 0.57)","0.56 (min 0.56, max 0.57)"
362
375
"x264-4mqw1 (workloads/)","0.49 (min 0.49, max 0.50)","0.56 (min 0.54, max 0.57)","0.56"
363
376
377
+
|
378
+
364
379
Boot-time Measurement
365
-
---------------------
380
+
=====================
366
381
367
382
Boot media: MMCSD
368
-
^^^^^^^^^^^^^^^^^
383
+
-----------------
369
384
370
385
.. csv-table:: Linux boot time MMCSD
371
386
:header: "Boot Configuration","am62xx_lp_sk-fs: Boot time in seconds: avg(min,max)","am62xx_sk-fs: Boot time in seconds: avg(min,max)","am62xxsip_sk-fs: Boot time in seconds: avg(min,max)"
@@ -376,8 +391,8 @@ Boot time numbers [avg, min, max] are measured from "Starting kernel" to Linux p
376
391
377
392
|
378
393
379
-
ALSA SoC Audio Driver
380
-
---------------------
394
+
ALSA SoC Audio
395
+
==============
381
396
382
397
#. Access type - RW\_INTERLEAVED
383
398
#. Channels - 2
@@ -412,11 +427,12 @@ ALSA SoC Audio Driver
412
427
413
428
|
414
429
415
-
Graphics SGX/RGX Driver
416
-
-----------------------
430
+
Graphics SGX/RGX
431
+
================
417
432
418
433
GFXBench
419
-
^^^^^^^^
434
+
--------
435
+
420
436
Run GFXBench and capture performance reported (Score and Display rate in fps). All display outputs (HDMI, Displayport and/or LCD) are connected when running these tests
421
437
422
438
.. csv-table:: GFXBench Performance
@@ -427,7 +443,7 @@ Run GFXBench and capture performance reported (Score and Display rate in fps). A
427
443
" GFXBench 5.x gl_5_high_off","11.08 (min 10.87, max 11.19)","0.17","11.79 (min 11.60, max 11.89)","0.18"
428
444
429
445
Glmark2
430
-
^^^^^^^
446
+
-------
431
447
432
448
Run Glmark2 and capture performance reported (Score). All display outputs (HDMI, Displayport and/or LCD) are connected when running these tests
433
449
@@ -441,7 +457,8 @@ Run Glmark2 and capture performance reported (Score). All display outputs (HDMI,
441
457
|
442
458
443
459
Ethernet
444
-
-----------------
460
+
========
461
+
445
462
Ethernet performance benchmarks were measured using :command:`netperf` 2.7.1 https://hewlettpackard.github.io/netperf/doc/netperf.html
446
463
Test procedures were modeled after those defined in RFC-2544:
447
464
https://tools.ietf.org/html/rfc2544, where the DUT is the TI device
@@ -507,29 +524,29 @@ Running the following commands will trigger :command:`netperf` clients to measur
:header: "Command Used","am62xx_lp_sk-fs: THROUGHPUT (Mbits/sec)","am62xx_lp_sk-fs: CPU Load % (LOCAL_CPU_UTIL)","am62xx_sk-fs: THROUGHPUT (Mbits/sec)","am62xx_sk-fs: CPU Load % (LOCAL_CPU_UTIL)","am62xxsip_sk-fs: THROUGHPUT (Mbits/sec)","am62xxsip_sk-fs: CPU Load % (LOCAL_CPU_UTIL)"
528
545
529
546
"netperf -H 192.168.0.1 -j -c -C -l 60 -t TCP_STREAM; netperf -H 192.168.0.1 -j -c -C -l 60 -t TCP_MAERTS","1641.49 (min 1549.80, max 1756.23)","51.62 (min 39.44, max 63.10)","1563.91 (min 1444.87, max 1707.72)","39.79 (min 33.68, max 47.59)","1731.44 (min 1694.58, max 1768.29)","50.23 (min 42.61, max 57.85)"
530
547
531
-
.. rubric:: UDP Throughput
532
-
:name: CPSW2g-udp-throughput-0-loss
548
+
UDP Throughput
549
+
^^^^^^^^^^^^^^
533
550
534
551
.. csv-table:: CPSW2g UDP Egress Throughput 0 loss
535
552
:header: "Frame Size(bytes)","am62xx_lp_sk-fs: UDP Datagram Size(bytes) (LOCAL_SEND_SIZE)","am62xx_lp_sk-fs: THROUGHPUT (Mbits/sec)","am62xx_lp_sk-fs: Packets Per Second (kPPS)","am62xx_lp_sk-fs: CPU Load % (LOCAL_CPU_UTIL)","am62xx_sk-fs: UDP Datagram Size(bytes) (LOCAL_SEND_SIZE)","am62xx_sk-fs: THROUGHPUT (Mbits/sec)","am62xx_sk-fs: Packets Per Second (kPPS)","am62xx_sk-fs: CPU Load % (LOCAL_CPU_UTIL)","am62xxsip_sk-fs: UDP Datagram Size(bytes) (LOCAL_SEND_SIZE)","am62xxsip_sk-fs: THROUGHPUT (Mbits/sec)","am62xxsip_sk-fs: Packets Per Second (kPPS)","am62xxsip_sk-fs: CPU Load % (LOCAL_CPU_UTIL)"
"256k","32.00 (min 18.90, max 39.70)","1.36 (min 1.21, max 1.47)","83.60 (min 83.20, max 83.90)","1.58 (min 1.40, max 1.80)"
749
760
750
761
MMC EXT4
751
-
^^^^^^^^
762
+
--------
752
763
753
764
.. csv-table:: MMC EXT4
754
765
:header: "Buffer size (bytes)","am62xx_sk-fs: Write Raw Throughput (Mbytes/sec)","am62xx_sk-fs: Write Raw CPU Load (%)","am62xx_sk-fs: Read Raw Throughput (Mbytes/sec)","am62xx_sk-fs: Read Raw CPU Load (%)"
@@ -816,11 +827,11 @@ The performance numbers were captured using the following:
@@ -913,11 +924,13 @@ Listed for each algorithm are the code snippets used to run each
913
924
time -v openssl speed -elapsed -evp aes-128-cbc
914
925
915
926
IPSec Software Performance
916
-
^^^^^^^^^^^^^^^^^^^^^^^^^^
927
+
--------------------------
917
928
918
929
.. csv-table:: IPSec Software Performance
919
930
:header: "Algorithm","am62xx_sk-fs: Throughput (Mbps)","am62xx_sk-fs: Packets/Sec","am62xx_sk-fs: CPU Load","am62xxsip_sk-fs: Throughput (Mbps)","am62xxsip_sk-fs: Packets/Sec","am62xxsip_sk-fs: CPU Load"
920
931
921
932
"aes128","77.65 (min 73.60, max 81.70)","6.50 (min 6.00, max 7.00)","51.08 (min 50.78, max 51.38)","80.50","7.00","51.43"
922
933
"aes192","0.60","0.00","56.22"
923
934
"aes256","130.00 (min 53.40, max 206.60)","11.00 (min 4.00, max 18.00)","39.11 (min 27.23, max 50.98)","88.60 (min 0.40, max 205.40)","7.67 (min 0.00, max 18.00)","43.07 (min 27.21, max 51.38)"
0 commit comments