-
Notifications
You must be signed in to change notification settings - Fork 1
Expand file tree
/
Copy pathhead
More file actions
332 lines (325 loc) · 23 KB
/
head
File metadata and controls
332 lines (325 loc) · 23 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
o _O_r_i_g_a_m_i_ _-_ _T_a_s_k_ _E_x_e_c_u_t_i_o_n_ _F_r_a_m_e_w_o_r_k
o _T_e_c_h_n_o_l_o_g_y_ _B_e_n_c_h_m_a_r_k_i_n_g
o _H_P_C_ _T_r_a_i_n_i_n_g_ _C_o_u_r_s_e_s
o _P_O_P_ _E_U_ _P_r_o_j_e_c_t
* _S_u_p_p_o_r_t
o _T_e_c_h_n_i_c_a_l_ _S_u_p_p_o_r_t
# _T_e_c_h_n_i_c_a_l_ _S_u_p_p_o_r_t_ _O_v_e_r_v_i_e_w
# _C_o_n_t_a_c_t_ _o_u_r_ _S_u_p_p_o_r_t_ _T_e_a_m
# _S_o_f_t_w_a_r_e_ _L_i_c_e_n_s_i_n_g
o Resources
# _C_a_s_e_ _S_t_u_d_i_e_s
# _D_o_c_u_m_e_n_t_a_t_i_o_n
# _I_n_d_u_s_t_r_y_ _A_r_t_i_c_l_e_s
# _I_n_s_t_a_l_l_e_r_'_s_ _&_ _U_s_e_r_s_'_ _N_o_t_e_s
# _K_n_o_w_n_ _I_s_s_u_e_s
# _T_e_c_h_n_i_c_a_l_ _P_o_s_t_e_r_ _R_e_p_o_s_i_t_o_r_y
# _T_e_c_h_n_i_c_a_l_ _R_e_p_o_r_t_ _R_e_p_o_s_i_t_o_r_y
# _W_e_b_i_n_a_r_s_ _&_ _P_r_e_s_e_n_t_a_t_i_o_n_s
o _L_i_c_e_n_c_e_ _M_a_n_a_g_e_m_e_n_t
# _K_u_s_a_r_i_ _F_A_Q
# _K_u_s_a_r_i_ _L_i_c_e_n_c_e_ _M_a_n_a_g_e_m_e_n_t
o _S_o_f_t_w_a_r_e_ _D_o_w_n_l_o_a_d_s
o _C_o_d_e_ _C_o_n_t_r_i_b_u_t_i_o_n
* _A_b_o_u_t
o _A_b_o_u_t_ _N_A_G
o _B_l_o_g
o _P_e_o_p_l_e
o _C_a_r_e_e_r_s
# _C_a_r_e_e_r_s_ _a_t_ _N_A_G
# _D_i_v_e_r_s_i_t_y
# _E_m_p_l_o_y_e_e_ _B_e_n_e_f_i_t_s_ _(_U_K_)
# _E_m_p_l_o_y_e_e_ _B_e_n_e_f_i_t_s_ _(_U_S_)
# _E_n_v_i_r_o_n_m_e_n_t_a_l_ _P_o_l_i_c_y
# _L_i_v_i_n_g_,_ _w_o_r_k_i_n_g_ _a_n_d_ _r_e_l_o_c_a_t_i_n_g_ _t_o_ _O_x_f_o_r_d_s_h_i_r_e
# _S_u_p_p_o_r_t_i_n_g_ _S_t_u_d_e_n_t_s
o _N_A_G_n_e_w_s
o _P_r_e_s_s_ _R_e_l_e_a_s_e_s
o _E_v_e_n_t_s
o Resources
# _C_a_s_e_ _S_t_u_d_i_e_s
# _D_o_c_u_m_e_n_t_a_t_i_o_n
# _I_n_d_u_s_t_r_y_ _A_r_t_i_c_l_e_s
# _I_n_s_t_a_l_l_e_r_'_s_ _&_ _U_s_e_r_s_'_ _N_o_t_e_s
# _K_n_o_w_n_ _I_s_s_u_e_s
# _T_e_c_h_n_i_c_a_l_ _P_o_s_t_e_r_ _R_e_p_o_s_i_t_o_r_y
# _T_e_c_h_n_i_c_a_l_ _R_e_p_o_r_t_ _R_e_p_o_s_i_t_o_r_y
# _W_e_b_i_n_a_r_s_ _&_ _P_r_e_s_e_n_t_a_t_i_o_n_s
o _C_o_l_l_a_b_o_r_a_t_i_o_n
o _P_a_r_t_n_e_r_s
o _H_o_w_ _t_o_ _c_i_t_e_ _N_A_G
o _M_e_m_b_e_r_s_h_i_p
o _L_i_f_e_ _S_e_r_v_i_c_e_ _R_e_c_o_g_n_i_t_i_o_n_ _A_w_a_r_d
o _W_o_r_l_d_w_i_d_e_ _D_i_s_t_r_i_b_u_t_o_r_ _N_e_t_w_o_r_k
o _W_o_r_l_d_w_i_d_e_ _C_o_n_t_a_c_t_ _I_n_f_o_r_m_a_t_i_o_n
Search [search ]
Submit
Toggle navigation
********** MMaaiinn nnaavviiggaattiioonn **********
* _H_o_m_e
* _S_e_r_v_i_c_e_s
o _A_l_g_o_r_i_t_h_m_i_c_ _D_i_f_f_e_r_e_n_t_i_a_t_i_o_n_ _S_o_l_u_t_i_o_n_s
# _A_l_g_o_r_i_t_h_m_i_c_ _D_i_f_f_e_r_e_n_t_i_a_t_i_o_n_ _S_o_l_u_t_i_o_n_s
# _A_l_g_o_r_i_t_h_m_i_c_ _D_i_f_f_e_r_e_n_t_i_a_t_i_o_n_ _S_e_r_v_i_c_e_s
# _A_l_g_o_r_i_t_h_m_i_c_ _D_i_f_f_e_r_e_n_t_i_a_t_i_o_n_ _S_o_f_t_w_a_r_e
# _A_l_g_o_r_i_t_h_m_i_c_ _D_i_f_f_e_r_e_n_t_i_a_t_i_o_n_ _R_e_s_e_a_r_c_h
o _M_a_t_h_e_m_a_t_i_c_a_l_ _O_p_t_i_m_i_z_a_t_i_o_n_ _C_o_n_s_u_l_t_a_n_c_y
o _C_l_o_u_d_ _H_P_C_ _M_i_g_r_a_t_i_o_n_ _S_e_r_v_i_c_e
o _S_o_f_t_w_a_r_e_ _O_p_t_i_m_i_z_a_t_i_o_n_ _a_n_d_ _C_o_d_e_ _M_o_d_e_r_n_i_z_a_t_i_o_n
o _A_l_g_o_r_i_t_h_m_ _D_e_s_i_g_n_ _&_ _D_e_v_e_l_o_p_m_e_n_t
o _S_o_f_t_w_a_r_e_ _P_o_r_t_i_n_g_ _&_ _T_u_n_i_n_g
o _T_r_a_i_n_i_n_g_ _C_o_u_r_s_e_s
o _C_a_s_e_ _S_t_u_d_i_e_s
* _S_o_f_t_w_a_r_e
o _A_l_g_o_r_i_t_h_m_i_c_ _D_i_f_f_e_r_e_n_t_i_a_t_i_o_n_ _S_o_f_t_w_a_r_e
o _M_a_t_h_e_m_a_t_i_c_a_l_ _O_p_t_i_m_i_z_a_t_i_o_n_ _S_o_f_t_w_a_r_e
o _N_A_G_ _L_i_b_r_a_r_y
# _N_A_G_ _L_i_b_r_a_r_y
# _N_A_G_ _L_i_b_r_a_r_y_ _-_ _L_a_t_e_s_t_ _C_o_n_t_e_n_t
# _N_A_G_ _L_i_b_r_a_r_y_ _f_o_r_ _C
# _N_A_G_ _L_i_b_r_a_r_y_ _f_o_r_ _C_+_+
# _N_A_G_ _L_i_b_r_a_r_y_ _f_o_r_ _F_o_r_t_r_a_n
# _N_A_G_ _L_i_b_r_a_r_y_ _f_o_r_ _J_a_v_a
# _N_A_G_ _L_i_b_r_a_r_y_ _f_o_r_ _P_y_t_h_o_n
# _N_A_G_ _L_i_b_r_a_r_y_ _f_o_r_ _._N_E_T
# _N_A_G_ _L_i_b_r_a_r_y_ _f_o_r_ _S_M_P_ _&_ _M_u_l_t_i_c_o_r_e
# _N_A_G_ _L_i_b_r_a_r_y_ _f_o_r_ _X_e_o_n_ _P_h_i_ _
# _N_A_G_ _T_o_o_l_b_o_x_ _f_o_r_ _M_A_T_L_A_B_ _®
# _N_A_G_ _a_n_d_ _M_i_c_r_o_s_o_f_t_ _O_f_f_i_c_e
o _N_A_G_ _F_o_r_t_r_a_n_ _C_o_m_p_i_l_e_r
# _N_A_G_ _F_o_r_t_r_a_n_ _C_o_m_p_i_l_e_r
# _N_A_G_ _F_o_r_t_r_a_n_ _B_u_i_l_d_e_r
o _D_o_w_n_l_o_a_d_ _S_o_f_t_w_a_r_e
# _A_l_l_ _D_o_w_n_l_o_a_d_s
# _N_A_G_ _L_i_b_r_a_r_y_ _V_e_r_s_i_o_n_s
# _L_i_b_r_a_r_y_ _f_o_r_ _._N_E_T
# _L_i_b_r_a_r_y_ _f_o_r_ _P_y_t_h_o_n_ _v_e_r_s_i_o_n_s
# _L_i_b_r_a_r_y_ _f_o_r_ _J_a_v_a_ _v_e_r_s_i_o_n_s
# _F_o_r_t_r_a_n_ _L_i_b_r_a_r_y_ _f_o_r_ _S_M_P_ _&_ _M_u_l_t_i_c_o_r_e_ _V_e_r_s_i_o_n_s
# _C_ _L_i_b_r_a_r_y_ _f_o_r_ _S_M_P_ _&_ _M_u_l_t_i_c_o_r_e_ _V_e_r_s_i_o_n_s
# _L_i_b_r_a_r_y_ _f_o_r_ _t_h_e_ _X_e_o_n_ _P_h_i
# _T_o_o_l_b_o_x_ _f_o_r_ _M_A_T_L_A_B_ _®_ _V_e_r_s_i_o_n_s
# _F_o_r_t_r_a_n_ _C_o_m_p_i_l_e_r_ _V_e_r_s_i_o_n_s
# _d_c_o_/_c_+_+_ _V_e_r_s_i_o_n_s
# _T_e_r_m_s_ _&_ _C_o_n_d_i_t_i_o_n_s
o _D_o_c_u_m_e_n_t_a_t_i_o_n
o _O_r_i_g_a_m_i_ _-_ _T_a_s_k_ _E_x_e_c_u_t_i_o_n_ _F_r_a_m_e_w_o_r_k
o _A_c_c_u_r_a_c_y_ _&_ _Q_u_a_l_i_t_y_ _A_s_s_u_r_a_n_c_e
o _D_a_t_a_ _M_i_n_i_n_g_ _C_o_m_p_o_n_e_n_t_s
o _L_a_n_g_u_a_g_e_s_ _a_n_d_ _E_n_v_i_r_o_n_m_e_n_t_s
o _N_u_m_e_r_i_c_a_l_ _R_o_u_t_i_n_e_s_ _f_o_r_ _G_P_U_s
o _N_A_G_ _M_P_I_ _P_a_r_a_l_l_e_l_ _L_i_b_r_a_r_y
* _H_P_C_ _&_ _C_l_o_u_d
o _H_P_C_ _a_n_d_ _C_l_o_u_d_ _C_o_n_s_u_l_t_i_n_g_ _a_n_d_ _S_e_r_v_i_c_e_s
o _C_l_o_u_d_ _H_P_C_ _M_i_g_r_a_t_i_o_n_ _S_e_r_v_i_c_e
o _T_C_O_ _C_a_l_c_u_l_a_t_o_r
o _S_o_f_t_w_a_r_e_ _O_p_t_i_m_i_z_a_t_i_o_n_ _a_n_d_ _C_o_d_e_ _M_o_d_e_r_n_i_z_a_t_i_o_n
o _G_P_U_ _a_n_d_ _A_c_c_e_l_e_r_a_t_o_r_ _C_o_d_e_ _T_u_n_i_n_g
o _O_r_i_g_a_m_i_ _-_ _T_a_s_k_ _E_x_e_c_u_t_i_o_n_ _F_r_a_m_e_w_o_r_k
o _T_e_c_h_n_o_l_o_g_y_ _B_e_n_c_h_m_a_r_k_i_n_g
o _H_P_C_ _T_r_a_i_n_i_n_g_ _C_o_u_r_s_e_s
o _P_O_P_ _E_U_ _P_r_o_j_e_c_t
* _S_u_p_p_o_r_t
o _T_e_c_h_n_i_c_a_l_ _S_u_p_p_o_r_t
# _T_e_c_h_n_i_c_a_l_ _S_u_p_p_o_r_t_ _O_v_e_r_v_i_e_w
# _C_o_n_t_a_c_t_ _o_u_r_ _S_u_p_p_o_r_t_ _T_e_a_m
# _S_o_f_t_w_a_r_e_ _L_i_c_e_n_s_i_n_g
o Resources
# _C_a_s_e_ _S_t_u_d_i_e_s
# _D_o_c_u_m_e_n_t_a_t_i_o_n
# _I_n_d_u_s_t_r_y_ _A_r_t_i_c_l_e_s
# _I_n_s_t_a_l_l_e_r_'_s_ _&_ _U_s_e_r_s_'_ _N_o_t_e_s
# _K_n_o_w_n_ _I_s_s_u_e_s
# _T_e_c_h_n_i_c_a_l_ _P_o_s_t_e_r_ _R_e_p_o_s_i_t_o_r_y
# _T_e_c_h_n_i_c_a_l_ _R_e_p_o_r_t_ _R_e_p_o_s_i_t_o_r_y
# _W_e_b_i_n_a_r_s_ _&_ _P_r_e_s_e_n_t_a_t_i_o_n_s
o _L_i_c_e_n_c_e_ _M_a_n_a_g_e_m_e_n_t
# _K_u_s_a_r_i_ _F_A_Q
# _K_u_s_a_r_i_ _L_i_c_e_n_c_e_ _M_a_n_a_g_e_m_e_n_t
o _S_o_f_t_w_a_r_e_ _D_o_w_n_l_o_a_d_s
o _C_o_d_e_ _C_o_n_t_r_i_b_u_t_i_o_n
* _A_b_o_u_t
o _A_b_o_u_t_ _N_A_G
o _B_l_o_g
o _P_e_o_p_l_e
o _C_a_r_e_e_r_s
# _C_a_r_e_e_r_s_ _a_t_ _N_A_G
# _D_i_v_e_r_s_i_t_y
# _E_m_p_l_o_y_e_e_ _B_e_n_e_f_i_t_s_ _(_U_K_)
# _E_m_p_l_o_y_e_e_ _B_e_n_e_f_i_t_s_ _(_U_S_)
# _E_n_v_i_r_o_n_m_e_n_t_a_l_ _P_o_l_i_c_y
# _L_i_v_i_n_g_,_ _w_o_r_k_i_n_g_ _a_n_d_ _r_e_l_o_c_a_t_i_n_g_ _t_o_ _O_x_f_o_r_d_s_h_i_r_e
# _S_u_p_p_o_r_t_i_n_g_ _S_t_u_d_e_n_t_s
o _N_A_G_n_e_w_s
o _P_r_e_s_s_ _R_e_l_e_a_s_e_s
o _E_v_e_n_t_s
o Resources
# _C_a_s_e_ _S_t_u_d_i_e_s
# _D_o_c_u_m_e_n_t_a_t_i_o_n
# _I_n_d_u_s_t_r_y_ _A_r_t_i_c_l_e_s
# _I_n_s_t_a_l_l_e_r_'_s_ _&_ _U_s_e_r_s_'_ _N_o_t_e_s
# _K_n_o_w_n_ _I_s_s_u_e_s
# _T_e_c_h_n_i_c_a_l_ _P_o_s_t_e_r_ _R_e_p_o_s_i_t_o_r_y
# _T_e_c_h_n_i_c_a_l_ _R_e_p_o_r_t_ _R_e_p_o_s_i_t_o_r_y
# _W_e_b_i_n_a_r_s_ _&_ _P_r_e_s_e_n_t_a_t_i_o_n_s
o _C_o_l_l_a_b_o_r_a_t_i_o_n
o _P_a_r_t_n_e_r_s
o _H_o_w_ _t_o_ _c_i_t_e_ _N_A_G
o _M_e_m_b_e_r_s_h_i_p
o _L_i_f_e_ _S_e_r_v_i_c_e_ _R_e_c_o_g_n_i_t_i_o_n_ _A_w_a_r_d
o _W_o_r_l_d_w_i_d_e_ _D_i_s_t_r_i_b_u_t_o_r_ _N_e_t_w_o_r_k
o _W_o_r_l_d_w_i_d_e_ _C_o_n_t_a_c_t_ _I_n_f_o_r_m_a_t_i_o_n
[/sites/default/files/styles/banner/public/2020-06/cloud-
computing.jpeg?itok=60DK1E6m]
Sunny Weather Forecast for ARM s Cost of Solution in Cloud HPC
Published 25/08/2020 By Branden Moore
One of the primary drivers for Cloud computing is access to architectures and
systems which may not be readily available in-house. One example of this is
AWS s somewhat recent introduction of their own custom-designed _G_r_a_v_i_t_o_n_ _2
processor. This processor is based on the ARM architecture, rather than the
x86-based architectures from Intel and AMD. We have had a number of clients
enquire about how viable ARM is for their HPC needs. While there are a handful
of published benchmarks available, I decided to take an afternoon and try it
for myself.
For this small exercise, I decided to benchmark the weather code _W_R_F v3.9.1.1.
There are two "traditional" benchmarks for WRFv3, representing different
resolutions (12km and 2.5km resolutions). Both benchmarks run for 3 simulated
hours. The smaller benchmark (12km resolution) typically scales well to a few
hundred cores, and the larger benchmark (2.5km resolution) will scale to a few
thousand cores. However, for this project, I ran the benchmarks on only a
single node, and as this exercise was only to satisfy my own curiosity, I did
not re-run the benchmarks multiple times which we would normally do to capture
statistical variation.
For the benchmarking hardware, I wanted to compare offerings from Intel, AMD
and ARM. AWS is currently the only major cloud provider to offer ARM-based
instances with their AWS Graviton 2 processor. This led me to the C5, C5a and
C6g instance types on AWS, of which I selected the largest instance type
available, in order to get a full node. The benchmarked systems all used
Amazon Linux 2 as the OS, and I used _S_p_a_c_k to install GCC 9.3.0, and to build
WRF s dependencies_[_1_] which made building for ARM no more difficult than for
Intel and AMD. When building WRF itself, the only modifications to the default
compilation configuration was to add the aarch64 architecture to the GNU/
Linux configuration section, and tuning parameters to optimize for the target
platform (-march=native -mtune=native).
IInnssttaannccee NNaammee PPrroocceessssoorr IIDD ## SSoocckkeettss ## vvCCPPUU ## PPhhyyssiiccaall CCoorreess RRAAMM PPrriiccee//HHrr
cc55..2244xxllaarrggee Intel Xeon 2 96 48 192GB $4.08
Platinum 8275CL
cc55aa..2244xxllaarrggee AMD EPYC 7R32 1 96 48 192GB $3.696
cc66gg..1166xxllaarrggee AWS Graviton 2 1 64 64 128GB $2.176
Already, it is clear that this benchmark will not quite be an apples-to-apples
comparison. The C5 (Intel) instance is a dual-socket configuration, whereas
both the C5a (AMD) and C6g (ARM) instances are single socket. Both C5 and C5a
have SMT (aka, HyperThreading), whereas the C6g does not. If we instead
consider comparing the performance from each instance offering, rather than
looking at the underlying CPU configuration, the comparisons become much more
straightforward.
Results
********** SSMMTT // HHyyppeerrtthhrreeaaddiinngg **********
Many HPC applications demonstrate a performance degradation when SMT is in use,
so many HPC centers disable it. I wanted to double-check to see if that is the
case for this benchmark. A quick look at Figure 1 shows that in this case,
there is a performance advantage to using SMT on both Intel and AMD systems for
WRF (aka, use all 96 threads), but the difference is minor at best. We can also
see that the dual-socket Intel system significantly out-performs the single-
socket AMD system on the larger benchmark, most likely due to the higher
overall system memory bandwidth_[_2_]. For the remainder of this paper, full
instance will refer to using all of the vCPUs available to the instance.
[Performance advantage to using SMT]
FFiigguurree 11:: CCoommppaarriinngg uussiinngg SSMMTT vvss nnoott ?? sshhoorrtteerr iiss bbeetttteerr
********** CCoommppaarriinngg PPeerrffoorrmmaannccee ffoorr AARRMM vvss IInntteell vvss AAMMDD ffoorr WWRRFF **********
Figure 2 shows the total compute time (not including startup time or writing
the results to disk) for WRF running both benchmarks across the three
architectures. It is plain to see that AWS's Graviton 2 chip performs quite
competitively. While it is the slowest of the three for the smaller benchmark
(12km resolution), it out-performs AMD's offering during the larger scale
benchmark (2.5km resolution). The Intel-based system shows a non-trivial
performance advantage over both ARM and AMD.
[Compute time using full instance (all vCPU)]
FFiigguurree 22:: CCoommppuuttee ttiimmee uussiinngg ffuullll iinnssttaannccee ((aallll vvCCPPUU)) ?? sshhoorrtteerr iiss bbeetttteerr
I expect that the higher memory speeds available on the Graviton 2 processor is
the main reason that it out-performs the AMD system on the larger-scale
benchmark. If AWS introduces a dual-socket AMD Rome instance type, this should,
of course, be revisited. The Intel processor s higher clock speed combined
with the increased memory bandwidth from having 2 sockets give it a sizable
performance advantage here.
********** CCoommppaarriinngg CCoossttss ooff AARRMM vvss IInntteell vvss AAMMDD ffoorr WWRRFF **********
With this study taking place in "the Cloud," it is imperative to also consider
costs when benchmarking. AWS has priced their Graviton 2 offerings extremely
competitively. Recall that the instances being benchmarked cost (as of today,
in the US-EAST-2 region, with On-Demand pricing) $4.08/hr for the Intel system,
$3.70/hr for the AMD system, and only $2.18/hr for the Graviton 2 system.
We multiply the hourly price by our runtime to give us our cost-to-solution
numbers. As can be seen in Figure 3, while the Intel-based instance has much
higher performance than the ARM-based instance, when you factor in prices, the
Graviton 2 gives us a lower cost-to-solution, despite taking a longer time to
reach the solution.
[Cost comparisons]
FFiigguurree 33:: CCoosstt ccoommppaarriissoonnss -- sshhoorrtteerr iiss bbeetttteerr
There is a distinct tradeoff between performance and cost for WRF on these
platforms. For the 2.5km benchmark, we would ideally explore a scaling study as
well, to see if there is a point where you can get the performance of an Intel-
based solution, but at a cheaper overall cost, with an ARM-based solution.
[Cost & Performance comparisons]
FFiigguurree 44:: CCoosstt aanndd PPeerrffoorrmmaannccee ccoommppaarriissoonnss -- sshhoorrtteerr iiss bbeetttteerr
SSyysstteemm NNaammee BBeenncchhmmaarrkk CCoommppuuttee TTiimmee ((ss)) BBeenncchhmmaarrkk CCoosstt (($$))
cc66gg ?? ((GGrraavviittoonn 22)) CONUS 12km 75.33 $0.046
cc55aa ?? ((EEPPYYCC 77RR3322)) CONUS 12km 68.26 $0.070
cc55 ?? ((XXeeoonn 88227755CCLL)) CONUS 12km 59.00 $0.067
cc66gg ?? ((GGrraavviittoonn 22)) CONUS 2.5km 4384.16 $2.65
cc55aa ?? ((EEPPYYCC 77RR3322)) CONUS 2.5km 4799.63 $4.93
cc55 ?? ((XXeeoonn 88227755CCLL)) CONUS 2.5km 3395.74 $3.85
As the AMD EYPC and ARM HPC ecosystems mature, we can hope to see increased
performance from compilers which are more targeted at these architectures (ie,
AOCC from AMD, and ARM s Allinea Studio), as well as other LLVM-based
compilers. In the past, we have seen that the Intel compiler does a better job
than gfortran for optimizing WRF for Intel processors. It would be interesting
to revisit this benchmark study with additional compilers.
[/sites/default/files/styles/video/public/2020-07/cloud-
cost_0.jpeg?itok=DMc5Wcdk]
Summary
In response to questions about the suitability for ARM processors for HPC
today, I ran one popular HPC benchmark, WRFv3 on 3 different compute-
optimized AWS instance types. We found that AWS s custom ARM-based
offering, while not the fastest processor available for this benchmark,
provides a very cost-efficient solution for WRF, and performance is competitive
to other, more traditional HPC processors.
If you re interested in more in-depth benchmarking and performance analysis
of various HPC hardware solutions, please get in _c_o_n_t_a_c_t.
********** FFuurrtthheerr rreeaaddiinngg **********
* _h_t_t_p_s_:_/_/_w_w_w_._n_a_g_._c_o_m_/_c_o_n_t_e_n_t_/_n_a_g_-_c_l_o_u_d_-_h_p_c_-_m_i_g_r_a_t_i_o_n_-_s_e_r_v_i_c_e
* _h_t_t_p_s_:_/_/_w_w_w_._n_a_g_._c_o_m_/_b_l_o_g_/_c_o_s_t_-_s_o_l_u_t_i_o_n_-_c_l_o_u_d_-_h_p_c
===============================================================================
_[_1_] I first attempted to use GCC 10, but gfortran 10 and WRF do not appear to
get along well. WRF would crash at runtime due to routines in libgfortran.
_[_2_] If AWS introduces a dual-socket AMD Rome instance size, such as are
available on other cloud providers, the performance profile should change
significantly, and this will be worth revisiting.
Author
_B_r_a_n_d_e_n_ _M_o_o_r_e
Leave a Comment
[name ]
Submit
Sign up for the NAG newsletter [Unknown INPUT type]
Submit
** FFooootteerr mmeennuu **
* _A_b_o_u_t_ _N_A_G
o _B_l_o_g
o _N_A_G_n_e_w_s
o _C_a_s_e_ _S_t_u_d_i_e_s
o _C_o_n_t_a_c_t_ _u_s
* _S_u_p_p_o_r_t
o _C_o_n_t_a_c_t_ _s_u_p_p_o_r_t
o _D_o_c_u_m_e_n_t_a_t_i_o_n
o _I_n_s_t_a_l_l_e_r_'_s_ _&_ _U_s_e_r_s_'_ _N_o_t_e_s
o _D_o_w_n_l_o_a_d_s
o _T_e_c_h_n_i_c_a_l_ _R_e_p_o_r_t_s
Copyright 2021, Numerical Algorithms Group Ltd (The)
** LLeeggaall **
* _P_r_i_v_a_c_y_ _N_o_t_i_c_e
* _T_r_a_d_e_m_a_r_k_s
[NAG logo - white]
_W_o_r_l_d_w_i_d_e_ _L_o_c_a_t_i_o_n_s