Gen2 EPYC を搭載する「c5a」インスタンスの性能をOpenSSLで確認してみた

最新 第2世代のAMD EPYCプロセッサ搭載、コンピューティングに最適化された C5Aインスタンスの性能を確認してみました。
2020.06.05

この記事は公開されてから1年以上経過しています。情報が古い可能性がありますので、ご注意ください。

AWSチームのすずきです。

2020/6/5、Gen2 世代 の AMD EPYC Rome プロセッサを搭載した 「C5a」 インスタンスが、 バージニア、オハイオ、アイルランド、フランクフルト、シドニー、シンガポール の各リージョンで利用可能になりました。

オレゴンリージョンで 「c5a.large」 のEC2インスタンスを起動して、OpenSSLコマンドでその処理性能を測定、既存のインスタンスとの比較する機会がありましたので、紹介させて頂きます。

環境

  • リージョン: オレゴン (us-west-2)
  • AMI: amzn2-ami-hvm-2.0.20200520.1-x86_64-gp2 (ami-0e34e7b9ca0ace12d)

cpuinfo

model name は 「AMD EPYC 7R32」、 カスタム型番の模様でした。

sh-4.2$ cat /proc/cpuinfo
processor       : 0
vendor_id       : AuthenticAMD
cpu family      : 23
model           : 49
model name      : AMD EPYC 7R32
stepping        : 0
microcode       : 0x8301025
cpu MHz         : 2999.009
cache size      : 512 KB
physical id     : 0
siblings        : 2
core id         : 0
cpu cores       : 1
apicid          : 0
initial apicid  : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 13
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid extd_apicid aperfmperf tsc_known_freq pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm cmp_legacy cr8_legacy abm sse4a misalignsse 3dnowprefetch topoext ssbd ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 clzero xsaveerptr arat npt nrip_save rdpid
bugs            : sysret_ss_attrs spectre_v1 spectre_v2 spec_store_bypass
bogomips        : 5600.42
TLB size        : 3072 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 48 bits physical, 48 bits virtual
power management:

processor       : 1
vendor_id       : AuthenticAMD
cpu family      : 23
model           : 49
model name      : AMD EPYC 7R32
stepping        : 0
microcode       : 0x8301025
cpu MHz         : 3115.607
cache size      : 512 KB
physical id     : 0
siblings        : 2
core id         : 0
cpu cores       : 1
apicid          : 1
initial apicid  : 1
fpu             : yes
fpu_exception   : yes
cpuid level     : 13
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid extd_apicid aperfmperf tsc_known_freq pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm cmp_legacy cr8_legacy abm sse4a misalignsse 3dnowprefetch topoext ssbd ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 clzero xsaveerptr arat npt nrip_save rdpid
bugs            : sysret_ss_attrs spectre_v1 spectre_v2 spec_store_bypass
bogomips        : 5600.42
TLB size        : 3072 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 48 bits physical, 48 bits virtual
power management:

openssl

openssl 暗号化方式を変えてパフォーマンスを求めました。

openssl speed -evp aes-128-ctr
openssl speed -evp aes-128-gcm

以下のインスタンスのCPUコア性能の比較を試みてみました。

インスタンスタイプ CPU
c5a.large AMD EPYC 7R32
c5.large Intel(R) Xeon(R) Platinum 8124M CPU @ 3.00GHz
m5a.large AMD EPYC 7571
m3.large Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz

結果

第2世代のEPYCを搭載する「c5a」、 第1世代のEPYC コンピューティング最適化ではなく 汎用の 「m5a」と比較して 25% 高い CPUコア性能 で利用できる事が確認できました。

Intel CPU、 同世代の コンピューティング最適化「c5」 との比較では、aes-128-ctr は 約40% 高性能、 aes-128-gcm は 25% 低い性能でした。

aes-128-ctr

aes-128-gcm

まとめ

第2世代のEPYCを搭載する「c5a」、前世代から CPUコア性能が向上している事が確認できました。

Intel Xeonとの比較でも、 Intel CPUに特化した最適化、命令セットに依存したワークロードを除けば、 費用対効果の良い利用が期待できると思われます。

東京リージョンでの「C5a」の早期のリリースを期待したいと思います。

参考リンク

$ openssl speed -evp aes-128-ctr | aes-128-gcm | chacha20-poly1305 の結果を集めるスレ

ログ

lscpu

c5a

$ lscpu
Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              2
On-line CPU(s) list: 0,1
Thread(s) per core:  2
Core(s) per socket:  1
Socket(s):           1
NUMA node(s):        1
Vendor ID:           AuthenticAMD
CPU family:          23
Model:               49
Model name:          AMD EPYC 7R32
Stepping:            0
CPU MHz:             3144.445
BogoMIPS:            5600.42
Hypervisor vendor:   KVM
Virtualization type: full
L1d cache:           32K
L1i cache:           32K
L2 cache:            512K
L3 cache:            4096K
NUMA node0 CPU(s):   0,1
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr ssesse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid extd_apicid aperfmperf tsc_known_freq pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrandhypervisor lahf_lm cmp_legacy cr8_legacy abm sse4a misalignsse 3dnowprefetch topoext ssbd ibrs ibpb stibp vmmcallfsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 clzero xsaveerptr arat npt nrip_save rdpid

c5

Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              2
On-line CPU(s) list: 0,1
Thread(s) per core:  2
Core(s) per socket:  1
Socket(s):           1
NUMA node(s):        1
Vendor ID:           GenuineIntel
CPU family:          6
Model:               85
Model name:          Intel(R) Xeon(R) Platinum 8124M CPU @ 3.00GHz
Stepping:            4
CPU MHz:             3400.038
BogoMIPS:            6000.00
Hypervisor vendor:   KVM
Virtualization type: full
L1d cache:           32K
L1i cache:           32K
L2 cache:            1024K
L3 cache:            25344K
NUMA node0 CPU(s):   0,1
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr ssesse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single pti fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves ida arat pku ospke

m5a

Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              2
On-line CPU(s) list: 0,1
Thread(s) per core:  2
Core(s) per socket:  1
Socket(s):           1
NUMA node(s):        1
Vendor ID:           AuthenticAMD
CPU family:          23
Model:               1
Model name:          AMD EPYC 7571
Stepping:            2
CPU MHz:             2500.805
BogoMIPS:            4399.50
Hypervisor vendor:   KVM
Virtualization type: full
L1d cache:           32K
L1i cache:           64K
L2 cache:            512K
L3 cache:            8192K
NUMA node0 CPU(s):   0,1
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr ssesse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf tsc_known_freq pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm cmp_legacy cr8_legacy abm sse4a misalignsse 3dnowprefetch topoext vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflushopt sha_ni xsaveopt xsavec xgetbv1 clzero xsaveerptr arat npt nrip_save

m3

sh-4.2$ lscpu
Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              2
On-line CPU(s) list: 0,1
Thread(s) per core:  2
Core(s) per socket:  1
Socket(s):           1
NUMA node(s):        1
Vendor ID:           GenuineIntel
CPU family:          6
Model:               62
Model name:          Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Stepping:            4
CPU MHz:             2500.115
BogoMIPS:            5000.13
Hypervisor vendor:   Xen
Virtualization type: full
L1d cache:           32K
L1i cache:           32K
L2 cache:            256K
L3 cache:            25600K
NUMA node0 CPU(s):   0,1
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr ssesse2 ht syscall nx rdtscp lm constant_tsc rep_good nopl xtopology cpuid pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm cpuid_fault pti fsgsbase smep erms xsaveopt

openssl

$ openssl version
OpenSSL 1.0.2k-fips  26 Jan 2017

c5a

$ openssl speed -evp aes-128-ctr
Doing aes-128-ctr for 3s on 16 size blocks: 155771851 aes-128-ctr's in 2.99s
Doing aes-128-ctr for 3s on 64 size blocks: 119225529 aes-128-ctr's in 3.00s
Doing aes-128-ctr for 3s on 256 size blocks: 62484458 aes-128-ctr's in 3.00s
Doing aes-128-ctr for 3s on 1024 size blocks: 23514798 aes-128-ctr's in 3.00s
Doing aes-128-ctr for 3s on 8192 size blocks: 3432818 aes-128-ctr's in 3.00s
OpenSSL 1.0.2k-fips  26 Jan 2017
built on: reproducible build, date unspecified
options:bn(64,64) md2(int) rc4(8x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx)
compiler: gcc -I. -I.. -I../include  -fPIC -DOPENSSL_PIC -DZLIB -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -DKRB5_MIT -m64 -DL_ENDIAN -Wall -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches   -m64 -mtune=generic -Wa,--noexecstack -DPURIFY -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DRC4_ASM -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM -DECP_NISTZ256_ASM
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
aes-128-ctr     833561.74k  2543477.95k  5332007.08k  8026384.38k  9373881.69k

$ openssl speed -evp aes-128-gcm
Doing aes-128-gcm for 3s on 16 size blocks: 120025321 aes-128-gcm's in 3.00s
Doing aes-128-gcm for 3s on 64 size blocks: 69139341 aes-128-gcm's in 3.00s
Doing aes-128-gcm for 3s on 256 size blocks: 29172864 aes-128-gcm's in 3.00s
Doing aes-128-gcm for 3s on 1024 size blocks: 10332539 aes-128-gcm's in 3.00s
Doing aes-128-gcm for 3s on 8192 size blocks: 1419860 aes-128-gcm's in 3.00s
OpenSSL 1.0.2k-fips  26 Jan 2017
built on: reproducible build, date unspecified
options:bn(64,64) md2(int) rc4(8x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx)
compiler: gcc -I. -I.. -I../include  -fPIC -DOPENSSL_PIC -DZLIB -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -DKRB5_MIT -m64 -DL_ENDIAN -Wall -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches   -m64 -mtune=generic -Wa,--noexecstack -DPURIFY -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DRC4_ASM -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM -DECP_NISTZ256_ASM
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
aes-128-gcm     640135.05k  1474972.61k  2489417.73k  3526839.98k  3877164.37k

c5

$ openssl speed -evp aes-128-ctr
Doing aes-128-ctr for 3s on 16 size blocks: 155617817 aes-128-ctr's in 2.99s
Doing aes-128-ctr for 3s on 64 size blocks: 102234872 aes-128-ctr's in 3.00s
Doing aes-128-ctr for 3s on 256 size blocks: 46302738 aes-128-ctr's in 3.00s
Doing aes-128-ctr for 3s on 1024 size blocks: 14538980 aes-128-ctr's in 3.00s
Doing aes-128-ctr for 3s on 8192 size blocks: 1958875 aes-128-ctr's in 3.00s
OpenSSL 1.0.2k-fips  26 Jan 2017
built on: reproducible build, date unspecified
options:bn(64,64) md2(int) rc4(16x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx)
compiler: gcc -I. -I.. -I../include  -fPIC -DOPENSSL_PIC -DZLIB -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -DKRB5_MIT -m64 -DL_ENDIAN -Wall -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches   -m64 -mtune=generic -Wa,--noexecstack -DPURIFY -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DRC4_ASM -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM -DECP_NISTZ256_ASM
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
aes-128-ctr     832737.48k  2181010.60k  3951166.98k  4962638.51k  5349034.67k

$ openssl speed -evp aes-128-gcm
Doing aes-128-gcm for 3s on 16 size blocks: 110622788 aes-128-gcm's in 3.00s
Doing aes-128-gcm for 3s on 64 size blocks: 66645416 aes-128-gcm's in 3.00s
Doing aes-128-gcm for 3s on 256 size blocks: 31259119 aes-128-gcm's in 3.00s
Doing aes-128-gcm for 3s on 1024 size blocks: 12223862 aes-128-gcm's in 3.00s
Doing aes-128-gcm for 3s on 8192 size blocks: 1897743 aes-128-gcm's in 3.00s
OpenSSL 1.0.2k-fips  26 Jan 2017
built on: reproducible build, date unspecified
options:bn(64,64) md2(int) rc4(16x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx)
compiler: gcc -I. -I.. -I../include  -fPIC -DOPENSSL_PIC -DZLIB -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -DKRB5_MIT -m64 -DL_ENDIAN -Wall -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches   -m64 -mtune=generic -Wa,--noexecstack -DPURIFY -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DRC4_ASM -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM -DECP_NISTZ256_ASM
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
aes-128-gcm     589988.20k  1421768.87k  2667444.82k  4172411.56k  5182103.55k

m5a

sh-4.2$ openssl speed -evp aes-128-ctr
Doing aes-128-ctr for 3s on 16 size blocks: 119757154 aes-128-ctr's in 3.00s
Doing aes-128-ctr for 3s on 64 size blocks: 89850443 aes-128-ctr's in 3.00s
Doing aes-128-ctr for 3s on 256 size blocks: 46666773 aes-128-ctr's in 3.00s
Doing aes-128-ctr for 3s on 1024 size blocks: 17375495 aes-128-ctr's in 3.00s
Doing aes-128-ctr for 3s on 8192 size blocks: 2506952 aes-128-ctr's in 3.00s
OpenSSL 1.0.2k-fips  26 Jan 2017
built on: reproducible build, date unspecified
options:bn(64,64) md2(int) rc4(8x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx)
compiler: gcc -I. -I.. -I../include  -fPIC -DOPENSSL_PIC -DZLIB -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -DKRB5_MIT -m64 -DL_ENDIAN -Wall -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches   -m64 -mtune=generic -Wa,--noexecstack -DPURIFY -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DRC4_ASM -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM -DECP_NISTZ256_ASM
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
aes-128-ctr     638704.82k  1916809.45k  3982231.30k  5930835.63k  6845650.26k

sh-4.2$ openssl speed -evp aes-128-gcm
Doing aes-128-gcm for 3s on 16 size blocks: 83049029 aes-128-gcm's in 3.00s
Doing aes-128-gcm for 3s on 64 size blocks: 44164295 aes-128-gcm's in 3.00s
Doing aes-128-gcm for 3s on 256 size blocks: 21574089 aes-128-gcm's in 3.00s
Doing aes-128-gcm for 3s on 1024 size blocks: 7471839 aes-128-gcm's in 3.00s
Doing aes-128-gcm for 3s on 8192 size blocks: 1087827 aes-128-gcm's in 3.00s
OpenSSL 1.0.2k-fips  26 Jan 2017
built on: reproducible build, date unspecified
options:bn(64,64) md2(int) rc4(8x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx)
compiler: gcc -I. -I.. -I../include  -fPIC -DOPENSSL_PIC -DZLIB -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -DKRB5_MIT -m64 -DL_ENDIAN -Wall -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches   -m64 -mtune=generic -Wa,--noexecstack -DPURIFY -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DRC4_ASM -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM -DECP_NISTZ256_ASM
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
aes-128-gcm     442928.15k   942171.63k  1840988.93k  2550387.71k  2970492.93k

m3

sh-4.2$ openssl speed -evp aes-128-ctr
Doing aes-128-ctr for 3s on 16 size blocks: 81601869 aes-128-ctr's in 3.00s
Doing aes-128-ctr for 3s on 64 size blocks: 61444877 aes-128-ctr's in 3.00s
Doing aes-128-ctr for 3s on 256 size blocks: 31790303 aes-128-ctr's in 3.00s
Doing aes-128-ctr for 3s on 1024 size blocks: 10039252 aes-128-ctr's in 3.00s
Doing aes-128-ctr for 3s on 8192 size blocks: 1349002 aes-128-ctr's in 3.00s
OpenSSL 1.0.2k-fips  26 Jan 2017
built on: reproducible build, date unspecified
options:bn(64,64) md2(int) rc4(16x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx)
compiler: gcc -I. -I.. -I../include  -fPIC -DOPENSSL_PIC -DZLIB -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -DKRB5_MIT -m64 -DL_ENDIAN -Wall -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches   -m64 -mtune=generic -Wa,--noexecstack -DPURIFY -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DRC4_ASM -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM -DECP_NISTZ256_ASM
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
aes-128-ctr     435209.97k  1310824.04k  2712772.52k  3426731.35k  3683674.79k

sh-4.2$ openssl speed -evp aes-128-gcm
Doing aes-128-gcm for 3s on 16 size blocks: 52092242 aes-128-gcm's in 3.00s
Doing aes-128-gcm for 3s on 64 size blocks: 33475416 aes-128-gcm's in 3.00s
Doing aes-128-gcm for 3s on 256 size blocks: 11592981 aes-128-gcm's in 3.00s
Doing aes-128-gcm for 3s on 1024 size blocks: 3180357 aes-128-gcm's in 3.00s
Doing aes-128-gcm for 3s on 8192 size blocks: 407830 aes-128-gcm's in 3.00s
OpenSSL 1.0.2k-fips  26 Jan 2017
built on: reproducible build, date unspecified
options:bn(64,64) md2(int) rc4(16x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx)
compiler: gcc -I. -I.. -I../include  -fPIC -DOPENSSL_PIC -DZLIB -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -DKRB5_MIT -m64 -DL_ENDIAN -Wall -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches   -m64 -mtune=generic -Wa,--noexecstack -DPURIFY -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DRC4_ASM -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM -DECP_NISTZ256_ASM
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
aes-128-gcm     277825.29k   714142.21k   989267.71k  1085561.86k  1113647.79k