ゲノム解析のワークロードで Mountpoint for Amazon S3 のリード性能は如何ほどものかと確認してみた

2023.08.26

EC2 に S3 マウントして直接読み書きできるようになった Moutpoint for Amazon S3 を使って、どの程度のスループット、IOPS なのか気になったので測定してみました。

本記事では解析ツールからリファレンスゲノムなどインプットファイルをリードするワークロードを想定し、fio でシーケンシャルリードの性能を測定します。

Inventory icons created by Freepik - Flaticon

Mountpoint for Amazon S3 は大容量のデータの読み取り用途でゲノム解析が想定されている新ツールです。Mountpoint for Amazon S3 の登場により、ゲノム解析のワークフローにおけるストレージに位置づけを見直す機会となりました。

Mountpoint for Amazon S3 is ideal for workloads that read large datasets (terabytes to petabytes in size) and require the elasticity and high throughput of Amazon Simple Storage Service (Amazon S3). Common use cases include large-scale machine learning (ML) training, autonomous vehicle simulation, genomics analysis, and image rendering. While these workloads read large datasets over several compute instances, they write sequentially to a file from a single node. This means they do not need shared file system features such as locking.

Open Source File Client – Mountpoint for Amazon S3 – AWS

検証環境

HPC ワークロードの考慮事項を踏まえた上で環境を構築しました。I/O ワークロードのシュミレートは fio を使用します。

EC2 は Gateway 型の VPC エンドポイント経由で S3 バケットへアクセスする構成としています。

EC2 インスタンス

ネットワーク帯域幅が 10 Gbps 以上で安定してかつ、手頃なスペックなインスタンスタイプを選定しています。

項目
OS Ubuntu 22.04
CPU Intel
Instance Type m6i.8xlarge
vCPU 32
Simultaneous Multi-Threading 有効
Memory 128 GiB
Network Bandwidth 12.5 Gbps
Mountpoint for Amazon S3 v1.0.0
fio v3.28

fio 設定

よくありそうなゲノム解析のワークロードを想定したリードアクセスの設定を作成しました。

  • 読み取りファイルサイズは 500 MiB, 1 GiB, 3 GiB の 3 パターン
    • リファレンスゲノムのサイズを想定
    • リファレンスゲノムなどのインプットファイルのアクセスはシーケンシャルリードと想定
  • 並列実行数は 1, 8, 16 の 3 パターン
    • Mountpoint for Amazon S3 のデフォルト上限数が 16 スレッド
    • 解析内容、アプリによって並列数が異なるため比較のため
  • ブロックサイズは 8 MiB 固定
    • Mountpoint for Amazon S3 の並列リクエストの 1 パートサイズは 8 MiB がデフォルト
設定ファイルおりたたみ

SEQ500.conf

[global]
ioengine=libaio
direct=1
kb_base=1024
iodepth=128
directory=/mnt/s3
stonewall
group_reporting

[SEQ8M-Q128T1-500M-Read]
size=500M
bs=8m
numjobs=1
rw=read

[SEQ8M-Q128T8-500M-Read]
size=500M
bs=8m
numjobs=8
rw=read

[SEQ8M-Q128T16-500M-Read]
size=500M
bs=8m
numjobs=16
rw=read

SEQ1G.conf

[global]
ioengine=libaio
direct=1
kb_base=1024
iodepth=128
directory=/mnt/s3
stonewall
group_reporting

[SEQ8M-Q128T1-1G-Read]
size=1g
bs=8m
numjobs=1
rw=read

[SEQ8M-Q128T8-1G-Read]
size=1g
bs=8m
numjobs=8
rw=read

[SEQ8M-Q128T16-1G-Read]
size=1g
bs=8m
numjobs=16
rw=read

SEQ3G.conf

[global]
ioengine=libaio
direct=1
kb_base=1024
iodepth=128
directory=/mnt/s3
stonewall
group_reporting

[SEQ8M-Q128T1-3G-Read]
size=3g
bs=8m
numjobs=1
rw=read

[SEQ8M-Q128T8-3G-Read]
size=3g
bs=8m
numjobs=8
rw=read

[SEQ8M-Q128T16-3G-Read]
size=3g
bs=8m
numjobs=16
rw=read

ストレージ性能計測

EC2 インスタンスに Mountpoint for Amazon S3 を使って /mnt/s3 に S3 バケットをマウントしました。

$ sudo mount-s3 --allow-delete --allow-other hpc-dev-mountpoint /mnt/s3

計測内容のコンフィグを順次実行しました。

$ sudo fio ./SEQ500.conf

確認結果

私の想定したワークロードではシーケンシャルリードスループットは 1.3 GiB/s でる結果となりました。この速度ならリファレンスゲノムなどのインプットファイルを S3 に保存してマウントしたパスを直接する指定する利用ありですね。

ただ、ファイルサイズが小さいもので並列数が 1 だと低速でした。ファイルサイズに依存していますが 375 〜 992 MiB/s です。解析ツールはもともと大容量のファイルを読み込みことが一般的なので、多くツールは並列アクセスできるような気もします。

最終的には一度解析ツールを走らせてみて、ツールのアクセス特性に応じてインプットファイルの保存先を検討することになります。

もちろん S3 からファイルをダウンロードして、高速なインスタンスストアや、EBS へ保存した方がより高いスループット、IOPS 性能を期待できます。ですが、利便性を考えると Mountpoint for Amazon S3 のリード性能と、大容量ストレージとしての S3 の保存単価の安さは魅力的です。

ストレージコストが気になるけどファイル整理を頑張れない方にはとくに最高のソリューションになるのではないでしょうか。

  • 解析処理前に S3 からファイルダウンロードするひと手間がなくなる
    • ローカルに保存するために EBS をある程度大きなサイズ確保しなくて済み、ストレージコストを削減できる
    • 上記の対策のためにファイルサイズに応じた従量課金の EFS を利用している場合は、バーストクレジットを気にしないで済む
    • EBS も EFS もライトアクセス用途に引き続き活躍する場面はあるが、永続的なファイルの保存先としての用途なら見直す価値あり
  • ファイルサイズ、ファイル数ともにかさみがちなインプットファイル S3 に置いたまま解析ツールから直接アクセスできる
    • S3 Inteligent Tiering を利用して S3 に放置しているファイルは自動的に安価なストレージクラスへ移動し、よりコストを圧縮できる

従来は AWS CLI で S3 とファイルやりとりしていたところを、S3 をマウントした後なら cp でもファイル操作できるため、AWS に不慣れなユーザーの方でも S3 を利用しはじめやすいと思います。注意するところは POSIX 準拠していないなど Mountpoint for Amazon S3 の制約事項があります。

総括

永続的なストレージとして S3 を利用し、S3 から直接リードできるようになり、大容量のファイルを並列アクセスでリードするとスループットは 1.3 GiB/s でるため申し分なし。インプットファイルの保存先、リードアクセスの仕方は見直す価値あり。

Inventory icons created by Freepik - Flaticon

ParallelCluster で Mountpoint for Amazon S3 を利用する方法はこちらのブログで紹介しています。

実行ログ一覧

SEQ500.conf
SEQ8M-Q128T1-500M-Read: (g=0): rw=read, bs=(R) 8192KiB-8192KiB, (W) 8192KiB-8192KiB, (T) 8192KiB-8192KiB, ioengine=libaio, iodepth=128
SEQ8M-Q128T8-500M-Read: (g=1): rw=read, bs=(R) 8192KiB-8192KiB, (W) 8192KiB-8192KiB, (T) 8192KiB-8192KiB, ioengine=libaio, iodepth=128
...
SEQ8M-Q128T16-500M-Read: (g=2): rw=read, bs=(R) 8192KiB-8192KiB, (W) 8192KiB-8192KiB, (T) 8192KiB-8192KiB, ioengine=libaio, iodepth=128
...
fio-3.28
Starting 25 processes
SEQ8M-Q128T1-500M-Read: Laying out IO file (1 file / 500MiB)
SEQ8M-Q128T8-500M-Read: Laying out IO file (1 file / 500MiB)
SEQ8M-Q128T8-500M-Read: Laying out IO file (1 file / 500MiB)
SEQ8M-Q128T8-500M-Read: Laying out IO file (1 file / 500MiB)
SEQ8M-Q128T8-500M-Read: Laying out IO file (1 file / 500MiB)
SEQ8M-Q128T8-500M-Read: Laying out IO file (1 file / 500MiB)
SEQ8M-Q128T8-500M-Read: Laying out IO file (1 file / 500MiB)
SEQ8M-Q128T8-500M-Read: Laying out IO file (1 file / 500MiB)
SEQ8M-Q128T8-500M-Read: Laying out IO file (1 file / 500MiB)
SEQ8M-Q128T16-500M-Read: Laying out IO file (1 file / 500MiB)
SEQ8M-Q128T16-500M-Read: Laying out IO file (1 file / 500MiB)
SEQ8M-Q128T16-500M-Read: Laying out IO file (1 file / 500MiB)
SEQ8M-Q128T16-500M-Read: Laying out IO file (1 file / 500MiB)
SEQ8M-Q128T16-500M-Read: Laying out IO file (1 file / 500MiB)
SEQ8M-Q128T16-500M-Read: Laying out IO file (1 file / 500MiB)
SEQ8M-Q128T16-500M-Read: Laying out IO file (1 file / 500MiB)
SEQ8M-Q128T16-500M-Read: Laying out IO file (1 file / 500MiB)
SEQ8M-Q128T16-500M-Read: Laying out IO file (1 file / 500MiB)
SEQ8M-Q128T16-500M-Read: Laying out IO file (1 file / 500MiB)
SEQ8M-Q128T16-500M-Read: Laying out IO file (1 file / 500MiB)
SEQ8M-Q128T16-500M-Read: Laying out IO file (1 file / 500MiB)
SEQ8M-Q128T16-500M-Read: Laying out IO file (1 file / 500MiB)
SEQ8M-Q128T16-500M-Read: Laying out IO file (1 file / 500MiB)
SEQ8M-Q128T16-500M-Read: Laying out IO file (1 file / 500MiB)
SEQ8M-Q128T16-500M-Read: Laying out IO file (1 file / 500MiB)
Jobs: 13 (f=13): [_(9),R(4),_(1),R(2),_(1),R(7),_(1)][100.0%][r=1488MiB/s][r=186 IOPS][eta 00m:00s]
SEQ8M-Q128T1-500M-Read: (groupid=0, jobs=1): err= 0: pid=1409: Sat Aug 26 02:28:26 2023
  read: IOPS=46, BW=375MiB/s (394MB/s)(496MiB/1321msec)
    slat (msec): min=3, max=324, avg=19.02, stdev=57.11
    clat (usec): min=5, max=901468, avg=235095.71, stdev=193283.48
     lat (msec): min=3, max=1179, avg=254.11, stdev=225.15
    clat percentiles (usec):
     |  1.00th=[     6],  5.00th=[ 10421], 10.00th=[ 31589], 20.00th=[101188],
     | 30.00th=[123208], 40.00th=[166724], 50.00th=[191890], 60.00th=[246416],
     | 70.00th=[267387], 80.00th=[287310], 90.00th=[633340], 95.00th=[641729],
     | 99.00th=[901776], 99.50th=[901776], 99.90th=[901776], 99.95th=[901776],
     | 99.99th=[901776]
  lat (usec)   : 10=1.61%
  lat (msec)   : 4=1.61%, 10=1.61%, 20=4.84%, 50=1.61%, 100=8.06%
  lat (msec)   : 250=43.55%, 500=25.81%, 750=9.68%, 1000=1.61%
  cpu          : usr=0.00%, sys=7.95%, ctx=498, majf=0, minf=126990
  IO depths    : 1=1.6%, 2=3.2%, 4=6.5%, 8=12.9%, 16=25.8%, 32=50.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=100.0%, >=64=0.0%
     issued rwts: total=62,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=128
SEQ8M-Q128T8-500M-Read: (groupid=1, jobs=8): err= 0: pid=1410: Sat Aug 26 02:28:26 2023
  read: IOPS=158, BW=1267MiB/s (1329MB/s)(3968MiB/3131msec)
    slat (msec): min=3, max=800, avg=43.09, stdev=114.80
    clat (usec): min=3, max=2883.5k, avg=845250.75, stdev=643855.32
     lat (msec): min=3, max=3046, avg=888.34, stdev=676.60
    clat percentiles (usec):
     |  1.00th=[      6],  5.00th=[  12911], 10.00th=[  48497],
     | 20.00th=[ 191890], 30.00th=[ 471860], 40.00th=[ 574620],
     | 50.00th=[ 767558], 60.00th=[ 994051], 70.00th=[1082131],
     | 80.00th=[1333789], 90.00th=[1669333], 95.00th=[2264925],
     | 99.00th=[2667578], 99.50th=[2868904], 99.90th=[2868904],
     | 99.95th=[2868904], 99.99th=[2868904]
  lat (usec)   : 4=0.20%, 10=1.41%
  lat (msec)   : 4=0.20%, 10=2.82%, 20=1.81%, 50=4.03%, 100=4.84%
  lat (msec)   : 250=5.65%, 500=10.89%, 750=17.34%, 1000=11.49%, 2000=31.65%
  lat (msec)   : >=2000=7.66%
  cpu          : usr=0.00%, sys=4.99%, ctx=4001, majf=4, minf=1015943
  IO depths    : 1=1.6%, 2=3.2%, 4=6.5%, 8=12.9%, 16=25.8%, 32=50.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=100.0%, >=64=0.0%
     issued rwts: total=496,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=128
SEQ8M-Q128T16-500M-Read: (groupid=2, jobs=16): err= 0: pid=1425: Sat Aug 26 02:28:26 2023
  read: IOPS=168, BW=1351MiB/s (1417MB/s)(7936MiB/5874msec)
    slat (msec): min=3, max=3184, avg=84.73, stdev=219.15
    clat (usec): min=4, max=5581.1k, avg=2172166.49, stdev=1457512.25
     lat (msec): min=3, max=5770, avg=2256.90, stdev=1482.10
    clat percentiles (usec):
     |  1.00th=[      7],  5.00th=[  19268], 10.00th=[  78119],
     | 20.00th=[ 658506], 30.00th=[1233126], 40.00th=[1635779],
     | 50.00th=[2122318], 60.00th=[2701132], 70.00th=[3103785],
     | 80.00th=[3405775], 90.00th=[4211082], 95.00th=[4731175],
     | 99.00th=[5133829], 99.50th=[5402264], 99.90th=[5603591],
     | 99.95th=[5603591], 99.99th=[5603591]
  lat (usec)   : 10=1.51%, 20=0.10%
  lat (msec)   : 4=0.10%, 10=1.81%, 20=1.51%, 50=3.02%, 100=2.82%
  lat (msec)   : 250=2.62%, 500=4.64%, 750=3.63%, 1000=4.13%, 2000=20.56%
  lat (msec)   : >=2000=53.53%
  cpu          : usr=0.01%, sys=2.59%, ctx=8026, majf=1, minf=2031871
  IO depths    : 1=1.6%, 2=3.2%, 4=6.5%, 8=12.9%, 16=25.8%, 32=50.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=100.0%, >=64=0.0%
     issued rwts: total=992,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=128

Run status group 0 (all jobs):
   READ: bw=375MiB/s (394MB/s), 375MiB/s-375MiB/s (394MB/s-394MB/s), io=496MiB (520MB), run=1321-1321msec

Run status group 1 (all jobs):
   READ: bw=1267MiB/s (1329MB/s), 1267MiB/s-1267MiB/s (1329MB/s-1329MB/s), io=3968MiB (4161MB), run=3131-3131msec

Run status group 2 (all jobs):
   READ: bw=1351MiB/s (1417MB/s), 1351MiB/s-1351MiB/s (1417MB/s-1417MB/s), io=7936MiB (8321MB), run=5874-5874msec
SEQ1G.conf
SEQ8M-Q128T1-1G-Read: (g=0): rw=read, bs=(R) 8192KiB-8192KiB, (W) 8192KiB-8192KiB, (T) 8192KiB-8192KiB, ioengine=libaio, iodepth=128
SEQ8M-Q128T8-1G-Read: (g=1): rw=read, bs=(R) 8192KiB-8192KiB, (W) 8192KiB-8192KiB, (T) 8192KiB-8192KiB, ioengine=libaio, iodepth=128
...
SEQ8M-Q128T16-1G-Read: (g=2): rw=read, bs=(R) 8192KiB-8192KiB, (W) 8192KiB-8192KiB, (T) 8192KiB-8192KiB, ioengine=libaio, iodepth=128
...
fio-3.28
Starting 25 processes
SEQ8M-Q128T1-1G-Read: Laying out IO file (1 file / 1024MiB)
SEQ8M-Q128T8-1G-Read: Laying out IO file (1 file / 1024MiB)
SEQ8M-Q128T8-1G-Read: Laying out IO file (1 file / 1024MiB)
SEQ8M-Q128T8-1G-Read: Laying out IO file (1 file / 1024MiB)
SEQ8M-Q128T8-1G-Read: Laying out IO file (1 file / 1024MiB)
SEQ8M-Q128T8-1G-Read: Laying out IO file (1 file / 1024MiB)
SEQ8M-Q128T8-1G-Read: Laying out IO file (1 file / 1024MiB)
SEQ8M-Q128T8-1G-Read: Laying out IO file (1 file / 1024MiB)
SEQ8M-Q128T8-1G-Read: Laying out IO file (1 file / 1024MiB)
SEQ8M-Q128T16-1G-Read: Laying out IO file (1 file / 1024MiB)
SEQ8M-Q128T16-1G-Read: Laying out IO file (1 file / 1024MiB)
SEQ8M-Q128T16-1G-Read: Laying out IO file (1 file / 1024MiB)
SEQ8M-Q128T16-1G-Read: Laying out IO file (1 file / 1024MiB)
SEQ8M-Q128T16-1G-Read: Laying out IO file (1 file / 1024MiB)
SEQ8M-Q128T16-1G-Read: Laying out IO file (1 file / 1024MiB)
SEQ8M-Q128T16-1G-Read: Laying out IO file (1 file / 1024MiB)
SEQ8M-Q128T16-1G-Read: Laying out IO file (1 file / 1024MiB)
SEQ8M-Q128T16-1G-Read: Laying out IO file (1 file / 1024MiB)
SEQ8M-Q128T16-1G-Read: Laying out IO file (1 file / 1024MiB)
SEQ8M-Q128T16-1G-Read: Laying out IO file (1 file / 1024MiB)
SEQ8M-Q128T16-1G-Read: Laying out IO file (1 file / 1024MiB)
SEQ8M-Q128T16-1G-Read: Laying out IO file (1 file / 1024MiB)
SEQ8M-Q128T16-1G-Read: Laying out IO file (1 file / 1024MiB)
SEQ8M-Q128T16-1G-Read: Laying out IO file (1 file / 1024MiB)
SEQ8M-Q128T16-1G-Read: Laying out IO file (1 file / 1024MiB)
Jobs: 10 (f=9): [_(10),f(1),R(1),_(1),R(3),_(1),R(2),_(1),R(2),_(1),R(1),_(1)][100.0%][r=3072MiB/s][r=384 IOPS][eta 00m:Jobs: 5 (f=5): [_(11),R(1),_(1),R(2),_(3),R(1),_(1),R(1),_(4)][100.0%][r=4096MiB/s][r=512 IOPS][eta 00m:00s]
SEQ8M-Q128T1-1G-Read: (groupid=0, jobs=1): err= 0: pid=1501: Sat Aug 26 02:32:12 2023
  read: IOPS=75, BW=605MiB/s (635MB/s)(1024MiB/1692msec)
    slat (msec): min=3, max=276, avg=12.33, stdev=38.64
    clat (usec): min=13, max=1301.7k, avg=403261.62, stdev=339870.88
     lat (msec): min=3, max=1577, avg=415.59, stdev=353.47
    clat percentiles (msec):
     |  1.00th=[    4],  5.00th=[   22], 10.00th=[   43], 20.00th=[   90],
     | 30.00th=[  138], 40.00th=[  186], 50.00th=[  234], 60.00th=[  542],
     | 70.00th=[  693], 80.00th=[  743], 90.00th=[  911], 95.00th=[ 1062],
     | 99.00th=[ 1116], 99.50th=[ 1301], 99.90th=[ 1301], 99.95th=[ 1301],
     | 99.99th=[ 1301]
  lat (usec)   : 20=0.78%
  lat (msec)   : 4=0.78%, 10=0.78%, 20=2.34%, 50=7.03%, 100=10.16%
  lat (msec)   : 250=30.47%, 500=4.69%, 750=25.00%, 1000=12.50%, 2000=5.47%
  cpu          : usr=0.00%, sys=13.31%, ctx=1027, majf=0, minf=262157
  IO depths    : 1=0.8%, 2=1.6%, 4=3.1%, 8=6.2%, 16=12.5%, 32=25.0%, >=64=50.8%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=50.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=50.0%
     issued rwts: total=128,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=128
SEQ8M-Q128T8-1G-Read: (groupid=1, jobs=8): err= 0: pid=1502: Sat Aug 26 02:32:12 2023
  read: IOPS=167, BW=1341MiB/s (1406MB/s)(8192MiB/6108msec)
    slat (msec): min=3, max=894, avg=42.56, stdev=109.03
    clat (usec): min=14, max=5868.0k, avg=2097782.59, stdev=1541563.25
     lat (msec): min=3, max=6028, avg=2140.34, stdev=1558.88
    clat percentiles (msec):
     |  1.00th=[    5],  5.00th=[   28], 10.00th=[   64], 20.00th=[  422],
     | 30.00th=[  726], 40.00th=[ 1401], 50.00th=[ 2232], 60.00th=[ 2802],
     | 70.00th=[ 3239], 80.00th=[ 3507], 90.00th=[ 3977], 95.00th=[ 4665],
     | 99.00th=[ 5269], 99.50th=[ 5470], 99.90th=[ 5738], 99.95th=[ 5873],
     | 99.99th=[ 5873]
  lat (usec)   : 20=0.59%, 50=0.20%
  lat (msec)   : 4=0.20%, 10=1.17%, 20=1.76%, 50=4.49%, 100=4.79%
  lat (msec)   : 250=4.39%, 500=8.01%, 750=5.37%, 1000=2.25%, 2000=13.48%
  lat (msec)   : >=2000=53.32%
  cpu          : usr=0.02%, sys=5.08%, ctx=8242, majf=0, minf=2097278
  IO depths    : 1=0.8%, 2=1.6%, 4=3.1%, 8=6.2%, 16=12.5%, 32=25.0%, >=64=50.8%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=50.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=50.0%
     issued rwts: total=1024,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=128
SEQ8M-Q128T16-1G-Read: (groupid=2, jobs=16): err= 0: pid=1510: Sat Aug 26 02:32:12 2023
  read: IOPS=174, BW=1395MiB/s (1462MB/s)(16.0GiB/11747msec)
    slat (msec): min=3, max=3443, avg=79.06, stdev=241.03
    clat (usec): min=12, max=11363k, avg=4053985.19, stdev=3015461.88
     lat (msec): min=3, max=11669, avg=4133.05, stdev=3042.88
    clat percentiles (msec):
     |  1.00th=[    5],  5.00th=[   91], 10.00th=[  224], 20.00th=[ 1011],
     | 30.00th=[ 1653], 40.00th=[ 2467], 50.00th=[ 3339], 60.00th=[ 5336],
     | 70.00th=[ 6477], 80.00th=[ 7215], 90.00th=[ 8020], 95.00th=[ 8792],
     | 99.00th=[10402], 99.50th=[10537], 99.90th=[11208], 99.95th=[11342],
     | 99.99th=[11342]
  lat (usec)   : 20=0.73%, 50=0.05%
  lat (msec)   : 4=0.15%, 10=0.78%, 20=0.63%, 50=1.03%, 100=2.20%
  lat (msec)   : 250=4.88%, 500=2.59%, 750=4.05%, 1000=2.69%, 2000=14.36%
  lat (msec)   : >=2000=65.87%
  cpu          : usr=0.01%, sys=2.82%, ctx=16524, majf=0, minf=4194546
  IO depths    : 1=0.8%, 2=1.6%, 4=3.1%, 8=6.2%, 16=12.5%, 32=25.0%, >=64=50.8%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=50.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=50.0%
     issued rwts: total=2048,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=128

Run status group 0 (all jobs):
   READ: bw=605MiB/s (635MB/s), 605MiB/s-605MiB/s (635MB/s-635MB/s), io=1024MiB (1074MB), run=1692-1692msec

Run status group 1 (all jobs):
   READ: bw=1341MiB/s (1406MB/s), 1341MiB/s-1341MiB/s (1406MB/s-1406MB/s), io=8192MiB (8590MB), run=6108-6108msec

Run status group 2 (all jobs):
   READ: bw=1395MiB/s (1462MB/s), 1395MiB/s-1395MiB/s (1462MB/s-1462MB/s), io=16.0GiB (17.2GB), run=11747-11747msec
SEQ3G.conf
SEQ8M-Q128T1-3G-Read: (g=0): rw=read, bs=(R) 8192KiB-8192KiB, (W) 8192KiB-8192KiB, (T) 8192KiB-8192KiB, ioengine=libaio, iodepth=128
SEQ8M-Q128T8-3G-Read: (g=1): rw=read, bs=(R) 8192KiB-8192KiB, (W) 8192KiB-8192KiB, (T) 8192KiB-8192KiB, ioengine=libaio, iodepth=128
...
SEQ8M-Q128T16-3G-Read: (g=2): rw=read, bs=(R) 8192KiB-8192KiB, (W) 8192KiB-8192KiB, (T) 8192KiB-8192KiB, ioengine=libaio, iodepth=128
...
fio-3.28
Starting 25 processes
SEQ8M-Q128T1-3G-Read: Laying out IO file (1 file / 3072MiB)
SEQ8M-Q128T8-3G-Read: Laying out IO file (1 file / 3072MiB)
SEQ8M-Q128T8-3G-Read: Laying out IO file (1 file / 3072MiB)
SEQ8M-Q128T8-3G-Read: Laying out IO file (1 file / 3072MiB)
SEQ8M-Q128T8-3G-Read: Laying out IO file (1 file / 3072MiB)
SEQ8M-Q128T8-3G-Read: Laying out IO file (1 file / 3072MiB)
SEQ8M-Q128T8-3G-Read: Laying out IO file (1 file / 3072MiB)
SEQ8M-Q128T8-3G-Read: Laying out IO file (1 file / 3072MiB)
SEQ8M-Q128T8-3G-Read: Laying out IO file (1 file / 3072MiB)
SEQ8M-Q128T16-3G-Read: Laying out IO file (1 file / 3072MiB)
SEQ8M-Q128T16-3G-Read: Laying out IO file (1 file / 3072MiB)
SEQ8M-Q128T16-3G-Read: Laying out IO file (1 file / 3072MiB)
SEQ8M-Q128T16-3G-Read: Laying out IO file (1 file / 3072MiB)
SEQ8M-Q128T16-3G-Read: Laying out IO file (1 file / 3072MiB)
SEQ8M-Q128T16-3G-Read: Laying out IO file (1 file / 3072MiB)
SEQ8M-Q128T16-3G-Read: Laying out IO file (1 file / 3072MiB)
SEQ8M-Q128T16-3G-Read: Laying out IO file (1 file / 3072MiB)
SEQ8M-Q128T16-3G-Read: Laying out IO file (1 file / 3072MiB)
SEQ8M-Q128T16-3G-Read: Laying out IO file (1 file / 3072MiB)
SEQ8M-Q128T16-3G-Read: Laying out IO file (1 file / 3072MiB)
SEQ8M-Q128T16-3G-Read: Laying out IO file (1 file / 3072MiB)
SEQ8M-Q128T16-3G-Read: Laying out IO file (1 file / 3072MiB)
SEQ8M-Q128T16-3G-Read: Laying out IO file (1 file / 3072MiB)
SEQ8M-Q128T16-3G-Read: Laying out IO file (1 file / 3072MiB)
SEQ8M-Q128T16-3G-Read: Laying out IO file (1 file / 3072MiB)
Jobs: 4 (f=4): [_(9),R(3),_(12),R(1)][68.2%][r=4428MiB/s][r=553 IOPS][eta 00m:28s]
SEQ8M-Q128T1-3G-Read: (groupid=0, jobs=1): err= 0: pid=1463: Sat Aug 26 03:00:16 2023
  read: IOPS=124, BW=992MiB/s (1040MB/s)(3072MiB/3096msec)
    slat (msec): min=2, max=270, avg= 7.73, stdev=26.17
    clat (usec): min=2, max=1624.0k, avg=557644.22, stdev=298817.56
     lat (msec): min=2, max=1872, avg=565.38, stdev=306.29
    clat percentiles (msec):
     |  1.00th=[    9],  5.00th=[   46], 10.00th=[   93], 20.00th=[  317],
     | 30.00th=[  443], 40.00th=[  481], 50.00th=[  617], 60.00th=[  634],
     | 70.00th=[  651], 80.00th=[  693], 90.00th=[  919], 95.00th=[ 1099],
     | 99.00th=[ 1401], 99.50th=[ 1418], 99.90th=[ 1620], 99.95th=[ 1620],
     | 99.99th=[ 1620]
   bw (  MiB/s): min= 1248, max= 1410, per=100.00%, avg=1329.41, stdev=115.13, samples=2
   iops        : min=  156, max=  176, avg=166.00, stdev=14.14, samples=2
  lat (usec)   : 4=0.26%
  lat (msec)   : 4=0.26%, 10=0.52%, 20=1.30%, 50=3.12%, 100=5.21%
  lat (msec)   : 250=2.86%, 500=30.73%, 750=41.15%, 1000=4.95%, 2000=9.64%
  cpu          : usr=0.00%, sys=10.53%, ctx=3077, majf=0, minf=262159
  IO depths    : 1=0.3%, 2=0.5%, 4=1.0%, 8=2.1%, 16=4.2%, 32=8.3%, >=64=83.6%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=99.6%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.4%
     issued rwts: total=384,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=128
SEQ8M-Q128T8-3G-Read: (groupid=1, jobs=8): err= 0: pid=1464: Sat Aug 26 03:00:16 2023
  read: IOPS=176, BW=1412MiB/s (1481MB/s)(24.0GiB/17404msec)
    slat (msec): min=2, max=1050, avg=40.41, stdev=97.63
    clat (usec): min=2, max=7749.9k, avg=4005526.31, stdev=2516565.47
     lat (msec): min=2, max=7897, avg=4045.93, stdev=2531.53
    clat percentiles (msec):
     |  1.00th=[   10],  5.00th=[   58], 10.00th=[  140], 20.00th=[  810],
     | 30.00th=[ 2232], 40.00th=[ 3574], 50.00th=[ 4463], 60.00th=[ 5671],
     | 70.00th=[ 6208], 80.00th=[ 6477], 90.00th=[ 6745], 95.00th=[ 6946],
     | 99.00th=[ 7349], 99.50th=[ 7416], 99.90th=[ 7550], 99.95th=[ 7617],
     | 99.99th=[ 7752]
   bw (  MiB/s): min=  304, max= 3985, per=100.00%, avg=1563.57, stdev=112.32, samples=141
   iops        : min=   38, max=  498, avg=195.40, stdev=14.04, samples=141
  lat (usec)   : 4=0.20%, 10=0.07%
  lat (msec)   : 4=0.23%, 10=0.52%, 20=0.88%, 50=2.60%, 100=3.22%
  lat (msec)   : 250=6.67%, 500=1.24%, 750=2.99%, 1000=3.12%, 2000=6.77%
  lat (msec)   : >=2000=71.48%
  cpu          : usr=0.02%, sys=2.06%, ctx=24634, majf=0, minf=2097293
  IO depths    : 1=0.3%, 2=0.5%, 4=1.0%, 8=2.1%, 16=4.2%, 32=8.3%, >=64=83.6%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=99.6%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.4%
     issued rwts: total=3072,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=128
SEQ8M-Q128T16-3G-Read: (groupid=2, jobs=16): err= 0: pid=1483: Sat Aug 26 03:00:16 2023
  read: IOPS=177, BW=1423MiB/s (1492MB/s)(48.0GiB/34539msec)
    slat (msec): min=2, max=12882, avg=81.59, stdev=421.78
    clat (usec): min=2, max=24325k, avg=7783383.56, stdev=4645673.45
     lat (msec): min=2, max=24609, avg=7864.98, stdev=4667.48
    clat percentiles (msec):
     |  1.00th=[   21],  5.00th=[  464], 10.00th=[ 1385], 20.00th=[ 3842],
     | 30.00th=[ 5738], 40.00th=[ 6611], 50.00th=[ 7483], 60.00th=[ 8154],
     | 70.00th=[ 9329], 80.00th=[11610], 90.00th=[14160], 95.00th=[15503],
     | 99.00th=[17113], 99.50th=[17113], 99.90th=[17113], 99.95th=[17113],
     | 99.99th=[17113]
   bw (  MiB/s): min=  560, max= 4320, per=100.00%, avg=2437.54, stdev=56.25, samples=416
   iops        : min=   70, max=  540, avg=304.69, stdev= 7.03, samples=416
  lat (usec)   : 4=0.08%, 10=0.18%
  lat (msec)   : 4=0.11%, 10=0.24%, 20=0.34%, 50=0.70%, 100=0.57%
  lat (msec)   : 250=0.83%, 500=2.00%, 750=1.40%, 1000=1.33%, 2000=5.73%
  lat (msec)   : >=2000=86.47%
  cpu          : usr=0.01%, sys=1.09%, ctx=49275, majf=1, minf=4194585
  IO depths    : 1=0.3%, 2=0.5%, 4=1.0%, 8=2.1%, 16=4.2%, 32=8.3%, >=64=83.6%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=99.6%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.4%
     issued rwts: total=6144,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=128

Run status group 0 (all jobs):
   READ: bw=992MiB/s (1040MB/s), 992MiB/s-992MiB/s (1040MB/s-1040MB/s), io=3072MiB (3221MB), run=3096-3096msec

Run status group 1 (all jobs):
   READ: bw=1412MiB/s (1481MB/s), 1412MiB/s-1412MiB/s (1481MB/s-1481MB/s), io=24.0GiB (25.8GB), run=17404-17404msec

Run status group 2 (all jobs):
   READ: bw=1423MiB/s (1492MB/s), 1423MiB/s-1423MiB/s (1492MB/s-1492MB/s), io=48.0GiB (51.5GB), run=34539-34539msec

おわりに

fio の設定をいろいろ試していて S3 にファイルを作成したファイルをリードアクセスしていると fio の問題なのか、Mountpoint for Amazon S3 の問題か切り分けはついていないのですが、fio の処理が走らなくなることがたびたび発生しました。

Mountpoint for Amazon S3 が気になって fio のコンフィグを作って走らせてみたのですが、今まで利用してきた EFS, EBS やインスタンスストアの性能と比較してどうなのか?という新たな疑問が生まれました。時間あるときに試したらブログで紹介したいと思います。