[レポート]Amazon Redshift: 10 years of innovation in integration, data sharing & more #ANT345 #reinvent

AWS re:Invent 2022

2022.12.10

この記事は公開されてから1年以上経過しています。情報が古い可能性がありますので、ご注意ください。

どーも、データアナリティクス事業本部コンサルティングチームのsutoです。

本エントリはAWS re:Invent 2022のセッション「ANT345 Amazon Redshift: 10 years of innovation in integration, data sharing & more」のレポートです。

セッションの概要

Amazon Redshift is continuously reinventing data warehousing to provide powerful analytics in the cloud for you. Amazon Redshift supports your needs for high-performance execution with complex, analytical queries with high concurrency and deep integrations with other AWS services, performing rich analytics on all data and ML-based autonomics that make analytics easier. In this session, learn about Amazon Redshift innovations like data sharing, streaming ingestion, federated query, built-in ML, and Amazon Redshift Serverless. Also explore the newly announced Amazon Redshift integration for Apache Spark, which helps developers easily build and run Apache Spark applications on Amazon Redshift data from Spark-based AWS analytics services like Amazon EMR, AWS Glue, and Amazon SageMaker.

Amazon Redshiftは、データウェアハウスを継続的に改革し、お客様のためにクラウドで強力な分析機能を提供します。Amazon Redshiftは、高い並行性と他のAWSサービスとの深い統合を備えた複雑な分析クエリによるハイパフォーマンス実行、すべてのデータに対する豊富な分析の実行、分析を容易にするMLベースのオートノミクスなどのニーズをサポートします。このセッションでは、データ共有、ストリーミング・インジェスト、フェデレート・クエリー、ビルトインML、Amazon Redshift ServerlessなどのAmazon Redshiftのイノベーションについて学びます。また、開発者がAmazon EMR、AWS Glue、Amazon SageMakerなどのSparkベースのAWS分析サービスからAmazon Redshiftデータ上でApache Sparkアプリケーションを簡単に構築して実行できるように、新しく発表されたApache Spark用のAmazon Redshift統合を探求してください。

スピーカー

Neeraja Rentachintala, Principal Product Manager, Amazon
Ippokratis Pandis, VP/Distinguished Engineer, Amazon Web Services

セッション内容

Security and Availability

業界をリードするセキュリティとアクセス制御を追加費用なしですぐに利用できる
- Columnレベルアクセス
- Rowレベルアクセス
- 動的データマスキング

クロスAZクラスターリカバリー
- AZ障害があってもクラスターを別のAZに再配置できる

今年のアップデートでマルチAZをサポート（この時点ではPreview）
- 両方のAZで稼働するため、これによってファイルオーバーを確実に実行できる

Performance

他のセッションでも解説があったクエリ実行の仕組みについて
- クエリ実行の際、内部のC++コード
- このコードは、非常に効率的で、私たちが使用しているEC2 Nitroベースのハードウェアを有効に活用することができる
- キャッシュヒット率99.96%を実現

これまでのRedshiftの開発でコストパフォーマンスをどれだけ向上できたかの話
この辺の内容は、私がセッションレポートを書いたブログ「Get better price performance in cloud data warehousing with Amazon Redshift #ANT320-R」と内容がほぼ一緒でした

Storage and Compute Elasticity

RA3インスタンスタイプは、コンピュートノードとストレージノードが分離していることで柔軟性が向上している
さらにリーダーノードのConcurrency Scaling機能があり、拡張における柔軟性も向上
同時接続ユーザーの増加に比例してスループットの値もしっかり伸びている
上記内容もブログ「Get better price performance in cloud data warehousing with Amazon Redshift #ANT320-R」と内容がほぼ一緒です