Suggestions of using AWS purpose-built databases for microservices #reinvent [DAT209-L]
This post is the session report about DAT209-L: Leadership session: AWS purpose-built databases at AWS re:Invent 2019.
日本語版はこちらです。
概要
In this session, Shawn Bice, VP of databases, discusses the AWS purpose-built database strategy and explains why your application should drive the requirements for which database(s) to use, not the other way around. Learn about the purpose of each AWS database service and how AWS customers are using purpose-built databases to build some of the most scalable applications on the planet. If you are a technology or engineering leader and you’re trying to understand how to modernize your data strategy, this session is for you. We discuss using various approaches for new application development, lifting-and-shifting to managed services, and refactoring monolithic database architectures to use purpose-built databases.
Speakers
- Shawn Bice
- VP, Databases, Amazon Web Services
- Tobias Ternstrom
- Director, RDS & Aurora, Amazon Web Services
- Joseph Idziorek
- Principal Product Manager, Amazon Web Services
To build applications as the microservices architecture, you definitely need to understand how AWS database services are categorized and how to use each of them. In this session, AWS experts explained 7 types of databases and these use cases by company's real situations.
App architectures & patterns have evolved over the years...
- Builders today are...
- Not really building monolith applications
- Looking towards purpose-built systems
- Taking a big app breaking into smaller parts
- Picking the right tool for the right job
- 60s: Mainframe
- 80s: Client Server
- Separate app logic from a database
- 90s: Three tier
- The internet arrived
- A client layer, an application layer and single database layer
- Microservices
- Gotten into this new era of the cloud, the systems today are way more specialized
- Databases are more specialized than they have ever been before
Common database categories
Easy way to think of a data strategy -> Categorize
- Relational
- Make sure that responses are strongly consistent
- Put a constraint to the data type
- Key-value
- One.item or a trillion items in a table performs the same to scale out
- Document
- Create the data model on the fly as a JSON
- In-memory
- Query frequent accessed data to it instead of a full table scan
- Graph
- Highly connected data
- Time-series
- Sequel primary axis of the data model
- Doesn't do updates, inserts append only
- Ledger
- Once I write to it, I can never change it
- Immutable transaction log with cryptographic verifiability
Top of mind for our customers
- Move to Managed
- Services: RDS, Aurora, ElastiCache, DocDB
- Tools: SCT, DMS
- Programs: MAP, Pro Serve, Partner
- Break Free
- Services: Aurora, Amazon Redshift
- Tools: SCT, DMS
- Programs: MAP, DM Freedom, Pro Serve, Partner
- New Modern Apps
- New Requirements
- Users: 1 million+
- Data Volume: TB-PB-EB
- Performance: Milli-Micro sec
- Request Rate: Millions+
- Access: Any device
- Scale: up-out-in
- Economics: Pay as you go
- Developers Access: Managed API
- New Requirements
Customer Stories
Lyft using Key-value: Amazon DynamoDB
- Need the performance to be able to scale whether the number of users is 10 or 10 million
- Use DynamoDB for storing individual GPS locations with particular to your ride
- Key-value patterns enables us to get inputs for an individual rider based on a known key
- DynamoDB scales horizontally, which is virtually unlimited and still have millisecond performance
ZipRecruiter using Relational: Amazon Aurora and Amazon RDS
- Need to have a rich query experience over a million businesses and 100 million job seekers
- Don't know exactly who is going to search for what
- Use read replica to support the workload
Liberty Mutual using Document: Amazon DocumentDB
- Need flexible data model for JSON which stores information about customers, policies and assets
- Iterate fast and deliver new features without changing the schema and the database
- If you don't know the access pattern but need boundless scale, DocumentDB is preferred over DynamoDB
UBISOFT using In-memory: Amazon ElastiCache
- Optimized for latency over durability, which means both reads and writes will get microsecond latency
- Minimal latency is significant for online games
- Practical data structure where you can easily create and update ledger area and leaderboard
Nike using Graph: Amazon Neptune
- The circles with nouns and the edge with directions and connections
- Can actually query on connections
- Build a social graph to connect to customers and athletes
Klarna using Ledger: Amazon QLDB
- Make sure no one comes in and pokes around with your records
- The traditional way to protect the records such as limited access and auditing relys on human
- Use the hash key to verify that nothing resolved
Fender - The right tool for the job
- Migrated their product data, images, and purchase orders
- from SQL Servers
- to DynamoDB, Amazon S3, AWS Lambda and Amazon ElastiCache
- Lowered costs by 20%
- Increased speed by 50%
- Migrated the whole system to the cloud in less than 6 months
Recently Updates
Amazon Managed Cassandra Service (Preview)
- Challenges to manage large Cassandra clusters at scale
- Specialized expertise to set up, configure, and maintain infrastructure and software
- Scaling clusters is time-consuming, manual, and error-prone, so many overprovision capacity
- Manual backups and error-prone restore processes to maintain integrity
- Unreliable upgrades with clunky rollback and debugging capabilities
Federated Query for Amazon Athena (Preview)
- Challenges querying data from multiple databases
- Microservices can minimize the blast radius but it is difficult to take a look individually
- Imagine an e-commerce store with a microservices architecture
- Accessing multiple systems can be challenging
Amazon Aurora integration with ML
- Challenges with integrating machine learning (ML) with your database
- Select and train the model
- Create application code to read data from the database
- Query and format the data for the ML algorithm
- Call an ML service to run the algorithm
- Format the output