3 FSI applications which moved to the cloud and gained advantages #reinvent [FSI201-L]

This post is the session report about FSI201-L: Leadership session: Running critical FSI applications on AWS at AWS re:Invent 2019.

日本語版はこちらです。

Abstract

As financial services institutions strive to grow their businesses and better serve customers by building secure, flexible, and scalable solutions on the cloud, they have gained the confidence to run their most business-critical applications on AWS. In this session, Frank Fallon, VP of Worldwide Financial Services at AWS, discusses the industry’s growing reliance on the public cloud and the strengthening of ties between business and IT. Frank is joined by senior executives of leading financial institutions who share why and how they moved key applications to AWS —and the benefits they have realized as a result.

Speakers

  • Frank Fallon
    • VP Financial Services, Amazon Web Services
  • Nitin Tandon
    • Principal, Vanguard
  • SANTOSH BARDWAJ
    • VP
    • ADVANCED ANALYTICS & CLOUD, DISCOVER FINANCIAL SERVICES
  • Robert Palatnick
    • Managing Director, Chief IT Architect, Depository Trust & Clearing Corporation

Three speakers from DTCC, Vanguard and Discover Card told their story about how they and their organizations have migrated and been running critical FSI applications on AWS. Financial institutions are so restricted that it is not easy to migrate to the cloud, but they actually did and gained the capabilities of it.

A brief history of Financial Services...at re:Invent

The financial Services industry continues to evolve

  • 2015
    • Just getting started and experimenting
    • How to operate the dev test environment?
    • Fairly simple resiliency with multi AZs
  • 2016
    • Talking about more sophisticated topics
    • Disaster recovery, Risk management
  • 2017
    • Getting more serious about moving to the cloud
    • Security & compliance, Account management
  • 2018
    • Digital innovation, Digital channels, Machine learning
    • All areas came to the fore within these conversations

Transformation is top of mind

What they are doing together to help transform their businesses? - Bring forward new research data platforms - Bring capabilities to open banking - Move critical applications to the cloud

  • JPMC: 50+ applications moved in three weeks
  • AXA: Global landing zone deployment
  • Goldman Sachs: Investment research platform migration
  • Barclays: New research data platform
  • Nasdaq: From data warehouse to data lake
  • National Australia Bank: Content-center transformation
  • HSBC: Open banking with serverless technology
  • Robobank: New ML-enabled lending solution

Customer perspectives - DTCC

  • US$1,854 trillion
    • The amount of processing that we do for securities transactions last year
  • Formed almost 50 years ago, which was really a paper crisis
  • To be a centralized computerized ledger of all paper transactions
  • Provide services for mutual funds for the insurance industry for ETFs
  • A global swap repository business that provides reporting to our regulators around the world

Business resilience

  • Focusing on resiliency
    • The financial industry to look at resilience not as a technology issue
    • But more this needed to be a business owned initiative,
  • Technology resilience
  • Operational resilience
    • Owning, operating, testing having the controls and knowing what to do in case of a disaster
  • Financial resilience
    • If there is a market disruption or a major financial institution suddenly goes out of business, how do you make sure the markets continue to operate smoothly?

Technology resilience gidelines

  • The model of implementing resilience methodology and bringing it into our organization and culture from three angles
    1. Governance and establishing the guidelines and the framework
    2. Architecture building out the right components
    3. Engineering; resources to do what you want the business to accomplish
  • Design for resilient IT capabilities
    • Assume disruptions will inevitably occur
  • Ensure regional availability
    • Design for transparent failover
  • Leverage out-of-region recovery
    • Minimize data loss and ensure consistency
  • Ensure resilience success
    • Build and automate controls and validation

Architectural approach

  1. Resiliency architecture principles
  2. Reference architecture
    1. The logical and physical model
  3. Reference implementation
    1. How you can implement it on AWS, such as EC2 and S3
  4. Services implementation
    1. Aggregations of those different architectures into something bringing business value

Active engineering of resilience

  • Chaos
    • Netflix, Chaos Monkey
  • Failure Mode Analysis
  • Service Validation
    • Push and illustrate the limits or metrics of service capabilities
  • CARE (Cloud Architecture Resilience Engineering)
    • With the experimentation DTCC look on, AWS was able to improve the replication latency dramatically within four months

Customer perspectives - Vanguard

Who we are

  • Vaguard is one of the world's largest investment management companies
  • Our purpose
    • "To take a stand for all inverstors, to treat them fairly, and to give them the best chance for investment success."
  • We serve 30 million investors in 15 countries
  • Every day, our portfolio managers invest up to two billion new dollars
  • We manage $6 trillion in investments with about $2 billion of net new money inflow per day
  • We have more than 17,000 crew members, as we call our employees, in keeping with our nautical theme.

Vanguard's cloud journey

  • 2015: Committed to public cloud
    • Didn't match either the scale and the pace of innovation
  • 2016: "Web Apps" MVC
    • Got security approvals
    • Established landing zones
    • Designed a cloud architecture and build a cloud platforms
  • 2017: New development in could and "Analytics" MVC
    • Moved on-prem Hadoop clusters to Amazon EMR
  • 2018: Web & Analytics expansion
    • Turned attention to mission critical applications
  • 2019: Core website capabilities
    • How do we refactor a large monolithic website
    • How do we decouple the dependency from the mainframe, DB2?
    • How do we know the solution that we came up with is scalable and will work?

Client website activity

  • 80% of the website traffic was driven to 5 to 7 key pages, such as log-in and balances

Core website

  • Resolve Mainframe dependency
    • Replicate the data in DB2 to DynamoDB
    • Refactor Java pages which got huge traffic into micro services that would read directly from DynamoDB
  • Scalability
    • Test that 20% of the traffic go to on prem and 80% go to AWS, or vice versa
    • Gradually increase the load from on-prem to AWS

Benefits

  • 75% reduction in planned downtime
  • 30% decrease in unplanned downtime
  • Deployment frequency increased by about 20x
  • Agile and CI/CD leads to easy scaling
  • Scalability + Agility + NWOW -> Greater client outcomes

Customer perspectives - Discover

About Discover

  • Credit Cards
    • $144Bn Card Sales Volume
    • $74Bn in Credit Card Receivables
  • Digital Banking
    • $52Bn+ Consumer Deposits
    • $10Bn Private Student Loans
    • $8bn Personal Loans
  • Payment Services
    • $246Bn Payment Services Volume
    • Discover Network: 16 network alliances
    • PULSE: ~2.1MM, ATMs in 134 countries
    • Diners: 190+ countries/territories

Data-driven customer experience

Why did we focus on data analytics? 1. Data was foundational for any technology transformation 1. Wanted to make decision based on more data and analytic centric ways 1. Wanted some concrete use cases tied along with this cloud journey to take value

  1. More data
  2. Better Analytics
  3. Better Experiences
  4. More Customers
  5. (Loop)

Cloud-enabled transformation

  • Scalability: 300% growth in data footprint
  • Delivery: 10x improvement in speed of delivery
  • Time to Market: 99% reduction in deployment time
  • Talent 25x growth in cloud engineers

CECL: Current Expected Credit Losses

  • What is CECL?
    • CECL is a new accounting standard to calculate the Loan Loss Provision
  • Challenge
    • CECL has been looking at the last 12 or 18 months of customer transaction data for the calculation, but it's not enough
    • The amount of time to take us to train our models based on the terabyte size data set
    • to improve customer experiences
    • Legacy environment not suited for Machine Learning on large datasets
  • Accuracy is essential
    • Precision is extremely important because even a few decimal makes a huge difference
    • Small changes to the calculation result in millions that need to be reserved
  • Improve customer experiences
    • How do technology and business come together?
    • Integrate real time data
    • integrate machine learning
    • Have a microservices based framework
  • On Premises
    • Original Plan: Model training = days, Model production = months (will not complete)
      • Physical limitations to scale, time to provision large physical footprint, and performance bottlenecks handing large datasets
  • On Cloud
    • Current State: Model training = hours, Model production = days
      • Containerized analytics processing, automation, scale on demand, faster processing on Amazon Redshift
    • Future State: Model training = hours, Model production = hours

Collections Lighthouse results: Digital Edge 2019 Awards winner

  • Reduction in false positives
    • Efficient identification of pre-delinquency
  • 40% reduction
    • in outbound calls
  • 60% reduction
    • in time to market
  • 28% increase
    • In on-track payments, enabling customers to improve their credit standing
  • Cloud-enabled customer experience
    • through integration of real-time data and cloud-hosted decision capability

Q&A

  • What's next?
    • Vanguard
      • We are at 70% now and will be at 80% by the end of 2020
      • Hope to have plans for the factoring simplification and platform orientation of entire portfolio
    • DTCC
      • Cloud governance insights; a framework and a dashboard for different aspects of the controls and governance environment in the cloud
      • The control plane; an ecosystem that any errors and any status can be reported to it
    • Discover
      • Accelerate cloud adoption
      • Security especially that we have data out there
      • Use cloud as catalyst to continue to invest more in area such as machine learning and real time
      • Talent
  • What was the biggest impact to the organization that moving to the cloud is happning?
    • Discover
      • Transformation and optimization
    • DTCC
      • Agility of...
        • How fast you can provision infrastructure
        • Capital
        • Access to innovation
        • How quickly we able to react to clients needs
      • Deployment frequency is one of the biggest impacts that we were able to drive

Building a brighter future on AWS

Our customers are building a brighter future on AWS

  • Modernizing legacy systems
  • Anticipating changing customer needs
  • Developing and driving new business opportunities
  • Fulfilling strict security and compliance requirements