I attended a session in AWS Summit Tokyo on 「AXA Life Insurance Ideal Form of Data utilization “lakehouse” for centralized analysis and forcasting」

2023.04.23

この記事は公開されてから1年以上経過しています。情報が古い可能性がありますので、ご注意ください。

Introduction

Hemanth of Alliance Department here. In this blog, I'll be writing on "Ideal data utilization situations and lakehouse for centralized analysis and forecasting" session in AWS Summit Tokyo 2023 [AP-31] . As the session was in Japanese,I have put in my best effort to put up all the content in english here.

Main Speaker

Yuka Inose from AXA Life Insurance Co., Ltd Data Solution

Agenda

1) The opening was done by Databricks Japan Co., Ltd

2) The introduction of Lakehouse - Why Lakehouse necessary now?

3) Conclusion

Opening

Data Lakehouse Platform of Databricks 1) Data warehouse

2) Data Engineering

3) Data Streaming

4) Data Science, ML

There are other useful things such as the unity catalog, Delta Lake and Lakehouse Platform as said by Databricks -

1) simple - In one platform integrate data warehouses and AI use Cases

2) Open - open source base. no need to take data out from customer's AWS environment

3) TCO reduction - processing costs are down by 91% when compared to traditional data platforms

The Introduction of Lakehouse - why is Lakehouse necessary now?

Divided into 4 sections

Data Area Trajectory So Far

Before 2018 ["Think well first" phase started ] Birth of the first DataLake - used only by Data Scientists

2019-2020 [Entered into "It's ok to fail so act first" phase] Birth of second generation Data Lake focused on value chain and Data accumulation

2021-2022 Data Lake utilization of accumulated Data Metadata Management

2023-2024 DL/DWH/DM/MLOps Integration Lakehouse in demand. The rise of investment in this field.

2025- Lake house utilization every employee becomes familiar with analyis and machine learning

Turning point for Lake House Investment

3 initiatives that changed it

1) Data Lake community - For people who haven't used Data Lake or Quicksight yet had opportunities at open door venues related to product and case study introductions.

2) Data Kiosk - people who already started using it. mechanism of all reception spots and support related to data

3) in-house sales activity - proposals to departments with interests and needs. lowering initial hurdles without a budget only at the start.

Necessity for Lakehouse

The challeges before are as follows

complicated - traditional reporting system and multiple independent data lakes existed - only users with high skill set could used

retrodiction - how the past was and what future holds, mindset of "past+present" to "past+present future" Closed

create an environment to withstand increasing needs of machine learning use cases in future

Future Prospects

when looked from from operation side it's about the business, support and agency and how AI successfully implemented for personnel use. From analytics side it's about the insight into the future then to data team and business departments. Then from both view it comes down to diverse data for analysis and making use of AI which directly propotions to lakehouse.

Conclusion

3 main things that was conveyed during the session

1) Creating a masterpiece of data utilization that can be experienced throughout the organization

2) Just by creating simple structure draws out the underlying strenght of the data team

3) AI utilization partner for each employee which inturn is by building a lake house