Integrating Data for Box via Fivetran and Snowflake

2021.02.09

(日本語はこちらから)

Background

Fivetran is committed to provide a wide variety of data connectors in order to enhance the user experience of cloud computing. Recently, Fivetran has released its Box connector in the initial Beta version.

The file types that can be connected from Box are currently limited to: Separated Value Files (such as CSV, TSV, etc.), JSON text files delimited by new lines, JSON Arrays, Avro and Parquet. For compatibility purposes, the data encoding for these files should be either: UTF-8, UTF-16, or UTF-32, with big or little endian order. UTF-8 encoding is automatically assigned if no Byte-Order Mark is present at the beginning of a file.

The purpose of this post is to familiarize Fivetran users with the Box connectivity and the data loading process.

How to Connect and Use

Initially, make sure you have connected the Fivetran account to Snowflake as a destination. It can be achieved as shown here in detail.

In the next step, we have to connect Fivetran with the Box account. From Fivetran, click on the “Connector” button as shown below

In the next step, choose the connector you want to connect, in this case we search for Box. Currently, Box connector is in the Beta mode as shown below.

In the next step, a destination schema is needed, this is the name of a dataset you want to create. Enter the source “Folder URL” from Box in the same format as shown below and click the “Authorize” button.

This will take you to the authorization page asking permission from Box.

When the access is approved, the data connection will be established as shown below.

By clicking on this newly established Box connection, its details can be explored as shown below.

The “Status” tab would show details about the data synchronization. Automatic and manual synchronization can be applied here. Preference to receive Email alerts can be chosen from the top right corner.

Logs and Schema can be viewed too as shown below.

The Setup tab would show details such as folder-id, user who created the connection, frequency of data synchronization, ability to re-synchronize all historic data (especially useful when the Box data is modified) etc.

An email will be sent to Fivetran active users to notify them about this newly added datasource. Notification settings can be modified, if needed, as shown below.

As the data connection is established and the data is loaded to the destination warehouse (Snowflake), we can take a look at the database in Snowflake.

By quickly running a SQL we can verify the loaded data which is ready for transformation.

Fivetran works as an ELT platform, and it will Extract and Load the data into the warehouse, where it can be later Transformed.

Summary

As mentioned earlier, Box connector for Fivetran is currently available in its Beta mode, and production usage should be avoided until a stable version is available.

Fivetranの導入支援はクラスメソッドにおまかせください

クラスメソッドはFivetranのライセンス販売パートナーです。 FivetranのほかにもAWS、Looker、Alteryx、Tableau、Snowflakeとパートナー契約を結んでおり、それぞれ豊富な導入実績があります。各サービスとFivetranを連携させることで、お客様の環境にあわせた最適な構成をご提案します。