A migration guide from Amazon QuickSight Topics to Dataset Q&A for metadata has been published, so I tried it out.

A migration guide from Amazon QuickSight Topics to Dataset Q&A for metadata has been published, so I tried it out.

Here is a Python script and procedure for automatically migrating business metadata accumulated in Amazon QuickSight's traditional Q&A feature "Topics" to the latest "Dataset Q&A". You can gradually migrate to the new, higher-precision Q&A experience while leveraging your investments in synonyms, calculated fields, and more.
2026.06.29

This page has been translated by machine translation. View original

This is Ishikawa from the Cloud Business Division. A guide for converting metadata from Topics, the traditional natural language query feature, to the Dataset Enrichment format for Dataset Q&A has been published on the Amazon QuickSight community by alaresch (Amy Marvin@AWS). A Python script automatically extracts business context accumulated in Topics (synonyms, calculated fields, custom instructions, etc.) and carries it over to Dataset Q&A.

https://community.amazonquicksight.com/t/conversion-guide-amazon-quick-topics-dataset-q-a/52500

For those who have already been adding business metadata to Topics, this is a practical guide that allows you to migrate to a new, higher-accuracy Q&A experience without wasting your previous investment.

What are Topics and Dataset Q&A

Amazon QuickSight provides two types of natural language data query experiences.

  • Topics: The traditional Q&A feature. It groups one or more datasets and defines business context and field semantics to scope in advance "what users can ask questions about." It excels at predictable question patterns and continues to be supported.
  • Dataset Q&A: A new feature based on the latest text-to-SQL architecture. It queries datasets directly, converting questions to SQL at runtime and executing them against all data. It supports runtime calculations, row-level security (RLS) / column-level security (CLS), scaling up to 2 billion rows, and each answer comes with an Explanation showing the generated SQL and data sources.

Note that Dataset Enrichment is the mechanism for defining business context at the dataset layer. This guide bridges the curation on the Topics side to this Dataset Enrichment format.

Migrating Metadata from Topics to Dataset Q&A

The main points of this migration are as follows.

  • A Python script (convert_topics_to_enrichment.py) is provided that calls the Topics ListTopics / DescribeTopic APIs to automatically extract curated metadata.
  • Extraction targets include field synonyms, friendly names, semantic types (currency, dates, etc.), calculated fields, default aggregations, display formats, cell value synonyms, named entities, comparison order (greater-is-better, etc.), custom instructions, and more.
  • Output supports 3 formats — YAML (default) / JSON / TXT — and is generated as files such as <topic-name>_enrichment.yaml.
  • Dataset Q&A supports direct queries to SPICE as well as Amazon Redshift, Amazon Athena, Aurora PostgreSQL, and Amazon S3 Tables (direct queries have no row count limit).

Overview of the Conversion

The migration proceeds in the following 3 phases.

How to Use

Phase 1: Identify the Topics to Convert

You don't need to convert all Topics at once. Check usage in the User Activity tab of each Topic and prioritize Topics such as the following.

  • Frequent unanswered responses of "I can't answer that"
  • Complex or frequently updated calculated fields
  • Low feedback scores or usage rates
  • Questions are being requested that exceed the pre-configured field scope

Conversely, Topics that are stable with high satisfaction and predictable question patterns can continue to operate as-is for the time being.

20260629-amazon-quick-topic-to-dataset-2

Phase 2: Run the Conversion Script

Download the conversion script convert_topics_to_enrichment.py from Conversion Guide: Amazon Quick Topics → Dataset Q&A.

20260629-amazon-quick-topic-to-dataset-3

Run the script after meeting the following prerequisites.

  • Python 3.8 or higher
  • AWS credentials with QuickSight:ListTopics and QuickSight:DescribeTopic permissions
  • boto3 installed (pip install boto3)

Examples of execution are as follows.

# Export all Topics in YAML format (default)
python convert_topics_to_enrichment.py --account-id $(aws sts get-caller-identity --query "Account" --output text)

# Export a specific Topic in JSON format
python convert_topics_to_enrichment.py --account-id $(aws sts get-caller-identity --query "Account" --output text) \
  --topic-id my-topic-id --format json

# Export in human-readable text format
python convert_topics_to_enrichment.py --account-id $(aws sts get-caller-identity --query "Account" --output text) --format txt

When actually executed, topics including those I don't own are output all at once.

% python convert_topics_to_enrichment.py --account-id $(aws sts get-caller-identity --query "Account" --output text) --region ap-northeast-1
Listing all topics in account 123456789012...
Found 21 topic(s)

:
:

Processing topic: siqCyC17cQiSRZoPuqj2PgzsI2SRZG2L
  Written: software_sales_enrichment.yaml

:
:

Done.

The contents of the generated enrichment file software_sales_enrichment.yaml are as follows.

software_sales_enrichment.yaml
topic_id: siqCyC17cQiSRZoPuqj2PgzsI2SRZG2L
topic_name: Software Sales
description: 'A sample Q topic with software sales order details topic with product, location, sales and geographical information. '
datasets:
- dataset_arn: arn:aws:quicksight:ap-northeast-1:123456789012:dataset/d5fe021f-e8a9-4582-9ff4-450f2b358989
  dataset_name: Daily Customer Sales
  dataset_description: Daily Customer Sales Data
  data_aggregation:
    date_granularity: DAY
    default_date_column: Order Date
  fields:
  - column_name: Order ID
    friendly_name: Order ID
    description: Order Identifier
    data_role: DIMENSION
    default_aggregation: DISTINCT_COUNT
    not_allowed_aggregations:
    - COUNT
    never_aggregate_in_filter: false
    is_included_in_topic: true
    semantic_type:
      type_name: Identifier
  - column_name: Order Date
    friendly_name: Order Date
    description: Customer Order Date
    data_role: DIMENSION
    never_aggregate_in_filter: false
    is_included_in_topic: true
    semantic_type:
      type_name: Date
      sub_type_name: Day
  - column_name: Date Key
    friendly_name: Date Key
    data_role: DIMENSION
    never_aggregate_in_filter: false
    is_included_in_topic: false
    semantic_type:
      type_name: DatePart
  - column_name: Contact Name
    friendly_name: Contact Name
    description: Customer Name
    synonyms:
    - Customer Name
    - Customer POC
    data_role: DIMENSION
    never_aggregate_in_filter: false
    is_included_in_topic: true
    semantic_type:
      type_name: Person
  - column_name: Country
    friendly_name: Country
    description: Customer Country
    data_role: DIMENSION
    never_aggregate_in_filter: false
    is_included_in_topic: true
    semantic_type:
      type_name: Location
      sub_type_name: Country
  - column_name: City
    friendly_name: City
    description: Customer City
    data_role: DIMENSION
    never_aggregate_in_filter: false
    is_included_in_topic: true
    semantic_type:
      type_name: Location
      sub_type_name: City
  - column_name: Region
    friendly_name: Region
    description: Sales Region
    synonyms:
    - Area
    data_role: DIMENSION
    never_aggregate_in_filter: false
    is_included_in_topic: true
    semantic_type:
      type_name: Location
  - column_name: Subregion
    friendly_name: Subregion
    description: Sales Sub-Regions
    data_role: DIMENSION
    never_aggregate_in_filter: false
    is_included_in_topic: true
    semantic_type:
      type_name: Location
  - column_name: Customer
    friendly_name: Customer
    synonyms:
    - buyer
    - purchaser
    - company
    - client
    data_role: DIMENSION
    never_aggregate_in_filter: false
    is_included_in_topic: true
  - column_name: Customer ID
    friendly_name: Customer ID
    description: Unique Customer Identifier
    data_role: DIMENSION
    default_aggregation: DISTINCT_COUNT
    never_aggregate_in_filter: false
    is_included_in_topic: true
    semantic_type:
      type_name: Identifier
  - column_name: Industry
    friendly_name: Industry
    description: Customer Industry
    synonyms:
    - Domain
    data_role: DIMENSION
    never_aggregate_in_filter: false
    is_included_in_topic: true
  - column_name: Segment
    friendly_name: Segment
    description: Customer Business Segment
    synonyms:
    - Sector
    data_role: DIMENSION
    never_aggregate_in_filter: false
    is_included_in_topic: true
  - column_name: Product
    friendly_name: Product
    description: Product Name
    synonyms:
    - Item
    - Service
    data_role: DIMENSION
    default_aggregation: DISTINCT_COUNT
    never_aggregate_in_filter: false
    is_included_in_topic: true
  - column_name: License
    friendly_name: License
    description: Product License Detail
    data_role: DIMENSION
    never_aggregate_in_filter: false
    is_included_in_topic: true
  - column_name: Sales
    friendly_name: Sales
    description: Sales in dollar
    synonyms:
    - Revenue
    - Spend
    data_role: MEASURE
    default_aggregation: SUM
    never_aggregate_in_filter: false
    is_included_in_topic: true
    formatting:
      display_format: CURRENCY
      currency_symbol: $
      fraction_digits: 0
      use_grouping: false
      use_blank_cell_format: false
    semantic_type:
      type_name: Currency
  - column_name: Quantity
    friendly_name: Quantity
    description: Quantity Sold
    synonyms:
    - qty
    data_role: MEASURE
    default_aggregation: SUM
    never_aggregate_in_filter: false
    is_included_in_topic: true
    formatting:
      display_format: NUMBER
      fraction_digits: 0
      use_grouping: false
      use_blank_cell_format: false
  - column_name: Discount
    friendly_name: Discount
    description: Sales Discount applied on order
    data_role: MEASURE
    default_aggregation: SUM
    never_aggregate_in_filter: false
    is_included_in_topic: true
    formatting:
      display_format: CURRENCY
      currency_symbol: $
      fraction_digits: 0
      use_grouping: false
      use_blank_cell_format: false
  - column_name: Profit
    friendly_name: Profit
    description: Estimated Gross Profit
    data_role: MEASURE
    default_aggregation: SUM
    never_aggregate_in_filter: false
    is_included_in_topic: true
    formatting:
      display_format: CURRENCY
      currency_symbol: $
      fraction_digits: 0
      use_grouping: false
      use_blank_cell_format: false
  named_entities:
  - entity_name: Order Details
    description: Named Entity with data fields commonly asked for orders
    definitions:
    - field_name: Order Date
    - field_name: Customer
    - field_name: Product
    - field_name: Segment
    - field_name: Industry
    - field_name: Quantity
    - field_name: Sales
    - field_name: Profit
    - field_name: Discount
    - field_name: Order ID
  - entity_name: Order location
    definitions:
    - field_name: Order Date
    - field_name: Product
    - field_name: Customer
    - field_name: Sales
    - field_name: Quantity
    - field_name: Country
    - field_name: City
    - field_name: Region
    - field_name: Subregion

The generated enrichment file can be uploaded directly to Dataset Q&A, or you can use its contents as a reference to manually configure field descriptions and calculated fields via the UpdateDataSet API or the console.

Phase 3: Enable Dataset Q&A

Since the Dataset Enrichment feature was not available in legacy experience datasets, I created a new experience dataset.

Open the target dataset in Amazon QuickSight, go to "Dataset details" → "Custom instructions" → "Upload file", and upload the generated enrichment file software_sales_enrichment.yaml. Press the [Save and publish] button to apply the changes.

20260629-amazon-quick-topic-to-dataset-4

For the prompt "Please tell me the total revenue," while "revenue" does not exist as a column, since "Revenue" has been configured as a synonym for Sales, the correct result is returned.

20260629-amazon-quick-topic-to-dataset-5

Notes on Usage

  • Topics will not be deprecated. They continue to be fully supported and will evolve going forward. However, for new Q&A implementations, Dataset Q&A + Dataset Enrichment is recommended.
  • Topics and Dataset Q&A can run in parallel. You can use Topics on sheets and Dataset Q&A via chat.
  • SPICE datasets are subject to capacity limits, but direct queries have no row count limits.
  • For calculated fields, the system may automatically generate complex window functions and others, so you may not need to recreate the same ones as in Topics (refer to the official documentation for details).

In Closing

The curation accumulated in Amazon QuickSight Topics — such as synonyms and calculated fields — can be seamlessly carried over to the Dataset Enrichment format for Dataset Q&A using the conversion script. Dataset Q&A offers advantages such as high accuracy through a Text-to-SQL foundation, large-scale support, and automatic RLS/CLS application, making it effective to migrate incrementally, starting with Topics that have many unanswered questions or complex calculated fields.

For those already operating Topics, why not start by identifying migration candidates in the Topics User Activity tab and exporting the metadata with the conversion script?

Share this article

AWSのお困り事はクラスメソッドへ