
Classmethod Data Analytics Newsletter (AWS Data Analytics Edition) – July 2026 Issue
This page has been translated by machine translation. View original
This is Ishikawa from the Consulting Division of the Cloud Business Headquarters. Here is the AWS data analytics-related update information for June 2026. This month, Amazon QuickSight announced autonomous agents and multi-dataset analytics, AWS Glue Data Catalog added support for semantic search (preview), and Amazon S3 added Annotations to provide business context to objects. It was a month when analysis via natural language and agents, interactive development experience using Spark Connect, and semantic data discovery all progressed simultaneously. There are other updates as well, so let me introduce them!
Amazon SageMaker Unified Studio
New Features & Updates
2026/06/03 - Amazon SageMaker Unified Studio now supports notebook scheduling
Amazon SageMaker Unified Studio now allows you to schedule, parameterize, and orchestrate notebook execution directly from the notebook interface. You can automate recurring workloads such as daily reports, data quality checks, and model retraining without setting up an external orchestration infrastructure, enabling a smooth transition from experimentation to production operations.
2026/06/03 - Amazon SageMaker Unified Studio now supports a localized experience in twelve languages
The Amazon SageMaker Unified Studio UI now supports localization in 12 languages, including Japanese. The supported languages are English, Chinese (Simplified and Traditional), French, German, Indonesian, Italian, Japanese, Korean, Portuguese (Brazil), Spanish, and Turkish. Both automatic detection from browser settings and manual selection in profile settings are supported.
2026/06/09 - Amazon SageMaker Unified Studio Notebooks now support EMR Serverless
Amazon SageMaker Unified Studio notebooks now support Amazon EMR Serverless via Apache Spark Connect. You can run PySpark and Spark SQL in notebook cells on EMR Serverless Spark applications, and select the Spark runtime from the side panel. Features available include fast session startup with pre-initialized capacity, monitoring with integrated Spark UI, VPC connectivity, and SageMaker Data Agent integration for code generation.
Amazon SageMaker Data Agent
New Features & Updates
2026/06/03 - Amazon SageMaker Data Agent now supports conversation history
Amazon SageMaker Data Agent, available in SageMaker Unified Studio, now supports conversation history. You can retrieve past conversations, agent-generated code, and troubleshooting exchanges later, maintaining continuity of work across analysis sessions.
2026/06/04 - Amazon SageMaker Data Agent integrates business context into conversations
Amazon SageMaker Data Agent has been integrated with business context and metadata from SageMaker Catalog. You can search glossaries, custom metadata forms, asset descriptions, READMEs, and more to discover data using business terminology rather than technical table names, generating more accurate SQL and Python code. Business context synchronized from Collibra, Atlan, and Alation can also be utilized.
Amazon DataZone
New Features & Updates
API Changes
2026/06/15 - Amazon DataZone - 1 new methods
Amazon DataZone now allows you to delete lineage events. You can clean up data lineage events that are no longer needed.
Amazon Redshift
New Features & Updates
2026/06/08 - Amazon Redshift reduces manual snapshot cost for Serverless and RG instances
A new billing model has been introduced for manual snapshots of Amazon Redshift Serverless and RG instances. Storage is measured in units of unique data blocks shared between snapshots, rather than the total size of individual snapshots. This eliminates double-charging for duplicate data when retaining multiple snapshots, allowing you to increase backup frequency and retention periods while keeping costs down. Available in all AWS commercial regions and GovCloud regions.
2026/06/25 - Amazon Redshift adds Reserved Instance upfront pricing options for RG instances
Payment options for Reserved Instances (RI) for Amazon Redshift RG instances have been expanded to include "All Upfront" and "Partial Upfront" options. Previously, RIs for RG instances were only available with the No Upfront payment option, but this update now allows you to choose upfront payment methods for 1-year or 3-year terms as well. All Upfront maximizes the discount by paying the full contract amount at the start, while Partial Upfront requires a lump sum payment upfront with the remainder split into lower monthly payments. Combined with the No Upfront option, you can now optimize compute costs according to your cash flow needs by choosing from three payment patterns.
AWS Glue
New Features & Updates
2026/06/17 - AWS Glue Data Catalog now supports business context and semantic search (Preview)
AWS Glue Data Catalog now supports business context and semantic search (preview). By indexing business context alongside technical metadata, you can search for data by "meaning" using the new Glue Search API. Tables can be discovered not only by their structure, but also by their business meaning through glossary terms and descriptive metadata. It also supports use by MCP-compatible AI agents. Available in US East, US West, and Europe (Ireland) regions.
2026/06/17 - AWS Glue Interactive Sessions now support Spark Connect for interactive workloads
AWS Glue Interactive Sessions now support Apache Spark Connect. You can develop and run Spark applications on Glue's serverless infrastructure without cluster management, from managed notebooks in Amazon SageMaker Unified Studio or from your preferred environment such as Jupyter or Visual Studio Code. It can be used for ad hoc data exploration, step-by-step iterative debugging, and incremental development of PySpark jobs before deploying to production. Available in multiple regions including Tokyo.
API Changes
2026/06/04 - AWS Glue - 2 new 3 updated methods
AWS Glue Interactive Sessions now support Apache Spark Connect. Remote Spark execution via gRPC is now available, and the GetSessionEndpoint and GetDashboardUrl APIs have been added. CreateSession now accepts the SPARK_CONNECT session type.
2026/06/12 - AWS Glue - 6 updated methods
GetTable can now retrieve metadata for Apache Iceberg tables. By specifying LATEST_ICEBERG_METADATA in the new AttributesToGet parameter, you can receive the schema, partition specification, sort order, and table properties in the response.
2026/06/17 - AWS Glue - 28 new methods
Search and Discovery for AWS Glue is now supported. You can search for assets such as Data Catalog tables and enrich them with business context and glossary terms.
2026/06/19 - AWS Glue - 1 new 4 updated methods
A SearchAssets operation has been added to discover Data Catalog assets using full-text search and filters. The naming of Glossary Terms and Attachment APIs has also been reorganized.
2026/06/29 - AWS Glue - 1 new methods
An UpdateAsset operation has been added to set business names and descriptions for existing AWS Glue Data Catalog assets.
AWS Lake Formation
New Features & Updates
2026/06/12 - Access Amazon S3 data files directly using AWS Lake Formation permissions
You can now use AWS Lake Formation permissions to directly read from and write to data files in registered S3 locations from Apache Spark jobs on Amazon EMR. Lake Formation credential vending for S3 location access is available with EMR release label 7.13 and later, Boto3 1.42.29 and later, AWS SDK for Java 2.41.32 and later, and AWS CLI 2.33.1 and later. Fine-grained access control at the table level can be maintained on the Lake Formation side without individually managing S3 bucket policies.
Amazon QuickSight
New Features & Updates
2026/06/01 - Amazon QuickSight now supports VPC connectivity for MCP connections
Amazon QuickSight now supports VPC connectivity to Model Context Protocol (MCP) servers. You can connect privately hosted MCP servers to QuickSight via Amazon VPC without exposing them to the internet.
2026/06/11 - Amazon QuickSight now integrates with Snowflake Cortex AI
Amazon QuickSight has been integrated with Snowflake Cortex AI through the Model Context Protocol. You can query Snowflake data and documents in natural language and automate multi-step workflows. This enables structured data analysis with Cortex Analyst, obtaining insights from documents with Cortex Search, and answers combined with internal knowledge accumulated in QuickSight Spaces.
2026/06/16 - Amazon QuickSight expands integrations with new connectors for Adobe, Figma, WhatsApp, and more
Amazon QuickSight has added 16 new connectors including Adobe, Figma, and WhatsApp. Tools that teams use on a daily basis, spanning productivity, design, analytics, data infrastructure, and communication, can be added to the workspace in minutes and integrated into QuickSight Flows, Chat, and Spaces.
2026/06/17 - Amazon QuickSight announces autonomous agents, multi-dataset analytics, and redesigned activity feed
Amazon QuickSight has announced autonomous agents, multi-dataset analytics, and a redesigned activity feed. With autonomous agents, you can instruct tasks in natural language and finely configure the level of autonomy, from step-by-step approval to broad goal-based discretion. With multi-dataset analytics, you can query multiple data sources, including Snowflake and relational databases, cross-functionally in natural language.
API Changes
2026/06/01 - Amazon QuickSight - 22 new methods
Public APIs for Amazon QuickSight Spaces, Agents, and Flows have been added. The Spaces API provides management of curated resource collections, the Agents API provides lifecycle control for AI agents that leverage Spaces, and the Flows API provides CRUDL APIs for automated workflows.
2026/06/05 - Amazon QuickSight - 8 new methods
Support for Knowledge Base API and Index Capacity API has been added.
Amazon OpenSearch Service
New Features & Updates
2026/06/10 - Amazon OpenSearch Service launches MCP Apps for agentic observability
Amazon OpenSearch Service now supports MCP Apps, enabling you to use observability workflows directly from compatible agentic IDEs such as Claude Desktop and VS Code. AI agents in your local environment can investigate incidents using logs, traces, metrics, and alerts stored in OpenSearch domains and collections or Amazon Managed Service for Prometheus. Visualization, root cause analysis, and distributed trace exploration can all be performed within the conversation.
2026/06/23 - Amazon OpenSearch Service now offers AI-assisted migrations
AI-assisted migration experience has been added to the Migration Assistant for Amazon OpenSearch Service. It simplifies the migration of self-managed Apache Solr, Elasticsearch, and OpenSearch to OpenSearch Serverless or managed clusters. For Solr, live traffic capture and replay is also supported.
API Changes
2026/06/17 - Amazon OpenSearch Service - 1 updated methods
You can now configure IAM Identity Center options for existing OpenSearch applications using the UpdateApplication API.
2026/06/19 - Amazon OpenSearch Service - 4 new methods
Data source attachment APIs have been added. You can attach and detach OpenSearch Service domains and OpenSearch Serverless collections to OpenSearch applications.
Amazon OpenSearch Serverless
New Features & Updates
2026/06/08 - Amazon OpenSearch Serverless now supports Agentic Search
With Agentic Search, you simply describe what you are looking for, such as "search for flights to Tokyo under $800" or "show the best-selling products in the electronics category this month." The system interprets the user's intent, plans the optimal search strategy, generates appropriate DSL queries, and returns results along with a clear explanation of its reasoning. Behind the scenes, a built-in QueryPlanningTool powered by an LLM translates natural language into DSL, orchestrates the appropriate tools, and retrieves results. Configuration and customization are available via API or OpenSearch Dashboards, and the OpenSearch UI provides guided agent creation and search execution.
Amazon EMR
New Features & Updates
2026/06/09 - Announcing Spark Connect on Amazon EMR Serverless
Apache Spark Connect is now supported on Amazon EMR Serverless (EMR release 7.13 / Apache Spark 3.5.6 and later). You can build and debug Spark applications from your preferred local environment such as VS Code or Jupyter, while running full-scale Spark processing on EMR Serverless. By separating the client and server, interactive ad hoc analysis and iterative debugging become easier.
API Changes
2026/06/04 - Amazon EMR - 5 new 4 updated methods
Interactive sessions with Spark Connect are now supported on Amazon EMR on EC2. The StartSession, GetSession, GetSessionEndpoint, ListSessions, and TerminateSession APIs have been added, and a sessionEnabled field has been added to RunJobFlow and DescribeCluster.
2026/06/05 - EMR Serverless - 4 updated methods
You can now update the maximum capacity (maximumCapacity) and custom fields even while an application is in the started state. Capacity can be adjusted without stopping running jobs.
Amazon MSK
New Features & Updates
API Changes
2026/06/22 - Managed Streaming for Kafka - 2 updated methods
Amazon MSK Replicator now supports mTLS (mutual TLS) authentication when connecting to external Apache Kafka clusters. Data replication from clusters that require mutual TLS for client authentication is now possible (supported when replicating to MSK Express brokers).
Amazon MWAA
New Features & Updates
2026/06/11 - Amazon MWAA Serverless now supports Amazon EventBridge notifications
Amazon MWAA Serverless can now send workflow and task state change events to Amazon EventBridge. State transitions such as workflow start, running, success, and failure, as well as task scheduled, success, failure, and awaiting retry, can be handled in an event-driven manner. Automation such as sending alerts when production workflows fail or resuming dependent pipelines when upstream workflows succeed can be achieved without custom polling. Available in all regions where Amazon MWAA Serverless is offered.
Amazon S3
New Features & Updates
2026/06/16 - Amazon S3 adds annotations to provide AI agents and analytics tools with context for data discovery
Amazon S3 now supports Annotations. This is a metadata feature that allows you to attach business context directly to objects in JSON, XML, or YAML format. Annotations have the same durability and consistency as objects, move with objects during copies and replication, and are deleted together when objects are deleted. By integrating with S3 Metadata, they can be queried from Amazon Athena as Apache Iceberg tables, providing context for AI agents and analytics tools to discover and utilize data. Available in all regions including AWS China regions.
API Changes
2026/06/16 - Amazon Simple Storage Service - 5 new 7 updated methods
Support for the Annotations feature has been added. You can attach up to 1,000 annotations per object (each up to 1MB), and new APIs for creating, retrieving, listing, and deleting annotations are provided. Configuration of annotation tables in S3 Metadata is also supported.
Amazon S3 Vectors
New Features & Updates
2026/06/16 - Amazon S3 Vectors now supports up to 10,000 similarity search results per query
Amazon S3 Vectors can now return up to 10,000 similarity search results per query. This is a 100x increase from the previous limit. This is particularly useful for multi-stage search pipelines that perform additional processing such as reranking, aggregation, and deduplication. Available in all regions where S3 Vectors is offered.
2026/06/16 - Amazon S3 Vectors reduces query charges by up to 80% for large vector indexes
For indexes with more than 10 million vectors in Amazon S3 Vectors, query processing charges (data processed charges) have been reduced by up to 80%. This is a welcome update for those running large-scale AI, RAG, and semantic search workloads, enabling further reduction of running costs. No changes are required on the application side, as this is applied automatically.
API Changes
2026/06/16 - Amazon S3 Vectors - 1 updated methods
QueryVectors now supports pagination and can return up to 10,000 results per query.
In Closing
June 2026 was a month when two trends — "agents and natural language" and "interactive development experience" — converged across data analytics services. The flow of discovering and analyzing data using business terminology and natural language has accelerated further, with Amazon QuickSight's autonomous agents and multi-dataset analytics, SageMaker Data Agent's business context integration, and AWS Glue Data Catalog's semantic search (preview). At the same time, support for Apache Spark Connect has expanded across AWS Glue, Amazon EMR, and SageMaker Unified Studio, making it possible to interactively work with Spark on serverless infrastructure from local IDEs and notebooks. Improvements in price-performance and cost optimization continue as well, such as the Graviton-based RG instances for Amazon Redshift and incremental billing for manual snapshots.
If any of the updates catch your interest, please give them a try. I hope this article is helpful to someone.
