I verified 23 new commands and functions added to CloudWatch Logs Insights (June 2026 edition)

In June 2026, we verified 23 new commands and functions added to CloudWatch Logs Insights. We tested hash functions, IP determination, type conversion, CSV/XML parsing, histogram, multi-stage pipes for stats, and more using actual queries, and summarized the confirmed behaviors and points to note.

suzuki.ryo

2026.06.10

This page has been translated by machine translation. View original

 IntroductionOn June 8, 2026, 23 new commands and functions were added to Amazon CloudWatch Logs Insights.
https://aws.amazon.com/jp/about-aws/whats-new/2026/06/amazon-cloudwatch-logs-insights-new/
In the previous article, we introduced the 13 additions announced on May 21, but in less than a month, there has been another significant addition.
https://dev.classmethod.jp/articles/cloudwatch-logs-insights-new-commands-functions-2026/


Announcement Date
Number Added
Main Categories


5/21
13
String operations, encode/decode, logfmt parsing, coordinate calculation

6/8
23
Hash, IP classification, type conversion, time series analysis, CSV/XML parsing, command extensions

Comparing the trends of features added in May and June reveals the direction of the query language.



May (13)
June (23)


Character
Log viewing and formatting tool
Log analysis platform

Main work
"Read and search"
"Aggregate and determine"

What it replaced
Manual inspection, manual decoding
Post-processing in Athena/pandas/Splunk

Typical use cases
Initial investigation during incidents
Security audits, traffic analysis, SLO aggregation

While May's additions were focused on "making logs easier to read," June's additions can be described as "getting answers from logs."
The list of new features added this time is as follows.


Category
Items


Hash functions
md5, sha256

String functions
strcontains (case-insensitive support), split

Conditional logic
if

Conversion functions
toNumber, toInt, toLong, toDouble

IP functions
ipv4ToNumber, isPrivateIP, isPublicIP, isReservedIP

Analytics functions
rate, count_over_time, sum_over_time, offset, histogram

Parse functions
parse CSV, parse XML, parse multi, values, addtotals

Other
limit any N, stats command up to 10 stages

※ The official announcement states "23 new query commands and functions." The above table expands parse CSV/XML/multi etc. individually, so the item count may differ. The official announcement counts syntax extensions together, and this article uses the official count of "23" as-is.
This article will run these in practice and confirm the operation results.
 Verification EnvironmentRegion: us-east-1
Log group: /test/insights-new-2026-06
Test data: JSON format, CSV format, XML format, time series data. Fictional user IDs, service names, and IP addresses are used (IPs consist of RFC 5737 TEST-NET, RFC 1918 private ranges, and well-known addresses such as 8.8.8.8)
 Hash Functions (md5, sha256)md5 and sha256 are functions that generate hash values from field values. They can be used when you want to display hashed versions of user IDs and similar values in logs.
fields user_id, md5(user_id) as md5_hash, sha256(user_id) as sha256_hash
| limit 3


user_id
md5_hash
sha256_hash


usr_010
2fb6c8adce410db19ed04a7157b1ebd0
34f0365ae65242b4664ad6a1e4fe941c77caf56d7bd5aca88f1e9c6927012207

usr_009
3b24508aecac732b2dbf6d4e4bf9c4c2
8c231f143bc453047665350be67bc92029fe34375ddd45bee71164fcb278315c

usr_008
136ea0f2a0158fcbbd873dead2d60963
a8d79cc679fa29d08b483641f5d4b01faea2e81aa44edc7684fc2b936f32437c

md5 returned a 128-bit (32 hexadecimal characters) string, and sha256 returned a 256-bit (64 hexadecimal characters) string.
!Low-entropy values like user_id can be recovered through dictionary attacks, so hashing alone does not constitute anonymization. Also, md5 is cryptographically weak (collision attacks are practical), so avoid it for use cases requiring collision resistance. Even with sha256, simply hashing a low-entropy value may not be sufficient for anonymization.
 String Functions (strcontains, split) splitsplit splits a string into an array using a specified delimiter.
fields tags, split(tags, ',') as tag_array
| limit 3


tags
tag_array


prod,warning,us-east-1
["prod","warning","us-east-1"]

prod,normal,ap-northeast-1
["prod","normal","ap-northeast-1"]

prod,critical,ap-northeast-1
["prod","critical","ap-northeast-1"]

The comma-delimited tag string was expanded into array format (["prod","warning","us-east-1"]).
 strcontainsstrcontains determines whether a string contains a specific substring. According to the documentation, specifying true as the third argument enables case-insensitive search.
fields service,
  strcontains(service, 'auth') as has_auth_lower,
  strcontains(service, 'AUTH') as has_AUTH_upper,
  strcontains(service, 'AUTH', true) as has_AUTH_ci
| filter ispresent(service)
| sort service
| limit 5


service
has_auth_lower
has_AUTH_upper
has_AUTH_ci


api-gateway
0
0
0

auth-service
1
0
0

cdn-edge
0
0
0

data-pipeline
0
0
0

geo-service
0
0
0

The basic operation (first and second arguments only) is working correctly. strcontains(service, 'auth') returns 1 for auth-service.
However, in the author's environment, the effect of the third argument true (case-insensitive mode) could not be confirmed. strcontains(service, 'AUTH', true) returned 0 for auth-service, appearing to still behave in a case-sensitive manner. The argument itself was accepted without a syntax error, but specifying it did not change the result.
 Conditional Logic (if)if is a function that returns a value based on a condition. It uses the syntax if(condition, value_if_true, value_if_false) and can be used like a ternary operator.
fields service, response_time,
  if(toNumber(response_time) > 1000, 'slow', 'fast') as speed
| limit 5


service
response_time
speed


queue-worker
2045
slow

search-service
156
fast

auth-service
8901
slow

notification
67
fast

geo-service
312
fast

Requests with a response time exceeding 1000ms were successfully classified as slow.
Compared to the case function added in the previous (May) update, if is suited for a single condition with two choices, while case is suited for multiple branches.
 Conversion Functions (toNumber, toInt, toLong, toDouble)Four functions for converting string fields to numeric types have been added. Let's verify the differences between each type conversion.
 Converting Integer Valuesfields response_time,
  toInt(response_time) as rt_int,
  toLong(response_time) as rt_long,
  toDouble(response_time) as rt_double,
  toNumber(response_time) as rt_number
| filter ispresent(response_time)
| limit 5


response_time
rt_int
rt_long
rt_double
rt_number


234
234
234
234
234

89
89
89
89
89

1523
1523
1523
1523
1523

5002
5002
5002
5002
5002

not_a_number
(null)
(null)
(null)
(null)

For integer values, all four functions returned the same result. When conversion fails (e.g., not_a_number), null is returned without an error.
 Converting Decimal Values (where type differences are prominent)parse @message /latency=(?<lat_val>[\d.]+)/
| display lat_val,
    toInt(lat_val) as lat_int,
    toLong(lat_val) as lat_long,
    toDouble(lat_val) as lat_double,
    toNumber(lat_val) as lat_number
| filter ispresent(lat_val)


lat_val
lat_int
lat_long
lat_double
lat_number


890.12
890
890
890.12
890.12

45.7
45
45
45.7
45.7

123.456
123
123
123.456
123.456

Differences appeared with values containing decimals.
toInt / toLong: Truncated the decimal portion (verified for positive values only; behavior for floor vs truncate with negative numbers was not confirmed)
toDouble / toNumber: Retained the decimal portion, and yielded the same results within the scope of this verification
Based on the type names, the difference between toInt and toLong may appear for values exceeding the 32-bit integer range (approximately 2.1 billion), but no difference was confirmed with the test data in this verification
For practical use, toInt/toLong can be used for aggregations where decimal precision is not needed, while toDouble/toNumber can be used for calculations where decimals need to be retained.
 IP Functions (ipv4ToNumber, isPrivateIP, isPublicIP, isReservedIP)Four functions for numeric conversion and classification of IP addresses have been added. They are useful for analyzing VPC flow logs and ALB access logs.
fields ip,
  ipv4ToNumber(ip) as ip_num,
  isPrivateIP(ip) as is_private,
  isPublicIP(ip) as is_public,
  isReservedIP(ip) as is_reserved
| limit 10


ip
ip_num
is_private
is_public
is_reserved


10.255.255.1
184549121
1
0
0

52.194.x.x
885131796
0
1
0

192.0.2.1
3221225985
0
0
1

169.254.169.254
2852039166
0
0
1

100.64.0.1
1681915905
0
0
1

8.8.8.8
134744072
0
1
0

172.16.0.10
2886729738
1
0
0

10.0.0.55
167772215
1
0
0

203.0.113.50
3405803826
0
0
1

192.168.1.100
3232235876
1
0
0

※ The lower octets of some public IPs are masked (complete IP addresses were used in the actual verification).
For the IP addresses prepared this time, we confirmed that the expected classification results were obtained. The classification criteria that can be read from the results are as follows.
Private (isPrivateIP = 1): Private address ranges defined in RFC 1918
10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16

Reserved (isReservedIP = 1): Address ranges reserved for special purposes
169.254.0.0/16 (link-local), 100.64.0.0/10 (CGN / Shared Address)
192.0.2.0/24 (TEST-NET-1), 203.0.113.0/24 (TEST-NET-3)

Public (isPublicIP = 1): Within the range tested, general public IP addresses that do not fall under Private or Reserved were determined to be 1
Within the IPv4 address ranges verified this time, only one of the three classifications was 1, confirming they are mutually exclusive. ipv4ToNumber is a function that converts an IP address to a 32-bit integer. It can be used for filtering by IP range (ipv4ToNumber(ip) >= X and ipv4ToNumber(ip) <= Y).
 Analytics Functions (rate, count_over_time, sum_over_time, offset, histogram)Five new features related to time series data analysis have been added.
 count_over_timecount_over_time counts the number of records per time range.
filter metric = 'cpu_usage'
| stats count_over_time(*) as cot by bin(2m)


bin(2m)
cot


2026-06-09 16:20:00.000
3

2026-06-09 16:18:00.000
4

2026-06-09 16:16:00.000
4

2026-06-09 16:14:00.000
4

2026-06-09 16:12:00.000
4

2026-06-09 16:10:00.000
1

Under the bin aggregation conditions of this verification, the result was equivalent to count(*).
 sum_over_timesum_over_time calculates the total value per time range.
filter metric = 'cpu_usage'
| stats sum_over_time(value) as sot by bin(2m)


bin(2m)
sot


2026-06-09 16:20:00.000
277

2026-06-09 16:18:00.000
447

2026-06-09 16:16:00.000
383

2026-06-09 16:14:00.000
408

2026-06-09 16:12:00.000
384

2026-06-09 16:10:00.000
143

This also yielded the same result as sum(value). count_over_time/sum_over_time appear to be named for time series analysis purposes. However, under the bin aggregation conditions of this verification, no behavioral difference from count(*)/sum() could be confirmed, so it cannot be definitively stated that they behave identically.
 histogramhistogram is used as a grouping function in the by clause, and aggregates by dividing a numeric field into buckets.
filter metric = 'cpu_usage'
| stats count(*) as cnt by histogram(value, 50)


histogram(value, 50)
cnt


50
10

100
10

The second argument is the bucket width, and the lower bound of the bucket is displayed in the results. Changing the bucket width to 25 shows a more detailed distribution.
filter metric = 'cpu_usage'
| stats count(*) as cnt by histogram(value, 25)


histogram(value, 25)
cnt


50
2

75
8

100
6

125
4

Note that histogram is a grouping function for the by clause, not an aggregation function for stats. While bin is for bucketing on the time axis, histogram is for bucketing on the numeric axis.
 offsetoffset is a modifier for bin() that shifts the alignment (starting position) of bin boundaries.
filter metric = 'cpu_usage'
| stats count(*) as cnt by bin(5m) offset 5m


bin(5m)
cnt


2026-06-09 16:20:00.000
3

2026-06-09 16:15:00.000
10

2026-06-09 16:10:00.000
7

The syntax is by bin(5m) offset 5m, placed after bin. It is a modifier, not a function. Using offset allows you to shift the starting position of bin boundaries, enabling you to set aggregation intervals aligned with business hours.
 raterate is a function that calculates the rate of change of a numeric field within a bin. It uses the syntax rate(field, period), where the second argument specifies the time unit (1s, 1m, 2m, etc.).
filter metric = 'cpu_usage'
| stats rate(value, 1s) as rate_1s, rate(value, 1m) as rate_1m, rate(value, 2m) as rate_2m by bin(5m)


bin(5m)
rate_1s
rate_1m
rate_2m


2026-06-09 18:35:00.000
20
0.3333
0.1667

2026-06-09 18:30:00.000
20
0.3333
0.1667

2026-06-09 18:25:00.000
20
0.3333
0.1667

The test data is a time series where values increase by +10 every 2 minutes. From the results, the ratio of values by period was rate_1s : rate_1m : rate_2m = 60 : 1 : 0.5. rate returns the total change in field values within a bin divided by the number of seconds in the period. The shorter the period, the larger the value.
Note that specifying a numeric value (e.g., 60) as the second argument results in an error. It must be specified as a time unit such as 1m.
 Parse Syntax (parse CSV, parse XML, parse multi, values, addtotals)Log parsing capabilities have been significantly expanded. Three new parse modes — CSV, XML, and multi-match — along with the aggregation helpers values and addtotals have been added.
 parse CSVThe syntax parse @message CSV as alias1, alias2, ... splits CSV-formatted logs by column.
filter @logStream = 'test-csv-stream'
| parse @message CSV as ts, lvl, svc, val
| display ts, lvl, svc, val


ts
lvl
svc
val


2026-06-10T00:04:00Z
DEBUG
cache
10

2026-06-10T00:03:00Z
INFO
search
50

2026-06-10T00:02:00Z
WARN
payment
175

2026-06-10T00:01:00Z
ERROR
auth-service
250

Each comma-delimited value was stored in order into the aliases. There is no distinction between header rows and data rows; all rows are parsed.
 parse XMLFor XML-formatted logs, fields are extracted using XPath-style path expressions.
filter @message like /<event>/
| parse @message XML '/event/level' as xlevel
| parse @message XML '/event/service' as xsvc
| parse @message XML '/event/code' as xcode
| display xlevel, xsvc, xcode


xlevel
xsvc
xcode


WARN
payment
429

INFO
api-gw
200

ERROR
auth
401

The syntax is parse @message XML '/element/path' as alias. Within the scope of this verification, values could be retrieved using simple element paths like /event/level. Retrieving the entire document object with parse @message XML as doc or accessing it with dot notation resulted in errors. To extract multiple fields, use multiple parse statements chained with pipes as shown above.
 parse multiparse multi expands all matches within a single line of a regular expression into individual records. It is powerful for parsing key=value formatted logs.
filter @message like /^level=/
| parse @message /(?<kname>\w+)=(?<kval>\S+)/ multi
| stats count(*) by kname


kname
count(*)


level
3

service
3

latency
3

request_id
3

The source data has 3 lines, each containing 4 key=value pairs. By adding multi, the 3 lines × 4 pairs = 12 records were expanded, enabling aggregation by key name.
Without multi, only the first match per line is extracted.
parse @message /(?<kname>\w+)=(?<kval>\S+)/
| display kname, kval


kname
kval


level
WARN

level
INFO

level
ERROR

With the regex-based parse multi tested this time, extraction worked as expected using named capture groups (?<name>...). On the other hand, the as alias multi syntax described in the documentation resulted in a syntax error in the author's environment.
 valuesvalues is an aggregation function that returns distinct values per group combined together.
filter ispresent(service) and ispresent(level)
| stats values(service) as services by level


level
services


INFO
test-convert, api-gateway, geo-service, notification, search-service

WARN
payment-service, queue-worker

ERROR
auth-service, cdn-edge

DEBUG
data-pipeline

In the API results of this verification, it was confirmed as a comma-delimited string representation. It is useful for listing unique values within a group.
 addtotalsaddtotals is a command that adds a column with the sum of numeric fields for each row. Note that not only the displayed columns but also all numeric fields present in the query are included in the sum, so the Total value may not match the simple sum of displayed columns.
filter ispresent(response_time)
| fields toNumber(response_time) as rt, toNumber(response_time) * 2 as rt2
| addtotals
| limit 5


rt
rt2
Total


2045
4090
8180

156
312
624

8901
17802
35604

67
134
268

234
468
936

By default, a row total is added under the column name Total. The column name can be customized with addtotals fieldname=RowSum.
Specifying col=true should also add a column total row, but the column total row was not included in the get-query-results API response. It may only be displayed in the console UI.
 Other (limit any, stats command up to 10 stages) limit anylimit any is a syntax that returns any N records without guaranteed ordering. While the regular limit returns the top N records in the order of the preceding sort (or what appears to be descending timestamp order when unspecified), limit any may return results faster when ordering is not needed.
fields service, level, ip | limit any 2


service
level
ip


search-service
INFO
10.255.255.1

cdn-edge
ERROR
52.194.x.x

This is useful when you want to quickly obtain samples from log groups with large amounts of logs.
 stats command up to 10 stagesThe range in which stats can be chained with pipes has been expanded, allowing up to 10 stages in the Standard log class.
filter ispresent(service)
| stats count(*) as cnt by service, level
| stats sum(cnt) as level_total by level
| stats max(level_total) as max_level_total, min(level_total) as min_level_total


max_level_total
min_level_total


5
1

Three stages of stats are connected with pipes. The first stage counts by service × level, the second stage sums by level, and the third stage calculates the maximum and minimum.
According to the documentation, up to 10 stages can be used in the Standard log class, while up to 2 stages can be used in the Infrequent Access log class. Subsequent stats stages can only reference fields defined in the previous stage, and sort and limit must be placed after the last stats.
As a practical example, here is a pattern for understanding trends from aggregated message character counts by time period.
fields strlen(@message) as msg_len
| stats sum(msg_len) as total_chars by bin(5m)
| stats max(total_chars) as peak, min(total_chars) as lowest, avg(total_chars) as average


peak
lowest
average


1586
431
935.5

This calculates the total message character count per 5 minutes and then retrieves the peak, minimum, and average from those results. If logs are primarily ASCII, this can also be used to get a rough sense of message size trends. Patterns like "aggregating aggregated results" can now be completed in a single query.
 SummaryWith 13 additions in May and 23 in June, the CloudWatch Logs Insights query language has expanded significantly in a short period of time.
The features added this time included IP classification, CSV/XML parsing, histogram, and multi-stage stats. It has become easier to perform aggregation, classification, and trend analysis within queries, not just log searching. Within the scope of the verification, it appears there will be more situations where analysis that was previously post-processed in Athena or external tools can be completed entirely within Logs Insights.
On the other hand, at the time of writing, case-insensitive search using the third argument of strcontains did not work as expected in the author's environment.
CloudWatch Logs Insights gives the impression of evolving from "a tool for searching and reviewing logs" into a query environment that is more useful for analytical purposes as well. With these additions, there seem to be even more situations where it can be used for daily investigations and ad hoc analysis.
 Reference Linkshttps://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/CWL_QuerySyntax.html

I verified 23 new commands and functions added to CloudWatch Logs Insights (June 2026 edition)

Introduction

Verification Environment

Hash Functions (md5, sha256)

String Functions (strcontains, split)

split

strcontains

Conditional Logic (if)

Conversion Functions (toNumber, toInt, toLong, toDouble)

Converting Integer Values

Converting Decimal Values (where type differences are prominent)

IP Functions (ipv4ToNumber, isPrivateIP, isPublicIP, isReservedIP)

Analytics Functions (rate, count_over_time, sum_over_time, offset, histogram)

count_over_time

sum_over_time

histogram

offset

rate

Parse Syntax (parse CSV, parse XML, parse multi, values, addtotals)

parse CSV

parse XML

parse multi

values

addtotals

Other (limit any, stats command up to 10 stages)

limit any

stats command up to 10 stages

Summary

Reference Links

AWS Topics

Trending Topics

Products & Services

Features and Series

Announcement Date	Number Added	Main Categories
5/21	13	String operations, encode/decode, logfmt parsing, coordinate calculation
6/8	23	Hash, IP classification, type conversion, time series analysis, CSV/XML parsing, command extensions

	May (13)	June (23)
Character	Log viewing and formatting tool	Log analysis platform
Main work	"Read and search"	"Aggregate and determine"
What it replaced	Manual inspection, manual decoding	Post-processing in Athena/pandas/Splunk
Typical use cases	Initial investigation during incidents	Security audits, traffic analysis, SLO aggregation

Category	Items
Hash functions	md5, sha256
String functions	strcontains (case-insensitive support), split
Conditional logic	if
Conversion functions	toNumber, toInt, toLong, toDouble
IP functions	ipv4ToNumber, isPrivateIP, isPublicIP, isReservedIP
Analytics functions	rate, count_over_time, sum_over_time, offset, histogram
Parse functions	parse CSV, parse XML, parse multi, values, addtotals
Other	limit any N, stats command up to 10 stages

user_id	md5_hash	sha256_hash
usr_010	2fb6c8adce410db19ed04a7157b1ebd0	34f0365ae65242b4664ad6a1e4fe941c77caf56d7bd5aca88f1e9c6927012207
usr_009	3b24508aecac732b2dbf6d4e4bf9c4c2	8c231f143bc453047665350be67bc92029fe34375ddd45bee71164fcb278315c
usr_008	136ea0f2a0158fcbbd873dead2d60963	a8d79cc679fa29d08b483641f5d4b01faea2e81aa44edc7684fc2b936f32437c

tags	tag_array
prod,warning,us-east-1	["prod","warning","us-east-1"]
prod,normal,ap-northeast-1	["prod","normal","ap-northeast-1"]
prod,critical,ap-northeast-1	["prod","critical","ap-northeast-1"]

service	has_auth_lower	has_AUTH_upper	has_AUTH_ci
api-gateway	0	0	0
auth-service	1	0	0
cdn-edge	0	0	0
data-pipeline	0	0	0
geo-service	0	0	0

service	response_time	speed
queue-worker	2045	slow
search-service	156	fast
auth-service	8901	slow
notification	67	fast
geo-service	312	fast

response_time	rt_int	rt_long	rt_double	rt_number
234	234	234	234	234
89	89	89	89	89
1523	1523	1523	1523	1523
5002	5002	5002	5002	5002
not_a_number	(null)	(null)	(null)	(null)

lat_val	lat_int	lat_long	lat_double	lat_number
890.12	890	890	890.12	890.12
45.7	45	45	45.7	45.7
123.456	123	123	123.456	123.456

ip	ip_num	is_private	is_public	is_reserved
10.255.255.1	184549121	1	0	0
52.194.x.x	885131796	0	1	0
192.0.2.1	3221225985	0	0	1
169.254.169.254	2852039166	0	0	1
100.64.0.1	1681915905	0	0	1
8.8.8.8	134744072	0	1	0
172.16.0.10	2886729738	1	0	0
10.0.0.55	167772215	1	0	0
203.0.113.50	3405803826	0	0	1
192.168.1.100	3232235876	1	0	0

bin(2m)	cot
2026-06-09 16:20:00.000	3
2026-06-09 16:18:00.000	4
2026-06-09 16:16:00.000	4
2026-06-09 16:14:00.000	4
2026-06-09 16:12:00.000	4
2026-06-09 16:10:00.000	1

bin(5m)	cnt
2026-06-09 16:20:00.000	3
2026-06-09 16:15:00.000	10
2026-06-09 16:10:00.000	7

bin(5m)	rate_1s	rate_1m	rate_2m
2026-06-09 18:35:00.000	20	0.3333	0.1667
2026-06-09 18:30:00.000	20	0.3333	0.1667
2026-06-09 18:25:00.000	20	0.3333	0.1667

ts	lvl	svc	val
2026-06-10T00:04:00Z	DEBUG	cache	10
2026-06-10T00:03:00Z	INFO	search	50
2026-06-10T00:02:00Z	WARN	payment	175
2026-06-10T00:01:00Z	ERROR	auth-service	250

level	services
INFO	test-convert, api-gateway, geo-service, notification, search-service
WARN	payment-service, queue-worker
ERROR	auth-service, cdn-edge
DEBUG	data-pipeline