A Brief Summary of Compound Sort Key and Interleaved Sort Key with Performance Tests | Redshift

This post is English translation of the Japanese version. I'm not a native English speaker, so please let me know if you find something like a grammatical error. Cheers!

Have you ever received something like following alerts from your Redshift Advisor?

Initialize Interleaved Sort Keys
Recomendation
Run VACUUM REINDEX, as a superuser, on tables with inactive interleaved sort keys.

Replace Single Column Interleaved Sort Keys
Recomendation
Recreate 5 tables to use a single-column compound sort key.

Both recommendations popped up because of using Interleaved Sort Key. Interleaved Sorting gives equal weight to each column, or a subset of columns, which is very convenient for database designers to use this for a bunch of tables "anyway", but this sometimes has a bad influence on Redshift performance. Although you can refer to Recommendations from your Redshift Advisor which lets you know how to deal with this problem, you need to review overall sort key designs of the tables alerted to find a fundamental solution of the problem.

Today, I'm going to provide a brief summary of characteristics of Compound Sort Key and Interleaved Sort Key and differences between them, and then examine each SQL performance.

Contents

Characteristics of Compound Sort Key and Interleaved Sort Key

Compound Sort Key and Interleaved Sort Key have detail (too detail!) characteristics.

Compound Sort Key

Compound sort key is the default sort type in Redshift and made up of one or more of its columns. Basically, Compound sorting is effective with these SQL operations; ORDER BY , GROUP BY and PARTITION BY used in window functions.

When using Compound sorting, you need to consider the order of columns in Compound sort key, the first one is called primary and the second one is called secondary . Querying only by the primary column has a good effect on the speed, while the secondary column and following columns can't be powerful without simultaneously using the primary. For example, when you want to query the following table which has Compound sort key, the primary dateid can be used only in ORDER BY operations, but the secondary eventid is definitely used with the primary dateid otherwise the sorting performance will be degraded.

CREATE TABLE sales_csk
(
  salesid integer not null,
  listid integer not null,
  sellerid integer not null,
  buyerid integer not null,
  eventid integer not null,
  dateid smallint not null,
  qtysold smallint not null,
  pricepaid decimal(8,2),
  commission decimal(8,2),
  saletime timestamp
)
DISTSTYLE KEY DISTKEY (dateid)
COMPOUND SORTKEY (dateid, eventid);

One of the major advantages of Compound sort key is that it enables Merge Join when the following criteria are met. Merge Join is the fastest way to join in Redshift and that advantage can be seen in my examination, too. However, the more complex tables or data are, the harder achieving Merge Join is, because the criteria are very strict. If the criteria are not met, Redshift chooses Hash Join or Nested Loop for the join operator instead.

The criteria of Merge Join

  1. Used for INNER JOIN and OUTER JOIN, not for FULL JOIN.
  2. Both join columns are the distribution key and include the primary of Compound sort key.
  3. Both tables are over 80% sorted.

Also, Compound sort keys help improve compression of the column.

Interleaved Sort Key

On the other hand, Interleaved sort key is also made up of one or more of its columns, but each sort key column has equal importance . That is using only the secondary column can still work. Especially, when filtering with the WHERE clause, or when the sort key column has a long (more than 8 bytes) common prefix, interleaved sorting will give better performance than compound sorting.

To keep fully benefit of Interleaved sorting, you need to periodically execute VACUUM REINDEX to the tables. The total time of VACUUM REINDEX will be longer than that of VACUUM FULL because a VACUUM REINDEX operation requires an extra analysis of the distribution of the values in interleaved columns before performing a VACUUM FULL operation. If a VACUUM REINDEX operation can't be included in a daily batch operation, you shouldn't use an interleaved sort key on columns with monotonically increasing attributes, such as id, dates, or timestamps in order to prevent its distribution from getting larger.

It is said that if a table is large enough to require multiple 1MB blocks per slice, Interleaved sorting performs better than Compound sorting. However, in other words, it doesn't have a remarkable effect on not so large tables. If the table has only 10 million rows at most, it is hard to feel the effect of Interleaved sorting because the power of Redshift is enough to perform a high-speed process to such tables. I have an image of the capability of Interleaved sorting working usefully when a table has over 100 million records.

For your information, you can refer to the indicator interleaved_skew in the view SVV_INTERLEAVED_COLUMNS to find which table should be run VACUUM REINDEX on. If the interleaved_skew is greater than 1.4, a VACUUM REINDEX will usually improve performance.

-- to check the interleaved_skew
SELECT DISTINCT
  sic.tbl as tbl_id,
  sti.schema as schema_name,
  stp.name as table_name,
  sti.tbl_rows,
  sic.col,
  sic.interleaved_skew,
  sic.last_reindex
FROM svv_interleaved_columns sic
LEFT JOIN svv_table_info sti ON sic.tbl = sti.table_id
LEFT JOIN stv_tbl_perm stp ON sic.tbl = stp.id
WHERE interleaved_skew > 1.4
ORDER BY schema_name, table_name, col;

Both keys have pros and cons and very complex features as mentioned above. It is hard to apply appropriately either key for each different use case without any examinations. So, I'm going to execute simple queries and make a comparison between both keys with SQL performance.

Conclusion

I'm going to tell the conclusion first because examinations are quite many. If you are operating a database for batch processing and handling less than 100 million records in a table, you want to basically use Compound Sort Key. Consider doing the following things as part of sort key design:

  1. Make adjustments to the relationships between the columns used as a primary or secondary key and the ones used in JOIN or WHERE queries.
  2. If a table is using JOIN operation in batch processing, confirm whether the joining column can be also defined as the distribution key or not. If possible, the fastest join operator, Merge Join will be selected.
  3. Operating compound-key-based database, if there is a bottleneck at a certain table in batch processing, you want to apply Interleaved Sort Key to it instead.

In case that a table has over than 100 million records, the capability of Interleaved sorting becomes more effective. There is no absolute way of thinking on such a complex case, so you are required to be more attentive to which sort key should be used. Consider the following:

  • Improve performance by choosing Interleaved sort key for columns which are not the primary key and frequently used independently.
  • Compound sorting is the fastest in a large number of cases, but sometimes it's going to be worse than Interleaved sorting. Interleaved sorting works not too good and not too bad on average.
  • Pay attention to Data Redistribution in Query Plan.
  • Avoid using Interleaved sort key for all of the tables because you need to perform VACUUM REINDEX to all huge tables. It will cost a lot.
  • If possible, Use COPY instead of INSERT to load data for Interleaved sort key tables. A deep copy automatically creates interleaved indexes in addition to the data load. But other concurrent updates cannot be made during a deep copy.
  • etc...

Please note this is not the absolute principle for any tables. Also, this principle is effective for tables which are used in batch processing and queried with complex SQL. If tables are referenced by BI clients, you should design sort keys for frequently used SQL.

Word Definitions and Sample Data

In the examination, I created sample tables with each key; No sort key, Compound sort key and Interleaved sort key, and compared them in terms of execution time and Query Plan. The word definitions in Query Plan are the following things:

Cost : A relative value that is useful for comparing operations within a plan. The first value of cost separated by two periods is the relative cost of returning the first row for this operation and the second value is the relative cost of completing the operation . (ex. cost=131.97..133.41 )

Join Operators : Redshift automatically selects these operators in join operations. This is based on the physical design of the tables, such as key design and data sorting. There are three types of operators, Nested Loop, Hash Join and Merge Join, and Merge Join is the fastest in them but hard to be achieved.

Aggregate Operators : Redshift automatically selects these operators in an aggregate function and a GROUP BY operation. A GROUP BY operation has two types of operators, HashAggregate and GroupAggregate, and GroupAggregate will be selected when an aggregate function is performed with sorting.

Data Redistribution : This is a method for how data is moved around a cluster to facilitate the join. There are seven types of attributes in query plans and less data movement is desirable.

Sample tables were created from Step 6: Load Sample Data from Amazon S3 - Amazon Redshift and Step 1: Create a Test Data Set - Amazon Redshift.

-- 365rows
CREATE TABLE date(
	dateid smallint not null,
	caldate date not null,
	day character(3) not null,
	week smallint not null,
	month character(5) not null,
	qtr character(5) not null,
	year smallint not null,
	holiday boolean default('N')
)
DISTSTYLE KEY DISTKEY(dateid)
No sort key | COMPOUND SORTKEY (dateid) | INTERLEAVED SORTKEY (dateid)
;

-- 8,798rows
CREATE TABLE event(
	eventid integer not null,
	venueid smallint not null,
	catid smallint not null,
	dateid smallint not null,
	eventname varchar(200),
	starttime timestamp
)
DISTSTYLE KEY DISTKEY(eventid)
No sort key | COMPOUND SORTKEY (eventid) | INTERLEAVED SORTKEY (eventid)
;

-- 172,456rows
CREATE TABLE sales(
	salesid integer not null,
	listid integer not null,
	sellerid integer not null,
	buyerid integer not null,
	eventid integer not null,
	dateid smallint not null,
	qtysold smallint not null,
	pricepaid decimal(8,2),
	commission decimal(8,2),
	saletime timestamp
)
DISTSTYLE KEY DISTKEY(dateid)
No sort key | COMPOUND SORTKEY (dateid, eventid) | COMPOUND SORTKEY (eventid, dateid) | INTERLEAVED SORTKEY (dateid, eventid)
;

-- 1,400,000rows
CREATE TABLE part(
	p_partkey     INTEGER NOT NULL,
	p_name        VARCHAR(22) NOT NULL,
	p_mfgr        VARCHAR(6) NOT NULL,
	p_category    VARCHAR(7) NOT NULL,
	p_brand1      VARCHAR(9) NOT NULL,
	p_color       VARCHAR(11) NOT NULL,
	p_type        VARCHAR(25) NOT NULL,
	p_size        INTEGER NOT NULL,
	p_container   VARCHAR(10) NOT NULL
)
DISTSTYLE KEY DISTKEY(p_partkey)
No sort key | COMPOUND SORTKEY (p_partkey, p_size) | INTERLEAVED SORTKEY (p_partkey, p_size)
;

-- 3,000,000rows
CREATE TABLE customer(
	c_custkey      INTEGER NOT NULL,
	c_name         VARCHAR(25) NOT NULL,
	c_address      VARCHAR(25) NOT NULL,
	c_city         VARCHAR(10) NOT NULL,
	c_nation       VARCHAR(15) NOT NULL,
	c_region       VARCHAR(12) NOT NULL,
	c_phone        VARCHAR(15) NOT NULL,
	c_mktsegment   VARCHAR(10) NOT NULL
)
DISTSTYLE KEY DISTKEY(c_custkey)
No sort key | COMPOUND SORTKEY (c_custkey, c_region) | INTERLEAVED SORTKEY (c_custkey, c_region)
;

-- 600,037,902rows
CREATE TABLE lineorder(
	lo_orderkey          INTEGER NOT NULL,
	lo_linenumber        INTEGER NOT NULL,
 	lo_custkey           INTEGER NOT NULL,
	lo_partkey           INTEGER NOT NULL,
	lo_suppkey           INTEGER NOT NULL,
	lo_orderdate         INTEGER NOT NULL,
	lo_orderpriority     VARCHAR(15) NOT NULL,
	lo_shippriority      VARCHAR(1) NOT NULL,
	lo_quantity          INTEGER NOT NULL,
	lo_extendedprice     INTEGER NOT NULL,
	lo_ordertotalprice   INTEGER NOT NULL,
	lo_discount          INTEGER NOT NULL,
	lo_revenue           INTEGER NOT NULL,
	lo_supplycost        INTEGER NOT NULL,
	lo_tax               INTEGER NOT NULL,
	lo_commitdate        INTEGER NOT NULL,
	lo_shipmode          VARCHAR(10) NOT NULL
)
DISTSTYLE KEY DISTKEY(c_custkey)
No sort key | COMPOUND SORTKEY (lo_custkey, lo_partkey) | COMPOUND SORTKEY2 (lo_partkey, lo_custkey) | INTERLEAVED SORTKEY (lo_custkey, lo_partkey)
;

/*
nsk: No sort key
csk: COMPOUND SORTKEY
csk2: COMPOUND SORTKEY (The primary and the secondary are reversed) 
ilsk: INTERLEAVED SORTKEY
*/

copy date_nsk from 's3://awssampledbuswest2/tickit/date2008_pipe.txt' 
credentials 'aws_iam_role=<iam-role-arn>' 
delimiter '|' region 'us-west-2';
INSERT INTO date_csk SELECT * FROM date_nsk;
INSERT INTO date_ilsk SELECT * FROM date_nsk;

copy event_nsk from 's3://awssampledbuswest2/tickit/allevents_pipe.txt' 
credentials 'aws_iam_role=<iam-role-arn>'
delimiter '|' timeformat 'YYYY-MM-DD HH:MI:SS' region 'us-west-2';
INSERT INTO event_csk SELECT * FROM event_nsk;
INSERT INTO event_ilsk SELECT * FROM event_nsk;

copy sales_nsk from 's3://awssampledbuswest2/tickit/sales_tab.txt'
credentials 'aws_iam_role=<iam-role-arn>'
delimiter '\t' timeformat 'MM/DD/YYYY HH:MI:SS' region 'us-west-2';
INSERT INTO sales_csk SELECT * FROM sales_nsk;
INSERT INTO sales_csk2 SELECT * FROM sales_nsk;
INSERT INTO sales_ilsk SELECT * FROM sales_nsk;

copy customer_nsk from 's3://awssampledbuswest2/ssbgz/customer' 
credentials 'aws_iam_role=<iam-role-arn>'
gzip compupdate off region 'us-west-2';
INSERT INTO customer_csk SELECT * FROM customer_nsk;
INSERT INTO customer_ilsk SELECT * FROM customer_nsk;

copy lineorder_nsk from 's3://awssampledbuswest2/ssbgz/lineorder' 
credentials 'aws_iam_role=<iam-role-arn>'
gzip compupdate off region 'us-west-2';
INSERT INTO lineorder_csk SELECT * FROM lineorder_nsk;
INSERT INTO lineorder_csk2 SELECT * FROM lineorder_nsk;
INSERT INTO lineorder_ilsk SELECT * FROM lineorder_nsk;

After INSERT, I operated VACUUM FULL or VACUUM REINDEX, and ANALYZE to the tables. Ensure that the result caching has been disabled before the examination. ( Using SET enable_result_cache_for_session = off; )

Examination

ORDER BY

-- Less than 10 million records table
-- ORDER BY with the primary key
SELECT dateid FROM sales ORDER BY dateid;

-- EXPLAIN
XN Merge  (cost=1000000016724.67..1000000017155.81 rows=172456 width=2)
  Merge Key: dateid
  ->  XN Network  (cost=1000000016724.67..1000000017155.81 rows=172456 width=2)
        Send to leader
        ->  XN Sort  (cost=1000000016724.67..1000000017155.81 rows=172456 width=2)
              Sort Key: dateid
              ->  XN Seq Scan on sales_nsk  (cost=0.00..1724.56 rows=172456 width=2)
XN Merge  (cost=0.00..1724.56 rows=172456 width=2)
  Merge Key: dateid
  ->  XN Network  (cost=0.00..1724.56 rows=172456 width=2)
        Send to leader
        ->  XN Seq Scan on sales_csk  (cost=0.00..1724.56 rows=172456 width=2)
XN Merge  (cost=1000000016724.67..1000000017155.81 rows=172456 width=2)
  Merge Key: dateid
  ->  XN Network  (cost=1000000016724.67..1000000017155.81 rows=172456 width=2)
        Send to leader
        ->  XN Sort  (cost=1000000016724.67..1000000017155.81 rows=172456 width=2)
              Sort Key: dateid
              ->  XN Seq Scan on sales_ilsk  (cost=0.00..1724.56 rows=172456 width=2)
# No sort key Compound sort key Interleaved sort key
1 398ms 351ms 356ms
2 376ms 350ms 362ms
3 355ms 344ms 354ms
Cost 1000000016724.67..
1000000017155.81
0.00..1724.56 1000000016724.67..
1000000017155.81

Execution times are almost all the same. The cost of Compound sorting is the lowest.

-- Less than 10 million records table
-- ORDER BY with the secondary key
SELECT eventid FROM sales ORDER BY eventid;

-- EXPLAIN
XN Merge  (cost=1000000016724.67..1000000017155.81 rows=172456 width=4)
  Merge Key: eventid
  ->  XN Network  (cost=1000000016724.67..1000000017155.81 rows=172456 width=4)
        Send to leader
        ->  XN Sort  (cost=1000000016724.67..1000000017155.81 rows=172456 width=4)
              Sort Key: eventid
              ->  XN Seq Scan on sales_nsk  (cost=0.00..1724.56 rows=172456 width=4)
XN Merge  (cost=1000000016724.67..1000000017155.81 rows=172456 width=4)
  Merge Key: eventid
  ->  XN Network  (cost=1000000016724.67..1000000017155.81 rows=172456 width=4)
        Send to leader
        ->  XN Sort  (cost=1000000016724.67..1000000017155.81 rows=172456 width=4)
              Sort Key: eventid
              ->  XN Seq Scan on sales_csk  (cost=0.00..1724.56 rows=172456 width=4)
XN Merge  (cost=1000000016724.67..1000000017155.81 rows=172456 width=4)
  Merge Key: eventid
  ->  XN Network  (cost=1000000016724.67..1000000017155.81 rows=172456 width=4)
        Send to leader
        ->  XN Sort  (cost=1000000016724.67..1000000017155.81 rows=172456 width=4)
              Sort Key: eventid
              ->  XN Seq Scan on sales_ilsk  (cost=0.00..1724.56 rows=172456 width=4)
# No sort key Compound sort key Interleaved sort key
1 362ms 346ms 355ms
2 363ms 352ms 349ms
3 355ms 365ms 364ms
Cost 1000000016724.67..
1000000017155.81
1000000016724.67..
1000000017155.81
1000000016724.67..
1000000017155.81

Both execution times and costs are almost all the same.

-- Less than 10 million records table
-- ORDER BY with the primary key and the secondary key
SELECT dateid, eventid FROM sales ORDER BY dateid, eventid;

-- EXPLAIN
XN Merge  (cost=1000000016724.67..1000000017155.81 rows=172456 width=6)
  Merge Key: dateid, eventid
  ->  XN Network  (cost=1000000016724.67..1000000017155.81 rows=172456 width=6)
        Send to leader
        ->  XN Sort  (cost=1000000016724.67..1000000017155.81 rows=172456 width=6)
              Sort Key: dateid, eventid
              ->  XN Seq Scan on sales_nsk  (cost=0.00..1724.56 rows=172456 width=6)
XN Merge  (cost=0.00..1724.56 rows=172456 width=6)
  Merge Key: dateid, eventid
  ->  XN Network  (cost=0.00..1724.56 rows=172456 width=6)
        Send to leader
        ->  XN Seq Scan on sales_csk  (cost=0.00..1724.56 rows=172456 width=6)
XN Merge  (cost=1000000016724.67..1000000017155.81 rows=172456 width=6)
  Merge Key: dateid, eventid
  ->  XN Network  (cost=1000000016724.67..1000000017155.81 rows=172456 width=6)
        Send to leader
        ->  XN Sort  (cost=1000000016724.67..1000000017155.81 rows=172456 width=6)
              Sort Key: dateid, eventid
              ->  XN Seq Scan on sales_ilsk  (cost=0.00..1724.56 rows=172456 width=6)
# No sort key Compound sort key Interleaved sort key
1 438ms 417ms 424ms
2 437ms 425ms 439ms
3 445ms 431ms 449ms
Cost 1000000016724.67..
1000000017155.81
0.00..1724.56 1000000016724.67..
1000000017155.81

Execution times are all the same. The cost of Compound sorting is the lowest.

-- More than 10 million records table
-- ORDER BY with the primary key
SELECT lo_custkey FROM lineorder ORDER BY lo_custkey LIMIT 10000;

-- EXPLAIN
XN Merge  (cost=1000093487338.12..1000094987432.84 rows=600037888 width=4)
  Merge Key: lo_custkey
  ->  XN Network  (cost=1000093487338.12..1000094987432.84 rows=600037888 width=4)
        Send to leader
        ->  XN Sort  (cost=1000093487338.12..1000094987432.84 rows=600037888 width=4)
              Sort Key: lo_custkey
              ->  XN Seq Scan on lineorder_nsk  (cost=0.00..6000378.88 rows=600037888 width=4)
XN Merge  (cost=0.00..6000378.88 rows=600037888 width=4)
  Merge Key: lo_custkey
  ->  XN Network  (cost=0.00..6000378.88 rows=600037888 width=4)
        Send to leader
        ->  XN Seq Scan on lineorder_csk  (cost=0.00..6000378.88 rows=600037888 width=4)
XN Merge  (cost=1000093487338.12..1000094987432.84 rows=600037888 width=4)
  Merge Key: lo_custkey
  ->  XN Network  (cost=1000093487338.12..1000094987432.84 rows=600037888 width=4)
        Send to leader
        ->  XN Sort  (cost=1000093487338.12..1000094987432.84 rows=600037888 width=4)
              Sort Key: lo_custkey
              ->  XN Seq Scan on lineorder_ilsk  (cost=0.00..6000378.88 rows=600037888 width=4)
# No sort key Compound sort key Interleaved sort key
1 1.8s 97ms 1.4s
2 1.7s 93ms 1.5s
3 1.7s 56ms 1.5s
Cost 1000093487338.12..
1000094987432.84
0.00..6000378.88 1000093487338.12..
1000094987432.84

The execution time of Compound sorting is the fastest and the one of Interleaved sorting is the next-fastest. The cost of Compound sorting is the lowest.

-- More than 10 million records table
-- ORDER BY with the secondary key
SELECT lo_partkey FROM lineorder ORDER BY lo_partkey LIMIT 10000;

-- EXPLAIN
XN Merge  (cost=1000093487338.12..1000094987432.84 rows=600037888 width=4)
  Merge Key: lo_partkey
  ->  XN Network  (cost=1000093487338.12..1000094987432.84 rows=600037888 width=4)
        Send to leader
        ->  XN Sort  (cost=1000093487338.12..1000094987432.84 rows=600037888 width=4)
              Sort Key: lo_partkey
              ->  XN Seq Scan on lineorder_nsk  (cost=0.00..6000378.88 rows=600037888 width=4)
XN Merge  (cost=1000093487338.12..1000094987432.84 rows=600037888 width=4)
  Merge Key: lo_partkey
  ->  XN Network  (cost=1000093487338.12..1000094987432.84 rows=600037888 width=4)
        Send to leader
        ->  XN Sort  (cost=1000093487338.12..1000094987432.84 rows=600037888 width=4)
              Sort Key: lo_partkey
              ->  XN Seq Scan on lineorder_csk  (cost=0.00..6000378.88 rows=600037888 width=4)
XN Merge  (cost=1000093487338.12..1000094987432.84 rows=600037888 width=4)
  Merge Key: lo_partkey
  ->  XN Network  (cost=1000093487338.12..1000094987432.84 rows=600037888 width=4)
        Send to leader
        ->  XN Sort  (cost=1000093487338.12..1000094987432.84 rows=600037888 width=4)
              Sort Key: lo_partkey
              ->  XN Seq Scan on lineorder_ilsk  (cost=0.00..6000378.88 rows=600037888 width=4)
# No sort key Compound sort key Interleaved sort key
1 1.9s 1.4s 1.5s
2 1.7s 1.4s 1.4s
3 1.8s 1.5s 1.5s
Cost 1000093487338.12..
1000094987432.84
1000093487338.12..
1000094987432.84
1000093487338.12..
1000094987432.84

In terms of execution times, there is no differences between Compound sorting and Interleaved sorting and both are faster than no sort key. The costs are almost all the same.

-- More than 10 million records table
-- ORDER BY with the primary key and the secondary key
SELECT lo_custkey, lo_partkey FROM lineorder ORDER BY lo_custkey, lo_partkey LIMIT 10000;

-- EXPLAIN
XN Merge  (cost=1000093487338.12..1000094987432.84 rows=600037888 width=8)
  Merge Key: lo_custkey, lo_partkey
  ->  XN Network  (cost=1000093487338.12..1000094987432.84 rows=600037888 width=8)
        Send to leader
        ->  XN Sort  (cost=1000093487338.12..1000094987432.84 rows=600037888 width=8)
              Sort Key: lo_custkey, lo_partkey
              ->  XN Seq Scan on lineorder_nsk  (cost=0.00..6000378.88 rows=600037888 width=8)
XN Merge  (cost=0.00..6000378.88 rows=600037888 width=8)
  Merge Key: lo_custkey, lo_partkey
  ->  XN Network  (cost=0.00..6000378.88 rows=600037888 width=8)
        Send to leader
        ->  XN Seq Scan on lineorder_csk  (cost=0.00..6000378.88 rows=600037888 width=8)
XN Merge  (cost=1000093487338.12..1000094987432.84 rows=600037888 width=8)
  Merge Key: lo_custkey, lo_partkey
  ->  XN Network  (cost=1000093487338.12..1000094987432.84 rows=600037888 width=8)
        Send to leader
        ->  XN Sort  (cost=1000093487338.12..1000094987432.84 rows=600037888 width=8)
              Sort Key: lo_custkey, lo_partkey
              ->  XN Seq Scan on lineorder_ilsk  (cost=0.00..6000378.88 rows=600037888 width=8)
# No sort key Compound sort key Interleaved sort key
1 2.3s 69ms 1.7s
2 2.4s 66ms 1.8s
3 2.3s 61ms 1.7s
Cost 1000093487338.12..
1000094987432.84
0.00..6000378.88 1000093487338.12..
1000094987432.84

The result is the same as "ORDER BY with the primary key", but the difference between them is more obvious. The cost of Compound sorting is the lowest.

GROUP BY

-- Less than 10 million records table
-- GROUP BY with the primary key
SELECT dateid, count(*) FROM sales GROUP BY dateid;

-- EXPLAIN
XN HashAggregate  (cost=2586.84..2587.75 rows=364 width=2)
  ->  XN Seq Scan on sales_nsk  (cost=0.00..1724.56 rows=172456 width=2)
XN GroupAggregate  (cost=0.00..2587.71 rows=350 width=2)
  ->  XN Seq Scan on sales_csk  (cost=0.00..1724.56 rows=172456 width=2)
XN HashAggregate  (cost=2586.84..2587.75 rows=363 width=2)
  ->  XN Seq Scan on sales_ilsk  (cost=0.00..1724.56 rows=172456 width=2)
# No sort key Compound sort key Interleaved sort key
1 22ms 21ms 23ms
2 24ms 22ms 24ms
3 22ms 21ms 22ms
Cost 2586.84..2587.75 0.00..2587.71 2586.84..2587.75
Aggregate Operators HashAggregate GroupAggregate HashAggregate

Execution times are almost the same. The initiate cost of Compound sorting is lower than the others, but the total ones are all the same.

-- Less than 10 million records table
-- GROUP BY with the secondary key
SELECT eventid, count(*) FROM sales GROUP BY eventid;

-- EXPLAIN
XN HashAggregate  (cost=2586.84..2606.31 rows=7787 width=4)
  ->  XN Seq Scan on sales_nsk  (cost=0.00..1724.56 rows=172456 width=4)
XN HashAggregate  (cost=2586.84..2606.42 rows=7834 width=4)
  ->  XN Seq Scan on sales_csk  (cost=0.00..1724.56 rows=172456 width=4)
XN HashAggregate  (cost=2586.84..2606.27 rows=7772 width=4)
  ->  XN Seq Scan on sales_ilsk  (cost=0.00..1724.56 rows=172456 width=4)
# No sort key Compound sort key Interleaved sort key
1 61ms 62ms 53ms
2 53ms 57ms 60ms
3 48ms 48ms 49ms
Cost 2586.84..2606.31 2586.84..2606.42 2586.84..2606.27
Aggregate Operators HashAggregate HashAggregate HashAggregate

Both execution times and costs are almost all the same.

-- Less than 10 million records table
-- GROUP BY with the primary key and the secondary key
SELECT dateid, eventid, count(*) FROM sales GROUP BY dateid, eventid;

-- EXPLAIN
XN HashAggregate  (cost=3017.98..3061.09 rows=17246 width=6)
  ->  XN Seq Scan on sales_nsk  (cost=0.00..1724.56 rows=172456 width=6)
XN GroupAggregate  (cost=0.00..3061.09 rows=17246 width=6)
  ->  XN Seq Scan on sales_csk  (cost=0.00..1724.56 rows=172456 width=6)
XN HashAggregate  (cost=3017.98..3061.09 rows=17246 width=6)
  ->  XN Seq Scan on sales_ilsk  (cost=0.00..1724.56 rows=172456 width=6)
# No sort key Compound sort key Interleaved sort key
1 551ms 437ms 430ms
2 683ms 411ms 586ms
3 439ms 659ms 438ms
Cost 3017.98..3061.09 0.00..3061.09 3017.98..3061.09
Aggregate Operators HashAggregate GroupAggregate HashAggregate

Execution times are almost all the same. The initiate cost of Compound sorting is lower than the others, but the total ones are all the same.

-- More than 10 million records table
-- GROUP BY with the primary key
SELECT lo_custkey, count(*) FROM lineorder GROUP BY lo_custkey;

-- EXPLAIN
XN HashAggregate  (cost=9000568.32..9004948.86 rows=1752214 width=4)
  ->  XN Seq Scan on lineorder_nsk  (cost=0.00..6000378.88 rows=600037888 width=4)
XN GroupAggregate  (cost=0.00..9005054.16 rows=1794335 width=4)
  ->  XN Seq Scan on lineorder_csk  (cost=0.00..6000378.88 rows=600037888 width=4)
XN HashAggregate  (cost=9000568.32..9005161.02 rows=1837079 width=4)
  ->  XN Seq Scan on lineorder_ilsk  (cost=0.00..6000378.88 rows=600037888 width=4)
# No sort key Compound sort key Interleaved sort key
1 13.2s 4.6s 8.5s
2 13.3s 4.6s 8.3s
3 13.3s 4.6s 8.4s
Cost 9000568.32..
9004948.86
0.00..
9005054.16
9000568.32..
9005161.02
Aggregate Operators HashAggregate GroupAggregate HashAggregate

The execution time of Compound sorting is the fastest and the one of Interleaved sorting is the next-fastest. In terms of total costs, Interleaved sorting is lower than Compound sorting.

-- More than 10 million records table
-- GROUP BY with the secondary key
SELECT lo_partkey, count(*) FROM lineorder GROUP BY lo_partkey;

-- EXPLAIN
XN HashAggregate  (cost=9000568.32..9002981.02 rows=965078 width=4)
  ->  XN Seq Scan on lineorder_nsk  (cost=0.00..6000378.88 rows=600037888 width=4)
XN HashAggregate  (cost=9000568.32..9003137.57 rows=1027699 width=4)
  ->  XN Seq Scan on lineorder_csk  (cost=0.00..6000378.88 rows=600037888 width=4)
XN HashAggregate  (cost=9000568.32..9003314.96 rows=1098655 width=4)
  ->  XN Seq Scan on lineorder_ilsk  (cost=0.00..6000378.88 rows=600037888 width=4)
# No sort key Compound sort key Interleaved sort key
1 17.4s 15.8s 7.4s
2 17.2s 15.3s 7.3s
3 16.3s 16.1s 7.3s
Cost 9000568.32..
9002981.02
9000568.32..
9003137.57
9000568.32..
9003314.96
Aggregate Operators HashAggregate HashAggregate HashAggregate

Finally, we can obviously see the advantage of Interleaved sorting in this pattern, "More than 10 million records table" and "GROUP BY with the secondary key". Although the cost of Interleaved sorting is the highest, the execution time is the fastest.

-- More than 10 million records table
-- GROUP BY with the primary key and the secondary key
SELECT lo_custkey, lo_partkey, count(*) FROM lineorder GROUP BY lo_custkey, lo_partkey LIMIT 10000;

-- EXPLAIN
XN HashAggregate  (cost=10500663.04..10650672.51 rows=60003789 width=8)
  ->  XN Seq Scan on lineorder_nsk  (cost=0.00..6000378.88 rows=600037888 width=8)
XN GroupAggregate  (cost=0.00..10650672.51 rows=60003789 width=8)
  ->  XN Seq Scan on lineorder_csk  (cost=0.00..6000378.88 rows=600037888 width=8)
XN HashAggregate  (cost=10500663.04..10650672.51 rows=60003789 width=8)
  ->  XN Seq Scan on lineorder_ilsk  (cost=0.00..6000378.88 rows=600037888 width=8)
# No sort key Compound sort key Interleaved sort key
1 37.7s 60ms 33.7s
2 36.8s 67ms 35.1s
3 38.0s 62ms 34.6s
Cost 10500663.04..
10650672.51
0.00..
10650672.51
10500663.04..
10650672.51
Aggregate Operators HashAggregate GroupAggregate HashAggregate

The performance of Interleaved sorting is getting extremely bad when GROUP BY is executed with the primary and the secondary. On the other hand, the performance of Compound sorting is very good. The initiate cost of Compound sorting is lower than the others, but the total ones are all the same.

WHERE

-- Less than 10 million records table
-- WHERE with the primary key
SELECT dateid FROM sales WHERE dateid IN ('2000', '2050', '1850');

-- EXPLAIN
XN Seq Scan on sales_nsk  (cost=0.00..3017.98 rows=1395 width=2)
  Filter: ((dateid = 1850::smallint) OR (dateid = 2000::smallint) OR (dateid = 2050::smallint))
XN Seq Scan on sales_csk  (cost=0.00..24.44 rows=1397 width=2)
  Filter: ((dateid = 1850::smallint) OR (dateid = 2000::smallint) OR (dateid = 2050::smallint))
XN Seq Scan on sales_ilsk  (cost=0.00..3017.98 rows=1398 width=2)
  Filter: ((dateid = 1850::smallint) OR (dateid = 2000::smallint) OR (dateid = 2050::smallint))
# No sort key Compound sort key Interleaved sort key
1 32ms 35ms 30ms
2 25ms 33ms 27ms
3 27ms 27ms 30ms
Cost 0.00..3017.98 0.00..24.44 0.00..3017.98

Execution times are all the same. The cost of Compound sorting is the lowest.

-- Less than 10 million records table
-- WHERE with the secondary key
SELECT eventid FROM sales WHERE eventid BETWEEN 10 AND 1000;

-- EXPLAIN
XN Seq Scan on sales_nsk  (cost=0.00..2586.84 rows=19865 width=4)
  Filter: ((eventid <= 1000) AND (eventid >= 10))
XN Seq Scan on sales_csk  (cost=0.00..2586.84 rows=19710 width=4)
  Filter: ((eventid <= 1000) AND (eventid >= 10))
XN Seq Scan on sales_ilsk  (cost=0.00..2586.84 rows=19153 width=4)
  Filter: ((eventid <= 1000) AND (eventid >= 10))
# No sort key Compound sort key Interleaved sort key
1 73ms 73ms 75ms
2 72ms 70ms 69ms
3 65ms 63ms 67ms
Cost 0.00..2586.84 0.00..2586.84 0.00..2586.84

Both execution times and costs are almost all the same.

-- Less than 10 million records table
-- WHERE with the primary key and the secondary key
SELECT dateid, eventid FROM sales WHERE dateid BETWEEN 1000 AND 2000 AND eventid BETWEEN 10 AND 1000;

-- EXPLAIN
XN Seq Scan on sales_nsk  (cost=0.00..3449.12 rows=9480 width=6)
  Filter: ((eventid <= 1000) AND (eventid >= 10) AND (dateid <= 2000) AND (dateid >= 1000))
XN Seq Scan on sales_csk  (cost=0.00..1639.04 rows=9366 width=6)
  Filter: ((eventid <= 1000) AND (eventid >= 10) AND (dateid <= 2000) AND (dateid >= 1000))
XN Seq Scan on sales_ilsk  (cost=0.00..3449.12 rows=9093 width=6)
  Filter: ((eventid <= 1000) AND (eventid >= 10) AND (dateid <= 2000) AND (dateid >= 1000))
# No sort key Compound sort key Interleaved sort key
1 64ms 48ms 65ms
2 53ms 49ms 51ms
3 46ms 55ms 54ms
Cost 0.00..3449.12 0.00..1639.04 0.00..3449.12

Execution times are all the same. The cost of Compound sorting is the lowest.

-- More than 10 million records table
-- WHERE with the primary key
SELECT lo_custkey FROM lineorder WHERE lo_custkey IN ('123434', '123', '3415678');

-- EXPLAIN
XN Seq Scan on lineorder_nsk  (cost=0.00..10500663.04 rows=1028 width=4)
  Filter: ((lo_custkey = 123) OR (lo_custkey = 123434) OR (lo_custkey = 3415678))
XN Seq Scan on lineorder_csk  (cost=0.00..17.56 rows=1004 width=4)
  Filter: ((lo_custkey = 123) OR (lo_custkey = 123434) OR (lo_custkey = 3415678))
XN Seq Scan on lineorder_ilsk  (cost=0.00..10500663.04 rows=980 width=4)
  Filter: ((lo_custkey = 123) OR (lo_custkey = 123434) OR (lo_custkey = 3415678))
# No sort key Compound sort key Interleaved sort key
1 1.4s 22ms 143ms
2 1.3s 25ms 135m
3 1.3s 24ms 134ms
Cost 0.00..10500663.04 0.00..17.56 0.00..10500663.04

The execution time of Compound sorting is the fastest and the one of Interleaved sorting is the next-fastest. The cost of Compound sorting is the lowest.

-- More than 10 million records table
-- WHERE with the secondary key
SELECT lo_partkey FROM lineorder WHERE lo_partkey BETWEEN 500 AND 1000;

-- EXPLAIN
XN Seq Scan on lineorder_nsk  (cost=0.00..9000568.32 rows=309378 width=4)
  Filter: ((lo_partkey <= 1000) AND (lo_partkey >= 500))
XN Seq Scan on lineorder_csk  (cost=0.00..9000568.32 rows=308579 width=4)
  Filter: ((lo_partkey <= 1000) AND (lo_partkey >= 500))
XN Seq Scan on lineorder_ilsk  (cost=0.00..9000568.32 rows=305233 width=4)
  Filter: ((lo_partkey <= 1000) AND (lo_partkey >= 500))
# No sort key Compound sort key Interleaved sort key
1 1.3s 2.3s 575ms
2 1.3s 1.8s 579ms
3 1.3s 2.3s 594ms
Cost 0.00..9000568.32 0.00..9000568.32 0.00..9000568.32

As well as GROUP BY, we can see the advantage of Interleaved sorting in this pattern. The execution time of Compound sort key is slower than the one of No sort key. The costs are all the same.

-- More than 10 million records table
-- WHERE with the primary key and the secondary key
SELECT lo_custkey, lo_partkey FROM lineorder WHERE lo_custkey BETWEEN 500 AND 2000 AND lo_partkey BETWEEN 500 AND 1000;

-- EXPLAIN
XN Seq Scan on lineorder_nsk  (cost=0.00..12000757.76 rows=157 width=8)
  Filter: ((lo_custkey <= 2000) AND (lo_custkey >= 500) AND (lo_partkey <= 1000) AND (lo_partkey >= 500))
XN Seq Scan on lineorder_csk  (cost=0.00..6017.71 rows=155 width=8)
  Filter: ((lo_custkey <= 2000) AND (lo_custkey >= 500) AND (lo_partkey <= 1000) AND (lo_partkey >= 500))
XN Seq Scan on lineorder_ilsk  (cost=0.00..12000757.76 rows=151 width=8)
  Filter: ((lo_custkey <= 2000) AND (lo_custkey >= 500) AND (lo_partkey <= 1000) AND (lo_partkey >= 500))
# No sort key Compound sort key Interleaved sort key
1 1.6s 25ms 24ms
2 1.6s 23ms 23ms
3 1.7s 23ms 22ms
Cost 0.00..12000757.76 0.00..6017.71 0.00..12000757.76

Both Compound sorting and Interleaved sorting are much faster than no sorting. Unlike GROUP BY, the Interleaved sort key made up of multiple columns still has good performance in the WHERE clause. The cost of Compound sorting is the lowest.

JOIN

-- Less than 10 million records table
-- JOIN with the primary key
SELECT s.dateid, s.eventid, s.salesid, d.caldate FROM sales s LEFT JOIN date d ON s.dateid = d.dateid;

-- EXPLAIN
XN Hash Left Join DS_DIST_NONE  (cost=4.56..5609.38 rows=172456 width=14)
  Hash Cond: ("outer".dateid = "inner".dateid)
  ->  XN Seq Scan on sales_nsk s  (cost=0.00..1724.56 rows=172456 width=10)
  ->  XN Hash  (cost=3.65..3.65 rows=365 width=6)
        ->  XN Seq Scan on date_nsk d  (cost=0.00..3.65 rows=365 width=6)
XN Merge Left Join DS_DIST_NONE  (cost=0.00..3884.82 rows=172456 width=14)
  Merge Cond: ("outer".dateid = "inner".dateid)
  ->  XN Seq Scan on sales_csk s  (cost=0.00..1724.56 rows=172456 width=10)
  ->  XN Seq Scan on date_csk d  (cost=0.00..3.65 rows=365 width=6)
XN Hash Left Join DS_BCAST_INNER  (cost=4.56..58405609.38 rows=172456 width=14)
  Hash Cond: ("outer".dateid = "inner".dateid)
  ->  XN Seq Scan on sales_csk2 s  (cost=0.00..1724.56 rows=172456 width=10)
  ->  XN Hash  (cost=3.65..3.65 rows=365 width=6)
        ->  XN Seq Scan on date_csk d  (cost=0.00..3.65 rows=365 width=6)
XN Hash Left Join DS_DIST_NONE  (cost=4.56..5609.38 rows=172456 width=14)
  Hash Cond: ("outer".dateid = "inner".dateid)
  ->  XN Seq Scan on sales_ilsk s  (cost=0.00..1724.56 rows=172456 width=10)
  ->  XN Hash  (cost=3.65..3.65 rows=365 width=6)
        ->  XN Seq Scan on date_ilsk d  (cost=0.00..3.65 rows=365 width=6)
# No sort key Compound sort key Compound sort key 2 Interleaved sort key
1 681ms 655ms 687ms 670ms
2 682ms 679ms 685ms 683ms
3 665ms 684ms 677ms 678ms
Cost 4.56..5609.38 0.00..3884.82 4.56..
58405609.38
4.56..5609.38
Join Operaters Hash Left Join Merge Left Join Hash Left Join Hash Left Join
Data Redistribution DS_DIST_NONE DS_DIST_NONE DS_BCAST_INNER DS_DIST_NONE

Execution times are all the same, but the costs of each sort key are different. The cost of Compound sorting with Merge join as the Join Operator was the lowest, and that of Compound sorting 2 with DS_BCAST_INNER as the Data Redistribution explosively increased.

-- Less than 10 million records table
-- JOIN with the secondary key
SELECT s.dateid, s.eventid, s.salesid, e.eventname FROM sales s LEFT JOIN event e ON s.eventid = e.eventid;

-- EXPLAIN
XN Hash Left Join DS_BCAST_INNER  (cost=109.98..2815365714.80 rows=172456 width=27)
  Hash Cond: ("outer".eventid = "inner".eventid)
  ->  XN Seq Scan on sales_nsk s  (cost=0.00..1724.56 rows=172456 width=10)
  ->  XN Hash  (cost=87.98..87.98 rows=8798 width=21)
        ->  XN Seq Scan on event_nsk e  (cost=0.00..87.98 rows=8798 width=21)
XN Hash Left Join DS_BCAST_INNER  (cost=109.98..2815365714.80 rows=172456 width=27)
  Hash Cond: ("outer".eventid = "inner".eventid)
  ->  XN Seq Scan on sales_csk s  (cost=0.00..1724.56 rows=172456 width=10)
  ->  XN Hash  (cost=87.98..87.98 rows=8798 width=21)
        ->  XN Seq Scan on event_csk e  (cost=0.00..87.98 rows=8798 width=21)
XN Merge Left Join DS_DIST_NONE  (cost=0.00..3990.24 rows=172456 width=27)
  Merge Cond: ("outer".eventid = "inner".eventid)
  ->  XN Seq Scan on sales_csk2 s  (cost=0.00..1724.56 rows=172456 width=10)
  ->  XN Seq Scan on event_csk e  (cost=0.00..87.98 rows=8798 width=21)
XN Hash Left Join DS_BCAST_INNER  (cost=109.98..2815365714.80 rows=172456 width=27)
  Hash Cond: ("outer".eventid = "inner".eventid)
  ->  XN Seq Scan on sales_ilsk s  (cost=0.00..1724.56 rows=172456 width=10)
  ->  XN Hash  (cost=87.98..87.98 rows=8798 width=21)
        ->  XN Seq Scan on event_ilsk e  (cost=0.00..87.98 rows=8798 width=21)
# No sort key Compound sort key Compound sort key 2 Interleaved sort key
1 699ms 669ms 676ms 720ms
2 752ms 694ms 689ms 680ms
3 691ms 706ms 696ms 675ms
Cost 109.98..
2815365714.80
109.98..
2815365714.80
0.00..3990.24 109.98..
2815365714.80
Join Operaters Hash Left Join Hash Left Join Merge Left Join Hash Left Join
Data Redistribution DS_BCAST_INNER DS_BCAST_INNER DS_DIST_NONE DS_BCAST_INNER

Execution times are all the same. The total cost of Compound sort key with Merge Join is the lowest.

-- Less than 10 million records table
-- JOIN with the primary key and the secondary key
SELECT s.dateid, s.eventid, s.salesid, e.eventname FROM sales s LEFT JOIN event e ON s.dateid = e.dateid AND s.eventid = e.eventid;

-- EXPLAIN
XN Hash Left Join DS_BCAST_INNER  (cost=131.97..2815366172.66 rows=172456 width=27)
  Hash Cond: (("outer".eventid = "inner".eventid) AND ("outer".dateid = "inner".dateid))
  ->  XN Seq Scan on sales_nsk s  (cost=0.00..1724.56 rows=172456 width=10)
  ->  XN Hash  (cost=87.98..87.98 rows=8798 width=23)
        ->  XN Seq Scan on event_nsk e  (cost=0.00..87.98 rows=8798 width=23)
XN Hash Left Join DS_BCAST_INNER  (cost=131.97..2815366172.66 rows=172456 width=27)
  Hash Cond: (("outer".eventid = "inner".eventid) AND ("outer".dateid = "inner".dateid))
  ->  XN Seq Scan on sales_csk s  (cost=0.00..1724.56 rows=172456 width=10)
  ->  XN Hash  (cost=87.98..87.98 rows=8798 width=23)
        ->  XN Seq Scan on event_csk e  (cost=0.00..87.98 rows=8798 width=23)
XN Merge Left Join DS_DIST_NONE  (cost=0.00..2723.54 rows=172456 width=27)
  Merge Cond: (("outer".eventid = "inner".eventid) AND ("outer".dateid = "inner".dateid))
  ->  XN Seq Scan on sales_csk2 s  (cost=0.00..1724.56 rows=172456 width=10)
  ->  XN Seq Scan on event_csk e  (cost=0.00..87.98 rows=8798 width=23)
XN Hash Left Join DS_BCAST_INNER  (cost=131.97..2815366172.66 rows=172456 width=27)
  Hash Cond: (("outer".eventid = "inner".eventid) AND ("outer".dateid = "inner".dateid))
  ->  XN Seq Scan on sales_ilsk s  (cost=0.00..1724.56 rows=172456 width=10)
  ->  XN Hash  (cost=87.98..87.98 rows=8798 width=23)
        ->  XN Seq Scan on event_ilsk e  (cost=0.00..87.98 rows=8798 width=23)
# No sort key Compound sort key Compound sort key 2 Interleaved sort key
1 557ms 575ms 574ms 568ms
2 565ms 577ms 561ms 572ms
3 592ms 603ms 603ms 570ms
Cost 131.97..
2815366172.66
131.97..
2815366172.66
0.00..2723.54 131.97..
2815366172.66
Join Operaters Hash Left Join Hash Left Join Merge Left Join Hash Left Join
Data Redistribution DS_BCAST_INNER DS_BCAST_INNER DS_DIST_NONE DS_BCAST_INNER

The result is almost same as the previous one. Using both the primary key and the secondary key rather than only using the primary key decreases the cost.

-- More than 10 million records table
-- JOIN with the primary key
SELECT l.lo_custkey, l.lo_orderdate, c.c_name FROM lineorder l LEFT JOIN customer c ON l.lo_custkey = c.c_custkey LIMIT 10000;

-- EXPLAIN
XN Hash Left Join DS_DIST_NONE  (cost=37500.00..37539868.00 rows=600037888 width=30)
  Hash Cond: ("outer".lo_custkey = "inner".c_custkey)
  ->  XN Seq Scan on lineorder_nsk l  (cost=0.00..6000378.88 rows=600037888 width=8)
  ->  XN Hash  (cost=30000.00..30000.00 rows=3000000 width=26)
        ->  XN Seq Scan on customer_nsk c  (cost=0.00..30000.00 rows=3000000 width=26)
XN Merge Left Join DS_DIST_NONE  (cost=0.00..13538352.48 rows=600037888 width=30)
  Merge Cond: ("outer".lo_custkey = "inner".c_custkey)
  ->  XN Seq Scan on lineorder_csk l  (cost=0.00..6000378.88 rows=600037888 width=8)
  ->  XN Seq Scan on customer_csk c  (cost=0.00..30000.00 rows=3000000 width=26)
XN Hash Left Join DS_BCAST_INNER  (cost=37500.00..1080037539868.00 rows=600037888 width=30)
  Hash Cond: ("outer".lo_custkey = "inner".c_custkey)
  ->  XN Seq Scan on lineorder_csk2 l  (cost=0.00..6000378.88 rows=600037888 width=8)
  ->  XN Hash  (cost=30000.00..30000.00 rows=3000000 width=26)
        ->  XN Seq Scan on customer_csk c  (cost=0.00..30000.00 rows=3000000 width=26)
XN Hash Left Join DS_DIST_NONE  (cost=37500.00..37539868.00 rows=600037888 width=30)
  Hash Cond: ("outer".lo_custkey = "inner".c_custkey)
  ->  XN Seq Scan on lineorder_ilsk l  (cost=0.00..6000378.88 rows=600037888 width=8)
  ->  XN Hash  (cost=30000.00..30000.00 rows=3000000 width=26)
        ->  XN Seq Scan on customer_ilsk c  (cost=0.00..30000.00 rows=3000000 width=26)
# No sort key Compound sort key Compound sort key 2 Interleaved sort key
1 203ms 88ms 594ms 211ms
2 221ms 105ms 567ms 218ms
3 208ms 97ms 628ms 205ms
Cost 37500.00..
37539868.00
0.00..
13538352.48
37500.00..
1080037539868.00
37500.00..
37539868.00
Join Operaters Hash Left Join Merge Left Join Hash Left Join Hash Left Join
Data Redistribution DS_DIST_NONE DS_DIST_NONE DS_BCAST_INNER DS_DIST_NONE

The differences between these sort keys are more obvious when the number of records is over 10 million. Compound sorting was the fastest, and Compound sorting 2 with DS_BCAST_INNER as the Data Redistribution got more than three times as slow as the others.

-- More than 10 million records table
-- JOIN with the secondary key
SELECT l.lo_custkey, l.lo_orderdate, p.p_name FROM lineorder l LEFT JOIN part p ON l.lo_partkey = p.p_partkey LIMIT 10000;

-- EXPLAIN
XN Hash Left Join DS_BCAST_INNER  (cost=17500.00..392025519110.24 rows=600037888 width=24)
  Hash Cond: ("outer".lo_partkey = "inner".p_partkey)
  ->  XN Seq Scan on lineorder_nsk l  (cost=0.00..6000378.88 rows=600037888 width=12)
  ->  XN Hash  (cost=14000.00..14000.00 rows=1400000 width=20)
        ->  XN Seq Scan on part_nsk p  (cost=0.00..14000.00 rows=1400000 width=20)
XN Hash Left Join DS_BCAST_INNER  (cost=17500.00..392025519110.24 rows=600037888 width=24)
  Hash Cond: ("outer".lo_partkey = "inner".p_partkey)
  ->  XN Seq Scan on lineorder_csk l  (cost=0.00..6000378.88 rows=600037888 width=12)
  ->  XN Hash  (cost=14000.00..14000.00 rows=1400000 width=20)
        ->  XN Seq Scan on part_csk p  (cost=0.00..14000.00 rows=1400000 width=20)
XN Merge Left Join DS_DIST_NONE  (cost=0.00..13513363.88 rows=600037888 width=24)
  Merge Cond: ("outer".lo_partkey = "inner".p_partkey)
  ->  XN Seq Scan on lineorder_csk2 l  (cost=0.00..6000378.88 rows=600037888 width=12)
  ->  XN Seq Scan on part_csk p  (cost=0.00..14000.00 rows=1400000 width=20)
XN Hash Left Join DS_BCAST_INNER  (cost=17500.00..392025519110.24 rows=600037888 width=24)
  Hash Cond: ("outer".lo_partkey = "inner".p_partkey)
  ->  XN Seq Scan on lineorder_ilsk l  (cost=0.00..6000378.88 rows=600037888 width=12)
  ->  XN Hash  (cost=14000.00..14000.00 rows=1400000 width=20)
        ->  XN Seq Scan on part_ilsk p  (cost=0.00..14000.00 rows=1400000 width=20)
# No sort key Compound sort key Compound sort key 2 Interleaved sort key
1 338ms 355ms 72ms 323ms
2 294ms 320ms 79ms 398ms
3 297ms 332ms 74ms 348ms
Cost 17500.00..
392025519110.24
17500.00..
392025519110.24
0.00..
13513363.88
17500.00..
392025519110.24
Join Operaters Hash Left Join Hash Left Join Merge Left Join Hash Left Join
Data Redistribution DS_BCAST_INNER DS_BCAST_INNER DS_DIST_NONE DS_BCAST_INNER

The performance of Compound sorting with Merge join is by far the best.

-- More than 10 million records table
-- JOIN with the primary key and the secondary key
SELECT l.lo_custkey, l.lo_orderdate FROM lineorder l LEFT JOIN lineorder l2 ON l.lo_custkey = l2.lo_custkey AND l.lo_partkey = l2.lo_partkey LIMIT 10000;

-- EXPLAIN
XN Hash Left Join DS_DIST_NONE  (cost=9000568.32..7230458679.56 rows=600037888 width=8)
  Hash Cond: (("outer".lo_custkey = "inner".lo_custkey) AND ("outer".lo_partkey = "inner".lo_partkey))
  ->  XN Seq Scan on lineorder_nsk l  (cost=0.00..6000378.88 rows=600037888 width=12)
  ->  XN Hash  (cost=6000378.88..6000378.88 rows=600037888 width=8)
        ->  XN Seq Scan on lineorder_nsk l2  (cost=0.00..6000378.88 rows=600037888 width=8)
XN Merge Left Join DS_DIST_NONE  (cost=0.00..18003089.13 rows=600037888 width=8)
  Merge Cond: (("outer".lo_custkey = "inner".lo_custkey) AND ("outer".lo_partkey = "inner".lo_partkey))
  ->  XN Seq Scan on lineorder_csk l  (cost=0.00..6000378.88 rows=600037888 width=12)
  ->  XN Seq Scan on lineorder_csk l2  (cost=0.00..6000378.88 rows=600037888 width=8)
XN Hash Left Join DS_DIST_OUTER  (cost=9000568.32..60011019258465.38 rows=600037888 width=8)
  Outer Dist Key: l.lo_custkey
  Hash Cond: (("outer".lo_custkey = "inner".lo_custkey) AND ("outer".lo_partkey = "inner".lo_partkey))
  ->  XN Seq Scan on lineorder_csk2 l  (cost=0.00..6000378.88 rows=600037888 width=12)
  ->  XN Hash  (cost=6000378.88..6000378.88 rows=600037888 width=8)
        ->  XN Seq Scan on lineorder_csk l2  (cost=0.00..6000378.88 rows=600037888 width=8)
XN Hash Left Join DS_DIST_NONE  (cost=9000568.32..7230458334.30 rows=600037888 width=8)
  Hash Cond: (("outer".lo_custkey = "inner".lo_custkey) AND ("outer".lo_partkey = "inner".lo_partkey))
  ->  XN Seq Scan on lineorder_ilsk l  (cost=0.00..6000378.88 rows=600037888 width=12)
  ->  XN Hash  (cost=6000378.88..6000378.88 rows=600037888 width=8)
        ->  XN Seq Scan on lineorder_ilsk l2  (cost=0.00..6000378.88 rows=600037888 width=8)
# No sort key Compound sort key Compound sort key 2 Interleaved sort key
1 27.0s 90ms 44.9s 25.8s
2 27.1s 77ms 44.7s 25.8s
3 26.9s 69ms 44.7s 25.9s
Cost 9000568.32..
7230458679.56
0.00..
18003089.13
9000568.32..
60011019258465.38
9000568.32..
7230458334.30
Join Operaters Hash Left Join Merge Left Join Hash Left Join Hash Left Join
Data Redistribution DS_DIST_NONE DS_DIST_NONE DS_DIST_OUTER DS_DIST_NONE

I didn't have a just right joining table for lineorder, so I used the same table for joining. As the result of this, Redshift came to handle over 16 billion records, but the performance of Compound sorting was still really good. (That was amazing!) Compound sorting with DS_DIST_OUTER as the Data Redistribution is more than twice as slow as the others.

Therefore, if you can correctly deal with Compound sort keys, you can achieve maximum benefits for the performance of Redshift, otherwise the performance gets worse than Interleaved sorting. Interleaved sorting becomes an easy way out for the sort key design if the tables are really huge and complex.

When to achieve Merge Join

-- More than 10 million records table
-- JOIN with the primary key
SELECT l.lo_custkey, l.lo_orderdate, c.c_name FROM lineorder_csk l LEFT JOIN customer_csk c ON l.lo_custkey = c.c_custkey LIMIT 10000;
SELECT l.lo_custkey, l.lo_orderdate, c.c_name FROM lineorder_csk l LEFT JOIN (SELECT c_custkey, c_name FROM customer_csk ) c ON l.lo_custkey = c.c_custkey LIMIT 10000;
SELECT l.lo_custkey, l.lo_orderdate, c.c_name FROM lineorder_csk l LEFT JOIN customer_csk c ON l.lo_custkey::text = c.c_custkey::text LIMIT 10000;

-- EXPLAIN
XN Merge Left Join DS_DIST_NONE  (cost=0.00..13538352.48 rows=600037888 width=30)
  Merge Cond: ("outer".lo_custkey = "inner".c_custkey)
  ->  XN Seq Scan on lineorder_csk l  (cost=0.00..6000378.88 rows=600037888 width=8)
  ->  XN Seq Scan on customer_csk c  (cost=0.00..30000.00 rows=3000000 width=26)
XN Merge Left Join DS_DIST_NONE  (cost=0.00..13538352.48 rows=600037888 width=30)
  Merge Cond: ("outer".lo_custkey = "inner".c_custkey)
  ->  XN Seq Scan on lineorder_csk l  (cost=0.00..6000378.88 rows=600037888 width=8)
  ->  XN Seq Scan on customer_csk  (cost=0.00..30000.00 rows=3000000 width=26)
XN Hash Left Join DS_BCAST_INNER  (cost=37500.00..6570354213173.60 rows=9000568320000 width=30)
  Hash Cond: ((("outer".lo_custkey)::character varying)::text = (("inner".c_custkey)::character varying)::text)
  ->  XN Seq Scan on lineorder_csk l  (cost=0.00..6000378.88 rows=600037888 width=8)
  ->  XN Hash  (cost=30000.00..30000.00 rows=3000000 width=26)
        ->  XN Seq Scan on customer_csk c  (cost=0.00..30000.00 rows=3000000 width=26)
# As-is Subquery Convert the datatype
1 62ms 82ms 1.2s
2 99ms 81ms 1.2s
3 80ms 81ms 1.2s
cost 0.00..13538352.48 0.00..13538352.48 37500.00..6570354213173.60
Join Operaters Merge Left Join Merge Left Join Hash Left Join

Merge Join is achieved in a subquery, but when you changed the column's datatype, it's not achieved.

References