1 minute read

Ever since AWS moved their Athena engine to version 3 weird things started happening to us.

This query that used to run under 3 minutes now times out after 45 minutes. What the hell?

After trying to debug this query unsuccessfully for hours, we decided to contact AWS Support directly and this is what they said (paraphrased):

AWS Support: So uhmmm, maybe like, yeah, try… adding this to your query somewhere: @{rule_based_join_reorder ='false'}?

AWS Support: …

AWS Support: We just like added this as an experimental feature but forgot to document it lol xoxo aws

If you google this you get literally zero results:

(Update 2023-08-07: My blog post is now the first hit, hehe)

It feels like I’m in possession of this forbidden jutsu that no one else knows and it feels great. Anyway, you can guess what we tried out next:

@{rule_based_join_reorder ='false'}
WITH pupils as (
    SELECT pupilid
        -- very long and complex sql query here

What just happened here exactly? I had no clue so I asked ChatGPT:

Me: Dear ChatGPT, What does @{rule_based_join_reorder =’false’} do?

ChatGPT: In SQL Server, the rule_based_join_reorder option is used to control the query optimizer’s behavior for reordering joins in query execution plans.

… (more bla bla bla)

It turns out that there is this undocumented setting in AWS that allows us to disable the rule based join reordering. The heuristics used to order the joins turned out to be very inefficient for our specific query, resulting in the huge speedup after disabling it.

In other words: haha athena go brrrrrr