Some technical writing by Justin Giancola. I work on the platform team at Precision Nutrition.
Lately lateral join queries have been my go-to Swiss Army Chainsaw. Typically the query I end up with is complicated enough to warrant promoting it to a view. However, for simpler stuff I was pleasantly surprised1 to learn how well Ecto supports lateral joins. As of Ecto 3.4.3 you can construct lateral joins without using fragments.
Let’s see how this works.
This likely warrants a post of its own, but generally speaking lateral joins are like correlated subqueries but way better. They are typically simpler to write, faster to execute, and let you do lots of things correlated subqueries cannot.
As a quick example, let’s say we have a messaging app, and we want to return a list of the most recent messages for each conversation. We could write this as:
select
most_recent_messages.*
from conversations
inner join lateral (
select
*
from messages
where conversation_id = conversations.id
order by messages.inserted_at desc
limit 1
) as most_recent_messages on true;
Before Ecto 3.4.3, one would have had to use fragments to write the above query.
Let’s assume we have Conversation
and Message
schema structs, and we want to run something like the above query and end up with a list of the most recent Message
structs from each convesration. We might2 write:
query =
from c in Conversation,
inner_lateral_join:
m in fragment("select id, body, sender, conversation_id, inserted_at, updated_at from messages where conversation_id = ? order by inserted_at desc limit 1", c.id),
select: %{id: m.id, body: m.body, sender: m.sender, conversation_id: m.conversation_id, inserted_at: m.inserted_at, updated_at: m.updated_at}
query
|> Repo.all()
|> Enum.map(fn(m) -> struct(Message, m) end)
This is pretty cumbersome. Setting aside issues of query hygiene when using fragments, the main thing to notice is that we must specify all fields from the message table in our subquery, and manually map these into Message
structs. We have to do this explicitly rather than just select *
in the query and let Ecto figure out what struct to map into because when we use the fragment, Ecto has no idea what the subquery is returning.
Fortunately there is a better way. We can use Ecto’s support for lazy named bindings to combine the child and parent queries:
subquery =
from m in Message,
where: parent_as(:conversation).id == m.conversation_id,
order_by: [desc: m.inserted_at],
limit: 1
query =
from c in Conversation, as: :conversation,
inner_lateral_join: m in subquery(child_query),
select: m
query
|> Repo.all()
The parts to note here are the parent_as
and as
components which allow us to refer to a named atom binding in the parent query from the subquery. Since we now have a proper subquery instead of a fragment, Ecto understands exactly what the subquery returns and correctly loads the results into Message
structs without any extra mapping required.
Lateral joins are a powerful tool to use when writing queries, and lazy named bindings in Ecto make it easy to write them without using fragments. I think it’s interesting to note that lazy named bindings were added to Ecto rather than, for example, more powerful support for lateral joins.3 It’s a more general feature that also solves for this use case. I think it’s a great example of the Elixir community in general, and José Valim in particular, avoiding very specific solves in favour of more powerful, high-level features that can be combined to address a wider range of problems.
This has not been my experience with other database libraries. ↩
There might be a slightly nicer way to accomplish this using a fragment query, but everything I’ve found looks pretty awful. If you have a better solution using fragments, let me know… ↩
An hypothetical example of what more powerful support for lateral joins in Ecto might look like. ↩