Jonathan
Reinink

Dynamic relationships in Laravel using subqueries

Posted on November 22, 2018

When building web apps that interact with a database, I always have two goals in mind:

  1. Keep database queries to a minimum.
  2. Keep memory usage to a minimum.

These goals can have a drastic impact on the performance of your app.

Developers are typically pretty good at the first goal. We're aware of N+1 style problems, and use techniques like eager-loading to limit database queries. However, we're not always the best at the second goal—keeping memory usage down. In fact, we sometimes do more harm than good trying to reduce database queries at the expense of memory usage.

Let me explain how this happens, and what you can do to satisfy both goals in your app.

The challenge

Consider the following example. You have a users page in your app which shows some info about them, including their last login date. This seemingly simple page actually presents some interesting complexity.

Name Email Last Login
Adam Campbell [email protected] Nov 10, 2018 at 12:01pm
Taylor Otwell [email protected] Never
Jonathan Reinink [email protected] Jun 2, 2018 at 5:30am
Adam Wathan [email protected] Nov 20, 2018 at 7:49pm

In this app we're tracking user logins in a logins table, so we can do statistical reporting on it. Here's what the basic database schema looks like:

Schema::create('users', function (Blueprint $table) {
    $table->increments('id');
    $table->string('name');
    $table->string('email');
    $table->timestamps();
});

Schema::create('logins', function (Blueprint $table) {
    $table->increments('id');
    $table->integer('user_id');
    $table->string('ip_address');
    $table->timestamp('created_at');
});

And here are the corresponding models for those tables with their relationships:

class User extends Model
{
    public function logins()
    {
        return $this->hasMany(Login::class);
    }
}

class Login extends Model
{
    public function user()
    {
        return $this->belongsTo(User::class);
    }
}

So how do we go about creating the users page above? In particular, how do we get the last login date? The easy answer here might be to do the following:

$users = User::all();

@foreach ($users as $user)
    <tr>
        <td>{{ $user->name }}</td>
        <td>{{ $user->email }}</td>
        <td>
            @if ($lastLogin = $user->logins()->latest()->first())
                {{ $lastLogin->created_at->format('M j, Y \a\t g:i a') }}
            @else
                Never
            @endif
        </td>
    </tr>
@endforeach

But, if we're a good developer (and we are), we're going to notice a problem here. We've just created an N+1 issue. For every user we display, we're now running an additional query to get their last login. If our page displays 50 users, we're now executing 51 total queries.

select * from "users";
select * from "logins" where "logins"."user_id" = 1 and "logins"."user_id" is not null order by "created_at" desc limit 1;
select * from "logins" where "logins"."user_id" = 2 and "logins"."user_id" is not null order by "created_at" desc limit 1;
// ...
select * from "logins" where "logins"."user_id" = 49 and "logins"."user_id" is not null order by "created_at" desc limit 1;
select * from "logins" where "logins"."user_id" = 50 and "logins"."user_id" is not null order by "created_at" desc limit 1;

Let's see if we can do better. This time let's eager-load all the login records:

$users = User::with('logins')->get();

@foreach ($users as $user)
    <tr>
        <td>{{ $user->name }}</td>
        <td>{{ $user->email }}</td>
        <td>
            @if ($user->logins->isNotEmpty())
                {{ $user->logins->sortByDesc('created_at')->first()->created_at->format('M j, Y \a\t g:i a') }}
            @else
                Never
            @endif
        </td>
    </tr>
@endforeach

This solution only requires two database queries. One for the users, and a second for the corresponding login records. Success!

Well, not exactly. This is where memory issues become a problem. Sure, we've avoided the N+1 problem, but we've actually created a bigger memory issue:

Users per page 50 users
Average logins per user 250 logins
Total login records loaded 12,500 records

We're now loading 12,500 login records, simply to show the last login for each user. This will not only consume memory, it will also require additional computation, since each record must be initialized as an Eloquent model. And this is a pretty conservative example. You can easily run into similar situations that result in millions of records being loaded.

Caching

You might be thinking at this point, "no big deal, I'll just cache the last login".

Schema::create('users', function (Blueprint $table) {
   $table->integer('last_login_id');
});

When a user logs in, we'll create the new login record, and then update the last_login_id foreign key on the user. We'll then create a lastLogin relationship on our user model, and eager-load that relationship.

$users = User::with('lastLogin')->get();

This is a totally valid solution. But be aware, caching often isn't this simple. Yes, there are absolutely situations where denormalization is appropriate. I just don't like reaching for it because of a perceived limitation in my ORM. We can do better.

Introducing subqueries

There is another way to solve this problem, and that's with subqueries. Subqueries allow us to select extra columns (attributes) right in our primary database query (the users query in our example). Let's look at how we can do this. You may be surprised to learn that Laravel actually has built-in support for subqueries using the selectSub method. First the long version:

$lastLogin = Login::select('created_at')
    ->whereColumn('user_id', 'users.id')
    ->latest()
    ->limit(1)
    ->getQuery();

$users = User::select('users.*')
    ->selectSub($lastLogin, 'last_login_at')
    ->get();

@foreach ($users as $user)
    <tr>
        <td>{{ $user->name }}</td>
        <td>{{ $user->email }}</td>
        <td>
            @if ($user->last_login_at)
                {{ $user->last_login_at->format('M j, Y \a\t g:i a') }}
            @else
                Never
            @endif
        </td>
    </tr>
@endforeach

In this example we're not actually loading a relationship yet. That's coming. What we are doing is using a subquery to get each user's last login date as an attribute. Let's look at the database query that's actually being run:

select
    "users".*,
    (
        select "created_at" from "logins"
        where "user_id" = "users"."id"
        order by "created_at" desc
        limit 1
    ) as "last_login_at"
from "users"

Using a subquery in this way allows us to get all the information we need for our users page in a single query. This technique provides some huge performance wins, since we can keep both our database queries and memory usage to a minimum, plus we've avoided having to use caching.

Macro

Before we go any further, I want to show you a little macro I use to keep the subquery boilerplate to a minimum. We'll do this by adding a new addSubSelect method to the query builder. Put the following code in your AppServiceProvider.

use Illuminate\Database\Query\Builder;

Builder::macro('addSubSelect', function ($column, $query) {
    if (is_null($this->columns)) {
        $this->select($this->from.'.*');
    }

    return $this->selectSub($query->limit(1), $column);
});
Alternatively, I've created a small package for this. Install it using Composer:
composer require reinink/advanced-eloquent

Don't worry too much about what this macro does. However, if you're curious:

  • It adds a select('table.*') to get all the base table columns, in addition to the new subquery column. This is required, since Laravel won't include a "select *" if you have any custom selects defined, which our subquery technically is.
  • It adds a limit of one, since subqueries can only ever return a single row.
  • It calls the selectSub method.

Now, let's rewrite our code to use this new macro:

$users = User::addSubSelect('last_login_at', Login::select('created_at')
    ->whereColumn('user_id', 'users.id')
    ->latest()
)->get();

Wew, pretty nice! 🎉

Scopes

One more thing before we move on to the next step. Let's move our new subquery into a scope on the User model:

class User extends Model
{
    public function scopeWithLastLoginDate($query)
    {
        $query->addSubSelect('last_login_at', Login::select('created_at')
            ->whereColumn('user_id', 'users.id')
            ->latest()
        );
    }
}

$users = User::withLastLoginDate()->get();

I love tucking away query builder code into model scopes like this. Not only does it keep controllers simpler, it also allows easy reuse of these queries. Plus, it's going to help us with the next step, loading dynamic relationships via subqueries.

Dynamic relationships via subqueries

Okay, now for the part we've been building up to. Using a subquery to get the last login date is great, but what if we wanted additional information about the last login? For example, maybe we want to show the IP address of that login as well. How would we do this?

One option is to simply create a second subquery scope:

$users = User::withLastLoginDate()->withLastLoginIpAddress()->get();

{{ $user->last_login_at->format('M j, Y \a\t g:i a') }} ({{ $user->last_login_ip_address }})

And, of course, this would certainly work. But wouldn't it be nicer to work with an actual Login model instance? Especially if that model had additional functionality built into it, such as accessors or relationships.

$users = User::withLastLogin()->get();

{{ $user->lastLogin->created_at->format('M j, Y \a\t g:i a') }} ({{ $user->lastLogin->ip_address }})

Enter dynamic relationships.

We'll start by defining a new lastLogin belongs-to relationship. Now normally for a belongs-to relationship to work, your table needs a column for the foreign key. In our example, that would mean having a last_login_id column on our users table. However, since we're trying to avoid actually having to denormalize and store that data on the users table, we'll use a subquery to select the foreign key instead. Eloquent has no idea that this isn't a real column, meaning everything just works as if it was. Let's look at the code:

class User extends Model
{
    public function lastLogin()
    {
        return $this->belongsTo(Login::class);
    }

    public function scopeWithLastLogin($query)
    {
        $query->addSubSelect('last_login_id', Login::select('id')
            ->whereColumn('user_id', 'users.id')
            ->latest()
        )->with('lastLogin');
    }
}

$users = User::withLastLogin()->get();

<table>
    <tr>
        <th>Name</th>
        <th>Email</th>
        <th>Last Login</th>
    </tr>
    @foreach ($users as $user)
        <tr>
            <td>{{ $user->name }}</td>
            <td>{{ $user->email }}</td>
            <td>
                @if ($user->lastLogin)
                    {{ $user->lastLogin->created_at->format('M j, Y \a\t g:i a') }}
                @else
                    Never
                @endif
            </td>
        </tr>
    @endforeach
</table>

That's all there is to it! The end result here is two database queries. Let's have a look at them:

select
    "users".*,
    (
        select "id" from "logins"
        where "user_id" = "users"."id"
        order by "created_at" desc
        limit 1
    ) as "last_login_id"
from "users"

This query is basically exactly the same as the query we saw before, except instead of selecting the last login date, we're selecting the last login id. We've essentially got the last_login_id column we would have added if we cached the value, without actually having to cache it.

Now let's look at the second query. This is the query that Laravel automatically runs when we eager-load the last logins using with('lastLogin'), which you can see we called in our scope.

select * from "logins" where "logins"."id" in (1, 3, 5, 13, 20 ... 676, 686)

Our subquery has allowed us to select only the last login for each user. Plus, since we're using a standard Laravel relationship for our lastLogin, we also get these records as proper Login Eloquent models. Really nice. 👌

Lazy-loading dynamic relationships

One thing to be aware of with this technique is that you cannot lazy-load dynamic relationships out of the box. This is because our scope will not be added by default.

$lastLogin = User::first()->lastLogin; // will return null

If you'd like lazy-loading to work, you can still do this by adding a global scope to your model:

class User extends Model
{
    protected static function boot()
    {
        parent::boot();

        static::addGlobalScope(function ($query) {
            $query->withLastLogin();
        });
    }
}

Personally, I don't do this. I prefer to explicitly eager-load my dynamic relationships when they are needed.

Can this be done with a has-one?

One last thing before we wrap up. You might be wondering at this point if we could have avoided all this work by simply using a has-one relationship. The short answer is no. Let's look at why.

The first way you might think to do this is by sorting the has-one query:

public function lastLogin()
{
    return $this->hasOne(Login::class)->latest();
}

And, at first, this actually appears to provide the desired results. However, if we look at the generated query, we'll see a problem:

select * from "logins" where "logins"."user_id" in (1, 2, 3...99, 100) order by "created_at" desc

It's eager-loading logins by the user_id, but there are no limits or filters set. Meaning this won't only load the last login, it will load every login record for that user. We're right back to the 12,500 login records problem we saw earlier.

But, we're determined! So, we add a limit:

public function lastLogin()
{
    return $this->hasOne(Login::class)->latest()->limit(1);
}

That seems like it should work, but let's look at the generated query:

select * from "logins" where "logins"."user_id" in (1, 2, 3...99, 100) order by "created_at" desc limit 1

Laravel eager-loads relationships in a single database query, but we've now added a limit of 1. That means we'll only get one record back for all users. This will be the login record for the very last user to have logged in. All other users will have a null relationship.

Summary

I hope that gives you a good overview of how you can use subqueries to create dynamic relationships in Laravel. This is a powerful technique that allows you to push more work into the database layer of your app. This can have a huge impact on performance by allowing you to drastically reduce the number of database queries executed and overall memory used.

If you have any questions or feedback on this article, send me (@reinink) a message on Twitter. I'd love to hear from you.

If you enjoyed this article, and want to be notified in the future about interesting things I'm working on, join my email list below.