C# tutorials > Language Integrated Query (LINQ) > LINQ to Entities (Entity Framework Core) > How to use `Include()` and `ThenInclude()` for eager loading?

How to use `Include()` and `ThenInclude()` for eager loading?

This tutorial demonstrates how to use Include() and ThenInclude() in Entity Framework Core for eager loading related entities. Eager loading allows you to retrieve related data in a single database query, which can improve performance compared to lazy loading, where related data is loaded on demand.

Understanding Eager Loading

Eager loading is a technique in Entity Framework Core where you load related entities along with the main entity in a single query. This is achieved using the Include() method. When you have multiple levels of related entities, you use ThenInclude() to specify the further relationships to load.

Without eager loading, Entity Framework Core might perform separate queries to load related entities (lazy loading), which can lead to the N+1 problem. The N+1 problem occurs when retrieving N entities and then making one additional query for each of those N entities to fetch related data. Eager loading mitigates this by fetching everything in one go.

Basic `Include()` Example

This code snippet demonstrates a basic example of using Include() to load a Blog entity along with its related Posts. The Include(b => b.Posts) part tells Entity Framework Core to retrieve all the posts associated with each blog in a single query.

First, the database is created and seeded if it's empty. Then, the query uses .Include(b => b.Posts) to specify that when retrieving Blog entities, the related Posts entities should also be loaded in the same query. Finally, the code iterates through the retrieved blogs and their posts to display their titles.

using Microsoft.EntityFrameworkCore;
using System.Linq;

public class Blog
{
    public int BlogId { get; set; }
    public string Title { get; set; }
    public string Content { get; set; }
    public ICollection<Post> Posts { get; set; }
}

public class Post
{
    public int PostId { get; set; }
    public string Title { get; set; }
    public string Content { get; set; }
    public int BlogId { get; set; }
    public Blog Blog { get; set; }
}

public class BloggingContext : DbContext
{
    public DbSet<Blog> Blogs { get; set; }
    public DbSet<Post> Posts { get; set; }

    protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
    {
        optionsBuilder.UseInMemoryDatabase("BloggingDatabase");
    }
}

public class Example
{
  public static void Run()
  {
    using (var context = new BloggingContext())
    {
        // Ensure database is created
        context.Database.EnsureCreated();

        // Add sample data
        if (!context.Blogs.Any())
        {
            context.Blogs.Add(new Blog { Title = "My Blog", Content = "Blog Content" });
            context.SaveChanges();

            var blog = context.Blogs.FirstOrDefault();
            if (blog != null)
            {
                context.Posts.Add(new Post { BlogId = blog.BlogId, Title = "My First Post", Content = "Post Content" });
                context.SaveChanges();
            }
        }

        // Eager load Blogs with their Posts
        var blogsWithPosts = context.Blogs
            .Include(b => b.Posts)
            .ToList();

        foreach (var blog in blogsWithPosts)
        {
            Console.WriteLine($"Blog: {blog.Title}");
            foreach (var post in blog.Posts)
            {
                Console.WriteLine($"  - Post: {post.Title}");
            }
        }
    }
  }
}

Using `ThenInclude()` for Multiple Levels of Relationships

When you have relationships nested deeper, you use ThenInclude() to traverse the relationships. In this example, we load Blogs, their related Posts, and the Comments associated with each Post.

The key part is .Include(b => b.Posts).ThenInclude(p => p.Comments). First, Include(b => b.Posts) loads the posts related to each blog. Then, ThenInclude(p => p.Comments) loads the comments related to each post that was already loaded. This retrieves all the data in a single database round trip, avoiding the N+1 problem.

using Microsoft.EntityFrameworkCore;
using System.Linq;

public class Blog
{
    public int BlogId { get; set; }
    public string Title { get; set; }
    public string Content { get; set; }
    public ICollection<Post> Posts { get; set; }
}

public class Post
{
    public int PostId { get; set; }
    public string Title { get; set; }
    public string Content { get; set; }
    public int BlogId { get; set; }
    public Blog Blog { get; set; }
    public ICollection<Comment> Comments { get; set; }
}

public class Comment
{
    public int CommentId { get; set; }
    public string Text { get; set; }
    public int PostId { get; set; }
    public Post Post { get; set; }
}

public class BloggingContext : DbContext
{
    public DbSet<Blog> Blogs { get; set; }
    public DbSet<Post> Posts { get; set; }
    public DbSet<Comment> Comments { get; set; }

    protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
    {
        optionsBuilder.UseInMemoryDatabase("BloggingDatabase");
    }
}

public class Example
{
  public static void Run()
  {
    using (var context = new BloggingContext())
    {
        // Ensure database is created
        context.Database.EnsureCreated();

        // Add sample data
        if (!context.Blogs.Any())
        {
            context.Blogs.Add(new Blog { Title = "My Blog", Content = "Blog Content" });
            context.SaveChanges();

            var blog = context.Blogs.FirstOrDefault();
            if (blog != null)
            {
                context.Posts.Add(new Post { BlogId = blog.BlogId, Title = "My First Post", Content = "Post Content" });
                context.SaveChanges();

                var post = context.Posts.FirstOrDefault();
                if(post != null)
                {
                    context.Comments.Add(new Comment{ PostId = post.PostId, Text = "First Comment"});
                    context.SaveChanges();
                }
            }
        }


        // Eager load Blogs with their Posts and the Posts' Comments
        var blogsWithPostsAndComments = context.Blogs
            .Include(b => b.Posts)
                .ThenInclude(p => p.Comments)
            .ToList();

        foreach (var blog in blogsWithPostsAndComments)
        {
            Console.WriteLine($"Blog: {blog.Title}");
            foreach (var post in blog.Posts)
            {
                Console.WriteLine($"  - Post: {post.Title}");
                foreach (var comment in post.Comments)
                {
                    Console.WriteLine($"    - Comment: {comment.Text}");
                }
            }
        }
    }
  }
}

Concepts Behind the Snippet

The core concept is to minimize database round trips. By eager loading, you reduce the number of queries needed to retrieve related data. This is particularly important in web applications where database latency can significantly impact performance.

Understanding the entity relationships in your data model is crucial. You need to know which entities are related to which, and how deep the relationships go, to effectively use Include() and ThenInclude().

Real-Life Use Case

Consider an e-commerce application. You might want to display a list of products, each with its categories and customer reviews. Using eager loading, you could fetch all this data in one go:


var products = _context.Products
    .Include(p => p.Categories)
    .ThenInclude(c => c.Category)
    .Include(p => p.Reviews)
    .ThenInclude(r => r.Customer)
    .ToList();

This way, when rendering the product list, you don't need to make separate database queries for each product's categories and reviews.

Best Practices

  • Use Eager Loading Judiciously: Only load related data that you actually need. Over-eager loading can lead to unnecessary data retrieval and impact performance.
  • Profile Your Queries: Use database profiling tools to analyze the performance of your queries and identify potential N+1 problems.
  • Consider Projections: If you only need a subset of data from related entities, consider using projections (Select()) instead of eager loading. This can reduce the amount of data transferred from the database.

Interview Tip

When discussing eager loading in an interview, be prepared to explain the N+1 problem and how Include() and ThenInclude() can prevent it. Also, be ready to discuss the trade-offs between eager loading, lazy loading, and explicit loading, and when each approach is most appropriate.

Demonstrate an understanding that while eager loading can improve performance by reducing the number of database queries, it can also increase the amount of data retrieved, potentially impacting memory usage and network bandwidth. Knowing when to use each technique based on the specific requirements is key.

When to Use Them

Use Include() and ThenInclude() when:

  • You need to access related entities frequently.
  • You are experiencing the N+1 problem with lazy loading.
  • The performance gain from reducing database round trips outweighs the cost of retrieving additional data.

Memory Footprint

Eager loading increases the memory footprint of your application because you are loading more data into memory at once. Be mindful of the size of the related entities and the number of entities you are loading. Over-eager loading can lead to memory pressure and performance issues.

Alternatives

Alternatives to eager loading include:

  • Lazy Loading: Related entities are loaded on demand, when they are accessed. This can lead to the N+1 problem.
  • Explicit Loading: You explicitly load related entities using context.Entry(entity).Reference(navigationProperty).Load() or context.Entry(entity).Collection(navigationProperty).Load().
  • Projections (Select()): Retrieve only the data you need from related entities using a projection. This can be more efficient than eager loading if you only need a subset of the data.

Pros

  • Reduces the number of database queries.
  • Avoids the N+1 problem.
  • Improves performance when related entities are frequently accessed.

Cons

  • Increases the amount of data retrieved from the database.
  • Increases memory footprint.
  • Can lead to performance issues if overused.

FAQ

  • What is the N+1 problem?

    The N+1 problem occurs when retrieving N entities and then making one additional query for each of those N entities to fetch related data. This can significantly impact performance, especially with large datasets. Eager loading can help prevent the N+1 problem.

  • When should I use eager loading vs. lazy loading?

    Use eager loading when you know you will need to access the related entities frequently and want to avoid the N+1 problem. Use lazy loading when you only need to access related entities occasionally and want to minimize the amount of data retrieved upfront.

  • Can I use `Include()` and `ThenInclude()` with filtering?

    You cannot directly apply filters within Include() or ThenInclude(). However, you can use projections (Select()) to achieve similar results. For example, you can filter the related entities within the projection.