CHAPTER 3
As you might expect, Entity Framework offers a small number of APIs to get data from the database into objects, designed to cover slightly different scenarios.
All data access layers must support loading by primary key, and certainly EFCF does so. For that, we have the Find method:
Code Listing 53
//retrieving a record by a single primary key consisting of an integer. var project = ctx.Projects.Find(1); |
This method can take any number of parameters; it supports entities having composite primary keys. The following example shows just that:
Code Listing 54
//retrieving a record by a composite primary key consisting of two integers. var project = ctx.SomeEntities.Find(1, 2); |
Find will return NULL if no matching record is found; no exception will be thrown, since this is a perfectly valid result.
Tip: The order and type of the parameters must match the order of primary key properties defined in the mapping for this entity.
Since its introduction with .NET 3.5, Language Integrated Querying (LINQ) has become the de facto standard for querying data of any kind, so it’s no surprise that Entity Framework has LINQ support. It will probably be your API of choice for most scenarios. It is strongly typed (meaning you can tell at compile time that some things are not right), refactor friendly, and its syntax is easy to understand. Let’s see some examples.
A LINQ to Entities query is built from the entity collection properties of the DbSet<T> type exposed by the context, and it consists of an IQueryable<T> implementation.
Tip: Don’t forget that LINQ is about querying. You won’t find any way to change data here.
Execution of LINQ queries is deferred until GetEnumerator is invoked, which occurs in a foreach loop or when a terminal operator such as ToList, ToArray, ToDictionary, Any, Count, LongCount, Single, SingleOrDefault, First, FirstOrDefault, Last, or LastOrDefault is used. You can pick up this query and start adding restrictions such as paging or sorting.
Code Listing 55
//create a base query. var projectsQuery = from p in ctx.Projects select p; //add sorting. var projectsSortedByDateQuery = projectsQuery.OrderBy(x => x.Start);
//execute and get the sorted results. var projectsSortedByDateResults = projectsSortedByDateQuery.ToList(); //add paging and ordering (required for paging). var projectsWithPagingQuery = projectsQuery .OrderBy(x => x.Start) .Take(5) .Skip(0); //execute and get the first 5 results. var projectsWithPagingResults = projectsWithPagingQuery.ToList(); //add a restriction var projectsStartingAWeekAgoQuery = projectsQuery .Where(x => x.Start.Year >= DateTime.Today.Year); //execute and get the projects that started a week ago. var projectsStartingAWeekAgoResults = projectsStartingAWeekAgoQuery.ToList(); |
You will get at most a single result.
Code Listing 56
//retrieving at most a single record with a simple filter. var project = ctx.Projects.SingleOrDefault(x => x.ProjectId == 1); |
Restricting by several properties is just as easy:
Code Listing 57
//retrieving multiple record with two filters. var projects = ctx.Projects .Where(x => x.Name.Contains("Something") && x.Start >= DateTime.Today) .ToList(); |
Or having one of two conditions matched:
Code Listing 58
//or var resourcesKnowingVBOrCS = ctx.Technologies .Where(t => t.Name == "VB.NET" || t.Name == "C#") .SelectMany(x => x.Resources) .Select(x => x.Name) .ToList(); |
Count results:
Code Listing 59
//count var numberOfClosedProjects = ctx.Projects .Where(x => x.End != null && x.End < DateTime.Now) .Count(); |
Check record existence:
Code Listing 60
//check existence var existsProjectBySomeCustomer = ctx.Projects .Any(x => x.Customer.Name == "Some Customer"); |
Perform a projection (only get some parts of an entity):
Code Listing 61
//get only the name of the resource and the name of the associated project. var resourcesXprojects = ctx.Projects .SelectMany(x => x.ProjectResources) .Select(x => new { Resource = x.Resource.Name, Project = x.Project.Name }) .ToList(); |
Do aggregations:
Code Listing 62
var avgResources = ctx.Projects.Average(p => p.ProjectResources.Count()); |
Get distinct values:
Code Listing 63
//distinct roles performed by a resource. var roles = ctx.Resources .SelectMany(x => x.ProjectResources) .Where(x => x.Resource.Name == "Ricardo Peres") .Select(x => x.Role) .Distinct() .ToList(); |
Group on a property:
Code Listing 64
//grouping and projecting. var resourcesGroupedByProjectRole = ctx.Projects .SelectMany(x => x.ProjectResources) .Select(x => new { Role = x.Role, Resource = x.Resource.Name }) .GroupBy(x => x.Role) .Select(x => new { Role = x.Key, Resources = x }) .ToList(); //grouping and counting. var projectsByCustomer = ctx.Projects .GroupBy(x => x.Customer) .Select(x => new { Customer = x.Key.Name, Count = x.Count() }) .ToList(); //top 10 customers having more projects in descending order. var top10CustomersWithMoreProjects = ctx.Projects .GroupBy(x => x.Customer.Name) .Select(x => new { x.Key, Count = x.Count() }) .OrderByDescending(x => x.Count) .Take(10) .ToList(); |
Use the results of a subquery:
Code Listing 65
//subquery var usersKnowingATechnology = (from r in ctx.Resources where r.Technologies.Any(x => (from t in ctx.Technologies where t.Name == "ASP.NET" select t).Contains(x)) select r) .ToList(); |
Partial matches (LIKE):
Code Listing 66
//like var aspTechnologies = (from t in ctx.Technologies where EF.Functions.Like(t.Name, "asp%") select t) .ToList(); |
Tip: Notice the static Like method in the EF.Functions class.
Finally, you can check for one of a set of values:
Code Listing 67
//contains var customersToFind = new string[] { "Some Customer", "Another Customer" }; var projectsOfCustomers = ctx.Projects .Where(x => customersToFind.Contains(x.Customer.Name)) .ToList(); |
Note: In case you are wondering, all literals present in LINQ queries (strings, numbers, dates, etc.) will be turned into parameters for proper execution plan reusing.
Tip: In EF Core 2.0, it is not possible to perform mathematical operations nor operations over dates and times.
Since EF Core 1.0, it has been possible to mix server and client-side code on your queries. For example, if the LINQ parser finds some method call that it does not know how to translate to a database call, it will execute it silently on the client-side. Depending on the call, it can either prove useful or result in a performance penalty. As an example of the latter, imagine the case where you are filtering by a client method: you need to bring all records, apply filtering to them, and only after return the results. But if used sparingly, it can indeed turn out useful; here’s an example:
Code Listing 68
var projects = from p in ctx.Projects select new { p.Name, Age = CalculateAge(p.StartDate) }; |
As a side note, if you want to disable this, effectively reverting to the old, pre-Core behavior, all you need to do is configure logging to throw an exception on the event of a client method call:
Code Listing 69
services.AddDbContext<ProjectsContext>(options => { options .UseSqlServer("<connection string>") .ConfigureWarnings(options => options.Throw(RelationalEventId.QueryClientEvaluationWarning)); }); |
Try as we might, the truth is that when you’re working with relational databases, it’s impossible to escape SQL. This may be because performance is typically better or because some query is difficult or even impossible to express using any of the other APIs, but that’s just the way it is. Entity Framework Code First (and Core) has full support for SQL, including:
It does have all the disadvantages you might expect:
The first case I’m demonstrating is how to execute a SELECT and convert the result into an entity. For that, we shall use the FromSql method of the DbSet<T>.
Code Listing 70
//simple select var projectFromSQL = ctx.Projects .FromSql("SELECT * FROM Project WHERE Name = @p0", "Big Project") .SingleOrDefault(); |
Tip: Notice how we pass parameters directly after the SQL; each must be named @p0, @p1, and so on.
If we wanted to retrieve an entity from a table-valued function, we would use the following:
Code Listing 71
//table-valued function var projectFromFunction = ctx.Projects .FromSql("SELECT * FROM dbo.GetProjectById @p0", 1) .SingleOrDefault(); |
Where the GetProjectById function might be something like this:
Code Listing 72
CREATE FUNCTION dbo.GetProjectById ( @ProjectID INT ) RETURNS TABLE AS RETURN ( SELECT * FROM Project WHERE ProjectId = @ProjectID ) GO |
Tip: Don’t forget that if you want to return entities, your SQL must return columns that match the properties of these entities, as specified in its mapping.
If we want to execute arbitrary SQL modification commands (UPDATE, DELETE, INSERT), we will need to retrieve the underlying DbConnection as shown in the following example.
Code Listing 73
//get the ADO.NET connection. var con = ctx.Database.GetDbConnection(); //create a command. var cmd = con.CreateCommand(); cmd.CommandText = "UPDATE ProjectDetail SET Budget = Budget * 1.1 WHERE ProjectId = @p0"; //create a parameter. var parm = cmd.CreateParameter(); parm.ParameterName = "p0"; cmd.Parameters.Add(parm); //update records. var updatedRecords = cmd.ExecuteUpdate(); |
Also new in EF Core 2 is something that existed in pre-Core EF: the ability to execute database functions. Apply a [DbFunction] attribute to a static method, like this:
Code Listing 74
[DbFunction("ComputeHash")] public static int ComputeHash(this string phrase) { throw new NotImplementedException(); } |
As you can see, it’s implementation is irrelevant, as it will never be called by code. Four things to keep in mind:
The name parameter to the [DbFunction] attribute is not necessary, if the database function has the same name as the method, but you can also add a schema name, if the function is not to be found under the default schema. Now you can use it like this:
Code Listing 75
var hash = from p in ctx.Projects select p.Name.ComputeHash(); |
This will execute the database function on the database and return its result. You can also use it in other contexts, like for filtering, for example.
A new feature coming in EF Core is the ability to mix LINQ with SQL; this allows you to get the best of both worlds:
Let’s see an example:
Code Listing 76
var ps = ctx.Projects .FromSql("SELECT p.* FROM Project p") .OrderBy(p => p.Start) .ToList(); |
You can feed the FromSql method from any valid SQL, including stored procedures or functions. Notice that this method is prototyped to return instances of Project, so any queries to be executed need to return columns that can be mapped to Projects, per the mapping.
There will be cases in which you want all data from the main entity as well as associated entities to be loaded at the same time. This will most likely be for one of two reasons:
Enter eager loading. What eager loading means is, when issuing a query, you explicitly declare the expansion paths that Entity Framework will bring along with the root entities. EF will then generate a different SQL expression than it would normally with using a JOIN for all the required associations.
For example, the following query brings along a Customer and all of its Projects, and it introduces the Include method.
Code Listing 77
//explicitly eager load the Customer for each project. var projectsAndTheirCustomers = ctx.Projects .Include(x => x.Customer) .ToList(); |
For the record, this will produce the following SQL statement:
Code Listing 78
SELECT [Extent1].[ProjectId] AS [ProjectId], [Extent1].[Name] AS [Name], [Extent1].[Start] AS [Start], [Extent1].[End] AS [End], [Extent2].[CustomerId] AS [CustomerId], [Extent2].[Contact_Email] AS [Contact_Email], [Extent2].[Contact_Phone] AS [Contact_Phone], [Extent2].[Name] AS [Name1] FROM [dbo].[Project] AS [Extent1] INNER JOIN [dbo].[Customer] AS [Extent2] ON [Extent1].[Customer_CustomerId] = [Extent2].[CustomerId] |
The Include method can also take a String as its parameter, which must be the name of a navigation property (a reference or a collection).
Code Listing 79
//explicitly eager load the Customer for each project. var projectsAndTheirCustomers = ctx.Projects .Include("Customer") .ToList(); |
Multiple paths can be specified.
Code Listing 80
//two independent include paths. var resourcesProjectResourcesAndTechnologies = ctx.Resources .Include(x => x.ProjectResources) .Include(x => x.Technologies) .ToList(); |
In this case, the SQL will look like the following:
Code Listing 81
SELECT [UnionAll1].[ResourceId] AS [C1], [UnionAll1].[ResourceId1] AS [C2], [UnionAll1].[ResourceId2] AS [C3], [UnionAll1].[Contact_Email] AS [C4], [UnionAll1].[Contact_Phone] AS [C5], [UnionAll1].[Name] AS [C6], [UnionAll1].[C1] AS [C7], [UnionAll1].[ProjectResourceId] AS [C8], [UnionAll1].[ProjectResourceId1] AS [C9], [UnionAll1].[Role] AS [C10], [UnionAll1].[Project_ProjectId] AS [C11], [UnionAll1].[Resource_ResourceId] AS [C12], [UnionAll1].[C2] AS [C13], [UnionAll1].[C3] AS [C14] FROM (SELECT CASE WHEN ([Extent2].[ProjectResourceId] IS NULL) THEN CAST(NULL AS int) ELSE 1 END AS [C1], [Extent1].[ResourceId] AS [ResourceId], [Extent1].[ResourceId] AS [ResourceId1], [Extent1].[ResourceId] AS [ResourceId2], [Extent1].[Contact_Email] AS [Contact_Email], [Extent1].[Contact_Phone] AS [Contact_Phone], [Extent1].[Name] AS [Name], [Extent2].[ProjectResourceId] AS [ProjectResourceId], [Extent2].[ProjectResourceId] AS [ProjectResourceId1], [Extent2].[Role] AS [Role], [Extent2].[Project_ProjectId] AS [Project_ProjectId], [Extent2].[Resource_ResourceId] AS [Resource_ResourceId], CAST(NULL AS int) AS [C2], CAST(NULL AS varchar(1)) AS [C3] FROM [dbo].[Resource] AS [Extent1] LEFT OUTER JOIN [dbo].[ProjectResource] AS [Extent2] ON [Extent1].[ResourceId] = [Extent2].[Resource_ResourceId] UNION ALL SELECT 2 AS [C1], [Extent3].[ResourceId] AS [ResourceId], [Extent3].[ResourceId] AS [ResourceId1], [Extent3].[ResourceId] AS [ResourceId2], [Extent3].[Contact_Email] AS [Contact_Email], [Extent3].[Contact_Phone] AS [Contact_Phone], [Extent3].[Name] AS [Name], CAST(NULL AS int) AS [C2], CAST(NULL AS int) AS [C3], CAST(NULL AS int) AS [C4], CAST(NULL AS int) AS [C5], CAST(NULL AS int) AS [C6], [Join2].[TechnologyId] AS [TechnologyId], [Join2].[Name] AS [Name1] FROM [dbo].[Resource] AS [Extent3] INNER JOIN (SELECT [Extent4].[Resource_ResourceId] AS [Resource_ResourceId], [Extent5].[TechnologyId] AS [TechnologyId], [Extent5].[Name] AS [Name] FROM [dbo].[TechnologyResource] AS [Extent4] INNER JOIN [dbo].[Technology] AS [Extent5] ON [Extent5].[TechnologyId] = [Extent4].[Technology_TechnologyId] ) AS [Join2] ON [Extent3].[ResourceId] = [Join2].[Resource_ResourceId]) AS [UnionAll1] ORDER BY [UnionAll1].[ResourceId1] ASC, [UnionAll1].[C1] ASC |
Here’s a final example with multilevel inclusion:
Code Listing 82
//multilevel include paths. var resourcesProjectResourcesCustomers = ctx .Resources .Include(x => x.ProjectResources.Select(y => y.Project.Customer)) .ToList(); |
The generated SQL will look like this:
Code Listing 83
SELECT [Project1].[ResourceId] AS [ResourceId], [Project1].[Contact_Email] AS [Contact_Email], [Project1].[Contact_Phone] AS [Contact_Phone], [Project1].[Name] AS [Name], [Project1].[C1] AS [C1], [Project1].[ProjectResourceId] AS [ProjectResourceId], [Project1].[Role] AS [Role], [Project1].[ProjectId] AS [ProjectId], [Project1].[Name1] AS [Name1], [Project1].[Start] AS [Start], [Project1].[End] AS [End], [Project1].[CustomerId] AS [CustomerId], [Project1].[Contact_Email1] AS [Contact_Email1], [Project1].[Contact_Phone1] AS [Contact_Phone1], [Project1].[Name2] AS [Name2], [Project1].[Resource_ResourceId] AS [Resource_ResourceId] FROM ( SELECT [Extent1].[ResourceId] AS [ResourceId], [Extent1].[Contact_Email] AS [Contact_Email], [Extent1].[Contact_Phone] AS [Contact_Phone], [Extent1].[Name] AS [Name], [Join2].[ProjectResourceId] AS [ProjectResourceId], [Join2].[Role] AS [Role], [Join2].[Resource_ResourceId] AS [Resource_ResourceId], [Join2].[ProjectId] AS [ProjectId], [Join2].[Name1] AS [Name1], [Join2].[Start] AS [Start], [Join2].[End] AS [End], [Join2].[CustomerId] AS [CustomerId], [Join2].[Contact_Email] AS [Contact_Email1], [Join2].[Contact_Phone] AS [Contact_Phone1], [Join2].[Name2] AS [Name2], CASE WHEN ([Join2].[ProjectResourceId] IS NULL) THEN CAST(NULL AS int) ELSE 1 END AS [C1] FROM [dbo].[Resource] AS [Extent1] LEFT OUTER JOIN (SELECT [Extent2].[ProjectResourceId] AS [ProjectResourceId], [Extent2].[Role] AS [Role], [Extent2].[Resource_ResourceId] AS [Resource_ResourceId], [Extent3].[ProjectId] AS [ProjectId], [Extent3].[Name] AS [Name1], [Extent3].[Start] AS [Start], [Extent3].[End] AS [End], [Extent4].[CustomerId] AS [CustomerId], [Extent4].[Contact_Email] AS [Contact_Email], [Extent4].[Contact_Phone] AS [Contact_Phone], [Extent4].[Name] AS [Name2] FROM [dbo].[ProjectResource] AS [Extent2] INNER JOIN [dbo].[Project] AS [Extent3] ON [Extent2].[Project_ProjectId] = [Extent3].[ProjectId] INNER JOIN [dbo].[Customer] AS [Extent4] ON [Extent3].[Customer_CustomerId] = [Extent4].[CustomerId] ) AS [Join2] ON [Extent1].[ResourceId] = [Join2].[Resource_ResourceId] ) AS [Project1] ORDER BY [Project1].[ResourceId] ASC, [Project1].[C1] ASC |
As you can imagine, EF goes through a lot of work to JOIN all data that it needs to fetch at the same time, hence the quite complicated SQL.
Here’s a final example with multilevel inclusion using ThenInclude:
Code Listing 84
//multilevel include paths var resourcesProjectResourcesCustomers = ctx .Resources .Include(x => x.ProjectResources) .ThenInclude(x => x.Role) .ToList(); |
ThenInclude can be used only following an Include call to force loading a nested path on the included one.
As you can imagine, EF goes to a lot of work to JOIN all data that it needs to fetch at the same time, hence the quite complicated SQL.
In a case where a reference property was not eagerly loaded, we can still force it to load explicitly:
Code Listing 85
//explicitly load the Customer property. ctx.Entry(project).Reference(x => x.Customer).Load(); |
The same also applies to collections:
Code Listing 86
//see if the ProjectResources collection is loaded. var resourcesLoaded = ctx .Entry(project) .Collection(x => x.ProjectResources) .IsLoaded; if (resourcesLoaded == false) { //explicitly load the ProjectResources collection. ctx.Entry(project).Collection(x => x.ProjectResources).Load(); } |
Another interesting case is where you want to load just a part of some collection by filtering out the entities that do not match a given condition, or even count its members without actually loading them. It is possible to do it with EF, and for this you would issue queries like this:
Code Listing 87
//count an entity's collection entities without loading them. var countDevelopersInProject = ctx .Entry(project) .Collection(x => x.ProjectResources) .Query() .Where(x => x.Role == Role.Developer) .Count(); //filter an entity’s collection without loading it. var developersInProject = ctx .Entry(project) .Collection(x => x.ProjectResources) .Query() .Where(x => x.Role == Role.Developer) .ToList(); |
And you can also force loading and bring along related references or collections—notice the call to Include:
Code Listing 88
//filter an entity’s collection without loading it. var developersInProject = ctx .Entry(project) .Collection(x => x.ProjectResources) .Query() .Include(x => x.Resource) .ToList(); |
So, the difference between lazy and eager loading is that with lazy loading, you don’t need to do anything explicit—you just access the navigation properties without even thinking about it, whereas with explicit loading, you have to perform some action.
Entities known by an Entity Framework context—either loaded from it or marked for deletion or insertion— are stored in what is called a local or first-level cache. Martin Fowler calls it the Identity Map, and you can read more about the concept here. Basically, the context keeps track of all these entities, so that it doesn’t need to materialize them whenever a query that returns their associated records is executed. It is possible to access this cache by means of the ChangeTracker instance. Two extension methods make it easier:
Code Listing 89
public static class DbContextExtensions { public static IEnumerable<EntityEntry<T>> Local<T>(this DbContext context) where T : class { return context.ChangeTracker.Entries<T>(); }
public static IEnumerable<EntityEntry<T>> Local<T>(this DbSet<T> set) where T : class { if (set is InternalDbSet<T>) { var svcs = (set as InternalDbSet<T>) .GetInfrastructure() .GetService<IDbContextServices>();
var ctx = svcs.CurrentContext.Context;
return Local<T>(ctx); }
throw new ArgumentException("Invalid set", "set"); } } //local Projects. var projectsAlreadyLoaded = ctx.Projects.Local(); //filtered local Projects – no need to call ToList. var projectsAlreadyLoadedBelongingToACustomer = projectsAlreadyLoaded .Where(x => x.Customer.Name == "Some Customer"); |
It is possible to know all entities that are present in the local cache, and to see their state, as seen by the context. It’s the responsibility of the ChangeTracker to keep track of all these entities.
Code Listing 90
//get the projects in local cache that have been modified. var modifiedProjects = ctx.ChangeTracker .Entries<Project>() .Where(x => x.State == EntityState.Modified) .Select(x => x.Entity); |
As you can guess, it’s considerably faster to get an entity from the local cache than it is to load it from the database. With that in mind, we can write a method like the following one, for transparently returning a local entity or fetching it with SQL.
Code Listing 91
//retrieve from cache or the database. public static IQueryable<T> LocalOrDatabase<T>(this DbContext context, Expression<Func<T, bool>> expression) where T : class { var localResults = context .Set<T>() .Local() .Where(expression.Compile());
if (localResults.Any() == true) { return localResults.AsQueryable(); }
return context.Set<T>().Where(expression); } |
Another useful technique consists of leveraging LINQ expressions to build complex queries from extension methods.
The BETWEEN SQL operator does not have a corresponding LINQ expression. We can use two simultaneous conditions in our LINQ expression, one for the low end of the range (> X), and one for the high end (< Y). We can also implement a LINQ extension method to provide us this functionality with a single expression.
Code Listing 92
public static class QueryableExtensions { public static IQueryable<TSource> Between<TSource, TKey>( this IQueryable<TSource> source, Expression<Func<TSource, TKey>> property, TKey low, TKey high ) where TKey : IComparable<TKey> { var sourceParameter = Expression.Parameter(typeof(TSource)); var body = property.Body; var parameter = property.Parameters[0]; var compareMethod = typeof(TKey).GetMethod("CompareTo", new Type[] { typeof(TKey) }); var zero = Expression.Constant(0, typeof(int)); var upper = Expression.LessThanOrEqual(Expression.Call(body, compareMethod, Expression.Constant(high)), zero); var lower = Expression.GreaterThanOrEqual(Expression.Call(body, compareMethod, Expression.Constant(low)), zero); var andExpression = Expression.AndAlso(upper, lower); var whereCallExpression = Expression.Call ( typeof(Queryable), "Where", new Type[] { source.ElementType }, source.Expression, Expression.Lambda<Func<TSource, Boolean>>(andExpression, new ParameterExpression[] { parameter }) );
return source.Provider.CreateQuery<TSource>(whereCallExpression); } } |
For a good understanding of how this is implemented, it is crucial to understand LINQ expressions. There are some good links on the Internet. This technology, although complex to master, has great potential and has drawn a lot of attention.
This is an extension method on IQueryable<T>, and it can be used like this:
Code Listing 93
//get projects starting between two dates. var projectsBetweenTodayAndTheDayBefore = ctx .Projects .Between(x => x.Start, DateTime.Today.AddDays(-1), DateTime.Today) .ToList(); //projects with 10 to 20 resources. var projectsWithTwoOrThreeResources = ctx .Projects .Select(x => new { x.Name, ResourceCount = x.ProjectResources.Count() }) .Between(x => x.ResourceCount, 10, 20) .ToList(); |
The LINQ provider will happily chew the new expression and translate it into the appropriate SQL.
Both DbContext and DbSet<T> implement the IInfrastructure<IServiceProvider> interface. This method exposes the internal service provider, and from it you can obtain references to all services used by Entity Framework Core.