CHAPTER 7
Language-Integrated Query (LINQ) allows you to query data with a SQL-like syntax. LINQ can be used with many different types of data and both Microsoft and third parties have built LINQ providers to access a wide range of data sources. This chapter narrows that list by showing you how to use LINQ to Objects. Once you know LINQ to Objects, understanding other LINQ providers is easy because of similar syntax.
Before you write any LINQ code, remember to add a using declaration to the System.Linq namespace at the top of your file. Each example in this chapter will use the following class, containing collections to work with:
using System.Collections.Generic; public class Customer { public int ID { get; set; } public string Name { get; set; } } public class Order { public int CustomerID { get; set; } public string Description { get; set; } } public static class Company { static Company() { Customers = new List<Customer> { new Customer { ID = 0, Name = "May" }, new Customer { ID = 1, Name = "Gary" }, new Customer { ID = 2, Name = "Jennifer" } }; Orders = new List<Order> { new Order { CustomerID = 0, Description = "Shoes" }, new Order { CustomerID = 0, Description = "Purse" }, new Order { CustomerID = 2, Description = "Headphones" } }; } public static List<Customer> Customers { get; set; } public static List<Order> Orders { get; set; } } |
These are collections of objects in memory. To make the collection easier to query, Company is a static class, with a static constructor that initializes static properties. If you abstract this concept, that data could have been read from a file, database, or REST service. Regardless of the data source or the LINQ provider, the basic LINQ syntax remains the same.
To query data, you only need the from and select keywords. Remember to add a using clause for the System.Linq namespace. The syntax looks like SQL, as you can see in the following example.
using System; using System.Linq; using System.Collections.Generic; public class Program { public void Main() { IEnumerable<Customer> customers = from cust in Company.Customers select cust; foreach (Customer cust in customers) Console.WriteLine(cust.Name); } } |
LINQ to Objects queries result in a collection of type IEnumerable<T>. In this case, it’s a collection of Customer objects. The from keyword specifies a range variable, cust, which holds each object from the collection. You specify the collection after the in keyword.
The select defines what to query. In this example, you’re just returning the whole object. In fact, the collection you get is identical to what is in Company.Customers. This isn’t particularly useful in LINQ to Objects, but is very useful if the data was read from an external data source, like a database where you just wanted to get a collection of objects into memory for further manipulation. The select allows you to reshape the data you get back into various projections. The following is a query that gets the customer name.
IEnumerable<string> customers2 = from cust2 in Company.Customers select cust2.Name; |
The select uses the cust2 variable to access the Name, resulting in a collection of string (the Name property’s type). Sometimes you need a whole different object, where that object might be defined as:
public class CustomerViewModel { public string Name { get; set; } } |
And a new projection could be written as:
IEnumerable<CustomerViewModel> customerVMs = from custVM in Company.Customers select new CustomerViewModel { Name = custVM.Name }; |
Here, select instantiates a new CustomerViewModel. Then it populates values, using object initialization syntax, to assign the custVM.Name to the new object’s Name property. This results in a collection of type CustomerViewModel.
The previous example assumed you needed to work with a specifically typed collection. However, what if you don’t care what type the collection is and what if you didn’t want to create a new class just to do manipulation in a single algorithm? In that case, you could use an anonymous type, as shown in the following listing.
var customers3 = from cust3 in Company.Customers select new { Name = cust3.Name }; foreach (var cust3 in customers3) Console.WriteLine(cust3.Name); |
Anonymous types don’t have names you can use, even though C# might create an internal name for its own use. To work around this problem, use the var keyword as the type. Notice how the projection uses new without a type name: an anonymous type. You can define whatever properties you want for an anonymous type; just write them in. Notice also that you can use var in the foreach loop.
If you need to return a collection from a method, create a new (named) type and project into that. Anonymous types are designed for situations limited to the scope in which they are used. You’ll see the var keyword used elsewhere in code, but the reason it was added to the language was to support this scenario. The following listing shows a common way to use var, other than the previous scenario.
var customer = new Customer(); |
The previous statement is shorter than specifying the object type of the variable, which is redundant in this case and is obviously Customer. However, the following example is less obvious.
var response = DoSomethingAndReturnResults(); |
The problem in the previous statement is that just reading the code doesn’t tell you what type var is. You don’t know whether it’s a single object or a collection. In this case, the code might be more maintainable by specifying the type.
Note: A common misconception is that var is dangerous because it behaves like object, allowing you to set the variable to any type. This is not true. When you use var, the code is still strongly typed. Once you assign a value to a variable of type var, you can’t assign any other type to that variable. In the previous examples, customers is an instance of type Customer. You can’t write code later to assign an object of another instance type—for example, an Order type—to that variable.
You can filter a collection with the where clause, as shown in the following example.
var customers4 = from cust4 in Company.Customers where cust4.Name.Length > 3 && !cust4.Name.StartsWith("G") select cust4; foreach (var cust4 in customers4) Console.WriteLine(cust4.Name); |
In the previous listing, a customer’s name must be longer than 3, which filters the list down to Gary and Jennifer. The clause to the right of the && operator filters that list even further to the name whose first character is not "G".
In LINQ to Objects, you can create complex conditions in the where clause using logical operators, parentheses for grouping, and any other logic to filter results. You can even call another method that will evaluate the current object being evaluated. The result of the where clause must evaluate to a bool. Other LINQ providers might restrict the type of expressions in a where clause, so you’ll have to review documentation for that particular provider to learn more.
In LINQ, the orderby clause lets you sort collection results. The following listing demonstrates this.
var customers5 = from cust5 in Company.Customers orderby cust5.Name descending select cust5; foreach (var cust5 in customers5) Console.WriteLine(cust5.Name); |
In this example, the orderby clause sorts the list by the customer name in descending order. The default order is ascending, which you’ll get by either omitting descending or specifying ascending instead. The output is:
May
Jennifer
Gary
Sometimes you’ll have two different collections of objects or related tables in a database and you need to join them together. To do this, use the join clause.
var customerOrders = from cust in Company.Customers join ord in Company.Orders on cust.ID equals ord.CustomerID select new { ID = cust.ID, Customer = cust.Name, Item = ord.Description }; foreach (var custOrd in customerOrders) Console.WriteLine( $"Customer: {custOrd.Customer}, Item: {custOrd.Item}"); |
After the from clause, you can use one or more join clauses to access the types you need. The on keyword lets you specify the keys to match between tables. This example creates a projection on an anonymous type to create a report based on the joined information. This was a normal join, which omits any Customers where there isn’t a matching Order. The following example lets you do the equivalent of a left join.
var customerOrders2 = from cust in Company.Customers join ord in Company.Orders.DefaultIfEmpty() on cust.ID equals ord.CustomerID select new { ID = cust.ID, Customer = cust.Name, Item = ord.Description }; foreach (var custOrd2 in customerOrders) Console.WriteLine( $"Customer: {custOrd2.Customer}, Item: {custOrd2.Item}"); |
The difference here is the call to DefaultIfEmpty, which includes the Customer with the Name Gary, even though there aren’t any orders in the join that match his ID.
You’ve seen basic LINQ syntax, but there’s much more available in the form of standard query operators. There are literally dozens of standard query operators, and you can view all of them on MSDN at https://msdn.microsoft.com/en-us/library/vstudio/bb397896(v=vs.120).aspx.
The following code listings are a grab bag of examples, demonstrating how to use standard query operators that you might find useful.
So far, you’ve been working with IEnumerable<T>, where T is the projected type of the query. There are a set of standard query operators that will return different collection types, including ToList, ToArray, ToDictionary, and more. Here’s an example that turns the results into a List.
var custList = (from cust in Company.Customers select cust) .ToList(); custList.ForEach(cust => Console.WriteLine(cust.Name)); |
The previous code enclosed the query in parentheses and then called the ToList operator. The ForEach method on List<T> lets you pass a lambda.
LINQ queries use deferred execution. This means that the query doesn’t execute until you execute a foreach loop or call one of the standard query operators, like ToList, that requests the data.
You’ve seen how the C# select, where, orderby, and join keywords help build queries. Each of these queries have a standard query operator equivalent. These standard query operators use a fluent syntax and give you a different way to perform the same query as their matching language syntax. Some people prefer the fluent style and others prefer the language syntax, but the method you choose is really a personal preference. The following is an example of the Where and Select operators, which mirror the where and select language syntax clauses.
var customers6 = Company.Customers .Where(cust => cust.Name.StartsWith("J")); foreach (var cust6 in customers6) Console.WriteLine(cust6.Name); var customers7 = Company.Customers.Select(cust => cust.Name); foreach (var cust7 in customers7) Console.WriteLine(cust7); |
The Where lambda must evaluate to a bool and the Select lambda lets you specify the projection.
You can perform set operations like Union, Except, and Intersect. The following listing is an example of Union.
var additionalCustomers = new List<Customer> { new Customer { ID = 1, Name = "Gary" } }; var customerUnion = Company.Customers .Union(additionalCustomers) .ToArray(); foreach (var cust in customerUnion) Console.WriteLine(cust.Name); |
Just pass a compatible collection and Union will produce a combined collection of all objects. I used the ToArray operator in this example too, which results in an array of the collection type, Customer.
There is a useful set of operators for selecting First, FirstOrDefault, Single, SingleOrDefault, Last, and LastOrDefault. The following example demonstrates First.
Console.WriteLine(Company.Customers.First().Name); |
The only thing about using First this way is the possibility of an InvalidOperationException with the message “Sequence contains no elements.” This sequence contains elements, but this isn’t guaranteed. You would be safer using the operator with the OrDefault suffix, as in the following listing.
var empty = Company.Customers .Where(cust => cust.ID == 999) .SingleOrDefault(); if (empty == null) Console.WriteLine("No values returned."); |
The previous example writes "No values returned." Because there isn’t a customer with ID == 999, the SingleOrDefault returns null, which is the default value of a reference type object.
These were only a handful of available operators, but hopefully you have a sense for the wealth of support in language syntax as well as the standard query operators that comprise LINQ.
LINQ allows you to use SQL-like syntax to query data. The LINQ provider used in this chapter is LINQ to Objects, which lets you query objects in memory, but there are many other LINQ providers for other data sources. Use a from to specify the collection being queried and a select to shape the results. The where clause lets you filter results and takes a bool expression to evaluate if a given object should be included. The orderby clause lets you sort results. The join clause lets you combine two collections. Standard query operators extend LINQ and make it even more powerful than the language keywords.