Monday 13 February 2012

Grouping Records Through LINQ



Uniting the Records



If you have two sets of records of the same type, you can unite them into a new common list. This is referred to as uniting the lists. The lists can be created as an array or a List<> object. To give you the ability to unite two lists, the Enumerable class is equipped with a method named Union. Its syntax is:

public static IEnumerable Union(this IEnumerable first, IEnumerable second);


The variable that calls this method and the argument must be of the same type. Here is an example:


using System;
using System.Linq;
using System.Collections.Generic;

public class Exercise
{
    public static int Main()
    {
        var names = new string[]
     {
        "Jerrie Sachs",
         "Stevens Souza",
         "Marianne Swanson",
         "Alain Gudmundson",
         "Jeannette Perkins",
         "Pierrette Perkins"
     };

        var values = new string[]
     {
         "Gertrude Monay",
         "Raymond Kouma"
     };

        IEnumerable<string> result = names.Union<string>(values);

        foreach (var member in result)
            Console.WriteLine(member.ToString());

        Console.WriteLine();
        return 0;
    }
}


This would produce: 

Union


In the same way, you can unite records from lists of classes. The most important rule to observe is that both lists must have values that are of the same type.


Grouping by Categories


In all statements we have created so far, we were storing the results in a select variable, and the select variable was able to provide its list when necessary. We also learned how to sort the values in a select list. As an alternative, you can ask the compiler to produce a list where the values are grouped by categories. For example, if you have a list of students, you may want the list to be organized by gender, or by another category. 

To support grouping, the LINQ provides the group...by operation. The primary formula to use it is: 

var SubListName = from ValueHolder in List group ValueHolder by Category;
 

The new keywords are group and by. They are required. The ValueHolder factor is the same variable created from the from operator. The Category is the new factor of our formula. It must be a value that can be gotten from the ValueHolder list. Here is an example:

var classroom = from pupils
                in students
                group pupils by pupils.Gender;


The group...by expression creates a list of categories, not a list of values, like the select statement would do. This means that each item of the list produced by the group...by clause is a category. Therefore, to access a category, use a foreach loop that applies to each item of the list. 

Here is an example:

var classroom = from pupils
                in students
                group pupils by pupils.Gender;

foreach (var stds in classroom)
{
      
}
 

Each category of the group...by list is its own list of items. Therefore, after accessing a category, inside of it, you can access an item. To do this, use a nested foreach loop. 

Here is an example:

using System;
using System.Linq;
using System.Collections.Generic;

public class Exercise
{
    public static int Main()
    {
        var students = new Student[]
     {
            new Student(82495, "Carlton", "Blanchard", 'M'),
            new Student(20857, "Jerrie", "Sachs", 'U'),
            new Student(20935, "Charlotte", "O'Keefe", 'F'),
            new Student(79274, "Christine", "Burns", 'F'),
            new Student(79204, "Bobbie", "Swanson"),
            new Student(14815, "Marianne", "Swanson", 'F'),
            new Student(24958, "Jeannette", "Perkins", 'F'),
            new Student(24759, "Pierrette", "Perkins", 'F'),
            new Student(92804, "Charles", "Pressmann", 'M'),
            new Student(80074, "Alain", "Goodson", 'M')
     };

        var classroom = from pupils
                        in students
                        group pupils by pupils.Gender;
        
        Console.WriteLine("+========+============+===========+=======+");
        Console.WriteLine("| Std # | First Name | Last Name | Gender |");
        foreach (var stds in classroom)
        {
            foreach (var pupil in stds)
            {
                Console.WriteLine("+-------+------------+-----------+--------+");
                Console.WriteLine("| {0,5} | {1,-10} | {2,-9} | {3,4}   |",
                                  pupil.StudentNumber, pupil.FirstName,
                                  pupil.LastName, pupil.Gender);
            }
        }

        Console.WriteLine("+=======+============+===========+========+");

        Console.WriteLine();
        return 0;
    }
}

public class Student
{
    public int    StudentNumber;
    public string FirstName;
    public string LastName;
    public char Gender;

    public Student(int number = 0,
                   string firstName = "Leslie",
                   string lastName = "Doe",
                   char   gdr = 'U')
    {
        StudentNumber = number;
        FirstName = firstName;
        LastName = lastName;
        Gender = gdr;
    }
}


This would produce:

Group By


Notice that you do not create a select statement after the group...by clause.

To restrict the list of records in the result, you can add a where condition. Here is an example:

public class Exercise
{
    public static int Main()
    {
        var students = new Student[]
     {
            new Student(82495, "Carlton", "Blanchard", 'M'),
            new Student(20857, "Jerrie", "Sachs", 'U'),
            new Student(20935, "Charlotte", "O'Keefe", 'F'),
            new Student(79274, "Christine", "Burns", 'F'),
            new Student(79204, "Bobbie", "Swanson"),
            new Student(14815, "Marianne", "Swanson", 'F'),
            new Student(24958, "Jeannette", "Perkins", 'F'),
            new Student(24759, "Pierrette", "Perkins", 'F'),
            new Student(92804, "Charles", "Pressmann", 'M'),
            new Student(80074, "Alain", "Goodson", 'M')
     };

        var classroom = from pupils
                        in students
                        where pupils.FirstName.StartsWith("C")
                        group pupils by pupils.Gender;
        
        Console.WriteLine("+========+============+===========+=======+");
        Console.WriteLine("| Std # | First Name | Last Name | Gender |");
        foreach (var stds in classroom)
        {
            foreach (var pupil in stds)
            {
                Console.WriteLine("+-------+------------+-----------+--------+");
                Console.WriteLine("| {0,5} | {1,-10} | {2,-9} | {3,4}   |",
                                  pupil.StudentNumber, pupil.FirstName,
                                  pupil.LastName, pupil.Gender);
            }
        }
        Console.WriteLine("+=======+============+===========+========+");

        Console.WriteLine();
        return 0;
    }
}


This would produce:

Group By


The Key to a Group


When you create grouping of values, the resulting list is stored in a variable of type IGrouping. The IGrouping interface is defined in the System.Linq namespace of the System.Core.dll assembly. The IGrouping interface is derived from the IEnumerable interface. This means that it gets most of its behaviors from that interface. This also means that using the IGrouping interface gives you access to the members of the Enumerable class.

The IGrouping interface is a generic class declared as follows:

public interface IGrouping<TKey, TElement> : IEnumerable<TElement>,
          IEnumerable
 

In our introduction to grouping, we saw that its operation identifies the categories of items from the from variable. Each category is referred to as a key and each category can be recognized as a TKey object of the IGrouping list. This allows you to access each category. In fact, you can access a category and perform an operation on it.

Although the IGrouping interface inherits most of its functionality from the IEnumerable interface and implemented through the Enumerable class, it is equipped with only one property, named Key. To get the value of an IGrouping category, you can retrieve it from the Key property.

Grouping Into a Variable


When you create a grouping, you get a list of categories of values and that list becomes ready to be used. In some cases, before exploring the list, you may want to perform an operation on it. One way you can do this, you can store that list in a (local) variable and use that variable as if it were a from variable.

To declare a variable to store the grouping values, you use the into contextual keyword through the following formula:

var SubListName = from ValueHolder
    in List
    group ValueHolder by Category into GroupVariable ...;
 

The GroupVariable is the new factor in our formula. You specify it as a regular name of a variable. Here is an example:

var empls = from staffMembers
             in employees
             group staffMembers by staffMembers.Gender into Categories
 

After creating the name, you can perform any operation on it inside the LINQ statement. The variable is of type IGrouping. This means that you can access its Key property or you can access one of the methods that the interface gets from IEnumerable, and then use it as you see fit. Here is an example:

var classroom = from pupils
                in students
                group pupils by pupils.Gender into Genders
                where Genders.Contains(students[0])
 

Before ending the LINQ statement, you must create either a group...by expression or a select statement that uses the into variable. Here is an example:

var classroom = from pupils
                in students
                group pupils by pupils.Gender into Genders
                where Genders.Contains(students[0])
                select Genders;
 

This statement, particularly the Enumerable.Contains(students[0]) produces only the category (group) identified as the first index (0) of the values in the main list:

Group By


Notice that all records in the final result have a common category, which in this case is the M gender of each student. For this reason, you can omit that column when presenting the values to the user. Here is an example (the column for the rating was removed from the list view):

using System;
using System.Linq;
using System.Collections.Generic;

public class Exercise
{
    public static int Main()
    {
        var students = new Student[]
     {
            new Student(82495, "Carlton", "Blanchard", 'M'),
            new Student(20857, "Jerrie", "Sachs", 'U'),
            new Student(20935, "Charlotte", "O'Keefe", 'F'),
            new Student(79274, "Christine", "Burns", 'F'),
            new Student(79204, "Bobbie", "Swanson"),
            new Student(14815, "Marianne", "Swanson", 'F'),
            new Student(24958, "Jeannette", "Perkins", 'F'),
            new Student(24759, "Pierrette", "Perkins", 'F'),
            new Student(92804, "Charles", "Pressmann", 'M'),
            new Student(80074, "Alain", "Goodson", 'M')
     };

        var classroom = from pupils
                        in students
                        group pupils by pupils.Gender into Genders
                        where Genders.Contains(students[0])
                        select Genders;

        Console.WriteLine("+=======+============+===========+");
        Console.WriteLine("| Std # | First Name | Last Name |");
        foreach (var stds in classroom)
        {
            foreach (var pupil in stds)
            {
                Console.WriteLine("+-------+------------+-----------+");
                Console.WriteLine("| {0,5} | {1,-10} | {2,-9} |",
                                  pupil.StudentNumber, pupil.FirstName,
                                  pupil.LastName);
            }
        }
        Console.WriteLine("+=======+============+===========+");

        Console.WriteLine();
        return 0;
    }
}

public class Student
{
    public int    StudentNumber;
    public string FirstName;
    public string LastName;
    public char Gender;

    public Student(int number = 0,
                   string firstName = "Leslie",
                   string lastName = "Doe",
                   char   gdr = 'U')
    {
        StudentNumber = number;
        FirstName = firstName;
        LastName = lastName;
        Gender = gdr;
    }
}


This would produce:

Group By



In the same, to get the category stored in the second index of the grouping, you would use Enumerable.Contains(students[1]). Of course this means that you can use grouping and the into operator to get a list of items of only one particular category.


Although the GroupVariable can be selected or grouped...by, it cannot be used outside the LINQ statement. It is only available in the local LINQ expression.

No comments :