Home

Selecting Records Over Partitions

 

Using Groups of Records

 

Introduction

One of the most common features of a list is that it may have repeating values.  For example, if you have a table of employees where the information of each employee includes the department where he works, you will likely find a number of employees in a certain department and another series of employees in another department.

If you have that type of list, you may be interested in getting a certain piece of information for each group of records. For our example of employees, you may want to know the number of employees in each department, the number of female employees in each department, the minimum salary of employees in each department, etc. This is different from simply calling an aggregate function on the table because the aggregate function would apply to all records. This time, you want to get the information for each group in the list.

Author Note

In this and the next lesson, we will use the word "group" or the expression "group of records". The actual word to use is "partition"; and in some cases, we will use "partition". The problem is that the word partition has a particular meaning in the database world and it has to do with "Distributed Databases". Therefore, to reduce confusion, we will mostly use "group" or "group of records" or "groups of records". You are free to substitute "group" with "partition".

Practical LearningPractical Learning: Introducing Record Partitioning

  1. Open the LambdaPropertiesManagement1 file. Select and copy its whole content
  2. Start Microsoft SQL Server and connect
  3. Right-click the name of the server and click New Query
  4. Paste the LambdaPropertiesManagement1 code in the Query Editor
  5. To execute, on the main menu, click Query -> Execute
  6. Right-click the name of the server and click Start PowerShell
  7. Type SQLCMD and press Enter
  8. Type USE LambdaPropertiesManagement1; and press Enter
  9. Type GO and press Enter
     
    Lambda Properties Management
  10. To see the list of tenants, type the following code and press Enter after each line (if PowerShell is not working, type that code in the Query Editor and press F5 to execute):
    SELECT *
    FROM Rentals.Tenants t;
    GO
    Lambda Properties Management
  11. Click inside the Query Editor and press Ctrl + A
  12. Type the following code:
    USE LambdaPropertiesManagement1;
    GO
    
    SELECT props.PropertyNumber, props.Bedrooms, props.MonthlyRate
    FROM Listing.Properties props
    ORDER BY props.Bedrooms;
    GO
    Lambda Properties Management
  13. If you observe the records, you would notice that there are six efficiencies (apartments that don't have a formal bedroom) and two of those apartments have the same price, 22 one-bedroom apartments and some of those apartments have the same price, etc.
    Get to the SQL Server Management Studio. Click inside the Query Editor and press Ctrl + A
  14. To call an aggregate function, type the following code:
    USE LambdaPropertiesManagement1;
    GO
    
    SELECT MIN(props.MonthlyRate) "Cheapest Monthly Rate"
    FROM Listing.Properties props;
    GO
  15. To execute, on the main menu, click Query -> Execute
  16. Click inside the Query Editor, press Ctrl + A, and press Delete
  17. Return to the PowerShell window
  18. You can also call various aggregate functions to get different statistics. To see an example, get to the PowerShell window, type the following statement and press Enter after each line:
    SELECT COUNT(props.PropertyType) "Total Listing",
           MIN(props.MonthlyRate)    "Cheapest Monthly Rate",
           MAX(props.MonthlyRate)    "Highest Monthly Rate",
           AVG(props.MonthlyRate)    "Average Monthly Rate"
    FROM   Listing.Properties props;
    GO
    Lambda Properties Management

Selecting Records Over Aggregates

As mentioned above, the primary reason to analyse groups of records is to get a piece of information about the groups. This is done using an aggregate function combined with at least the OVER keyword. The primary formula to follow is:

SELECT ... AggregateFunction(WhatField) OVER() ... FROM WhatObject

After the SELECT keyword, you can call an aggregate function applied to a column of the table whose records you want to analyze. After calling the aggregate function, you must call the OVER macro as a function. This means that OVER is followed by parentheses. You can leave them empty or pass some of the options we will learn.

Besides calling an aggregate function, you can also access any field of the table. You can do this before or after the aggregate function (of course, we will see many examples).

Remember that when specifying the fields used in a SQL statement, if a column is a foreign key, you can use a join to get a more significant value from the parent table.

Practical LearningPractical Learning: Selecting Records Over an Aggregate Function

  1. When OVER is called as a parameter-less function, the database engine gets the value of the first record and moves to the next record. It compares the value of that new record to the value it has. Depending on the aggregate function that is being used:
    • If the value it has responds better to the aggregate function than the value of the new record, the interpreter moves to the next record
    • If the value of the new record responds better to the function, the new value becomes selected and the interpreter moves to the next record
    When it gets to the last record, the SQL interpreter should have the right value. One of the particularities of the parameter-less OVER macro is that it returns an instance of each record, including records whose value did not conform to the aggregate function.
    To see an example of using the OVER keyword, type the following code and press Enter after each line:
    SELECT MIN(props.MonthlyRate) OVER() "Cheapest Monthly Rate"
    FROM Listing.Properties props;
    GO
    Selecting Records Over an Aggregate Function
  2. Notice that, when the parameter-less OVER is called, the SQL interpreter returns all records.
    To make the parameter-less OVER() return only one value, you can apply the DISTINCT keyword. To try it, type the following statement and press Enter after each line (remember that you can press the up arrow key to locate a record and simply edit it):
    SELECT DISTINCT(MIN(props.MonthlyRate) OVER()) "Cheapest Monthly Rate"
    FROM Listing.Properties props;
    GO
    Selecting Records Over an Aggregate Function
  3. To use an additonal field, type the following statement and press Enter after each line:
    SELECT props.Bedrooms,
           MIN(props.MonthlyRate) OVER() "Cheapest Monthly Rate"
    FROM Listing.Properties props;
    GO
  4. You can use the GROUP BY clause to an aggregate expression. To try it, type the following statement and press Enter after each line:
    SELECT props.Bedrooms,
           MIN(props.MonthlyRate) OVER() "Cheapest Monthly Rate"
    FROM Listing.Properties props
    GROUP BY props.Bedrooms, props.MonthlyRate;
    GO

Selecting Ordered Records Over

If you simply call the parameter-less OVER, it produces the list of all records as they were created. Here is an example:

Selecting Ordered Records Over

This result does not show records in groups. As we know already, to show records in groups, you can arrange them, which is done by adding an ORDER BY clause. In this case, specify the ordering in the parentheses of OVER(). The formula to follow is:

SELECT ... AggregateFunction(WhatField) OVER(ORDER BY Field) ... FROM WhatObject

Remember that you want to analyse records in groups of values. Obviously the records should have similar values. And obviously a column that includes unique values (such as a primary key column) or rarely repeating values (like a last name column) is not a good candidate. This means that you should arrange the records based on a column that has repeating values.

Practical LearningPractical Learning: Selecting Ordered Records Over an Aggregate Function

  1. Return to the Query Editor
  2. To arrange records OVER, type the following statement:
    USE LambdaPropertiesManagement1;
    GO
    
    SELECT props.Bedrooms,
           MIN(props.MonthlyRate) OVER(ORDER BY props.MonthlyRate) "Cheapest Monthly Rate"
    FROM Listing.Properties props;
    GO
    
  3. To execute, on the main menu, click Query -> Execute
     
    Selecting Ordered Records Over an Aggregate Function Selecting Ordered Records Over an Aggregate Function
    Selecting Ordered Records Over an Aggregate Function
  4. In the above example, we arranged the records based on the same columns on which the aggregate function is used. This is not a rule: You can arrange the records based on any column of your choice, but the column should have repeating values. To see an example, change the statement as follows:
    USE LambdaPropertiesManagement1;
    GO
    
    SELECT props.Bedrooms,
           MIN(props.MonthlyRate) OVER(ORDER BY props.Bedrooms) "Cheapest Monthly Rate"
    FROM Listing.Properties props;
    GO
    
    
  5. To execute, on the main menu, click Query -> Execute
  6. To see another example, change the statement as follows:
    USE LambdaPropertiesManagement1;
    GO
    
    SELECT props.PropertyType,
           MIN(props.MonthlyRate) OVER(ORDER BY props.PropertyType) "Cheapest Monthly Rate"
    FROM Listing.Properties props;
    GO
    
    
  7. To execute, on the main menu, click Query -> Execute
  8. To see one more example, change the statement as follows:
    USE LambdaPropertiesManagement1;
    GO
    
    SELECT props.PropertyType,
           COUNT(props.PropertyType) OVER(ORDER BY props.PropertyType) "Range in Category"
    FROM Listing.Properties props;
    GO
    
    
  9. To execute, on the main menu, click Query -> Execute
  10. Click inside the Query window and press Ctrl + A
  11. Press Delete
 

Selecting Records Over

In order to analyze records in series, you must create the necessary groups. A group is also referred to as a partition. Therefore, to create a group or partition, you must specify what column will be used; that is, what column holds the repeating values. To do this, pass an expression as PARTITION BY to the parentheses of OVER(). The PARTITION BY expression is followed by the name of the field.

When partitioning the records, if you want, you can arrange the groups based on a field of your choice. To do this, after the column applied to the partition, add an ORDER BY clause that uses the column of your choice.

Practical LearningPractical Learning: Selecting Records Over a Partition

  1. Type the following code:
    USE LambdaPropertiesManagement1;
    GO
    
    SELECT PropertyType [Property Type],
           COUNT(PropertyType) OVER(PARTITION BY PropertyType) Total
    FROM Listing.Properties;
    GO
  2. Press F5 to execute
     
    Selecting Records Over Partitions Selecting Records Over Partitions Selecting Records Over Partitions
  3. This result seems to indicate that there are 44 apartments, 15 condominiums, 19 single families, and 16 townhouses. We have to check it.
    Click inside the Query Editor, press Ctrl + A, and press Delete
  4. Switch to the PowerShell window.
    To get the unique value per category (per partition), type the following statement and press Enter after each line:
    SELECT DISTINCT(PropertyType) [Property Type],
           COUNT(PropertyType) OVER(PARTITION BY PropertyType) Total
    FROM Listing.Properties;
    GO
    Selecting Records Over a Partition
  5. Once again, notice that we used only one column to represent the values, to perform the aggregate operation, and to partition the records. Depending on the type(s) of statistics you are trying to get, you can use different fields if you want. As an example, type the following code and press Enter after each line:
    SELECT props.Bedrooms,
           MIN(props.MonthlyRate) OVER() "Cheapest Monthly Rate"
    FROM Listing.Properties props;
    GO
    Selecting Records Over a Partition
  6. When the PARTITION BY clause is passed to OVER, it returns the result of the aggregate function for each group (or part, or partition) but it also produces an instance of each record. To see an example of partitioning and to arrange records OVER, type the following statement and press Enter after each line:
    SELECT props.Bedrooms,
           MIN(props.MonthlyRate) OVER(PARTITION BY props.Bedrooms) "Cheapest Monthly Rate Based on Number of Bedrooms"
    FROM Listing.Properties props;
    GO
    
    
    Selecting Records Over a Partition
  7. Notice that, by default, the records are arranged based on the first column of the SQL statement. As mentioned already, you can arrange the groups based on the column of your choice. To see an example, type the following statement and press Enter after each line:
    SELECT props.Bedrooms,
           MIN(props.MonthlyRate) OVER(PARTITION BY props.Bedrooms ORDER BY props.PropertyNumber) N'Cheapest Monthly Rate Based on Number of Bedrooms'
    FROM Listing.Properties props;
    GO
    
    
    Selecting Records Over a Partition
  8. To show a unique value for each partition, pass the partitioned column to DISTINCT. To apply, type the following statement and press Enter after each line:
    SELECT DISTINCT(props.Bedrooms),
           MIN(props.MonthlyRate) OVER(PARTITION BY props.Bedrooms) "Cheapest Monthly Rate Based on Number of Bedrooms"
    FROM Listing.Properties props;
    GO
    
    
    Selecting Records Over a Partition
  9. To arrange the partitions, type the following statement and press Enter after each line:
    SELECT DISTINCT(props.Bedrooms),
           MIN(props.MonthlyRate) OVER(PARTITION BY props.Bedrooms ORDER BY props.PropertyNumber)
    		N'Cheapest Monthly Rate Based on Number of Bedrooms'
    FROM Listing.Properties props;
    GO
    
    
    Selecting Records Over a Partition
  10. To get the number of cheapest houses/apartments in each category, type the following statement and press Enter after each line:
    SELECT DISTINCT(props.Bedrooms) "Number of Bedrooms",
           COUNT(props.PropertyNumber) "Number of Houses or Apartments",
           MIN(props.MonthlyRate) OVER(PARTITION BY props.Bedrooms) "Cheapest Monthly Rate"
    FROM Listing.Properties props
    GROUP BY props.Bedrooms, props.MonthlyRate;
    GO
    
    
    Selecting Records Over a Partition
  11. To see the average monthly rate of properties based on the number of bedrooms, type the following statement and press Enter after each line:
    SELECT DISTINCT(props.Bedrooms),
           AVG(props.MonthlyRate) OVER(PARTITION BY props.Bedrooms) "Avergage Monthly Rate Based on Number of Bedrooms"
    FROM Listing.Properties props;
    GO
    
    
    Selecting Records Over a Partition
  12. To format the average value to display in currency, type the following code and press Enter after each line:
    SELECT DISTINCT(props.Bedrooms),
           FORMAT(AVG(props.MonthlyRate) OVER(PARTITION BY props.Bedrooms), N'C')
    			"Avergage Monthly Rate Based on Number of Bedrooms"
    FROM Listing.Properties props;
    GO
  13. Type the following code and press Enter after each line:
    SELECT DISTINCT(pmts.RegistrationNumber),
           SUM(pmts.AmountPaid) OVER(PARTITION BY pmts.RegistrationNumber) [Total Payments]
    FROM Rentals.Payments pmts;
    GO
    Selecting Records Over a Partition
  14. Remember that if a SQL expression contains a foreign key, you can use a join. As an example, type the following code and press Enter after each line:
    SELECT DISTINCT(regs.TenantCode),
           SUM(pmts.AmountPaid) OVER(PARTITION BY pmts.RegistrationNumber) [Total Payments]
    FROM Rentals.Payments pmts
    INNER JOIN Rentals.Registrations regs
    ON pmts.RegistrationNumber = regs.RegistrationNumber;
    GO
    Selecting Records Over a Partition
  15. Switch to the Query Editor and type the following code:
    USE LambdaPropertiesManagement1;
    GO
    
    SELECT DISTINCT(CONCAT(tens.FirstName, N' ', tens.LastName, N', occupying ',
                    LOWER(props.PropertyType), N' #', props.PropertyNumber,
    		N' with a monthly rate of $', props.MonthlyRate)) [Tenant and Registration Information],
           SUM(pmts.AmountPaid) OVER(PARTITION BY pmts.RegistrationNumber) [Total Payments]
    FROM Rentals.Payments pmts
    INNER JOIN Rentals.Registrations regs ON pmts.RegistrationNumber = regs.RegistrationNumber
    INNER JOIN Rentals.Tenants tens ON tens.TenantCode = regs.TenantCode
    INNER JOIN Listing.Properties props ON regs.PropertyNumber = props.PropertyNumber;
    GO                                                                                                   
  16. To execute, press F5

    Selecting Records Over Partitions

  17. Click inside the Query Editor and press Ctrl + A
  18. Press Delete
 
 
   
 

Previous Copyright © 2012-2013 FunctionX Next