Consider the following table named Employees: using System;
using System.Data;
using System.Drawing;
using System.Windows.Forms;
using System.Data.SqlClient;
public class Exercise : System.Windows.Forms.Form
{
DataGridView dgvIdentifications;
public Exercise()
{
InitializeComponent();
}
void InitializeComponent()
{
dgvIdentifications = new DataGridView();
dgvIdentifications.Location = new Point(12, 12);
dgvIdentifications.Size = new System.Drawing.Size(270, 250);
Text = "Exercise";
Controls.Add(dgvIdentifications);
Load += new EventHandler(ExerciseLoad);
dgvIdentifications.Anchor = AnchorStyles.Left | AnchorStyles.Top |
AnchorStyles.Right | AnchorStyles.Bottom;
}
void ExerciseLoad(object sender, EventArgs e)
{
SqlConnectionStringBuilder csbExercise = new SqlConnectionStringBuilder();
csbExercise.DataSource = "(local)";
csbExercise.InitialCatalog = "Exercise1";
csbExercise.IntegratedSecurity = true;
using (SqlConnection cntExercise = new SqlConnection(csbExercise.ConnectionString))
{
SqlCommand cmdEmployees =
new SqlCommand("CREATE SCHEMA Personnel;", cntExercise);
cntExercise.Open();
cmdEmployees.ExecuteNonQuery();
}
using (SqlConnection cntExercise =
new SqlConnection("Data Source='(local)';" +
"Database='Exercise1';" +
"Integrated Security='SSPI';"))
{
SqlCommand cmdEmployees =
new SqlCommand("CREATE TABLE Personnel.Employees(" +
"EmployeeNumber nchar(6) not null primary key, " +
"FirstName nvarchar(20), MiddleName nvarchar(20), " +
"LastName nvarchar(20), " +
"HourlySalary smallmoney, Status nvarchar(40));" +
"INSERT INTO Personnel.Employees " +
"VALUES(N'862804', N'Christopher', NULL, N'Larsen', 14.50, N'Full Time'), " +
" (N'293747', N'Henry', N'Donald', N'Jonathan', 12.85, N'Full Time'), " +
" (N'847597', N'Chistine', NULL, N'Garrison', 24.05, N'Part Time'), " +
" (N'979558', N'Peter', NULL, N'Horries', NULL, NULL), " +
" (N'385807', N'Lance', N'James', N'Seagal', 16.95, N'Full Time'), " +
" (N'927405', N'Paula', N'Roberta', N'Ortez', NULL, N'Full Time'), " +
" (N'790875', N'Paul', NULL, N'Swanson', 10.90, NULL), " +
" (N'384096', N'Kristopher', N'Jude', N'Michaels', 12.85, N'Part Time'), " +
" (N'385968', N'Jennifer', NULL, N'Sanders', 15.00, N'Part Time'), " +
" (N'380696', N'David', N'Peter', N'Monahan', 13.05, N'Full Time');",
cntExercise);
cntExercise.Open();
cmdEmployees.ExecuteNonQuery();
MessageBox.Show("A table named \"Employees\" has been created in the Personnel schema.",
"Exercise",
MessageBoxButtons.OK,
MessageBoxIcon.Information);
}
using (SqlConnection cntExercise = new SqlConnection(csbExercise.ConnectionString))
{
SqlCommand cmdEmployees = new SqlCommand(
"SELECT ALL * FROM Personnel.Employees;",
cntExercise);
cntExercise.Open();
cmdEmployees.ExecuteNonQuery();
SqlDataAdapter sdaEmployees = new SqlDataAdapter(cmdEmployees);
DataSet dsEmployees = new DataSet("EmployeesSet");
sdaEmployees.Fill(dsEmployees);
dgvIdentifications.DataSource = dsEmployees.Tables[0];
}
}
public static int Main()
{
System.Windows.Forms.Application.Run(new Exercise());
return 0;
}
}
Although you can create an aggregate query with all fields or any field(s) of a table or view, the purpose of the query is to summarize data. For a good summary query, you should select a column where the records hold categories of data. This means that the records in the resulting query have to be grouped by categories. To support this, the SQL provides the GROUP BY expression. It is added after the FROM clause. This is done as follows: SELECT WhatField(s) FROM WhatObject(s) GROUP BY Column(s) The new expression in this formula is GROUP BY. This indicates that you want to group some values from one or more columns. Of course, there are rules you must follow. As stated already, the purpose of an aggregate query is to provide some statistics. Therefore, it is normal that you be interested only in the column(s) that hold(s) the desired statistics and avoid the columns that are irrelevant. As a result, if you select (only) the one column that holds the information you want, in the resulting list, each of its categories would display only once.
|
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Transact-SQL provides many built-in functions used to get statistics. These functions are used in various circumstances, depending on the nature of the column being investigated. This means that you should first decide what type of value you wand to get, then choose the appropriate function. To call the function in SQL code, start a SELECT statement and pass the column to the function. The minimum formula to follow is: SELECT FunctionName(FieldName) FROM TableName;
Probably the most basic piece of information you may want to get about a table or query is the number of records it has. In statistics, this is referred to as the number of samples. To help you get this information, Transact-SQL provides a function named Count. It counts the number of records in a column and produces the total. This function also counts NULL values. The syntax of the Count() function is: COUNT ( { [ [ ALL | DISTINCT ] expression ] | * } ) RETURNS INT
This function takes one argument. The Count() function returns an int value. Here is an example:
If you are working on a large number of records, you can call the Count_Big() function. Its syntax is: COUNT_BIG ( { [ ALL | DISTINCT ] expression } | * ) RETURNS bigint
Both count functions allow you to pass the name of the column that has the values. By default, these functions would count only the records that have a known value, not including NULL values.
If you have a list of values, you may want to get the
lowest value. For example, in a list of houses of a real estate company with
each property having a price, you may want to know which house is the
cheapest. To let you get this information, Transact-SQL provides a function
named MIN. Its syntax is: DependsOnType MIN ( [ ALL | DISTINCT ] expression ) The return value of the MIN() function depends on the type of value that is passed to it. For example, if you pass a column that is number-based, the function returns the highest number. Here is an example: USE DepartmentStore1;
GO
SELECT MIN(si.UnitPrice) N'Cheapest'
FROM Inventory.StoreItems si;
GO
If you pass a string-based column, the function returns the the last value in the alphabetical order. Here is an example: USE rosh;
GO
SELECT MIN(stds.LastName) [First Student]
FROM Registration.Students stds;
GO
In the same way, you can pass a date/time-based column. Here is an example: USE rosh;
GO
SELECT MIN(stds.DateOfBirth) "Youngest Student"
FROM Registration.Students stds;
GO
Be careful when passing a value to an aggregate function such as MIN(). For example, if the name of a column is processed by a function, the returned value would be used by the aggregate function. Consider the following call: SELECT MIN(FORMAT(Studs.BirthDate, N'D')) [Earliest Birthdate] FROM Studs; GO This would produce:
Notice that the name Friday, as a string, is the one being processed by the MIN() function, instead of the actual date.
The opposite of the lowest is the highest value of a series. To assist you with getting this value, Transact-SQL provides the Max() function. Its function is: DependsOnType MAX ( [ ALL | DISTINCT ] expression ) This function follows the same rules as its MIN() counterpart, but in reverse order (of the rules). Here is an example: SELECT MAX(si.UnitPrice) N'Most Expensive'
FROM Inventory.StoreItems si;
The sum of the values of a series is gotten by adding all values. In algebra and statistics, it is represented as follows: ∑x To let you calculate the sum of values of a certain
column of a table, Transact-SQL provides a function named Sum. The
syntax of the Sum() function is: Number SUM ( [ ALL | DISTINCT ] expression ) Unlike the MIN() and the MAX() functions that can receive a column of almost any type, the column passed to the SUM() function must be number-based.
In algebra and statistics, the mean is the average of the numeric values of a series. To calculate it, you can divide the sum by the number of values of the series. It is calculated using the following formula:
From this formula:
To support this operation, Transact-SQL provides the
Avg function. Its syntax is: AVG ( [ ALL | DISTINCT ] expression ) RETURNS Number
Imagine you have a column with numeric values. You already know how to get the sum and the mean. The standard deviation is a value by which the elements vary (deviate) from the mean. The formula to calculate the standard deviation is:
From this formula:
The above formula wants you to first calculate the mean. As an alternative, you can use a formula that does not require the mean. It is:
Instead of creating your own function, Transact-SQL can
assist you. First there are two types of standard deviations. The sample
standard deviation relates to a sample. To let you calculate it,
Transact-SQL provides a function named StdDev. Its syntax is: STDEV ( [ ALL | DISTINCT ] expression ) RETURNS float The other standard deviation relates to a population. To help you calculate it, Transact-SQL provides the STDDEVP() function. Its syntax is: STDEVP ( [ ALL | DISTINCT ] expression ) RETURNS float
The variance is the square of the standard deviation. This means that, to calculate it, you can just square the value of a standard deviation. As seen with the standard deviation, there are two types of variances. A sample variance relates to a sample. To help you calculate a sample variance of records, Transact-SQL provides VAR function. Its syntax is: VAR ( [ ALL | DISTINCT ] expression ) RETURNS float The function used to calculate a population variance is VARP and its syntax is: VARP ( [ ALL | DISTINCT ] expression ) RETURNS float
All of the aggregate queries we have used so far involved all the records of a table or view. In some cases, you may want to restrict the records to consider. As you may know by now, this is done by adding a condition to a SQL statement. To add a condition to an aggregate query, instead of WHERE, you use the HAVING keyword. Here is an example: using (SqlConnection scAltairRealtors =
new SqlConnection("Data Source=(local);" +
"Database='AltairRealtors2';" +
"Integrated Security='SSPI';"))
{
SqlCommand cmdProperties =
new SqlCommand("SELECT PropertyType AS Category, " +
" COUNT(PropertyNumber) AS Count, " +
" MIN(MarketValue) AS Minimum, " +
" MAX(MarketValue) AS Maximum, " +
" SUM(MarketValue) AS Sum, " +
" AVG(MarketValue) AS Average, " +
" STDEV(MarketValue) AS StandardDeviation, " +
" VAR(MarketValue) AS Variance " +
"FROM Listings.Properties " +
"GROUP BY PropertyType " +
"HAVING PropertyType = 'Single Family';",
scAltairRealtors);
scAltairRealtors.Open();
cmdProperties.ExecuteNonQuery();
SqlDataAdapter sdaProperties = new SqlDataAdapter(cmdProperties);
DataSet dsProperties = new DataSet("PropertiesSet");
sdaProperties.Fill(dsProperties);
}
Imagine you have a table that has one or more fields with numeric values and you use a SELECT statement to select some of those columns. At the end the statement, you can ask the database engine to perform a calculation using one or more of the aggregate functions and show the result(s). To do this, you use the COMPUTE keyword in a formula as follows: [ COMPUTE
{ { AVG | COUNT | MAX | MIN | STDEV | STDEVP | VAR | VARP | SUM }
( expression ) } [ ,...n ]
[ BY expression [ ,...n ] ]
]
As you can see, you start with COMPUTE followed by the desired function, which uses parentheses. In the parentheses, include the name of the column that holds the numeric values. |
|
|||||||||||||||||||||||||||||||||||||||||
|
|