The DISTINCT clause in agregate functions in T-SQL

The DISTINCT clause is used to return only distinct (different) values.
It is use often with SELECT statement. But it is not only one using this clause. You may use it either in agregate functions.

I show it by get data from Person.Person table form AdventureWorks2012 database.
If You select year of modified date and first and lastname of person with count agregate function counting rows with this firstname,
You see  data as below for 2002 year and name begin with D letter:

  SELECT [FirstName],[LastName]
         ,YEAR([ModifiedDate])
         ,count([FirstName])
  FROM [AdventureWorks2012].[Person].[Person]
  WHERE YEAR([ModifiedDate]) = 2002 and [FirstName] LIKE 'D%'
  GROUP BY [FirstName],[LastName],YEAR([ModifiedDate])
  ORDER BY [FirstName],[LastName],YEAR([ModifiedDate]) ;

qq1With D first letter in firstname is 7 people.

So, let get year modified date and count agregate function counting rows with this firstname where firstname beginn with D letter for all years.

  SELECT YEAR([ModifiedDate])
	 ,count([FirstName])
  FROM [AdventureWorks2012].[Person].[Person]
  WHERE [FirstName] LIKE 'D%'
  GROUP BY YEAR([ModifiedDate])
  ORDER BY YEAR([ModifiedDate]) ;

qq2In 2002 year it is  people as in  above query.

If You use DISTINCT clause in agregate function, You see this data:

  SELECT YEAR([ModifiedDate])
        ,count(DISTINCT [FirstName])
  FROM [AdventureWorks2012].[Person].[Person]
  WHERE [FirstName] LIKE 'D%'
  GROUP BY YEAR([ModifiedDate])
  ORDER BY YEAR([ModifiedDate]) ;

qq3

As You may notice the count of number rows is diffrent, in this data for 2002 it is 4 people. Without DISTINCT the number of rows is bigger( 7 people).
Why? Because with DISTINCT clause the rows with the same data are counting only one time. The David firstname has 4 people so data with this name is counting only one time.

Leave a Reply

Your email address will not be published.

Time limit is exhausted. Please reload CAPTCHA.