Introduction

This article highlights the shrewd utilization of JOINs and how they can be used to take care of regular business issues, which may be generally troublesome and moderate to tackle with other routines. This article does not cover the nuts and bolts of JOINs, but the reader can check further study. The vast majority of the issues, including removing information from numerous connected tables, are typically comprehended utilizing INNER and LEFT OUTER JOINs. On the other hand, numerous a times a few engineers are left pondering which JOIN ought to be actualized in which situation, particularly on the off chance when more than two tables are included.

MCSE Training – Resources (Intense)

Let the Game Begin

Even though it is recommended that one should avoid subqueries, the vast majority don’t hesitate to use them without realizing the benefits that JOINs can provide over subqueries.

Consider the below scenario from the AdventureWorks 2008 database. Let’s say a query needs to extract all product names that have been bought from vendor Mountain Works.

Case 1: Sub-queries VS JOINs

The traditional way

select DISTINCT Name 
 from Production.Product
 where ProductID IN (Select ProductID from Purchasing.ProductVendor
                      where VendorID=( select VendorID 
                                        from Purchasing.Vendor
                                        where Name='Mountain Works'
                                      )
                     )

Using JOINs:

select DISTINCT P.Name
 from Production.Product P
  INNER JOIN Purchasing.ProductVendor PV 
    ON P.ProductID = PV.ProductID
  INNER JOIN Purchasing.Vendor V 
    ON V.VendorID = PV.VendorID
 where V.Name='Mountain Works'

For a table containing less information, both the questions may or less take same measure of time. However, for a table that contains a large number of columns the time distinction would be recognizable. The reason is that the subquery needs to run one inquiry for each returned column though the JOIN simply needs to run one question, and it uses the table files (if there are any). The same result can be accomplished utilizing an internal JOIN.

Case 2: NOT IN clause VS JOINs

Presently we require an inquiry that needs to separate the names of all items that have been purchased by every one of the sellers with the exception of Mountain Works, so this time we will including the NOT IN statement.

Utilizing Sub-Queries (Traditional and moderate)

select DISTINCT Name 
 from Production.Product 
 where ProductID IN (Select ProductID from Purchasing.ProductVendor 
                     where VendorID IN(select VendorID from Purchasing.Vendor 
                                       where Name<>'Mountain Works'
                                      )
                     )

Using JOINs:

select DISTINCT P.Name 
 from Production.Product P
  INNER JOIN Purchasing.ProductVendor PV 
   ON P.ProductID = PV.ProductID
  LEFT JOIN Purchasing.Vendor V 
   ON V.VendorID = PV.VendorID 
   AND V.Name='Mountain Works'
 where V.Name IS NULL

Note that in the second code piece, a LEFT JOIN has been utilized alongside an AND condition. A LEFT JOIN with Purchasing.Vendor alongside AND condition will give all of you the merchants whose name is Mountain Works (all whatever remains of the vendors’ names will be NULL). These are the columns we don’t need, so to sift them through, we simply include ‘V.Name is NULL’.

Once more, the explanation behind utilizing a JOIN rather than the sub-query is that the latter needs to run one query for each returned column, while the former simply needs to run one query altogether that uses the table records.

Case 3: Optimizing Aggregations

Aggregations are a resource-intensive process. The more the number of columns used for a GROUP BY, the slower the query will execute. Hence, it is always advisable to create aggregations only on required columns.

Taking a look at an example where it is required to find the quantity of product ordered along with product details. In the customary way, I have kept all the non-aggregate columns that are present in the select query, in the group by section as well.

Using all columns (traditional and slow)

select P.Name,
       P.ProductNumber,
       P.Color,
       P.DaysToManufacture,
        P.MakeFlag,
        P.FinishedGoodsFlag,
        P.Color,
        P.SafetyStockLevel,
        P.ReorderPoint,
        P.StandardCost,
        P.ProductNumber,
        P.MakeFlag,
        P.FinishedGoodsFlag,
        P.Color,
        P.SafetyStockLevel,
        P.ReorderPoint,
        P.SizeUnitMeasureCode,
        P.WeightUnitMeasureCode,
        P.Weight,
        P.ProductLine,
        P.Class,
        P.Style,
        H.Freight,
        C.AccountNumber,
        S.SalesOrderID,
       SUM(S.OrderQty) AS OrderQty
 FROM Production.Product P
  INNER JOIN Sales.SalesOrderDetail S ON P.ProductID = S.ProductID
  INNER JOIN Sales.SalesOrderHeader H ON H.SalesOrderID = S.SalesOrderID
  INNER JOIN Sales.Customer C ON C.CustomerID = H.CustomerID
 GROUP BY P.Name,
            P.ProductNumber,
            P.Color,
            P.DaysToManufacture,
            P.MakeFlag,
            P.FinishedGoodsFlag,
            P.Color,
            P.SafetyStockLevel,
            P.ReorderPoint,
            P.StandardCost,
            P.ProductNumber,
            P.MakeFlag,
            P.FinishedGoodsFlag,
            P.Color,
            P.SafetyStockLevel,
            P.ReorderPoint,
            P.SizeUnitMeasureCode,
            P.WeightUnitMeasureCode,
            P.Weight,
            P.ProductLine,
            P.Class,
            P.Style,
            H.Freight,
            C.AccountNumber,
            S.SalesOrderID

Using JOINs, I would create an alias table and use it only on two columns not on 25 columns (shown in traditional method). Aggregation is being done on only the important columns, which is a better way.

select P.Name,	
       P.ProductNumber,
       P.Color,
       P.DaysToManufacture,
        P.MakeFlag,
        P.FinishedGoodsFlag,
        P.Color,
        P.SafetyStockLevel,
        P.ReorderPoint,
        P.StandardCost,
        P.ProductNumber,
        P.MakeFlag,
        P.FinishedGoodsFlag,
        P.Color,
        P.SafetyStockLevel,
        P.ReorderPoint,
        P.SizeUnitMeasureCode,
        P.WeightUnitMeasureCode,
        P.Weight,
        P.ProductLine,
        P.Class,
        P.Style,
        H.Freight,
        C.AccountNumber,
        S.SalesOrderID,
        S.OrderQty
 FROM Production.Product P
  INNER JOIN (select ProductID,SalesOrderID,SUM(OrderQty) AS OrderQty FROM Sales.SalesOrderDetail GROUP BY ProductID,SalesOrderID) S ON P.ProductID = S.ProductID
  INNER JOIN Sales.SalesOrderHeader H ON H.SalesOrderID = S.SalesOrderID
  INNER JOIN Sales.Customer C ON C.CustomerID = H.CustomerID

In the previous approach, aggregation has been done on four columns, though in the last case if the same thing is done utilizing JOINs, then collection should be put just on one segment.

End Game

To wrap things up, the accompanying circumstance is the place JOINs have been demonstrated to be greatly accommodating. In a call focus-based application, it is usual to have a table that records the call-related points of interest like the reaction, call date, and customer name. How about we make the base table:

Select 'A' AS CustomerName,'03-12-10' AS CalledOn,'No Response' AS CustomerResponse INTO #TMP
UNION
Select 'A' AS CustomerName,'02-15-11' AS CalledOn,'Busy Call later' AS CustomerResponse
UNION
Select 'B' AS CustomerName,'03-12-10' AS CalledOn,'Interested Call later' AS CustomerResponse
UNION
Select 'A' AS CustomerName,'10-14-11' AS CalledOn,'No Response' AS CustomerResponse
UNION
Select 'C ' AS CustomerName,'03-12-10' AS CalledOn,'No Response' AS CustomerResponse
UNION
Select 'B' AS CustomerName,'04-29-11' AS CalledOn,'No Response' AS CustomerResponse
UNION
Select 'A' AS CustomerName,'02-14-12' AS CalledOn,'Interested Call later' AS CustomerResponse
UNION
Select 'B' AS CustomerName,'05-01-11' AS CalledOn,'No Response' AS CustomerResponse
UNION
Select 'C' AS CustomerName,'01-05-12' AS CalledOn,'Not Interested' AS CustomerResponse
UNION
Select 'B' AS CustomerName,'10-14-11' AS CalledOn,'Sale' AS CustomerResponse

The prerequisite is to discover every one of the customers who have never demonstrated lack of engagement, their last reaction, and the last time they were called.\

SELECT T.CustomerName,T.CustomerResponse,T.CalledOn
 FROM #TMP T
  LEFT OUTER JOIN (SELECT CustomerName 
                    FROM #TMP 
                    WHERE CustomerResponse = 'Not Interested'
                  ) Q
   ON Q.CustomerName = T.CustomerName
 INNER JOIN (SELECT CustomerName,
                    MAX(CONVERT(DATE,CalledOn)) AS MaxCalledOn 
              FROM #TMP 
              GROUP BY CustomerName
            )Q2
  ON Q2.CustomerName = T.CustomerName
 WHERE Q.CustomerName IS NULL
 AND T.CalledOn = Q2.MaxCalledOn

A brief examination of the inquiry:

1. This inquiry is a decent case of the self-join idea.

2. Instead of utilizing NOT As a part of, we have used LEFT OUTER JOIN and IS NULL.

3. There are three sections in the yield of the inquiry; however the GROUP BY is connected just on a solitary segment.

4. One essential idea of JOIN is being misused here. The two subqueries, Q and Q2, give fewer lines in the yield when contrasted with the primary inquiry; however, their outcome gets rehashed as the worth in key segment. That is, CustomerName is repetitive.

Unquestionably, there are different approaches to compose an equal question, but it is left to the readers to attempt that.

Summary

We can undoubtedly keep away from subqueries by utilizing JOINs. It will make your question speedier and less asset escalated. JOINs have awesome force and their wise use would improve your questions as well as make the SQL Server motor upbeat.