Even
 with all the new features in the Microsoft SQL Server Business 
Intelligence (BI), sometimes the only way to accomplish a task is with 
good old fashioned T-SQL code. (Fortunately, “code” is the acronym for 
this great magazine!) In this latest installment of The Baker’s Dozen, I’ll present 13 T-SQL programming tips that could help you accomplish different database tasks.
What’s on the Menu?
Getting right to the point, here are the 13 items on the menu for this article:  
- A T-SQL example that allocates data in one table based on percentages from another table, where you might learn an unexpected lesson about data types.
- The Baker’s Dozen Spotlight: A T-SQL example that uses the MERGE statement in a data warehousing scenario.
- A T-SQL example that performs the equivalent of a MEDIAN function.
- A T-SQL example that queries SQL Server system tables to retrieve specifics about snapshots.
- A T-SQL example to demonstrate the difference between identity columns and GUID columns.
- A T-SQL example to show different ways to perform queries using partial text searches, and how to use SQL Server tools to evaluate performance.
- The Baker’s Dozen Potpourri: An example of T-SQL Ranking and Aggregation and the dangers of trying to accomplish everything in one query.
- An example of using PIVOT where the spreading element is dynamic.
- Determining a Percentile over a range of values.
- A potential “gotcha” when performing a WHERE with an OUTER JOIN.
- Manually setting an identity column.
- Performing a rollback on a TRUNCATE.
- Creating a Date Dimension.
With
 just a few exceptions, the examples use the AdventureWorks2008R2 
database. You can find AdventureWorks2008R2 on the CodePlex site. If 
you’re still using SQL Server 2008 and not 2008R2, the examples will 
still work - you’ll just need to change any 2008R2 references to 2008. 
Tip 1:  Performing an Allocation
Suppose
 you receive budget data at the region level, and you need to allocate 
it downward to the market or account level, based on each 
market/account’s percentage of share of the region. This scenario occurs
 in data warehouse scenarios where a developer must allocate a measure 
based on some weighted %. The developer needs to be careful, not only to
 implement the allocation method correctly, but also to ensure that the 
sum of the allocated numbers equal the original sum of the measure being
 allocated.
Listing 1 shows an example, using the AdventureWorks2008R2 database. The example uses the tables Purchasing.PurchaseOrderHeader and Purchasing.PurchaseOrderDetails,
 and allocates the freight from the order header table down to the 
product line items in the order detail table, for each purchase order. 
Stated simply, if order A has $100 in freight, and order A has two line 
items (1 and 2) with line items order dollars of $500 and $1,500 
respectively, then line item 1 would receive $25 of the freight and line
 item 2 would receive $75 of the freight. This is because line item 1 
had 25% of the total order dollars and line item 2 had 75% of the line 
item dollars.
This certainly seems simple enough - determine each line item allocation ratio (LineItemRatio) by taking the line item dollars (OrderQty * UnitPrice) and dividing by the SubTotal of order dollars in the order header table.
(OrderQty * UnitPrice)    / POH.SubTotal as LineItemRatio 
After that, you would take the LineItemRatio and
 multiply by the Freight, to determine the allocated freight for each 
line item. Then you’d sum all the allocated freights (which would sum to
 $1,583,807.6975) and compare that to the sum of freight in the order 
header table ($1,583,978.2263). But that’s off by roughly $170!
Now,
 one might argue that the difference is insignificant - after all, $170 
is roughly a hundredth of a percent of the 1.5 million in total freight.
 However, it’s possible that accountants might require (with good 
reason) that the numbers either match or are within a few pennies. So 
are we off by $170 because of simple rounding?
It’s actually a bit more complicated. The culprit here is the money data type. The UnitPrice in
 the order detail table is stored as a money data type, which has a 
fixed scale of four decimal positions. Therefore, any ratio that we 
derive from a money data type will also contain four decimal positions. 
This means that the sum for allocated freights (for any one order) will 
differ from the original order freight by several cents or possibly a 
few dollars. When you aggregate that difference across thousands of 
orders, you have the explanation for the difference of $170.
The solution is to cast the result of the numerator (OrderQty * UnitPrice) as a decimal or a floating point data type, or simply multiply the numerator by 1.0 to force a cast, like so:
(OrderQty * UnitPrice) * 1.0     / POH.SubTotal as LineItemRatio  Cast( (OrderQty * UnitPrice) as float)     / POH.SubTotal as LineItemRatio 
Figure 1 shows
 a partial result with the allocation ratio going well beyond four 
decimals. When we apply this logic, the difference across all orders is 
down to 0.000077 of a penny! I challenge anyone to come closer than 
that!
Figure 1: Results of allocation.
 
No comments:
Post a Comment