Practical Problem-Solving with PROC SQL |
Problem |
You want to compute a weighted average of a column of values.
Background Information |
There is one input table, called Sample, that contains the following data:
Sample Input Table for Weighted Averages
Sample Data for Weighted Average Obs Value Weight Gender 1 2893.35 9.0868 F 2 56.13 26.2171 M 3 901.43 -4.0605 F 4 2942.68 -5.6557 M 5 621.16 24.3306 F 6 361.50 13.8971 M 7 2575.09 29.3734 F 8 2157.07 7.0687 M 9 690.73 -40.1271 F 10 2085.80 24.4795 M
Note that some of the weights are negative.
Solution |
Use the following PROC SQL code to obtain weighted averages that are shown in the following output:
proc sql; title 'Weighted Averages from Sample Data'; select Gender, sum(Value*Weight)/sum(Weight) as WeightedAverage from (select Gender, Value, case when Weight gt 0 then Weight else 0 end as Weight from Sample) group by Gender;
PROC SQL Output for Weighted Averages
Weighted Averages from Sample Data Weighted Gender Average -------------------- F 1864.026 M 1015.91
How It Works |
This solution uses an in-line view to create a temporary table that eliminates the negative data values in the Weight column. The in-line view is a query that
selects the Gender and Value columns.
uses a CASE expression to select the value from the Weight column. If Weight is greater than zero, then it is retrieved; if Weight is less than zero, then a value of zero is used in place of the Weight value.
(select Gender, Value, case when Weight>0 then Weight else 0 end as Weight from Sample)
The first, or outer, SELECT statement in the query
selects the Gender column
constructs a weighted average from the results that were retrieved by the in-line view.
select Gender, sum(Value*Weight)/sum(Weight) as WeightedAverage
Finally, the query uses a GROUP BY clause to combine the data so that the calculation is performed for each gender.
group by Gender;
Copyright © 2008 by SAS Institute Inc., Cary, NC, USA. All rights reserved.