SAS Institute. The Power to Know

SAS(R) 9.2 SQL Procedure User's Guide

Previous Page | Next Page

Practical Problem-Solving with PROC SQL

Computing a Weighted Average


Problem

You want to compute a weighted average of a column of values.


Background Information

There is one input table, called Sample, that contains the following data:

Sample Input Table for Weighted Averages

                        Sample Data for Weighted Average

                      Obs     Value      Weight     Gender

                        1    2893.35      9.0868      F   
                        2      56.13     26.2171      M   
                        3     901.43     -4.0605      F   
                        4    2942.68     -5.6557      M   
                        5     621.16     24.3306      F   
                        6     361.50     13.8971      M   
                        7    2575.09     29.3734      F   
                        8    2157.07      7.0687      M   
                        9     690.73    -40.1271      F   
                       10    2085.80     24.4795      M   

Note that some of the weights are negative.


Solution

Use the following PROC SQL code to obtain weighted averages that are shown in the following output:

proc sql;
   title 'Weighted Averages from Sample Data';
   select Gender, sum(Value*Weight)/sum(Weight) as WeightedAverage
      from (select Gender, Value,
                   case
                      when Weight gt 0 then Weight
                      else 0
                   end as Weight
               from Sample)
      group by Gender;

PROC SQL Output for Weighted Averages

                       Weighted Averages from Sample Data

                                          Weighted
                              Gender       Average
                              --------------------
                              F           1864.026
                              M            1015.91

How It Works

This solution uses an in-line view to create a temporary table that eliminates the negative data values in the Weight column. The in-line view is a query that

  • selects the Gender and Value columns.

  • uses a CASE expression to select the value from the Weight column. If Weight is greater than zero, then it is retrieved; if Weight is less than zero, then a value of zero is used in place of the Weight value.

           (select Gender, Value,
                   case
                      when Weight>0 then Weight
                      else 0
                   end as Weight
               from Sample)

The first, or outer, SELECT statement in the query

  • selects the Gender column

  • constructs a weighted average from the results that were retrieved by the in-line view.

The weighted average is the sum of the products of Value and Weight divided by the sum of the Weights.
select Gender, sum(Value*Weight)/sum(Weight) as WeightedAverage

Finally, the query uses a GROUP BY clause to combine the data so that the calculation is performed for each gender.

group by Gender;

Previous Page | Next Page | Top of Page