# SAS/ETS Web Examples

## Testing for Returns to Scale in a Cobb-Douglas Production Function

Contents | SAS Program

# Overview

A production function is a function that summarizes the conversion of inputs in to outputs. For example, the production of cars using steel, labor, machinery, and plant facilities could be described as . Production functions can be applied to a single firm, an industry, or an entire nation. Note, however, that they are limited to producing a single output, so that joint production is disallowed, although multiple inputs are used. The simplest production function used frequently in economics is a Cobb-Douglas production function. This is a two-input production function that takes on the form

where output, , is a function of two inputs, capital () and labor (). The parameters and are all positive constants calculated from empirical data and is a multiplicative exponential error. This type of production function is particularly useful since it is linear in logarithms and can be used to determine whether the inputs exhibit increasing, decreasing, or constant returns to scale. For example, suppose both the capital and the labor inputs are doubled. The Cobb-Douglas production function helps you determine what happens to the output. Will it also increase by double? Will output increase by more than double? Will it increase by some amount less than double?

Formally, for constant returns to scale, . That is, if both of the inputs, capital and labor, are increased by a factor of , then output also increases by a factor of . For increasing returns, if both capital and labor are increased by a factor of , then output increases by an amount greater than . In this case, . The opposite is true for decreasing returns. If both capital and labor are increased by a factor of , then output increases by an amount less than such that .

The U.S. Census Bureau conducts an Annual Survey of Manufactures in each of the four years between the economic census. This survey is broken down into each of the 50 states and the District of Colombia in the 2001 Geographic Area Statistics. The data consist of labor hours as the estimate for labor input, total capital expenditures as the estimate for capital input, and value added as the estimate for output. Labor, capital, and output data are from 1997, the most recent year for which complete data are available.

The following sections of code demonstrate how to manipulate input data into usable formats. You use PROC TRANSPOSE first to transpose the data into a form that is usable for your regression and then modify the data further to make it usable for your graph. The code for creating a graph with two vertical axes is also displayed, followed by a plot of the data in Figure 1.1.

   data census_data;
input state $type$ am;
datalines;
AL q 28762420000
AL K 2818886000
AL L 544806000
AK q 1153433000
AK K 97292000
AK L 18942000
AZ q 27900974000
AZ K 2593261000
AZ L 245413000

... more lines ...

WI L 829277000
WY q 1030583000
WY K 83569000
WY L 12209000
;
run;

   ** Manipulate the data for running regression and for plotting data **;
proc sort data=census_data;
by state;
run;

** reg_data is the data set you use for running the regression **;
proc transpose data=census_data out=reg_data
(rename = (Col1=q Col2=K Col3=L));
var am;
by state;
run;

data newdata;
set census_data;
lnam=log(am);
if type='q' then do;
format lnam dollar15.2;
end;
else if type='K' then do;
format lnam dollar15.2;
end;
else if type='L' then do;
delete;
end;
run;
** plot_data is the data set you use for plotting the data **;
data plot_data;
merge newdata (in=a) reg_data (in=b);
by state;
if a and b;
lnl=round(log(L),.01);
keep state type lnam lnl;
run;

   ** Create graph **;
proc gplot data=plot_data;
plot lnam*state=type / nolegend haxis=axis1 vaxis=axis2;
plot2 lnl*state / vaxis=axis3;
symbol1 c=blue  i=join v=star;
symbol2 c=green i=join v=dot;
symbol3 c=red   i=join v=plus ;
axis1 label=('State');
axis2 label=(angle=90 'Value Added, Capital (in log dollars)')
order=(0 to 30 by 5);
axis3 label=(angle=90 'Labor (in log hours)')
order=(0 to 30 by 5);
title "1997 Production by State";
footnote1 c=blue  '     *  Capital'
c=red   '     +  Labor'
c=green '     .  Value Added';
footnote2 j=l 'Source: U.S. Census Bureau, Annual Survey of Manufactures,';
footnote3 j=l '        2001 Geographic Area Statistics';
run;
quit;
title; footnote;


Figure 1.1 1997 Production by State

# Analysis

Using the 1997 U.S. Census Bureau data, you can test for the three types of returns to scale based on the Cobb-Douglas production function with both F tests and t tests.

## Conducting an F test for Constant Returns to Scale

In this example, you test the simplest case to determine whether the model has constant returns to scale. Beginning with the general form of the Cobb-Douglas equation, take the natural log of both sides of the equation and define the regression equation.

You are interested in determining whether the function exhibits constant returns to scale and test against the alternative that the returns are not constant. Hence,

You can test this hypothesis using SAS by first creating the log variables, then using PROC REG to conduct an F test. Traditionally, you need to create both a full and a reduced model where the full model regresses . The reduced model restates the hypothesis as and substitutes the new value for into the full model. Solving for the reduced model, you get the following:

Thus, you perform the simple linear regression as the reduced model using the MODEL statement.

However, PROC REG can calculate the reduced model for you by using a TEST statement. After the full model is defined, the conditions for the F test are specified in the TEST statement and SAS automatically calculates the reduced model necessary to conduct the test. In this case, the conditions you use echo the hypothesis that the sum of the betas equals zero. Note that the variable names in the TEST statement correspond to the regressors and that each variable name represents its estimated coefficient.

   ** Create log variables **;
data model (drop=_name_);
set reg_data;
y  = log(q);
x1 = log(K);
x2 = log(L);
run;

** Run regression **;
proc reg data=model;
MODEL_1: model y = x1 x2;
F_TEST: test x1+x2=1;
run;
quit;


The results are given in two separate tables, as shown in Figure 1.2 and Figure 1.3.

Figure 1.2 Fit Statistics and Parameter Estimates
The REG Procedure
Model: MODEL_1
Dependent Variable: y

Analysis of Variance
Source DF Sum of
Squares
Mean
Square
F Value Pr > F
Model 2 102.86112 51.43056 1152.36 <.0001
Error 48 2.14227 0.04463
Corrected Total 50 105.00339

 Root MSE R-Square 0.21126 0.9796 23.6025 0.9787 0.89507

Parameter Estimates
Variable DF Parameter
Estimate
Standard
Error
t Value Pr > |t|
Intercept 1 2.46466 0.46928 5.25 <.0001
x1 1 0.66876 0.05930 11.28 <.0001
x2 1 0.36535 0.05535 6.60 <.0001

Figure 1.3 F test Results
The REG Procedure
Model: MODEL_1

Test F_TEST Results for Dependent Variable y
Source DF Mean
Square
F Value Pr > F
Numerator 1 0.11003 2.47 0.1230
Denominator 48 0.04463

In the F test results, you find a -value of . If the -value is less than the chosen significance level, then you reject the hypothesis in favor of the alternative. If the -value is greater than the chosen significance level, then there is insufficient evidence to reject the null hypothesis. If you assume a significance level of 0.05 for this example, then , and you fail to reject the hypothesis. You find that the model demonstrates constant returns to scale.

## Conducting a t test for Non-Constant Returns to Scale

If you want to perform a specific test for either increasing or decreasing returns to scale, then you need to use a one-sided t test. In the case of increasing returns, you test the following hypothesis and alternative:

The case of decreasing returns works similarly. You test the following hypothesis and alternative:

You use the same MODEL statement that you used in the preceding F test to conduct this t test for increasing returns to scale. However, you also ask PROC REG to return a covariance matrix by specifying the COVB option on the MODEL statement. This enables you to compute the standard error of a linear combination of parameter estimates. The necessary code and its results follow in Figure 1.4:

   ** t Test Regression **;
proc reg data=model;
T_TEST: model y = x1 x2 / covb;
run;
quit;


Figure 1.4 Model Results and Covariance Matrix
The REG Procedure
Model: T_TEST
Dependent Variable: y

Analysis of Variance
Source DF Sum of
Squares
Mean
Square
F Value Pr > F
Model 2 102.86112 51.43056 1152.36 <.0001
Error 48 2.14227 0.04463
Corrected Total 50 105.00339

 Root MSE R-Square 0.21126 0.9796 23.6025 0.9787 0.89507

Parameter Estimates
Variable DF Parameter
Estimate
Standard
Error
t Value Pr > |t|
Intercept 1 2.46466 0.46928 5.25 <.0001
x1 1 0.66876 0.05930 11.28 <.0001
x2 1 0.36535 0.05535 6.60 <.0001

Covariance of Estimates
Variable Intercept x1 x2
Intercept 0.2202250596 -0.015334657 0.0054015337
x1 -0.015334657 0.0035162514 -0.003054157
x2 0.0054015337 -0.003054157 0.0030639604

You construct the t statistic using the results such that

where the standard error (se) of the parameter estimates, , is computed as

by using the results in the covariance matrix.

You can either compute the t statistic by hand, using the results in the ANOVA table and the preceding equation, or you can use an OUTEST statement in PROC REG to achieve the same result. The following code demonstrates how to use the OUTEST statement and its results in Figure 1.5.

   ** t Test Regression with OUTEST statement **;
proc reg data=model outest=est covout edf;
T_TEST_OUT: model y = x1 x2 / covb;
run;
quit;
** Calculate t statistic**;
** Keep the parameter estimates, error degrees of freedom, **;
** variances, and covariance                               **;
data parms;
set est (where=(_type_='PARMS'));
keep x1 x2 _edf_;
run;
data varx1;
set est (where=(_name_='x1'));
varx1=x1;
covx1x2=x2;
keep varx1 covx1x2;
run;
data varx2;
set est (where=(_name_='x2'));
varx2=x2;
keep varx2;
run;
** Calculate the standard error for the denominator, **;
** the t statistic, and the p-values                 **;
data ttest;
merge parms varx1 varx2;
se_x1x2 = sqrt(varx1 + varx2 + (2*covx1x2));
t = (x1 + x2 - 1)/se_x1x2;
p_inc = 1 - probt(t,_edf_);
p_dec = probt(t,_edf_);
run;

      ** Print and view the results **;
proc print data=ttest;
var t _edf_ p_inc p_dec;
title "t statistic and p-values for model";
run;


Figure 1.5 Model Results
 t statistic and p-values for model

Obs t _EDF_ p_inc p_dec
1 1.57014 48 0.061476 0.93852

As with the F test, you compare the -values with your chosen level of significance. Assuming a significance level of 0.05, you compare the -value for increasing returns to scale. If the -value is greater than the chosen significance level, then there is insufficient evidence to reject the null hypothesis. Since , you fail to reject the hypothesis. Similarly, you compare the -value for decreasing returns to scale to your chosen level of significance. Since , you fail to reject the hypothesis. The model is clearly demonstrating constant returns to scale.

# References

Mendenhall, William and Terry Sincich (2003), A Second Course in Statistics: Regression Analysis, Sixth Edition, New Jersey: Pearson Education Inc.

Nicholson, W. (1992), Microeconomic Theory: Basic Principles and Extensions, Fifth Edition, Fort Worth: Dryden Press.

U.S. Census Bureau (2003), "Annual Survey of Manufactures, 2001 Geographic Area Statistics," http://www.census.gov/mcd/asm-as3.html.

Wooldridge, Jeffrey M. (2003), Introductory Econometrics: A Modern Approach, Second Edition, Ohio: South-Western.

Zellner, Arnold (1971), An Introduction to Bayesian Inference in Econometrics, New York: Wiley.