MEANS Procedure
         
         
         
         
         
         
            
         
         
         
            OUTPUT Statement
            
            Writes statistics to a new SAS data set.
            
            
            
            
            
               Syntax 
               
               
               
                  
                     Optional Arguments
                     
                     - OUT=SAS-data-set
 
                        - 
                           
names the new output
                              data set. If SAS-data-set does
                              not exist, then PROC MEANS creates it. If you omit OUT=, then the
                              data set is named DATAn, where n is
                              the smallest integer that makes the name unique.
                           
                           
                              Default:DATAn
                              Tip:You can use data set
                                 options with the OUT= option.      
                              
                            
                         
                     
                     - output-statistic-specification(s)
 
                        - 
                           
specifies the statistics
                              to store in the OUT= data set and names one or more variables that
                              contain the statistics. The form of the 
output-statistic-specification is
                              
                              
                                 statistic-keyword<(variable-list)>=<name(s)>
                                 where 
                           
 
                           - statistic-keyword
 
                              - 
                                 
specifies which statistic
                                    to store in the output data set. The available statistic keywords
                                    are  
                                 
                                 
                                       
                                          
                                             Descriptive statistics
                                                keyword
                                               
                                           | 
                                       
                                       
                                          | 
                                             
                                           | 
                                          
                                             
                                           | 
                                       
                                       
                                          | 
                                             
                                           | 
                                          
                                             
                                           | 
                                       
                                       
                                          | 
                                             
                                           | 
                                          
                                             
                                           | 
                                       
                                       
                                          | 
                                             
                                           | 
                                          
                                             
                                           | 
                                       
                                       
                                          | 
                                             
                                           | 
                                          
                                             
                                           | 
                                       
                                       
                                          | 
                                             
                                           | 
                                          
                                             
                                           | 
                                       
                                       
                                          | 
                                             
                                           | 
                                          
                                             
                                           | 
                                       
                                       
                                          | 
                                             
                                           | 
                                          
                                             
                                           | 
                                       
                                       
                                          | 
                                             
                                           | 
                                          
                                             
                                           | 
                                       
                                       
                                          | 
                                             
                                           | 
                                          
                                             
                                           | 
                                       
                                       
                                          
                                             Quantile statistics
                                                keyword
                                               
                                           | 
                                       
                                       
                                          | 
                                             
                                           | 
                                          
                                             
                                           | 
                                       
                                       
                                          | 
                                             
                                           | 
                                          
                                             
                                           | 
                                       
                                       
                                          | 
                                             
                                           | 
                                          
                                             
                                           | 
                                       
                                       
                                          | 
                                             
                                           | 
                                          
                                             
                                           | 
                                       
                                       
                                          | 
                                             
                                           | 
                                          
                                             
                                           | 
                                       
                                       
                                          | 
                                             
                                           | 
                                          
                                             
                                           | 
                                       
                                       
                                          | 
                                             
                                           | 
                                          
                                             
                                           | 
                                       
                                       
                                          | 
                                             
                                           | 
                                          
                                             
                                           | 
                                       
                                       
                                          
                                             Hypothesis testing keyword
                                               
                                           | 
                                       
                                       
                                          | 
                                             
                                           | 
                                          
                                             
                                           | 
                                       
                                    
                                  
                                 
                                 By default the statistics
                                    in the output data set automatically inherit the analysis variable's
                                    format, informat, and label. However, statistics computed for N, NMISS,
                                    SUMWGT, USS, CSS, VAR, CV, T, PROBT, PRT, SKEWNESS, and KURTOSIS 
                                    will not inherit the analysis variable's format because this format
                                    might be invalid for these statistics (for example, dollar or datetime
                                    formats).
                                 
 
                                 
                               
                           
                           - variable-list
 
                              - 
                                 
specifies the names
                                    of one or more numeric analysis variables whose statistics you want
                                    to store in the output data set.
                                 
                                 
                                    Default:all numeric analysis variables
                                    
                                  
                               
                           
                           - name(s)
 
                              - 
                                 
specifies one or more
                                    names for the variables in output data set that will contain the analysis
                                    variable statistics. The first name contains the statistic for the
                                    first analysis variable; the second name contains the statistic for
                                    the second analysis variable; and so on.
                                 
                                 
                                    Default:the analysis variable name. If you specify AUTONAME,
                                       then the default is the combination of the analysis variable name
                                       and the statistic-keyword.
                                       If you use the CLASS statement and an OUTPUT statement without an output-statistic-specification,
                                       then the output data set contains five observations for each combination
                                       of class variables: the value of N, MIN, MAX, MEAN, and STD. If you
                                       use the WEIGHT statement or the WEIGHT option in the VAR statement,
                                       then the output data set also contains an observation with the sum
                                       of weights (SUMWGT) for each combination of class variables.
                                    
                                    Interaction:If you specify variable-list,
                                       then PROC MEANS uses the order in which you specify the analysis variables
                                       to store the statistics in the output data set variables.
                                    
                                    Tip:Use the AUTONAME option to have PROC MEANS generate unique
                                       names for multiple variables and statistics.
                                    
                                    
                                  
                               
                           
                           
                         
                     
                     - id-group-specification
 
                        - 
                           
combines the features
                              and extends the ID statement, the IDMIN option in the PROC statement,
                              and the MAXID and MINID options in the OUTPUT statement to create
                              an OUT= data set that identifies multiple extreme values. The form
                              of the 
id-group-specification is
                              
                              
                                 IDGROUP (<MIN|MAX (variable-list-1) <…MIN|MAX
                                          (variable-list-n)>> <<MISSING> 
<OBS> <LAST>> 
                                    OUT <[n]>
 
                                    (id-variable-list)=<name(s)>)
                                 
                                  
                            
                           - MIN|MAX(variable-list)
                              
 
                              - 
                                 
specifies the selection
                                    criteria to determine the extreme values of one or more input data
                                    set variables specified in variable-list.
                                    Use MIN to determine the minimum extreme value and MAX to determine
                                    the maximum extreme value.
                                 
                                 When you specify multiple
                                    selection variables, the ordering of observations for the selection
                                    of 
n extremes is done the same
                                    way that PROC SORT sorts data with multiple BY variables. PROC MEANS
                                    concatenates the variable values into a single key. The MAX(
variable-list)
                                    selection criterion is similar to using PROC SORT and the DESCENDING
                                    option in the BY statement.
                                 
 
                                 
                                    Default:If you do not specify MIN or MAX, then PROC MEANS
                                       uses the observation number as the selection criterion to output observations.
                                    
                                    Restriction:If you specify criteria that are contradictory, then
                                       PROC MEANS uses only the first selection criterion.
                                    
                                    Interaction:When multiple observations contain the same extreme
                                       values in all the MIN or MAX variables, PROC MEANS uses the observation
                                       number to resolve which observation to write to the output. By default,
                                       PROC MEANS uses the first observation to resolve any ties. However,
                                       if you specify the LAST option, then PROC MEANS uses the last observation
                                       to resolve any ties.
                                    
                                  
                               
                           
                           - LAST
 
                              - 
                                 
specifies that the
                                    OUT= data set contains values from the last observation (or the last n observations,
                                    if n is specified). If you
                                    do not specify LAST, then the OUT= data set contains values from the
                                    first observation (or the first n observations,
                                    if n is specified). The OUT=
                                    data set might contain several observations because in addition to
                                    the value of the last (first) observation, the OUT= data set contains
                                    values from the last (first) observation of each subgroup level that
                                    is defined by combinations of class variable values.
                                 
                                 
                                    Interaction:When you specify MIN or MAX and when multiple
                                       observations contain the same extreme values, PROC MEANS uses the
                                       observation number to resolve which observation to save to the OUT=
                                       data set. If you specify LAST, then PROC MEANS uses the later observations
                                       to resolve any ties. If you do not specify LAST, then PROC MEANS uses
                                       the earlier observations to resolve any ties.
                                    
                                  
                               
                           
                           - MISSING
 
                              - 
                                 
specifies that missing
                                    values be used in selection criteria.
                                 
                                 
                               
                           
                           - OBS
 
                              - 
                                 
includes an _OBS_ variable
                                    in the OUT= data set that contains the number of the observation in
                                    the input data set where the extreme value was found.
                                 
                                 
                                    Interactions:If you use WHERE processing, then the value of
                                       _OBS_ might not correspond to the location of the observation in the
                                       input data set.
                                    
                                    If you use [n]
                                       to write multiple extreme values to the output, then PROC MEANS creates n _OBS_
                                       variables and uses the suffix n to
                                       create the variable names, where n is
                                       a sequential integer from 1 to n.
                                    
                                  
                               
                           
                           - [n]
                              
 
                              - 
                                 
specifies the number
                                    of extreme values for each variable in id-variable-list to
                                    include in the OUT= data set. PROC MEANS creates n new
                                    variables and uses the suffix _n to
                                    create the variable names, where n is
                                    a sequential integer from 1 to n.
                                 
                                 By default, PROC MEANS
                                    determines one extreme value for each level of each requested type.
                                    If 
n is greater than one, then 
n extremes
                                    are output for each level of each type. When 
n is
                                    greater than one and you request extreme value selection, the time
                                    complexity is  

 , where  

  is the number of types requested and  

  is the number of observations in the input data
                                    set. By comparison, to group the entire data set, the time complexity
                                    is  

 .
                                 
 
                                 
                                    Default:1
                                    
                                    Range:an integer between 1 and 100
                                    
                                    Example:For example, to output two minimum extreme values
                                       for each variable, use 
idgroup(min(x) out[2](x y z)=MinX MinY MinZ);
The
                                       OUT= data set contains the variables MinX_1, MinX_2, MinY_1, MinY_2,
                                       MinZ_1, and MinZ_2.
 
                                  
                               
                           
                           - (id-variable-list)
                              
 
                              - 
                                 
identifies one or more
                                    input data set variables whose values PROC MEANS includes in the OUT=
                                    data set. PROC MEANS determines which observations to output by the
                                    selection criteria that you specify (MIN, MAX, and LAST).
                                 
                                 
                                    Alias:IDGRP
                                    
                                    Requirement:You must specify the MIN|MAX selection criteria first
                                       and OUT(id-variable-list)=
                                       after the suboptions MISSING, OBS, and LAST.
                                    
                                    Tips:You can use id-group-specification to
                                       mimic the behavior of the ID statement and a maximum-id-specification or minimum-id-specification in
                                       the OUTPUT statement.
                                    
                                    When you want the output data set to contain extreme values
                                       along with other ID variables, it is more efficient to include them
                                       in the id-variable-list than
                                       to request separate statistics. For example, the statement output
                                          idgrp(max(x) out(x a b)= ); is more efficient than the
                                       statement  output idgrp(max(x) out(a b)= ) max(x)=;
                                    
                                    Identifying the Top Three Extreme Values with the Output Statistics
                                  
                               
                           
                           - name(s)
 
                              - 
                                 
specifies one or more
                                    names for variables in the OUT= data set. 
                                 
                                 
                                    Default:If you omit name,
                                       then PROC MEANS uses the  names of variables in the id-variable-list.
                                    
                                    Tip:Use the AUTONAME option to automatically resolve naming
                                       conflicts.
                                    
                                  
                               
                           
                           CAUTION:
                              The IDGROUP
                                 syntax enables you to create output variables with the same name.
                              
                              When this action happens,
                                 only the first variable appears in the output data set. Use the AUTONAME
                                 option to automatically resolve these naming conflicts.
                              
 
                            
                           Note: If you specify fewer new
                              variable names than the combination of  analysis variables and identification
                              variables, then the remaining output variables use the corresponding
                              names of the ID variables as soon as PROC MEANS exhausts the list
                              of new variable names. 
                           
 
                         
                     
                     - maximum-id-specification(s)
 
                        - 
                           
specifies that one
                              or more identification variables be associated with the maximum values
                              of the analysis variables. The form of the 
maximum-id-specification is
                              
                              
                                 MAXID <(variable-1 <(id-variable-list-1)> <…variable-n   <(id-variable-list-n)>>)> = name(s)
                                  
                            
                           - variable
 
                              - 
                                 
identifies the numeric
                                    analysis variable whose maximum values PROC MEANS determines. PROC
                                    MEANS can determine several maximum values for a variable because,
                                    in addition to the overall maximum value, subgroup levels, which are
                                    defined by combinations of class variables values, also have maximum
                                    values.
                                 
                                 
                                    Tip:If you use an ID statement and omit variable,
                                       then PROC MEANS uses all analysis variables.
                                    
                                  
                               
                           
                           - id-variable-list
 
                              - 
                                 
identifies one or more
                                    variables whose values identify the observations with the maximum
                                    values of the analysis variable.
                                 
                                 
                                    Default:the ID statement variables
                                    
                                  
                               
                           
                           - name(s)
 
                              - 
                                 
specifies the names
                                    for new variables that contain the values of the identification variable
                                    associated with the maximum value of each analysis variable.
                                 
                                 
                                    Note:If multiple observations contain the maximum value within
                                       a class level, then PROC MEANS saves the value of the ID variable
                                       for only the first of those observations in the output data set.
                                    
                                    Tips:If you use an ID statement,  and omit variable and id-variable,
                                       then PROC MEANS associates all ID statement variables with each analysis
                                       variable. Thus, for each analysis variable, the number of variables
                                       that are created in the output data set equals the number of variables
                                       that you specify in the ID statement.
                                    
                                    Use the AUTONAME option to automatically resolve naming
                                       conflicts. 
                                    
                                    
                                  
                               
                           
                           CAUTION:
                              The MAXID
                                 syntax enables you to create output variables with the same name.
                              
                              When this action happens,
                                 only the first variable appears in the output data set. Use the AUTONAME
                                 option to automatically resolve these naming conflicts. 
                              
 
                            
                           Note: If you specify fewer new
                              variable names than the combination of  analysis variables and identification
                              variables, then the remaining output variables use the corresponding
                              names of the ID variables as soon as PROC MEANS exhausts the list
                              of new variable names. 
                           
 
                         
                     
                     - minimum-id-specification
 
                        - 
                           
See the description
                              of maximum-id-specification. This option behaves in exactly the same
                              way, except that PROC MEANS determines the minimum values instead
                              of the maximum values. The form of the 
minid-specification is 
                              
                                 MINID<(variable-1 <(id-variable-list-1)> <…variable-n  <(id-variable-list-n)>>)> = name(s) 
                                 
                                      
                            
                           When MINID is used
                              without an explicit variable list, it is similar to the following
                              more advanced IDGROUP syntax example:  
                              
                                 IDGRP( min(x) missing out(id_variable)=idminx) idgrp(
                                    min(y) missing out(id_variable)=idminy)               
                                 
                                
                            
                           If one or more of the
                              analysis variables has a missing value, the id_variable value will
                              correspond to the observation with the missing value not the observation
                              with the value for the MIN statistic.                            
                           
 
                         
                     
                     - option
 
                        - 
                           
can be one of the following
                              items:
                           
                           - AUTOLABEL
 
                              - 
                                 
specifies that PROC
                                    MEANS appends the statistic name to the end of the variable label.
                                    If an analysis variable has no label, then PROC MEANS creates a label
                                    by appending the statistic name to the analysis variable name.
                                 
                                 
                               
                           
                           - AUTONAME
 
                              - 
                                 
specifies that PROC
                                    MEANS creates a unique variable name for an output statistic when
                                    you do not assign the variable name in the OUTPUT statement. This
                                    action is accomplished by appending to thestatistic-keyword end
                                    of the input variable name from which the statistic was derived. For
                                    example, the statement  output min(x)=/autoname;produces
                                    the x_Min variable in the output data set.  
                                 
                                 AUTONAME activates
                                    the SAS internal mechanism to automatically resolve conflicts in the
                                    variable names in the output data set. Duplicate variables will not
                                    generate errors. As a result, the statement  
output min(x)=
                                       min(x)=/autoname; produces two variables, x_Min and x_Min2,
                                    in the output data set.
                                 
 
                                 
                               
                           
                           - KEEPLEN
 
                              - 
                                 
specifies that statistics
                                    in the output data set inherit the length of the analysis variable
                                    that PROC MEANS uses to derive them. 
                                 
                                 CAUTION:
                                    You permanently
                                       lose numeric precision when the length of the analysis variable causes
                                       PROC MEANS to truncate or round the value of the statistic. However,
                                       the precision of the statistic will match that of the input.
                                    
                                  
                               
                           
                           - LEVELS
 
                              - 
                                 
includes a variable
                                    named _LEVEL_ in the output data set. This variable contains a value
                                    from 1 to n that indicates
                                    a unique combination of the values of class variables (the values
                                    of _TYPE_ variable).
                                 
                                 
                               
                           
                           - NOINHERIT
 
                              - 
                                 
specifies that the
                                    variables in the output data set that contain statistics do not inherit
                                    the attributes (label and format) of the analysis variables which
                                    are used to derive them.
                                 
                                 
                                    Interaction:When no option is used (implied INHERIT) then
                                       the statistics inherit the attributes, label and format, of the input
                                       analysis variable(s). If the INHERIT option is used in the OUTPUT
                                       statement, then the statistics inherit the length of the input analysis
                                       variable(s), the label and format.
                                    
                                    Tip:By default, the output data set includes an output variable
                                       for each analysis variable and for five observations that contain
                                       N, MIN, MAX, MEAN, and STDDEV. Unless you specify NOINHERIT, this
                                       variable inherits the format of the analysis variable, which can be
                                       invalid for the N statistic (for example, datetime formats).
                                    
                                  
                               
                           
                           - WAYS
 
                              - 
                                 
includes a variable
                                    named _WAY_ in the output data set. This variable contains a value
                                    from 1 to the maximum number of class variables that indicates how
                                    many class variables PROC MEANS combines to create the TYPE value.
                                 
                                 
                               
                           
                         
                     
                   
                
             
          
       
      
      
      
         Copyright © SAS Institute Inc. All rights reserved.