The BOXPLOT statement generates a table with the information that can be used to generate a box plot. It does not generate the plot.
Examples: | Retrieving Box Values |
specifies one or more numeric variables. If you do not specify this option, then all numeric variables in the table are used.
specifies that the levels of the GROUPBY variables are to be arranged in descending order.
Alias | DESC |
specifies the formats for the GROUPBY= variables. If you do not specify the FORMATS= option, or if you omit the entry for a GROUPBY variable, the default format is applied for that variable.
Example | proc imstat data=lasr1.table1;
boxplot x / groupby=(a b) formats=("8.3", "$10");
quit;
|
specifies a list of variable names, or a single variable name, to use as GROUPBY variables in the order of the grouping hierarchy. If you do not specify any GROUPBY variable names, then the calculation is performed across the entire table—possibly subject to a WHERE clause.
specifies the maximum number of levels in a GROUPBY set. When the software determines that there are at least n levels in the GROUPBY set, it abandons the action, returns a message, and does not produce a result set. You can specify the GROUPBYLIMIT= option if you want to avoid creating excessively large result sets in GROUPBY operations.
specifies the number of bins to create when a numeric GROUPBY variable exceeds the MERGELIMIT=n specification. If you specify a MERGELIMIT, but do not specify a value for the MERGEBINS= option, the server automatically calculates the number of bins.
specifies that when the number of unique values in a numeric GROUPBY variable exceeds n, the variable is automatically binned and the GROUPBY structure is determined based on the binned values of the variable, rather than the unique formatted values.
specifies that the levels of the GROUPBY variables are to be arranged in descending order.
Alias | DESC |
specifies the number of bins for reporting outliers. The default number of bins is 10 if you do not specify an NOUTLIERBINS= value, but do specify the OUTLIERS option. Specifying a nonzero value for NOUTLIERBINS= implies the specification of the OUTLIERS option.
Alias | NOUTBINS= |
Default | 10 |
specifies the largest number of outliers to be returned. If you request outliers with the OUTLIERS option, and you specify a NOUTLIERLIMIT= value, the actual outliers are being returned rather than the binned values. Specifying a nonzero value for NOUTLIERLIMIT= implies the specification of the OUTLIERS option.
Alias | NOUTLIMIT= |
specifies to include outliers in computations and results. If the NOUTLIMIT=n option is specified, then the server returns up to n outliers on the high and low ends of the distribution. Otherwise, outliers are binned into NOUTLIERBINS=b bins.
When you specify this option and the table is partitioned, the results are calculated separately for each value of the partition key. In other words, the partition variables function as automatic GROUPBY variables. This mode of executing calculations by partition is more efficient than using the GROUPBY= option. With a partitioned table, the server takes advantage of knowing that observations for a partition cannot be located on more than one worker node.
statement / partition="F 11"; /* passed directly to the server */ statement / partition="F","11"; /* composed by the procedure */
Alias | PART= |
specifies that the ordering of the GROUPBY variables is based on the raw values of the variables instead of the formatted values.
saves the result table so that you can use it in other IMSTAT procedure statements like STORE, REPLAY, and FREE. The value for table-name must be unique within the scope of the procedure execution. The name of a table that has been freed with the FREE statement can be used again in subsequent SAVE= options.
requests that the server estimate the size of the result set. The procedure does not create a result table if the SETSIZE option is specified. Instead, the procedure reports the number of rows that are returned by the request and the expected memory consumption for the result set (in KB). If you specify the SETSIZE option, the SAS log includes the number of observations and the estimated result set size. See the following log sample:
NOTE: The LASR Analytic Server action request for the STATEMENT
statement would return 17 rows and approximately
3.641 kBytes of data.
specifies either a quoted string that contains the SAS expression that defines the temporary variables or a file reference to an external file with the SAS statements.
Alias | TE= |
specifies the list of temporary variables for the request. Each temporary variable must be defined through SAS statements that you supply with the TEMPEXPRESS= option.
Alias | TN= |