• Print  |
  • Feedback  |

Knowledge Base


TS-698

Storing Graphs Generated with BY-Group Processing in Separate Files

15MAR04


When exporting a single graph to a specific file, you can name the output file using a FILENAME statement. If you produce several graphs from a single procedure using a BY statement, they will each be written to that same file. Some formats, such as CGM and GIF, can store multiple images in the same file, so with BY-group processing you could append multiple graphs to the same output file. However, some programs can only import one image from such a file.

Other formats, like PNG and JPEG, were designed to store only a single image. When using these formats, each graph that is created from the BY statement will replace the one before it. For these reasons, you may want store graphs generated from BY-group processing in separate files.

In SAS 7 and above, you can write the graphs to separate files by specifying an aggregate file storage location on the FILENAME statement. The aggregate storage location is typically a directory or a partitioned data set. Each graph is automatically written to a separate file or member in this location. The names of the files are derived from the corresponding graphics catalog (GRSEG) entries.

For more explicit control over the file names, you can create the output using a macro loop instead. The macro can use the value of the BY variable to subset the data and name the output file. This method is also useful in releases prior to SAS 7 where the aggregate file storage method is not available. Examples of both methods are given below.

This document replaces TS-411, "How can I get graphs generated with BY-group processing stored in separate CGM files".

I. Using Aggregate File Storage Location

Each time you create a graph with SAS/Graph, a GRSEG entry is created in a graphics catalog. With the aggregate file storage location method, file names are derived from these GRSEG entry names, while the file extension is determined by the device driver. The GRSEG entry name is defined automatically from the procedure name or manually by the NAME= option. The procedure name is the default method for naming the files.

A. Default Names

The following code produces ten graphs, one for each value of REGION. The GRSEG entries are named GCHART, GCHART1, GCHART2, and so on.

proc sort data=sashelp.shoes out=sorted;
by region;
run;
goptions reset=all device=gif gsfname=output gsfmode=replace;
filename output "c:\";
proc gchart data=sorted;
by region;
vbar product / sumvar=inventory;
run;quit;

The code also produces ten GIF files in the C:\ directory. The GIF files would be named according to their corresponding GRSEG entries such as GCHART.GIF, GCHART1.GIF, GCHART2.GIF, and so on. If a different procedure or different device driver were used, the file names would change accordingly.

The FILENAME statement in this example works on hosts with directory-based file systems, such as Windows and UNIX. You may need to change the FILENAME statement if you are on another host. For example, on MVS you might change the FILENAME statement to:

filename grafout "userid.pdsename"
disp=new space=(trk,(5,1)) recfm=vb;

The members of this partitioned data set would be USERID.PDSENAME.GCHART, USERID.PDSENAME.GCHART1, USERID.PDSENAME.GCHART2, and so on. If the files are transferred to another system, the appropriate extension should be added at that time.

B. The NAME= option

The NAME= option can be used to set the GRSEG entry name, and likewise the file name of the exported file. For example, if the VBAR statement from the above example were modified as follows:

vbar product / sumvar=inventory name='Region';

Now the output graphs are named REGION.GIF, REGION1.GIF, REGION2.GIF, and so on.

The NAME= option is limited to eight characters or less. It cannot use the #BYVAL parameter to automatically name the graphs according to the value of the BY variable.

C. Restart numbering to replace files

You may have noticed that the catalog entry names are incremented for each graph generated. Because the GRSEG entry names change, the exported files are not replaced each time the procedure executes. For example, if you run the example in I.A above to produce GCHART.GIF through GCHART9.GIF, and then rerun the code in the same SAS session without deleting the first set of graphs, the next set of graphs will be GCHART10.GIF to GCHART19.GIF.

To have the same names used each time the program is executed in a given SAS session, you can delete the previous catalog entries using the GREPLAY procedure. You can delete all the entries in the catalog, or delete specific entries by name or by number. Place the GREPLAY code at the beginning of your program to delete the previous entries before creating the new ones. By default, graphs are written to the WORK.GSEG catalog.

1. Deleting all entries in the catalog

proc greplay igout=gseg nofs;
delete _all_;
run;quit;

This code deletes all the graphs in the default catalog WORK.GSEG.

2. Deleting entries by name

proc greplay igout=gseg nofs;
delete gchart gchart1 gchart2;
run;quit;

This code deletes the graphs named GCHART, GCHART1, and GCHART2 from the default the catalog WORK.GSEG.

3. Deleting entries by number

proc greplay igout=gseg nofs;
delete 1 2 3;
run;quit;

This code deletes the first three entries in the default catalog WORK.GSEG.

You can also specify a range of values:

<>proc greplay igout=gseg nofs;
delete 1 to 5;
run;quit;

This code deletes the first five entries, GCHART, GCHART1, GCHART2, GCHART3, and GCHART4.

II. Using a Macro

You may want to use a macro to write multiple graphs generated by the same procedure to separate files if: you are running a release prior to SAS 7; you want to have file names longer than eight characters; or if you want the file name to contain the BY value. The basic steps are as follows:

  1. Sort the data.
  2. Use a DATA step to count the number of BY values, and create a macro variable for each BY value.
  3. Use a %DO loop in the macro to run the procedure for each BY value.
  4. In the macro loop, use the macro variables on a WHERE clause in the procedure to subset the data for each BY value.
  5. In the macro loop, use the macro variables on the FILENAME statement to name the file.

Below are two examples of naming the output files using the BY-value. The first example uses one BY-value; the second example is modified to produce file names from two BY-values.

A. Creating file names from one BY-value

This example produces a separate graph for each REGION from the data set SASHELP.SHOES. The files are stored in the GIF format and named for the REGION they represent. Each numbered line is described below.

proc sort data=sashelp.shoes out=sortreg;
 by region;
run;
data _null_;
set sortreg end=last;
by region;
 1 if first.region then do;
  2 if index(region,'/') GT 0 then substr(region,index(region,'/'),1)='_';
  3 count+1;
  4 call symput('value'||left(put(count,5.)),trim(region));
 end;
5 if last then call symput('num',put(count,5.));
run;
6 goptions device=gif gsfname=grafout gsfmode=replace;
%macro outfile1;
 7 %do i=1 %to #
  8 filename grafout "c:\&&value&i...gif";
  9 title1 height=1 "Sales in &&value&i";
  proc gchart data=sortreg;
   10 where region = "&&value&i";
   vbar product / sumvar=sales;
  run;quit;
  filename grafout clear;
 %end;
%mend;
%outfile1;

The code does the following:

1.The conditions are executed only once for each region.
2. The region "Central America/Caribbean" contains the forward slash character "/" which is invalid or can be misinterpreted by the FILENAME statement on some systems. The INDEX and SUBSTR functions find and change this character to an underscore "_" instead.
3. The number of unique BY values is stored in the COUNT variable.
4. Assign each BY value to a unique macro variable. The COUNT value is used so the macro variables are given names such as &VALUE1, &VALUE2, etc.
5. On the last observation in the data set, store the total number of BY values in the NUM macro. The variable LAST is not an automatic variable; it was defined on the SET statement.
6. Set the device driver and GSFNAME.
7. This loop will execute once for each BY value.
8. Set the FILENAME using the BY value. The value "&&value&I" first resolves the &i to become a macro variable, such as &VALUE1. Then this macro is resolved to the corresponding REGION.

Notice with this method the extension is not appended automatically to the file name; if you change the DEVICE, you may need to change the extension to match. On the FILENAME, three dots are used between the name and extension: "&&value&i...gif". While its resolving, the macro variable absorbs the first two. After the macro variable is resolved, the file names still contain the ".gif" extension.

The FILENAME statement in this example works on hosts with directory-based file systems, such as Windows and UNIX. You may need to tailor this statement for your host. For example, for MVS you might change the FILENAME statement to:

filename grafout "userid.&&value&i...gif"
disp=new space=(trk,(5,1)) recfm=vb;
9. Writes the current region in the TITLE.
10. Subset the data for this region. Because REGION is a character variable, the macro variable reference is made in quotes. If the variable you specify on the WHERE clause is a numeric variable, do not place the macro variable reference in quotes.

This example would create the following files:

Africa.gif
Asia.gif
Canada.gif
Central America_Caribbean.gif
Eastern Europe.gif
Middle East.gif
Pacific.gif
South America.gif
United States.gif
Western Europe.gif

B. Creating file names from two BY-values

The logic for creating the FILENAME from two BY-values is very similar. The data is sorted by both BY-values, and different macro variables are created for each BY-value for the WHERE and FILENAME statements. For each unique combination of BY-values, a different set of macro variables is created. Some of these macro variables will resolve to the same value, but this method makes the macro processing easier: only one macro loop is needed. Each numbered line is described below.

data subset;
set sashelp.shoes;
 1if index(region,'Europe') GT 0;
run;
2proc sort data=subset nodupkey out=sortset;
 by product region;
run;
data _null_;
 set sortset end=last;
 by product region;
 3count+1;
 call symput('product'||left(put(count,5.)),trim(product));
 call symput('region'||left(put(count,5.)),trim(region));
 if last then call symput('num',put(count,5.));
run;
goptions device=gif gsfname=grafout gsfmode=replace;
%macro outfile2;
 4%do i=1 %to #
  5filename grafout "c:\&&product&i.._&®ion&i...gif";
  5title1 "Number of Stores Carrying ""&&product&i"" Style in &®ion&i";
  axis1 value=(a=0 r=0);
  6proc gchart data=subset;
   5where product="&&product&i" and region="&®ion&i";
   vbar subsidiary / sumvar=stores sum maxis=axis1;
  run;quit;
 %end;
%mend;
%outfile2;

The code does the following:

1.The data is subset so only two regions, "Eastern Europe" and "Western Europe", are processed. This example produces sixteen graphs from the SUBSET data set.
2.The data is sorted by the desired BY values, PRODUCT and REGION; all unique combinations of these two variables are stored in the SORTSET data set.
3.The number of unique combinations of PRODUCT and REGION is stored in the COUNT variable. For each observation, a new macro variable is assigned to contain the values of PRODUCT and REGION.
4.The macro loop executes once for each unique combination of the two BY variables.
5.The FILENAME, TITLE, and WHERE clause are created using the macros variables.
6.Notice the GCHART procedure uses the SUBSET data set, not the SORTSET data set, to create the graphs.

The graphs produced from this example are:

Boot_Eastern Europe.gif
Boot_Western Europe.gif
Men's Casual_Eastern Europe.gif
Men's Casual_Western Europe.gif
Men's Dress_Eastern Europe.gif
Men's Dress_Western Europe.gif
Sandal_Eastern Europe.gif
Sandal_Western Europe.gif
Slipper_Eastern Europe.gif
Slipper_Western Europe.gif
Sport Shoe_Eastern Europe.gif
Sport Shoe_Western Europe.gif
Women's Casual_Eastern Europe.gif
Women's Casual_Western Europe.gif
Women's Dress_Eastern Europe.gif
Women's Dress_Western Europe.gif

Notice that the PRODUCT values may contain the single quote character " ' ". If your operating system cannot use this character as part of a file name, you can change the character using code similar to that given in Example II.A:

if index(product,"'") GT 0 then substr(product,index(product,"'"),1)='_';

Or you can simply remove it:

product=compress(product,"'");

With this data, each PRODUCT included each of the REGIONS; that is, all values of the second BY-variable were represented in each of the first BY-variables. However, this code would also work if the values of the second BY-variable were different for each of the first BY-variable values. For example, if the BY variables were REGION and SUBSIDIARY, there are twelve unique combinations of these variables in the SUBSET data set:

REGIONSUBSIDIARY
Eastern EuropeBudapest
Eastern EuropeMoscow
Eastern EuropePrague
Eastern EuropeWarsaw
Western EuropeCopenhagen
Western EuropeGeneva
Western EuropeHeidelberg
Western EuropeLisbon
Western EuropeLondon
Western EuropeMadrid
Western EuropeParis
Western EuropeRome

Although the values of SUBSIDIARY are unique within each REGION, the example can produce the correct graphs for this subset as well.

III. Reference

SAS OnlineDoc, SAS/GRAPH Software: Reference, "Exporting SAS/GRAPH Output", "Creating External Files with SAS/GRAPH Program Statements"