SUPPORT / SAMPLES & SAS NOTES
 

Support

Sample 24981: Perform item analysis for multiple choice tests

DetailsResultsDownloadsAboutRate It

Item analysis for multiple choice tests

Contents: Purpose / History / Requirements / Usage / Limitations
PURPOSE:
The %ITEM macro computes descriptive statistics for analysis of data from a multiple-choice test. Each observation contains the answers from one subject to a set of questions ("items"). The data are compared to an answer key to determine which answers are correct. The score for each subject is computed as the number of correct answers. The output is very similar to that from the ITEM procedure in the SUGI Supplemental library, but several incorrect statistics have been fixed.
HISTORY:
Last updated: 26Jun92
REQUIREMENTS:
Only Base SAS software is required.
USAGE:
Follow the instructions in the Downloads tab of this sample to save the %ITEM macro definition. Replace the text within quotes in the following statement with the location of the %ITEM macro definition file on your system. In your SAS program or in the SAS editor window, specify this statement to define the %ITEM macro and make it available for use:
   %inc "<location of your file containing the ITEM macro>";

Following this statement, you may call the %ITEM macro. See the Results tab for an example.

The following arguments may be listed within parentheses in any order, separated by commas:

   DATA=        SAS data set to analyze. The default is _LAST_.
                Data set options may be used.

   VAR=         List of variables representing items. Only the first
                character of the formatted value of each variable is
                used to compare with the answer key. The variables may
                be numeric or character or a mixture of both. The usual
                forms of abbreviated lists (e.g., X1-X100, ABC--XYZ,
                ABC:) may be used.

                The default is _ALL_; BY variables and ID variables
                are not automatically removed from the default list.

   FORMAT=      An optional SAS format to be applied to all the VAR
                variables.

                If FORMAT= is not specified and if formats have been
                assigned to the VAR variables in a previous DATA step,
                then those formats are used.

   ID=          List of variables to be copied to the OUT= and OUTBIN=
                data sets and printed in the grade report. These
                variables are not used in the analysis. The usual forms
                of abbreviated lists (e.g., X1-X100, ABC--XYZ, ABC:)
                may be used.

                If you specify ID=, you must also specify VAR=.

   COPY=        List of additional variables to be copied to the OUT=
                and OUTBIN= data sets. The usual forms of abbreviated
                lists (e.g., X1-X100, ABC--XYZ, ABC:) may be used.

                If you specify COPY=, you must also specify VAR=.

   BY=          List of variables for BY groups. Abbreviated variable
                lists (e.g., X1-X100, ABC--XYZ, ABC:) may NOT be used.

                If you specify BY=, you must also specify VAR=.

   SUBTEST=     Series of one or more subtest specifications separated
                by one of the characters given in the SUBSEP= argument.

                Each subtest specification contains either a variable
                list that is a subset of the variables in the VAR= list,
                or the keyword _ALL_ indicating all of the variables
                in the VAR= list. A separate analysis is performed for
                each of these variable lists.

                Each subtest specification may also contain a quoted
                string of up to 40 characters providing a title to
                identify the subtest on the printout. This title is
                also written to the output data sets in a variable
                named by the SUBNAME= argument. This string may not
                contain any of the characters in the SUBSEP= argument,
                nor may it contain two consecutive quotation marks.

                If no subtests are specified, only one analysis is
                performed using all the variables in the VAR= list.

   SUBSEP=      One or more characters used to separate subtest
                specifications in the SUBTEST= argument. The default
                is a /. Do not use blanks or commas.

   SUBNAME=     Name of the variable in the output data sets containing
                the subtest title. The default is _SUBTEST.

   RESPONSE=    Quoted string containing all the valid single-character
                responses to all the items. Any answer not in this list
                is declared invalid. There cannot be more than 200
                valid responses because of the limit on the length of
                quoted strings in the SAS system.

                The default is RESPONSE='12345'

   KEY=         Quoted string specifying the correct single-character
                answers for each item in the same order as specified
                in the VAR= argument. If you specify a quoted string
                for the KEY= argument, then there cannot be more than
                200 items because of the limit on the length of
                character strings in the SAS system. You may specify
                several quoted strings separated by the concatenation
                operator || as long as the total length does not
                exceed 200.

                If you do not specify KEY=, or if you specify
                KEY=_FIRST_, then the first observation read from
                the data set is assumed to contain the answer key,
                and this observation is omitted from other
                computations. This method of specifying the answer key
                must be used when there are more than 200 items. If
                there are BY groups, they are assumed all to have the
                same key, so only the first BY group should have a
                record giving the answer key.

   OUT=         SAS data set containing the ID variables, the score for
                each subject, the number of missing and invalid answers,
                and the score as a percentage.
                Data set options may NOT be used.

   OUTBIN=      SAS data set containing the ID variables, the score for
                each subject, and one variable for each item coded as
                1 for a correct answer or 0 for an incorrect answer.
                The variable names are BIN1, BIN2, ..., BINn, where n
                is the number of items.
                Data set options may NOT be used.

   OUTITEM=     SAS data set containing item statistics.
                Data set options may NOT be used.

   SCORE=       Name of the score variable in the OUT= and OUTBIN=
                data sets. The default is SCORE=SCORE.
                The name should not begin with an underscore.

   MISSING=     Name of the variable giving the number of missing values
                in the OUT= data set. The default is MISSING=MISSING.
                The name should not begin with an underscore.

   INVALID=     Name of the variable giving the number of invalid values
                in the OUT= data set. The default is INVALID=INVALID.
                The name should not begin with an underscore.

   PERCENT=     Name of the variable giving the percentage score
                in the OUT= data set. The default is PERCENT=PERCENT.
                The name should not begin with an underscore.

   PROPHECY=    The desired reliability level to be used in the
                Spearman-Brown prophecy formula. The default is
                PROPHECY=.9.

   UPLO=        The proportion of subjects to be included in each of
                the two subsets of subjects when comparing the
                percentage of occurrence of each response in the
                highest-scoring subset and the lowest-scoring subset.
                The default is UPLO=.3334.

   OPTIONS=     List of additional options separated by blanks:

                NOGRADE        Suppress printing the grade report.
                NOG
                NG

                NOHISTOGRAM    Suppress printing the bar chart of
                NOHIST         score frequencies.

                NOITEM         Suppress printing statistics for
                               each item.

                NOTES          Do not suppress notes in the SAS log
                               for various preliminary procedure
                               and data steps. This option can be
                               useful for diagnosing mysterious error
                               messages.

                NOTOTAL        Suppress computation of item-total
                               correlations.

                NOUPLO         Suppresses computation of percentage
                               of responses by subjects in the
                               highest-scoring and lowest-scoring
                               subsets.

                NOX            Suppress printing extra columns for
                               omitted and invalid responses for
                               each item.

LIMITATIONS:
  • Number of responses <= 200.
  • Variable names should not begin with an underscore.
  • Spurious warning from PROC CONTENTS may occur when OPTIONS OBS= is specified.



These sample files and code examples are provided by SAS Institute Inc. "as is" without warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability and fitness for a particular purpose. Recipients acknowledge and agree that SAS Institute shall not be liable for any damages whatsoever arising out of their use of this material. In addition, SAS Institute will provide no support for the materials contained herein.