Sample Data Sets

Baseball Data

The Baseball data set contains performance measures and salary levels for regular hitters and leading substitute hitters in Major League Baseball for the year 1986 (Reichler 1987). There is one observation per hitter.

The following list describes each variable.

name
player's name
no_atbat
number of times at bat (in 1986)
no_hits
number of hits (in 1986)
no_home
number of home runs (in 1986)
no_runs
number of runs (in 1986)
no_rbi
number of runs batted in (in 1986)
no_bb
number of bases on balls (in 1986)
yr_major
years in the major leagues
cr_atbat
career at-bats
cr_hits
career hits
cr_home
career home runs
cr_runs
career runs
cr_rbi
career runs batted in
cr_bb
career bases on balls
league
player's league at the end of 1986
division
player's division at the end of 1986
team
player's team at the end of 1986
position
positions played (in 1986)
no_outs
number of putouts (in 1986)
no_assts
number of assists (in 1986)
no_error
number of errors (in 1986)
salary
salary, in thousands of dollars (in 1986)

The position variable in the Baseball data set is encoded as follows:

13 first base, third base CS center field, shortstop
1B first base DH designated hitter
1O first base, outfield DO designated hitter, outfield
23 second base, third base LF left field
2B second base O1 outfield, first base
2S second base, shortstop OD outfield, designated hitter
32 third base, second base OF outfield
3B third base OS outfield, shortstop
3O third base, outfield RF right field
3S third base, shortstop S3 shortstop, third base
C catcher SS shortstop
CD center field, designated hitter UT utility
CF center field    

Previous Page | Next Page | Top of Page