Sample Data Sets

Baseball Data

The Baseball data set contains performance measures and salary levels for regular hitters and leading substitute hitters in Major League Baseball for the year 1986 (Reichler 1987). There is one observation per hitter.

The following list describes each variable.

name
player's name

no_atbat
number of times at bat (in 1986)

no_hits
number of hits (in 1986)

no_home
number of home runs (in 1986)

no_runs
number of runs (in 1986)

no_rbi
number of runs batted in (in 1986)

no_bb
number of bases on balls (in 1986)

yr_major
years in the major leagues

cr_atbat
career at-bats

cr_hits
career hits

cr_home
career home runs

cr_runs
career runs

cr_rbi
career runs batted in

cr_bb
career bases on balls

league
player's league at the end of 1986

division
player's division at the end of 1986

team
player's team at the end of 1986

position
positions played (in 1986)

no_outs
number of putouts (in 1986)

no_assts
number of assists (in 1986)

no_error
number of errors (in 1986)

salary
salary, in thousands of dollars (in 1986)

The position variable in the Baseball data set is encoded as follows:

13first base, third baseCScenter field, shortstop
1Bfirst baseDHdesignated hitter
1Ofirst base, outfieldDOdesignated hitter, outfield
23second base, third baseLFleft field
2Bsecond baseO1outfield, first base
2Ssecond base, shortstopODoutfield, designated hitter
32third base, second baseOFoutfield
3Bthird baseOSoutfield, shortstop
3Othird base, outfieldRFright field
3Sthird base, shortstopS3shortstop, third base
CcatcherSSshortstop
CDcenter field, designated hitterUTutility
CFcenter field  

Previous Page | Next Page | Top of Page