The Analytic Edge Lecture code in Python Week2 Money Ball


Read in data

Subset to only include moneyball years

First create a logical vector of True and Falses Then use the logical vector to subset the dataframe

Compute Run Difference

To avoid error message, use column.copy() rather than directly using column

Scatterplot to check for linear relationship

Regression model to predict wins

I will be using statsmodels for this linear regression model

Regression model to predict runs scored

