Scatter plots and Lines

In this section, we will cover everything to do with scatterplots. The main focus of this section is plotting the results of a linear regression and as such most of this will be aimed at lines. In the future, line graphs and scatterplots will be separated to their own “modules”.


  • Scatter plots
  • For this section, we will be using the tadpoles.csv data set The second dataset we analysed tadpole abundance in different sized ponds using a linear model/regression. Plotting linear regressions is really straightforward, but can be done a couple of different ways, depending on what you wish to accomplish. First, let’s run the basic analysis again (excluding the reeds factor). tadpoles.lm <- lm(abundance ~ pondsize, data = tadpoles) summary(tadpoles.lm) ## ## Call: ## lm(formula = abundance ~ pondsize, data = tadpoles) ## ## Residuals: ## Min 1Q Median 3Q Max ## -73.

  • Linear Lines
  • To produce a line on our graph, the easiest solution is using geom_smooth(method=lm). geom_smooth() by default will produce a loess smooth through our graph with confidence intervals. Since we have run a linear model, we specify the method of the geometric shape to fit that of a linear model (lm). ggplot(tadpoles, aes(x=pondsize, y=abundance)) + geom_point(alpha = 0.5)+ geom_smooth(method=lm) method=lm tells the smooth line to plot a linear relationship between the variables in the graph environment.

  • Logistic regression
  • For this section, we will be using the nestpredation.csv data set In our third dataset, we analysed the nest predation dataset using a generalised linear model with a binomial distribution, also known as a Logistic Regression. In this scenario, our data is measuring whether a nest was attacked or not in areas of different shrubcover. When we analyse this using a GLM, it is calculating the probability of a nest being attacked, given different values of shrubcover.