Worksheet 8

Published

October 21, 2024

Questions are below. My solutions will be available after the tutorials are all finished. The whole point of these worksheets is for you to use your lecture notes to figure out what to do. In tutorial, the TAs are available to guide you if you get stuck. Once you have figured out how to do this worksheet, you will be prepared to tackle the assignment that depends on it.

If you are not able to finish in an hour, I encourage you to continue later with what you were unable to finish in tutorial.

Hurricanes

The number of hurricanes making landfall on the east coast of the US was recorded each year from 1904 to 2014. The “hurricane season” is from June 1 to November 30 each year. The data are recorded in the file http://ritsokiguess.site/datafiles/hurricanes.csv. There are three columns: the year, the number of hurricanes, and period, in which the years are divided up into 25-year periods.

You might imagine that climate change may cause the number of hurricanes to have changed over time. One way to assess this is to divide the years into a number of groups (such as the 25-year periods here), and then to compare the number of hurricanes in each group. If the groups differ, there is some kind of trend over time.

  1. Read in and display (some of) the data.
  1. Make a suitable plot of the number of hurricanes by time period.
  1. Run a suitable analysis to compare the numbers of hurricanes in the four time periods. Justify any choices you make. (That is to say, I am not telling you what kind of analysis to run; it is up to you to choose what you think is the best thing to do and to be able to say why you made the choice you did. My solutions to this question go into a fair bit of detail about what your considerations might be.)

Bread

What makes bread rise? Specifically what are the effects of baking temperature and the amount of yeast on how much a loaf of bread will rise while baking? To find out, a batch of a certain bread mix was divided into 48 parts. Each part had a randomly chosen amount of yeast added (0.75, 1, or 1.25 teaspoons) and was then baked at a temperature of either 350 or 425 (degrees Fahrenheit). After baking, the height of each (very small) loaf of bread was measured (in inches). Apart from the yeast and the baking temperature, the ingredients for each small loaf were identical, so any differences in height can be attributed to one or both of the amount of yeast used and the baking temperature.

The data are in http://ritsokiguess.site/datafiles/bread_wide.csv.

  1. Read in and display (most of) the data.
  1. The values in the dataframe are all values of height under various conditions (of yeast and temperature). Rearrange the dataframe so that you have one column of heights and one column of conditions.
  1. The column containing the treatment conditions that you created in the previous question contains two things, amount of yeast and temperature. Create two new columns, each containing one of these.
  1. Starting from the data you read in from the file, rearrange it so that there is one column of heights, and columns showing the amount of yeast and the temperature that goes with each height. Do your rearrangement with one command. Save your resulting dataframe.
  1. Make a suitable graph of the three columns (not including row) in your final dataframe.