Worksheet 6

Published

October 11, 2024

Questions are below. My solutions will be available after the tutorials are all finished. The whole point of these worksheets is for you to use your lecture notes to figure out what to do. In tutorial, the TAs are available to guide you if you get stuck. Once you have figured out how to do this worksheet, you will be prepared to tackle the assignment that depends on it.

If you are not able to finish in an hour, I encourage you to continue later with what you were unable to finish in tutorial.

The thickness of stamps

Collectors of postage stamps know that the same stamp may be made from several different batches of paper of different thicknesses. Our data set, in http://ritsokiguess.site/datafiles/stamp.csv, contains the thickness, in millimetres, of each of 485 stamps that were printed in 1872. It is suspected that the paper used in that year was thinner than in previous years.

Read in and display (some of) the data.

Make a suitable graph of these data. Justify your choice briefly.

From your graph, why do think it might be a good idea to do a sign test rather than a one-sample \(t\)-test on these data? Explain briefly.

The median thickness in years prior to 1872 was 0.081 mm. Is there evidence that the paper on which stamps were printed in 1872 is thinner than in previous years? Explain briefly.

Canned tuna

Light tuna is sold in 170-gram cans. The tuna can be canned in either water or oil. Is there a difference in the typical selling price of tuna canned in water or in oil? To find out, 25 supermarkets were sampled. In 14 of them (randomly chosen), the price of a (randomly chosen) brand of tuna in water was recorded, and in the other 11 supermarkets, the price of a (randomly chosen) brand of tuna in oil was recorded. The data are in http://ritsokiguess.site/datafiles/tuna.csv, with the column canned_in saying whether the tuna was canned in oil or water, and the price being in cents.

Read in and display (some of) the data.

Make a graph of just the prices (that is to say, not including what each can of tuna was canned in).

Make a normal quantile plot of the prices (again, ignoring what the tuna was canned in).

What does your normal quantile plot tell you about the distribution of prices? Is that consistent with your first plot? (I would add “explain briefly” if this were for marks, but you should probably think about how your explanation would look in any case.)

Some of the tuna was canned in oil and some in water, which may make a difference to the distribution of price. Make normal quantile plots of price for the tuna packed in oil and packed in water separately, with one ggplot command.

Why might it be a bad idea to run a two-sample \(t\)-test on these data? Explain briefly.