The data in Table 9.13 are numbers of insurance policies, n, and numbers
of claims, y, for cars in various insurance categories, CAR, tabulated by age
of policy holder, AGE, and district where the policy holder lived (DIST =
1, for London and other major cities, and DIST = 0, otherwise). The table
is derived from the CLAIMS data set in Aitkin et al. (2005) obtained from
a paper by Baxter et al. (1980).
a. Calculate the rate of claims y/n for each category and plot the rates by
AGE,CAR and DIST to get an idea of the main effects of these factors.
b. Use Poisson regression to estimate the main effects (each treated as categorical and modelled using indicator variables) and interaction terms.
c. Based on the modelling in (b), Aitkin et al. (2005) determined that allthe interactions were unimportant and decided that AGE and CAR could be treated as though they were continuous variables. Fit a model incorporating these features and compare it with the best model obtained in
(b). What conclusions do you reach?