--- title: "Quick start guide" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{quickstart} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` The objective of this package is to compute rates adjusted by a reference population or other rate. This is a very common procedure in epidemiology, allowing the comparison of rates of a event (like mortality) among groups that have different age distributions. Some packages like the `epitools` compute these adjusted rates. This package functions wraps the `epitools` functions in a tidy way, allowing the computation of age adjusted rates for several groups using key variables, like year and regions for example. ## Setup ### Installing the package ```{r eval=FALSE} devtools::install_github("rfsaldanha/tidyrates") ``` ### Load the package ```{r} library(tidyrates) ``` ## Direct adjusted rate ### Events and population data Let's use the Fleiss dataset, quoted by the `epitools ` package (Fleiss, 1981, p. 249 ). ```{r} population <- c(230061, 329449, 114920, 39487, 14208, 3052, 72202, 326701, 208667, 83228, 28466, 5375, 15050, 175702, 207081, 117300, 45026, 8660, 2293, 68800, 132424, 98301, 46075, 9834, 327, 30666, 123419, 149919, 104088, 34392, 319933, 931318, 786511, 488235, 237863, 61313) population <- matrix(population, 6, 6, dimnames = list(c("Under 20", "20-24", "25-29", "30-34", "35-39", "40 and over"), c("1", "2", "3", "4", "5+", "Total"))) count <- c(107, 141, 60, 40, 39, 25, 25, 150, 110, 84, 82, 39, 3, 71, 114, 103, 108, 75, 1, 26, 64, 89, 137, 96, 0, 8, 63, 112, 262, 295, 136, 396, 411, 428, 628, 530) count <- matrix(count, 6, 6, dimnames = list(c("Under 20", "20-24", "25-29", "30-34", "35-39", "40 and over"), c("1", "2", "3", "4", "5+", "Total"))) ``` ```{r} population ``` ```{r} count ``` The Fleiss data present events (`count` object) and population (`population` object) for six age groups on five different groups (from 1 to 5+). The `tidyrates` package present the same Fleiss data in a tidy way, with a tibble in long format. ```{r} fleiss_data ``` The `key` variable refers to the groups, `age_group` to the age groups, `name` separates the `values` into events and population. *You may use this same structure for your use case data.* ### Reference population data The Fleiss example uses the average population as standard population reference. ```{r} standard<-apply(population[,-6], 1, mean) standard ``` Using `tidyrates`, we must supply a tibble with two variables: age group and population. ```{r} standard_pop <- tibble::tibble( age_group = c("Under 20", "20-24", "25-29", "30-34", "35-39", "40 and over"), population = c(63986.6, 186263.6, 157302.2, 97647.0, 47572.6, 12262.6) ) ``` ### Rate computation To use the direct adjustment procedure, `tidyrate` present the `rate_adj_direct` function. The `.data` argument must be a tibble with the events and population data, and the `.std` argument must be standard population tibble. The `.keys` argument must point to grouping variables on the `.data` tibble, if available. The `rate_adj_direct` will compute the crude rate, adjusted rate and exact confidence intervals for each group. ```{r} rate_adj_direct(fleiss_data, .std = standard_pop, .keys = "key") ``` ## Indirect adjusted rate ### Events and population data Let's use the Selvin dataset, quoted by the `epitools ` package (Selvin, 2004). ```{r} dth40 <- c(45, 201, 320, 670, 1126, 3160, 9723, 17935, 22179, 13461, 2238) pop40 <- c(906897, 3794573, 10003544, 10629526, 9465330, 8249558, 7294330, 5022499, 2920220, 1019504, 142532) ``` The `tidyrates` present the same dataset in a tidy way. ```{r} selvin_data_1940 ``` ### Events and population reference data ```{r} dth60 <- c(141, 926, 1253, 1080, 1869, 4891, 14956, 30888, 41725, 26501, 5928) pop60 <- c(1784033, 7065148, 15658730, 10482916, 9939972, 10563872, 9114202, 6850263, 4702482, 1874619, 330915) ``` The `tidyrates` present the same dataset in a tidy way. ```{r} selvin_data_1960 ``` ### Rate computation To use the indirect adjustment procedure, `tidyrate` present the `rate_adj_indirect` function. The `.data` argument must be a tibble with the events and population data, and the `.std` argument must be also a tibble with the events and population data. The `.keys` argument must point to grouping variables on the `.data` tibble, if available. The `rate_adj_indirect` will compute the crude rate, adjusted rate and exact confidence intervals for each group. ```{r} rate_adj_indirect(selvin_data_1940, selvin_data_1960) ```