This package aims to make really easy to tidy data retrieved from Gapminder. A the beginning is:
When you have loaded the package you are now in possession of two super powers (functions): tidy_indice and tidy_bunch.
function tidy as explain above tidy a data
sheet downloaded on Gapminder. This data sheet can be either in csv or
xlsx as indicated on the gapminder site.
take as argument the path to the file and
return the data as a tidy data frame.
filepath <- system.file("extdata", "life_expectancy_years.csv", package = "tidygapminder")
# From .............................
df <- readr::read_csv(filepath)
#> Rows: 187 Columns: 220
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> chr (1): country
#> dbl (219): 1800, 1801, 1802, 1803, 1804, 1805, 1806, 1807, 1808, 1809, 1810,...
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#> # A tibble: 6 × 220
#> country `1800` `1801` `1802` `1803` `1804` `1805` `1806` `1807` `1808` `1809`
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 Afghani… 28.2 28.2 28.2 28.2 28.2 28.2 28.1 28.1 28.1 28.1
#> 2 Albania 35.4 35.4 35.4 35.4 35.4 35.4 35.4 35.4 35.4 35.4
#> 3 Algeria 28.8 28.8 28.8 28.8 28.8 28.8 28.8 28.8 28.8 28.8
#> 4 Andorra NA NA NA NA NA NA NA NA NA NA
#> 5 Angola 27 27 27 27 27 27 27 27 27 27
#> 6 Antigua… 33.5 33.5 33.5 33.5 33.5 33.5 33.5 33.5 33.5 33.5
#> # ℹ 209 more variables: `1810` <dbl>, `1811` <dbl>, `1812` <dbl>, `1813` <dbl>,
#> # `1814` <dbl>, `1815` <dbl>, `1816` <dbl>, `1817` <dbl>, `1818` <dbl>,
#> # `1819` <dbl>, `1820` <dbl>, `1821` <dbl>, `1822` <dbl>, `1823` <dbl>,
#> # `1824` <dbl>, `1825` <dbl>, `1826` <dbl>, `1827` <dbl>, `1828` <dbl>,
#> # `1829` <dbl>, `1830` <dbl>, `1831` <dbl>, `1832` <dbl>, `1833` <dbl>,
#> # `1834` <dbl>, `1835` <dbl>, `1836` <dbl>, `1837` <dbl>, `1838` <dbl>,
#> # `1839` <dbl>, `1840` <dbl>, `1841` <dbl>, `1842` <dbl>, `1843` <dbl>, …
# To................................
ti_df <- tidy_indice(filepath)
#> # A tibble: 6 × 3
#> country year life_expectancy_years
#> <chr> <dbl> <dbl>
#> 1 Afghanistan 1800 28.2
#> 2 Afghanistan 1801 28.2
#> 3 Afghanistan 1802 28.2
#> 4 Afghanistan 1803 28.2
#> 5 Afghanistan 1804 28.2
#> 6 Afghanistan 1805 28.2
makes use of tidy_indice
to tidy
a whole set of data sheets and have the options to merge all data frames
into one big data frame with merge
set to
dir_path <- system.file("extdata", package = "tidygapminder")
# From ................................
#> [1] "agriculture_land.xlsx" "life_expectancy_years.csv"
# To ..................................
td_dp <- tidy_bunch(dir_path, merge = TRUE)
#> country year Agricultural land (% of land area) life_expectancy_years
#> 1 Afghanistan 1800 NA 28.2
#> 2 Afghanistan 1801 NA 28.2
#> 3 Afghanistan 1802 NA 28.2
#> 4 Afghanistan 1803 NA 28.2
#> 5 Afghanistan 1804 NA 28.2
#> 6 Afghanistan 1805 NA 28.2