Pipe in a dataframe to return a diagnosis of its missing and unique values for each columns. Default behavior is to diagnose all columns, but a subset can be specified in the dots with tidyselect.
diagnose(df, ...)
dataframe
tidyselect
dataframe summary
this function is inspired by the excellent dlookr package. It takes a dataframe and returns a summary of unique and missing values of the columns.
iris %>% diagnose()
#> # A tibble: 5 × 6
#> variables types missing_count missing_percent unique_count unique_rate
#> <chr> <chr> <int> <dbl> <int> <dbl>
#> 1 Sepal.Length numeric 0 0 35 0.233
#> 2 Sepal.Width numeric 0 0 23 0.153
#> 3 Petal.Length numeric 0 0 43 0.287
#> 4 Petal.Width numeric 0 0 22 0.147
#> 5 Species factor 0 0 3 0.02