Pipe in a dataframe to return a diagnosis of its missing and unique values for each columns. Default behavior is to diagnose all columns, but a subset can be specified in the dots with tidyselect.

diagnose(df, ...)

Arguments

df

dataframe

...

tidyselect

Value

dataframe summary

Details

this function is inspired by the excellent dlookr package. It takes a dataframe and returns a summary of unique and missing values of the columns.

Examples

iris %>% diagnose()
#> # A tibble: 5 × 6
#>   variables    types   missing_count missing_percent unique_count unique_rate
#>   <chr>        <chr>           <int>           <dbl>        <int>       <dbl>
#> 1 Sepal.Length numeric             0               0           35       0.233
#> 2 Sepal.Width  numeric             0               0           23       0.153
#> 3 Petal.Length numeric             0               0           43       0.287
#> 4 Petal.Width  numeric             0               0           22       0.147
#> 5 Species      factor              0               0            3       0.02