counts the distinct entries of categorical variables. The max_distinct
argument limits the scope to
categorical variables with a maximum number of unique entries, to prevent overflow.
diagnose_category(.data, ..., max_distinct = 5)
dataframe
tidyselect
integer
dataframe
iris %>%
diagnose_category()
#> # A tibble: 3 × 4
#> column level n ratio
#> <chr> <fct> <int> <dbl>
#> 1 Species setosa 50 0.333
#> 2 Species versicolor 50 0.333
#> 3 Species virginica 50 0.333