Inputs a dataframe and returns various summary statistics of the numeric columns. For example zeros
returns the number
of 0 values in that column. minus
counts negative values and infs
counts Inf values. Other rarer metrics
are also returned that may be helpful for quick diagnosis or understanding of numeric data. mode
returns the most common
value in the column (chooses at random in case of tie) , and mode_ratio
returns its frequency as a ratio of the total rows
diagnose_numeric(.data, ...)
dataframe
tidyselect. Default: all numeric columns
dataframe
iris %>%
diagnose_numeric() %>%
print(width = Inf)
#> 150 rows
#> # A tibble: 4 × 11
#> variables zeros minus infs min mean max `|x|<1 (ratio)` integer_ratio
#> <chr> <int> <int> <int> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 Sepal.Length 0 0 0 4.3 5.84 7.9 0 0.113
#> 2 Sepal.Width 0 0 0 2 3.06 4.4 0 0.187
#> 3 Petal.Length 0 0 0 1 3.76 6.9 0 0.0867
#> 4 Petal.Width 0 0 0 0.1 1.20 2.5 0.333 0.0867
#> mode mode_ratio
#> <dbl> <dbl>
#> 1 5 0.0667
#> 2 3 0.173
#> 3 1.4 0.0867
#> 4 0.2 0.193