Finds the correlation between numeric variables in a data frame, chosen using tidyselect. Additional parameters for the correlation test can be specified as in cor.test

auto_cor(
  .data,
  ...,
  use = c("pairwise.complete.obs", "all.obs", "complete.obs", "everything",
    "na.or.complete"),
  method = c("pearson", "kendall", "spearman", "xicor"),
  include_nominals = TRUE,
  max_levels = 5L,
  sparse = TRUE,
  pval_thresh = 0.1
)

Arguments

.data

data frame

...

tidyselect cols

use

method to deal with na. Default is to remove rows with NA

method

correlation method. default is pearson, but also supports xicor.

include_nominals

logicals, default TRUE. Dummify nominal variables?

max_levels

maximum numbers of dummies to be created from nominal variables

sparse

logical, default TRUE. Filters and arranges cor table

pval_thresh

threshold to filter out weak correlations

Value

data frame of correlations

Details

includes the asymmetric correlation coefficient xi from xicor

Examples


iris %>%
auto_cor()
#> 1 column(s) have become 3 dummy columns
#> # A tibble: 15 × 6
#>    x                  y               cor  p.value significance method 
#>    <chr>              <chr>         <dbl>    <dbl> <chr>        <chr>  
#>  1 Petal.Width        Petal.Length  0.963 4.68e-86 ***          pearson
#>  2 species_setosa     Petal.Length -0.923 3.62e-63 ***          pearson
#>  3 species_setosa     Petal.Width  -0.887 1.29e-51 ***          pearson
#>  4 Petal.Length       Sepal.Length  0.872 1.04e-47 ***          pearson
#>  5 Petal.Width        Sepal.Length  0.818 2.33e-37 ***          pearson
#>  6 species_virginica  Petal.Width   0.769 1.30e-30 ***          pearson
#>  7 species_virginica  Petal.Length  0.721 2.38e-25 ***          pearson
#>  8 species_setosa     Sepal.Length -0.717 5.29e-25 ***          pearson
#>  9 species_virginica  Sepal.Length  0.638 1.62e-18 ***          pearson
#> 10 species_setosa     Sepal.Width   0.603 3.05e-16 ***          pearson
#> 11 species_versicolor Sepal.Width  -0.468 1.60e- 9 ***          pearson
#> 12 Petal.Length       Sepal.Width  -0.428 4.51e- 8 ***          pearson
#> 13 Petal.Width        Sepal.Width  -0.366 4.07e- 6 ***          pearson
#> 14 species_versicolor Petal.Length  0.202 1.33e- 2 *            pearson
#> 15 species_virginica  Sepal.Width  -0.136 9.79e- 2 .            pearson

# don't use sparse if you're interested in only one target variable
iris %>%
auto_cor(sparse = FALSE) %>%
dplyr::filter(x == "Petal.Length")
#> 1 column(s) have become 3 dummy columns
#> # A tibble: 6 × 5
#>   x            y                     cor  p.value significance
#>   <chr>        <chr>               <dbl>    <dbl> <chr>       
#> 1 Petal.Length Sepal.Length        0.872 1.04e-47 ***         
#> 2 Petal.Length Sepal.Width        -0.428 4.51e- 8 ***         
#> 3 Petal.Length Petal.Width         0.963 4.68e-86 ***         
#> 4 Petal.Length species_setosa     -0.923 3.62e-63 ***         
#> 5 Petal.Length species_versicolor  0.202 1.33e- 2 *           
#> 6 Petal.Length species_virginica   0.721 2.38e-25 ***