create dummies — create_dummies • framecleaner

adapted from the dummy_cols function Added the option to truncate the dummy column names, and to specify dummy cols using tidyselect.

create_dummies(
  .data,
  ...,
  append_col_name = TRUE,
  max_levels = 10L,
  remove_first_dummy = FALSE,
  remove_most_frequent_dummy = FALSE,
  clean_names = TRUE,
  ignore_na = FALSE,
  split = NULL,
  remove_selected_columns = TRUE
)

Arguments

.data: data frame
...: tidyselect columns. default selection is all character or factor variables
append_col_name: logical, default TRUE. Appends original column name to dummy col name
max_levels: uses fct_lump_n to limit the number of categories. Only the top n levels are preserved, and the rest being lumped into "other". Default is set to 10 levels, to prevent accidental overload. Set value to Inf to use all levels
remove_first_dummy: logical, default FALSE.
remove_most_frequent_dummy: logical, default FALSE
clean_names: logical, default TRUE. apply clean_names
ignore_na: logical, default FALSE
split: NULL
remove_selected_columns: logical, default TRUE

Value

data frame

Details

reference the fastDummies package for documentation on the original function.

Examples


iris %>%
create_dummies(Species, append_col_name = FALSE) %>%
  tibble::as_tibble()
#> 1 column(s) have become 3 dummy columns
#> # A tibble: 150 × 7
#>    Sepal.Length Sepal.Width Petal.Length Petal.Width setosa versicolor virginica
#>           <dbl>       <dbl>        <dbl>       <dbl>  <int>      <int>     <int>
#>  1          5.1         3.5          1.4         0.2      1          0         0
#>  2          4.9         3            1.4         0.2      1          0         0
#>  3          4.7         3.2          1.3         0.2      1          0         0
#>  4          4.6         3.1          1.5         0.2      1          0         0
#>  5          5           3.6          1.4         0.2      1          0         0
#>  6          5.4         3.9          1.7         0.4      1          0         0
#>  7          4.6         3.4          1.4         0.3      1          0         0
#>  8          5           3.4          1.5         0.2      1          0         0
#>  9          4.4         2.9          1.4         0.2      1          0         0
#> 10          4.9         3.1          1.5         0.1      1          0         0
#> # ℹ 140 more rows