Imputes missing values of a numeric matrix using stochastic gradient descent. recosystem

impute_recosystem(
  .data,
  lrate = c(0.05, 0.1),
  costp_l1 = c(0, 0.05),
  costq_l1 = c(0, 0.05),
  costp_l2 = c(0, 0.05),
  costq_l2 = c(0, 0.05),
  nthread = 8,
  loss = "l2",
  niter = 15,
  verbose = FALSE,
  nfold = 4,
  seed = 1
)

Arguments

.data

long format data frame

lrate

learning rate

costp_l1

l1 cost p

costq_l1

l1 cost q

costp_l2

l2 cost p

costq_l2

l2 cost q

nthread

nthreads

loss

loss function. also can use "l1"

niter

training iterations for tune

verbose

show training loss?

nfold

folds for tune validation

seed

seed for randomness

Value

long format data frame

Details

input is a long data frame with 3 columns: ID col, Item col (the column names from pivoting longer), and the ratings (values from pivoting longer)

pre-processing generally requires pivoting a wide user x item matrix to long format. The missing values from the matrix must be retained as NA values in the rating column. The values will be predicted and filled in by the algorithm. Output is a long data frame with the same number of rows as input, but no missing values.

This function automatically tunes the recosystem learner before applying. Parameter values can be supplied for tuning. To avoid tuning, use single values for the parameters.