dat <- read("some_data.csv")
resultsdir <- "./results"
logdir <- "./sge_logs"
mod <- sge_submit(
{
train_cv(
dat,
alg = "lightgbm",
alg.params = list(num_leaves = 16, learning.rate = 0.01),
outer.resampling = setup.resample(
resampler = "kfold", n.resamples = 10, seed = 2023
),
outdir = file.path(resultsdir, "mod_LightGBM16")
)
},
obj_names = c("dat", "resultsdir"),
packages = "rtemis",
n_threads = 12,
sge_out = logdir,
h_rt = "10:00:00",
system_command = "module load r"
)19 Working with an SGE scheduler
If you are working on a cluster with an SGE scheduler, you can use the sge_submit() function to submit jobs to the scheduler. This function generates the required R script and shell script, and submits to the SGE scheduler.
Arguments:
- The first argument is an R expression (surrounded in curly brackets) that will be evaluated. In this example, we are training a LightGBM model on a dataset
dat. obj_names: a character vector of the names of objects to be exported to the job. These objects must be available in the current R session.packages: a character vector of packages to be loaded in the job.n_threads: the number of threads to be used in the job.sge_out: the directory where the SGE output files will be written.h_rt: the maximum runtime for the job.system_command: Optional system command to be used before running the R script. In this example, it is used to load the R module.