+
Skip to content

PipeOpTargetMutate breaks internal validation #1401

@wmay

Description

@wmay

The mlr3 book shows how to set the internal_valid_task property of a task for internal validation, including with graph learners. But the code breaks when using PipeOpTargetMutate.

It looks like PipeOpTargetMutate calls convert_task, which doesn't preserve internal_valid_task. Then the base learner raises an error because it can't find the validation data. Other target-related pipeops also have this issue.

It's not clear to me whether this should be considered a bug in PipeOpTargetMutate or in convert_task.

tsk_mtcars = tsk("mtcars")
tsk_mtcars$internal_valid_task = sample(tsk_mtcars$nrow, 10)
convert_task(tsk_mtcars) # validation task is dropped

Example:

tsk_mtcars = tsk("mtcars")
tsk_mtcars$internal_valid_task = sample(tsk_mtcars$nrow, 10)
lrn_xgb = lrn("regr.xgboost")

# working graph learner
glrn = as_learner(po("pca") %>>% lrn_xgb)
set_validate(glrn, validate = "predefined")
glrn$train(tsk_mtcars) # it works

# failing graph learner
ttscalerange = ppl("targettrafo", trafo_pipeop = PipeOpTargetTrafoScaleRange$new(),
                   graph = PipeOpLearner$new(lrn_xgb))
glrn2 = as_learner(ttscalerange)
set_validate(glrn2, validate = "predefined")
glrn2$train(tsk_mtcars) # error

It raises this error:

Error in create_internal_valid_task(validate, task, test_row_ids, prev_valid,  : 
  Parameter 'validate' is set to 'predefined' but no internal validation task is present. This commonly happens in GraphLearners and can be avoided by configuring the validation data for the  GraphLearner via `set_validate(<glrn>, validate = <value>)`. See https://mlr3book.mlr-org.com/chapters/chapter15/predsets_valid_inttune.html for more information.
This happened in PipeOp regr.xgboost's $train()

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载