Open
Description
Hi,
A similar possible bug to #1787 (I was asked to start a new issue) I've just found exists when spaces
(and possibly other non-regular characters) occur in the class names when using h2o.
Here's an example based on #1787 that still breaks for me:
(reprex added by @pat-s)
library(mlr)
#> Loading required package: ParamHelpers
set.seed(123)
df <- data.frame(matrix(runif(100, 0, 1), 100, 9))
classx <- sample(paste(letters[1:4],letters[1:4]), 100, replace = TRUE)
df <- cbind(classx, df)
classif.task = makeClassifTask(id = "example",
data = df,
target = "classx")
gb.lrn = makeLearner("classif.h2o.randomForest",
predict.type = "prob")
rdesc = makeResampleDesc("CV", iters = 3, stratify = TRUE)
rin = makeResampleInstance(rdesc, task = classif.task)
r = resample(gb.lrn, classif.task, rin,
measures = list(mmce))
#> Resampling: cross-validation
#> Measures: mmce
#> Connection successful!
#>
#> R is connected to the H2O cluster:
#> H2O cluster uptime: 1 minutes 1 seconds
#> H2O cluster timezone: Europe/Berlin
#> H2O data parsing timezone: UTC
#> H2O cluster version: 3.26.0.2
#> H2O cluster version age: 5 months and 4 days !!!
#> H2O cluster name: H2O_started_from_R_patrickschratz_fmz906
#> H2O cluster total nodes: 1
#> H2O cluster total memory: 4.00 GB
#> H2O cluster total cores: 8
#> H2O cluster allowed cores: 8
#> H2O cluster healthy: TRUE
#> H2O Connection ip: localhost
#> H2O Connection port: 54321
#> H2O Connection proxy: NA
#> H2O Internal Security: FALSE
#> H2O API Extensions: Amazon S3, XGBoost, Algos, AutoML, Core V3, Core V4
#> R Version: R version 3.6.2 Patched (2019-12-12 r77564)
#> Warning in h2o.clusterInfo():
#> Your H2O cluster version is too old (5 months and 4 days)!
#> Please download and install the latest version from http://h2o.ai/download/
#>
#> | | | 0% | |======================================================================| 100%
#> | | | 0% | |============== | 20% | |======================================================================| 100%
#> | | | 0% | |======================================================================| 100%
#> | | | 0% | |======================================================================| 100%
#> Error in checkPredictLearnerOutput(.learner, .model, p): predictLearner for classif.h2o.randomForest has returned not the class levels as column names: a.a,b.b,c.c,d.d
Created on 2019-12-31 by the reprex package (v0.3.0)
This will run when replacing e.g. classif.h2o.randomForest
with classif.randomForestSRC
. It seems to fail with other non-standard characters in the class names but I haven't done an exhaustive search.
Many thanks for the wonderful package.
Andrew