这是indexloc提供的服务,不要输入任何密码
Skip to content

Importance of Random initialisation / seed ? #9

@knbknb

Description

@knbknb

I am by no means an expert on t-SNE, and I think your blogpost https://distill.pub/2016/misread-tsne/ is quite helpful for beginners, like myself.

Now that I have installed R-package Rtsne (https://github.com/jkrijthe/Rtsne), I ran one of its code examples from the package documentation. There I noticed that initialisation with a random seed also is very important. However you don't mention influence of the random seed in your blogpost.

## documentation code
iris_unique <- unique(iris) # Remove duplicates
iris_matrix <- as.matrix(iris_unique[,1:4])
set.seed(42) # Set a seed if you want reproducible results
tsne_out <- Rtsne(iris_matrix) # Run TSNE

# Show the objects in the 2D tsne representation
plot(tsne_out$Y,col=iris_unique$Species)
### end of r documentation

### my code
# run again, with different random initzn: completely different-looking plot
tsne_out <- Rtsne(iris_matrix) # Run TSNE
plot(tsne_out$Y,col=iris_unique$Species)

For the small (n= 150) iris dataset, the clusters just seem to be flip-flopped to different edges of the 2d-plane of the plotting area, but for more complex data the plots can look completely different, even when the hyperparameters are held constant and just the init values change.

(Let me know if this is not written clearly enough.)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions