-
Notifications
You must be signed in to change notification settings - Fork 42
Description
I am by no means an expert on t-SNE, and I think your blogpost https://distill.pub/2016/misread-tsne/ is quite helpful for beginners, like myself.
Now that I have installed R-package Rtsne (https://github.com/jkrijthe/Rtsne), I ran one of its code examples from the package documentation. There I noticed that initialisation with a random seed also is very important. However you don't mention influence of the random seed in your blogpost.
## documentation code
iris_unique <- unique(iris) # Remove duplicates
iris_matrix <- as.matrix(iris_unique[,1:4])
set.seed(42) # Set a seed if you want reproducible results
tsne_out <- Rtsne(iris_matrix) # Run TSNE
# Show the objects in the 2D tsne representation
plot(tsne_out$Y,col=iris_unique$Species)
### end of r documentation
### my code
# run again, with different random initzn: completely different-looking plot
tsne_out <- Rtsne(iris_matrix) # Run TSNE
plot(tsne_out$Y,col=iris_unique$Species)
For the small (n= 150) iris dataset, the clusters just seem to be flip-flopped to different edges of the 2d-plane of the plotting area, but for more complex data the plots can look completely different, even when the hyperparameters are held constant and just the init values change.
(Let me know if this is not written clearly enough.)