random - How to reproduce exact results with LDA function in R's topicmodels package -
i've been unable create reproducible results topicmodels' lda function. take example documentation:
library(topicmodels) set.seed(0) lda1 <- lda(associatedpress[1:20, ], control=list(seed=0), k=2) set.seed(0) lda2 <- lda(associatedpress[1:20, ], control=list(seed=0), k=2) identical(lda1, lda2) # [1] false how can identical results 2 separate calls lda?
as aside (in case package authors on here), find control=list(seed=0) snippet unfortunate , unnecessary. behind scenes, there's line if (missing(seed)) seed <- as.integer(sys.time()). doesn't make process more reliably random, undoes specified seed. missing something?
update: @hrbrmstr discovered below, passing seed control results in identical objects, difference being temp local file location. question more of misunderstanding (though still seems clearer if function respected set.seed()).
not "answer" there's no other way post code snippets :-)
i gave following go:
library(topicmodels) data(associatedpress) lda1 <- lda(associatedpress[1:20, ], control=list(seed=0), k=2) lda2 <- lda(associatedpress[1:20, ], control=list(seed=0), k=2) identical(lda1, lda2) [1] false all.equal(lda1, lda2) [1] "attributes: < component 5: attributes: < component 10: 1 string mismatch > >" a1 <- posterior(lda1, associatedpress) a2 <- posterior(lda2, associatedpress) identical(a1, a2) [1] true all.equal(a1, a2) [1] true all.equal(lda1@alpha,lda2@alpha) [1] true all.equal(lda1@call,lda2@call) [1] true all.equal(lda1@dim,lda2@dim) [1] true all.equal(lda1@control,lda2@control) [1] "attributes: < component 10: 1 string mismatch >" all.equal(lda1@k,lda2@k) [1] true all.equal(lda1@terms,lda2@terms) [1] true all.equal(lda1@documents,lda2@documents) [1] true all.equal(lda1@beta,lda2@beta) [1] true all.equal(lda1@gamma,lda2@gamma) [1] true all.equal(lda1@wordassignments,lda2@wordassignments) [1] true all.equal(lda1@loglikelihood,lda2@loglikelihood) [1] true all.equal(lda1@iter,lda2@iter) [1] true all.equal(lda1@logliks,lda2@logliks) [1] true all.equal(lda1@n,lda2@n) [1] true identical(lda1@alpha,lda2@alpha) [1] true identical(lda1@call,lda2@call) [1] true identical(lda1@dim,lda2@dim) [1] true identical(lda1@control,lda2@control) [1] false identical(lda1@k,lda2@k) [1] true identical(lda1@terms,lda2@terms) [1] true identical(lda1@documents,lda2@documents) [1] true identical(lda1@beta,lda2@beta) [1] true identical(lda1@gamma,lda2@gamma) [1] true identical(lda1@wordassignments,lda2@wordassignments) [1] true identical(lda1@loglikelihood,lda2@loglikelihood) [1] true identical(lda1@iter,lda2@iter) [1] true identical(lda1@logliks,lda2@logliks) [1] true identical(lda1@n,lda2@n) [1] true is "unequal" @control significant?
Comments
Post a Comment