random - How to reproduce exact results with LDA function in R's topicmodels package -


i've been unable create reproducible results topicmodels' lda function. take example documentation:

library(topicmodels) set.seed(0) lda1 <- lda(associatedpress[1:20, ], control=list(seed=0), k=2) set.seed(0) lda2 <- lda(associatedpress[1:20, ], control=list(seed=0), k=2) identical(lda1, lda2) # [1] false 

how can identical results 2 separate calls lda?

as aside (in case package authors on here), find control=list(seed=0) snippet unfortunate , unnecessary. behind scenes, there's line if (missing(seed)) seed <- as.integer(sys.time()). doesn't make process more reliably random, undoes specified seed. missing something?

update: @hrbrmstr discovered below, passing seed control results in identical objects, difference being temp local file location. question more of misunderstanding (though still seems clearer if function respected set.seed()).

not "answer" there's no other way post code snippets :-)

i gave following go:

library(topicmodels)  data(associatedpress)  lda1 <- lda(associatedpress[1:20, ], control=list(seed=0), k=2) lda2 <- lda(associatedpress[1:20, ], control=list(seed=0), k=2)  identical(lda1, lda2) [1] false  all.equal(lda1, lda2) [1] "attributes: < component 5: attributes: < component 10: 1 string mismatch > >"  a1 <- posterior(lda1, associatedpress) a2 <- posterior(lda2, associatedpress)  identical(a1, a2) [1] true  all.equal(a1, a2) [1] true  all.equal(lda1@alpha,lda2@alpha) [1] true all.equal(lda1@call,lda2@call) [1] true all.equal(lda1@dim,lda2@dim) [1] true all.equal(lda1@control,lda2@control) [1] "attributes: < component 10: 1 string mismatch >" all.equal(lda1@k,lda2@k) [1] true all.equal(lda1@terms,lda2@terms) [1] true all.equal(lda1@documents,lda2@documents) [1] true all.equal(lda1@beta,lda2@beta) [1] true all.equal(lda1@gamma,lda2@gamma) [1] true all.equal(lda1@wordassignments,lda2@wordassignments) [1] true all.equal(lda1@loglikelihood,lda2@loglikelihood) [1] true all.equal(lda1@iter,lda2@iter) [1] true all.equal(lda1@logliks,lda2@logliks) [1] true all.equal(lda1@n,lda2@n) [1] true  identical(lda1@alpha,lda2@alpha) [1] true identical(lda1@call,lda2@call) [1] true identical(lda1@dim,lda2@dim) [1] true identical(lda1@control,lda2@control) [1] false identical(lda1@k,lda2@k) [1] true identical(lda1@terms,lda2@terms) [1] true identical(lda1@documents,lda2@documents) [1] true identical(lda1@beta,lda2@beta) [1] true identical(lda1@gamma,lda2@gamma) [1] true identical(lda1@wordassignments,lda2@wordassignments) [1] true identical(lda1@loglikelihood,lda2@loglikelihood) [1] true identical(lda1@iter,lda2@iter) [1] true identical(lda1@logliks,lda2@logliks) [1] true identical(lda1@n,lda2@n) [1] true 

is "unequal" @control significant?


Comments

Popular posts from this blog

user interface - How to replace the Python logo in a Tkinter-based Python GUI app? -

objective c - Greedy NSProgressIndicator Allocation -

how to set an OCR language in Google Drive -