sentiment analysis - Theano Classification Task always gives 50% validation error and test error? -
i doing text classification experiment theano's dbn (deep belief network) , sda (stacked denoising autoencoder) examples. have produced feature/label dataset theano's minst dataset produced , changed feature length , output values of examples adopt dataset (2 outputs instead of 10 outputs, , number of features adopted dataset). every time run experiments (both dbn , sda) exact 50% validation error , test error. have ideas i'm doing wrong? because have produced dataset out of movie review dataset minst dataset format , pickled it.
my code same code can find in http://www.deeplearning.net/tutorial/dbn.html , sda code same code can find in http://www.deeplearning.net/tutorial/sda.html
the difference have made own dataset instead of minst digit recognition dataset. dataset bag of words features movie review dataset of course has different number of features , output classes have made tiny modifications in function parameters number of inputs , output classes. code runs beautifully results 50%. sample output:
pre-training layer 2, epoch 77, cost -11.8415031463 pre-training layer 2, epoch 78, cost -11.8225591118 pre-training layer 2, epoch 79, cost -11.8309999005 pre-training layer 2, epoch 80, cost -11.8362189546 pre-training layer 2, epoch 81, cost -11.8251214285 pre-training layer 2, epoch 82, cost -11.8333494168 pre-training layer 2, epoch 83, cost -11.8564580976 pre-training layer 2, epoch 84, cost -11.8243052414 pre-training layer 2, epoch 85, cost -11.8373403275 pre-training layer 2, epoch 86, cost -11.8341470443 pre-training layer 2, epoch 87, cost -11.8272021013 pre-training layer 2, epoch 88, cost -11.8403720434 pre-training layer 2, epoch 89, cost -11.8393612003 pre-training layer 2, epoch 90, cost -11.828745041 pre-training layer 2, epoch 91, cost -11.8300890796 pre-training layer 2, epoch 92, cost -11.8209189065 pre-training layer 2, epoch 93, cost -11.8263340225 pre-training layer 2, epoch 94, cost -11.8348454378 pre-training layer 2, epoch 95, cost -11.8288419285 pre-training layer 2, epoch 96, cost -11.8366522357 pre-training layer 2, epoch 97, cost -11.840142131 pre-training layer 2, epoch 98, cost -11.8334445128 pre-training layer 2, epoch 99, cost -11.8523094141
the pretraining code file dbn_moviereview.py ran 430.33m
... getting finetuning functions ... finetunning model epoch 1, minibatch 140/140, validation error 50.000000 % epoch 1, minibatch 140/140, test error of best model 50.000000 % epoch 2, minibatch 140/140, validation error 50.000000 % epoch 3, minibatch 140/140, validation error 50.000000 % epoch 4, minibatch 140/140, validation error 50.000000 % optimization complete best validation score of 50.000000 %,with test performance 50.000000 %
the fine tuning code file dbn_moviereview.py ran 5.48m
i ran both sda , dbn 2 different feature sets. got exact 50% accuracy on these 4 experiments.
i asked same question in theano's user groups , answered feature values should between 0 , 1.
so used normalizer normalize feature values , solved problem.
Comments
Post a Comment