java - Mahout random forest classifier example ArrayIndexOutOfBoundsException -


while trying run random forest example encounter java.lang.arrayindexoutofboundsexception: 100 error. here 100 bind number of trees. map part 100% complete , reduce 0%. use hadoop-1.2.1 , mahout-distribution-0.7. have tried mahout-distribution-0.9 same error.

does ran example luck?

problem found. if running hadoop mapred.job.tracker=local, partialbuilder cannot number of mapping tasks using mapred.map.tasks. consequence computes number of trees per mapping task wrong.

solution: don't use parameter "-p" when running random forest job on local hadoop.

details:

windiana@host:~/mahout/data/> hadoop jar $mahout_home/examples/target/mahout-examples-0.9-job.jar org.apache.mahout.classifier.df.mapreduce.buildforest -dmapred.max.split.size=1874231 -d testdata/kddtrain+.arff -ds testdata/kddtrain+.info -sl 5 -t 100 -o nsl-forest warning: $hadoop_home deprecated.  14/08/07 11:25:18 info mapreduce.buildforest: inmem mapred implementation 14/08/07 11:25:18 info mapreduce.buildforest: building forest... 14/08/07 11:25:18 info util.nativecodeloader: loaded native-hadoop library 14/08/07 11:25:19 info filecache.trackerdistributedcachemanager: creating kddtrain+.info in /tmp/hadoop-martin/mapred/local/archive/-1415030653984777464_-1414908735_797966215/filetestdata-work-5026960219142699303 rwxr-xr-x 14/08/07 11:25:19 info filecache.trackerdistributedcachemanager: cached testdata/kddtrain+.info /tmp/hadoop-martin/mapred/local/archive/-1415030653984777464_-1414908735_797966215/filetestdata/kddtrain+.info 14/08/07 11:25:19 info filecache.trackerdistributedcachemanager: cached testdata/kddtrain+.info /tmp/hadoop-martin/mapred/local/archive/-1415030653984777464_-1414908735_797966215/filetestdata/kddtrain+.info 14/08/07 11:25:19 info filecache.trackerdistributedcachemanager: creating kddtrain+.arff in /tmp/hadoop-martin/mapred/local/archive/3941906571438652588_-1415143228_797959215/filetestdata-work-5750487161401524172 rwxr-xr-x 14/08/07 11:25:19 info filecache.trackerdistributedcachemanager: cached testdata/kddtrain+.arff /tmp/hadoop-martin/mapred/local/archive/3941906571438652588_-1415143228_797959215/filetestdata/kddtrain+.arff 14/08/07 11:25:19 info filecache.trackerdistributedcachemanager: cached testdata/kddtrain+.arff /tmp/hadoop-martin/mapred/local/archive/3941906571438652588_-1415143228_797959215/filetestdata/kddtrain+.arff 14/08/07 11:25:19 info mapred.jobclient: running job: job_local966281240_0001 14/08/07 11:25:19 info mapred.localjobrunner: waiting map tasks 14/08/07 11:25:19 info mapred.localjobrunner: starting task: attempt_local966281240_0001_m_000000_0 14/08/07 11:25:19 info util.processtree: setsid exited exit code 0 14/08/07 11:25:19 info mapred.task:  using resourcecalculatorplugin : org.apache.hadoop.util.linuxresourcecalculatorplugin@2df8fdda 14/08/07 11:25:19 info mapred.maptask: processing split: [firstid:0, nbtrees:100, seed:null] 14/08/07 11:25:19 info inmem.inmemmapper: loading data... 14/08/07 11:25:20 info mapred.jobclient:  map 0% reduce 0% 14/08/07 11:25:21 info inmem.inmemmapper: data loaded : 125973 instances 14/08/07 11:25:25 info mapred.localjobrunner:  14/08/07 11:25:26 info mapred.jobclient:  map 1% reduce 0%  ...  14/08/07 11:27:59 info mapred.jobclient:  map 98% reduce 0% 14/08/07 11:28:00 info mapred.task: task:attempt_local966281240_0001_m_000000_0 done. , in process of commiting 14/08/07 11:28:00 info mapred.localjobrunner:  14/08/07 11:28:00 info mapred.task: task attempt_local966281240_0001_m_000000_0 allowed commit 14/08/07 11:28:00 info output.fileoutputcommitter: saved output of task 'attempt_local966281240_0001_m_000000_0' file:/home/martin/programmieren/mahout/data/cut/nsl-forest 14/08/07 11:28:00 info mapred.localjobrunner:  14/08/07 11:28:00 info mapred.task: task 'attempt_local966281240_0001_m_000000_0' done. 14/08/07 11:28:00 info mapred.localjobrunner: finishing task: attempt_local966281240_0001_m_000000_0 14/08/07 11:28:00 info mapred.localjobrunner: map task executor complete. 14/08/07 11:28:00 info mapred.jobclient:  map 99% reduce 0% 14/08/07 11:28:00 info mapred.jobclient: job complete: job_local966281240_0001 14/08/07 11:28:00 info mapred.jobclient: counters: 12 14/08/07 11:28:00 info mapred.jobclient:   file output format counters  14/08/07 11:28:00 info mapred.jobclient:     bytes written=2353226 14/08/07 11:28:00 info mapred.jobclient:   file input format counters  14/08/07 11:28:00 info mapred.jobclient:     bytes read=0 14/08/07 11:28:00 info mapred.jobclient:   filesystemcounters 14/08/07 11:28:00 info mapred.jobclient:     file_bytes_read=61962918 14/08/07 11:28:00 info mapred.jobclient:     file_bytes_written=45667235 14/08/07 11:28:00 info mapred.jobclient:   map-reduce framework 14/08/07 11:28:00 info mapred.jobclient:     map input records=100 14/08/07 11:28:00 info mapred.jobclient:     physical memory (bytes) snapshot=0 14/08/07 11:28:00 info mapred.jobclient:     spilled records=0 14/08/07 11:28:00 info mapred.jobclient:     total committed heap usage (bytes)=132120576 14/08/07 11:28:00 info mapred.jobclient:     cpu time spent (ms)=0 14/08/07 11:28:00 info mapred.jobclient:     virtual memory (bytes) snapshot=0 14/08/07 11:28:00 info mapred.jobclient:     split_raw_bytes=90 14/08/07 11:28:00 info mapred.jobclient:     map output records=100 14/08/07 11:28:00 info common.hadooputil: deleting file:/home/martin/programmieren/mahout/data/cut/nsl-forest 14/08/07 11:28:00 info mapreduce.buildforest: build time: 0h 2m 41s 702 14/08/07 11:28:00 info mapreduce.buildforest: forest num nodes: 130056 14/08/07 11:28:00 info mapreduce.buildforest: forest mean num nodes: 1300 14/08/07 11:28:00 info mapreduce.buildforest: forest mean max depth: 19 14/08/07 11:28:00 info mapreduce.buildforest: storing forest in: nsl-forest/forest.seq 

Comments

Popular posts from this blog

android - Get AccessToken using signpost OAuth without opening a browser (Two legged Oauth) -

org.mockito.exceptions.misusing.InvalidUseOfMatchersException: mockito -

google shop client API returns 400 bad request error while adding an item -