java - reading text file against specific words -
i creating tool in java eclipse distinguish whether sentence contains particular word or not.
i using twitter4j tool able search tweets in twitter.
i have used stanford nlp tagger able tag tweets twitter. stored in text file.
here code
public class texttag { public static void main(string[] args) throws ioexception, classnotfoundexception { string tagged; // initialize tagger maxenttagger tagger = new maxenttagger("taggers/english-left3words-distsim.tagger"); // sample string string sample = "output tagged"; //the tagged string tagged = tagger.tagstring(sample); //output tagged sample string onto console //system.out.println(tagged); /*pick sentences file ouput.txt , store output of tagged sentences in file entitytagged.txt. */ fileinputstream fstream = new fileinputstream("output.txt"); datainputstream in = new datainputstream(fstream); bufferedreader br = new bufferedreader(new inputstreamreader(in)); //we pick sentences line line file ouput.txt , store in string sample while((sample = br.readline())!=null) { //tag string tagged = tagger.tagstring(sample); filewriter q = new filewriter("entitytagged.txt",true); bufferedwriter out =new bufferedwriter(q); //write file entitytagged.txt out.write(tagged); out.newline(); out.close(); }
my next step use tagged tweets entitytagged.txt , compare these string of positive words , negative words.
i have created 2 text files, list of positive words , list of negative words, , goal loop through 10 different tagged tweets in 'entitytagged.txt" file against positive.txt , negative.txt files find out if word comes can distinguish if tweets positive or negative
my end result should have
tweet 1: positive tweet 2: negative tweet 3: negative
etc
at moment, struggling create algorithm can implement this
any appreciated
thank you
here's five-minute algorithm. store positive , negative words delimited strings. loop through words in tweet see if exist in delimited strings. you'll have expand split regex include special characters:
string positivewords = "|nice|happy|great|"; positivewords = positivewords.tolowercase(); string negativewords = "|bad|awful|mean|yuck|sad|"; negativewords = negativewords.tolowercase(); string tweetone = "nice day happy not sad @ all"; tweetone = tweetone.tolowercase(); string[] arrwords = tweetone.split("\\s"); int value = 0; (int i=0; < arrwords.length; i++) { if (positivewords.indexof("|"+arrwords[i]+"|") != -1) { system.out.println("pos word(+1): " + arrwords[i]); value++; } if (negativewords.indexof("|"+arrwords[i]+"|") != -1) { system.out.println("neg word(-1): " + arrwords[i]); value--; } } system.out.println("positive/negative value: " + value);
Comments
Post a Comment