tokenize - Java: StringTokenizer does not respect separator -


i have following code extracts tab-separated strings string array:

static public list<string> getcontents(file afile, string separator){      // strings, split based on separator      list<string> contentlist = new arraylist<string>();      stringtokenizer tokenizer = new stringtokenizer(util.getcontents(afile), separator);      while (tokenizer.hasmoretokens()){         contentlist.add(tokenizer.nexttoken());      }      return contentlist; } 

the separator in case therefore "\t".

as long 2 strings separated 1 tab, great. however, dataset has 2 strings between separated 2 tabs. means 1 parameter missing , emptry string shoulid added list. method ignores , returns array 1 string less.

in particular case, want array of 5 strings back. means, text containing 4 tabs no text returns array of 5 empty strings (needed parsing job based on that). unfortunately, have no control on content , working millions of files generated out of control.

is there better way stringtokenizer ? or have implement on own?

here examples:

string ok = a\tb\tc\td\te string nok = a\tb\tc\t\te

ralf

found this: how split string in java

and can

"mystring".split("\t", -1); 

to obtain empty strings if there multiple separators custering in 1 place.

thanks anyway!


Comments

Popular posts from this blog

android - Get AccessToken using signpost OAuth without opening a browser (Two legged Oauth) -

org.mockito.exceptions.misusing.InvalidUseOfMatchersException: mockito -

google shop client API returns 400 bad request error while adding an item -