tokenize - Java: StringTokenizer does not respect separator -

May 15, 2011

i have following code extracts tab-separated strings string array:

static public list<string> getcontents(file afile, string separator){      // strings, split based on separator      list<string> contentlist = new arraylist<string>();      stringtokenizer tokenizer = new stringtokenizer(util.getcontents(afile), separator);      while (tokenizer.hasmoretokens()){         contentlist.add(tokenizer.nexttoken());      }      return contentlist; }

the separator in case therefore "\t".

as long 2 strings separated 1 tab, great. however, dataset has 2 strings between separated 2 tabs. means 1 parameter missing , emptry string shoulid added list. method ignores , returns array 1 string less.

in particular case, want array of 5 strings back. means, text containing 4 tabs no text returns array of 5 empty strings (needed parsing job based on that). unfortunately, have no control on content , working millions of files generated out of control.

is there better way stringtokenizer ? or have implement on own?

here examples:

string ok = a\tb\tc\td\te string nok = a\tb\tc\t\te

ralf

found this: how split string in java

and can

"mystring".split("\t", -1);

to obtain empty strings if there multiple separators custering in 1 place.

thanks anyway!

Search This Blog

Silver

tokenize - Java: StringTokenizer does not respect separator -

Comments

Post a Comment

Popular posts from this blog

user interface - How to replace the Python logo in a Tkinter-based Python GUI app? -

android - Get AccessToken using signpost OAuth without opening a browser (Two legged Oauth) -

org.mockito.exceptions.misusing.InvalidUseOfMatchersException: mockito -