machine learning - How to use python to tokenize and chunk tagged sentences line by line -


i'm linguistist , want use python tokenize sentences in csv document line line , tell tag , token position in tag(b-beginning or i-inside) example below.

"id", "sentence" "1", "<person>claire</person>lived in<location>london uk</location>for<time>2 years</time>" "2", "<location>uk</location> in<location>europe</location>"  ...........  ...........    dataframe = pd.read_csv(document)  sentences = dataframe['sentence']  line in sentences :      #print token position tag   >> claire  b-per  person      lived   null   null        in      null   null     london  b-loc  location     uk      i-loc  location         null   null     2       b-tim   time     years   i-tim   time       uk      b-loc  location          null   null                     in      null   null     europe  b-loc  location  


Comments

Popular posts from this blog

android - Get AccessToken using signpost OAuth without opening a browser (Two legged Oauth) -

org.mockito.exceptions.misusing.InvalidUseOfMatchersException: mockito -

google shop client API returns 400 bad request error while adding an item -