machine learning - How to use python to tokenize and chunk tagged sentences line by line -
i'm linguistist , want use python tokenize sentences in csv document line line , tell tag , token position in tag(b-beginning or i-inside) example below.
"id", "sentence" "1", "<person>claire</person>lived in<location>london uk</location>for<time>2 years</time>" "2", "<location>uk</location> in<location>europe</location>" ........... ........... dataframe = pd.read_csv(document) sentences = dataframe['sentence'] line in sentences : #print token position tag >> claire b-per person lived null null in null null london b-loc location uk i-loc location null null 2 b-tim time years i-tim time uk b-loc location null null in null null europe b-loc location
Comments
Post a Comment