How do I use regex in Java to pull this from html? -

April 15, 2011

i'm trying pull data espn box scores, , 1 of html files has:

<td style="text-align:left" nowrap><a href="http://espn.go.com/nba/player/_/id/2754/channing-frye">channing frye</a>, pf</td>

and i'm interested in grabbing name (channing frye) , position (pf)

right now, i've been using pattern.quote(start) + "(.*?)" + pattern.quote(end) grab text in between start , end, i'm not sure how i'm supposed grab text starts pattern .../http://espn.go.com/nba/player/_/id/ , can contain (any integer)/anyfirst-anylast"> grab name need (channing frye), </a>, , grab position need (pf) , ends pattern </td>

thanks!

you use pattern:

\\/nba\\/player\\/_\\/.*\\\">(.*)<.+>,\\s(.*)<

this match link in html contains `/nba/player/

string re = "\\/nba\\/player\\/_\\/.*\\">(.*)<.+>,\\s(.*)<"; string str = "<td style=\"text-align:left\" nowrap><a href=\"http://espn.go.com/nba/player/_/id/2754/channing-frye\">channing frye</a>, pf</td>";  pattern p = pattern.compile(re, pattern.multiline | pattern.case_insensitive); matcher m = p.matcher(str);

example: http://regex101.com/r/ha3uv0

Search This Blog

Silver

How do I use regex in Java to pull this from html? -

Comments

Post a Comment

Popular posts from this blog

user interface - How to replace the Python logo in a Tkinter-based Python GUI app? -

objective c - Greedy NSProgressIndicator Allocation -

how to set an OCR language in Google Drive -