How do I use regex in Java to pull this from html? -
i'm trying pull data espn box scores, , 1 of html files has:
<td style="text-align:left" nowrap><a href="http://espn.go.com/nba/player/_/id/2754/channing-frye">channing frye</a>, pf</td>
and i'm interested in grabbing name (channing frye) , position (pf)
right now, i've been using pattern.quote(start) + "(.*?)" + pattern.quote(end) grab text in between start , end, i'm not sure how i'm supposed grab text starts pattern .../http://espn.go.com/nba/player/_/id/ , can contain (any integer)/anyfirst-anylast"> grab name need (channing frye), </a>, , grab position need (pf) , ends pattern </td>
thanks!
you use pattern:
\\/nba\\/player\\/_\\/.*\\\">(.*)<.+>,\\s(.*)< this match link in html contains `/nba/player/
string re = "\\/nba\\/player\\/_\\/.*\\">(.*)<.+>,\\s(.*)<"; string str = "<td style=\"text-align:left\" nowrap><a href=\"http://espn.go.com/nba/player/_/id/2754/channing-frye\">channing frye</a>, pf</td>"; pattern p = pattern.compile(re, pattern.multiline | pattern.case_insensitive); matcher m = p.matcher(str); example: http://regex101.com/r/ha3uv0
Comments
Post a Comment