python - how to extract a specific paragraph tag -

June 15, 2010

i want extract contents of response:

<div class="bio-container">    <p class="bio profile" >        chinedu boy    </p> </div>

please assume there other paragrpah tags different class attributes, want extract 1 class attribute "bio-profile"

i want extract chinedu boy file.

i tried desc = bs.find ('p', {'class' : 'bio profile'})

but not working

this exact code trying apply answer above to:

import urllib bs4 import beautifulsoup bsoup import string   httpresponse = urllib.urlopen("https://twitter.com/drericcole") html = httpresponse.read() bs = bsoup(html) desc = bs.find("p", class_="bio profile-field") print desc.get_text().strip()

but error statement

print desc.get_text().strip() attributeerror: 'nonetype' object has no attribute 'get_text'

you should use .get_text() method on desc. using python 2.7 , bs 4.3.2:

from bs4 import beautifulsoup bsoup  ofile = open("test.html") soup = bsoup(ofile)  desc = soup.find("p", class_="bio profile") # or desc = soup.find("p", {"class":"bio profile"}) print desc.get_text().strip()

result:

chinedu boy [finished in 0.2s]

hope helps.

Search This Blog

Silver

python - how to extract a specific paragraph tag -

Comments

Post a Comment

Popular posts from this blog

user interface - How to replace the Python logo in a Tkinter-based Python GUI app? -

objective c - Greedy NSProgressIndicator Allocation -

how to set an OCR language in Google Drive -