Jsoup want to get Values where class names are same for all elements -


this html. want 2 details

publisher: springer-verlag, price: $7,284

problem outer , inner class names same. please suggest how above 2 values below html using jsoup.

<div class="details">     <div class="fullname">analytical , bioanalytical chemistry (2011)</div>     <div class="catbox">         <div class="catcontents">             <div class="contents_ct1">eigenfactor category:</div>             <div class="contents_ct2" style="margin-left: -5px;">analytic chemistry</div>         </div>         <div class="catcontents">             <div class="contents_ct1">isi category:</div>             <div class="contents_ct2" style="margin-left: -49px;">co ea</div>         </div>         <div class="catcontents">             <div class="contents_ct1">group:</div>             <div class="contents_ct2" style="margin-left: -80px;">sci</div>         </div>         <div class="catcontents">             <div class="contents_ct1">total articles (5yrs):</div>             <div class="contents_ct2" style="margin-left: -12px;">3,544</div>         </div>     </div>     <div class="catbox" style="margin-left: 20px">         <div class="catcontents">             <div class="contents_ct1">publisher:</div>             <div class="contents_ct2" style="margin-left: -55px;">springer-verlag</div>         </div>         <div class="catcontents">             <div class="contents_ct1">first published:</div>             <div class="contents_ct2" style="margin-left: -35px;">2001</div>         </div>         <div class="catcontents">             <div class="contents_ct1"><a href="http://journalprices.com/" title="prices provided journalprices.com" target="_blank" style="font-size: 11px">price:</a></div>             <div class="contents_ct2" style="margin-left: -80px;">$7,284</div>         </div>         <div class="catcontents">             <div class="contents_ct1">cost effectiveness:</div>             <div class="contents_ct2" style="margin-left: -18px;">1.0302</div>         </div>     </div>     <div class="tgraph">         <div class="plotb">             <iframe src="plot1.php?issn=1618-2642" width="370px" height="220px" frameborder=0 scrolling="no"></iframe>         </div>         <div class="plotb" style="margin-left: 10px">             <iframe src="plot2.php?issn=1618-2642" width="340px" height="220px" frameborder=0 scrolling="no"></iframe>         </div>     </div> </div> 

static html structure

assuming layout follows structure of source provided, can use simple css selector syntax specify element parse.

element publisher = doc.select("div.catbox:eq(2) div.catcontents div.contents_ct2").first(); element price = doc.select("div.catbox:eq(2) div.catcontents:eq(2) div.contents_ct2").first(); system.out.println("publisher: " + publisher.text() + "\nprice: " + price.text()); 

would result in print out

run: publisher: springer-verlag price: $7,284 

dynamic html structure

if structure isn't same time, below code should produce same result checks text of elements identify them correctly.

elements content = doc.select("div.catcontents"); element publisher = null; element price = null; (element element : content) {     if(element.text().startswith("publisher")){         publisher = element;     }     if(element.text().startswith("price")){         price = element;     } } system.out.println(publisher.text() + "\n" + price.text()); 

Comments

Popular posts from this blog

android - Get AccessToken using signpost OAuth without opening a browser (Two legged Oauth) -

org.mockito.exceptions.misusing.InvalidUseOfMatchersException: mockito -

google shop client API returns 400 bad request error while adding an item -