python - Looping through files -


i have 100 files in folder named 1.htm - 100.htm. run code extract info file , place extracted info in file final.txt. currently, have run program manually 100 files. need construct loop can run program 100 times, reading each file once. (kindly explain in detail exact edits need in code)

below code file 6.htm:

import glob import beautifulsoup beautifulsoup import beautifulsoup   fo = open("6.htm", "r") bo = open("output.txt" ,"w") f = open("final.txt","a+")  htmltext = fo.read() soup = beautifulsoup(htmltext) #print len(urls) table = soup.findall('table') rows = table[0].findall('tr'); tr in rows:     cols = tr.findall('td')     td in cols:         text = str(td.find(text=true)) + ';;;'         if(text!=" ;;;"):             bo.write(text);             bo.write('\n'); fo.close() bo.close()  b= open("output.txt", "r")  j in range (1,5): str=b.readline(); j in range(1, 15): str=b.readline(); c=str.split(";;;") #print c[1] if(c[0]=="apd id:"):     f.write(c[1])     f.write("#") if(c[0]=="name/class:"):     f.write(c[1])     f.write("#") if(c[0]=="source:"):     f.write(c[1])     f.write("#") if(c[0]=="sequence:"):     f.write(c[1])     f.write("#") if(c[0]=="length:"):     f.write(c[1])     f.write("#") if(c[0]=="net charge:"):     f.write(c[1])     f.write("#") if(c[0]=="hydrophobic residue%:"):     f.write(c[1])     f.write("#") if(c[0]=="boman index:"):     f.write(c[1])     f.write("#") f.write('\n'); b.close(); f.close();    f.close(); print "end" 

import os f = open("final.txt","a+") root, folders, files in os.walk('./path/to/html_files/'):     filename in files:         fo = open(os.path.abspath(root + '/' + filename, "r")         ... 

and rest of code goes there.

also consider (best practice)

with open(os.path.abspath(root + '/' + filename, "r") fo:     ... 

so don't forget close file handles, because there limited amount of open file handles allowed in os, make sure don't fill mistake.

making code this:

import os open("final.txt","a+") f:     root, folders, files in os.walk('./path/to/html_files/'):         filename in files:             open(os.path.abspath(root + '/' + filename, "r") fo:                 ... 

also never replace global variable-names such str:

str=b.readline(); 

there's no need ; @ end of code-lines, python.. code in comfy manner!

last not least..

if(c[0]=="apd id:"): if(c[0]=="name/class:"): if(c[0]=="source:"): if(c[0]=="sequence:"): if(c[0]=="length:"): if(c[0]=="net charge:"): if(c[0]=="hydrophobic residue%:"): if(c[0]=="boman index:"): 

should be:

if(c[0]=="apd id:"): elif(c[0]=="name/class:"): elif(c[0]=="source:"): elif(c[0]=="sequence:"): elif(c[0]=="length:"): elif(c[0]=="net charge:"): elif(c[0]=="hydrophobic residue%:"): elif(c[0]=="boman index:"): 

unless modify c along way ofcourse, don't.. switch!

shit keep finding more horrible things code (which have copy pasted examples across galaxies...):

you can condense above if/elif/else 1 if-block:

if(c[0] in ("apd id:", "name/class:", "source:", "sequence:", "length:", "net charge:", "hydrophobic residue%:", "boman index:")):     f.write(c[1])     f.write("#") 

and also, skip ( ... ) around if blocks, again.. python.. program in comfortable manner:

if c[0] in ("apd id:", "name/class:", "source:", "sequence:", "length:", "net charge:", "hydrophobic residue%:", "boman index:"):     f.write(c[1])     f.write("#") 

Comments

Popular posts from this blog

android - Get AccessToken using signpost OAuth without opening a browser (Two legged Oauth) -

org.mockito.exceptions.misusing.InvalidUseOfMatchersException: mockito -

google shop client API returns 400 bad request error while adding an item -