encoding - Python: UnicodeEncodeError with codecs.open -


i'm trying work orgnode.py (from here) parse org files. these files english/persian , using file -i seems utf-8 encoded. recieve error when use makelist function (which uses codec.open utf-8):

>>> orgnode.makelist("toread.org") [**  [[http://www.apa.org/helpcenter/sexual-orientation.aspx][sexual orientation, homosexuality , bisexuality]]            :toread:    added:[2013-11-06 wed] , **  [[http://stackoverflow.com/questions/11384516/how-to-make-all-org-files-under-a-folder-added-in-agenda-list-automatically][emacs - how make org-files under folder added in agenda-list automatically? - stack overflow]]   (setq org-agenda-text-search-extra-files '(agenda-archives "~/org/subdir/textfile1.txt" "~/org/subdir/textfile1.txt")) added:[2013-07-23 tue]  , traceback (most recent call last): file "<stdin>", line 1, in <module> unicodeencodeerror: 'ascii' codec can't encode characters in position 63-66: ordinal not in range(128) 

the function returns list of org headings, instead of last item (which written in persian) shows error. suggestion how can deal error?

as traceback tells you, exception raised statement input on python console (orgnode.makelist("toread.org")), , not in 1 of functions called during evaluation of statement.

this typical of encoding errors when interpreter automatically converts return value of statement display on console. text displayed result of applying repr() builtin return value.

here repr() of result of makelist unicode object, interpreter tries convert str using "ascii" codec default.

the culprit orgnode.__repr__ method (https://github.com/albins/orgnode/blob/master/orgnode.py#l592) return unicode object (because node content has automatically been decoded codecs.open), although __repr__ methods expected return strings safe (ascii) characters.

here smallest change can orgnode workaround problem:

-- a/orgnode.py +++ b/orgnode.py @@ -612,4 +612,4 @@ class orgnode(object):  # following output text used construct object          n = n + "\n" + self.body  -        return n +        return n.encode('utf-8') 

if want version returns ascii characters, can use 'string-escape' codec instead of 'utf-8'.

this quick , dirty fix. right solution rewrite proper __repr__ method, , add __str__ , __unicode__ methods class lacks. (i might fix myself if find time, quite interested in using python code manipulate org-mode files)


Comments

Popular posts from this blog

android - Get AccessToken using signpost OAuth without opening a browser (Two legged Oauth) -

org.mockito.exceptions.misusing.InvalidUseOfMatchersException: mockito -

google shop client API returns 400 bad request error while adding an item -