'ascii' codec can't encode character u'u2602' in position 438: ordinal not in range(128)












1















I am running into this problem where when I try to decode a string I run into one error,when I try to encode I run into another error,errors below,is there a permanent solution for this?



P.S please note that you may not be able to reproduce the encoding error with the string I provided as I couldnt copy/paste some errors



text =  "sometext"

string = 'n'.join(list(set(text)))
try:
print "decode"
text = string.decode('UTF-8')
except Exception as e:
print e
text = string.encode('UTF-8')


Errors:-



error while using string.decode('UTF-8')



'ascii' codec can't encode character u'u2602' in position 438: ordinal not in range(128)


Error while using string.encode('UTF-8')



Exception All strings must be XML compatible: Unicode or ASCII, no NULL bytes or control characters









share|improve this question























  • You should not be learning Python 2 in this day and age. Are you able to reproduce this with Python 3 (with the required changes to use print() etc)?

    – tripleee
    Nov 15 '18 at 7:17











  • If you can print the repr() of the problematic string, you will see a representation which should be easy to copy/paste here.

    – tripleee
    Nov 15 '18 at 7:17
















1















I am running into this problem where when I try to decode a string I run into one error,when I try to encode I run into another error,errors below,is there a permanent solution for this?



P.S please note that you may not be able to reproduce the encoding error with the string I provided as I couldnt copy/paste some errors



text =  "sometext"

string = 'n'.join(list(set(text)))
try:
print "decode"
text = string.decode('UTF-8')
except Exception as e:
print e
text = string.encode('UTF-8')


Errors:-



error while using string.decode('UTF-8')



'ascii' codec can't encode character u'u2602' in position 438: ordinal not in range(128)


Error while using string.encode('UTF-8')



Exception All strings must be XML compatible: Unicode or ASCII, no NULL bytes or control characters









share|improve this question























  • You should not be learning Python 2 in this day and age. Are you able to reproduce this with Python 3 (with the required changes to use print() etc)?

    – tripleee
    Nov 15 '18 at 7:17











  • If you can print the repr() of the problematic string, you will see a representation which should be easy to copy/paste here.

    – tripleee
    Nov 15 '18 at 7:17














1












1








1








I am running into this problem where when I try to decode a string I run into one error,when I try to encode I run into another error,errors below,is there a permanent solution for this?



P.S please note that you may not be able to reproduce the encoding error with the string I provided as I couldnt copy/paste some errors



text =  "sometext"

string = 'n'.join(list(set(text)))
try:
print "decode"
text = string.decode('UTF-8')
except Exception as e:
print e
text = string.encode('UTF-8')


Errors:-



error while using string.decode('UTF-8')



'ascii' codec can't encode character u'u2602' in position 438: ordinal not in range(128)


Error while using string.encode('UTF-8')



Exception All strings must be XML compatible: Unicode or ASCII, no NULL bytes or control characters









share|improve this question














I am running into this problem where when I try to decode a string I run into one error,when I try to encode I run into another error,errors below,is there a permanent solution for this?



P.S please note that you may not be able to reproduce the encoding error with the string I provided as I couldnt copy/paste some errors



text =  "sometext"

string = 'n'.join(list(set(text)))
try:
print "decode"
text = string.decode('UTF-8')
except Exception as e:
print e
text = string.encode('UTF-8')


Errors:-



error while using string.decode('UTF-8')



'ascii' codec can't encode character u'u2602' in position 438: ordinal not in range(128)


Error while using string.encode('UTF-8')



Exception All strings must be XML compatible: Unicode or ASCII, no NULL bytes or control characters






python utf-8






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 15 '18 at 6:41









user3508811user3508811

109217




109217













  • You should not be learning Python 2 in this day and age. Are you able to reproduce this with Python 3 (with the required changes to use print() etc)?

    – tripleee
    Nov 15 '18 at 7:17











  • If you can print the repr() of the problematic string, you will see a representation which should be easy to copy/paste here.

    – tripleee
    Nov 15 '18 at 7:17



















  • You should not be learning Python 2 in this day and age. Are you able to reproduce this with Python 3 (with the required changes to use print() etc)?

    – tripleee
    Nov 15 '18 at 7:17











  • If you can print the repr() of the problematic string, you will see a representation which should be easy to copy/paste here.

    – tripleee
    Nov 15 '18 at 7:17

















You should not be learning Python 2 in this day and age. Are you able to reproduce this with Python 3 (with the required changes to use print() etc)?

– tripleee
Nov 15 '18 at 7:17





You should not be learning Python 2 in this day and age. Are you able to reproduce this with Python 3 (with the required changes to use print() etc)?

– tripleee
Nov 15 '18 at 7:17













If you can print the repr() of the problematic string, you will see a representation which should be easy to copy/paste here.

– tripleee
Nov 15 '18 at 7:17





If you can print the repr() of the problematic string, you will see a representation which should be easy to copy/paste here.

– tripleee
Nov 15 '18 at 7:17












1 Answer
1






active

oldest

votes


















0














The First Error



The code you have provided will work as the text is a a bytestring (as you are using Python 2). But what you're trying to do is to decode from a UTF-8 string to
an ASCII one, which is possible, but only if that Unicode string contains only characters that have an ASCII equivalent (you can see the list of ASCII characters here). In your case, it's encountering a unicode character (specifically ☂) which has no ASCII equivalent. You can get around this behaviour by using:



string.decode('UTF-8', 'ignore')


Which will just ignore (i.e. replace with nothing) the characters that cannot be encoded into ASCII.



The Second Error



This error is more interesting. It appears the text you are trying to encode into UTF-8 contains either NULL bytes or specific control characters, which are not allowed by the version of Unicode (UTF-8) that you are trying to encode into. Again, the code that you have actually provided works, but something in the text that you are trying to encode is violating the encoding. You can try the same trick as above:



string.encode('UTF-8', 'ignore')


Which will simply remove the offending characters, or you can look into what it is in your specific text input that is causing the problem.






share|improve this answer
























  • This didn't help,even after adding string.decode('UTF-8', 'ignore') I get the same error

    – user3508811
    Nov 15 '18 at 16:34











  • @user3508811 can you provide the string you're trying to decode and the version of python you're trying to achieve it in?

    – nicklambourne
    Nov 16 '18 at 2:30











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53313784%2fascii-codec-cant-encode-character-u-u2602-in-position-438-ordinal-not-in-r%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









0














The First Error



The code you have provided will work as the text is a a bytestring (as you are using Python 2). But what you're trying to do is to decode from a UTF-8 string to
an ASCII one, which is possible, but only if that Unicode string contains only characters that have an ASCII equivalent (you can see the list of ASCII characters here). In your case, it's encountering a unicode character (specifically ☂) which has no ASCII equivalent. You can get around this behaviour by using:



string.decode('UTF-8', 'ignore')


Which will just ignore (i.e. replace with nothing) the characters that cannot be encoded into ASCII.



The Second Error



This error is more interesting. It appears the text you are trying to encode into UTF-8 contains either NULL bytes or specific control characters, which are not allowed by the version of Unicode (UTF-8) that you are trying to encode into. Again, the code that you have actually provided works, but something in the text that you are trying to encode is violating the encoding. You can try the same trick as above:



string.encode('UTF-8', 'ignore')


Which will simply remove the offending characters, or you can look into what it is in your specific text input that is causing the problem.






share|improve this answer
























  • This didn't help,even after adding string.decode('UTF-8', 'ignore') I get the same error

    – user3508811
    Nov 15 '18 at 16:34











  • @user3508811 can you provide the string you're trying to decode and the version of python you're trying to achieve it in?

    – nicklambourne
    Nov 16 '18 at 2:30
















0














The First Error



The code you have provided will work as the text is a a bytestring (as you are using Python 2). But what you're trying to do is to decode from a UTF-8 string to
an ASCII one, which is possible, but only if that Unicode string contains only characters that have an ASCII equivalent (you can see the list of ASCII characters here). In your case, it's encountering a unicode character (specifically ☂) which has no ASCII equivalent. You can get around this behaviour by using:



string.decode('UTF-8', 'ignore')


Which will just ignore (i.e. replace with nothing) the characters that cannot be encoded into ASCII.



The Second Error



This error is more interesting. It appears the text you are trying to encode into UTF-8 contains either NULL bytes or specific control characters, which are not allowed by the version of Unicode (UTF-8) that you are trying to encode into. Again, the code that you have actually provided works, but something in the text that you are trying to encode is violating the encoding. You can try the same trick as above:



string.encode('UTF-8', 'ignore')


Which will simply remove the offending characters, or you can look into what it is in your specific text input that is causing the problem.






share|improve this answer
























  • This didn't help,even after adding string.decode('UTF-8', 'ignore') I get the same error

    – user3508811
    Nov 15 '18 at 16:34











  • @user3508811 can you provide the string you're trying to decode and the version of python you're trying to achieve it in?

    – nicklambourne
    Nov 16 '18 at 2:30














0












0








0







The First Error



The code you have provided will work as the text is a a bytestring (as you are using Python 2). But what you're trying to do is to decode from a UTF-8 string to
an ASCII one, which is possible, but only if that Unicode string contains only characters that have an ASCII equivalent (you can see the list of ASCII characters here). In your case, it's encountering a unicode character (specifically ☂) which has no ASCII equivalent. You can get around this behaviour by using:



string.decode('UTF-8', 'ignore')


Which will just ignore (i.e. replace with nothing) the characters that cannot be encoded into ASCII.



The Second Error



This error is more interesting. It appears the text you are trying to encode into UTF-8 contains either NULL bytes or specific control characters, which are not allowed by the version of Unicode (UTF-8) that you are trying to encode into. Again, the code that you have actually provided works, but something in the text that you are trying to encode is violating the encoding. You can try the same trick as above:



string.encode('UTF-8', 'ignore')


Which will simply remove the offending characters, or you can look into what it is in your specific text input that is causing the problem.






share|improve this answer













The First Error



The code you have provided will work as the text is a a bytestring (as you are using Python 2). But what you're trying to do is to decode from a UTF-8 string to
an ASCII one, which is possible, but only if that Unicode string contains only characters that have an ASCII equivalent (you can see the list of ASCII characters here). In your case, it's encountering a unicode character (specifically ☂) which has no ASCII equivalent. You can get around this behaviour by using:



string.decode('UTF-8', 'ignore')


Which will just ignore (i.e. replace with nothing) the characters that cannot be encoded into ASCII.



The Second Error



This error is more interesting. It appears the text you are trying to encode into UTF-8 contains either NULL bytes or specific control characters, which are not allowed by the version of Unicode (UTF-8) that you are trying to encode into. Again, the code that you have actually provided works, but something in the text that you are trying to encode is violating the encoding. You can try the same trick as above:



string.encode('UTF-8', 'ignore')


Which will simply remove the offending characters, or you can look into what it is in your specific text input that is causing the problem.







share|improve this answer












share|improve this answer



share|improve this answer










answered Nov 15 '18 at 7:25









nicklambournenicklambourne

313




313













  • This didn't help,even after adding string.decode('UTF-8', 'ignore') I get the same error

    – user3508811
    Nov 15 '18 at 16:34











  • @user3508811 can you provide the string you're trying to decode and the version of python you're trying to achieve it in?

    – nicklambourne
    Nov 16 '18 at 2:30



















  • This didn't help,even after adding string.decode('UTF-8', 'ignore') I get the same error

    – user3508811
    Nov 15 '18 at 16:34











  • @user3508811 can you provide the string you're trying to decode and the version of python you're trying to achieve it in?

    – nicklambourne
    Nov 16 '18 at 2:30

















This didn't help,even after adding string.decode('UTF-8', 'ignore') I get the same error

– user3508811
Nov 15 '18 at 16:34





This didn't help,even after adding string.decode('UTF-8', 'ignore') I get the same error

– user3508811
Nov 15 '18 at 16:34













@user3508811 can you provide the string you're trying to decode and the version of python you're trying to achieve it in?

– nicklambourne
Nov 16 '18 at 2:30





@user3508811 can you provide the string you're trying to decode and the version of python you're trying to achieve it in?

– nicklambourne
Nov 16 '18 at 2:30




















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53313784%2fascii-codec-cant-encode-character-u-u2602-in-position-438-ordinal-not-in-r%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Florida Star v. B. J. F.

Danny Elfman

Lugert, Oklahoma