1. what’s Emoji Emotion symbols: it’s usually post by iPhone client;
2. when the content include a Emoji from API, it’s always a [] like “[]晚安俩宝”, actually it is a moon emotion, but when you insert it into a utf-8 mysql database , you will get a warning as: “Incorrect string value: ‘\xF0\x9F\x94\x8E’ for column ‘line’ at row 1.
The string ‘\xF0\x9F\x91\x8A, actually is a 4-byte unicode: u’\U0001f62a’, but utf-8 database could not accept a 4-byte, mysql under 5.5.3 don’t support 4-byte unicode.
3. using re.compile method you could replace all the 4-byte unicode to something you want.
[cce] >>> import re >>> highpoints = re.compile(u'[\U00010000-\U0010ffff]') >>> example = u'Some example text with a sleepy face: \U0001f62a' >>> highpoints.sub(u'[我是笑脸]', example) u'Some example text with a sleepy face:[我是笑脸] ' [/cce]
here the "\U0001f62a" is just a Emoji smile face.
you can print that in python shell to confirm that.
4. if still can't print, check your Locale settings by tpye: locale
be sure to set :
[cce] LANG=en_US.UTF-8 LC_ALL="en_US.UTF-8" [/cce]
or you can type below to make the session take effect
[cce] export LANG='en_US.UTF-8' export LC_ALL='en_US.UTF-8' [/cce]