Add missing checks for failures in calls to PyUnicode_AsUTF8String.

Previously a seg fault could occur when passing invalid UTF8 strings (low surrogates), eg passing u"\udcff" to the C layer (Python 3).
2017-12-04 18:41:55 +00:00 · 2017-12-04 18:41:55 +00:00 · b0e29fbdf3
commit b0e29fbdf3
parent 069ce1f6e9
12 changed files with 92 additions and 25 deletions
--- a/Examples/test-suite/python/unicode_strings_runme.py
+++ b/Examples/test-suite/python/unicode_strings_runme.py
@ -25,3 +25,13 @@ if sys.version_info[0:2] < (3, 0):
    check(unicode_strings.charstring(unicode("hello4")), "hello4")
    unicode_strings.charstring(u"hell\xb05")
    unicode_strings.charstring(u"hell\u00f66")
+
+low_surrogate_string = u"\udcff"
+try:
+    unicode_strings.instring(low_surrogate_string)
+    # Will succeed with Python 2
+except TypeError, e:
+    # Python 3 will fail the PyUnicode_AsUTF8String conversion resulting in a TypeError.
+    # The real error is actually:
+    # UnicodeEncodeError: 'utf-8' codec can't encode character '\udcff' in position 0: surrogates not allowed
+    pass