From 01611702ec04fa70445fd2c7d37b9b312d3f7561 Mon Sep 17 00:00:00 2001
From: William S Fulton
Date: Sat, 19 Dec 2015 03:52:33 +0000
Subject: [PATCH] Python 2 Unicode strings can be used as inputs to char * or
std::string types
Requires SWIG_PYTHON_2_UNICODE to be defined when compiling generated code.
---
CHANGES.current | 4 ++
Doc/Manual/Contents.html | 1 +
Doc/Manual/Python.html | 66 +++++++++++++++++++
.../python/unicode_strings_runme.py | 9 +++
Examples/test-suite/unicode_strings.i | 8 +++
5 files changed, 88 insertions(+)
diff --git a/CHANGES.current b/CHANGES.current
index a0e6dfa2b..050ff54cc 100644
--- a/CHANGES.current
+++ b/CHANGES.current
@@ -5,6 +5,10 @@ See the RELEASENOTES file for a summary of changes in each release.
Version 3.0.8 (in progress)
===========================
+2015-12-19: wsfulton
+ [Python] Python 2 Unicode UTF-8 strings can be used as inputs to char * or
+ std::string types if the generated C/C++ code has SWIG_PYTHON_2_UNICODE defined.
+
2015-12-17: wsfulton
Issues #286, #128
Remove ccache-swig.1 man page - please use the CCache.html docs instead.
diff --git a/Doc/Manual/Contents.html b/Doc/Manual/Contents.html
index 21ba6eaad..6d2cdaa76 100644
--- a/Doc/Manual/Contents.html
+++ b/Doc/Manual/Contents.html
@@ -1598,6 +1598,7 @@
Buffer interface
Abstract base classes
Byte string output conversion
+Python 2 Unicode
diff --git a/Doc/Manual/Python.html b/Doc/Manual/Python.html
index 962ee6843..c5219b693 100644
--- a/Doc/Manual/Python.html
+++ b/Doc/Manual/Python.html
@@ -122,6 +122,7 @@
Buffer interface
Abstract base classes
Byte string output conversion
+Python 2 Unicode
@@ -6163,6 +6164,71 @@ For more details about the surrogateescape error handler, please see
PEP 383.
+36.12.5 Python 2 Unicode
+
+
+
+A Python 3 string is a Unicode string so by default a Python 3 string that contains Unicode
+characters passed to C/C++ will be accepted and converted to a C/C++ string
+(char * or std::string types).
+A Python 2 string is not a unicode string by default and should a Unicode string be
+passed to C/C++ it will fail to convert to a C/C++ string
+(char * or std::string types).
+The Python 2 behavior can be made more like Python 3 by defining
+SWIG_PYTHON_2_UNICODE when compiling the generated C/C++ code.
+By default when the following is wrapped:
+
+
+
+%module unicode_strings
+char *charstring(char *s) {
+ return s;
+}
+
+
+
+An error will occur when using Unicode strings in Python 2:
+
+
+
+>>> from unicode_strings import *
+>>> charstring("hi")
+'hi'
+>>> charstring(u"hi")
+Traceback (most recent call last):
+ File "<stdin>", line 1, in ?
+TypeError: in method 'charstring', argument 1 of type 'char *'
+
+
+
+When the SWIG_PYTHON_2_UNICODE macro is added to the generated code:
+
+
+
+%module unicode_strings
+%begin %{
+#define SWIG_PYTHON_2_UNICODE
+%}
+
+char *charstring(char *s) {
+ return s;
+}
+
+
+
+Unicode strings will be successfully accepted and converted from UTF-8,
+but note that they are returned as a normal Python 2 string:
+
+
+
+>>> from unicode_strings import *
+>>> charstring("hi")
+'hi'
+>>> charstring(u"hi")
+'hi'
+>>>
+
+