Python 2 Unicode strings can be used as inputs to char * or std::string types
Requires SWIG_PYTHON_2_UNICODE to be defined when compiling generated code.
This commit is contained in:
parent
291186cfaf
commit
01611702ec
5 changed files with 88 additions and 0 deletions
|
|
@ -5,6 +5,10 @@ See the RELEASENOTES file for a summary of changes in each release.
|
|||
Version 3.0.8 (in progress)
|
||||
===========================
|
||||
|
||||
2015-12-19: wsfulton
|
||||
[Python] Python 2 Unicode UTF-8 strings can be used as inputs to char * or
|
||||
std::string types if the generated C/C++ code has SWIG_PYTHON_2_UNICODE defined.
|
||||
|
||||
2015-12-17: wsfulton
|
||||
Issues #286, #128
|
||||
Remove ccache-swig.1 man page - please use the CCache.html docs instead.
|
||||
|
|
|
|||
|
|
@ -1598,6 +1598,7 @@
|
|||
<li><a href="Python.html#Python_nn75">Buffer interface</a>
|
||||
<li><a href="Python.html#Python_nn76">Abstract base classes</a>
|
||||
<li><a href="Python.html#Python_nn77">Byte string output conversion</a>
|
||||
<li><a href="Python.html#Python_2_unicode">Python 2 Unicode</a>
|
||||
</ul>
|
||||
</ul>
|
||||
</div>
|
||||
|
|
|
|||
|
|
@ -122,6 +122,7 @@
|
|||
<li><a href="#Python_nn75">Buffer interface</a>
|
||||
<li><a href="#Python_nn76">Abstract base classes</a>
|
||||
<li><a href="#Python_nn77">Byte string output conversion</a>
|
||||
<li><a href="#Python_2_unicode">Python 2 Unicode</a>
|
||||
</ul>
|
||||
</ul>
|
||||
</div>
|
||||
|
|
@ -6163,6 +6164,71 @@ For more details about the <tt>surrogateescape</tt> error handler, please see
|
|||
<a href="https://www.python.org/dev/peps/pep-0383/">PEP 383</a>.
|
||||
</p>
|
||||
|
||||
<H3><a name="Python_2_unicode"></a>36.12.5 Python 2 Unicode</H3>
|
||||
|
||||
|
||||
<p>
|
||||
A Python 3 string is a Unicode string so by default a Python 3 string that contains Unicode
|
||||
characters passed to C/C++ will be accepted and converted to a C/C++ string
|
||||
(<tt>char *</tt> or <tt>std::string</tt> types).
|
||||
A Python 2 string is not a unicode string by default and should a Unicode string be
|
||||
passed to C/C++ it will fail to convert to a C/C++ string
|
||||
(<tt>char *</tt> or <tt>std::string</tt> types).
|
||||
The Python 2 behavior can be made more like Python 3 by defining
|
||||
<tt>SWIG_PYTHON_2_UNICODE</tt> when compiling the generated C/C++ code.
|
||||
By default when the following is wrapped:
|
||||
</p>
|
||||
|
||||
<div class="code"><pre>
|
||||
%module unicode_strings
|
||||
char *charstring(char *s) {
|
||||
return s;
|
||||
}
|
||||
</pre></div>
|
||||
|
||||
<p>
|
||||
An error will occur when using Unicode strings in Python 2:
|
||||
</p>
|
||||
|
||||
<div class="targetlang"><pre>
|
||||
>>> from unicode_strings import *
|
||||
>>> charstring("hi")
|
||||
'hi'
|
||||
>>> charstring(u"hi")
|
||||
Traceback (most recent call last):
|
||||
File "<stdin>", line 1, in ?
|
||||
TypeError: in method 'charstring', argument 1 of type 'char *'
|
||||
</pre></div>
|
||||
|
||||
<p>
|
||||
When the <tt>SWIG_PYTHON_2_UNICODE</tt> macro is added to the generated code:
|
||||
</p>
|
||||
|
||||
<div class="code"><pre>
|
||||
%module unicode_strings
|
||||
%begin %{
|
||||
#define SWIG_PYTHON_2_UNICODE
|
||||
%}
|
||||
|
||||
char *charstring(char *s) {
|
||||
return s;
|
||||
}
|
||||
</pre></div>
|
||||
|
||||
<p>
|
||||
Unicode strings will be successfully accepted and converted from UTF-8,
|
||||
but note that they are returned as a normal Python 2 string:
|
||||
</p>
|
||||
|
||||
<div class="targetlang"><pre>
|
||||
>>> from unicode_strings import *
|
||||
>>> charstring("hi")
|
||||
'hi'
|
||||
>>> charstring(u"hi")
|
||||
'hi'
|
||||
>>>
|
||||
</pre></div>
|
||||
|
||||
</body>
|
||||
</html>
|
||||
|
||||
|
|
|
|||
|
|
@ -12,3 +12,12 @@ if sys.version_info[0:2] >= (3, 1):
|
|||
raise ValueError('Test comparison mismatch')
|
||||
if unicode_strings.non_utf8_std_string() != test_string:
|
||||
raise ValueError('Test comparison mismatch')
|
||||
|
||||
# Testing SWIG_PYTHON_2_UNICODE flag which allows unicode strings to be passed to C
|
||||
if sys.version_info[0:2] < (3, 0):
|
||||
assert unicode_strings.charstring("hello1") == "hello1"
|
||||
assert unicode_strings.charstring(str(u"hello2")) == "hello2"
|
||||
assert unicode_strings.charstring(u"hello3") == "hello3"
|
||||
assert unicode_strings.charstring(unicode("hello4")) == "hello4"
|
||||
unicode_strings.charstring(u"hell\xb05")
|
||||
unicode_strings.charstring(u"hell\u00f66")
|
||||
|
|
|
|||
|
|
@ -2,6 +2,10 @@
|
|||
|
||||
%include <std_string.i>
|
||||
|
||||
%begin %{
|
||||
#define SWIG_PYTHON_2_UNICODE
|
||||
%}
|
||||
|
||||
%inline %{
|
||||
|
||||
const char* non_utf8_c_str(void) {
|
||||
|
|
@ -12,4 +16,8 @@ std::string non_utf8_std_string(void) {
|
|||
return std::string("h\xe9llo w\xc3\xb6rld");
|
||||
}
|
||||
|
||||
char *charstring(char *s) {
|
||||
return s;
|
||||
}
|
||||
|
||||
%}
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue