@@ -317,7 +317,7 @@ These APIs can be used to work with surrogates:
317317
318318.. c:function:: Py_UCS4 Py_UNICODE_JOIN_SURROGATES(Py_UCS4 high, Py_UCS4 low)
319319
320- Join two surrogate characters and return a single :c:type:`Py_UCS4` value.
320+ Join two surrogate code points and return a single :c:type:`Py_UCS4` value.
321321 *high* and *low* are respectively the leading and trailing surrogates in a
322322 surrogate pair. *high* must be in the range [0xD800; 0xDBFF ] and *low* must
323323 be in the range [0xDC00; 0xDFFF].
@@ -338,6 +338,8 @@ APIs:
338338 This is the recommended way to allocate a new Unicode object. Objects
339339 created using this function are not resizable.
340340
341+ On error, set an exception and return ``NULL``.
342+
341343 .. versionadded:: 3.3
342344
343345
@@ -614,6 +616,8 @@ APIs:
614616
615617 Return the length of the Unicode object, in code points.
616618
619+ On error, set an exception and return ``-1 ``.
620+
617621 .. versionadded :: 3.3
618622
619623
@@ -657,6 +661,8 @@ APIs:
657661 not out of bounds, and that the object can be modified safely (i.e. that it
658662 its reference count is one).
659663
664+ Return ``0`` on success, ``-1`` on error with an exception set.
665+
660666 .. versionadded:: 3.3
661667
662668
@@ -666,6 +672,8 @@ APIs:
666672 Unicode object and the index is not out of bounds, in contrast to
667673 :c:func: `PyUnicode_READ_CHAR `, which performs no error checking.
668674
675+ Return character on success, ``-1 `` on error with an exception set.
676+
669677 .. versionadded :: 3.3
670678
671679
@@ -674,6 +682,7 @@ APIs:
674682
675683 Return a substring of *unicode *, from character index *start * (included) to
676684 character index *end* (excluded). Negative indices are not supported.
685+ On error, set an exception and return ``NULL``.
677686
678687 .. versionadded:: 3.3
679688
@@ -990,6 +999,9 @@ These are the UTF-8 codec APIs:
990999 object. Error handling is "strict". Return ``NULL `` if an exception was
9911000 raised by the codec.
9921001
1002+ The function fails if the string contains surrogate code points
1003+ (``U+D800 `` - ``U+DFFF ``).
1004+
9931005
9941006.. c:function:: const char* PyUnicode_AsUTF8AndSize(PyObject *unicode, Py_ssize_t *size)
9951007
@@ -1002,6 +1014,9 @@ These are the UTF-8 codec APIs:
10021014 On error, set an exception, set *size* to ``-1`` (if it's not NULL) and
10031015 return ``NULL``.
10041016
1017+ The function fails if the string contains surrogate code points
1018+ (``U+D800 `` - ``U+DFFF ``).
1019+
10051020 This caches the UTF-8 representation of the string in the Unicode object, and
10061021 subsequent calls will return a pointer to the same buffer. The caller is not
10071022 responsible for deallocating the buffer. The buffer is deallocated and
@@ -1429,8 +1444,9 @@ They all return ``NULL`` or ``-1`` if an exception occurs.
14291444 Compare a Unicode object with a char buffer which is interpreted as
14301445 being UTF-8 or ASCII encoded and return true (``1 ``) if they are equal,
14311446 or false (``0 ``) otherwise.
1432- If the Unicode object contains surrogate characters or
1433- the C string is not valid UTF-8, false (``0 ``) is returned.
1447+ If the Unicode object contains surrogate code points
1448+ (``U+D800 `` - ``U+DFFF ``) or the C string is not valid UTF-8,
1449+ false (``0 ``) is returned.
14341450
14351451 This function does not raise exceptions.
14361452
0 commit comments