diff --git a/peps/pep-3131.rst b/peps/pep-3131.rst index 42bfbbdb4db..d6f9b0608df 100644 --- a/peps/pep-3131.rst +++ b/peps/pep-3131.rst @@ -57,8 +57,9 @@ an additional policy is necessary, anyway. Specification of Language Changes ================================= -The syntax of identifiers in Python will be based on the Unicode standard annex -UAX-31 [1]_, with elaboration and changes as defined below. +The syntax of identifiers in Python will be based on the `Unicode standard annex +UAX-31 `__, with elaboration and changes +as defined below. Within the ASCII range (U+0001..U+007F), the valid characters for identifiers are the same as in Python 2.5. This specification only introduces additional @@ -69,9 +70,10 @@ the ``unicodedata`` module. The identifier syntax is `` *``. The exact specification of what characters have the XID_Start or -XID_Continue properties can be found in the DerivedCoreProperties -file of the Unicode data in use by Python (4.1 at the time this -PEP was written), see [6]_. For reference, the construction rules +XID_Continue properties can be found in the `DerivedCoreProperties +file `__ +of the Unicode data in use by Python (4.1 at the time this +PEP was written). For reference, the construction rules for these sets are given below. The XID_* properties are derived from ID_Start/ID_Continue, which are derived themselves. @@ -94,7 +96,7 @@ comparison of identifiers is based on NFKC. A non-normative HTML file listing all valid identifier characters for Unicode 4.1 can be found at -http://www.dcl.hpi.uni-potsdam.de/home/loewis/table-3131.html. +https://web.archive.org/web/20081016132748/http://www.dcl.hpi.uni-potsdam.de/home/loewis/table-3131.html. Policy Specification ==================== @@ -136,8 +138,9 @@ The following changes will need to be made to the parser: Open Issues =========== -John Nagle suggested consideration of Unicode Technical Standard #39, -[2]_, which discusses security mechanisms for Unicode identifiers. +John Nagle suggested consideration of `Unicode Technical Standard #39 +`__, +which discusses security mechanisms for Unicode identifiers. It's not clear how that can precisely apply to this PEP; possible consequences are @@ -153,7 +156,8 @@ needs two identifiers to compare them for confusion - is it possible to somehow apply it to a single identifier only, and warn? In follow-up discussion, it turns out that John Nagle actually -meant to suggest UTR#36, level "Highly Restrictive", [3]_. +meant to suggest `UTR#36 `__, +level "Highly Restrictive". Several people suggested to allow and ignore formatting control characters (general category Cf), as is done in Java, JavaScript, and @@ -164,15 +168,17 @@ later. Some people would like to see an option on selecting support for this PEP at run-time; opinions vary on what precisely that option should be, and what precisely its default value -should be. Guido van Rossum commented in [5]_ that a global -flag passed to the interpreter is not acceptable, as it would +should be. `Guido van Rossum commented +`__ +that a global flag passed to the interpreter is not acceptable, as it would apply to all modules. Discussion ========== -Ka-Ping Yee summarizes discussion and further objection -in [4]_ as such: +`Ka-Ping Yee summarizes discussion and further objection +`__ +as such: A. Should identifiers be allowed to contain any Unicode letter? @@ -250,16 +256,6 @@ F. Which normalization form should be used, NFC or NFKC? G. Should source code be required to be in normalized form? -References -========== - -.. [1] http://www.unicode.org/reports/tr31/ -.. [2] http://www.unicode.org/reports/tr39/ -.. [3] http://www.unicode.org/reports/tr36/ -.. [4] https://mail.python.org/pipermail/python-3000/2007-June/008161.html -.. [5] https://mail.python.org/pipermail/python-3000/2007-May/007925.html -.. [6] http://www.unicode.org/Public/4.1.0/ucd/DerivedCoreProperties.txt - Copyright =========