From cdc29c36084ccfa447f171a25de2336b0bd74edd Mon Sep 17 00:00:00 2001 From: Clinton Stimpson Date: Fri, 26 Dec 2014 21:25:20 -0700 Subject: [PATCH] Encoding: Switch to use UTF-8 internally by default on Windows. This fixes several reported bugs about CMake not handling non-ascii paths on Windows. Practically, the use of some unicode characters may still be limited by the build or compiler tools. For example, a user may be limited by the build tools to using characters within the Windows ANSI code page (which can include non-ascii characters in the current system language). --- CMakeLists.txt | 2 +- Help/manual/cmake-language.7.rst | 8 +++++--- Help/release/dev/windows-utf-8.rst | 22 ++++++++++++++++++++++ 3 files changed, 28 insertions(+), 4 deletions(-) create mode 100644 Help/release/dev/windows-utf-8.rst diff --git a/CMakeLists.txt b/CMakeLists.txt index 1812b2773..33d2ce666 100644 --- a/CMakeLists.txt +++ b/CMakeLists.txt @@ -45,7 +45,7 @@ if(NOT DEFINED CMAKE_CXX_STANDARD) endif() # option to set the internal encoding of CMake to UTF-8 -option(CMAKE_ENCODING_UTF8 "Use UTF-8 encoding internally (experimental)." OFF) +option(CMAKE_ENCODING_UTF8 "Use UTF-8 encoding internally." ON) mark_as_advanced(CMAKE_ENCODING_UTF8) if(CMAKE_ENCODING_UTF8) set(KWSYS_ENCODING_DEFAULT_CODEPAGE CP_UTF8) diff --git a/Help/manual/cmake-language.7.rst b/Help/manual/cmake-language.7.rst index 15c101f63..5ec5858ff 100644 --- a/Help/manual/cmake-language.7.rst +++ b/Help/manual/cmake-language.7.rst @@ -60,14 +60,16 @@ Syntax Encoding -------- -A CMake Language source file must be written in 7-bit ASCII text -to be portable across all supported platforms. Newlines may be +A CMake Language source file may be written in 7-bit ASCII text for +maximum portability across all supported platforms. Newlines may be encoded as either ``\n`` or ``\r\n`` but will be converted to ``\n`` as input files are read. Note that the implementation is 8-bit clean so source files may be encoded as UTF-8 on platforms with system APIs supporting this -encoding. Furthermore, CMake 3.0 and above allow a leading UTF-8 +encoding. In addition, CMake 3.2 and above support source files +encoded in UTF-8 on Windows (using UTF-16 to call system APIs). +Furthermore, CMake 3.0 and above allow a leading UTF-8 `Byte-Order Mark`_ in source files. .. _`Byte-Order Mark`: http://en.wikipedia.org/wiki/Byte_order_mark diff --git a/Help/release/dev/windows-utf-8.rst b/Help/release/dev/windows-utf-8.rst new file mode 100644 index 000000000..64cd61638 --- /dev/null +++ b/Help/release/dev/windows-utf-8.rst @@ -0,0 +1,22 @@ +windows-utf-8 +------------- + +* On Windows, CMake learned to support international characters. + This allows use of characters from multiple (spoken) languages + in CMake code, paths to source files, configured files such as + ``.h.in`` files, and other files read and written by CMake. + Because CMake interoperates with many other tools, there may + still be some limitations when using certain international + characters. + + Files written in the :manual:`cmake-language(7)`, such as + ``CMakeLists.txt`` or ``*.cmake`` files, are expected to be + encoded as UTF-8. If files are already ASCII, they will be + compatible. If files were in a different encoding, including + Latin 1, they will need to be converted. + + The Visual Studio generators now write solution and project + files in UTF-8 instead of Windows-1252. Windows-1252 supported + Latin 1 languages such as those found in North and South America + and Western Europe. With UTF-8, additional languages are now + supported.