Encoding: Switch to use UTF-8 internally by default on Windows.

This fixes several reported bugs about CMake not handling
non-ascii paths on Windows.

Practically, the use of some unicode characters may still
be limited by the build or compiler tools.

For example, a user may be limited by the build tools to
using characters within the Windows ANSI code page (which can
include non-ascii characters in the current system language).
This commit is contained in:
Clinton Stimpson 2014-12-26 21:25:20 -07:00 committed by Brad King
parent 2ece34516a
commit cdc29c3608
3 changed files with 28 additions and 4 deletions

View File

@ -45,7 +45,7 @@ if(NOT DEFINED CMAKE_CXX_STANDARD)
endif() endif()
# option to set the internal encoding of CMake to UTF-8 # option to set the internal encoding of CMake to UTF-8
option(CMAKE_ENCODING_UTF8 "Use UTF-8 encoding internally (experimental)." OFF) option(CMAKE_ENCODING_UTF8 "Use UTF-8 encoding internally." ON)
mark_as_advanced(CMAKE_ENCODING_UTF8) mark_as_advanced(CMAKE_ENCODING_UTF8)
if(CMAKE_ENCODING_UTF8) if(CMAKE_ENCODING_UTF8)
set(KWSYS_ENCODING_DEFAULT_CODEPAGE CP_UTF8) set(KWSYS_ENCODING_DEFAULT_CODEPAGE CP_UTF8)

View File

@ -60,14 +60,16 @@ Syntax
Encoding Encoding
-------- --------
A CMake Language source file must be written in 7-bit ASCII text A CMake Language source file may be written in 7-bit ASCII text for
to be portable across all supported platforms. Newlines may be maximum portability across all supported platforms. Newlines may be
encoded as either ``\n`` or ``\r\n`` but will be converted to ``\n`` encoded as either ``\n`` or ``\r\n`` but will be converted to ``\n``
as input files are read. as input files are read.
Note that the implementation is 8-bit clean so source files may Note that the implementation is 8-bit clean so source files may
be encoded as UTF-8 on platforms with system APIs supporting this be encoded as UTF-8 on platforms with system APIs supporting this
encoding. Furthermore, CMake 3.0 and above allow a leading UTF-8 encoding. In addition, CMake 3.2 and above support source files
encoded in UTF-8 on Windows (using UTF-16 to call system APIs).
Furthermore, CMake 3.0 and above allow a leading UTF-8
`Byte-Order Mark`_ in source files. `Byte-Order Mark`_ in source files.
.. _`Byte-Order Mark`: http://en.wikipedia.org/wiki/Byte_order_mark .. _`Byte-Order Mark`: http://en.wikipedia.org/wiki/Byte_order_mark

View File

@ -0,0 +1,22 @@
windows-utf-8
-------------
* On Windows, CMake learned to support international characters.
This allows use of characters from multiple (spoken) languages
in CMake code, paths to source files, configured files such as
``.h.in`` files, and other files read and written by CMake.
Because CMake interoperates with many other tools, there may
still be some limitations when using certain international
characters.
Files written in the :manual:`cmake-language(7)`, such as
``CMakeLists.txt`` or ``*.cmake`` files, are expected to be
encoded as UTF-8. If files are already ASCII, they will be
compatible. If files were in a different encoding, including
Latin 1, they will need to be converted.
The Visual Studio generators now write solution and project
files in UTF-8 instead of Windows-1252. Windows-1252 supported
Latin 1 languages such as those found in North and South America
and Western Europe. With UTF-8, additional languages are now
supported.