Per-source copyright/license notice headers that spell out copyright holder
names and years are hard to maintain and often out-of-date or plain wrong.
Precise contributor information is already maintained automatically by the
version control tool. Ultimately it is the receiver of a file who is
responsible for determining its licensing status, and per-source notices are
merely a convenience. Therefore it is simpler and more accurate for
each source to have a generic notice of the license name and references to
more detailed information on copyright holders and full license terms.
Our `Copyright.txt` file now contains a list of Contributors whose names
appeared source-level copyright notices. It also references version control
history for more precise information. Therefore we no longer need to spell
out the list of Contributors in each source file notice.
Replace CMake per-source copyright/license notice headers with a short
description of the license and links to `Copyright.txt` and online information
available from "https://cmake.org/licensing". The online URL also handles
cases of modules being copied out of our source into other projects, so we
can drop our notices about replacing links with full license text.
Run the `Utilities/Scripts/filter-notices.bash` script to perform the majority
of the replacements mechanically. Manually fix up shebang lines and trailing
newlines in a few files. Manually update the notices in a few files that the
script does not handle.
Manually extract the C++ portion of `cmListFileLexer.in.l` into a
temporary file, format it, and then move it back into the original file.
Manually format C++ code inside the lexer actions to match our style.
Then re-generate the lexer.
Revise the documented modifications we need to make to the
flex-generated source file according to the needs of the new version.
Update our own implementation to avoid warnings with flex types.
The lexer changes in commit v3.0.0-rc1~495^2 (Add Lua-style long
brackets and long comments to CMake language, 2013-08-06) accidentally
left out matching '[' as a single character in an unquoted argument.
Add a lexer rule to match it and extend the RunCMake.Syntax test to
cover this case.
Teach the CMake language lexer to treat the \-LF pair terminating a
line ending in an odd number of backslashes inside a quoted argument
as a continuation. Drop the pair from the returned quoted argument
token text. This will allow long lines inside quoted argument
strings to be divided across multiple lines in the source file.
It will also allow quoted argument text to start on the line after
the opening quote. For example, the code:
set(x "\
...")
sets variable "x" to the value "..." with no opening newline.
Previously an odd number of backslashes at the end of a line inside
a quoted argument would put a \-LF pair (or a \-CR pair) literally
in the argument. Then the command-argument evaluator would complain
that the \-escape sequence is invalid. Therefore this syntax is
available to use without changing behavior of valid existing code.
Teach the RunCMake.Syntax test to cover cases of quoted arguments
with lines ending in \, \\, and \\\. Odd counts are continuations.
Teach the CMake language parser to recognize Lua-style "long bracket"
arguments. These start with two '[' separated by zero or more '='
characters e.g. "[[" or "[=[" or "[==[". They end with two ']'
separated by the same number of '=' as the opening bracket. There is no
nesting of brackets of the same level (number of '='). No escapes,
variable expansion, or other processing is performed on the content
between such brackets so they always represent exactly one argument.
Also teach CMake to parse and ignore "long comment" syntax. A long
comment starts with "#" immediately followed by an opening long bracket.
It ends at the matching close long bracket.
Teach the RunCMake.Syntax test to cover long bracket and long comment
cases.
Read input files in binary mode instead of text mode and convert CRLF
newlines to LF newlines explicitly in our own buffer. This is necessary
to read CMake source files with CRLF newlines on platforms whose C
runtime libraries do not transform newlines in text mode. For example,
a Cygwin or Linux binary may not transform CRLF -> LF in files read from
a Windows filesystem. Perform the conversion ourselves to ensure that
multi-line string literals in CMake source files have LF newlines
everywhere.
Teach the lexer to read a UTF-8, UTF-16 BE/LE, or UTF-32 BE/LE
Byte-Order-Mark from the start of a file if any is present. Report an
error on files using UTF-16 or UTF-32 and accept a UTF-8 or missing BOM.
Teach the lexer to treat a single letter as an identifier instead of an
unquoted argument. Outside of a command invocation, the parser treats
an identifier as a command name and an unquoted argument as an error.
Inside of a command invocation, the parser treats an identifier as an
unquoted argument. Therefore this change to the lexer will make what
was previously an error case work with no other behavioral change.
Teach cmListFileLexerDestroy to call cmListFileLexerSetToken with a NULL
token to free the token string buffer. Without this, if an error occurs
before the token cleanup happens when EOF is reached, then the token
string buffer may leak.
Teach the lexer to return tokens for whitespace. Teach the parser to
tolerate the space tokens where whitespace is allowed. Also teach the
parser to diagnose and warn about cases of quoted arguments followed
immediately by another argument. This was accidentally allowed
previously, so we only warn.
Update the RunCMake.Syntax test case StringNoSpace expected stderr to
include the warnings.
If a line inside a string ends in a backslash count the following
newline character as a line increment. Add a test covering this case to
verify that subsequent line numbers are correct.
This converts the CMake license to a pure 3-clause OSI-approved BSD
License. We drop the previous license clause requiring modified
versions to be plainly marked. We also update the CMake copyright to
cover the full development time range.