When using quoting, if one wishes to represent the delimiter itself in a string literal, one runs into the problem of
delimiter collision. For example, if the delimiter is a double quote, one cannot simply represent a double quote itself by the literal """ as the second quote is interpreted as the end of the string literal, not as the value of the string, and similarly one cannot write "This is "in quotes", but invalid." as the middle quoted portion is instead interpreted as outside of quotes. There are various solutions, the most general-purpose of which is using escape sequences, such as "\"" or "This is \"in quotes\" and properly escaped.", but there are many other solutions. Paired quotes, such as braces in Tcl, allow nested strings, such as {foo {bar} zork} but do not otherwise solve the problem of delimiter collision, since an unbalanced closing delimiter cannot simply be included, as in {}}.
Doubling up A number of languages, including
Pascal,
BASIC,
DCL,
Smalltalk,
SQL,
J, and
Fortran, avoid delimiter collision by
doubling up on the quotation marks that are intended to be part of the string literal itself: 'This Pascal string
contains two apostrophes' "I said, ""Can you hear me?"""
Dual quoting Some languages, such as
Fortran,
Modula-2,
JavaScript,
Python, and
PHP allow more than one quoting delimiter; in the case of two possible delimiters, this is known as
dual quoting. Typically, this consists of allowing the programmer to use either single quotations or double quotations interchangeably – each literal must use one or the other. "This is John's apple." 'I said, "Can you hear me?"' This does not allow having a single literal with both delimiters in it, however. This can be worked around by using several literals and using
string concatenation: 'I said, "This is ' + "John's" + ' apple."' Python has
string literal concatenation, so consecutive string literals are concatenated even without an operator, so this can be reduced to: 'I said, "This is '"John's"' apple."'
Delimiter quoting C++11 introduced so-called
raw string literals. They consist, essentially of :R"
end-of-string-id (
content )
end-of-string-id ", that is, after R" the programmer can enter up to 16 characters except whitespace characters, parentheses, or backslash, which form the
end-of-string-id (its purpose is to be repeated to signal the end of the string,
eos id for short), then an opening parenthesis (to denote the end of the eos id) is required. Then follows the actual content of the literal: Any sequence characters may be used (except that it may not contain a closing parenthesis followed by the eos id followed a quote), and finally – to terminate the string – a closing parenthesis, the eos id, and a quote is required. The simplest case of such a literal is with empty content and empty eos id: R"()". The eos id may itself contain quotes: is a valid literal (the eos id is " here.) Escape sequences don't work in raw string literals.
D supports a few quoting delimiters, with such strings starting with q" plus an opening delimiter and ending with the respective closing delimiter and ". Available delimiter pairs are (), <>, {}, and []; an unpaired non-identifier delimiter is its own closing delimiter. The paired delimiters nest, so that is a valid literal; an example with the non-nesting / character is . Similar to C++11, D allows here-document-style literals with end-of-string ids: :q"
end-of-string-id newline
content newline
end-of-string-id " In D, the
end-of-string-id must be an identifier (alphanumeric characters). In some programming languages, such as
sh and
Perl, there are different delimiters that are treated differently, such as doing string interpolation or not, and thus care must be taken when choosing which delimiter to use; see the section on
different kinds of strings below.
Multiple quoting A further extension is the use of
multiple quoting, which allows the author to choose which characters should specify the bounds of a string literal. For example, in
Perl: qq^I said, "Can you hear me?"^ qq@I said, "Can you hear me?"@ qq§I said, "Can you hear me?"§ all produce the desired result. Although this notation is more flexible, few languages support it; other than Perl,
Ruby (influenced by Perl) and
C++11 also support these. A variant of multiple quoting is the use of
here document-style strings. Lua (as of 5.1) provides a limited form of multiple quoting, particularly to allow nesting of long comments or embedded strings. Normally one uses
and to delimit literal strings (initial newline stripped, otherwise raw), but the opening brackets can include any number of equal signs, and only closing brackets with the same number of signs close the string. For example: local ls = [=[ This notation can be used for Windows paths: local path =
C:\Windows\Fonts ]=] Multiple quoting is particularly useful with
regular expressions that contain usual delimiters such as quotes, as this avoids needing to escape them. An early example is
sed, where in the substitution command s/
regex/
replacement/ the default slash / delimiters can be replaced by another character, as in s,
regex,
replacement, .
Constructor functions Another option, which is rarely used in modern languages, is to use a function to construct a string, rather than representing it via a literal. This is generally not used in modern languages because the computation is done at run time, rather than at parse time. For example, early forms of
BASIC did not include escape sequences or any other workarounds listed here, and thus one instead was required to use the CHR$ function, which returns a string containing the character corresponding to its argument. In
ASCII the quotation mark has the value 34, so to represent a string with quotes on an ASCII system one would write "I said, " + CHR$(34) + "Can you hear me?" + CHR$(34) In C, a similar facility is available via
sprintf and the %c "character" format specifier, though in the presence of other workarounds this is generally not used: char buffer[32]; snprintf(buffer, sizeof buffer, "This is %cin quotes.%c", 34, 34); These constructor functions can also be used to represent nonprinting characters, though escape sequences are generally used instead. A similar technique can be used in C++ with the std::string stringification operator. == Escape sequences ==