MarketPrintf
Company Profile

Printf

printf is a C standard library function and is also a Linux terminal (shell) command that formats text and writes it to standard output. The function accepts a format C-string argument and a variable number of value arguments that the function serializes per the format string. Mismatch between the format specifiers and count and type of values results in undefined behavior, and the program might crash or vulnerabilities may arise.

History
1950s: Fortran Early programming languages like Fortran used special statements with different syntax from other calculations to build formatting descriptions. In this example, the format is specified on line , and the command refers to it by line number: PRINT 601, IA, IB, AREA 601 FORMAT (4H A= ,I5,5H B= ,I5,8H AREA= ,F10.2, 13H SQUARE UNITS) Hereby: • indicates a string of 4 characters " A= " ( means Hollerith Field); • indicates an integer field of width 5; • indicates a floating-point field of width 10 with 2 digits after the decimal point. An output with input arguments , , and might look like this: A= 100 B= 200 AREA= 1500.25 SQUARE UNITS 1960s: BCPL and ALGOL 68 In 1967, BCPL appeared. Its library included the routine which looked like any other function call. An example application looks like this: WRITEF("%I2-QUEENS PROBLEM HAS %I5 SOLUTIONS*N", NUMQUEENS, COUNT) Hereby: • indicates an integer of width 2 (the order of the format specification's field width and type is reversed compared to C's ); • indicates an integer of width 5; • is a BCPL language escape sequence representing a newline character (for which C uses the escape sequence ). ALGOL 68 also had a function api, but used special syntax for the format: printf(($"Color "g", number1 "6d,", number2 "4zd,", hex "16r2d,", float "-d.2d,", unsigned value"-3d"."l$, "red", 123456, 89, BIN 255, 3.14, 250)); Using normal function syntax for the printing simplifies the language, and allows the printing to be implemented in the language itself. In most newer languages of that era I/O is not part of the syntax. However the format was usually not checked to see if it matched the type (or even the number) of values being printed and mismatches resulted in crashes, security exploits, and hardware failures (e.g., phone's networking capabilities being permanently disabled after trying to connect to an access point named "%p%s%s%s%s%n"). 1970s: C In 1973, was included as a C standard library routine as part of Version 4 Unix. 1990s: Shell command In 1990, the printf shell command, modeled after the C standard library function, was included with 4.3BSD-Reno. In 1991, a command was included with GNU shellutils (now part of GNU Core Utilities) and the syntax (options, arguments, etc.) of this "shell command" are different from the C-Language function e.g.: the "format section" does not use the positional arguments with a "$" symbol (n$) in the same way as printf() function: str="AA BB CC" # simple string with 3 fields set -- $str # convert fields to positional parameters printf "%s " $2 $3 $1; echo # in C printf() uses 2$ 3$.. • prints: BB CC AA 2000s: Java In 2004, Java 5.0 (1.5) released, which extended the class java.io.PrintStream, adding a method named printf() which functions analogously to printf() in C. Thus to print a formatted string to the standard output stream, one uses System.out.printf(). Java further introduced the method format to its string class java.lang.String. 2000s: -Wformat safety The need to do something about the range of problems resulting from lack of type safety has prompted attempts to make compilers -aware. The option of GNU Compiler Collection (GCC) allows compile time checks to calls, enabling the compiler to detect a subset of invalid calls (and issue either a warning or an error, terminating compilation, as set by other flags). Because the compiler inspects format specifiers, this feature effectively extends static analysis in C to include formatting aspects. 2020s: std::print C++ added input/output support using the operator to avoid safety issues of printf. This used the type of the arguments to choose which code to execute to print them, avoiding the crashes that are possible with format strings. However, the syntax can be verbose (especially for setting options like precision), and remains available in C++ and is often used instead. C++20 added a new API which uses a string consisting of verbatim text and placeholders followed by the values to print. The format string uses curly braces instead of percent signs, a syntax popularized by Python: :{{code|lang=C++|std::format("The hex value of {} is {:x}.", name, value)}} The recommended implementation is from Victor Zverovich's fmtlib which at compile time converts the string and argument types into an optimized formatting object, this is type-safe and syntax errors are detected at compile time. C++23 introduced the functions and which combined formatting and outputting, and is therefore a functional replacement for . It is possible to make a translator from a printf %-string to the same formatting object and this could produce a type-safe , but this was also not added to the spec. No analogous modernization has been introduced, though one has been proposed based on scnlib. These functions (plus ) can print floating point accurately using the least number of trailing digits possible, an ability long missing from . Another useful feature is that they ignore the locale. == Format specifier ==
Format specifier
Formatting of a value is specified as markup in the format string. For example, the following outputs Your age is and then the value of the variable in decimal format. printf("Your age is %d", age); Syntax The syntax for a format specifier is: %[parameter][flags][width][.precision][length]type Parameter field The parameter field is optional. If included, then matching specifiers to values is sequential. The numeric value selects the n-th value parameter. This is a POSIX extension, not C99. This field allows for using the same value multiple times in a format string instead of having to pass the value multiple times. If a specifier includes this field, then subsequent specifiers must also. For example, printf("%2$d %2$#x; %1$d %1$#x",16,17); outputs: This field is very useful for localizing messages to different natural languages that use different word orders. In the Windows API, support for this feature is via a different function, . Flags field The flags field can be zero or more of (in any order): Width field The width field specifies the number of characters to output. If the value can be represented in fewer characters, then the value is left-padded with spaces so that output is the number of characters specified. If the value requires more characters, then the output is longer than the specified width. A value is never truncated. For example, specifies a width of 3 and outputs with a space on the left to output 3 characters. The call outputs which is 4 characters long since that is the minimum width for that value even though the width specified is 3. If the width field is omitted, the output is the minimum number of characters for the value. If the field is specified as , then the width value is read from the list of values in the call. For example, outputs 10 (10) where the second parameter, , is the width (matches with ) and is the value to serialize (matches with ). Though not part of the width field, a leading zero is interpreted as the zero-padding flag mentioned above, and a negative value is treated as the positive value in conjunction with the left-alignment flag also mentioned above. The width field can be used to format values as a table (tabulated output). But, columns do not align if any value is larger than fits in the width specified. For example, notice that the last line value () does not fit in the first column of width 3 and therefore the column is not aligned. 1 1 12 12 123 123 1234 123 Precision field The precision field usually specifies a limit of the output, as set by the formatting type. For floating-point numeric types, it specifies the number of digits to the right of the decimal point to which the output should be rounded; for and it specifies the total number of significant figures (before and after the decimal, not including leading or trailing zeroes) to round to. For the string type, it limits the number of characters that should be output, after which the string is truncated. The precision field may be omitted, or a numeric integer value, or a dynamic value when passed as another argument when indicated by an asterisk (). For example, outputs . Length field The length field can be omitted or be any of: Platform-specific length options came to exist prior to widespread use of the ISO C99 extensions, including: ISO C99 includes the inttypes.h header file that includes a number of macros for cross-platform coding. For example: specifies decimal format for a 64-bit signed integer. Since the macros evaluate to a string literal, and the compiler concatenates adjacent string literals, the expression compiles to a single string. Macros include: Type field The type field can be any of: Custom data type formatting A common way to handle formatting with a custom data type is to format the custom data type value into a string, then use the specifier to include the serialized value in a larger message. Some printf-like functions allow extensions to the escape-character-based mini-language, thus allowing the programmer to use a specific formatting function for non-builtin types. One is the (now deprecated) glibc's [https://www.gnu.org/software/libc/manual/html_node/Customizing-Printf.html ]. However, it is rarely used due to the fact that it conflicts with static format string checking. Another is Vstr custom formatters, which allows adding multi-character format names. Some applications (like the Apache HTTP Server) include their own printf-like function, and embed extensions into it. However these all tend to have the same problems that has. The Linux kernel printk function supports a number of ways to display kernel structures using the generic specification, by additional format characters. For example, prints an IPv4 address in dotted-decimal form. This allows static format string checking (of the portion) at the expense of full compatibility with normal printf. ==Vulnerabilities==
Vulnerabilities
Format string attack Extra value arguments are ignored, but if the format string has more format specifiers than value arguments passed, the behavior is undefined. For some C compilers, an extra format specifier results in consuming a value even though there isn't one which allows the format string attack. Generally, for C, arguments are passed on the stack. If too few arguments are passed, then printf can read past the end of the stack frame, thus allowing an attacker to read the stack. Some compilers, like the GNU Compiler Collection, will statically check the format strings of printf-like functions and warn about problems (when using the flags or ). GCC will also warn about user-defined printf-style functions if the non-standard "format" is applied to the function. Uncontrolled format string exploit The format string is often a string literal, which allows static analysis of the function call. However, the format string can be the value of a variable, which allows for dynamic formatting but also a security vulnerability known as an uncontrolled format string exploit. Memory write Although an output function on the surface, allows writing to a memory location specified by an argument via . This functioning is occasionally used as a part of more elaborate format-string attacks. The functioning also makes accidentally Turing-complete even with a well-formed set of arguments. A game of tic-tac-toe written in the format string is a winner of the 27th IOCCC. ==Related functions==
Related functions
Family Variants of in the C standard library include: outputs to a file instead of standard output. writes to a string buffer instead of standard output. provides a level of safety over since the caller provides a length n that is the length of the output buffer in bytes (including space for the trailing null character). provides for safety by accepting a string handle (char**) argument. The function allocates a buffer of sufficient size to contain the formatted text and outputs the buffer via the handle. For each function of the family, including printf, there is also a variant that accepts a single va list| argument rather than a variable list of arguments. Typically, these variants start with "v". For example: , , . Generally, printf-like functions return the number of bytes output or -1 to indicate failure. • CC++DF#GGNU MathProgGNU OctaveGoHaskellJJava (since version 1.5) and JVM languages • Julia (via Printf standard library) • Lua () • MapleMATLABMax (via the object) • Objective-COCaml (via Printf module) • PARI/GPPerlRaku (via , , and ) • PHPPython (via operator) • RRed/SystemRubySQLiteTcl (via command) • Transact-SQL (via [https://technet.microsoft.com/en-us/library/ms175014.aspx ]) • Vala (via and ) ==See also==
tickerdossier.comtickerdossier.substack.com