A C program consists of
units called
source files (or
preprocessing files), which, in addition to source code, includes directives for the
C preprocessor. A translation unit is the output of the C preprocessor – a source file after it has been
preprocessed. Preprocessing notably consists of expanding a source file to recursively replace all #include directives with the literal file declared in the directive (usually
header files, but possibly other source files); the result of this step is a
preprocessing translation unit. Further steps include
macro expansion of #define directives, and
conditional compilation of #ifdef directives, among others; this translates the preprocessing translation unit into a
translation unit. From a translation unit, the compiler generates an
object file, which can be further processed and
linked (possibly with other object files) to form an
executable program. Note that the preprocessor is in principle language agnostic, and is a
lexical preprocessor, working at the
lexical analysis level – it does not do parsing, and thus is unable to do any processing specific to C syntax. The input to the compiler is the translation unit, and thus it does not see any preprocessor directives, which have all been processed before compiling starts. While a given translation unit is fundamentally based on a file, the actual source code fed into the compiler may appear substantially different than the source file that the programmer views, particularly due to the recursive inclusion of headers.
Scope Translation units define a
scope, roughly
file scope, and functioning similarly to
module scope; in C terminology this is referred to as
internal linkage, which is one of the two forms of
linkage in C. Names (functions and variables) declared outside of a function block may be visible either only within a given translation unit, in which case they are said to have internal linkage – they are not visible to the linker – or may be visible to other object files, in which case they are said to have
external linkage, and are visible to the linker. C does not have a notion of modules. However, separate object files (and hence also the translation units used to produce object files) function similarly to separate modules, and if a source file does not include other source files, internal linkage (translation unit scope) may be thought of as "file scope, including all header files". If a C++ module is split into module interface and module implementation, its translation unit consists of both the interface and implementation. A module translation unit also includes all its module partitions.
Code organization The bulk of a project's code is typically held in files with a .c suffix (or .cpp, .cxx or .cc for
C++, of which .cpp is used most conventionally) and among
C++ modules, the extensions are .cppm, .ixx, or .mxx (in which the most popular extension is .cppm or .ixx). Files intended to be included typically have a .h suffix ( .hpp or .hh are also used for C++, but .h is the most common even for C++), and generally do not contain function or variable definitions to avoid name conflicts when headers are included in multiple source files, as is often the case. Header files can be, and often are, included in other header files. It is standard practice for all .c files in a project to include at least one .h file. ==See also==