Cyclomatic complexity

Cyclomatic complexity is a software metric used to indicate the complexity of a program. It is a quantitative measure of the number of linearly independent paths through a program's source code. It was developed by Thomas J. McCabe, Sr. in 1976.

Description

Definition , so the program's cyclomatic complexity is . There are multiple ways to define cyclomatic complexity of a section of source code. One common way is the number of linearly independent paths within it. A set S of paths is linearly independent if the edge set of any path P in S is not the union of edge sets of the paths in some subset of S/P. If the source code contained no control flow statements (conditionals or decision points) the complexity would be 1, since there would be only a single path through the code. If the code had one single-condition IF statement, there would be two paths through the code: one where the IF statement is TRUE and another one where it is FALSE. Here, the complexity would be 2. Two nested single-condition IFs, or one IF with two conditions, would produce a complexity of 3. Another way to define the cyclomatic complexity of a program is to look at its control-flow graph, a directed graph containing the basic blocks of the program, with an edge between two basic blocks if control may pass from the first to the second. The complexity is then defined as M = E - N + 2P, where • = the number of edges of the graph. • = the number of nodes of the graph. • = the number of connected components. , which also results in a cyclomatic complexity of 3 using the alternative formulation (). An alternative formulation of this, as originally proposed, is to use a graph in which each exit point is connected back to the entry point. In this case, the graph is strongly connected. Here, the cyclomatic complexity of the program is equal to the cyclomatic number of its graph (also known as the first Betti number), which is defined as M = E - N + 2. Cyclomatic complexity may be applied to several such programs or subprograms at the same time (to all of the methods in a class, for example). In these cases, will equal the number of programs in question, and each subprogram will appear as a disconnected subset of the graph. McCabe showed that the cyclomatic complexity of a structured program with only one entry point and one exit point is equal to the number of decision points ("if" statements or conditional loops) contained in that program plus one. This is true only for decision points counted at the lowest, machine-level instructions. Decisions involving compound predicates like those found in high-level languages like IF cond1 AND cond2 THEN ... should be counted in terms of predicate variables involved. In this example, one should count two decision points because at machine level it is equivalent to IF cond1 THEN IF cond2 THEN .... Cyclomatic complexity may be extended to a program with multiple exit points. In this case, it is equal to \pi - s + 2, where \pi is the number of decision points in the program and is the number of exit points. Interpretation In his presentation "Software Quality Metrics to Identify Risk" for the Department of Homeland Security, Tom McCabe introduced the following categorization of cyclomatic complexity: • 1–10: Simple procedure, little risk • 11–20: More complex, moderate risk • 21–50: Complex, high risk • > 50: Untestable code, very high risk == In algebraic topology ==

{{anchor|Explanation in terms of algebraic topology}}In algebraic topology

An even subgraph of a graph (also known as an Eulerian subgraph) is one in which every vertex is incident with an even number of edges. Such subgraphs are unions of cycles and isolated vertices. Subgraphs will be identified with their edge sets, which is equivalent to only considering those even subgraphs which contain all vertices of the full graph. The set of all even subgraphs of a graph is closed under symmetric difference, and may thus be viewed as a vector space over GF(2). This vector space is called the cycle space of the graph. The cyclomatic number of the graph is defined as the dimension of this space. Since GF(2) has two elements and the cycle space is necessarily finite, the cyclomatic number is also equal to the 2-logarithm of the number of elements in the cycle space. A basis for the cycle space is easily constructed by first fixing a spanning forest of the graph, and then considering the cycles formed by one edge not in the forest and the path in the forest connecting the endpoints of that edge. These cycles form a basis for the cycle space. The cyclomatic number also equals the number of edges not in a maximal spanning forest of a graph. Since the number of edges in a maximal spanning forest of a graph is equal to the number of vertices minus the number of components, the formula E-N+P defines the cyclomatic number. Cyclomatic complexity can also be defined as a relative Betti number, the size of a relative homology group:M := b_1(G,t) := \operatorname{rank}H_1(G,t), which is read as "the rank of the first homology group of the graph G relative to the terminal nodes t". This is a technical way of saying "the number of linearly independent paths through the flow graph from an entry to an exit", where: • "linearly independent" corresponds to homology, and backtracking is not double-counted; • "paths" corresponds to first homology (a path is a one-dimensional object); and • "relative" means the path must begin and end at an entry (or exit) point. This cyclomatic complexity can be calculated. It may also be computed via absolute Betti number by identifying the terminal nodes on a given component, or drawing paths connecting the exits to the entrance. The new, augmented graph \tilde G obtains M = b_1(\tilde G) = \operatorname{rank}H_1(\tilde G). It can also be computed via homotopy. If a (connected) control-flow graph is considered a one-dimensional CW complex called X, the fundamental group of X will be \pi_1(X) \cong \Z^{*n}. The value of n+1 is the cyclomatic complexity. The fundamental group counts how many loops there are through the graph up to homotopy, aligning as expected. ==Applications==

Applications

Limiting complexity during development One of McCabe's original applications was to limit the complexity of routines during program development. He recommended that programmers should count the complexity of the modules they are developing, and split them into smaller modules whenever the cyclomatic complexity of the module exceeded 10. Measuring the "structuredness" of a program Section VI of McCabe's 1976 paper is concerned with determining what the control-flow graphs (CFGs) of non-structured programs look like in terms of their subgraphs, which McCabe identified. (For details, see structured program theorem.) McCabe concluded that section by proposing a numerical measure of how close to the structured programming ideal a given program is, i.e. its "structuredness". McCabe called the measure he devised for this purpose essential complexity. If a program is structured, then McCabe's reduction/condensation process reduces it to a single CFG node. In contrast, if the program is not structured, the iterative process will identify the irreducible part. The essential complexity measure defined by McCabe is simply the cyclomatic complexity of this irreducible graph, so it will be precisely 1 for all structured programs, but greater than one for non-structured programs. Some studies find a positive correlation between cyclomatic complexity and defects; functions and methods that have the highest complexity tend to also contain the most defects. However, the correlation between cyclomatic complexity and program size (typically measured in lines of code) has been demonstrated many times. Les Hatton has claimed that complexity has the same predictive ability as lines of code. Studies that controlled for program size (i.e., comparing modules that have different complexities but similar size) are generally less conclusive, with many finding no significant correlation, while others do find correlation. Some researchers question the validity of the methods used by the studies finding no correlation. Although this relation likely exists, it is not easily used in practice. Since program size is not a controllable feature of commercial software, the usefulness of McCabe's number has been questioned. ==See also==

Source: Wikipedia ↗

tickerdossier.com tickerdossier.substack.com