CAL Actor Language

CAL is a high-level programming language for writing (dataflow) actors, which are stateful operators that transform input streams of data objects (tokens) into output streams. CAL has been compiled to a variety of target platforms, including single-core processors, multicore processors, and programmable hardware. It has been used in several application areas, including video and processing, compression and cryptography. The MPEG Reconfigurable Video Coding (RVC) working group has adopted CAL as part of their technical standards efforts.

History and Introduction

The CAL Actor Language was developed in 2001 as part of the Ptolemy II project at University of California, Berkeley. CAL is a dataflow language intended for a variety of application domains, such as multimedia processing, control systems, network processing etc. Another common reason for choosing dataflow is that the goal is an efficient parallel implementation which would be difficult or impossible to achieve using a sequential programming language. Sequential languages are notoriously difficult to parallelize in general, so efficient parallel implementations usually need significant guidance from a user. A CAL dataflow program provides simple, understandable, and powerful abstractions that allow the specification of as much or as little parallelism as needed, enabling tools to produce sophisticated implementations that exploit the concurrent structure of a computation. When programming in dataflow, the programmer is typically constructing a concurrent description of a computational system, which is different from a common sequential program. Rather than being concerned with the step-by-step execution of an algorithm, a dataflow programmer builds a system of asynchronously communicating entities called actors. Much of the programming effort is directed toward finding a good factoring of a problem into actors, and toward engineering appropriate communication patterns among the actors. == CAL features ==

CAL features

The structure of actors Actors perform their computation in a sequence of steps called firings. In each such step, an actor may: • Consume tokens from its input ports • Modify its internal state • Produce tokens at its output ports Consequently, describing an actor involves describing its interface to the outside and the ports, the structure of its internal state, and the steps it can perform, what these steps do (in token production and consumption, and the update of actor state), and how to pick the step that an actor will perform next. This section discusses some of the constructs in the CAL language that deal with these issues. Actions describe the things that occur during a step that an actor takes. It is accurate to say that a step consists of executing an action. When an actor takes a step, it may consume input tokens and produce output tokens. Therefore, input patterns do these: • Define the number of tokens (for each port) to be consumed when an action is executed (fired). • Declare the variable symbols by which tokens consumed by an action firing are referred to within an action. • Define a firing condition for an action, i.e., a condition that must be met for the action to be able to fire. The output side of an action is a little simpler, the output expressions simply define the number and values of the output tokens to be produced on each output port by each firing of the action. It is permissible to omit the explicit naming of the port that an input pattern or output expression applies to if an action provides as many input patterns as there are input ports, or output expressions as there are output ports. In such a case, the patterns or expressions are matched by position against the port declarations. One way to view an actor is as an operator on streams of data – sequences of tokens enter it on its input ports, and sequences of tokens leave it on its output ports. When discussing the operation of an actor, it is often useful to look at it as an operator on streams. Actors can have parameters. They act as constants during the actor execution, and are given a concrete value when an actor is instantiated as part of an actor network. The main purpose of actor parameters is to allow programmers to specify families of related actors, without having to duplicate a lot of code. Non-determinism A non-deterministic actor is one that, for the same input sequences, allows more than one run and more than one possible output. Non-determinism can be very powerful when used appropriately, but it can also be a very troublesome source of errors. One concern is that non-determinism might be introduced into an actor inadvertently, i.e., an author thinks an actor is deterministic even if it is not. One of the key design goals of the CAL language was to allow the description of non-deterministic actors, while simultaneously permitting tools to identify possible sources of non-determinism, so that they can warn about them. A key consequence of a non-deterministic actor like NDMerge is that during an execution, its output may depend on the timing of its input. If both its input queues are empty, and NDMerge is waiting for input, then whatever input the next token arrives at may be the one that is copied next to the output. Consequently, the scheduling of activities in the actor network, or the relative speeds of the actors feeding into an actor like NDMerge may affect the output of the system. This may, occasionally, be desirable, and at other times it may not. Regardless, it is a property that must be considered. One way to view non-determinism of the kind that makes an actor dependent on the precise timing of token arrivals is that such an actor only appears to be non-deterministic if viewing it as an operator on streams, because that view abstracts from the temporal properties of the execution, and thus purposefully removes information that is used to determine the sequence in which actions fire. From the perspective of the CAL language, this is not entirely accurate, but even so, it is easy to write non-deterministic actors that would not be deterministic even if everything is known about the timing of the tokens and the actor implementation, such as: Guarded actions The guard clause of an action contains a set of expressions that all must be true for the action to be fireable. For the first action to be fireable, the incoming token must be greater or equal to zero, in which case it will be sent to output P. Otherwise that action cannot fire. Conversely, for the second action to be fireable, the token must be less than zero, in which case it is sent to output N. A run of this actor might look like this: An actor could run into trouble if it ever encounters a zero token, because none of its actions will be able to fire on it. It's not illegal to write actors that terminate on some input, and it may be important to have a few of those in some systems. But it is a pitfall that can cause problems. Secondly, the guard conditions are also disjoint along with being exhaustive. Finally, guard conditions can peek at incoming tokens without consuming them. If the guards are false or the action is not fired for some other reason, and if the token is not consumed by another action, then it remains where it is, and is available for the next firing. (Or it can remain there indefinitely, as in the case of the zero token in front of SplitDead, which is never removed because the actor is dead.) The Select actor below is another example of the use of guarded actions. It is similar to the NDMerge actor in the sense that it merges two streams (the ones arriving at its A and B input ports). However, it does so according to the (Boolean) values of the tokens arriving at its S input port. Actors with state In all the actors so far, nothing an action firing did would in any way affect subsequent firings of actions of the same actor. Using state variables, action firings can leave information behind for subsequent firings of either the same or a different action of the same actor. The way this actor is written, selecting the next input token and then copying the token to the output is one atomic step. Select and IterSelect are almost, but not entirely, equivalent. Firstly, IterSelect makes twice as many steps to process the same number of tokens. Secondly, it reads, and therefore consumes, the S input token, irrespective of whether a matching data token is available on A or B. ==Schedules==

Schedules

The IterSelect actor of the prior section illustrated the use of state to control the selection of actions. This is an extremely common thing to do in practice, and the CAL language provides special syntax for this purpose in the form of schedules. Conceptually, schedules can be viewed as codifying a particular pattern of using a state variable; they add nothing to the language in expressiveness. The rationale for using schedules is twofold: • They are usually easier to use and less error prone than using any state variables, and many guards and assignments. • Tools can use the information encoded in a schedule more easily, and thus recognize regularities in the actor that might help them produce more efficient code, or perform other analyses that help in implementation and design. Each state transition consists of three parts: the original state, a list of action tags, and the following state. Then, the number of actions has increased: instead of the original three, the new version with the schedule now has four actions. The reason is that an action can no longer directly assign the successor state, as it did in the original, where depending on the value of the token read state would be assigned either the value 1 or 2. In the version with a schedule, that state modification is implicit in the structure of the state machine, and it occurs depending on which action fires. Accordingly, the condition that checks the value of the token has moved from within the body of the action to the guards of the two actions tagged readT and readF. ==Priorities==

Priorities

As long as it has only input on one of its input ports, everything is unambiguous. But, as with NDMerge, as soon as input is available on both input ports, it could fire either of its two actions, and there is nothing in that actor specification which would predispose it to choose one over the other. None of the language constructs so far would allow us to do this. Unlike in this case of schedules, which could be regarded syntactic sugar because they could be reduced to extant elements of the language (state variables, guards, and assignments), this situation requires a true extension: action priorities. The basic idea is to add some inequalities that relate actions to their firing precedence. As in the case of schedules, action tags are used to identify actions to be referred to later on, but this time within the priority inequality. The priority block contains only one such inequality, relating the action tagged config to the one tagged process, giving the former priority over the latter. Even this version is still very timing-dependent. In this case, that need not be a problem, and is probably a requirement for this actor to perform its function. But in general, priorities, especially when used as in the prior example, must be well- understood to yield the correct results. Especially when information about timing of communication within a network is vague, it is probably best to view them as strong implementation directives. == Statements and expressions ==

Statements and expressions

The prior section focused mainly on those constructs in CAL that are related to actor-specific concepts: token input and output, actions, controlling the action selection, and so forth. This section discusses the more pedestrian parts of CAL, the statements and expressions used to manipulate data objects and express (sequential) algorithms. This part of the language is similar to what can be found in many procedural programming languages (such as C, Pascal, Java, Ada), so the focus is on areas that might be slightly different in CAL. == Expressions ==

Expressions

Unlike languages such as C, CAL makes a strong distinction between statements and expressions. They have very distinct roles, very distinct meanings, and can never be used interchangeably. An expression in CAL is a piece of code which sole purpose is to compute a value. An expression has a value, or that it evaluates to a value. For most expressions, the value that they evaluate to will depend on the values of one or more variables at the time when the expression is evaluated. Since variable values may change over time, the same expression may have different values when evaluated at different times. Atomic expressions Probably the most fundamental expressions are constants. Another group of basic expressions are variable references. Syntactically, a variable is any sequence of letters and digits. One important property of expressions is that they are guaranteed not to change variables (we also say they have no side effects)—consequently, within an expression, multiple references to the same variable will always yield the same result. Simple composite expressions CAL provides operators of two kinds to build expressions: unary and binary. A unary operator in CAL is always a prefix operator, i.e. it appears before its single operand. A binary operator occurs between its two operands. ==Statements==

Statements

In some ways, statements in CAL are simply the opposite of expressions: they have no return value, but can change the values of variables. Changing the values of variables is the whole point of statements. Statements are executed in strict sequential order. Unless otherwise specified, the execution of statements proceeds in the order in which they appear in the program text. This means that any variable changes produced by a statement may affect the execution of subsequent statements. Control flow As in most other programming languages, there are control flow constructs to control the order in which the statements within a program are executed. The part of this loop that directly follows the foreach keyword is a generator, much like those in list comprehensions. Action • Input patterns: declaring variables • Guard: specifying enabling conditions • Output expressions: computing output tokens • Body: modifying the actor state == References ==

Source: Wikipedia ↗

tickerdossier.com tickerdossier.substack.com