Gearman assigns each involved computer a role as
client, job
server, or worker. A worker machine can be assigned multiple instances of the worker role, which allows more powerful computers to complete more portions of a given task. Tasks originate on a client, are transmitted from the client to the job server, and performed on one or more workers. The completed task's output is then returned, again by way of the job server, to the client where the task originated. Gearman is conceptually related to
MapReduce; Gearman handles MapReduce by allowing worker nodes to map out work to other workers, with the original worker acting as the reducer. Gearman performs coalescence on the work sent by a client. If two or more clients ask for work to be completed on the same body of work, either by seeing that the same blocks are being sent or by using the unique value sent by the client, it will coalesce the work so that only one worker is used. It does this specifically to avoid
thundering herd problems which are common to cache hit failures. To mitigate the damage that would be done if a job server (or its network connection) were to fail, clients can be configured with more than one assigned job server; if the first assigned job server fails, another can be transparently substituted. Gearman implements a
protocol that consists of binary packets containing requests and responses; this protocol defines the structure of messages passing between the three parts of a Gearman implementation. By default, the Gearman protocol uses
TCP port 4730. It previously operated on port 7003, but this conflicted with the
AFS port range and the new port (4730) was assigned by
IANA. The name "Gearman" was chosen as an
anagram for "Manager", "since it dispatches jobs to be done, but does not do anything useful itself." == Features ==