A
string homomorphism (often referred to simply as a
homomorphism in
formal language theory) is a string substitution such that each character is replaced by a single string. That is, f(a)=s, where s is a string, for each character a. String homomorphisms are
monoid morphisms on the
free monoid, preserving the empty string and the
binary operation of
string concatenation. Given a language L, the set f(L) is called the
homomorphic image of L. The
inverse homomorphic image of a string s is defined as f^{-1}(s) = \{ w \mid f(w) = s \} while the inverse homomorphic image of a language L is defined as f^{-1}(L) = \{ s \mid f(s) \in L \} In general, f(f^{-1}(L)) \neq L, while one does have f(f^{-1}(L)) \subseteq L and L \subseteq f^{-1}(f(L)) for any language L. The class of regular languages is closed under homomorphisms and inverse homomorphisms. Similarly, the context-free languages are closed under homomorphisms and inverse homomorphisms. A string homomorphism is said to be ε-free (or e-free) if f(a) \neq \varepsilon for all
a in the alphabet \Sigma. Simple single-letter
substitution ciphers are examples of (ε-free) string homomorphisms. An example string homomorphism
guc can also be obtained by defining similar to the
above substitution:
guc(‹a›) = ‹A›, ...,
guc(‹0›) = ε, but letting
guc be undefined on punctuation chars. uc differs from
fuc by returning strings, while the latter returned singleton sets of strings. ---> Examples for inverse homomorphic images are •
guc−1({ ‹SSS› }) = { ‹sss›, ‹sß›, ‹ßs› }, since
guc(‹sss›) =
guc(‹sß›) =
guc(‹ßs›) = ‹SSS›, and •
guc−1({ ‹A›, ‹bb› }) = { ‹a› }, since
guc(‹a›) = ‹A›, while ‹bb› cannot be reached by
guc. For the latter language,
guc(
guc−1({ ‹A›, ‹bb› })) =
guc({ ‹a› }) = { ‹A› } ≠ { ‹A›, ‹bb› }. The homomorphism
guc is not ε-free, since it maps e.g. ‹0› to ε. A very simple string homomorphism example that maps each character to just a character is the conversion of an
EBCDIC-encoded string to
ASCII. ==String projection==