Traditionally, deep linguistic processing has been concerned with computational grammar development (for use in both
parsing and generation). These grammars were manually developed, maintained and were computationally expensive to run. In recent years, machine learning approaches (also known as
shallow linguistic processing) have fundamentally altered the field of
natural language processing. The rapid creation of robust and wide-coverage machine learning NLP tools requires substantially lesser amount of manual labor. Thus deep linguistic processing methods have received less attention. However, it is the belief of some computational linguists that in order for computers to understand natural language or
inference, detailed syntactic and
semantic representation is necessary. Moreover, while humans can easily understand a sentence and its meaning, shallow linguistic processing might lack human language 'understanding'. For example: :a)
Things would be different if Microsoft were located in Georgia. In sentence (a), a shallow
information extraction system might infer wrongly that Microsoft's headquarters was located in Georgia. While as humans, we understand from the sentence that Microsoft office was never in Georgia. :b)
The National Institute for Psychology in Israel was established in May 1971 as the Israel Center for Psychobiology by Prof. Joel. In sentence (b), a shallow system could wrongly infer that Israel was established in May 1971. Humans know that it is the National Institute for Psychobiology that was established in 1971. In summary of the comparison between deep and shallow language processing, deep linguistic processing provides a knowledge-rich analysis of language through manually developed grammars and language resources. Whereas, shallow linguistic processing provides a knowledge-lean analysis of language through statistical/machine learning manipulation of texts and/or
annotated linguistic resource. ==Sub-communities==