Receptive field modification Research on basic
sensory discriminations often show that perceptual
learning effects are specific to the trained task or
stimulus. Many researchers take this to suggest that perceptual learning may work by modifying the
receptive fields of the cells (e.g.,
V1 and V2 cells) that initially encode the stimulus. For example, individual cells could adapt to become more sensitive to important features, effectively recruiting more cells for a particular purpose, making some cells more specifically tuned for the task at hand. Evidence for receptive field change has been found using single-cell recording techniques in
primates in both tactile and auditory domains. However, not all perceptual
learning tasks are specific to the trained stimuli or tasks. Sireteanu and Rettenback discussed
discrimination learning effects that generalize across eyes, retinal locations and tasks. Ahissar and Hochstein used visual search to show that learning to detect a single line element hidden in an array of differently-oriented line segments could generalize to positions at which the target was never presented. In human vision, not enough receptive field modification has been found in early visual areas to explain perceptual learning. Training that produces large behavioral changes such as improvements in discrimination does not produce changes in receptive fields. In studies where changes have been found, the changes are too small to explain changes in behavior.
Reverse hierarchy theory The Reverse Hierarchy Theory (RHT), proposed by Ahissar & Hochstein, aims to link between learning dynamics and specificity and the underlying neuronal sites. RHT proposes that naïve performance is based on responses at high-level cortical areas, where crude, categorical level representations of the environment are represented. Hence initial learning stages involve understanding global aspects of the task. Subsequent practice may yield better perceptual resolution as a consequence of accessing lower-level information via the feedback connections going from high to low levels. Accessing the relevant low-level representations requires a backward search during which informative input populations of neurons in the low level are allocated. Hence, subsequent learning and its specificity reflect the resolution of lower levels. RHT thus proposes that initial performance is limited by the high-level resolution whereas post-training performance is limited by the resolution at low levels. Since high-level representations of different individuals differ due to their prior experience, their initial learning patterns may differ. Several imaging studies are in line with this interpretation, finding that initial performance is correlated with average (BOLD) responses at higher-level areas whereas subsequent performance is more correlated with activity at lower-level areas. RHT proposes that modifications at low levels will occur only when the backward search (from high to low levels of processing) is successful. Such success requires that the backward search will "know" which neurons in the lower level are informative. This "knowledge" is gained by training repeatedly on a limited set of stimuli, such that the same lower-level neuronal populations are informative during several trials. Recent studies found that mixing a broad range of stimuli may also yield effective learning if these stimuli are clearly perceived as different, or are explicitly tagged as different. These findings further support the requirement for top-down guidance in order to obtain effective learning.
Enrichment versus differentiation In some complex perceptual tasks, all
humans are experts. We are all very sophisticated, but not infallible at scene identification, face identification and speech
perception. Traditional explanations attribute this expertise to some holistic, somewhat specialized, mechanisms. Perhaps such quick identifications are achieved by more specific and complex perceptual detectors which gradually "chunk" (i.e., unitize) features that tend to concur, making it easier to pull a whole set of information. Whether any concurrence of features can gradually be chunked with practice or chunking can only be obtained with some pre-disposition (e.g. faces, phonological categories) is an open question. Current findings suggest that such expertise is correlated with a significant increase in the cortical volume involved in these processes. Thus, we all have somewhat specialized face areas, which may reveal an innate property, but we also develop somewhat specialized areas for written words as opposed to single letters or strings of letter-like symbols. Moreover, special experts in a given domain have larger cortical areas involved in that domain. Thus, expert musicians have larger auditory areas. These observations are in line with traditional theories of enrichment proposing that improved performance involves an increase in cortical representation. For this expertise, basic categorical identification may be based on enriched and detailed representations, located to some extent in specialized brain areas.
Physiological evidence suggests that training for refined discrimination along basic dimensions (e.g. frequency in the auditory modality) also increases the representation of the trained parameters, though in these cases the increase may mainly involve lower-level sensory areas.
Selective reweighting In 2005, Petrov, Dosher and Lu pointed out that perceptual
learning may be explained in terms of the selection of which analyzers best perform the classification, even in simple discrimination tasks. They explain that the some part of the neural system responsible for particular decisions have specificity, while low-level perceptual units do not. ==The impact of training protocol and the dynamics of learning==