Human perception is a remarkable thing. We take the ease with which we can process a very jumbled and confusing world for granted. Humans are sensory creatures: we gain information about our environment through specific organs and nerves. The sensations we experience are absolute–they are entirely quantifiable, as they are detections of physical energy around us, directed related to the intensity or location of the light inputted into our eyes, the sound waves going into our ears, and so on. The information is now converted to electrical signals by the sensory nerves and sent towards the brain. This is where sensation ends, perception begins, and the picture gets complex. A great way to think about perception is as a probabilistic function: there is no guaranteed output for any given inputs, just certain outcomes that are more likely than others. The sensory information that gets passed to the brain is messy: as an example, imagine you are in a room listening to your own music. Someone is trying to talk to you, while outside cars are loudly whizzing by. There are three high magnitude sources of sound waves being processed as one individual wave. But yet your brain is able to break this one wave apart into its primary components and understand them as well as physically possible. There were infinite ways to break down this problem, but yet your brain unconsciously did so in the exact way you wanted it to. Your brain was able to do this because it is exceptionally good at three things that AI algorithms struggle with: selecting which subset of the info is an object, then focusing on that; organizing sensory info to disregard irrelevant info; and using real world knowledge, which provides predictive power for patterns. Aspects of observed objects that are relevant to your subsequent behavior are generally not present in, say, just the retina: i.e. if something is poisonous or deceptively sharp. We need background knowledge to make sense of countless things in our daily life, which is why our brain can be thought of as a prediction machine. This machine’s job is to minimize surprise: it constantly takes in information about the world, updating assumptions and expectations so that you can function as seamlessly as possible. This is all a long-winded way of saying the perception is the recognizing and interpreting of sensory signals that your brain receives. There is nothing absolute that dictates that a completely novel set of sensed stimuli should equate to a picture of a familiar person. But that is what perception does, by characterizing the important elements of the image, discounting the irrelevant portions, and using prior experience to gauge how this new image should be categorized. As an example, let’s observe the process of visual perception–arguably the most important form of perception since we are primarily visual creatures. Sensory information travels from the optic nerve → thalamus → visual cortex (at the rear of the brain). Currently, we think about visual processing as being done through two distinct streams: the What pathway (located on the rear, bottom, side of brain) for recognizing and identifying objects, and the Where pathway (upper back of brain) for object movement and location, important for visually guided behavior. Subregions of the visual cortex are organized in a hierarchical manner: visual features represented in 'lower' areas and more complex features represented in 'higher' areas. At the bottom, neurons are sensitive to basic signals such as orientation, direction. Moving up, the areas process more complex visual features: contours, textures, and the location of something in either the foreground or background. Top of the What hierarchy is inferior temporal (IT) cortex: represents complete objects (fusiform face area specifically responds to faces). Top regions in the Where stream are involved in tasks like guiding eye movements (saccades) using working memory, and integrating our vision with our body position (e.g. as you reach for an object). In sum, as the visual input works its way up the hierarchy, these simple features are combined to create more complex features, until at the top of the hierarchy, neurons can represent complete visual objects such as a face. Brains are trying to make sense of reality, rather than seeking the truth. This means that everything you perceive will be done at first in the context of existing assumptions and consistent with your worldview. It comes as a surprise to you if something is inconsistent with your predictions. Think optical illusions: a checker board with shadows, squares we perceive as light because of shadowing actually are darker than we perceive them to be. A really cool example of neurotechnology within this space is using deep neural networks to reconstruct images shown to people. They do this by working with fMRI data and structuring their neural network so that it reflects (as much as possible) the categorical process of image representation that human brains follow. They take advantage of the multiple levels of visual cortical representations, reconstructing the image both from seen contents and ones.