August 2017

Race as Information

Understanding how to conceptualize a problem can be useful, even if you aren’t an expert in the field. I don’t work in evolutionary biology, I don’t know their literature, and I don’t know the answers of their field. On the other hand, many of them don’t know the right answers either. I know this because there is a constant debate among otherwise smart academics in the same field over how to understand the interactions between race and genetics. Some claim it’s pseudoscience, others claim it is a rich field of study. How can a non-expert decide who is right?

I originally had this challenge in my preferred field of Economics. About ten years ago Paul Krugman and Milton Friedman were two of my favorite economists, yet they had profound disagreements. I found this to be true across many fields, and ended up growing more interested in the abstract problem of understanding scientific disagreement. Like most areas I felt I figured out myself, Scott Alexander had also identified the real guys who discovered it, and wrote all about it.

After another few years of reading those guys, studying more modern information and game theory, and making the move into the tech space, I developed a somewhat refined method of classifying problem types. I’m still working it out, but here is an example of what I think is the right way to discuss race:

For those who do want to discuss race it helps to share rigorous definitions on the meaning of terms, and the scientific inference underlying the discussion. Race discussions essentially amount to information classification problems, and then making inferences on those classifications based on the observed outcomes of those who belong to the class. I don’t know the answer to racism (is there an answer?) but I at least think there are appropriate scientific ways to discuss race that are generally ignored. These ways are often even ignored by scientists who study the field, but have never taken the time to properly learn the philosophy of the scientific method or information theory.

I outlined my view of the philosophy of science and politics in this old post. I think it’s useful to start by imagining the world as an information matrix, with patterns and numbers running a complex simulation, bound only by the laws of reality. In this world what is race? Well, we have all these little simulacrums we call humans. Since we evolved to understand humans as what we are, and with whom we interact and live among, it’s obvious. In the simulation though, humans are simply another set of self-replicating patterns. Granted, ones that have an exceptionally unusual ability to process and respond to information. Across all the information embedded in these human simulacrums, we could run a learning algorithm to classify the most abstract human. In this case, specifically, we are looking for the set of patterns common across all humans.

For example, we could say every human consists of α, then there is a random n-dimensional variable ε that contains all idiosyncratic information. We could represent this as h_i = α + ε_i. Where i is indexed for every human. All information differences among humans would then be contained in the epsilon term. We can do this, and feel confident it corresponds to reality, because we know that everything about ourselves is encoded in our genetic information. In a world where humans were homogeneous, but also contained a noise component, we might imagine everyone looking the exact same, except some people have green eyes, and some have blue eyes. In this case everything about ourselves would be encapsulated in α, with the ε_i term simply being a binomial distribution that assigns eye color.

We know though that humans are more complex than that, with there being correlations between traits and geography. As some humans evolved in separate pockets from one another, they developed different traits. Crucially though, that’s not necessarily the same thing as what most people mean when they discuss race.

The challenge is that the ε_i encodes all information differences, yet due to our inability to scan the genetic code of everyone we meet, we have to use our senses to infer their genetic information instead. For example, Africans have an incredible amount of genetic diversity. However, outside of subtle differences, they are generally classified as ‘black.’ Currently I’m classified as white, although back during Churchill’s England, I wouldn’t have been viewed as the same race, due to what they would have considered my mother’s technically non-white ethnicity. We also hear stories about how Irish immigrants were viewed as being from the ‘Irish race.’ I don’t want to dive too deep into a historical study of race classifications, but I’m currently reading a few books on WWII, and at least some of the Americans and British viewed ‘the Hun’ (Germans) as being part of a violent war-mongering race. Hitler, of course, viewed the Slavs as being subhuman. Whereas today Russians are obviously classified as white. What gives?

There have been thousands of books written on the variation of racial classifications over time, it’s an interesting topic and worth studying. The question is why do they change over time? What does that tell us about human reasoning systems?

When we observe another human we are given access to a set of information that primarily comes through our sense of sight. This set of information we observe isn’t the same as the true set of information. The true set of information is what is running on the simulation mainframe. The human brain observes a set of information, and then makes a classification using our own, poorly trained, biased, heuristic classification algorithm. This is why whiteness can change over centuries. Because whiteness itself is simply an abstraction based on classifying an in-group on an observable characteristic. And what is an incredibly easily observable characteristic? Skin color. In fact, there is an entire field of ‘whiteness study’ dedicated to this question.  Hitler didn’t care much about whiteness, he created a new classification of Aryanism. Colonialist powers seem to have focused more on whiteness, as was the characteristic with the highest fidelity to distinguish them from their colonies.

Our classifications do not perfectly map to genetic differences and they are highly biased by cultural, nationalistic, religious, and historical exogenous factors.

But there is a correlation. How can we tell? Because scientists have tested the mapping, as reported at infoproc:

The figure is from the following paper, reporting on a study of over 4000 individuals. The researchers can group most Europeans into a geographical cline (NW vs SE, that’s the red band in the lower right of the figure; there are two clusters but also individuals who are in-between) + Ashkenazim (the pink isolated cluster in the upper left) using a few hundred markers. I’m sure even better resolution can be obtained with more loci. Discerning the Ancestry of European Americans in Genetic Association Studies

We see here the mapping between human race based classifications, and the closest we have come so far to using pure unbiased genetic information to create genetic classifications. As it turns out, there is a correlation between the skin color we observe, and the genetic clustering of the person we observe it in. However, skin-color is a much simpler classification model than the true model. As a result we may bucket together people into the same category based on skin color, who actually belong to separate categories.

What this might mean in practice though, is that when we discuss racial differences we are:

1.) discussing something correlated but by no means determined by genetic information.

2.) Discussing a topic where the interaction between the first point, and societal dynamics, can change the measurement itself. e.g. If a group of persons observe ‘black people’, they are actually observing a large set of genetic variation. By then enslaving them they introduce a historical shock that severely damages the groups ability to perform, even when enslavement is ended.

3.) A group may enslave another group, and then use observable genetic differences to create an after-the-fact story about superiority of one group over the other.

4.) On average blue people may be smarter than red people, but this is only because n% of red people are also circles, but there is no way for a color-person to observe geometric shapes. So they instead have to rely on color, which is correlated with geometric shapes.

5.) Different groups of people slowly build shared algorithms (sometimes called culture), which promote education. These algorithms can be incredibly potent at creating efficient and intelligent humans. If another group of people have not yet created this algorithm, or had their progress wiped out by being conquered or enslaved, it may be difficult to infer if whether or not their lack of performance is due to genetic information, or the cultural algorithm they are lacking.

6.) It is also possible that the ability for a society to create a cultural algorithm that promotes intelligence is tied to the intelligence of the society itself. Or, worse, perhaps it is created over centuries by the absolute smartest humans created by this society. These humans may only receive the platform to shape the algorithm if placed within a society that is receptive to their ideas. This iterative process could itself be sensitive to the average base ability. Meaning the ability to self-optimize a culture algorithm to promote intelligence could be radically tied to the distribution of base ability of the society itself.

7.) Selective mating patterns within a homogenous group of people may lead to rapid improvement of certain traits or intelliectual abilities, as we observe in Ashkenazi Jews. These selective mating patterns could be driven by a cultural shift, or an exogenous force (Christians being forbidden from money-lending).

8.) Mating between groups can persist, and create new local areas of genetic variation in high-dimensional genetic space.

So how can we understand whiteness then? My guess is that it’s an interaction between genetic fidelity, the partial-information of human classifier systems, and historical accident. Obviously there is some genetic information that is, on average, different between white people and colored people. In fact, this is true by definition. The question then is are there any other consistent traits correlated with these skin tones? Except over time who is considered white, and who is considered colored is allowed to vary, but it is only allowed to vary within a certain structure. This structure is determined by how human classifiers separate out other humans based on inferred genetic information. The way we classify other humans based on their genetic information is itself a function of what categorizations are useful for social interaction. What is useful social interaction is often a function of how our evolutionary tribal systems have been optimized.

If all humans had perfect genetic information of one another, there wouldn’t be a need to make probabilistic inferences based on the set of partial information the human eye is capable of noticing. Instead we could exercise a much more efficient and well-structured racism. In fact, it wouldn’t even be racism, it would be individualism. If you knew all the genetic information of another human, and how those all mapped to outcomes and expected actions, you might avoid sociopaths in a way you are currently unable to detect. But so long as we need to use our eyes to make partial information judgement, and so long as this partial information judgement is correlated with information that is predictive, I don’t see how some form of racist thought can ever be purged.

Although we can make shared agreements to purposefully ignore this partial information. This is the danger with partial information, which societal philosophies based on individualism attempt to circumvent. Whether we like it or not, our race conveys information to those around us. Any individual would optimize their own actions be incorporating more information rather than less, yet this can lead to suboptimal outcomes at the societal level due to the potential for lower trust and social cohesion. But if we choose to do this, the intellectually honest way is to admit that race conveys information, and then ask that we ignore it. The intellectually dishonest way is to claim that there is zero probabilistic information that can possibly be gained by observing another humans race, and anything that people claim that they notice is hateful and racist.