Wednesday, October 23, 2019

The Stanford Prison Experiment was wrong

The replication crisis in psychology has called into question some of the most famous studies in the field, including such stalwarts of undergraduate classes as the Stanford prison experiment and the Stanford marshmallow experiment (the connection with Stanford in both cases is a coincidence). In the case of the latter, a conceptual replication in 2018 failed to find many of the original longitudinal effects, undermining previous claims that a person's ability to inhibit the urge to eat a marshmallow as a child predicts various later life outcomes. The charges against the Stanford prison experiment (SPE) are more serious. Whereas the SPE has traditionally been claimed to show how ordinary people with no history of violence would come to act in brutal and dehumanizing ways when placed into particular situations, recent analysis of archival materials from the study by Thibault Le Texier suggests that this conclusion was reached fraudulently.  

A recent Psychology Today article describes Le Texier's conclusions:

"Zimbardo had actually decided in advance what conclusions he wanted to demonstrate. For example, on only the second day of the experiment, he put out a press release stating that prisons dehumanize their inmates and therefore need to be reformed. Moreover, contrary to his repeated claims that participants in the experiment assigned to the role of guards were not told how to treat the prisoners and were free to make up their own rules, the archival data clearly show that the guards were told in advance what was expected of them, how they were to mistreat the prisoners, and were given a detailed list of rules to follow to ensure that prisoners were humiliated and dehumanized."

Common sense would suggest that the discovery that results cannot be replicated or were fraudulent should have implications for psychological theory and teaching. We should not teach students conclusions that have been undermined or shown to be fraudulent. Nor should such findings be used to prop up theories.

In the case of the marshmallow task, I agree. Of course, it's a moot point because I haven't covered that study in recent years (I find it to exude self righteousness on behalf of the studious professional class whose achievements it ostensibly explains). But if I had, the largely failed replication would cause me to stop.

The case for dropping the SPE might seem to be even stronger. It's findings were fraudulent, after all. Yet, I don't see myself doing this, for reasons that I think are significant for understanding the underlying issues at stake with the replication crisis, and with the possibility of a science of psychology in the first place. 

While the replication crisis groups both experiments together as examples of (ostensibly) scientific psychology, the claims that they make are quite different. The longitudinal findings from the marshmallow experiment pertain to correlations amongst aggregated groupings of human behavior. They are essentially statistical facts, largely devoid of a narrative of human intentionality. The Stanford Prison experiment, on the other hand, lends itself to narration. It is a story, of particular people in a particular type of situation acting in particular ways. 

The recent revelation about the SPE mean that this story is fiction. The "John Wayne"-style guard who initiated the pattern of dehumanizing abuse did not, as the traditional story holds, do so spontaneously due to the situation, but at Zimbardo's behest. 

What is crucial here is that, in the case of the SPE, what is fake/fictional is a narrative, rather than a statistical or propositional conclusion. A proposition that is fake can be disregarded. The same cannot be said for a narrative. As Jerome Bruner explains, the value in a narrative is not a function of its veracity, but its versimilitude. A narrative that is false may nevertheless be lifelike and compelling, and as a result, worth telling. What matters in narrative is not that what is described did happen, but that it could happen. This is why the claims of fraud by Zimbardo do not make the study (at least in its general form) irrelevant.

This point has significant implications for psychology. In science, conclusions must be drawn from empirical evidence. A physicist could not derive a new theory from how they imagine the physical world works. History provides clear examples of why this would fail. If we didn't know better, we would likely imagine (as people did before Newton) that heavier objects necessarily fall faster. In the case of human behavior, things are different. We can narrate imaginary accounts of how people act, and these accounts may be compelling--in fact, they may even allow us to know people better than we did before--even when they are known to be fake.

There are other relevant issue here for the replication crisis. Given the strange fact that we can deepen our knowledge of the human mind from fictional accounts of the behavior of imagined people, it is somehow fitting that empirical records of actual behavior may function to undermine their own validity. So, the act of discovering and reporting that people in a given situation respond in a particular way can lead to people choosing, in response, to act in a different way.




















Thursday, April 5, 2018

Some thoughts on a priori, a posteriori and other types of conclusions

Philosophers commonly distinguish between a priori and a posteriori conclusions. The former are conclusions whose truth can be determined solely by virtue of the meanings of the terms that make them up. The common example is the statement all bachelors are unmarried. The philosopher Kant, who articulated this distinction, defined a priori conclusions as those in which the predicate is contained within the subject. By contrast, a posteriori conclusions are those that are true only by virtue of some external facts. For example, the truth of the claim that it will rain in New York on April 4, 2019 can only be ascertained by observing aspects of the world external to that statement.

In the sciences, a related distinction can be seen between theoretical/philosophical work and empirical research. Both lead to conclusions which can be asserted with some certainty (assuming we exclude truly haphazard theorizing). Corresponding to Kant's distinction (above), theoretical/philosophical work involves reaching conclusions that are not immediately dependent on observation, something which is the case for empirical research.

While this distinction has proven to be useful and relevant, I would like to point out how processes of drawing conclusions do not necessarily fall into only one or the other category in a mutually exclusive way--which is not to say that more or less pure cases of drawing conclusions of either type can't be found. 

To illustrate this, I'd like to use a recent project (my dissertation) as an example. My dissertation analyzed the discursive functioning of what I call knowledge claims. Knowledge claims are claims about what people know of the form s/he knows that X, s/he has a concept of Y, or s/he knows how to Z. The goal of my work (which is the subject of a forthcoming article in Theoretical and Philosophical Psychology, co-authored with Shahida Abdulsalam and Eugene Vvedenskiy) was to  understand what it means to know something by analyzing the ways that knowledge claims function in discourse. Specifically, under what circumstances (esp. for what reasons) may they be asserted or contested? Concerning the reasons that are given to support the assertion or contestation of a claim, what is it about these reasons (and not others) that makes them valid?

As the forthcoming article explains, the things that knowledge claims claim that people know (e.g., how to ride a bike, that Albany is the capital of NY, a concept of number) appear to be generalized descriptions of behavior, rather than of any kind of brain or cognitive content. This conclusions was reached through the careful analysis of texts (published research articles on children's understanding of number) in which knowledge claims were extensively used.

Despite the reliance of my analysis on published texts as a type of data source, I would not consider the conclusions to be a posteriori. The reason is that, although the conclusions were informed by the analyzed texts, the specific aspect of the texts that informed the conclusions--and the subject matter of the conclusions themselves--were conventional/normative in relation to both the researchers' and assumed audience's perspective uses of knowledge claims. Therefore, although an external data source was analyzed to inform the ultimate conclusions, the conclusions could have been reached without consulting any external data source. The "researcher" could have just made up the data. In the case of the knowledge claim project, this would involve thinking up various ways that typical interlocutors might assert and justify knowledge claims in conventional ways. The analysis of already-existing texts simply stimulated the analysis towards relevant features of normative discourse that might have otherwise been laborious to think up.

While the preceding argument would seem to suggest that the conclusions about knowledge claims were a priori, this is also problematic. A priori claims are those that are true by virtue of the meanings of the terms themselves--i.e., by virtue of the normative meanings of the words that constitute them. The conclusion that knowledge claims are descriptions of behavior is not self evidently true based on the definition of knowledge claims. (If one objects that knowledge claims are too obscure a category, the same point can be made with a particular example of a knowledge claim: Having a concept of number does not appear, self evidently, to be a description of behavior.)

It might even be argued that it is an a priori truth that a description of what someone knows is not a description of what they can do. Yet, the claim that this is not true is corroborated in a number of different ways, and--at least in principle--this corroboration could have been done exclusively with spontaneously imagined examples of normative conversions involving knowledge claims.





Wednesday, April 4, 2018

Some musings on the "sameness" of standardized experimental stimuli

The traditional approach to experimental research in psychology involves the creation of experimental settings with standardized stimuli and measures of response. Significant care is taken to eliminate differences in the stimuli presented to participants, except insofar as this constitutes manipulation of the independent variable. For example, a study investigating the effect of the color of light on mood might expose participants to environments that vary in color, but in no other way. It is assumed that participants exposed to the same color environment are being exposed to the same stimulus.

It's not hard to see how this can become problematic. In a psychological context, different qualities (e.g., color) are not readily manipulated independently of other qualities For example, changing a heart from yellow to purple is hardly just a change in color, since the latter color evokes the U.S. military service award the purple heart. Consequently, the variable color is confounded with presence or absence of whatever associations a particular participant may or may not have with the military medal.

There are two issues here. The first is the way that variables that are analytically distinct may be conflated empirically (color with connotations of military service). The second is that the fact that this conflation is not solely a result of the stimulus itself, but also of a particular participant's interpretation of that stimulus. In other words, the "same" stimulus can be interpreted in different ways by different participants (or the same participant on different occasions). In the color research example, this means that some participants exposed to the purple heart would perceive the connotation to military service honor and others would not. In effect, participants ostensibly exposed to the same stimulus have actually been exposed to different percepts.

This second issue--which has been raised by a number of critical commentators--is hardly hidden. It's fairly obvious that different stimuli could have different meanings for different participants. Still, the fact that researchers go on as if stimuli can be treated as "the same" for different participants is hardly just an error.

In acting as if the same stimulus will be "the same" for different participants, researchers are making an assumption that underlies human social interactions in general. The foundation of human cultural life is intersubjectivity, i.e., the assumption of a shared world of objects which exist for me as they do for you. If we were to replace wholesale the assumption of intersubjectivity with the skepticism that things as they appear to me are not necessarily as they appear to you, language would be impossible.

While accepting the basic reasonableness of the critique of standardized stimuli, it's worth considering whether the intersubjective basis of human cultural life entails some important role for standardized stimuli. In other words, what does the centrality of the assumption of a shared environment in human life in general mean for research contexts where this assumption may be made?






Friday, September 16, 2016

Basic reasons to be critical of mainstream psychology: A short introduction

It's reasonable to assume that the authority and influence of modern psychology is as much due to the perception that it's objective and scientific as it is to the relevance of its subject matter. The problem is, when when it comes to the measurement of variables like intelligence, self esteem, happiness, motivation, etc., this objectivity is an illusion. This can be shown with the following argument:

1. The power and authority of psychologists rests on the perception that their conclusions are more objective or definitive than biased everyday value judgments. So, while someone may "feel that their friend is intelligent", a psychologist is seen as able to more definitively "measure their intelligence." (The same holds for similar cases involving beliefs, self-esteem, attitudes, personality, etc.)

2. Despite any suggestions to the contrary, psychological variables (unlike, e.g., length, weight, temperature, etc.,) are not literally measurable; they can only be measured figuratively.

3. The figurative measurement of psychological variables is only possible by relying on subjective, common-sensical assumptions about which form(s) of behavior exemplify a particular variable (e.g., in the case of intelligence, the behavioral acts that are assumed to indicate intelligence). Having done this, psychologists may then quantify the variable by doing one or both of the following ways...
    • Relying on subjective reflection to make a quantity out of something intrinsically non-quantitative, as in the case of a likert scale (e.g., on a scale of 1-5, how much is s/he able to think outside of the box?).
    •  Aggregating some set of target behaviors that exemplify a given variable, and then "measuring" the variable in terms of the number of these behaviors a person is judged to have performed.
IMPLICATIONS
The decisions described in (3) can only be made in terms of commonsense, everyday, subjective judgments. There is literally no other authority or basis on which to make these decisions.

Now, it could be argued that this criticism is too harsh. After all, the problems raised do not preclude the use of "measures" of things like intelligence, happiness, etc. in surveys or polls where their standardized form makes them useful for aggregating data and making comparisons across variables.  While there certainly are some valid and unproblematic uses of this type (Binet's use of the first intelligence test in the Paris school system is a good example), the previously identified problems are still very much relevant. The fundamental issue is that any findings reached with a measure that is constructed in the ways described above are only as definitive as the value judgments from which the measure was derived. No amount of care, standardization, validation or anything else can change this.

This limitation can be illustrated with the example of an intelligence test applied across different of groups of people. While the results of this might be described as group a performed significantly better than group b on this objective measure of intelligence, in reality the result would be somewhat more similar to a psychologist (acting indirectly via a written assessment) used their own judgment to evaluate the intelligence of a series of people and aggregated the results, revealing some group differences.

Monday, July 2, 2012

One of the most interesting things about music is how the clearly psychological enjoyment of music correlates with certain types of acoustic phenomena. While is would be a mistake to say that good music correlates perfectly with harmonies that reflect simple mathematical relationships, there is a clear relation between simple mathematical relationships in the frequency of sound waves, and stereotypically enjoyable music. While dissonance is an increasing characteristic of music over the last few hundred years, this is not just any dissonance. Instead, it would be better to say that it's a very tangentially developed version of order. The blue notes heard in jazz or blues music (among others) are not dissonant for the sake of disorder. Rather, their dissonance creates a tension in the underlying order that, when developed and resolved skillfully by the musician, is a celebration of that order; a gesture that attempts to cleverly, passionately, triumphantly, ...---reaffirm that order.

The appreciation of more profound dissonance as it relates to order is a clearly more developmentally complex state than the appreciation of moderate, or minimal dissonance. For example, appreciation of elevator music is more simple than appreciation of modern jazz. Elevator music is so simple as to be boring. Modern jazz may be complex enough to sound completely disordered.

Learning a musical instrument is an interesting application of this type of thinking. What are the different ways that the development of the appreciation of music--in particular the harmonic aspects of music--can relate to the development of the ability to play an instrument. One possibility is that there are two alternative developmental relationships between harmonic appreciation and instrument ability. One is that the musician learns the instrument while still at a primitive level of harmonic appreciation, and the development of harmonic appreciation is scaffolded by the instrument as a psychological structure. The other is that advanced harmonic appreciation develops far in advance of musical talent. In this case, an awareness of how certain notes relate to the harmonic order of the song is embodied not by knowledge of the instrument, but instead by some internalized spatial schema that does not relate to the instrument. The latter type of musician would be able understand how complex harmonies relate to a song without being able to express this flexibly, or with any depth, with the instrument. The difference between these two possibilities is equivalent to the difference between a person who knows something in a way that is difficult for them to communicate (which doesn't mean that they don't know it), and a person who can easily communicate the same knowledge. In this example, language command functions the same way that command of the musical instrument does.

Wednesday, April 11, 2012

Science Teaching Controversy

Recently, the news in TN has involved the introduction, and today, passing of a controversial bill involving the teaching of science. The way I'm describing this is very vague, and this is intentional. While the bill itself is clear and concise (well, maybe not conceptually), the media reaction to it is anything but. In it's own words, the bill seeks to "[help] students to understand, analyze, critique and review in an objective manner the scientific strengths and scientific weaknesses of existing scientific theories." This doesn't or shouldn't sound controversial, but given the history of this country, and in particular Tennessee, and the fact that this bill is getting so much attention, it's almost impossible to not immediately make the connection to the creationism-evolution debate, of which the bill is most certainly a part.

The previous sentence captures in a nutshell what is so interesting about this issue. On one hand, the bill itself contains very little that any reasonable person would find objectionable: It simply states that in teaching science, critique of existing scientific theories, and discussion of alternatives (initiated only by students) should be allowed. The governor of TN, Bill Haslam, who has been under intense scrutiny, has made much the same point, saying that it wasn't clear to him what if anything the bill would change.

It's not just that the bill is unobjectionable. Far from being unobjectionable (which has neutral connotations), I would argue that the bill comes across as distinctly appealing. It stresses the value of unimpeded discussion, and the freedom to debate any topic that seems problematic. This embodies values that are central to both the United States of America AND to the practice of scientific inquiry.

Now, if the issues at stake in the bill were simply those that are explicitly stated, there would be no reason for new headlines like the following: "TN Governor Signs Law Protecting Creationist, Denialist Science Teachers and Their Theories" (From democratic underground)

While this title reflects a definite resistance to the bill, this resistance isn't to the bill's explicit meaning, but instead to the implications that it would have for the separation between church and state, and in particular, the teaching of creationism in school. The asymmettry between the bill and its opposition reflects the fact that they're two narratives that are being used to account for what's going on: the narrative of the separation between church and state, and another narrative about the need for free and open discussion in this country. Proponents of the bill are claiming that it provides a way to ensure the open exchange of ideas. Critics remain mute on that topic and instead claim that the bill will protect teachers or students who wish to bring religion into the classroom.

The simultaneous coexistence of multiple narratives that make sense of a situation is something that happens all the time. Different narratives stress different meanings, and skew situations in different ways. What is so interesting about the current case is the different values that are elicited by the narratives. The case against the Tennessee bill (and against other similar bills) is practically being strangled to death by its adherence to multiple, conflicting values.

Opposition to the bill, whether in official statements, public comments, or "digital comments" in online forums, tends to be based around one or both of the following: First, the claim that the bill would compromise the separation between church and state by providing a defense for those who introduce creationism into the classroom. Secondly, the related claims that evolution is practically universally accepted as a scientific fact, alternatives are only found outside the mainstream, and the evolution but not alternatives are based on empirical evidence.

These claims are relatively powerless against the bill. First of all, the need to maintain a separation between church and state is explicitly mentioned in the bill. This ensures that efforts to block the bill cannot be made on that basis. Consequently, the ACLU has stated that it will use confederates in the TN school system to monitor classroom activities, and build its case on evidence that these activities are compromising the separation b/t church and state.

The second claim (that evolution is a better theory and accepted scientific fact) connects to the most interesting part of this whole issue. Any reasonable person would not disagree with these claims about evolution. However, to use these claims as the basis of an argument against the bill, or its effects is unfortunately an exercise in futility. Arguing that evolution should be taught just because it's widely accepted contradicts innumerable statements of the fundamental tenets and values of scientific practice (e.g., as articulated in curriculum materials). These principles stress the value of questioning existing beliefs or theories, and drawing conclusions based on evidence. Ironically, some of the most high-visibility critics of religion (to say nothing of bringing religion into schools) base their arguments on these value. According to people like Steven Pinker and Richard Dawkins, the ideal person is completely rational, basing his or her decisions on empirically grounded facts.

As this illustrates, the argument that evolution is too well established to be questioned contradicts the basic spirit of science as outlined by many of its most public advocates. As if this weren't enough, this argument also contradicts basic values of freedom--freedom from having to accept a viewpoint one doesn't want to accept, or the freedom to call anything in to question. It's a safe bet to say that no value has been more frequently affirmed in the United States.

This situation is clearly a mess. If I had to sum it up neatly, I'd say that "creationists" (that's what they are after all) are taking advantage of the post-modern epistemological crisis in science. They're using the disputed territory that exists between naive claims about the epistemology of science (as made by Dawkins, Pinker, and others) and the reality of scientific practice and so called "rational thinking" (in which rational knowledge exists on a foundation of beliefs and assumptions about the world) to further their own ends of bringing religion into school. Preventing this from happening requires that those on the "science side" resolve the tension between the naive epistemology of many scientists, and the somewhat messier reality of actual scientific practice. Unfortunately, I think that this will have to involve some kind of very unscientific value judgment about the superiority of science as a way of understanding the world.

Wednesday, February 15, 2012

Children's understanding of pretense

My research work as an undergraduate, and now as a graduate student has been frequently concerned with children’s developing understanding of number, or phrased differently, with their developing ability to use numbers. This topic is one that I’ve come to as a result of chance, and have stuck with as a result of the comparative ease of doing work in an area that I’m familiar with, more than with any overarching interest.

A good deal of research in this area is done by researchers who ask young preschooler’s numerical questions that involve numbers, and can be answered by using counting, or basic logical reasoning. These situations involve an interesting juxtaposition of knowledge and ignorance on the behalf of the researchers. On one hand, the researchers put themselves in these situations (in preschools, daycare centers, home environments, etc.) because of their ignorance about children’s developing numerical thinking. On the other hand, once in these situations, the researchers ask children questions that in many cases they already know the answer to. Researcher’s do not go to children because they can’t count, or because they want to know the mathematical result of an addition or subtraction operation. They know the “objective” answers to these questions, and are instead interested in whether or not children know them.

The situation described above involves, depending on your interpretation, either dishonesty, or an asymettrical form of communication. What do the children involved think of this situation? The most ideal state of affairs, as far as the research is concerned, is for children to assume that the question asked of them is one that the researcher genuinely doesn’t know, and needs their help with. In such a situation, a child’s behavior is driven solely by the desire to get the “right” answer. Of course, this may not be how things actually are. Is it actually reasonable to think that children believe that the researcher really needs help answering the question they have been asked? This question can’t be answered without knowing what children think these questions are about in the first place, and there’s an abundance of evidence that 3-4 years olds don’t understand what type of answer a given numerical question is seeking, let alone how to reach that answer.

It may be that children’s understanding of the nature of social situations outstrips their understanding of numbers. This might lead them to grasp the pretense nature of research situations before they can correctly answer the questions asked therein. If so, then the issue is once again what children think that goal of the question is. In other words, if they understand the pretense nature of the situation, and they are compliant, their response will reflect their best ideas about how to answer the question.

Both of the above considerations indicate that there may be an interaction between (a), children’s understanding of the situation (pretense or serious), and (b), their interpretation of the problem. That is, children’s understanding of the situation may shape their developing understanding of the nature of the problem. Conversely, their developing understanding of the problem may shape their understanding of the situation. Just as a small number of possible answers may be found for algebraic equations with multiple variables, so too may a consideration of the possibilities above lead to a narrowing down of possible conclusions.

The question of whether children understand the element of pretense in research designs is contingent on how they understand the question. Certain types of questions make pretense a realistic interpretation, while in others, it makes no sense. If children see numerical questions as commands for them to perform a certain series of behaviors, then the realization that there is an element of pretense would never arise. Such children wouldn’t see the situation as involving the researcher’s desire for a question to be answered, but as involving their giving a performance. They would see the act itself as the object of the situation, rather than as a means to an end. Conversely, if it can be established that children understand the pretense nature of the research situation, then this would indicate a different understanding of the question, namely, one in which pretense makes sense.

The relationship between children’s understanding of numerical questions and their understanding of the situation in which these questions are asked is not just important to figure out at any given point in development. The relation between these two things determines the nature of the developmental process itself. The development of numerical thinking involves not only a developing understanding of counting and numerical logic, but also an understanding of the goals of the situations in which this is used. Children who are oblivious to a researcher’s/teacher’s/parent’s pretense will make sense of numerical situations in a very different way from those who grasp pretense.