[New post] New Acquisitions: On the Wisdom of Noah Smith
Bret Devereaux posted: " I generally try to avoid having Twitter disputes spill on to the blog. Generally what happens on Twitter is best left on Twitter and in some cases not even that. However this past week I was pulled into a Twitter debate with Noah Smith about the validi" A Collection of Unmitigated Pedantry
I generally try to avoid having Twitter disputes spill on to the blog. Generally what happens on Twitter is best left on Twitter and in some cases not even that. However this past week I was pulled into a Twitter debate with Noah Smith about the validity of the way that historians offer our knowledge into the public debate who then opted to continue that debate in a long-form blog post tackling my work in particular, which in turn seemed like it demanded a long-form response rather than something in 280 characters or less.
Noah's initial tweet declared:
My beef with academic history is not that it's woke. Nor that it's anti-woke. It's that the theories are given even more credence than macroeconomics even though they're even less empirically testable.
Carlos Morel responded with some confusion, noting that historians tend to be quite wary of what he termed 'theorizing' and perhaps foolishly I waded in because I felt Morel had made a valid point somewhat confused by the fact that Smith seemed unaware that the word 'theory' is used differently in different fields. And so to compress quite a few tweets into a short statement, I offered that in fact this was a real difference: historians do not generally aim to construct laws of general applicability (quite unlike social scientists, who do), but instead to study and furnish relevant exemplars as tools (but not predictions) for thinking about current problems.
To which Smith responded that he'd be putting together a post where my public writings would "form the core" showing how contra to what I said that "academic historians make strong theoretical claims that cannot be evaluated empirically." And here I want to ask you to please put a pin in the word 'empirically' now used twice because we're going to come back to it.
In any event he did write that post and it is here and you can read it; unfortunately I cannot recommend it. We're going to talk about why but it is going to end up being rather involved because to explain why a shallow critique of a discipline's methods is shallow, you have to explain how that discipline functions and whyit does so. But before we get to the complicated stuff we should deal with:
The Bad Faith Complaints
We can start with the conceit of the section titled "Historical analogies are theories" (mark that word 'theories' again because we're coming back to it) which is where my work is focused on. For the most part I just want to get most of this out of the way before I get to the real meat of Smith's complaint. There are a few problems with his reading of the essays in question which seems to speak either to bad faith or a failure in comprehension. Let's take this paragraph of his:
Which is building off of this paragraph of mine:
Now perhaps this is unfair of me coming as a specialist trained in the reading of texts but it sure does seem to me like Smith has stripped out quite a lot of qualifiers here to fundamentally misrepresent a paragraph that is in fact a giant caveat, instead presenting it as claim of general statistical predictability. Phrases like "stretch the scientific metaphor" before "laboratories of democracy" fairly clearly indicate that, no, I am well aware these are not actual laboratories; the double-quote marks around "data set" do the same, acknowledging that this is analogous to a data set but not an actual data set. Moreover I am leading the paragraph with the idea that there is fundamental uncertainty, indeed "always risks" in "drawing comparisons across vast chasms of time and culture" before introducing the idea that there may well be other examples which might offer different lessons. Rather than "asserting that ancient Greece is an appropriate analogy for our modern politics," as Smith has it, I have explicitly opened the door to the idea that it might not be. As we'll see, this sort of stress on context and contingency over rules of general applicability is a key difference between the methods historians use and the social sciences.
Smith then has a brief foray into wondering if I am constructing an argument-by-definition that tyrants are defined by their effort to repeatedly seek power (a pointless digression; I define 'tyrant' in the piece) which we may mostly skip over except to note that I think Smith's failure to familiarize himself at even a basic level with the material in question is rather exposed by his use of Richard Nixon as the example of a tyrant who would be 'No True Scotsman'd' out of this potential definition I am not actually proposing. Richard Nixon was an crook and a bad president, but I know of no definition where a politician who won election to his office and then was forced to leave it by constitutional processes and didn't use violence to attempt to retain power would qualify as a tyrant; Watergate does not have a body count. The ancient Greek definition (the one I was working with here) functionally requires the position to be extra-constitutional and violence to be used in the seizure and maintenance of power as standard definitional features. Even the remarkably reductive characterization I offered in the essay, that tyranny was a neutral Greek term for one-man rule (if you think you detect an editor shaving down a more complete definition for the sake of pacing, that's because you do), still disqualifies Nixon who - quite manifestly in the instance as he was about to be removed from power by Congress - did not rule alone.
The broader problem here is Smith's choice of targets, which are not a bunch of peer-reviewed journal articles but in fact a number of mostly short-form (c. 1200-1500 word) essays in traditional media publications. At first when Smith read the phrase "Would-be tyrants keep trying until they succeed" to mean "all would-be tyrants keep trying until they succeed," I assumed he was just unfamiliar with this genre of writing where editors tend to strip out caveats like 'sometimes' or 'frequently' as redundant and where the general mode is to present a single detailed example on the assumption that, as a historian we have in fact considered the broader evidence as it exists but do not have the word-space to exhaustively list all examples.
Indeed in most cases, Smith has removed what nuance was contained in the original essays. A conclusion of "a need to choose: Either trim down American objectives in what are, effectively, occupied countries to those which can be achieved merely by organizing the existing military and political structures or settle down to the task of building a new military organization [there] from the ground up" becomes in Smith's summation, "to predict that the U.S. military would be more effective if it reorganized itself in certain ways." The latter is frankly a misleading summary of an argument which instead offers the Romans as one tool (of potentially many, including others like the British army in India mentioned in that very essay) with which to think about the trade-offs inherent in trying to raise military force in occupied countries. Perhaps Smith has never written in this genre before and so is unfamiliar with it and its constraints.
Am I to assume that Smith has done a statistical study of every transformative president and confirmed that the mean number of crises they required to be transformative was two? Why hasn't he included these studies in his short article? Has he done empirical verification of the two-crisis theory? Of course not, he is reasoning from a single historical exemplar and drawing a conclusion from it, a fairly standard use of case-based inductive reasoning. It turns out he does know how this works.
Collectively I think these problems speak to the shallow degree to which Smith is engaging with the question; critiquing the non-comprehensiveness of short, traditional media articles is a failure of seriousness. No one is going to give a complete accounting of ancient Greek tyranny in the c. 1,000 known Greek poleis in 1,500 words; demanding they do so is a literal category error, mistaking the think-piece for the footnote-laden journal article.
With that out of the way, we can at last get to the meat of the disagreement.
Epistemologies
In his tweets (above) and in the essay itself, Smith is consistent in his call for empirical methods to be used and for historians to acknowledge that they are making "predictive theories." I think he reaches his main point most clearly here:
And my sense is that the problem here is that Smith has not familiarized himself with history as a discipline or historical argumentation, both how it works but also why it works that way. Consequently he's attempting to jam them into his social science (economics, particularly) framework, apparently unaware that there are different methodologies and indeed often different epistemologies than his own. That is a fairly big problem and a disappointing one, so let me explain what I mean.
First, an epistemology is a theory of knowledge - that is a theory of how we can come to know things. You will note that I keep using it in the plural because there is not one epistemology in use but in fact several; actual functioning humans rely on different epistemic principles at different times and in different subjects. Empiricism is one of these epistemologies, which argues that direct, personal sense-perception is the chief or even only valid form of knowledge; in its pure form, empiricism rejects authority, testimony, and rationalism (that is the application of raw logic) as sources of knowledge. To test something empirically is thus to test it experientially, typically in the form of an experiment whose results can be observed empirically, that is with the senses. That of course includes using all sorts of tools, including statistical tools; a statistical test of data is an empirical test of that data as it outputs a result which can be observed.
Empiricism, as you may gather, is the epistemology which underlies the scientific method and so is the chief epistemology in the natural sciences and the social sciences. And I am terribly fond of it; for things which can be tested empirically, it works great and is generally to be preferred over other epistemological approaches. Those who know my actual research will be well aware that I think forms of empirical testing (such as experimental archaeology) can be very valuable in helping to understand the past.
The problem is that not all things can be tested empirically, either because it is impossible to do so or because it is impractical to do so. On the impossible end, we have phenomenon which are not subject to independent sense-perception, like the thoughts of a person other than yourself; we are not (yet?) able to export someone's thoughts and brain imaging provides at best a very incomplete picture of someone's mental state. The best one can do is ask the person what they are thinking but for the researcher this introduces a non-empirical break in which they are forced to rely on the subject telling the truth. Likewise, a fairly large category of things which cannot be tested empirically is everything that does not exist right now, since humans experience time in a linear, continuously forward-moving fashion, leaving all things that once existed but no longer do out of the realm of human sense-perception. That, of course, is terribly relevant to historians, because it means very little of what we study is subject to empirical tests. Most humans, indeed even famous humans, leave no empirical evidence of their existence; one could not, for instance, empirically prove the existence of Socrates. And yet we can be very confident Socrates existed!
On the other end, some things are impractical to test empirically; empirical tests rely on repeated experiments under changing conditions (the scientific method) to determine how something behaves. This is well enough when you are studying something relatively simple, but a difficult task when you are studying, say, a society. Social scientists look for 'natural experiments' to aid their understanding, but this is compounded by the rarity of some really important phenomena. An economist studying market interactions can statistically analyze millions of daily trades on the NYSE, but a political scientist has to wait four years to add one data-point to a data-series on presidential elections. Consequently empirical methods struggle to establish solid predictability for such rare events despite the best efforts of very talented analysts. Needless to say, asking people to do a controlled to-scale experiment in, say, warfare or pandemic control would face severe legal, ethical and practical hurdles, but at the same time these events are sufficiently rare and complex that relying only on natural experiments results imposes severe limits. Again, empiricism is great, when you can use it.
Now a philosopher might then insist on pure empiricism (or, as in David Hume's formulation, pure empiricism in matters of fact and pure logic for the relations of ideas) and just declare everything outside of it unknowable, but in practice this is absurdly limiting. In our actual lives and also in the course of nearly every kind of scholarship (humanities, social sciences or STEM) we rely on a range of epistemologies. Some things are considered proved by the raw application of deductive reasoning and logic, a form of rationalism rather than empiricism (one cannot, after all, sense-perceive the square root of negative one). In some things testimony must be relied on; perhaps the most important element of history as a discipline are the systems we apply to assess the reliability of testimony for events which, having taken place in the past, cannot be viewed empirically in the present (the term for all of these methods collectively is 'the historical method.' Historians are not creative people when it comes to naming things).
To insist that all knowledge must be empirical knowledge and that all theories must be empirically testable theories is to either misunderstand what the words mean or to engage in scientism - the delusion that all knowledge is empirical, scientific knowledge.
The irony in this is that when Smith suggests a specific test in response to my article it is not, in fact, an empirical test. In particular he suggests that, "Before we conclude that "would-be tyrants keep trying until they succeed", we should rigorously and systematically check the historical record to see if we could identify, ex ante, a set of characteristics that allowed us to predict who would keep trying to seize power and who would give up." But without then running the experimentin the present (that is, waiting to see what kind of dictators show up and if they match the 'set of characteristics') there's no empirical test there. It is difficult to even imagine how such a test would be managed (how do you isolate the variables?), but its results would be useless to anyone today in any event because they wouldn't be known for decades. This is of course the great difficulty that political scientists wrestle with (not without significant success, mind you): the effort to apply scientific methods to the subject (large-scale human interactions which often include violence) where controlled experiments are impossible, unethical or both.
Instead what he's actually suggesting isn't an empirical test at all: it is a quantitative, statistical test of data none of which can be empirically verified because it all happened in the past; only the statistical analysis is subject to empirical verification. It's a lot easier to see how this would be done: past leaders would be grouped into tyrants and non-tyrants, each assigned a series of mathematically defined qualities and then one would run a regression analysis to determine which variables are most predictive of being a tyrant. One would then 'test' the analysis against other historical tyrants not included in the original sample to see if the predictability held. That's still not an empirical test (neither the main data set, nor the test data set are empirically derived - only their comparison is empirically observed which won't correct for any problems in the non-empirical elements: the evidence and the process of reducing it to data), but this is precisely the sort of work that political scientists do and often to useful result. That said I hope in this case the difficulties implied by assigning historical figures mathematically defined qualities to enable statistical comparison, especially in the context of incomplete historical information, are not lost on the reader. There is perhaps a reason no political scientist has yet tried to go and run this analysis.
How History Works
And now at last we get into both how and why the historical method differs from social science approaches to the past. Now the difference here shouldn't be taken as too binary; there is significant overlap in both methods and concerns. But generally the social sciences aim to establish general rules for how societies function which have strong future predictiveness; laws of the workings of society akin to the laws of the workings of physics whereby we can predict with quite a lot of precision where a ball will go when thrown; for the sake of clarity I'm going to call these 'laws of general applicability.' By contrast the focus of historians is on the past itself; while historians of past decades often toyed with the idea of 'grand narratives' akin to the social sciences' laws of general applicability, these have long since been abandoned by all but a few because the exceptions kept overwhelming the general rules. Of course historians hope that the work we do to create knowledge of the past will be useful in the present, but the discipline prioritizes the former over the latter. As a result, historians generally reject the creation of laws of general applicability, insisting that while the past is a useful teacher, efforts at strict predictability will always be overwhelmed by contingency, context and unexpected variables.
That in turn leads to differences in methods. Social scientists - and here we mostly mean economists and political scientists, the two social sciences that collide with history most often - use many methods but perhaps the most important is quantitative analysis using historical data. Historians by and large do not reject that method but tend to be leery of it because a key step in the process is the reduction of a whole bunch of very complex evidence to a handful of mathematical variables which can then be analyzed statistically. I use the word reduction here quite intentionally; those mathematical values can only be either simplifications or distortions of the evidence used to create them. Going back to our example above with tyrants, imagine the difficulties in reducing figures like Peisistratos or Cylon (or Napoleon, Caesar or Hitler) to mathematical expressions; there is ample room for the introduction of new bias but even if done faithfully the result will flatten out much of the nuance of these figures (or be forced to infer in places where we lack evidence). And so even when such data is generated with great care that means taking complex, difficult phenomena and flattening them.
Let's take an example and we'll pick a statistical argument that I think actually has some considerable merit to it: the democratic peace theory. The theory (in its modern form), arguing that democratic countries do not (generally) go to war with each other, emerged out of statistical studies comparing historical democracies on a period-by-period basis with a list of historical conflicts and observing the lower rate of democracy-on-democracy conflict. But a brief look at the original study reveals how much the complexity of the actual history was flattened to provide for statistical analysis; the United States is a democracy in 1776, but Great Britain isn't until 1832 - a look at the actual criteria (fn. a on p. 212) reveals the tortured efforts to get a binary classification which produced data that made even minimal sense. Non-European style potentially democratic states - I have in mind the Six Nations of the Iroquois Confederacy - are entirely excluded. The binary created means that the United States in 1812 is 100% a democracy despite holding a not-trivial proportion of its populace in slavery, while its enemy Great Britain is 100% not a democracy despite the main political power being vested in an elected parliament. The actual complexity has to be flattened out to produce data; this isn't a critique of the theory - the flattening was unavoidable. Some more recent efforts at the problem have tried assigning democracy or liberalism 'scores' to countries but of course that itself introduces all sorts of complications. The conversion or compression of fuzzy, non-numerical evidence to data is thus not free, it is a 'lossy' process.
And that leads to a fairly frequent kind of interaction between the disciplines: historians scour the evidence and produce our best assessment of them, often at fairly low confidence. Social scientists then take these assessments and turn them into data, stripping or flattening out the caveats in the process and then produce impressive looking statistics like those implied by the chart below and attempt to draw conclusions from them, while historians cry foul over how - in Smith's phasing - 'disciplining with data' cannot correct for the problems with the data.
In the case of this chart, what Max Roser has succeeded in doing is not charting global deaths in conflict, but rather in charting the rate at which evidence for battles is preserved over time and the reliability of the estimates of their casualties. One can actually see the problem with this effort at creating a map of all known battles:
The conclusion drawn in the tweet is, of course, spurious; the prominence of Europe here is an artifact of what battles are well-documented in the languages that the database was constructed with. And before one argues this is just a tweet, John Keegan tried the same thing in A History of Warfare (1993). The map and its conclusions are only so good as the evidence that informs the data it is based on.
Which appears to contend that the Mongolian Steppe was a relatively peaceful place. Of course it wasn't, but it was a place that produced almost no written records and so provides almost no recorded battles to the database, though in the brief moments we can see clearly into the history of this region it was very violent indeed. But the gaps here are massive; keep in mind for instance on the first chart that prior to 1492 (so nearly the whole first sixth of the Max Roser chart) functionally none of the conflicts in North or South American can be represented because they leave no trace in the evidence. Was it peaceful? No, it doesn't seem to have been, but that's not reflected in the data.
All of which is to say that while the social science approach to understanding the past has its benefits, it also has some fairly severe limitations. It is not simply a superior method. My own view is that both historians and social scientists have a fair bit to learn from each other's methods and conclusions (and caveats and criticisms) but of course this requires mutual respect and a mutual methodological understanding.
Historians by contrast are constrained by two key factors: as a discipline we're taught to avoid simplifying our subjects for the sake of analysis (some historians are more careful in this than others) and rather than focusing on converting the historical research of another field into data, historians deal directly with primary sources, which in turn demand quite a lot of time and energy be invested into collecting and working through the evidence. Smith's insinuation that historians aren't doing the "hard, often unrewarding work" is incredible coming from someone who works in a discipline that has all of its historical data 'pre-chewed' for it by historians. Every dot on that chart of battle deaths above likely represents years of work by historians collecting, sorting and understanding difficult and often contradictory primary source material. Without that work, producing the chart would be impossible.
Consequently that means that rather than engaging in very expansive (mile wide, inch deep) studies aimed at teasing out general laws of society, historians focus very narrowly in both chronological and topical scope. It is not rare to see entire careers dedicated to the study of a single social institution in a single country for a relatively short time because that is frequently the level of granularity demanded when you are working with the actual source evidence 'in the raw.'
Nevertheless as a discipline historians have always held that understanding the past is useful for understanding the present. Or as arguably the first historian, Thucydides puts it, "if it be judged useful by those inquirers who desire an exact knowledge of the past as an aid to the understanding of the future, which in the course of human affairs must resemble, if it does not reflect it, I shall be content." Smith declares that this sort of use of history means that these "are social-science theories" (emphasis original), which is an absurd bit of turf-claiming from a discipline (the social sciences) which is, as a practice distinct from philosophy or history, something like 2,225 years younger than history. So how do historians build arguments with present-tense implications and is this approach valid?
The present-tense implications of historical research generally come in two kinds: either the history of a thing (usually an institution) that still exists is used to explain how that thing came to exist as it does or the history of something in the past is presented as analogous to something similar in the present, such that the former is a useful tool when thinking about the latter. Smith is clearly focused on the latter kind of historical argument, so we can set the history-of-a-thing-that-exists argument aside for today.
The epistemic foundation of these kinds of arguments is actually fairly simple: it rests on the notion that because humans remain relatively constant situations in the past that are similar to situations today may thus produce similar outcomes. This is no new thing; the attentive will notice our good friend Thucydides laying out this very logic some c. 2,420 years ago. At the same time it comes with a caveat: historians avoid claiming strict predictability because our small-scale, granular studies direct so much of our attention to how contingent historical events are. Humans remain constant, but conditions, technology, culture, and a thousand other things do not. I think it would be fair to say that historians - and this is a serious contrast with many social scientists - generally consider strong predictions of that sort impossible when applied to human affairs. Which is why, to the frustration of some, we tend to refuse to engage counter-factuals or grand narrative predictions.
… and it's remarkable — and honestly confusing to visitors from other fields — the extent to which historians resist explicit reasoning about causation and counterfactual analysis even while constantly saying things that clearly implicate these ideas.
We tend to refuse to engage in counterfactual analysis because we look at the evidence and conclude that it cannot support the level of confidence we'd need to have. This is not a mindless, Luddite resistance but a considered position on the epistemic limits of knowing the past or predicting the future.
Instead historians are taught when making present-tense arguments to adopt a very limited kind of argument: Phenomenon A1 occurred before and it resulted in Result B, therefore as Phenomenon A2 occurs now, result B may happen. Tyrants in the past have made multiple attempts to seize power, therefore tyrants in the present may as well, therefore some concern over this possibility is warranted. The result is not a prediction but rather an acknowledgement of possibility; the historian does not offer a precise estimate of probability (in the Bayesian way) because they don't think accurately calculating even that is possible - the 'unknown unknowns' (that is to say, contingent factors) overwhelm any system of assessing probability statistically. Once again what Smith mistakes for lethargy is in fact a considered position by historians that further certainty is not possible; the critique historians make is that the methods Smith advises take that essential, unresolveable uncertainty, dress it up with numbers and pretend it has vanished (or been quantified) when it hasn't and cannotbe.
Nevertheless this historian's approach holds significant advantages. By treating individual examples in something closer to the full complexity (in as much as the format will allow) rather than flattening them into data, they can offer context both to the past event and the current one. What elements of the past event - including elements that are difficult or even impossible to quantify - are like the current one? Which are unlike? How did it make people then feel and so how might it make me feel now? These are valid and useful questions which the historian's approach can speak to, if not answer, and serve as good examples of how the quantitative or 'empirical' approaches that Smith insists on are not, in fact, the sum of knowledge or required to make a useful and intellectually rigorous contribution to public debate.
Though I've already offered several examples where Smith critiques historical methodologies without actually bothering to understand them, I want to add one more, which is Smith's apparent lack of awareness of the different uses of the word 'theory,' which came upin the debatesthat led to his essay. Smith uses 'theory' to mean 'hypothesis' or 'predictive theory' but within the discipline of history 'theory' as a term refers to the broad intellectual framework into which evidence is interpreted; that ought to make sense given that history is by and large a discipline of source criticism rather than one engaged in hypothesis testing (though we do a bit of that too). Historical theory is thus often concerned with questions of what sort of history is important, what questions can and ought to be asked of the sources and how their importance should be understood (e.g. mentalités within the Annales school; a study which focuses on mentalités can be described as being situated within an Annales theoretical framework), a very different beast from a hypothesis.
The Wisdom of Noah Smith
In conclusion, Noah Smith's analysis here cannot be recommended. While accusing historians of shrugging off the "hard, often unrewarding work" he has failed to engage meaningfully with the historical method and its epistemic systems; the easy, swiftly rewarding work of establishing a basic understanding of other disciplines. The result is an analysis which repeatedly misunderstands the claims that historians are making and deliberately ignores the way they signal uncertainty. Instead, Smith indulges in rank scientism, insisting that all claims be confirmed empirically, a calling he himself failed to answer to on the previous day in arguing for the elite overproduction hypothesis which he admits, "make[s] some questionable assumptions about how labor markets work" and which isn't empirically tested. For my own part I would note that Peter Turchin's effort to find support for this hypothesis in the ancient world is fatally flawed by the lack of evidence; Turchin has built a vast castle on sand - it can be no stronger than its meager evidentiary foundation.
Of course some historians absolutely do make arguments that extend out beyond what their evidence can support firmly. Sometimes they are responsible in signalling uncertainty and sometimes they are not. The problem is especially acute in compressed formats and genres. Of course social scientists do much the same; it cannot for instance both be the case that we can be certain that forgiving student loan debtboth absolutely will and assuredly will notinduce more inflation. On this latter point, I suspect Smith would agree and yet I don't see him suggesting that economics, as a discipline, should pack it up and go home.
Great to see history majors being abandoned. Maybe now all the historians will have time to go read a book pic.twitter.com/e5xE7Exfr7
And it would be easy enough to dismiss one ill-advised essay on an internet full of them were it not for the fact that this kind of scientism contributes to the hardship of a discipline already under sustained attack not because of the predictions it supposedly makes but because of the true things we insist on teaching about the past, while at the same time history departments continue to shrink. Of course these trends worry me as a historian, but they ought to also worry social scientists too, who after all, as noted rely on historians to process the evidence they use to produce the data that forms the foundations of their conclusions and also benefit from historians offering a critical second look at their models and conclusions. Without historians to do that work, the ability of social scientists to reach for data earlier than the rise of the modern administrative state functionally vanishes.
Yet for all of this it is the wisdom of most historians to be able to see the value in methodological and epistemological approaches other than their own. Alas, it is a wisdom Noah Smith apparently lacks.
No comments:
Post a Comment