One Weird Trick to Promote Persistent Information Asymmetry

When I first began researching the legal services market, I quickly became aware of the serious information asymmetry problem across the marketplace – especially for criminal defense. As I’ve probably mentioned before, defendants have no real way to estimate the quality of an attorney when considering hiring one for a criminal case (except obviously by consulting Blackstone Trial Analytics, LLC – the trusted name in attorney referrals and quantitative LSM analysis). This is a classic adverse selection problem. Defendants know (or can approximately make themselves aware of) the average outcome of a criminal charge; but they don’t know which attorneys contribute better than average outcomes and which contribute worse than average outcomes.

Ultimately, attorneys seem to set roughly comparable prices for their services and share the market. Of course many defendants would like to pay more for high quality attorneys and probably all defendants would like to avoid low quality attorneys (at least at the prevailing prices). I wondered why (high quality) attorneys didn’t try to solve this problem. They could, for example, publicize their records. But this wouldn’t work if many other attorneys simply didn’t publicize their own records. Probably the public would have a hard time interpreting the record in the context of suppliers of criminal defense services generally. This is especially true if people systematically overrate their probability of success at trial.

More plausibly, I thought, attorneys could make their fees contingent on case outcomes. For example, they could charge some variable amount (by quality) for plea bargains and more for trials – much more where the defendant wins. Obviously any specific deal is possible, including zero or even negative fees. These arrangements, I thought, would probably produce few poor incentives, remove some bad existing incentives and communicate important facts about quality to defendants.

Generally, when there appears to be an obvious and easy solution to fix an apparent market failure, a non-market failure lurks just behind it. This is one of those cases. Here’s the relevant ABA rule:

(d) A lawyer shall not enter into an arrangement for, charge, or collect:

(1) any fee in a domestic relations matter, the payment or amount of which is contingent upon the securing of a divorce or upon the amount of alimony or support, or property settlement in lieu thereof; or

(2) a contingent fee for representing a defendant in a criminal case

 

Advertisements

Juries: 12 Increasingly Angry Men

One of the most enjoyable things about economic analysis is that it often yields surprising facts about cherished institutions. For example, the jury selection process in the United States probably burdens defendants disproportionately in criminal trials – most people believe jury selection serves to mitigate the problem of biased juries.

Why? Imagine the distribution of jurors by how sympathetic they are to the defendant. I would guess the distribution isn’t normal with a mean of “indifferent”; instead, the average juror probably starts from a relatively unsympathetic position. Assume for a particular defendant, 60% of the population has some bias against the defendant and 40% has some bias in favor. If the jury is randomly selected, we would expect 6/10 jurors to be relatively anti-defendant. If the prosecutor and defense attorney can identify the jurors biases and are allowed one challenge each, the final jury should be 6.2/10 anti-defendant. Every additional challenge increases the frequency of anti-defendant jurors in the panel.

Apparently Scots law embraces (wisely, I think) random jury selection. The quotation below is from Peter Duff’s “The Scottish Criminal Jury: A Very Peculiar Institution”:

There is no equivalent to the voir dire procedure in Scotland, a fact which might surprise some American readers. The strong opposition of the Scottish criminal justice system to any procedure of this type is well illustrated by the observations of the Appeal Court in McCadden v. H. M. Advocate:

There may never be a process which eliminates the possibility of personal prejudices existing among jurors, the nearest practical one (and it is not foolproof) being possibly the “vetting” of jurors, a system against which the law of Scotland has steadfastly closed the doors. Evidence of how it is used and abused in countries in which it is operated only tends to confirm the wisdom of that decision.

The court went on to observe that it should not be “lightly assumed” that jurors will pursue their prejudices in defiance of their oath and the directions of the judge. On a more practical note, the court pointed out that the broad base from which jurors are drawn means that any prejudices and biases tend to cancel each other out, and further, that the majority verdict, whereby a bare eight to-seven vote either way suffices, ensures that it is unlikely that one prejudiced juror can affect the outcome of the case.

Why Are Public Defenders So Good?

In my last post, I provided a graph suggesting public defenders have above average win-rates. Most people find this surprising. Actually, this fits neatly into a model of the LSM where defense attorneys are profit maximizers and public defenders are sentence minimizers. Profit maximization does not imply sentence minimization. Instead, defense attorneys focus on “Win-Stay” and “Lose-Stay” outcomes. To see what I mean, consider Bayes’ Rule

bayesrule

All of this means that the probability of A conditional on B equals the probability of B conditional on A multiplied by the probability you assign to A, over the probability of B conditional on A multiplied by the probability you assign to A plus the probability of B conditional on not-A multiplied by the probability you assign to not-A. If the above isn’t clear, check out Bryan Caplan’s excellent lecture notes or this post at Econlog.

Here’s an example relevant to the defense attorney profit maximization problem:

P(A|B) = P(Attorney is Optimal|Bad Case Outcome)
P(B|A)=  P(Bad Case Outcome|Attorney is Optimal)
P(A) = Probability Attorney is Optimal
P(~A)= Probability Attorney is not Optimal

The profit maximizing attorney wants to persuade clients with bad outcomes that their attorney was still the correct choice. This way, the attorney still has access to that client’s network (and of course for future cases with the same client). In order to do this, attorneys should focus on increasing their clients’ subjectively held belief that they are high quality and increasing the clients’ belief that bad outcomes with high quality attorneys are common. For simplicity, let’s assume that the attorney’s clients will stay with them or recommend them to others if the attorney wins their case.

These incentives create a potential agent-principal problem in the attorney-defendant relationship. If an hour of signaling “I have a great win-rate” does more to increase the probability of Lose-Stay outcomes than an hour of work increasing the probability of winning, the attorney will invest too little (from the defendant’s perspective) in actually winning.

Public defenders, as sentence minimizers, don’t have this problem. Basically – and this can be seen in the data – the average public defender is a better agent than the average private defense attorney. Of course public defenders have obvious weaknesses – essentially zero budget for non-procedural trial inputs, for example. But with respect to procedural inputs, they should behave as if they have been given infinitely large budgets.

 

Should Public Defender Caseloads Matter for Indigent Defense Outcomes?

Unsurprisingly, I believe the answer is yes; surprisingly, I expect heavy caseloads should improve defense outcomes. My reasoning:

  • Public defenders (PDs) behave as if they are not budget constrained (with respect to trial inputs they supply themselves).
  • Heavy caseloads should increase the probability of seeing any given case continued on a given day.
  • The average indigent defendant is more likely than the average wealthy defendant to have a prior criminal record, and the marginal disutility of additional criminal charges is probably strongly diminishing. In other words, prosecutors quickly lose (or often lack) the ability to tempt indigent defendants with plea bargains that offer features like amended charges with better labor market signaling. Indigent defendants have more taste for trials, which are costly to prosecutors.

(1) strongly suggests that PD clients are relatively more expensive to prosecute, especially at general district court levels where things like (costly) expert testimony are less common. (3) also suggests a level effect; when indigent defendants and relatively wealthy defendants have identical case details, the indigent defendants will probably do somewhat better in terms of trial outcomes as a group. (2) implies that as PD caseloads increase, the probability of continuances across the PD portfolio increase. Sentence maximizing prosecutors will know this, and will be induced to offer more favorable pleas or drop charges as needed.

If I Love the Legal System So Much, Why Don’t I Marry It?

As a consultant, my job is simple: increase mean, decrease variance. In order to do this, I need to have some expertise in how the criminal justice system (CJS) works. I think the best way to understand the CJS is to break the system up into critical component parts. Each component part involves interactions between agents, and each agent has goals. Using court data, we can refine and parameterize our agent models until we have a predictive service to offer our clients. Many of the conclusions of this type of analysis are surprising – for example, the finding that judges are better adjudicators than most people believe. I frequently point out that the CJS today doesn’t seem wildly different from the CJS most reformers pine for.

This should not be construed to mean I think the CJS is optimal. Rather, I think judges and prosecutors behave basically like the types of agents you would want in an optimal CJS. The real problem isn’t with the adjudicators, it’s with the arresting authorities and the laws. The arresting authorities have an easy enough fix in principle; we could just tie their pay to their successful prosecution rate (or any derivative of this plan). Would this fix magically give us an optimal CJS? No, but it would likely help fight the pervasive over-arresting of blacks by police. The more fundamental problem is the law generally. The optimal CJS is that which maximizes aggregate utility; by backward induction the optimal set of laws is that which maximizes aggregate utility.

I submit that the current set of laws is nowhere close to the utility maximizing set. While I think many laws are undesirable and should be repealed, the worst feature of our legal regime seems to be the given penalties for breaking most laws. We all know long prison sentences are overrated; the wise Alex Tabarrok prefers this alternative (do read his entire post):

I favor more police on the street to make punishment more quick, clear, and consistent. I would be much happier with more police on the street, however, if that policy was combined with an end to the “war on drugs”, shorter sentences, and an end to brutal post-prison policies that exclude millions of citizens from voting, housing, and jobs.

I suspect such a policy regime would move us unambiguously closer to the optimal regime (i.e. more deterrence and more utility), but it neglects the problem of police behavior. Presumably, the social costs would still be born disproportionately by populations of color.

Are Police Better People? Probably Not.

After reviewing the FCGDC data, it seems more likely that disparate black/white CJS outcomes are driven by police rather than judges or prosecutors. It seems the fundamental problem is that blacks are arrested at rates much higher than you would expect based on Fairfax County demographic information. There are a couple of ways to think about this. Perhaps (1) police officers tend to arrest blacks at lower level of confidence than whites with respect to suspected guilt. Perhaps (2) police arrest whites and blacks at comparable confidence levels but they target relatively black neighborhoods. Alternatively, (3) black populations may be associated with more crimes per capita than white populations. Given the relationship between income and criminality, it seems likely that the third option may have some truth – but it seems intuitively unlikely that this can account for the vast disparity between white and black arrest rates.

On the other hand, imagine that police officers maximize arrests at a certain level of confidence. Imagine that they have no bias against any particular group. In a highly stylized setting where a representative police officer can choose between patrolling two neighborhoods, identical in every way, he will be indifferent. Now suppose one neighborhood is somewhat poorer and the average inhabitant of that neighborhood is somewhat more likely to be involved in criminal activity. If this police officer is the ideal social agent (i.e. only cares about enforcing society’s laws), he will focus disproportionately on the poorer neighborhood. Specifically, he will patrol the poorer neighborhood until the marginal benefit of search (in arrests) equals the marginal cost (in time); at that point he will be indifferent between patrolling either neighborhood. In this scenario, the majority of arrests will come from the poorer neighborhood. If we assume the relative population size of the two neighborhoods is large compared to the size of the police force, we should expect a vast majority of arrests to come from the poorer neighborhood. If we add enough assumptions, we can create a scenario where the police behavior in Fairfax County today is “socially optimal” under the dubious meta-assumption that the Code of Virginia is optimally designed.

Can we test to see if reality approximates the story above? Yes. We already know that blacks and whites face comparable trial outcomes when controlling for income, so it’s unlikely that the average black defendant is more guilty than the average white defendant (i.e. the fact bundle against black defendants seems as strong as the bundle against white defendants on average). So what? The important takeaway from this fact is that one group doesn’t seem to commit crimes more conspicuously or hide crimes more adeptly; framed differently, it isn’t easier to search for either black or white criminals, although the density of criminals in one area may be relatively higher than that of another area. This makes reality look somewhat similar to the theoretical world outlined above. We also know that blacks are relatively more likely than whites to have charges against them dropped. This suggests support for police behavior hypothesis (1), police arrest blacks at lower confidence levels than whites. Reviewing the FCGDC police data itself, we find additional support for hypothesis (1) and (2).

With respect to the data, the distribution of average defendant win-rates associated with individual police officers is distributed fairly normally (we only looked at officers with at least 50 trials); but a histogram alone doesn’t tell us too much in this case. Maybe police arrest everyone at 50% confidence levels, and variation in defendant win-rates is more about whether the individual police officer is in the short-run or long-run. This is analogous to flipping a fair coin some number of times; over 50 trials, you may see a relatively large number of heads turn up. Over 1,000,000 trials, the Heads/Tails ratio should settle down at 1:1. Actually, we find that defendant win-rate by police officer doesn’t change much over time. A much more plausible story is that individual police officers arrest defendants at individual confidence levels. We also find substantial differences in black/white defendant win-rates across a number of police officers.

To sum up, even assuming the unexpectedly high level of black arrests is purely a function of crime density in poorer areas:

  • Individual police officers make arrests at dramatically differing levels of confidence.
  • A number of police seem to arrest blacks at lower levels of confidence than whites.

This is about the most charitable view of the arresting authorities one could reasonably give and it doesn’t exactly paint the police in a very flattering light.

Are Judges Better People? Probably.

In my first post, I wrote about data from the Fairfax County General District Court (FCGDC) and how it seems to support the view that courts themselves should not be the primary target of the typical criminal justice reformer. This should not be misconstrued as evidence that courts are models of well-designed institutions that we should copy whenever possible. The FCGDC data provides support for the view that prosecutors are extremely effective at giving adjudicating authorities what they want, and by a stroke of fortune the adjudicating authorities want things like accuracy and procedural discipline instead of things like apartheid. In other words, the FCGDC doesn’t seem to be a particularly robust institution, but nevertheless it happens to work surprisingly well. In this post, I’ll give a few reasons why I think that’s true.

If the story I’ve outline is correct, judges are basically benevolent dictators. This is exceedingly rare in politics; why does it make sense in the courtroom?

From “’Ideology’ or ‘Situation Sense’? An Experimental Investigation of Motivated Reasoning and Professional Judgment” by Dan Kahan, David Hoffman, Danieli Evans, Neal Devins, Eugene Lucci, and Katherine Cheng:

The study involved a sample of sitting judges (n = 253), who, like members of a general public sample (n = 800), were culturally polarized on climate change, marijuana legalization and other contested issues. When the study subjects were assigned to analyze statutory interpretation problems, however, only the responses of the general-public subjects and not those of the judges varied in patterns that reflected the subjects’ cultural values. The responses of a sample of lawyers (n = 217) were also uninfluenced by their cultural values; the responses of a sample of law students (n = 284), in contrast, displayed a level of cultural bias only modestly less pronounced than that observed in the general-public sample.

The key takeaway from this study is that judges are less susceptible to cognitive biases than members of the public. The authors attribute this to legal training, and their results do seem to support this story; law students do worse than judges and members of the public do worse than law students. I wonder how much of the story can be explained by IQ alone (judges have higher IQs than law students, law students have higher IQs than the general population); it would be interesting to see the study redone testing additional high-IQ groups without formal legal training (sociologists, engineers, economists, etc.). At any rate, the results are interesting.

Obviously the study results are obtained by judges responding to an anonymous, low-stakes survey. But the results from the FCGDC dataset support that judges abstain from motivated reasoning even in relatively high-stakes trials. Given that few formal institutions constrain judge behavior (in practice they are basically free to adjudicate however they want), what can account for a judge’s style? Probably the same things that prevent people from producing low-quality work in general – it’s embarrassing, for one. Economists often note that the bureaucracy, for example, works better than you would expect assuming all of its agents are highly self-interested. The same seems to apply to judges.