At a talk on a late-stage draft of this book, after I presented the results at the end of chapter 9 (showing that my proof of concept measure behaves as we'd expect a valid rule of law measure to behave, i.e., being closely correlated with economic development, individual liberty, etc.), an audience member asked a question I hadn't considered before: does my measure do better than the alternatives with respect to those other variables?
Of course, apples-to-apples comparison makes little sense, as a key advantage of my measure is that it is unidimensional, whereas the other leading candidates tend to be multidimensional. But that isn't a barrier to practical comparability with the aid of statistical dimensionality reduction techniques.
Hence, this document presents a comparision of the behavior of my measure with the behavior of the first principal component of the World Justice Project's 2012 index (the best of the alternative measures), based on their 8 main factors (n.b., the WJP's own statistical audit reports that "the eight dimensions share a single latent factor that captures 81% of the total variance").
There was not enough time for this material to make it into the book, but it should be nonetheless useful to readers. I claim that it shows that my measure is capturing essentially the same thing as the WJP's entire measure---except that mine has the advantage of parsimony and simplicity, being based on far fewer variables, and is closer tied to a theoretical account of what the rule of law is. Thus, it further validates the measurement approach presented in The Rule of Law in the Real World.
Replication information:
This document contains working Python code and output, mixed with text. For my own convenience, I pre-created a single CSV containing both my data and the WJP's 2012 factors, which may be downloaded here. If you'd like to get the data directly from the WJP, as of December 12, 2015, it's available here. All libraries and tools (including the wonderful ipython notebook that I used to run the code and generate this page) are packaged into the excellent free Anaconda scientific Python distribution, which I highly recommend. The one exception is that the fancy interactive graphics use the bokeh library, which, after you install anaconda, can be gotten by going to the command line and typing conda install bokeh
. I used Python 2, but it should work on the more modern Python 3 too (untested).
If you don't feel like looking at all the data stuff, you can also go home.
from bokeh.plotting import figure, output_notebook, show, ColumnDataSource
from bokeh.models import HoverTool
def setFigParams(fig):
fig.legend.orientation = "bottom_right"
fig.legend.border_line_alpha = 0.2
fig.legend.background_fill_color = "gold"
fig.legend.background_fill_alpha = 0.05
fig.grid.grid_line_alpha = 0.1
fig.grid.minor_grid_line_alpha = 0.1
fig.axis.minor_tick_line_color = None
fig.axis.major_label_text_font_size = '15pt'
fig.axis.major_label_text_alpha = 0.7
fig.axis.axis_label_standoff = 15
fig.axis.axis_label_text_font_size = '15pt'
fig.axis.axis_label_text_alpha = 0.7
fig.axis.axis_line_alpha = 0.1
fig.outline_line_alpha = 0.1
fig.title_text_font_style = "bold"
fig.title_text_font = "arial"
fig.title_text_alpha = 0.8
fig.title_text_baseline ="bottom"
import numpy as np
import pylab as plt
import pandas as pd
import sklearn
rolds = pd.read_csv("rol-scores.csv")
rolds.head() # display the first five rows
F1LGP, F2AOC, etc. are the eight main factors from WJP data for 2012; sub-factors (which WJP composes into these aggregate factors) are more numerous and not included). Numbers 7 and 8 are "civil justice" and "criminal justice," hence the identical acronyms.
# off the shelf principal components analysis
wjp_columns = ["F1LGP", "F2AOC", "F3OS", "F4FR", "F5OG", "F6RE", "F7CJ", "F8CJ"]
from sklearn.decomposition import PCA
wjpPC = PCA(n_components=1)
onedim = wjpPC.fit_transform(rolds[wjp_columns])
rolds['wjppc'] = onedim
# create per capita GDP variable
gdppc = rolds['2012GDP']/rolds['Pop. In Millions for 2012']
rolds['gdppc'] = gdppc
# convenience dataframe with just the cells I'm working with, and also rename the one with a typo-ed space.
rolds['elec_pros'] = rolds['elec_pros ']
finalds = pd.DataFrame(rolds[['State','RoLScore','wjppc','gdppc','elec_pros','pol_plur', 'per_auto','hprop']])
finalds.head()
First, let's look at the relationship between my score and the first WJP principal component.
output_notebook()