Comparision of RLRW scores to WJP scores

At a talk on a late-stage draft of this book, after I presented the results at the end of chapter 9 (showing that my proof of concept measure behaves as we'd expect a valid rule of law measure to behave, i.e., being closely correlated with economic development, individual liberty, etc.), an audience member asked a question I hadn't considered before: does my measure do better than the alternatives with respect to those other variables?

Of course, apples-to-apples comparison makes little sense, as a key advantage of my measure is that it is unidimensional, whereas the other leading candidates tend to be multidimensional. But that isn't a barrier to practical comparability with the aid of statistical dimensionality reduction techniques.

Hence, this document presents a comparision of the behavior of my measure with the behavior of the first principal component of the World Justice Project's 2012 index (the best of the alternative measures), based on their 8 main factors (n.b., the WJP's own statistical audit reports that "the eight dimensions share a single latent factor that captures 81% of the total variance").

There was not enough time for this material to make it into the book, but it should be nonetheless useful to readers. I claim that it shows that my measure is capturing essentially the same thing as the WJP's entire measure---except that mine has the advantage of parsimony and simplicity, being based on far fewer variables, and is closer tied to a theoretical account of what the rule of law is. Thus, it further validates the measurement approach presented in The Rule of Law in the Real World.

Replication information:
This document contains working Python code and output, mixed with text. For my own convenience, I pre-created a single CSV containing both my data and the WJP's 2012 factors, which may be downloaded here. If you'd like to get the data directly from the WJP, as of December 12, 2015, it's available here. All libraries and tools (including the wonderful ipython notebook that I used to run the code and generate this page) are packaged into the excellent free Anaconda scientific Python distribution, which I highly recommend. The one exception is that the fancy interactive graphics use the bokeh library, which, after you install anaconda, can be gotten by going to the command line and typing conda install bokeh. I used Python 2, but it should work on the more modern Python 3 too (untested).

If you don't feel like looking at all the data stuff, you can also go home.

In [1]:
from bokeh.plotting import figure, output_notebook, show, ColumnDataSource
from bokeh.models import HoverTool
In [2]:
def setFigParams(fig):
    fig.legend.orientation = "bottom_right"
    fig.legend.border_line_alpha = 0.2
    fig.legend.background_fill_color = "gold"
    fig.legend.background_fill_alpha = 0.05
    fig.grid.grid_line_alpha = 0.1
    fig.grid.minor_grid_line_alpha = 0.1
    fig.axis.minor_tick_line_color = None
    fig.axis.major_label_text_font_size = '15pt'
    fig.axis.major_label_text_alpha = 0.7
    fig.axis.axis_label_standoff = 15
    fig.axis.axis_label_text_font_size = '15pt'
    fig.axis.axis_label_text_alpha = 0.7
    fig.axis.axis_line_alpha = 0.1
    fig.outline_line_alpha = 0.1
    fig.title_text_font_style = "bold"
    fig.title_text_font = "arial"
    fig.title_text_alpha = 0.8
    fig.title_text_baseline ="bottom"
In [3]:
import numpy as np
import pylab as plt
import pandas as pd
import sklearn 
rolds = pd.read_csv("rol-scores.csv")
rolds.head()  # display the first five rows
Out[3]:
State Pop. In Millions for 2012 RoLScore elec_pros pol_plur free_expr assoc_org per_auto 2012GDP hprop ... htra hinv F1LGP F2AOC F3OS F4FR F5OG F6RE F7CJ F8CJ
0 Albania 3.2 42.60 8 10 13 8 9 1.264810e+10 30 ... 79.8 65 0.457571 0.311833 0.729112 0.631233 0.444956 0.426012 0.507478 0.410052
1 Argentina 41.1 51.94 11 15 14 11 13 4.760000e+11 15 ... 67.6 40 0.459925 0.474730 0.595719 0.630741 0.479807 0.431074 0.536658 0.434365
2 Australia 22.7 73.28 12 15 16 12 15 1.530000e+12 90 ... 86.2 80 0.883457 0.896729 0.858811 0.842861 0.840763 0.830545 0.723283 0.723643
3 Austria 8.4 73.15 12 15 16 12 15 3.950000e+11 90 ... 86.8 85 0.822501 0.773411 0.885257 0.824339 0.802236 0.844839 0.743924 0.747908
4 Bangladesh 154.7 31.57 9 11 9 8 9 1.160000e+11 20 ... 54.0 55 0.402931 0.290016 0.623857 0.433688 0.352168 0.355572 0.322477 0.381836

5 rows × 23 columns

F1LGP, F2AOC, etc. are the eight main factors from WJP data for 2012; sub-factors (which WJP composes into these aggregate factors) are more numerous and not included). Numbers 7 and 8 are "civil justice" and "criminal justice," hence the identical acronyms.

In [4]:
# off the shelf principal components analysis 
wjp_columns = ["F1LGP", "F2AOC", "F3OS", "F4FR", "F5OG", "F6RE", "F7CJ", "F8CJ"]
from sklearn.decomposition import PCA
wjpPC = PCA(n_components=1)
onedim = wjpPC.fit_transform(rolds[wjp_columns])
rolds['wjppc'] = onedim
In [5]:
# create per capita GDP variable
gdppc = rolds['2012GDP']/rolds['Pop. In Millions for 2012']
rolds['gdppc'] = gdppc
In [6]:
# convenience dataframe with just the cells I'm working with, and also rename the one with a typo-ed space. 
rolds['elec_pros'] = rolds['elec_pros ']
finalds = pd.DataFrame(rolds[['State','RoLScore','wjppc','gdppc','elec_pros','pol_plur', 'per_auto','hprop']])
finalds.head()
Out[6]:
State RoLScore wjppc gdppc elec_pros pol_plur per_auto hprop
0 Albania 42.60 -0.278243 3.952530e+09 8 10 9 30
1 Argentina 51.94 -0.198056 1.158151e+10 11 15 13 15
2 Australia 73.28 0.726105 6.740088e+10 12 15 15 90
3 Austria 73.15 0.648907 4.702381e+10 12 15 15 90
4 Bangladesh 31.57 -0.516102 7.498384e+08 9 11 9 20

First, let's look at the relationship between my score and the first WJP principal component.

In [7]:
output_notebook()
BokehJS successfully loaded.

Graphics

You can hover the mouse over a datapoint to view the country. You can also use the mouse to drag the plot around, and the scroll wheel (or two-finger scroll on a mac) to zoom in and out.

If you click the magnifying glass icon then draw a rectangle, the plot will zoom in on it.

If you click the circular arrows, it will all reset.

In [8]:
hover = HoverTool(tooltips=[("State", "@State"), ("RLRW Score", '@RoLScore'),("WJP Score", "@wjppc")])
basic1 = figure(width=850, height=550, responsive=True, title='RLRW against WJP Scores', x_axis_label = 'RLRW', y_axis_label = 'WJP', tools=['pan,' 'box_zoom', 'wheel_zoom', 'reset', 'save', hover])
basic1.scatter(finalds['RoLScore'], finalds['wjppc'], source=ColumnDataSource(data=finalds), name='basic1', marker='square', size=8, fill_alpha=0.4, color='black')
setFigParams(basic1)
show(basic1)

As we can see, they very closely track one another. Now let's use overlaid scatterplots to compare how they look with respect to all the comparator variables given in the book. In order to do this, we'll have to scale the variables first.

In [9]:
from __future__ import division  # to avoid any weird nonsense with integer division 
def scale(variable):
    x = variable.copy()
    sdev = x.std()
    xmean = x.mean()
    x -= xmean
    x /= sdev
    return x
v2s = ['RoLScore','wjppc','gdppc','elec_pros','pol_plur', 'per_auto','hprop']
for v in v2s:
    finalds[v] = scale(finalds[v])
finalds.head()
Out[9]:
State RoLScore wjppc gdppc elec_pros pol_plur per_auto hprop
0 Albania -0.583004 -0.661588 -0.578904 -0.203537 -0.340063 -0.493828 -0.690925
1 Argentina 0.007954 -0.470926 -0.184115 0.625800 0.850157 0.704527 -1.282219
2 Australia 1.358174 1.726488 2.704461 0.902245 0.850157 1.303705 1.674249
3 Austria 1.349949 1.542930 1.649975 0.902245 0.850157 1.303705 1.674249
4 Bangladesh -1.280892 -1.227155 -0.744639 0.072909 -0.102019 -0.493828 -1.085121

Now we can simply loop over the variables of interest and display overlaid scatterplots. In the below, red indicates the score from The Rule of Law in the Real World, while blue indicates the score from the first WJP principal component.

In [10]:
vofi = [('gdppc', "GDP Per Capita"),('elec_pros', "Freedom House Electoral Process"),('pol_plur', "Freedom House Political Pluralism"), ('per_auto', "Freedom House Personal Autonomy"),('hprop', "Heritage Foundation Property Rights")]
In [11]:
def makePlot(variable):
    hover = HoverTool(tooltips=[("State", "@State"),(variable[1], '@' + variable[0]),("RLRW Score", '@RoLScore'),("WJP Score", "@wjppc")])
    aplot = figure(width=850, height=550, responsive=True, title=variable[1], x_axis_label = 'Rule of Law Scores', y_axis_label = variable[1], tools=['pan,' 'box_zoom', 'wheel_zoom', 'reset', 'save', hover])
    aplot.scatter(finalds['RoLScore'], finalds[variable[0]], source=ColumnDataSource(data=finalds), marker='square', size=8, fill_alpha=0.4, legend="RLRW", color='red')
    aplot.scatter(finalds['wjppc'], finalds[variable[0]], source=ColumnDataSource(data=finalds), marker='circle', size=8, fill_alpha=0.4, legend="WJP", color='blue')
    setFigParams(aplot)
    show(aplot)
    
for var in vofi:
    makePlot(var)
    print('\n\n\n')















As you have seen, my measure and the first principal component of the WJP behave essentially the same with respect to the variables of interest.

RETURN HOME