legislatoR

legislatoR is a package for the software environment R that facilitates access to the Comparative Legislators Database (CLD). The package is available through CRAN and GitHub. To install the package from CRAN, type:

install.packages("legislatoR")

You can also download the development version directly from Github:

#install.packages("devtools")
devtools::install_github("saschagobel/legislatoR")

Usage

A working Internet connection is required to access the CLD in R. This is because the data are stored online and not installed together with the package. The package provides table-specific function calls. These functions are named after the respective table and preceded by legislatoR::get_. To fetch the Core table, use the legislatoR::get_core() function, for the Political table, use the legislatoR::get_political() function. Call the package help file via ?legislatoR() to get an overview of all function calls. Tables are legislature-specific, so a letter country code must be passed as an argument to the function. Here is a breakdown of all country codes. You can also call the legislatoR::cld_content() function to get an overview of the CLD's scope and valid country codes.

Legislature Code Legislature Code Legislature Code
Austrian Nationalrat aut German Bundestag deu UK House of Commons gbr
Canadian House of Commons can Irish Dail irl United States Congress usa_house/
usa_senate
Czech Poslanecka Snemovna cze Scottish Parliament sco
French Assemblée fra Spanish Congreso esp

Working with the legislatoR package in R

The legislatoR package provides an easy-to-use interface to the CLD in R. This brief tutorial will present two use-cases to illustrate how to engage the different tables of the database to extract information for analyses. To be precise, we will:

  • explore the distribution of seats in the U.S. Senate during the 116th United States Congress (Jan 2019 - Jan 2021)
  • map the birthplaces of all the members of the Bundestag by political party
The first step will be loading the libraries in R. We will employ a series of packages for data manipulation and visualization. If it is the first time you are using one of these packages, you will need to install them.

# load the libraries
library(legislatoR)
library(dplyr) #data manipulation
library(ggplot2) #data visualization
library(ggtext) #text aesthetics in ggplot2
library(ggpol) #geom_parliament() (parliament plots)
library(sf) #spatial vector encodings in R
library(rnaturalearth) #natural earth map data
library(rnaturalearthdata) #natural earth map data

set.seed(1310) # to get consistent results from randomization

The U.S. Senate during the 116th United States Congress

The Core table is at the center of the database structure. This table contains basic demographic information about the legislators, in addition to the joining keys: Wikipedia page ID (pageid) and the Wikidata ID (wikidataid). These data can be retrieved through the get_core() call function. The call functions take the legislature code as an argument (i.e., get_*(legislature="code")). Each table can be called independently based on the use-case and can be linked through one of the joining keys.

# load US Senate core
legislatoR::get_core(legislature = "usa_senate") %>%
dplyr::sample_n(10) #get ten random entries

In this case, we will employ Political table to derive the counts of the seats each party held:

# load US Senate political table for the 116th Congress
us_senate_political <- legislatoR::get_political(legislature = "usa_senate") %>%
  dplyr::filter(session == 116) #filter only legislators from the 116th Congress

dplyr::sample_n(us_senate_political, 10) #print ten random entries

You can employ dplyr verbs to extract the seat counts per party and generate a visual illustration of the distribution:

# get the seat counts per party for parliament plot
us_senate_counts <- us_senate_political %>%
  dplyr::group_by(party) %>% #nest at party
  dplyr::summarize(seats = n()) %>% #generate counts
  dplyr::mutate(colors = dplyr::case_when(party == "D" ~ "#3885D3",
  party == "R" ~ "#E02E44",
  party == "Independent" ~ "#637684")) #assign party colors for visualization

us_senate_counts

We can use the us_senate_counts data frame as the input for our plot:

# a caption with some md formating
caption_plot <- paste0("The 116th US Senate was controlled by the <b style='color:#E02E44'&gt Republican party</b&gt with <b style='color:#E02E44'&gt",
us_senate_counts$seats[us_senate_counts$party == "R"], # Number of Republican seats
" seats</b&gt.<br&gtThe <b style='color:#3885D3'&gtDemocratric party</b&gt held <b style='color:#3885D3'&gt",
us_senate_counts$seats[us_senate_counts$party == "D"], # Number of Democrat seats
" seats</b&gt. The <b style='color:#637684'&gt",
us_senate_counts$seats[us_senate_counts$party == "Independent"], # Number of Independent seats
" Independents </b&gtcaucased with the Democrats."
)

# plot parliament seats
ggplot(us_senate_counts) +
  ggpol::geom_parliament(aes(seats = seats, fill = party), color = "black") +
  scale_fill_manual(values = us_senate_counts$colors, labels = us_senate_counts$party) +
  coord_fixed() +
  labs(title = "<b&gtUS Senate membership (116th Congress)</b&gt",
  caption = caption_plot) +
  theme_void() +
  theme(legend.position = "none",
  plot.title = ggtext::element_markdown(),
  plot.caption = ggtext::element_markdown())

usa-ggparliament


The birthplaces of German parlimentarians

In most instances, you will need to link information from different tables in the database. For instance, in this case we need the parliamentarians' birthplace coordinates and their political party. These two data points are in the Core and Political tables. One easy way to link the tables is by employing dplyr joins.

# assign Political and Core tables to the environment
deu_politicians <- dplyr::left_join(x = legislatoR::get_political(legislature = "deu"),
                                    y = legislatoR::get_core(legislature = "deu"),
                                    by = "pageid") #these two tables can be joined through their Wikipedia page ID

head(deu_politicians) #print first couple observations

The deu_politicians data frame contains all the information from the two tables. We can use these data to extract the latitudes and longitudes of the legislators' birthplace for a map.

# extract birthplace latitudes and longitudes with regular expressions
deu_birthplace_map_df <- deu_politicians %>%
  dplyr::distinct(wikidataid, .keep_all = T) %>% # keep unique entries of legislators
  dplyr::mutate(lat = stringr::str_extract(birthplace, "[-[:digit:]]{1,4}\\.[:digit:]+") %>% as.numeric(),
                lon = stringr::str_extract(birthplace, "[-[:digit:]]{1,4}\\.[:digit:]+$") %>% as.numeric())

# define German boundaries
lat1 <- 47; lat2 <- 55.5 ; lon1 <- 5.5; lon2 <- 15.5

germany_sf <- rnaturalearth::ne_countries(scale = "medium", returnclass = "sf", country = "Germany") #get spatial encodings for Germany

ggplot(germany_sf) +
  geom_sf(size = 1) +
  geom_point(data = deu_birthplace_map_df, aes(x = lon, y = lat), size = .25,
  shape = 20, color = "#cc0065", alpha = 0.5) +
  theme_bw() +
  facet_wrap(~party) +
  coord_sf(xlim = c(lon1, lon2), ylim = c(lat1, lat2), expand = FALSE) +
  labs(title = "<b>Birthplace of Bundestag members<b>")+
  theme(plot.margin=grid::unit(c(0,0,0,0), "mm"),
  axis.title = element_blank(),
  axis.ticks = element_blank(),
  axis.text=element_blank(),
  panel.grid.major = element_blank(),
  panel.grid.minor = element_blank(),
  legend.position = "none", text = element_text(size=10),
  panel.grid.minor.y =  element_blank(),
  strip.background = element_rect(fill="white"),
  strip.text.x = element_text(color = "darkblue", face = "bold"),
  plot.title = ggtext::element_markdown(family = "Source Sans Pro"))

bundestag-birth-map