Council spending – open data

My local authority recently decided to publish all spending over £500 in an effort to be more transparent. Here’s a post taking an overview of what they’ve published. I’ve used R for the analysis. The dataset doesn’t contain much detail, but if you’ve analysis suggestions, please add them in the comments!

You can download the spending data here. It’s available in pdf (why?!) and xlsx (plain text would be more open).

First off, some packages:

library(tidyverse)
library(readxl)
library(janitor)
library(lubridate)
library(formattable)

Read in the dataset:

df = read_excel("~/Downloads/midlothian_payments_over_500_01042019_to_15092019.xlsx") %>%
   clean_names()

We’ve got six columns:

  • type
  • date_paid
  • supplier
  • amount
  • our_ref
  • financial_year

 

Busiest day:

df %>%
   mutate(day = weekdays(date_paid)) %>%
   group_by(day) %>%
   summarise(transactions = n(),
             thousands_pounds_spent = sum(amount) / 1000) %>%
   mutate(day = fct_relevel(day, rev(c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday")))) %>%
   gather(var, value, -day) %>%
   ggplot(aes(day, value)) +
   geom_col() +
   facet_wrap(~var, scales = "free_x") +
   coord_flip() +
   scale_y_continuous(labels = scales::comma) +
   labs(title = "Busiest day of the week",
        x = "",
        y = "")

day

Busiest time of year:

df %>%
   mutate(dow = weekdays(date_paid),
          dow = if_else(dow == "Tuesday" | dow == "Friday", "Tue/Fri", "Other")) %>%
   group_by(date_paid, dow) %>%
   summarise(transactions = n(),
             pounds_spent = sum(amount)) %>%
   gather(var, value, -date_paid, -dow) %>%
   ggplot(aes(date_paid, value, colour = dow)) +
   geom_point() +
   facet_wrap(~var, scales = "free_y") +
   scale_y_log10(labels = scales::comma) +
   scale_colour_brewer(type = "qual", palette = "Set2") +
   labs(title = "Busiest day of the year",
        x = "",
        y = "")

year

Top 10 payees by value:

df %>%
   group_by(supplier) %>%
   summarise(pounds_spent = sum(amount),
             transactions = n()) %>%
   arrange(desc(pounds_spent)) %>%
   top_n(n = 10, wt = pounds_spent) %>%
   mutate(pounds_spent = currency(pounds_spent, "£", digits = 0L)) %>%
   formattable(list(`pounds_spent` = color_bar("#FA614B"),
                    `transactions` = color_bar("lightpink")))

Screenshot from 2019-10-22 11-59-36

In Scotland local authorities collect water charges on behalf of the water authority, which they then pass on. It’s not surprise that Scottish Water is the biggest supplier.

Top 10 payees by frequency:

df %>%
   group_by(supplier) %>%
   summarise(pounds_spent = sum(amount),
             transactions = n()) %>%
   arrange(desc(transactions)) %>%
   top_n(n = 10, wt = transactions) %>%
   mutate(pounds_spent = currency(pounds_spent, "£", digits = 0L)) %>%
   formattable(list(`pounds_spent` = color_bar("lightpink"),
                    `transactions` = color_bar("#FA614B")))

Screenshot from 2019-10-22 11-59-46

As a final note, writing this post is reminding me again I should be moving away from wordpress because incorporating code and output would be much easier with mark/blog down! As always, legacy is holding me back.

1 Comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s