ASER Trends: Year-over-Year Changes

Overview

Objective: Estimate how much learning outcomes change between cohorts in the ASER data, and assess whether a location-based difference-in-differences strategy applied to ASER data would yield precise estimates.

A location-based DiD would compare year-over-year changes in outcomes across locations — exploiting variation in when an intervention rolled out across areas. Precision depends on how much of the total variation in outcomes is within-location over time (signal) versus across locations (noise absorbed by location fixed effects). If year-over-year changes are small relative to cross-location differences, the approach may have limited statistical power.

Data: State-level ASER survey data from 2005 to 2014, drawn from the publicly available dataset at github.com/dougj892/public-datasets. All four outcomes reported in the original ASER reports are examined: the share of Std 3 students reading at Std 1 level, the share of Std 5 students reading at Std 2 level, the share of Std 3 students who can subtract, and the share of Std 5 students who can divide.

Analyses:

  1. Year-over-year changes — For each state and year, the change from the prior year is computed for each outcome. Histograms show the distribution of these annual changes, giving a sense of how much scores typically shift from one cohort to the next.
  2. Pairwise state differences — For each year, absolute differences between all pairs of states are computed. Histograms show the distribution of these cross-location gaps. Comparing the two distributions indicates whether within-location change is large enough relative to cross-location variation to support a precise DiD estimate.

Code
library(tidyverse)
Code
aser <- read_csv("../data/raw/ASER trends over time.csv") |>
  filter(year <= 2014) |>
  arrange(State, year) |>
  group_by(State) |>
  mutate(
    d_std3_read_std1_all = std3_read_std1_all - lag(std3_read_std1_all),
    d_std5_read_std2_all = std5_read_std2_all - lag(std5_read_std2_all),
    d_std3_subtract_all  = std3_subtract_all  - lag(std3_subtract_all),
    d_std5_divis_all     = std5_divis_all     - lag(std5_divis_all)
  ) |>
  ungroup()
Code
outcome_vars <- c(
  "std3_read_std1_all", "std5_read_std2_all",
  "std3_subtract_all",  "std5_divis_all"
)

aser_slim <- aser |>
  filter(year >= 2007) |>
  select(year, State, all_of(outcome_vars))

aser_pairs <- aser_slim |>
  inner_join(aser_slim, by = "year", suffix = c("_1", "_2")) |>
  filter(State_1 < State_2) |>
  mutate(
    abs_diff_std3_read_std1_all = abs(std3_read_std1_all_1 - std3_read_std1_all_2),
    abs_diff_std5_read_std2_all = abs(std5_read_std2_all_1 - std5_read_std2_all_2),
    abs_diff_std3_subtract_all  = abs(std3_subtract_all_1  - std3_subtract_all_2),
    abs_diff_std5_divis_all     = abs(std5_divis_all_1     - std5_divis_all_2)
  ) |>
  select(year, State_1, State_2, starts_with("abs_diff_"))
Code
ggplot(aser, aes(x = d_std3_read_std1_all)) +
  geom_histogram(bins = 30, fill = "steelblue", color = "white") +
  labs(
    x = "Change from prior year",
    y = "Count",
    title = "Std 3 reading at Std 1 level (all): annual change"
  ) +
  theme_minimal()

Year-over-year change in Std 3 reading at Std 1 level (all)
Code
ggplot(aser, aes(x = d_std5_read_std2_all)) +
  geom_histogram(bins = 30, fill = "steelblue", color = "white") +
  labs(
    x = "Change from prior year",
    y = "Count",
    title = "Std 5 reading at Std 2 level (all): annual change"
  ) +
  theme_minimal()

Year-over-year change in Std 5 reading at Std 2 level (all)
Code
ggplot(aser, aes(x = d_std3_subtract_all)) +
  geom_histogram(bins = 30, fill = "steelblue", color = "white") +
  labs(
    x = "Change from prior year",
    y = "Count",
    title = "Std 3 subtraction (all): annual change"
  ) +
  theme_minimal()

Year-over-year change in Std 3 subtraction (all)
Code
ggplot(aser, aes(x = d_std5_divis_all)) +
  geom_histogram(bins = 30, fill = "steelblue", color = "white") +
  labs(
    x = "Change from prior year",
    y = "Count",
    title = "Std 5 division (all): annual change"
  ) +
  theme_minimal()

Year-over-year change in Std 5 division (all)
Code
ggplot(aser_pairs, aes(x = abs_diff_std3_read_std1_all)) +
  geom_histogram(bins = 30, fill = "coral", color = "white") +
  labs(
    x = "Absolute difference between states",
    y = "Count",
    title = "Std 3 reading at Std 1 level (all): pairwise abs. difference"
  ) +
  theme_minimal()

Pairwise absolute difference: Std 3 reading at Std 1 level (all)
Code
ggplot(aser_pairs, aes(x = abs_diff_std5_read_std2_all)) +
  geom_histogram(bins = 30, fill = "coral", color = "white") +
  labs(
    x = "Absolute difference between states",
    y = "Count",
    title = "Std 5 reading at Std 2 level (all): pairwise abs. difference"
  ) +
  theme_minimal()

Pairwise absolute difference: Std 5 reading at Std 2 level (all)
Code
ggplot(aser_pairs, aes(x = abs_diff_std3_subtract_all)) +
  geom_histogram(bins = 30, fill = "coral", color = "white") +
  labs(
    x = "Absolute difference between states",
    y = "Count",
    title = "Std 3 subtraction (all): pairwise abs. difference"
  ) +
  theme_minimal()

Pairwise absolute difference: Std 3 subtraction (all)
Code
ggplot(aser_pairs, aes(x = abs_diff_std5_divis_all)) +
  geom_histogram(bins = 30, fill = "coral", color = "white") +
  labs(
    x = "Absolute difference between states",
    y = "Count",
    title = "Std 5 division (all): pairwise abs. difference"
  ) +
  theme_minimal()

Pairwise absolute difference: Std 5 division (all)