Coursework
A credit risk report written by a pipeline of seven Claude agents.
The course wanted a credit risk write up. Logistic regression, a decision tree, a random forest, on the UCI Taiwan 2005 default panel. Thirty thousand cardholders, 22 percent default rate. The kind of dataset that gets assigned because every step in the process is well documented and the right answer is already in the literature.
So the modeling was not where the project went.
The seven agent pipeline
Seven Claude Code agents. Each one has a single responsibility. They communicate only by writing files to the repo, never directly to each other.
A1 Style Forensics reads the AI detection literature and writes a constraint file. A2 Domain Voice reads Basel, SR 11 7, FICO disclosures, and writes a credit analyst tone profile. A3 fits the models in R with tidymodels under a fixed seed. A4 drafts the report under the A1 and A2 constraints. A5 grep checks every decimal in the draft against the actual numbers in results/. A6 runs a regex over the draft for banned words and stylistic tells, and only signs off after three consecutive clean passes. A7 renders the PDF and verifies it lands inside the eight to twelve page rubric.
The rule that no agent both writes and checks its own output is enforced by sentinel files in workflow/status.log. An agent that finishes its job appends a sentinel; the next agent in line blocks until that sentinel exists.
The models themselves
Three. Logistic regression as the audit friendly baseline. A pruned tree for an interpretable single path. A random forest tuned on a five fold cross validation grid.
The forest wins on AUC and Brier (0.7655 and 0.1394). Logistic regression reports an odds ratio of 1.92 per unit increase in PAY_0, the most recent month’s payment status. The strongest permutation importance is avg_delinquency, narrowly above PAY_0 itself.
The whole pipeline runs in roughly an hour, cold start to signed off PDF, under one seed. It’s idempotent. That part was the point.