TECHIN 510 — Spring 2026

Working with Data

Week 4: Pandas, Visualization & AI-Assisted Analysis
University of Washington • Global Innovation Exchange
01 / 25
What You Will Learn

Learning Objectives

1
Conduct a structured stakeholder interview Extract decision-making needs from real users
2
Write a JTBD statement "When [person] is [situation], they want [motivation] so they can [outcome]"
3
Build a Streamlit dashboard KPI cards, columns, and expanders for a decision-maker audience
4
Create interactive filters st.selectbox, st.slider, and st.date_input
02 / 25
What You Will Learn

Learning Objectives (continued)

5
Write assert statements using PRIMM Verify dashboard logic with assertion-based testing
6
Arrange visualizations into a story-first layout Situation → Data → Insight → Action

Capstone Objectives

C
Design a story-first dashboard layout Map JTBD elements to specific dashboard components
D
Write 2+ assert statements Verify data transformation correctness and explain why each matters
03 / 25
Part 1

The Data Pipeline

From physical to digital
04 / 25
Week 4 Pipeline

The Data Pipeline

RAW DATA TRANSFORMED VISUALIZATION DECISION CSV, JSON, API response pandas groupby, filtering, joins, aggregation Plotly chart, map, histogram Action taken WEEK 4 DATA PIPELINE Nine stages (Week 3) collapsed to four — complexity shifted, not removed

The hard problems shifted from physical (noise, voltage, timing) to semantic (column meaning, NaN values, aggregation correctness).

05 / 25
The Shift

From Physical to Semantic

Semantic problem: What does this column actually mean? Why are there 847 NaN values in a dataset that’s supposed to be complete?

Key question: Did the AI write that aggregation correctly, or did it silently drop the rows it couldn’t handle?

06 / 25
Jobs-to-be-Done

JTBD Drives the Pipeline

“When the GIX Facilities Coordinator is fielding twenty onboarding questions per new cohort, they want automated answers so they can reclaim three hours every quarter.”

1
The decision at the end of the pipeline Everything — CSV, pandas code, chart — exists to serve that decision
2
If you don’t know the decision first… You don’t know what data you need, and you can’t judge if the AI gave you the right aggregation
3
JTBD contains your spec “Allocating inspectors by neighborhood” tells you the unit is count, not dollar value
07 / 25
Part 2

Prompt Engineering for Data

From vague to precise
08 / 25
Prompt Comparison

Bad Prompt vs. Good Prompt

Vague Prompt
make a bar chart of the
building permits data

AI guesses column names, aggregation method, axis labels, and chart type. Every guess is a silent failure point.

CTOC-Specified Prompt
Using the Seattle building permits
DataFrame with columns [permit_type,
neighborhood, application_date,
issue_date, value, contractor],
create a bar chart showing the count
of permits by neighborhood.
Group by 'neighborhood', count rows
(not sum of value). Sort descending,
show top 15. Use Plotly Express.

The difference: The AI no longer guesses. Every ambiguous decision is now specified.

09 / 25
When AI Breaks Quietly

AI Failure Modes

1
Hallucinated column name df['permit_category'] → KeyError (or silently wrong if column exists with different data)
2
Wrong aggregation .sum() on ‘value’ instead of .count() on rows — chart looks fine, scale is completely wrong
3
Silent NaN drop df.dropna() called before grouping — neighborhoods with incomplete records vanish, totals no longer match

The worst kind of bug: the one that doesn’t crash. The code runs, the chart renders, and the numbers are wrong.

Agent analog: Hallucinated column = agent calls wrong tool. Wrong aggregation = agent chains tools in wrong order. Silent NaN drop = agent swallows tool errors. Same failure modes, different context.

10 / 25
Prompt Spec

The CTOC Template

C
Context Dataset name + actual column names (prevents hallucination)
T
Task Specific transformation — group by X, count/sum/mean Y
O
Output Chart type, axis labels, sort order, title
C
Constraints Handle NaN by dropping/filling, top N only, date range X to Y

Four fields. That’s it. The 40% planning phase applied directly to data work.

CTOC: A Chart-Level Functional Spec. Same structure returns in Week 7: Context, Task, Output, Constraints — CTOC is a mini system prompt for each visualization.

11 / 25
Code Example

CTOC: Seattle Building Permits

Context: Seattle building permits DataFrame.
         Columns: permit_type, neighborhood,
         application_date, issue_date, value, contractor

Task:    Group by neighborhood. Count the number of
         rows (permits) per neighborhood.
         Do NOT sum the value column.

Output:  Plotly Express bar chart.
         Title: "Building Permits by Neighborhood"
         X-axis: "Neighborhood", Y-axis: "Permit Count"
         Sort descending by count. Show top 15 only.
         Rotate x-axis labels 45 degrees.

Constraints: Drop rows where neighborhood is NaN
             before grouping. Log how many rows
             were dropped.
12 / 25
The Full Flow

JTBD → Spec → Prompt → Verify

JTBD SPEC (40%) PROMPT (20%) VERIFY (40%) "When a city planner reviews permits..." → defines aggregation column names + aggregation + chart type + NaN fills CTOC → AI generates code assert len ≤ 15 assert monotonic assert dropped ≥ 0 40% PLANNING 20% CODE 40% VERIFICATION
13 / 25
Big Idea

Context Engineering

“You are not just writing a better prompt. You are engineering the context the model operates in so that its failure modes are structurally eliminated.”

If you give the model the real column names, it cannot hallucinate fake ones. The spec constrains the AI’s generation space.

This same principle applies in Week 7 when you write system prompts for the Anthropic API.

14 / 25
Part 3

Visualization Landscape

Choosing the right library for the job
15 / 25
Library Landscape

Four Libraries, Four Jobs

M
Matplotlib Static charts, publication figures. import matplotlib.pyplot as plt — No interactivity.
P
Plotly Express (this course) Web dashboards, production. import plotly.express as px — Full hover, zoom, pan.
A
Altair Declarative, grammar of graphics. import altair as alt — Good interactivity.
S
Streamlit native Quick prototypes. st.line_chart(), st.bar_chart() — Minimal, fast to write.

Rule of thumb: Start with Streamlit native to explore, switch to Plotly before you show anyone.

16 / 25
Part 4

Story-First Dashboard

Situation → Evidence → Action
17 / 25
App Skeleton

Page Config + @st.cache_data

st.set_page_config(
    page_title="Decision-Maker Dashboard",
    page_icon="📊",
    layout="wide",
    initial_sidebar_state="expanded"
)
@st.cache_data
def load_data(uploaded_file):
    df = pd.read_csv(uploaded_file)
    return df

layout="wide" is non-negotiable for dashboards. The default narrow column is wrong for chart-heavy layouts.

@st.cache_data caches the return value. Without it, every slider move re-reads the CSV from disk.

18 / 25
Dashboard Components

KPI Cards — st.columns + st.metric

col1, col2, col3 = st.columns(3)
col1.metric("Total Leads", 1247, delta="+12%")
col2.metric("Cost per Lead", "$23.50",
            delta="-8%", delta_color="inverse")
col3.metric("Conversion Rate", "3.2%", delta="+0.5%")

st.columns(3) splits horizontal space into equal-width containers. Three columns, three KPI cards.

delta_color="inverse" — a decrease in cost is good news. Business logic in one keyword argument.

19 / 25
Layout Pattern

Story-First vs. Unclear Layout

Story-first
Streamlit dashboard with headline asking whether Q3 qualified leads are on track, KPI row, bar chart by channel, and a recommended next step in an info callout.

Same synthetic data: headline and caption state the decision, KPIs answer “what happened?,” chart supports “why?,” and an action closes the loop.

Unclear story
Streamlit dashboard with generic title Dashboard, sidebar widgets first, bar chart before KPI metrics and extra filters, no recommendation or decision question.

Same data: widgets and chart come first; the viewer never sees what decision this supports or what to do next.

A dashboard without an action layer is a report. A dashboard with one is a decision tool.

20 / 25
Part 5

Trust But Verify

Assertion-Based Verification
21 / 25
Growing Rigor

The Verification Ladder

1
Week 1: Smoke test “Does the app run?”
3
Week 3: Physical test “Does the number match reality?” (thermometer check)
4
Week 4: Assert ← You are here “Does the data meet expectations?” (code checks code)
8
Week 8: Automated test “Does the code pass every time?” (Vitest / pytest)

AI wrote the KPI in 3 lines. It takes a larger block of asserts to know whether to trust it. That ratio doesn’t change.

22 / 25
Assert Framework

Four Assert Categories

1. Shape & Existence
assert len(df) > 0
assert 'leads' in df.columns

Does the data exist? Right columns?

2. Type
assert df['leads'].dtype in
  ['int64', 'float64']
assert df['date'].dtype ==
  'datetime64[ns]'

Date-as-string bugs live here.

3. Range
assert df['leads'].min() >= 0
assert df['spend'].max() < 1_000_000

Catches NaN bugs and data errors.

4. Relational
assert grouped['leads'].sum().sum()
  == df['leads'].sum()
assert len(filtered) <= len(df)

Do the parts add up to the whole?

23 / 25
Live Demo

The NaN Bug

1
AI generates a KPI calculation total_leads = df['leads'].sum() — looks perfect, runs clean
2
Output looks plausible Total leads: 842.0 — Cost per lead: $47.51 — no errors
3
Plant a NaN: df.loc[5, 'leads'] = float('nan') Simulates a missing value from the data source
4
Assert catches it: assert df['leads'].min() >= 0 NaN makes min() return NaN — NaN ≥ 0 is False — assert fires
5
Fix the transformation, not the assert df = df.dropna(subset=['leads']) — then assert passes cleanly

Never delete an assert because it’s inconvenient. That’s like unplugging a smoke detector because it keeps going off.

24 / 25
Testing & Full-Stack Practice

Decision Quality & Fairness

E2E Consistency Check
# Does the KPI match the chart?
assert kpi_total == chart_df['leads'].sum()
# Does the filter actually filter?
assert filtered_df['region'].unique() == ['West']
Who Disappears?

Every dropna() is a design decision. Which rows vanish? Are they random, or do they correlate with a demographic?

Check: does your cleaning step disproportionately remove data from one group?

Verification ladder rung 3 (assert). Next: security checks in Week 6.

25 / 25