Capstone Objectives
The hard problems shifted from physical (noise, voltage, timing) to semantic (column meaning, NaN values, aggregation correctness).
Semantic problem: What does this column actually mean? Why are there 847 NaN values in a dataset that’s supposed to be complete?
Key question: Did the AI write that aggregation correctly, or did it silently drop the rows it couldn’t handle?
“When the GIX Facilities Coordinator is fielding twenty onboarding questions per new cohort, they want automated answers so they can reclaim three hours every quarter.”
make a bar chart of the
building permits data
AI guesses column names, aggregation method, axis labels, and chart type. Every guess is a silent failure point.
Using the Seattle building permits
DataFrame with columns [permit_type,
neighborhood, application_date,
issue_date, value, contractor],
create a bar chart showing the count
of permits by neighborhood.
Group by 'neighborhood', count rows
(not sum of value). Sort descending,
show top 15. Use Plotly Express.
The difference: The AI no longer guesses. Every ambiguous decision is now specified.
The worst kind of bug: the one that doesn’t crash. The code runs, the chart renders, and the numbers are wrong.
Agent analog: Hallucinated column = agent calls wrong tool. Wrong aggregation = agent chains tools in wrong order. Silent NaN drop = agent swallows tool errors. Same failure modes, different context.
Four fields. That’s it. The 40% planning phase applied directly to data work.
CTOC: A Chart-Level Functional Spec. Same structure returns in Week 7: Context, Task, Output, Constraints — CTOC is a mini system prompt for each visualization.
Context: Seattle building permits DataFrame.
Columns: permit_type, neighborhood,
application_date, issue_date, value, contractor
Task: Group by neighborhood. Count the number of
rows (permits) per neighborhood.
Do NOT sum the value column.
Output: Plotly Express bar chart.
Title: "Building Permits by Neighborhood"
X-axis: "Neighborhood", Y-axis: "Permit Count"
Sort descending by count. Show top 15 only.
Rotate x-axis labels 45 degrees.
Constraints: Drop rows where neighborhood is NaN
before grouping. Log how many rows
were dropped.
“You are not just writing a better prompt. You are engineering the context the model operates in so that its failure modes are structurally eliminated.”
If you give the model the real column names, it cannot hallucinate fake ones. The spec constrains the AI’s generation space.
This same principle applies in Week 7 when you write system prompts for the Anthropic API.
Rule of thumb: Start with Streamlit native to explore, switch to Plotly before you show anyone.
st.set_page_config(
page_title="Decision-Maker Dashboard",
page_icon="📊",
layout="wide",
initial_sidebar_state="expanded"
)
@st.cache_data
def load_data(uploaded_file):
df = pd.read_csv(uploaded_file)
return df
layout="wide" is non-negotiable for dashboards. The default narrow column is wrong for chart-heavy layouts.
@st.cache_data caches the return value. Without it, every slider move re-reads the CSV from disk.
col1, col2, col3 = st.columns(3)
col1.metric("Total Leads", 1247, delta="+12%")
col2.metric("Cost per Lead", "$23.50",
delta="-8%", delta_color="inverse")
col3.metric("Conversion Rate", "3.2%", delta="+0.5%")
st.columns(3) splits horizontal space into equal-width containers. Three columns, three KPI cards.
delta_color="inverse" — a decrease in cost is good news. Business logic in one keyword argument.
Same synthetic data: headline and caption state the decision, KPIs answer “what happened?,” chart supports “why?,” and an action closes the loop.
Same data: widgets and chart come first; the viewer never sees what decision this supports or what to do next.
A dashboard without an action layer is a report. A dashboard with one is a decision tool.
AI wrote the KPI in 3 lines. It takes a larger block of asserts to know whether to trust it. That ratio doesn’t change.
assert len(df) > 0
assert 'leads' in df.columns
Does the data exist? Right columns?
assert df['leads'].dtype in
['int64', 'float64']
assert df['date'].dtype ==
'datetime64[ns]'
Date-as-string bugs live here.
assert df['leads'].min() >= 0
assert df['spend'].max() < 1_000_000
Catches NaN bugs and data errors.
assert grouped['leads'].sum().sum()
== df['leads'].sum()
assert len(filtered) <= len(df)
Do the parts add up to the whole?
Never delete an assert because it’s inconvenient. That’s like unplugging a smoke detector because it keeps going off.
# Does the KPI match the chart?
assert kpi_total == chart_df['leads'].sum()
# Does the filter actually filter?
assert filtered_df['region'].unique() == ['West']
Every dropna() is a design decision. Which rows vanish? Are they random, or do they correlate with a demographic?
Check: does your cleaning step disproportionately remove data from one group?
Verification ladder rung 3 (assert). Next: security checks in Week 6.