Skip to main content

Journal

Weekly entries from a six-month AI product development program.

Week 13 Journal: Meet Dino, Your Las Vegas Restaurant Insider

Built Dino, a consumer-facing AI dining concierge for Las Vegas with Rat Pack personality, real Google Maps restaurant data, mock reservation booking, and Google Calendar deep links — deployed on Railway via a FastAPI agentic architecture.

agentic-architecture, fastapi, claude-code, conversational-ai, google-maps-api, railway, consumer-product, personality-engineering, tool-use

Week 12 Journal: Feature Intake Copilot

Built a conversational AI intake copilot that transforms messy stakeholder feature requests into structured specs using a two-gate review architecture.

conversational-ai, claude, streamlit, supabase, safe, product-management, advisor-tool, multi-call-llm

Week 11 Journal: Anthropic Advisor Tool Experiment

Took the SAFe Feature Spec System from prototype to production by wiring in governance, migrating from SQLite to PostgreSQL, building a ConnectorInterface abstraction, and deploying the full v3 system to Streamlit Cloud.

advisor-tool, postgresql, streamlit, python, safe, evaluation, production-deployment, connector-pattern, governance, anthropic-api, supabase

Week 10 Journal: Responsible AI

Built nine responsible AI modules (cost guardrails, grounding checks, content safety, bias detection, audit trails, and prompt governance), transforming the SAFe Feature Spec System into a pipeline that can be trusted in production.

responsible-ai, guardrails, grounding, content-safety, bias-detection, audit-trail, cost-governance, prompt-governance, streamlit, sqlite, python, claude-code

Week 9 Journal: AI Evaluation System

Built an evaluation pipeline for the SAFe Feature Spec System: SQLite persistence, prompt versioning, a golden test set, a Streamlit dashboard, and an AI improvement suggester, turning prompt engineering from guesswork into measurement.

evaluation, llm-as-judge, sqlite, streamlit, prompt-engineering, a-b-testing, python, claude-code, safe, prompt-versioning

Week 8 Journal: Return of the Feature Spec Generator

Built a six-agent Streamlit system that automates the full SAFe feature spec workflow — from classification to scoring to polish — replacing a multi-session manual process with a consistent, 10-15 minute pipeline.

multi-agent, streamlit, safe, feature-spec, session-state, reflexive-architecture, llm-as-judge, workflow-automation

Week 7 Journal: Agentic RAG + MCP

Upgraded the Knowledge Assistant from a local ChromaDB prototype to a production-grade Pinecone-backed system, added agentic retrieval, and refactored tools into an MCP-style composable layer.

rag, pinecone, agentic-rag, mcp, vector-database, tool-use, streamlit, multi-agent, embeddings

Week 6 Journal: RAG Time

Built a full RAG pipeline from embeddings to deployed Knowledge Assistant, explored hybrid search and re-ranking, and established a baseline evaluation framework for the Feature Spec Generator.

rag, embeddings, chromadb, vector-store, hybrid-search, re-ranking, metadata-filtering, streamlit, evaluation, llm-as-judge, openai

Week 5 Journal: Feature Spec Generator Ship and Share

Converted the Feature Spec Generator from a CLI tool into a Streamlit web app, deployed it to Streamlit Cloud, and shipped it to real users with Teams webhook notifications.

streamlit, deployment, webhooks, microsoft-teams, session-state, feature-spec-generator, production

Week 4 Journal: Multi-Agent Systems

A grueling but rewarding week building production-grade multi-agent systems — from a hierarchical Feature Spec Generator to an ROI Analyzer hardened with resilience patterns, cost optimization, and smart model routing.

multi-agent, tool-use, function-calling, resilience, cost-optimization, prompt-engineering, asyncio, model-routing

Week 3 Journal: Tools and Agents

Built a progression from calculator tool to autonomous research agent, and learned that resilience code, context management, and bounded workflows aren't optional — they're load-bearing.

tool-use, agents, python, context-window, prompt-engineering, error-handling

Week 2: LLM and API Basics

Hands-on experiments with tokens, context windows, streaming, and prompting techniques reveal that AI product management requires an entirely new economic mental model.

tokens, context-window, prompt-engineering, streaming, python, rate-limits, product-management

Week 1: First Steps

A non-engineer sets up a Python dev environment, generates an API key, and makes a first Claude API call — and discovers that vibe-coding has limits.

python, anthropic-sdk, dev-environment, cli, api