September 18, 2025

Workshop: Validating Generative AI Models

toronto workshop genai validation

October 28, 5:30–7:30 PM • Downtown Toronto

GenAI is racing into production—your validation program has to keep pace. Join us after work in downtown Toronto for a practical, hands-on session focused on how to validate GenAI systems you can defend to stakeholders, auditors, and regulators. Looking for E-23 compliance strategies? 

Who should attend

  • Model Risk & Validation, Compliance, Internal Audit
  • Data Science & ML Engineering, ML Ops
  • Product Owners and Risk Champions for GenAI initiatives

What you’ll learn (and practice)

  • Scope the system, not just the model: Map prompts, RAG pipelines, tools, and guardrails; define boundaries and dependencies.
  • Design defensible tests: Build acceptance criteria for hallucination, harmful content, bias, robustness, privacy leakage, and IP risk.
  • Evaluation methods that work: Pairwise judging, rubric-based LLM-as-judge, golden sets, and human review—and when to use each.
  • RAG & retrieval validation: Groundedness, citation quality, retrieval recall/precision, and corpus hygiene checks.
  • Prompt & config change control: Versioning, test harnesses, regression suites, and rollback criteria.
  • Monitoring in production: Metrics for degradation, drift, jailbreaks, and safety incidents; alert→ticket→remediation loops.
  • Evidence & documentation: Model/system cards, validation reports, and audit-ready artifacts that align with policy and controls.

Format (interactive + practical)

Short lightning talks followed by guided mini-exercises with realistic case studies and editable templates. You’ll leave with assets you can adapt immediately.

Agenda (2 hours)

  • 5:30 PM — Arrival & networking (light bites & drinks)
  • 5:45 PM — Welcome & objectives
  • 5:55 PM — Scoping GenAI Systems (components, risks, and control points)
  • 6:10 PM — Mini-Exercise: Define acceptance criteria for a GenAI use case
  • 6:30 PM — Evaluation Techniques (LLM-as-judge, golden sets, human review)
  • 6:45 PM — Mini-Exercise: Build a small evaluation plan & test harness outline
  • 7:05 PM — Monitoring & Change Control (from pre-prod tests to on-call playbooks)
  • 7:20 PM — Debrief & immediate next steps
  • 7:30 PM — Close

What you’ll take away

  • A GenAI Validation Plan template (scope, risks, test strategy, acceptance criteria)
  • Evaluation workbook: examples for groundedness, harmful content, bias, and robustness
  • A lightweight test harness checklist for prompts, RAG, and guardrails
  • A 60–90 day playbook to operationalize validation and monitoring

Logistics

  • Date & Time: [Insert Date], 5:30–7:30 PM
  • Location: Downtown Toronto (venue details provided upon registration)
  • Capacity: Limited to keep the session highly interactive
  • Bring: Laptop recommended (we’ll share templates); optional: a GenAI use case from your org

Why attend

  • Actionable, not academic: Concrete artifacts, controls, and workflows.
  • Cross-functional by design: Risk, data, and product perspectives in one room.
  • Audit-ready outcomes: Evidence you can stand behind.

Company and Industry Updates, Straight to Your Inbox