Multi-agent meeting analysis with built-in peer review

Role

Architecture, Development

Team

Daan, Nina, Rogier

Stack

Python, FastAPI, LangGraph, Azure OpenAI

Status

Live (v2 launching)

FLAI analyses team meetings for psychological safety, power dynamics, conflict handling, and decision-making patterns. Input: a transcript. Output: eight construct analyses, synthesised into functional and dysfunctional observations plus underlying patterns. It's used inside &samhoud to do team development work that previously required multiple observer sessions per team.

I rebuilt the architecture from the ground up. The system runs on three LangGraph graphs. One generates meeting context from supplementary documents. One runs the construct analysis as a fan-out across eight parallel branches. One synthesises the results into a final analysis. No RAG. Behaviour analysis happens directly from the transcript with construct theory embedded in the prompts.

It's a peer review system the model runs on itself. Roughly 45 LLM calls per analysis. Slow. Expensive. Sharper than any single-pass version we tested.

The interesting part sits inside each of the eight construct branches. Every construct runs through a five-role pipeline instead of a single prompt. An analyst does the first read. A depth psychologist looks for root causes. A devil's advocate attacks the evidence and proposes alternative explanations, labelling each claim's strength. An arbiter weighs the critique and steelmans what holds up. A senior writer produces the final output.

A few design decisions worth naming. Models are split by role: gpt-5.1 for critique, verdict, and final synthesis; gpt-5 for creative and writing-heavy steps; gpt-5-mini for context generation. Each construct is grounded in actual theory (Edmondson on psychological safety, Lencioni, Kahneman on bias) which is embedded in the prompt rather than retrieved. Structured JSON output is produced deterministically from LLM-written markdown, not by asking the model for JSON directly, which removed a whole class of parsing failures.

What we're building next is a much bigger version. FLAI for Leadership. Organisations enrol all their leaders into the programme, define organisation-wide leadership development goals, and each leader uploads their own meetings continuously. The system observes and coaches them against their personal goals, but always in the context of where the organisation wants to go. It's a shift from one-off team analyses to a longitudinal coaching layer that runs across an entire leadership population. Different data model, different graphs, different product. Same engine underneath.

Stack

Python

FastAPI

LangGraph

LangChain

Azure OpenAI

PostgreSQL

Azure Blob Storage

React 19

TypeScript

Tailwind v4

Azure Speech Services

Next project

Multi-agent meeting analysis with built-in peer review

It's a peer review system the model runs on itself. Roughly 45 LLM calls per analysis. Slow. Expensive. Sharper than any single-pass version we tested.

From "I tried it once" to 85% daily AI adoption →