McKinsey & Company has dropped a free AI-powered interview simulator called Lilli—a tool designed to let candidates practice case-study drills without dropping $500/hour on boutique consulting coaches. Targeting entry-level business analyst and associate roles, the platform leverages proprietary LLM fine-tuning to generate adaptive, firm-specific case scenarios. It’s rolling out globally this week, but the real question isn’t just whether it works—it’s how it reshapes the $1B+ coaching industry and the hidden technical debt of AI-driven hiring systems.
The AI Behind the Curtain: How McKinsey’s Lilli Stack Compares to Open-Source Alternatives
McKinsey’s tool isn’t just another chatbot with a case-study prompt library. Under the hood, Lilli appears to use a fine-tuned Mistral 7B model (likely the mistral-7b-instruct-v0.2 variant) with domain-specific embeddings trained on McKinsey’s internal case archives. The architecture avoids proprietary black boxes by exposing an API-first design, allowing candidates to submit structured responses (in JSON format) for real-time feedback. This represents a deliberate pivot from traditional consulting prep, where candidates memorize frameworks like the MECE principle (Mutually Exclusive, Collectively Exhaustive) without understanding the underlying data pipelines.
Benchmarking against open-source alternatives like BigCode’s evaluation suite, Lilli’s response latency hovers around 800ms for 512-token inputs—competitive with Hugging Face’s transformers pipeline but slower than Google’s Vertex AI endpoint optimizations. The trade-off? McKinsey’s model achieves 92% accuracy on “structured reasoning” tasks (e.g., profit-and-loss breakdowns) compared to 78% for generic LLMs, per internal tests shared with Archyde.
Why This Matters for the Coaching Industry
The $500/hour coaching market isn’t just about ego—it’s about platform lock-in. Firms like McKinsey and BCG have long relied on proprietary frameworks (e.g., McKinsey’s Three Horizons model) to filter candidates. By democratizing access, McKinsey risks eroding its moat. But here’s the catch: Lilli isn’t just free—it’s sticky. Candidates who use it are implicitly trained on McKinsey’s preferred methodologies, creating a feedback loop where the firm’s hiring criteria shape the tool’s outputs.
— “This is the first time a top-tier firm has weaponized AI to internalize candidate training. It’s not philanthropy—it’s a Trojan horse for behavioral conditioning.”
The Ecosystem War: How Lilli Forces Open-Source AI to Adapt
Open-source communities are already scrambling to reverse-engineer Lilli’s fine-tuning process. On Hugging Face, forks of Mistral 7B with “consulting prep” prompts have surged 400% since April. But McKinsey’s move highlights a critical flaw in open-source AI: domain specificity. While models like llama-3-8b can handle general Q&A, they lack the structured output formatting Lilli enforces (e.g., forcing candidates to label assumptions as "A1", "A2").

This isn’t just about copying McKinsey’s tool—it’s about replicating the hiring signal. For example, Lilli’s API returns feedback in a custom schema:
{ "score": 8.5, "weaknesses": [ {"type": "logical_gap", "example": "Forgetting to account for fixed costs in Scenario B"}, {"type": "structural", "example": "MECE violation in Step 3"} ], "suggested_improvement": "Review McKinsey’s ‘Profit Pool Analysis’ framework (internal doc ID: MK-2024-045)" }
Open-source projects like Open Interviews are now racing to match this granularity. The question is whether they can without access to McKinsey’s proprietary case libraries.
The Antitrust Angle: Is This a Monopoly Play?
McKinsey’s tool isn’t just a hiring tool—it’s a data collection mechanism. Every candidate’s response is logged, creating a proprietary training dataset for future model iterations. This raises red flags under the FTC’s AI bias guidelines, which prohibit firms from using AI to exacerbate opportunity gaps. Yet McKinsey’s terms of service explicitly state that candidates waive claims to their data—a legal gray area that could invite scrutiny if the tool’s recommendations disproportionately favor certain demographics.
— “This is a classic case of ‘free’ as a loss leader. The real product isn’t the tool—it’s the behavioral data. If McKinsey starts charging for ‘premium insights’ derived from this data, they’ll have a monopoly on candidate psychology.”
Technical Deep Dive: How Lilli’s API Works (And Why It’s Not as Open as It Seems)
Lilli’s API is documented but not open. It requires candidates to authenticate via LinkedIn, creating a walled garden that prevents third-party integrations. Here’s the undocumented endpoint structure:
POST /api/v1/case/study– Submits a candidate’s response (max 2,000 tokens).GET /api/v1/feedback/{session_id}– Returns structured feedback (cached for 72 hours).POST /api/v1/analytics– Undocumented. Likely used for internal performance tracking.
Attempts to scrape feedback data via curl return HTTP 403 Forbidden, suggesting rate-limiting or IP-based blocking. This is a deliberate choice to prevent competitors from reverse-engineering McKinsey’s scoring algorithms.
The 30-Second Verdict
- For Candidates: Free access is a game-changer, but Lilli’s feedback is opaque—no transparency on how scores are calculated.
- For Coaches: The $500/hour market is bleeding, but niche firms (e.g., Analytic Hub) are pivoting to Lilli-compatible training.
- For McKinsey: This is a long-term play—the tool will feed into their internal hiring models, creating a self-reinforcing loop.
The Bigger Picture: Is This the Future of Hiring?
McKinsey’s move is a microcosm of a larger trend: AI as a hiring gatekeeper. Firms are increasingly using tools like Lilli to standardize subjective evaluations, reducing reliance on human interviewers. But this raises ethical questions. If an LLM scores a candidate’s response at 7.2/10, is that an objective measure—or just another layer of algorithmic bias?

The real innovation here isn’t the AI—it’s the business model. McKinsey isn’t just giving away a tool; it’s training the next generation of consultants on its preferred frameworks. In 5 years, the candidates who used Lilli today might be the partners who pay McKinsey millions for strategy work. That’s not philanthropy. That’s ecosystem lock-in.
What You Should Do Next
If you’re a candidate: Use Lilli, but cross-reference feedback with open-source tools like Consulting Prep’s AI Coach. If you’re a coach: Start building Lilli-compatible frameworks—or risk obsolescence. And if you’re a policymaker? Watch this space. The next antitrust battle might not be over chips or cloud—it might be over who owns the training data for the future workforce.