TalentMatch OS — Fábio Silva

TalentMatch OS end to end: search the pool by meaning, open a role and rediscover the best-fitting people already in it, then prep the interview and reach out — each step grounded in the candidate's real background.

Recruiting keeps paying to forget

A recruiter meets a brilliant backend engineer who isn’t right for the role that’s open today. Six weeks later a perfect role lands — and that engineer is a name half-remembered in an inbox, a CV in a folder nobody can search. So the search starts again: post the role, pay the agency, sift the inbound, and hope.

That is the quiet, expensive waste at the centre of hiring. The most valuable asset a recruiter owns is the people they’ve already met — and it’s the asset they’re worst at keeping. Every CV was paid for once, in time or in fees, and then allowed to go cold.

TalentMatch OS is built around a single correction: every CV becomes a permanent, searchable, re-rankable asset the moment it arrives. The problem was never sourcing more people. It was being unable to find the ones you already had.

The talent pool is the product

The centre of the system is a pool you search by meaning, not keywords. Ask for “a senior Python engineer with fintech experience and a history of leading teams” and it returns people whose CVs say that in their own words — no Boolean incantations, no exact-match brittleness.

That reframes the most expensive moment in recruiting. Open a role and the system runs rediscovery: it surfaces the best-fitting people already in your pool, each with an explainable fit score — not a black-box number, but a recall step followed by a reasoned re-rank you can read. For an agency, that’s the difference between placing from your own network and going back to the market. For an in-house team, it’s the difference between a role that takes weeks and one you fill from people you already know.

Local recall, Claude reasoning

The interesting engineering isn’t “we call an LLM.” Everyone can call an LLM. It’s the split that makes search both private and affordable at the scale of a whole talent pool.

Recall is local. Every CV is embedded on-device with a multilingual model running in ONNX — no torch, no external calls. Candidate text never leaves your infrastructure to be searched. pgvector does cosine search across the entire pool in one query.
Reasoning is selective. Only the top handful of recalled candidates are handed to Claude for the expensive part — the re-rank, the fit explanation, the judgment. The cheap, high-volume step runs locally; the model is spent only where it changes the answer.
Outputs are structured. Every Claude call that has to be parsed — profiles, fit scores, scorecard criteria — returns against a JSON schema, so the reasoning lands as reliable data, not prose to scrape.

It’s a deliberate division of labour: a fast, local, free retrieval layer underneath, and a frontier model used sparingly on top. The result feels like the model is everywhere; the bill and the privacy posture say otherwise.

AI that does the work — and shows it

This is unapologetically an AI-forward product, because reading and reasoning over CVs at scale is something language models genuinely do well. The discipline is in keeping every one of those judgments explainable and human-owned:

Ideal profiles — describe a role and Claude drafts the must-haves, nice-to-haves, seniority, and a summary you can edit, so “fit” is measured against criteria you can see and change.
Interview prep from the gap — for a given candidate and role, it generates focus areas, targeted questions, claims to verify, and red flags, derived from the difference between their profile and the role. Not generic questions — the ones this conversation needs.
Grounded outreach — first-touch messages and follow-up sequences written from the candidate’s real background, in the tone you pick. Saveable templates with placeholders, an inbox that tracks every sequence, and real SMTP sending when you configure it.
Scorecards — AI-drafted, role-specific criteria with “what to probe” hints; humans fill in the ratings, and multiple scorecards aggregate onto the pipeline card.

At no point does the model hire anyone. It drafts, scores, explains, and prepares; a person decides. The fit scores have reasons attached, the interview prep cites the gap it came from, and the recruiter’s own notes sit alongside it all.

Privacy is architecture, not a policy line

Recruiting data is some of the most sensitive a company holds, so the privacy story is built in rather than promised:

On-device embeddings mean CV text is vectorised locally — nothing leaves your infrastructure for search.
Per-tenant isolation: every record is scoped to the signed-in user’s company, behind HMAC-signed session tokens and pbkdf2-hashed passwords.
The public careers portal carries a GDPR consent gate, rate-limiting, and file validation — and applicants are auto-parsed straight into the pool and the role’s pipeline, so they arrive screened and ranked with no manual step.

End to end, not a point tool

The pool is the foundation, but the workflow runs the whole arc — source → screen → engage → interview → hire:

Ingestion & dedup — upload single CVs, bulk, or a whole folder (PDF / DOCX / TXT); a scanned-PDF guard rejects image-only files, and same-person uploads update the record instead of duplicating it.
ATS pipeline — a drag-and-drop Kanban per role (Applied → Screening → Interview → Offer → Hired → Rejected), with rediscovered candidates added straight from the sourcing tab.
Careers portal — branded public pages with AI-written role posts, feeding applications back into the pool automatically.
Executive analytics — time-to-hire, offer acceptance, inbound-sourcing share, pipeline conversion, and an editable estimate of recruitment savings, so the cost case is legible to the people who sign off on it.

Architecture

Layer	Technologies
Frontend	Next.js 14 · TypeScript · Tailwind · Framer Motion
Backend	FastAPI · SQLAlchemy · structured (JSON-schema) outputs
Reasoning	Claude — Opus 4.8 for profiles, careers posts & interview prep; Haiku 4.5 for fast extraction & matching
Retrieval	fastembed (multilingual e5, ONNX — on-device) · pgvector cosine search
Data	Postgres 16 + pgvector · per-tenant isolation
Auth	HMAC-signed bearer tokens · pbkdf2 hashing · per-request company scoping

The first request downloads the embedding model once, then runs offline. Two Claude tiers keep cost in proportion to the work — a fast model for high-volume extraction and matching, the frontier model only for the judgment-heavy steps.

The broader point

The temptation with a tool like this is to sell the AI. But the model isn’t the moat — anyone can wire up Claude. The value is the memory: turning a pile of CVs nobody could search into a compounding asset that gets more useful with every candidate you add, and then being disciplined about how the AI touches it — recall kept local and private, reasoning kept explainable, and every hiring decision left with a human.

The challenge wasn’t “use AI to recruit.” It was the older, more boring one underneath: stop paying to re-find people you already have. AI is simply the right tool for that job — once the asset it reasons over has been built properly first.