Deal-Sourcing Engine
A scoring pipeline that ingests structured (PitchBook-class) and unstructured (news, hiring, patents, alt-data) sources, applies your fund's thesis-specific filters, and ranks candidate targets — feeding your CRM with the top tier weekly.
- Engagement
- 6–10 week build · ongoing data ops
- Built for
- Deal teams · Sourcing analysts · GP principals
PitchBook gives you the universe. The work is filtering the universe down to the 30 companies that match the thesis, ranked by signal strength, refreshed weekly so nothing gets missed while the team is in close mode on the current deal.
What this is
A scoring pipeline that takes the world of private companies and surfaces the ones that match the fund's thesis. Three layers:
- Universe ingestion. Structured sources (PitchBook-class) plus unstructured (news, patents, hiring, alt-data), normalized into a single candidate-company graph with daily delta tracking.
- Thesis scoring. A rubric encoding the fund's thesis — sector, size, growth, margin profile, defensibility, ownership status — applied to each candidate. Auditable. Versioned. When the thesis evolves, so does the rubric.
- CRM hand-off. Weekly top-tier delivery into Affinity, DealCloud, or whatever your team uses. Disqualifiers fire transparently so nothing surprises the partner when they see the list.
How it's built
Polars / DuckDB for the data layer, FastAPI for serve-side, your CRM's API for delivery. Where the universe data is public-API-accessible, we wire direct integrations; where it requires scrape-and-normalize work, we own that pipeline with documented rate-limiting and license handling.
What you get
- The rubric, versioned and auditable.
- The weekly top-tier delivery into your CRM.
- Disqualifier tracking — knowing what was filtered out and why.
- Source-level reliability scoring so the partner reading the list knows what to trust.
- Runbook for the data ops your team owns post-handoff.
Engagement is shape, not list.
Length and price are functions of the data and the destination. The shape below is the typical engagement.
- Length
- 6–10 week build · ongoing data ops
- Lead
- Bogdan
- Cadence
- Async, weekly
- Bar
- Production
Scoped during the discovery call against the actual data and the operation it integrates with.
Principal engineer. Architecture and most code ships through one keyboard.
Written updates between, calls when the decision needs the room.
Async correctness, capacity under burst, observability at every boundary.
Products this composes with.
Same suite, or vertical-specialized versions in another.
- Same suite · Private Equity Suite
Technical Due Diligence Engine
A code- and architecture-level diligence report — red/yellow/green risk heatmap across modules, dependency analysis, test-coverage and CI/CD posture, security vulnerability surface, and the operating-partner-readable summary of what the next twelve months of platform work will cost.
- Same suite · Private Equity Suite
Diligence Red-Flag Engine
An AI-extraction-and-review layer that ingests the data room, surfaces the anomalies and the off-market clauses, benchmarks against comparable transactions, and screens beneficial ownership against sanctions and PEP lists — delivered as a risk heatmap the deal team works against.
What buyers ask about this one.
We already use PitchBook + Affinity. What's different?
Those are excellent universe and CRM tools. The work is the ranking layer on top — your fund's thesis encoded as scoring rules, ingestion of the unstructured signals PitchBook misses (hiring deltas, patent filings, supplier-relationship changes), and the weekly refresh that gets the top tier in front of the deal team without manual triage.
How do you encode our thesis?
First engagement: discovery interviews with the partners and the deal team, structured against the fund's mandate. The thesis becomes a scoring rubric (sector, size, growth, margin profile, defensibility, ownership) and a set of disqualifiers. The rubric is auditable and versioned — when the thesis evolves, the rubric evolves with it.
What alt-data sources do you integrate?
Hiring activity (LinkedIn / job-board signals), patent filings (USPTO + WIPO), product-launch tracking, supplier-and-customer mention graphs from public news, employee-review-site trajectory, executive-departure signals. Per-engagement, the source list is scoped to what actually moves the rubric for your fund's sector.
We've seen vendors promise '3× pipeline' — is that real?
The honest answer is it depends on what your baseline is. Funds with strong existing sourcing networks see incremental lift; funds whose sourcing was reactive see a step-change. We don't promise a multiplier — we'd rather scope to a concrete outcome (e.g., top-50 surfaced weekly, partner-time on triage cut by N hours) and measure against that.
Pricing?
Scoped against source coverage and CRM-integration depth. Discovery call covers both.
If the deliverable matches the gap, the next step is one call.
We'll scope length and price against your data and the operation it integrates with. No retainer, no fishing.
Bogdan and team · async-first · OP—2026