Skip to main content
§ Product

Deal-Sourcing Engine

A scoring pipeline that ingests structured (PitchBook-class) and unstructured (news, hiring, patents, alt-data) sources, applies your fund's thesis-specific filters, and ranks candidate targets — feeding your CRM with the top tier weekly.

Engagement
6–10 week build · ongoing data ops
Built for
Deal teams · Sourcing analysts · GP principals
§ Problem

PitchBook gives you the universe. The work is filtering the universe down to the 30 companies that match the thesis, ranked by signal strength, refreshed weekly so nothing gets missed while the team is in close mode on the current deal.

What this is

A scoring pipeline that takes the world of private companies and surfaces the ones that match the fund's thesis. Three layers:

  • Universe ingestion. Structured sources (PitchBook-class) plus unstructured (news, patents, hiring, alt-data), normalized into a single candidate-company graph with daily delta tracking.
  • Thesis scoring. A rubric encoding the fund's thesis — sector, size, growth, margin profile, defensibility, ownership status — applied to each candidate. Auditable. Versioned. When the thesis evolves, so does the rubric.
  • CRM hand-off. Weekly top-tier delivery into Affinity, DealCloud, or whatever your team uses. Disqualifiers fire transparently so nothing surprises the partner when they see the list.

How it's built

Polars / DuckDB for the data layer, FastAPI for serve-side, your CRM's API for delivery. Where the universe data is public-API-accessible, we wire direct integrations; where it requires scrape-and-normalize work, we own that pipeline with documented rate-limiting and license handling.

What you get

  • The rubric, versioned and auditable.
  • The weekly top-tier delivery into your CRM.
  • Disqualifier tracking — knowing what was filtered out and why.
  • Source-level reliability scoring so the partner reading the list knows what to trust.
  • Runbook for the data ops your team owns post-handoff.
§ How we engage

Engagement is shape, not list.

Length and price are functions of the data and the destination. The shape below is the typical engagement.

Length
6–10 week build · ongoing data ops

Scoped during the discovery call against the actual data and the operation it integrates with.

Lead
Bogdan

Principal engineer. Architecture and most code ships through one keyboard.

Cadence
Async, weekly

Written updates between, calls when the decision needs the room.

Bar
Production

Async correctness, capacity under burst, observability at every boundary.

§ Questions

What buyers ask about this one.

  • We already use PitchBook + Affinity. What's different?

    Those are excellent universe and CRM tools. The work is the ranking layer on top — your fund's thesis encoded as scoring rules, ingestion of the unstructured signals PitchBook misses (hiring deltas, patent filings, supplier-relationship changes), and the weekly refresh that gets the top tier in front of the deal team without manual triage.

  • How do you encode our thesis?

    First engagement: discovery interviews with the partners and the deal team, structured against the fund's mandate. The thesis becomes a scoring rubric (sector, size, growth, margin profile, defensibility, ownership) and a set of disqualifiers. The rubric is auditable and versioned — when the thesis evolves, the rubric evolves with it.

  • What alt-data sources do you integrate?

    Hiring activity (LinkedIn / job-board signals), patent filings (USPTO + WIPO), product-launch tracking, supplier-and-customer mention graphs from public news, employee-review-site trajectory, executive-departure signals. Per-engagement, the source list is scoped to what actually moves the rubric for your fund's sector.

  • We've seen vendors promise '3× pipeline' — is that real?

    The honest answer is it depends on what your baseline is. Funds with strong existing sourcing networks see incremental lift; funds whose sourcing was reactive see a step-change. We don't promise a multiplier — we'd rather scope to a concrete outcome (e.g., top-50 surfaced weekly, partner-time on triage cut by N hours) and measure against that.

  • Pricing?

    Scoped against source coverage and CRM-integration depth. Discovery call covers both.

§ The next step

If the deliverable matches the gap, the next step is one call.

We'll scope length and price against your data and the operation it integrates with. No retainer, no fishing.

Bogdan and team · async-first · OP—2026