Research · Framework · Methodology
Framework Methodology Scorecard Rubric

Scoring venture: the rubric, the weights, and the things we refuse to score.

Every company that reaches an investor on Alphaneo is graded against the same six-pillar rubric. This is the full methodology — every dimension, every weight, every input, and an honest list of the variables we deliberately leave out because they can't be measured without lying.

Alphaneo Research
Published Feb 6, 2026
18 min read
Version 3.1

A scoring framework is only as useful as the things it excludes. The hardest work we did building this rubric wasn't choosing the six dimensions — that was obvious from two decades of venture practice. It was drawing the line between what we can score honestly and what we cannot, and refusing to pad the rubric with variables that make it feel more rigorous while making the output worse.

Why we score at all

Venture investing is a judgment business. It will remain a judgment business. We are not proposing that a rubric replaces an analyst's view — we are proposing that the rubric makes the analyst's view legible. When an investor reads a scorecard on Alphaneo, they should be able to reconstruct how we arrived at each number, disagree with specific inputs, and form their own view on the same underlying evidence.

That is the whole point. The score is the wrapper. The evidence is the product.

A good rubric doesn't make investing easier. It makes disagreement productive.

The six dimensions, and the weights

The rubric covers six dimensions. Each is scored on a 0.00–1.00 scale by the primary analyst, reviewed by a second analyst independently, and reconciled at a weekly coverage meeting. The weights below are fixed — they do not flex by sector, stage, or deal size.

Team & Execution
20%
Market & TAM
15%
Product & Moat
20%
Unit Economics
20%
Round & Valuation
15%
Risk & Disclosure
10%

You will notice three dimensions carry equal weight at 20%: team, product, and unit economics. That is deliberate. We ran historical regressions against our prior coverage — which goes back farther than the Alphaneo platform, drawing on the analyst team's institutional work — and these three dimensions are the only ones whose scores meaningfully predicted realized IRR five years out. Market and valuation are weighted lower not because they matter less, but because they are more often wrong at the time of underwriting and correct themselves more readily in public evidence.

Fixed weights, on purpose

We are repeatedly asked to flex the weights by sector — more for moat in deep tech, less for unit economics in climate, etc. We decline. Once weights are negotiable, the scorecard stops being a discipline and starts being an output we can tune to match our prior. The weights are the rubric.

Dimension 01 — Team & Execution

DIMENSION · 01

Team & Execution

Weight
20%

We score founder-market fit, the specific evidence that this team can ship, and the depth of the bench behind them. We do not score pedigree — Stanford and MIT are not inputs. What we want is a recruiting pattern: has this founder hired people who have already shipped what the company is trying to build?

Scoring anchors to evidence of shipped outcomes, not on-paper credentials. A repeat founder with one small prior exit scores higher than a first-time founder with a thesis and a Crunchbase page.

What we score
  • Prior shipped products (as founder or operator)
  • Recruiting density vs. category average
  • Depth of the top-five leadership bench
  • Retention of early hires into Series B+
  • Founder's written record on the problem
What we don't score
  • School, degree, or academic pedigree
  • Prior employer brand alone
  • Reference checks from the company's investors
  • Board composition (at this stage, a trailing indicator)

Dimension 02 — Market & TAM

DIMENSION · 02

Market & TAM

Weight
15%

We size the wedge, not the ambition. Bottom-up TAM built from unit price × reachable customers × realistic penetration — discounted against any figure constructed from top-down macro numbers. If the deck's TAM slide uses a Gartner report as its only source, the dimension starts at 0.50.

Structural growth drivers matter. A shrinking category can still produce a great company, but the score has to reflect the headwind.

What we score
  • Bottom-up TAM (units × price × penetration)
  • Structural category growth vs. GDP
  • Reachable customer count within two sales cycles
  • Competitive intensity of the wedge
  • Category re-rating precedent (public comps)
What we don't score
  • Top-down TAMs from analyst reports
  • Ambition ("we will create a new category")
  • Secondary markets the company hasn't entered
  • Total global addressable spend

Dimension 03 — Product & Moat

DIMENSION · 03

Product & Moat

Weight
20%

Differentiation that compounds — not differentiation that wears off at the next competitor raise. We grade on whether the moat deepens with scale (data, network effects, switching costs, distribution depth) or whether it is fundamentally a feature lead that a well-capitalized incumbent can close in twelve months.

We also score product feel. Founders describing their product in accurate, specific, testable terms tell us something. Founders describing the product in slogans tell us something else.

What we score
  • Source of moat (data, network, switching, distribution)
  • Whether the moat compounds or erodes with scale
  • Product parity against top two competitors
  • Customer-described switching cost (from interviews)
  • Release cadence and defect rate trend
What we don't score
  • "AI-native" as a moat claim
  • Patents (rarely defensible for venture-stage software)
  • Brand at pre-Series-C stages
  • Awards, press coverage, or Product Hunt placement

Dimension 04 — Unit Economics

DIMENSION · 04

Unit Economics

Weight
20%

This is where we do the most work. Gross margin composition, CAC payback, net revenue retention, burn multiple, and — most importantly — whether the reported gross margin is calibrated to the same definition that the comparable public companies use. Roughly 40% of decks we review show an inflated gross margin because hosting, ML compute, or delivery cost sits below the line. We rebuild it.

The single most predictive input across our historical coverage is the burn multiple at the time of underwriting. Companies that entered diligence at a burn multiple above 2.0× and did not clear it within six months did not produce venture-scale outcomes.

What we score
  • Gross margin (rebuilt to public-comp definition)
  • CAC payback (on fully-loaded S&M)
  • Net revenue retention (cohort, not blended)
  • Burn multiple (trailing and forward)
  • Customer concentration of top 10
What we don't score
  • LTV / CAC ratio (too easy to manipulate)
  • "Contribution margin" as a substitute for GM
  • ARR defined on signed LOIs or pilots
  • Rule of 40 (composite that hides component decay)

Dimension 05 — Round & Valuation

DIMENSION · 05

Round & Valuation

Weight
15%

Lead quality matters, structure matters more, and the price relative to comparable private and public benchmarks decides the rest. We flag any non-standard preference terms, ratchets, pay-to-play clauses, or option pool expansions that dilute existing holders. Clean rounds score higher than structured rounds at comparable marks — by roughly 0.10 of a full dimension point.

What we score
  • Lead investor's track record in the category
  • Preference structure (1× NP is the bar)
  • Cap table cleanliness & recent secondary activity
  • Price vs. comparable private/public comps
  • Option pool top-up and its dilutive effect
What we don't score
  • Brand of the lead without category relevance
  • "Strategic" investor adjacency claims
  • Marketing-mark vs. real-cash-in valuation
  • SAFE stacks without concrete conversion math

Dimension 06 — Risk & Disclosure

DIMENSION · 06

Risk & Disclosure

Weight
10%

The smallest weight, and the one we use most often to decide whether a company lists at all. We score what the company volunteered versus what we had to chase. Data quality, audit posture, key-person concentration, litigation, regulatory exposure, and customer concentration all feed in. A company that proactively surfaces its weakest metric gets credit. A company that makes us subpoena basic cohort data does not.

What we score
  • Audit quality & financial control history
  • Key-person concentration & succession
  • Regulatory exposure (posture, not just jurisdiction)
  • Customer concentration beyond top 10
  • Proactive disclosure of weak metrics
What we don't score
  • ESG composite scores from third parties
  • Generic "macro" risk narrative
  • Black-swan scenarios the company can't control
  • Insurance coverage as a substitute for process

From score to grade

Each dimension produces a 0.00–1.00 score. The composite is a simple weighted average. The composite maps to a letter grade that anchors the reader's expectation — but the composite and the dimension-level scores are always shown together. We do not publish grades without the underlying numbers.

Table 1 · Composite-to-grade mapping
Composite rangeGradeCoverage posture
0.90–1.00A+High conviction · lead or co-lead candidate
0.85–0.89AActive coverage · full memo
0.80–0.84A−Active coverage · full memo
0.75–0.79B+Listed with caveats · abbreviated memo
0.70–0.74BListed with caveats · abbreviated memo
< 0.70Does not list on Alphaneo

The line at 0.70 is not negotiable. We have declined companies at 0.69 that later produced strong outcomes. We accept that. The alternative — moving the line for specific deals — is the mechanism by which every scoring framework eventually loses its integrity.

The score is the wrapper. The evidence is the product. The grade is a courtesy to the reader.

Things we refuse to score

Every rubric faces pressure to expand. Add a "vision" score. Add a "culture" score. Add a "founder psychological profile" score. We decline. Not because these variables don't matter — they plainly matter — but because we cannot score them without introducing error that swamps whatever signal they contain.

Refused · 01

Founder "quality" as a standalone score

We score shipped evidence under Team & Execution. A separate "founder quality" score becomes a proxy for analyst affinity — it correlates with who the analyst enjoyed talking to, not with outcomes. We have tested this repeatedly. It does not survive out-of-sample validation.

Refused · 02

Culture & values

Culture is real. Its effect on outcomes is real. Our ability to measure it from the outside, in the time a diligence process allows, is not. Asking three employees on a reference call produces survivorship-biased signal. We leave this to the investor who can do the work post-investment.

Refused · 03

Vision or "TAM of the ambition"

We score the wedge. We write the ambition into the memo narrative. We do not let the ambition move the Market & TAM score. If the bottom-up numbers don't support the round, the vision slide doesn't rescue it.

Refused · 04

ESG / impact composites

Third-party ESG scores are methodologically incoherent at venture stage. We report specific regulatory and governance facts under Risk & Disclosure. We do not produce a composite ESG number because we don't believe anyone produces a useful one.

Refused · 05

AI-nativeness, Web3-ness, or any other categorical descriptor

A company is or is not the thing. It shows up in the product and the unit economics. We don't give bonus points for being in a fashionable category, and we don't penalize for being outside one.

How scores change

Scores are not frozen at publication. We re-rate on material events — new financials, round extensions, notable customer wins or losses, regulatory actions, and key-person changes. A material re-rate triggers a second memo, and the scorecard shows the prior number alongside the new one so readers can see the direction of travel.

Quarterly review

Every listed company is reviewed at least once per quarter regardless of events. If nothing has changed, the review note says so — plainly, in one paragraph — and the score stands. Silence is a valid output, as long as it's documented.

Calibration — how we know the rubric is working

A rubric is only as good as its calibration against outcomes. We track two things: inter-analyst agreement (do two analysts independently land within 0.05 of each other?) and out-of-sample IRR correlation (do our composite scores correlate with realized IRR five years out?). The current numbers are respectable but not heroic: inter-analyst agreement at 0.86 within tolerance, and out-of-sample correlation of +0.41 on a pre-Alphaneo backtest spanning 212 companies from 2015–2020.

A correlation of 0.41 is not magic. It is, however, better than unweighted analyst vibes, which backtest at approximately 0.19 against the same sample. The rubric's job is to beat vibes. It does, by enough to be worth the work.


What this document is not

It is not a promise about returns. It is not a recommendation for any specific security. It is a disclosure of how we do the work, published so that investors on Alphaneo can disagree with our method as precisely as they disagree with our conclusions. If you believe we are weighting Unit Economics too heavily, that is a real argument — we welcome it, and we have had it internally more than once.

The rubric is version 3.1. Version 4.0 will arrive when the evidence demands it, not on a schedule.


This methodology document reflects the scoring framework used by the Alphaneo research desk as of the publication date. Scoring is performed by Alphaneo analysts and may not reflect the view of any third party. This document does not constitute investment advice. Full disclosures at alphaneo.ai/legal.

Research list

Get the next memo.

New analyst notes and scorecard updates. Accredited investors only.