Skip to main content

Documentation Index

Fetch the complete documentation index at: https://none-38c466ad.mintlify.app/llms.txt

Use this file to discover all available pages before exploring further.

This section describes how extracted evidence is converted into edges and clusters. It covers signal weighting, admission gates, conflict resolution, safeguards, and transitive controls. The objective is to ensure that linkages are precise, reproducible, and resilient under adversarial variation. All thresholds and constants are omitted by design.

7.1 Signal Contribution

Each pivot contributes weight subject to rarity adjustment, hub suppression, and temporal decay. Frequent features are capped, high-degree hubs are downweighted, and aged signals lose influence over time. Formal definition
For a signal ss:
contribution(s)=wclass(s)min{IDF(s),τclass}ϕhub(s)λ(s),\mathrm{contribution}(s) = w_{\mathrm{class}(s)} \cdot \min\{ \mathrm{IDF}(s), \tau^\star_{\mathrm{class}} \} \cdot \phi_{\mathrm{hub}}(s) \cdot \lambda(s), where ϕhub\phi_{\mathrm{hub}} suppresses high-degree nodes and λ(s)\lambda(s) applies age-dependent decay. Parameters omitted by design. Invariants
  • Rare and recent signals dominate
  • Hub suppression prevents clustering on common infra
  • Decay logic handles pruning of stale evidence

7.2 Typed Edge Gates

Edges are admitted through class-stratified gates. Hard evidence can be admitted alone, medium evidence requires diversity, and soft evidence cannot be promoted on its own. Formal definition Hard gate: whardτhMedium gate: (kmed2)(wtotalτm)Soft-only: whard=wmed=0    reject\begin{aligned} \text{Hard gate: } & w_{\mathrm{hard}} \ge \tau^\star_h \\ \text{Medium gate: } & (k_{\mathrm{med}} \ge 2) \land (w_{\mathrm{total}} \ge \tau^\star_m) \\ \text{Soft-only: } & w_{\mathrm{hard}}=w_{\mathrm{med}}=0 \;\Rightarrow\; \text{reject} \end{aligned} Invariants
  • Deterministic pivots dominate
  • Medium pivots require multiple classes
  • Soft evidence is supplemental only

7.3 Conflict Handling

Contradictory evidence overrides raw score. Payment mutexes, TLS issuer contradictions, geographic mismatches, and placeholder surfaces reduce or veto link strength. Formal definition linkScore(A,B)=w(A,B)Δ(γ(A,B)),\mathrm{linkScore}(A,B) = w(A,B) - \Delta(\gamma(A,B)), where γ\gamma is a normalized conflict score and Δ()\Delta(\cdot) applies either subtraction or rejection. Parameters omitted by design. Invariants
  • Logical contradictions override weight
  • Partial conflicts reduce confidence
  • Reasons are preserved for auditability

7.4 Diversity and Safeguards

Additional safeguards prevent spurious links and enforce evidence diversity:
  • Infra-only disallow: infrastructure pivots alone cannot admit edges
  • Diversity requirement: non-hard links must include multiple asset-like classes and a score floor
  • DOM reinforcement: approximate DOM similarity contributes a bounded bonus but never suffices on its own
Invariants
  • Blocks links formed only by infrastructure coincidences
  • Ensures multi-faceted support for edges without hard pivots
  • Allows structural reinforcement without over-reliance on layout alone

7.5 Negative Evidence Veto

Veto rules exclude edges arising from weak overlaps or known false commonalities. These checks are applied before final admission. Predicates include
  • Shared agency accounts without corroborating identity
  • Vendor templates or placeholder surfaces reused across many operators
  • Lone shared extension without other overlap
  • Geographic mismatches when identity is weak
  • Payment or TLS contradictions
Invariants
  • High-precision exclusion suppresses false positives
  • Strong identity is always required for survival
  • Veto application is logged for traceability

7.6 Bounded Transitive Expansion

Soft evidence chains are explicitly bounded. Let G=(V,E)G=(V,E) and let EsoftE_{\mathrm{soft}} be the set of soft edges. A proposed soft edge (u,v)(u,v) is rejected if it would create a path uvusing only Esoftwith length>L.u \leadsto v \quad \text{using only } E_{\mathrm{soft}} \quad \text{with length} > L^\star. Parameters omitted by design. Baseline contrast
Unbounded chaining of weak overlaps causes “snowballing,” where unrelated nodes are merged through repeated soft coincidences. The bounded predicate localizes soft pivots and prevents cluster inflation.

7.7 Community Detection

Admitted edges form a weighted graph. Clusters are extracted using modularity-based community detection. Formal definition Q=12mu,v(Auvkukv2m)δ(cu,cv),Q = \frac{1}{2m} \sum_{u,v} \Big( A_{uv} - \frac{k_u k_v}{2m} \Big)\, \delta(c_u, c_v), where AuvA_{uv} are edge weights, kuk_u node degrees, and cuc_u community assignments. Maximization of QQ yields partitions; fallback is connected components when community detection is unavailable. Parameters omitted by design. Baseline contrast
Using only connected components tends to over-merge clusters when a single bridge exists. Modularity-based detection favors dense substructures and suppresses merges across weak bridges.

7.8 Candidate Generation

Candidate edges are proposed upstream through approximate similarity search in embedding space. Formal definition
For a domain DD with embedding f(D)f(D):
Nk(D)={top-k neighbors of f(D)},N_k(D) = \{ \text{top-}k\ \mathrm{neighbors}\ \text{of } f(D) \}, with kk chosen adaptively (omitted). Candidate pairs are {(D,D):DNk(D)}\{(D,D') : D' \in N_k(D)\}. Baseline contrast
Exhaustive enumeration is O(N2)O(N^2); approximate similarity retrieval scales sub-quadratically. Prefilters remove trivial overlaps (e.g. provider-wide or CDN-wide artifacts) before scoring.

7.9 Outcome Properties

System outcomes reflect the guarantees of the scoring and correlation framework:
  • Explainability: every admitted edge carries metadata on weight, rarity caps, decay, conflicts, and veto results
  • Precision-first growth: infra-only and soft-only links are excluded; medium pivots must pass diversity; hard pivots dominate
  • Stability: reproducibility is measured via bootstrap resampling, with cluster Jaccard index enforced:
J=C(1)C(2)C(1)C(2),JJ.J = \frac{|C^{(1)} \cap C^{(2)}|}{|C^{(1)} \cup C^{(2)}|}, \quad J \ge J^\star.
  • Scalability: clustering methods partition graphs at the 10510^510610^6 node scale under configured resources
  • Auditability: all admitted and rejected edges retain their rationale, enabling external verification