# Blockchain Analytics

The following documentation provides an overview of blockchain analytics within Ospree’s compliance framework. It explains its importance in identifying and assessing financial crime risks, and how institutions can use blockchain analytics to enhance compliance controls and support more informed decision-making. It also includes technical details for users seeking a deeper understanding of how these capabilities are implemented in practice.

{% hint style="success" %}
For technical API details on wallet control verification, please refer to the following [section](https://docs.ospree.io/ospree-api/blockchain-analytics/create-address)
{% endhint %}

### Understanding Blockchain Analytics Data in Ospree

Ospree’s Blockchain Analytics module helps compliance, risk, and operations teams interpret blockchain exposure in a structured, actionable way. To support transaction monitoring, wallet screening, investigations, and Travel Rule decisions, the interface distills complex intelligence into four key fields:

* **Risk Score:** The quantified risk level of a transaction or wallet.
* **Sources:** The underlying analytics providers supplying the intelligence.
* **Risk Score Descriptors:** The specific behaviors or typologies triggering the score.
* **Controlling Entity:** The real-world organization or service attributed to the wallet.

At a high level, this data is produced through a combination of on-chain data ingestion, wallet clustering heuristics, entity attribution, and risk modeling.

Ospree does not depend on a single proprietary view of blockchain risk. Instead, our value proposition is to provide a modular, data-agnostic compliance layer that integrates analytics from multiple specialist providers. This allows customers to cross-reference data, improve asset coverage, drastically reduce false positives, and make more defensible compliance decisions.

<figure><img src="https://4249894339-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FQg7OS46Jha8pRGgGDSDo%2Fuploads%2FmZBGsvsHbQnGtvSHqore%2FBody%20(1).png?alt=media&#x26;token=a556ac96-54a0-44f9-9951-3b4d7eaf1dd4" alt=""><figcaption></figcaption></figure>

### How blockchain analytics data is created

Before a blockchain analytics provider can assign any score or label, it must first collect and structure raw blockchain activity. Providers such as Chainalysis, TRM Labs, Elliptic, Global Ledger, Crystal, Blockchain Intelligence Group, and others run their own infrastructure to ingest blockchain data at scale. This usually includes archive nodes, transaction parsers, mempool listeners, and smart contract event indexers across many supported networks.

The raw data is then normalized into searchable graph networks and databases that can interpret both major blockchain models:

* UTXO-based chains: such as Bitcoin and Litecoin.
* Account-based chains: such as Ethereum, Polygon, and Avalanche.

Once raw blockchain events are indexed, providers apply clustering logic. This is a core part of the analytics process. A single service, such as an exchange, may control thousands or millions of deposit addresses. The analytics engine uses heuristics to infer whether many addresses likely belong to the same operator. Examples include:

* Common-input ownership heuristic: Used on UTXO chains, where multiple inputs used in the same transaction strongly indicate shared cryptographic control.
* Change address detection: Where algorithms estimate which output returns the remaining value back to the sender.
* Deposit address sweeping analysis: Used on account-based chains, where many small user addresses systematically forward funds into a central hot wallet.

Clustering alone does not identify the real-world owner; that requires entity attribution. Providers combine open-source intelligence (OSINT), active service interactions, internal investigations, sanctions data, and law enforcement disclosures to label wallet clusters with real-world names (e.g., an exchange, mixer, ransomware syndicate, darknet market, or sanctioned entity).

After attribution, providers apply risk models that evaluate whether a wallet or transaction has exposure to categories of concern, such as fraud, theft, terrorism financing, or suspicious counterparties. Exposure may be measured directly or indirectly (tracking funds across multiple intermediate "hops"), depending on the provider’s specific methodology.

### Risk Score

The Risk Score is a summarized indicator of the risk associated with a blockchain address, wallet cluster, transaction, or counterparty, based on the analytics source selected in Ospree.

It is intended to help users quickly assess whether the screened object presents a heightened AML, CTF, sanctions, or fraud-related risk. In most cases, the score reflects a provider’s internal model, which may consider:

* Direct exposure to known illicit or sanctioned entities.
* Indirect exposure through intermediary wallets or hops.
* Behavioral patterns associated with typologies such as layering, obfuscation, or mixer usage.
* Proximity to identified high-risk services.
* Observed transaction flows across blockchains, bridges, or decentralized protocols.

A score of 100% risk does not necessarily mean ownership by a criminal actor. It typically means the provider has identified exposure or attribution that falls into its highest risk category according to its methodology. Customers should therefore treat the score as a decision-support signal, not as standalone proof.

It is also important to understand that risk scores are not fully standardized across the industry. One provider may use a 0–100 scale, another may use categories such as low, medium, high, and severe, and another may apply a weighted typology model. Ospree displays the score and related metadata exactly as provided by the source so that customers can review it in context.

For this reason, Ospree’s multi-provider approach is highly valuable. A compliance team can compare results across sources and align the data with their own institutional risk appetite and internal policy rules to determine whether to allow, flag, hold, escalate, or block activity.

### Sources

The Source field identifies the blockchain analytics provider from which the displayed data originates. This field matters because analytics outputs are heavily provider-dependent. Different vendors vary significantly in:<br>

* Blockchain coverage
* Attribution depth
* Update frequency
* Cross-chain tracing capability
* Typology taxonomy
* Scoring models
* Treatment of indirect exposure
* Confidence thresholds for attribution<br>

For example, some providers are especially strong in law-enforcement-grade investigations, some excel in DeFi tracing and cross-chain movement, others specialize in visually mapping exposure, and some focus heavily on enterprise-grade data management. No single provider is universally best for every use case, asset, or jurisdiction.

Ospree’s approach is not to hide this complexity, but to make it manageable. By surfacing the source clearly, Ospree allows customers to understand exactly where the result came from, compare outputs across vendors where configured, and align internal controls with the specific source methodologies they trust.

In practice, the Source field helps answer a basic but critical question: *“Whose analytics interpretation am I looking at?”*

### Risk Score Descriptors

Risk Score Descriptors are qualitative tags or labels that explain *why* a score is high, medium, or otherwise relevant.

These descriptors usually represent exposure categories, entity types, or behavioral indicators identified by the analytics provider. In the example, descriptors such as `Sanctioned OFAC`, `sent_to_binance`, or `sent_to_BitMex` indicate that the screened wallet or entity has known exposure patterns or connections that the provider considers relevant.

Descriptors can reflect several types of insight:

* **Regulatory or sanctions exposure**: Connection to an OFAC-sanctioned entity or wallet cluster.
* **Counterparty/service interaction**: Funds sent to or received from a major exchange, mixer, gambling service, darknet market, or bridge.
* **Behavioral or typology indicators**: Layering, rapid peel chains, or suspicious smart contract interactions.
* **Attribution context**: Association with ransomware infrastructure, fraud networks, or illicit service operators.

Descriptors are critical because the raw score alone is not enough for a defensible compliance decision. A high-risk score driven by direct sanctions exposure requires a fundamentally different operational response than one driven by historical interaction with a loosely regulated exchange. Compliance teams need context, not just severity.

Customers should also note that descriptors heavily depend on the provider’s specific naming conventions. Some labels are highly human-readable, while others may reflect internal taxonomies or machine-oriented tagging. Ospree surfaces these exactly as provided to preserve transparency and help analysts accurately understand the basis of the result.

### Controlling Entity

The Controlling Entity field represents the real-world organization, service, operator, or cluster that the analytics provider believes controls the screened wallet or group of wallets.

This is one of the most valuable outputs in blockchain analytics because blockchain addresses themselves are pseudonymous. A wallet string alone does not say whether it belongs to Binance, a sanctioned mixer, a phishing scam operator, or a legitimate OTC desk. The Controlling Entity field attempts to bridge that gap.

The result is based on complex clustering and attribution work performed by the provider. It may use:

* Transaction pattern analysis
* Address reuse analysis
* Wallet sweep behavior
* Smart contract interaction patterns
* Deposit and withdrawal tracing
* Exchange exposure mapping
* OSINT and public disclosures
* Proprietary investigative intelligence

This field should be interpreted carefully. In some cases, the provider may attribute a cluster confidently to a single service. In others, it may show multiple associated names, aliases, historical brands, or related entities. That does not always mean the address is currently controlled by all listed entities simultaneously; it may reflect attribution overlap, merged intelligence, or grouped service infrastructure.

For compliance purposes, the Controlling Entity helps answer a fundamental question: *“Who is likely behind this address or cluster?”* > That context is especially critical when making operational decisions about sanctions screening, Travel Rule obligations, counterparty due diligence (CDD), investigations, or escalation workflows.

***

### Regulatory Guidance

* Financial Action Task Force. (2019). *Guidance for a risk-based approach to virtual assets and virtual asset service providers*. FATF.
  * Context: Foundational reference for AML, CTF, and risk-based controls in crypto.
* Financial Action Task Force. (2021). *Updated guidance for a risk-based approach to virtual assets and virtual asset service providers*. FATF.
  * Context: Useful for understanding Travel Rule expectations and supervisory direction.
* Office of Foreign Assets Control. (2021). *Sanctions compliance guidance for the virtual currency industry*. U.S. Department of the Treasury.
  * Context: Important for understanding sanctions screening expectations.

***

### Academic Literature

**UTXO Heuristics (Common-Input Ownership & Change Address)**

The concepts of "Common-input ownership" and "Change address detection" were formally defined and tested in early academic literature examining Bitcoin's pseudo-anonymity.

* Meiklejohn, S., Pomarole, M., Jordan, G., Levchenko, K., McCoy, D., Voelker, G. M., & Savage, S. (2013). A fistful of bitcoins: Characterizing payments among men with no names. *Proceedings of the 2013 Internet Measurement Conference (IMC '13)*, 127–140.<br>
* Zhang, Y., Wang, J., & Luo, J. (2020). Heuristic-based address clustering in Bitcoin. *IEEE Access, 8*, 210582–210591.
  * What it backs up (Recent Advancements): This paper explores modern challenges in identifying change addresses and mathematically measures how effectively these heuristics reduce the anonymity set of UTXO-based blockchains.<br>

**Account-Based / Ethereum Heuristics (Deposit Sweeping)**

Because account-based chains like Ethereum do not use UTXOs, the academic community had to develop entirely new heuristic models, specifically focusing on how exchanges aggregate funds.

* Victor, F. (2020). Address clustering heuristics for Ethereum. In *Financial Cryptography and Data Security: 24th International Conference (FC 2020)* (pp. 136–154). Springer.<br>

**Entity Attribution (OSINT & Active Interaction)**

The process of moving from an anonymous cluster of addresses to a named, real-world entity requires active data gathering, a methodology heavily documented in forensic science.

* Ermilov, D., Panov, M., & Yanovich, Y. (2017). Automatic Bitcoin address clustering. *2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA)*, 461–466.<br>

**Risk Modeling and Taint Analysis (Tracing Hops)**

The concept of tracking funds through multiple intermediaries (indirect exposure) is rooted in network graph theory and flow analysis.

* Möser, M., Böhme, R., & Breuker, D. (2013). An inquiry into money laundering tools in the Bitcoin ecosystem. *2013 eCrime Researchers Summit*, 1–14.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.ospree.io/resources-and-help/blockchain-analytics.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
