The Moment You Outgrow the Standard API
There is a point in every trading infrastructure journey where the standard API stops being enough.
It happens quietly. Your team has been using TickDB for six months — streaming US equity quotes, backtesting mean-reversion strategies, monitoring after-hours price action. Everything works. Then your risk desk asks a question that the standard API cannot answer: "Can we normalize our internal data schema across TickDB feeds and our proprietary market microstructure feed without a custom translation layer?"
The standard API says no. Your custom SKILL says yes.
This is the gap that the TickDB SKILL protocol was designed to bridge. Rather than modifying the core API for every specialized enterprise requirement — a path that leads to API bloat and version chaos — TickDB exposes a structured extension layer. This layer lets engineering teams encapsulate proprietary logic, data transformations, and workflow integrations into deployable units called SKILLs.
This article walks through the complete development lifecycle of a custom TickDB SKILL: the protocol specification, function extension patterns, enterprise deployment topologies, and the specific architectural decisions that separate a fragile proof-of-concept from a production-grade extension that survives a 3 AM market open.
1. Understanding the SKILL Protocol Architecture
1.1 What a SKILL Actually Is
Before writing a single line of code, it is worth establishing a precise definition.
A SKILL in the TickDB ecosystem is a declarative, self-describing package that extends TickDB's runtime capabilities. It is not a plugin in the traditional sense — there is no shared library injection, no JVM classloader magic, no native binary deployment. Instead, a SKILL is a structured configuration bundle that the TickDB runtime interprets and orchestrates.
Think of it as an event-driven function pipeline: SKILLs consume data from TickDB streams, apply transformations or business logic, and emit results that downstream systems consume. The runtime manages orchestration, error handling, and lifecycle — your custom logic is isolated, versioned, and auditable.
The three core components of a SKILL are:
| Component | Role | Developer responsibility |
|---|---|---|
Manifest (skill.yaml) |
Declares the SKILL's identity, dependencies, interface, and lifecycle hooks | Define the configuration schema |
Functions (functions/ directory) |
Implements the custom business logic | Write production-grade code |
| Runtime bindings | Connects the SKILL to TickDB streams and external systems | Configure triggers, outputs, and error paths |
1.2 The Protocol Specification
The SKILL protocol is defined by a strict schema that ensures compatibility across TickDB versions and deployment targets. The manifest file is the contract between your custom logic and the TickDB runtime.
# skill.yaml — SKILL manifest structure
apiVersion: tickdb.io/v1
kind: Skill
metadata:
name: enterprise-orderbook-normalizer
version: 1.2.0
description: >
Normalizes TickDB depth snapshots into the internal order-book schema
used by Acme Capital's risk aggregation system.
author: platform-engineering@acmecapital.com
license: proprietary
spec:
# Runtime requirements
runtime:
language: python3.11
memoryLimitMb: 512
timeoutSeconds: 30
# Data inputs — what TickDB streams does this SKILL consume?
inputs:
- channel: depth
markets: [US, HK, CRYPTO]
levels: [1, 5, 10]
samplingMode: real-time
- channel: kline
markets: [US]
intervals: [1m, 5m, 1h]
# Data outputs — where does processed data go?
outputs:
- type: webhook
endpoint: "https://risk.internal.acmecapital.com/ingest"
auth:
method: mTLS
certPath: /run/secrets/risk-tls/cert.pem
keyPath: /run/secrets/risk-tls/key.pem
- type: internal-queue
queueName: normalized-orderbook
retentionSeconds: 3600
# Lifecycle hooks
hooks:
onInit: init_schema
onData: transform_orderbook
onError: escalate_alert
onShutdown: flush_pending
# Enterprise-specific: RBAC and audit
accessControl:
allowedTeams:
- risk-engineering
- platform-ops
auditLevel: full
The manifest declares the SKILL's intent. The runtime enforces the contract — validating that the declared inputs exist in your TickDB subscription, that memory and timeout limits are within acceptable bounds, and that authentication credentials are properly mounted.
2. Function Extension: Writing the Core Logic
2.1 The Function Interface Contract
Every function in a SKILL must conform to a strict interface. The TickDB runtime passes a standardized context object to your function and expects a predictable return structure. Deviating from this contract causes the runtime to reject the function at deployment time — not at invocation time. This is intentional: catching interface mismatches during deployment rather than at runtime is a fundamental design principle.
# functions/transform_orderbook.py
"""
Enterprise Order Book Normalizer — core transform function.
This function consumes raw TickDB depth snapshots, normalizes the schema
to Acme Capital's internal order-book format, and computes derived metrics
(pressure ratio, spread dynamics, imbalance score) before forwarding to
the risk aggregation system.
"""
import os
import json
import hashlib
import hmac
from typing import Any, TypedDict
from datetime import datetime, timezone
class OrderBookContext(TypedDict):
"""Standard TickDB SKILL function context."""
market: str
symbol: str
timestamp: int # Unix milliseconds
bid_levels: list[dict[str, Any]]
ask_levels: list[dict[str, Any]]
sequence_id: int
class NormalizedOutput(TypedDict):
"""Expected return structure for this function."""
status: str # "ok" | "drop" | "error"
output: dict[str, Any] | None
error_message: str | None
metadata: dict[str, Any]
def transform_orderbook(ctx: OrderBookContext) -> NormalizedOutput:
"""
Transform a raw TickDB depth snapshot into the internal schema.
Transformation rules:
1. Flatten nested levels into a single sorted bid/ask list.
2. Compute pressure ratio: sum(bid sizes, top 5) / sum(ask sizes, top 5).
3. Compute imbalance score: (bid_vol - ask_vol) / (bid_vol + ask_vol).
4. Attach schema version and processing timestamp.
5. Sign the output payload for mTLS authentication.
Returns:
NormalizedOutput: A dict conforming to the SKILL function contract.
"""
try:
# --- Schema validation ---
required_fields = ("market", "symbol", "timestamp",
"bid_levels", "ask_levels")
for field in required_fields:
if field not in ctx:
return _error_result(f"Missing required field: {field}")
# --- Normalize bid/ask levels ---
bids = _flatten_levels(ctx["bid_levels"])
asks = _flatten_levels(ctx["ask_levels"])
# --- Compute derived metrics ---
top_n = min(5, len(bids), len(asks))
bid_volume_top = sum(float(b["size"]) for b in bids[:top_n])
ask_volume_top = sum(float(a["size"]) for a in asks[:top_n])
total_bid_volume = sum(float(b["size"]) for b in bids)
total_ask_volume = sum(float(a["size"]) for a in asks)
pressure_ratio = (
bid_volume_top / ask_volume_top
if ask_volume_top > 0 else float("inf")
)
imbalance_score = (
(total_bid_volume - total_ask_volume)
/ (total_bid_volume + total_ask_volume)
if (total_bid_volume + total_ask_volume) > 0 else 0.0
)
best_bid = float(bids[0]["price"]) if bids else 0.0
best_ask = float(asks[0]["price"]) if asks else float("inf")
spread_bps = (
((best_ask - best_bid) / best_bid) * 10_000
if best_bid > 0 and best_ask != float("inf") else 0.0
)
# --- Build internal schema ---
normalized = {
"schema_version": "2.1.0",
"internal_symbol": _map_symbol(ctx["market"], ctx["symbol"]),
"exchange_timestamp": ctx["timestamp"],
"processing_timestamp": int(
datetime.now(timezone.utc).timestamp() * 1000
),
"sequence_id": ctx["sequence_id"],
"bids": bids,
"asks": asks,
"metrics": {
"pressure_ratio": round(pressure_ratio, 4),
"imbalance_score": round(imbalance_score, 4),
"spread_bps": round(spread_bps, 2),
"total_bid_volume": total_bid_volume,
"total_ask_volume": total_ask_volume,
},
}
# --- Signature for mTLS endpoint authentication ---
normalized["_signature"] = _compute_signature(
normalized, os.environ.get("SKILL_SIGNING_KEY", "")
)
return {
"status": "ok",
"output": normalized,
"error_message": None,
"metadata": {
"function": "transform_orderbook",
"schema_version": "2.1.0",
"market": ctx["market"],
"symbol": ctx["symbol"],
},
}
except Exception as e:
return _error_result(f"Unhandled exception: {str(e)}")
# --- Internal helpers ---
def _flatten_levels(levels: list[dict[str, Any]]) -> list[dict[str, Any]]:
"""
Flatten and sort order book levels by price (descending for bids,
ascending for asks).
"""
if not levels:
return []
# Normalize to list of dicts with 'price' and 'size'
normalized = [
{"price": float(l.get("price", l.get("p", 0))),
"size": float(l.get("size", l.get("s", 0)))}
for l in levels
]
return sorted(normalized, key=lambda x: x["price"], reverse=True)
def _map_symbol(market: str, symbol: str) -> str:
"""
Map TickDB symbol format to Acme Capital's internal symbol convention.
This is a placeholder — real implementations query an internal
reference data service.
"""
mapping = {
"US": lambda s: s,
"HK": lambda s: f"{s}.HK",
"CRYPTO": lambda s: s.upper(),
}
normalizer = mapping.get(market, lambda s: s)
return normalizer(symbol)
def _compute_signature(payload: dict, secret: str) -> str:
"""HMAC-SHA256 signature for payload authentication."""
serialized = json.dumps(payload, sort_keys=True, default=str)
return hmac.new(
secret.encode(),
serialized.encode(),
hashlib.sha256
).hexdigest()
def _error_result(message: str) -> NormalizedOutput:
return {
"status": "error",
"output": None,
"error_message": message,
"metadata": {
"function": "transform_orderbook",
"error_timestamp": int(
datetime.now(timezone.utc).timestamp() * 1000
),
},
}
A few engineering decisions in this code deserve explicit justification:
Sequential processing with stateless design. Each invocation of transform_orderbook is fully independent — it operates on a single context object and produces a single output. This is not an accident. Stateful functions that accumulate state across invocations introduce subtle ordering bugs when the runtime distributes messages across parallel workers. If your use case genuinely requires state accumulation (e.g., a rolling window indicator), implement it as a separate stateful service that consumes the stateless output from this function.
Explicit schema versioning. The output includes "schema_version": "2.1.0". Downstream consumers of your SKILL output may break if you change the output schema without warning. Semantic versioning in the payload allows consumers to implement their own version-branching logic independently of the SKILL deployment lifecycle.
Error isolation. The function wraps its entire body in a try/except and returns a structured error result rather than raising an exception. This is not defensive coding — it is the contract. The SKILL runtime's error hook (declared in the manifest) receives the error result, not the Python exception. Raising an unhandled exception at this layer causes the runtime to terminate the worker and reschedule the message, which can produce duplicate processing in production.
3. Lifecycle Hooks: The Orchestration Layer
3.1 Hook Anatomy
The SKILL manifest declares four lifecycle hooks. These are not decoration — they are the orchestration interface between your custom logic and the TickDB runtime's error management, scaling, and observability infrastructure.
| Hook | When it fires | Use case |
|---|---|---|
onInit |
Before the SKILL begins processing | Initialize schema validation, establish database connections, warm caches |
onData |
On every data event from subscribed channels | The primary transform function |
onError |
When onData returns a status of "error" |
Alert routing, dead-letter queue, circuit breaker |
onShutdown |
When the SKILL is gracefully stopped | Flush pending writes, close connections, log state |
3.2 Implementing the Error Hook
The onError hook is where most enterprise SKILLs diverge most significantly from a proof-of-concept. A production-grade error handler does not merely log the error — it implements a circuit breaker pattern to prevent cascade failures under sustained high-error conditions.
# functions/escalate_alert.py
"""
Error escalation hook for enterprise-orderbook-normalizer.
Implements a circuit breaker:
- Track error rate over a rolling 60-second window.
- If error rate exceeds 10%, open the circuit and drop subsequent
errors for 30 seconds (preventing downstream alert fatigue).
- Log all errors to the internal SIEM regardless of circuit state.
- After the cooldown period, attempt a half-open state and test
with the next error. If it succeeds, close the circuit.
"""
import os
import time
import json
from collections import deque
from datetime import datetime, timezone
from typing import Any, TypedDict
import logging
import requests
# --- Circuit breaker state (module-level singleton per worker) ---
_circuit_state = {
"errors": deque(maxlen=100), # Rolling window of (timestamp, error_msg)
"state": "closed", # closed | open | half-open
"last_open_time": None,
"cooldown_seconds": 30,
"error_threshold_rate": 0.10, # 10% error rate threshold
"window_seconds": 60,
}
# SIEM endpoint — credentials loaded from environment at init
_SIEM_ENDPOINT = os.environ.get("SIEM_ENDPOINT", "")
_SIEM_API_KEY = os.environ.get("SIEM_API_KEY", "")
class ErrorContext(TypedDict):
function_name: str
error_message: str
metadata: dict[str, Any]
def escalate_alert(ctx: ErrorContext) -> dict[str, Any]:
"""
Circuit-breaker error handler.
Returns a dict with:
- 'circuit_state': current state after processing this error
- 'alert_sent': whether an alert was dispatched to the SIEM
- 'message_suppressed': whether the error was dropped (circuit open)
"""
now = time.time()
state = _circuit_state
# --- Purge expired errors from rolling window ---
cutoff = now - state["window_seconds"]
while state["errors"] and state["errors"][0][0] < cutoff:
state["errors"].popleft()
# --- Record this error ---
state["errors"].append((now, ctx["error_message"]))
# --- Compute current error rate ---
error_count = len(state["errors"])
window_duration = (
now - state["errors"][0][0]
if state["errors"] else 0
)
error_rate = (
error_count / window_duration
if window_duration > 0 else 0
)
# --- Circuit breaker state machine ---
suppress = False
if state["state"] == "open":
if state["last_open_time"] and \
(now - state["last_open_time"]) > state["cooldown_seconds"]:
state["state"] = "half-open"
else:
suppress = True
elif state["state"] == "half-open":
# Next error closes the circuit (stay open on error)
state["state"] = "closed"
state["last_open_time"] = None
elif state["state"] == "closed":
if error_rate > state["error_threshold_rate"]:
state["state"] = "open"
state["last_open_time"] = now
suppress = False # Still alert on circuit open
# --- Always log to SIEM (non-negotiable for enterprise audit) ---
alert_payload = {
"timestamp": datetime.now(timezone.utc).isoformat(),
"skill": "enterprise-orderbook-normalizer",
"function": ctx["function_name"],
"error_message": ctx["error_message"],
"metadata": ctx["metadata"],
"circuit_state": state["state"],
}
_dispatch_to_siem(alert_payload)
return {
"circuit_state": state["state"],
"alert_sent": True,
"message_suppressed": suppress,
"error_rate": round(error_rate, 4),
}
def _dispatch_to_siem(payload: dict) -> None:
"""
Fire-and-forget SIEM dispatch with a 2-second timeout.
Never raise an exception here — this function is called from the
error hook, and raising would trigger recursion.
"""
if not _SIEM_ENDPOINT:
logging.warning("SIEM_ENDPOINT not configured — skipping alert dispatch")
return
try:
requests.post(
_SIEM_ENDPOINT,
json=payload,
headers={
"Authorization": f"Bearer {_SIEM_API_KEY}",
"Content-Type": "application/json",
},
timeout=(2.0, 5.0), # Connection timeout, read timeout
)
except requests.Timeout:
logging.error(f"SIEM dispatch timed out for payload: {payload}")
except requests.RequestException as e:
logging.error(f"SIEM dispatch failed: {e}")
This implementation handles an important subtlety in distributed error handling: alert fatigue is a reliability risk. A circuit breaker that forwards every error to the SIEM will eventually cause the SIEM to drop alerts, silently defeating the purpose of the monitoring infrastructure. Suppressing repeated errors during an open circuit — while still logging them — balances alert responsiveness with downstream system protection.
4. Private Deployment: Enterprise Deployment Topologies
4.1 Three Deployment Models
TickDB supports three deployment topologies for enterprise SKILLs. The right choice depends on your data residency requirements, network topology, and operational maturity.
| Model | Description | When to use |
|---|---|---|
| Managed (Cloud) | SKILLs run in TickDB's managed cloud infrastructure. Runtime, scaling, and observability are handled by TickDB. | Fastest time-to-market; no compliance restrictions on cloud deployment; team lacks platform engineering capacity |
| Co-located (On-prem connector) | TickDB operates a secure connector inside your network perimeter. SKILLs run in your infrastructure but are managed by TickDB's control plane. | Data residency requirements (e.g., HKMA, SEC regulations); latency-sensitive applications that need co-location |
| Fully private | Entire TickDB stack, including the control plane, runs inside your private cloud. No external network egress. | Maximum regulatory isolation (defense, sovereign wealth funds); zero-trust network environments |
4.2 Fully Private Deployment: Architecture Walkthrough
For organizations with strict data residency requirements, the fully private deployment model is the relevant option. The following architecture diagram describes a reference deployment on Kubernetes:
┌──────────────────────────────────────────────────────────────┐
│ Acme Capital Private Cloud │
│ (AWS GovCloud / Azure Sovereign) │
├──────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────┐ ┌──────────────┐ ┌────────────────┐ │
│ │ Data feeds │───▶│ TickDB Core │───▶│ SKILL Runtime │ │
│ │ (proprietary │ │ (on-prem) │ │ (Kubernetes) │ │
│ │ + exchange) │ │ │ │ │ │
│ └─────────────┘ └──────────────┘ └───────┬────────┘ │
│ │ │
│ ┌──────────────────────┼──────────┐ │
│ ▼ ▼ ▼ │
│ ┌──────────────┐ ┌────────────┐ ┌──────┐ │
│ │ Risk Engine │ │ Data Lake │ │ SIEM │ │
│ │ (internal) │ │ (S3/GCS) │ │ │ │
│ └──────────────┘ └────────────┘ └──────┘ │
│ │
└───────────────────────────────────────────────────────────────┘
The SKILL Runtime in this topology is a Kubernetes Deployment with the following properties:
# kubernetes/skill-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: enterprise-orderbook-normalizer
namespace: tickdb-skills
labels:
app: tickdb-skill
skill: enterprise-orderbook-normalizer
spec:
replicas: 3 # Active-active for HA; runtime distributes load
selector:
matchLabels:
app: tickdb-skill
skill: enterprise-orderbook-normalizer
template:
metadata:
labels:
app: tickdb-skill
skill: enterprise-orderbook-normalizer
spec:
containers:
- name: skill-runtime
image: tickdb/skill-runtime:2.4.1
resources:
requests:
memory: "512Mi"
cpu: "250m"
limits:
memory: "512Mi" # Matched to manifest memoryLimitMb
cpu: "500m"
env:
- name: TICKDB_API_KEY
valueFrom:
secretKeyRef:
name: tickdb-credentials
key: api-key
- name: SKILL_SIGNING_KEY
valueFrom:
secretKeyRef:
name: skill-internal-secrets
key: signing-key
- name: SIEM_ENDPOINT
value: "https://siem.internal.acmecapital.com/api/v1/ingest"
- name: SIEM_API_KEY
valueFrom:
secretKeyRef:
name: skill-internal-secrets
key: siem-api-key
volumeMounts:
- name: skill-package
mountPath: /app/skill
readOnly: true
- name: tls-certs
mountPath: /run/secrets/risk-tls
readOnly: true
livenessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 10
periodSeconds: 30
readinessProbe:
httpGet:
path: /ready
port: 8080
initialDelaySeconds: 5
periodSeconds: 10
volumes:
- name: skill-package
configMap:
name: enterprise-orderbook-normalizer
- name: tls-certs
secret:
secretName: risk-tls-certificates
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchLabels:
app: tickdb-skill
topologyKey: "kubernetes.io/hostname"
The podAntiAffinity rule ensures that the three replicas are scheduled on different nodes — a critical requirement for production SKILL deployments. A node failure should never eliminate all processing capacity.
4.3 Secret Management
Hardcoding credentials in environment variables is acceptable for development; it is a compliance violation in enterprise deployments. In the Kubernetes topology above, credentials are injected from Secrets objects, which themselves should be provisioned by an external secrets manager (AWS Secrets Manager, HashiCorp Vault, Azure Key Vault).
The deployment manifest uses secretKeyRef rather than raw values. This is not merely a security best practice — it is required for most regulated deployments under frameworks like SOC 2 Type II and ISO 27001. Auditors will flag raw credentials in environment variables as a finding.
5. Enterprise Customization: Beyond the Standard SKILL
5.1 The Function Extension Registry
Beyond the standard SKILL lifecycle, TickDB exposes a Function Extension Registry — a structured way to register custom functions that can be invoked by external systems (trading systems, risk engines, reporting pipelines) via the TickDB API gateway, without those external systems needing to know the function's implementation details.
This is a different deployment model from the stream-processing SKILLs described above. A Function Extension is a callable endpoint — it receives an HTTP request with a structured payload, executes your custom logic, and returns a response.
# functions/calculate_intraday_volatility.py
"""
Function Extension: Intraday Volatility Surface Calculator.
This function is registered in the Function Extension Registry and can be
invoked by any authorized system via the TickDB API gateway:
POST /v1/extensions/calculate_intraday_volatility
Authorization: Bearer <service-account-token>
Use case: Risk engines that need real-time volatility estimates for Greeks
calculation without maintaining their own OHLCV processing pipeline.
"""
import math
from typing import TypedDict
class VolatilityInput(TypedDict):
symbol: str
lookback_minutes: int
return_type: str # "simple" | "log"
class VolatilityOutput(TypedDict):
symbol: str
volatility_annualized: float
volatility_daily: float
sample_count: int
confidence: str # "high" | "medium" | "low"
def calculate_intraday_volatility(
input_data: VolatilityInput
) -> VolatilityOutput:
"""
Calculate annualized and daily volatility from intraday returns.
This is a simplified implementation. Production versions should:
- Fetch OHLCV data from TickDB /v1/market/kline
- Handle sparse data (pre-market, lunch hours)
- Apply EWMA or GARCH for forward-looking adjustment
"""
# Placeholder: real implementation fetches from TickDB and computes
# rolling standard deviation of returns
sample_count = 390 * (input_data["lookback_minutes"] // 60)
confidence = (
"high" if sample_count >= 390
else "medium" if sample_count >= 60
else "low"
)
# Simplified placeholder calculation
volatility_daily = 0.015 # ~1.5% daily vol (placeholder)
volatility_annualized = volatility_daily * math.sqrt(252)
return {
"symbol": input_data["symbol"],
"volatility_annualized": round(volatility_annualized, 6),
"volatility_daily": round(volatility_daily, 6),
"sample_count": sample_count,
"confidence": confidence,
}
The Function Extension Registry enables a powerful architectural pattern: logic centralization. Rather than copying your volatility calculation logic into every trading system that needs it, you write it once as a Function Extension and every authorized system calls the same endpoint. When you improve the calculation (switching from simple rolling standard deviation to GARCH), every consumer benefits immediately.
5.2 Versioning and Rollback
Enterprise SKILLs must be versioned with the same rigor as production services. The manifest's version field is the primary version identifier. TickDB's SKILL runtime supports:
- Semantic versioning: MAJOR.MINOR.PATCH (e.g.,
1.2.0). A MAJOR bump indicates a breaking change to the input/output schema. - Zero-downtime deployment: The runtime supports blue-green deployment of SKILL versions. Traffic is shifted gradually from the old version to the new version, with automatic rollback if the error rate spikes.
- Schema compatibility checking: Before deploying a new version, the runtime validates that the new version's manifest is compatible with the declared inputs and outputs. Breaking schema changes are blocked at deployment time, not discovered in production.
# SKILL deployment lifecycle commands
# Deploy a new version (runtime validates schema compatibility)
tickdb skill deploy enterprise-orderbook-normalizer:1.3.0 ./dist/
# Roll back to the previous version
tickdb skill rollback enterprise-orderbook-normalizer
# Check deployment status across replicas
tickdb skill status enterprise-orderbook-normalizer --watch
6. Security and Access Control
6.1 RBAC Integration
The SKILL manifest includes an accessControl block. This is not a documentation field — it is enforced at the runtime level. Teams not listed in allowedTeams cannot invoke the SKILL, even if they have a valid TickDB API key.
This matters for enterprise compliance: a risk analyst with a valid TickDB read key should not be able to trigger a SKILL that writes to the risk aggregation system. Role-based access control at the SKILL layer provides an additional enforcement boundary beyond the API key itself.
6.2 Audit Logging
Every SKILL invocation produces an audit log entry that is forwarded to your configured SIEM. The audit log captures:
- Invoker identity (team, service account)
- Input payload hash (not the full payload — protecting sensitive market data)
- Processing duration
- Output status (ok / drop / error)
- Error details (if applicable)
Audit logs are retained for the period required by your regulatory framework. For most financial institutions, this is 7 years minimum.
7. Monitoring and Observability
7.1 The Four Signals
A production SKILL is not complete without observability instrumentation. The SKILL runtime natively exposes four metrics endpoints:
| Signal | What it measures | Alert threshold (typical) |
|---|---|---|
| Latency | Time from data arrival to output dispatch | p99 > 500 ms |
| Error rate | Fraction of invocations returning "error" status |
> 1% over 5 minutes |
| Throughput | Messages processed per second | Deviation > 30% from baseline |
| Circuit state | Current circuit breaker state | Any state other than "closed" |
7.2 Integration with Prometheus + Grafana
# kubernetes/servicemonitor.yaml
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: enterprise-orderbook-normalizer
namespace: tickdb-skills
spec:
selector:
matchLabels:
app: tickdb-skill
skill: enterprise-orderbook-normalizer
endpoints:
- port: metrics
path: /metrics
interval: 15s
With the ServiceMonitor deployed, the SKILL's four signals are scraped automatically by Prometheus and available in Grafana dashboards. Alerting rules can be configured to page the platform engineering team when error rate or circuit state thresholds are breached.
8. When to Build a SKILL vs. When to Extend the API
This is the most important strategic question in SKILL development, and the answer is not always "build a SKILL."
Build a SKILL when:
- You need to encapsulate business logic that transforms or enriches TickDB data before it reaches downstream systems.
- Your use case requires data from multiple TickDB channels (depth + kline + trades) in combination.
- You need to route processed data to internal systems that are not part of the standard TickDB output ecosystem.
- Your organization has compliance or audit requirements that demand isolated processing pipelines.
Extend the core API (through TickDB's enterprise roadmap process) when:
- The required functionality is universally applicable — not specific to your organization.
- The feature requires changes to the core data model that would benefit all users.
- The use case is latency-critical enough that the SKILL runtime's orchestration overhead is unacceptable.
The SKILL protocol is explicitly designed for enterprise customization — not for features that belong in the core platform. Misusing the SKILL layer as a substitute for core API development creates maintenance burden for your team and reduces the ecosystem value of the platform for everyone else.
9. Closing
The gap between a standard API and an enterprise-grade data infrastructure is not filled by more endpoints. It is filled by structured extensibility — the ability to embed your organization's specific logic, compliance requirements, and integration needs into the data pipeline without fracturing the core platform.
The TickDB SKILL protocol is that structured extensibility layer. A well-designed custom SKILL is a deployable, versioned, observable, auditable unit of business logic that outlives any individual developer on your team. When the next regulatory requirement arrives — and it will — the SKILL is the unit you update, test, and redeploy.
Next Steps
If you are an enterprise platform engineer evaluating TickDB for custom workflows, contact enterprise@tickdb.ai to discuss private deployment options and the Function Extension Registry.
If you want to explore the SKILL development kit, the open-source SKILL SDK is available at the TickDB developer portal. The SDK includes a local runtime emulator, schema validation tools, and a deployment CLI.
If you are building a proof-of-concept SKILL and need guidance on architecture, the TickDB developer relations team offers office hours for enterprise customers. Schedule a session through the TickDB dashboard.
If you use AI coding assistants, search for and install the tickdb-market-data SKILL in your AI tool's marketplace — it includes pre-built templates for the patterns covered in this article.
This article does not constitute investment advice. Market data infrastructure decisions should be evaluated in the context of your organization's specific compliance, security, and operational requirements.