Documentation Index
Fetch the complete documentation index at: https://cloudsineai-5cd7c547.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
Performance figures are obtained on the CloudsineAI recommended hardware specification. Latency is measured under no-load conditions; throughput is measured under the stated maximum concurrent-user load with the listed prompt sizing.
Recommended hardware
| Specification | Value |
|---|
| AWS Instance Type | g6e.2xlarge |
| vCPU | 8 |
| GPU | NVIDIA L40S, 48 GB |
| Memory | 64 GB |
| Storage | 450 GB |
Test profile
| Test profile | Prompt size | Latency SLO |
|---|
| Input-guardrail performance | 115 tokens (~614 characters) | ≤1.5 seconds per message |
| Output-guardrail performance | 1,150 tokens (~6,140 characters) | ≤1.5 seconds per message |
Per-guardrail latency and throughput tables — including no-load per-message latency, maximum concurrent-user counts at the 1.5-second SLO, and the supporting CSV exports — are available on request as part of the full UAT performance report.
Sample latencies
Indicative figures from a recent benchmark run (LLM-only configuration against a public prompt-injection corpus):
| Metric | Value |
|---|
| Mean latency | ~1.3 seconds |
| P95 latency | ~1.5–1.6 seconds |
| Precision | ~97% |
Exact values vary by configuration (guardrail mix, vector sensitivity) and by dataset.