Engine Pricing

Important Note: We've introduced Smart Context Compression (formerly Palo Output). This features intelligently abstracts and compresses your input context, significantly reducing your effective token usage. However, for 770 and DEEP-R engines, this minimization will not apply to code or research data, which will be processed fully by the external LLM to maintain precision.

Subscription Plans: Architect ($20/month) includes 120M tokens. Business ($35/user/month) includes 200M tokens. Overage is billed at the rates below. See Billing for full plan details.

Engine / API Name Mini (Lite) (mpalo-palo-lite) Palo Bloom (mpalo-palo) 770 (mpalo-palo-770) DEEP-R (mpalo-palo-DEEP-R)
Blended In/Out Rate (per 1M tokens) $0.3 $0.9 $2.1 $2.9
Memory Traversal Info Optional: Traversal of memories for better supported recall, Episodic. Priced at 60% of blended I/O rate. (optional, per 1M tokens) $0.18 $0.54 $1.26 $1.76
Memory Mapping Info Optional: Mapping of memories for more accurate supported recall, Episodic + Temporal. Priced at 80% of blended In/Out rate. (optional, per 1M tokens) $0.24 $0.72 $1.68 $2.25
Image Processing (per image) $0.0005 $0.001 $0.003 $0.015
Audio Processing
(Q4 2026 Beta)
~$1.50 in / $6 out ~$2.00 in / $8 out ~$3.00 in / $10 out ~$5.00 in / $12 out

The Blended In/Out Rate combines input and output pricing for simplicity. Memory Traversal and Memory Mapping are optional features charged only when used.

How it works: All intermediate token costs are calculated as 60% (Traversal) and 80% (Mapping) of the engine's blended In/Out rate, ensuring predictable and proportional costs regardless of your workflow. This means if you use more memory features, costs scale fairly with the engine's power level.

Show detailed pricing breakdown for reference
Engine / API Name Mini (Lite) (mpalo-palo-lite) Palo Bloom (mpalo-palo) 770 (mpalo-palo-770) DEEP-R (mpalo-palo-DEEP-R)
Input Price (per 1M tokens) $0.235 $0.806 $1.8 $2.484
Output Price (per 1M tokens) $0.355 $1.032 $2.34 $3.229
Memory Traversal (80% of blended, per 1M tokens) $0.18 $0.54 $1.26 $1.76
Memory Mapping (80% of blended, per 1M tokens) $0.24 $0.76 $1.68 $2.25
Image Processing (per image) $0.0005 $0.001 $0.003 $0.015
Audio Processing (Est. per 1M) Betas Q2 '26 Betas Q2 '26 Betas Q2 '26 Betas Q2 '26

Note: Prices shown are illustrative for Mpalo Palo engine usage only.

Storage Costs (BYOVS): You choose your vector storage provider (e.g., Pinecone, Weaviate). Storage costs are paid directly to them (typically $0.10–$0.50/GB-month). With Mpalo, you pay for the intelligent memory access (Traversal/Mapping) to utilize that data, not the passive storage itself.
Billing for external AI models (OpenAI, Google) is also separate and handled via your API keys.

Feature Decision Guide: Traversal vs. Mapping

Feature Best For Recall Type Performance
Memory Traversal Chatbots, FAQs, simple personalization Episodic (Event-based) Fast (~50ms latency), Lower Cost
Memory Mapping Complex Agents, Long-term reasoning, Research Episodic + Temporal (Time-aware) Deeper (~200ms latency), Higher Accuracy

Add-ons & Extras

Add-on Price Availability
Custom Connections (Ollama, etc.) $5 / slot / month Free on Architect (2 slots) & Business (5 slots)
Secure Tunnel $20 / month Business & Enterprise Plans
Private Data Spaces Usage-based Enterprise Only

Monthly Cost Calculator

To provide a helpful starting point, Memory Traversal and Mapping costs are automatically estimated based on your input size (4.5x and 5.3x respectively). The total is calculated using the detailed pricing table above.

Estimated Monthly Cost:

Input Tokens: $0.00
Output Tokens: $0.00
Memory Traversal: $0.00
Memory Mapping: $0.00
Image Processing: $0.00
Total: $0.00

Feature Limitations

Feature Mini Palo Bloom DEEP DEEP-Research
Max Image Size 5 MB 10 MB 50 MB 100 MB
Image Formats JPG, PNG JPG, PNG, GIF JPG, PNG, GIF, WebP, TIFF, BMP All formats + RAW
Max Images per Call 1 3 10 25
Max Audio Size (TBD) 5 MB 10 MB 50 MB 100 MB
Audio Formats (TBD) MP3, WAV MP3, WAV, AAC MP3, WAV, AAC, FLAC, OGG All formats + hi-res
Max Audio per Call (TBD) 1 3 10 25

Understanding Mpalo's Capabilities and Limitations

It is crucial to understand that Mpalo's Palo Engines do not generate images or audio based on prompts. Instead, our models are designed to summarize, memorize, and process visual and auditory input to build rich, persistent memory representations.

Image Input Requirements

Input images must meet the following requirements to be processed by the API:

  • Supported File Types: PNG, JPEG, WEBP, and non-animated GIF.
  • Size Limits: Up to 20MB per image.
  • Quality: Images are resized before analysis, which may affect original dimensions.
  • Content: No watermarks, logos, or NSFW content. Must be clear enough for a human to understand.

Additional Model Limitations

While our vision capabilities are powerful, it's important to understand their current limitations:

  • Non-English Text: May not perform optimally with non-Latin alphabets.
  • Small Text: Enlarge text within the image for better readability.
  • Rotation: May misinterpret rotated or upside-down content.
  • Visual Elements: May struggle with graphs or text where meaning relies on colors or styles.
  • Spatial Reasoning: Struggles with tasks requiring precise spatial localization.
  • Counting: May provide approximate counts of objects.

We are actively working to enhance these capabilities.