Skip to content Mpalo Logo Spiral

The
Palo BLOOM
Parable

The Palo Model Family

Palo Models

Palo Mini

Fast and efficient, Palo Mini provides essential episodic memory for applications needing quick, contextual recall with low latency. Ideal for simple chatbots, command-line interfaces, or where minimal resource usage is critical. Supports basic multi-modal data handling (text, simple image references).

Learn more »

Palo Bloom

A balanced and versatile model, Palo offers enhanced episodic and semantic memory capabilities. Optimized for mobile applications, edge devices, and personal assistants demanding a good mix of performance, recall depth, and resource efficiency. Supports broader multi-modal inputs.

Learn more »

Palo Research

Our most advanced offering, Palo Research is engineered for complex tasks requiring profound reasoning, rich contextual understanding, and robust long-term memory. It integrates advanced features like Memory Mapping and is influenced by Memory-Traversal concepts for comprehensive semantic network building and nuanced recall. Ideal for enterprise-grade applications, sophisticated research, and advanced robotics requiring multi-modal data fusion and complex pattern recognition.

Learn more »
Palo Episodic Memory

Precise Episodic Memory with Palo

Palo offers improved recall of specific past interactions and events. This focuses on strengthening the timeline of experiences, allowing for more detailed and accurate references to previous points in a dialogue or process.

Memory-Traversal & Mapping

Palo's memory system is inspired by cognitive science. Our advanced models use two key capabilities:

Memory-Traversal

Navigate and link sequences of related interactions, forming coherent event chains for better contextual understanding.

Memory Mapping

Create interconnected knowledge structures to understand relationships and identify subtle patterns.

See Our Benchmarks

Advanced Reasoning through Memory-Traversal

Certain configurations of Palo incorporate techniques like Memory-Traversal. This allows for more complex reasoning by utilizing sequences of recalled information, contributing to deeper contextual understanding and enhanced problem-solving processes.

Palo Advanced Reasoning
Palo Deep Recall

Expanded Recall with Memory Mapping

More advanced Palo capabilities include Memory Mapping. This feature enables the system to access and synthesize information from a broader range of past interactions, helping to identify subtle patterns and connections within the data.

Proactive Contextual Synthesis in Palo

Future-oriented developments for Palo focus on proactively synthesizing relevant memories. The goal is to anticipate user needs and offer contextually relevant insights, aiming for a more integrated integration of recall and application.

Palo Proactive Synthesis

Smart Context Compression

Palo intelligently abstracts and compresses your input context before it reaches the LLM. This process significantly reduces downstream token consumption while preserving the core informational value, leading to major cost savings on long-context tasks.

Compression Efficiency Example

Raw Input Context: 50,000 tokens
Compressed Context: 15,000 tokens
Effective Savings: ~70% Reduction

Value Proposition: Memory That Pays for Itself

Mpalo distinguishes itself not just by price, but by architecture. While models like Gemini 2.5 Flash are aggressively priced, they lack persistent episodic memory. Mpalo's bundled approach (memory as inference cost) is often cheaper and faster than building your own vector database infrastructure.

Competitive Reality (2026)

Gemini 2.5 Flash: ~$0.24 blended. Cheap, but context is transient.

GPT-4o mini: ~$1.33 blended. Good baseline, no memory.

Palo Bloom: ~$0.90 blended (or ~$1.50 w/ Memory).

Why Pay More? You don't re-send 10k tokens of history every turn. You send 500 tokens of query + memory references.

Result: ~40-60% savings in total monthly costs.