Memory and Ultra Mode
Understand how memory works in Astell search and what Ultra mode is
Memory
Memory controls how much of your past conversation history Astell can carry into a new chat. The goal isn’t to replay everything you’ve ever said. The goal is continuity: Astell should be able to pick up the right context so you don’t have to restate background every time.
Astell Memory works across threads. That means when you start a new conversation, Astell can still use relevant context from previous conversations.
How Memory works
When you send a message, Astell does not load your entire chat history into the model. Instead, it runs a context-selection step:
- Astell searches across your conversation history (including other threads).
- It ranks prior messages by relevance to your current prompt.
- It includes the most relevant prior messages in the model’s context window for the new response.
Standard vs Expanded Memory
Both Standard and Expanded Memory can search across your past conversation history across threads. The difference is how many prior messages can be included in the model’s context for a single response.
Memory availability by plan
Memory is plan-defined. You don’t choose a Memory setting manually.
- Sapling: Standard Memory
- Tree: Expanded Memory
- Grove: Expanded Memory
- Forest (Enterprise): Expanded Memory
When Memory helps most
Memory is most helpful when the “right answer” depends on what you already discussed:
- Long-running workstreams where decisions and constraints carry over
- Iterative planning where you want to continue from prior steps
- “What did we decide last time?” questions
- Ongoing drafting where intent and constraints matter across sessions
How to get better results with Memory
- Reference the project or decision explicitly (“Continue the onboarding plan we discussed last week.”)
- Ask for deltas (“What changed since our last decision?”)
- Reassert constraints when needed (“Same scope, new timeline.”)
What Memory does not do
Memory does not expand permissions or override access rules. It also doesn’t replace ingestion: Memory is conversation-history continuity, not a mechanism for turning documents into searchable workspace knowledge.
Ultra mode
Ultra mode is designed for extended workspace context across multiple sources for complex queries. It works only with Advanced models, and it’s available on Grove and above. In most cases, a normal Advanced model response is enough—Ultra mode is meant for the rare cases where you need to reason over your entire context, pulling together many sources at once and producing a single “final answer” output. You can toggle Ultra mode per chat, and there’s a visible indicator when it’s active. Each Ultra-mode input uses 2 Advanced allowances.
What Ultra mode changes
Ultra mode increases the maximum context window used for the request so Advanced models can incorporate more workspace context across multiple sources in one response.
Common use cases include:
- Pulling together relevant material across docs, threads, tickets, and pull requests
- Summarizing a long time range (for example, “last 90 days”)
- Reconciling conflicting notes across sources and producing one recommendation
When Ultra mode is worth it (and when it isn’t)
How to prompt in Ultra mode (to avoid wasting allowance)
Because Ultra mode costs double allowance, treat it like a single high-quality request. The strongest prompts usually include:
- Objective (what “done” looks like)
- Scope (which systems and timeframe)
- Structure (summary → evidence → risks → recommendation → next steps)
Practical examples
Memory example:
“Continue the rollout plan we discussed last week. Keep the same constraints and update the timeline based on what changed.”
Ultra mode example (Advanced + Ultra on):
“Pull together the last 90 days of decisions across docs and threads, then give me a final rollout plan with risks, owners, and timeline.”
Related Articles
Continue learning with these related help articles