Native vs. Processed Data
Understand the two types of data ingested and their respective costs.
What's the difference between Native Data and Processed Data?
- Native Data is structured content from integrations (messages, tasks, text-only docs, CRM records, calendar items, code-related text). This sync is free and unlimited and uses 0 tokens.
- Processed Data generally refers to attachments and media (like documents, images, audio, and video). These consume tokens when they're processed.
What content is always free (Native Data)?
Native Data syncs with zero tokens and is completely unlimited on all plans. Included (always free):
- Messages: Slack messages/threads, Microsoft Teams chats, Discord messages, email conversations (Gmail/Outlook), DMs across platforms
- Tasks: Jira tickets/issues, GitHub issues/PRs, Linear tasks/projects, Asana tasks, Monday.com items, Trello cards
- CRM & business data: Salesforce records, HubSpot contacts/deals, customer notes/interactions, deal pipelines
- Calendar: Google Calendar events, Outlook calendar, meeting details/attendees, event descriptions
- Code & development: source code files, PR discussions, commit messages, code review comments, repository metadata
What kinds of content consume tokens?
Processed Data consumes tokens when it's ingested and processed. This includes all documents plus other file and media types. Common examples include:
- Documents: Notion pages, Confluence pages, Google Docs, PDFs, Word documents, PowerPoint/Slides, Excel/Sheets, and other document formats
- Images: screenshots, photos, diagrams, scanned documents, and other image files
- Audio: voice notes, meeting audio recordings, call recordings, and other audio files
- Video: meeting recordings, demos, training videos, and other video files
- Web pages: ingested web content processed for indexing
- Meetings: recorded meetings that require transcription and processing
Why does some content consume tokens when text syncing is free?
Non-text processing requires additional work to process and index the content. For example:
- Documents are priced based on the amount of content Astell has to process.
- Video processing includes transcription, diarization and frame analysis
- Audio processing includes transcription and diarization
- Embedded images inside documents are processed separately
How does unlimited syncing work?
Once you connect an integration like Slack, Gmail, or GitHub, Astell syncs your existing content and then keeps new content up to date in real time. All Native Data is indexed and fully searchable, and this syncing is completely unlimited — it never consumes tokens.
Is "unlimited" subject to Fair Use Policy?
Yes, although we are trying to make Native data ingestion feel as unlimited as possible, to avoid certain edge cases, Native Data ingestion is subject to Fair Use Policy. Fair Use Policy
Related Articles
Continue learning with these related help articles