Pricing
Built for Salesforce teams.
Priced for AppExchange.
You bring your own LLM API keys. FlowMason never charges per token — your AI costs go directly to your provider.
Starter
For individual developers and small teams exploring AI pipelines
- Up to 3 active pipelines
- 1 LLM provider
- Flow Builder invocation (@InvocableMethod)
- Apex facade (PipelineRunner.run())
- Basic audit log (30-day retention)
- FMTestMocks test framework
- Community support
Professional
For Salesforce teams shipping AI features in production
- Unlimited active pipelines
- All 7 LLM providers + provider switching
- Pipeline Studio (visual builder + SFDX export)
- Full audit log + cost reports (unlimited retention)
- All 6 LWC drop-in components
- REST API endpoints
- Trigger framework + Platform Events + Scheduler
- Async execution + GovernorMonitor
- Email support + response SLA
Enterprise
For orgs with compliance, multi-org, or private AI requirements
- Everything in Professional
- Multi-org deployment
- Private/on-premise LLM support (Ollama, custom Bedrock)
- Custom permission set configuration
- SSO integration support
- Dedicated onboarding + architecture review
- Custom SLA + named support contact
Full feature comparison
| Feature | Starter | Professional | Enterprise |
|---|---|---|---|
| Active pipelines | 3 | Unlimited | Unlimited |
| LLM providers | 1 | All 7 | All 7 + custom |
| Flow Builder (@InvocableMethod) | ✓ | ✓ | ✓ |
| Apex facade (PipelineRunner) | ✓ | ✓ | ✓ |
| FMTestMocks (unit testing) | ✓ | ✓ | ✓ |
| Pipeline Studio + SFDX export | — | ✓ | ✓ |
| LWC drop-in components (6) | — | ✓ | ✓ |
| REST API endpoints | — | ✓ | ✓ |
| Trigger framework + Platform Events | — | ✓ | ✓ |
| Async execution + GovernorMonitor | — | ✓ | ✓ |
| Audit log | 30 days | Unlimited | Unlimited |
| Cost reports | — | ✓ | ✓ |
| Private / on-premise LLM | — | — | ✓ |
| Multi-org deployment | — | — | ✓ |
| SSO integration | — | — | ✓ |
| Support | Community | Email + SLA | Named contact |
Beyond the FlowMason subscription
FlowMason itself is per-org subscription. LLM calls bill to your vendor account directly; we never mark up. Below is how to forecast the variable line.
Direct mode (vendor APIs)
Calls hit Anthropic / OpenAI / Bedrock / etc. Vendor bills you directly via Named Credential. No Einstein Requests consumed.
- Per-stage
Cost__cinPipeline_Stage_Log__c - Aggregated per-run in
ExecutionTrace.usage - Pricing rows in
FM_Provider_Pricing__mdt
Trust-layer mode (Models API)
Calls route through Salesforce Einstein Models API. Bills as Einstein Requests, not USD.
- Source of truth: Setup → Einstein → Usage
- BYOLLM Open Connector models 30% cheaper than Salesforce-managed
- FlowMason logs request counts to
FMLog
Tool-calling cost (ADR-013)
Tool-calling adds round-trips per turn but reduces prompt tokens because conversation history lives server-side in FMThreadState, not re-sent each turn.
| Metric | Pre-tool-calling | Tool-calling on |
|---|---|---|
| Round-trips / turn | 2 | 3-5 (capped at orgChatToolCallingMaxCalls, default 5) |
| Prompt tokens | baseline | ≥ 30% reduction on long conversations |
| Wall-clock budget | n/a | orgChatToolCallingTimeoutMs default 25s |
Net cost depends on conversation length. Short ones slightly more expensive; long ones cheaper.
Order-of-magnitude forecasting
- Anthropic Sonnet, 1k input / 500 output, $3 / $15 per 1M = ~$0.011 / call
- OpenAI GPT-4o, 1k input / 500 output, $2.50 / $10 per 1M = ~$0.0075 / call
- 100 turns/day × 4 round-trips × $0.01 = ~$1.20 / day ≈ $36/month per active chat user
2026-04 vendor rates. FM_Provider_Pricing__mdt is the source of truth.
Cost-optimisation playbook
- Cache hot prompts. Set
cacheable: trueon stages with stable system + user + model + temp. Identical input → zero callout. - Pin smaller models per stage. Classification + extraction usually work with the smallest model in the family.
- Drop
useFewShotwhere unused. Few-shot prefix tokens land on every call. - Watch
__meta.providerAttempts. High = first provider failing often. Rotate the order inproviderFallback. - Tune
orgChatToolCallingMaxCalls. Over-permissive lets the model loop; under-permissive aborts useful flows.