FLOW MASON

AI Insights & Analytics

Track costs, monitor performance, detect anomalies, and get AI-powered optimization recommendations.

Beta Feature - This feature is currently under heavy testing. For early access, contact [email protected].

FlowMason Studio provides AI-powered analytics that automatically detect patterns, anomalies, and optimization opportunities in your pipeline executions.

Overview

The insights engine analyzes your execution history to provide:

  • Cost Analysis - Track spending, detect spikes, forecast future costs
  • Performance Monitoring - Identify degradation, latency outliers, slow stages
  • Reliability Tracking - Failure patterns, error distribution, MTTR
  • Usage Patterns - Peak hours, busiest pipelines, model preferences
  • Optimization Recommendations - Model selection, cost reduction opportunities

Quick Start

Get Insights Summary

GET /api/v1/analytics/insights/summary
{
  "generated_at": "2024-01-15T10:30:00Z",
  "total_insights": 5,
  "critical_count": 1,
  "warning_count": 2,
  "info_count": 2,
  "top_cost_insight": {
    "type": "cost_spike",
    "severity": "warning",
    "title": "Cost increased by 45%",
    "description": "Spending increased from $12.50 to $18.12..."
  },
  "top_reliability_insight": {
    "type": "failure_pattern",
    "severity": "critical",
    "title": "Critical failure rate: 28%"
  },
  "estimated_savings": 8.50,
  "performance_change_percent": -5.2,
  "reliability_change_percent": 3.1
}

Get Full Report

GET /api/v1/analytics/insights/report?days=30

Returns comprehensive analysis including trends, breakdowns, forecasts, and recommendations.

Insight Types

Cost Insights

TypeDescriptionSeverity
cost_spikeSpending increased significantlyWarning/Critical
cost_optimizationWaste or concentration detectedInfo
model_recommendationCheaper model availableInfo

Example:

{
  "type": "cost_spike",
  "severity": "warning",
  "title": "Cost increased by 45%",
  "description": "Spending increased from $12.50 to $18.12",
  "data": {
    "previous_cost": 12.50,
    "current_cost": 18.12,
    "increase_percent": 45
  },
  "recommendations": [
    "Review usage patterns for unexpected increases",
    "Consider smaller models for simple tasks",
    "Check for runaway pipelines"
  ]
}

Performance Insights

TypeDescriptionSeverity
performance_degradationExecution time increasedWarning
performance_improvementExecution time decreasedInfo
anomalyHigh latency variabilityInfo

Example:

{
  "type": "performance_degradation",
  "severity": "warning",
  "title": "Execution time increased by 35%",
  "data": {
    "previous_avg_ms": 1200,
    "current_avg_ms": 1620
  },
  "recommendations": [
    "Check for slow API responses or rate limiting",
    "Review recent pipeline changes",
    "Consider caching repeated operations"
  ]
}

Reliability Insights

TypeDescriptionSeverity
failure_patternHigh failure rateCritical/Warning
reliabilityReliability issuesWarning

Example:

{
  "type": "failure_pattern",
  "severity": "critical",
  "title": "Critical failure rate: 28%",
  "data": {
    "failure_rate": 0.28,
    "total_failures": 42,
    "by_error_type": {
      "api_error": 25,
      "timeout": 12,
      "validation_error": 5
    }
  },
  "recommendations": [
    "Check recent pipeline changes",
    "Verify API credentials and rate limits",
    "Review error logs for root cause"
  ]
}

Trend Analysis

Track metrics over time:

{
  "cost_trend": {
    "metric_name": "cost",
    "current_value": 18.12,
    "previous_value": 12.50,
    "change_percent": 44.96,
    "direction": "up",
    "is_significant": true
  }
}

Directions: up, down, stable

Performance Metrics

{
  "performance_metrics": {
    "avg_duration_ms": 1250,
    "p50_duration_ms": 1100,
    "p95_duration_ms": 2800,
    "p99_duration_ms": 4500,
    "slowest_stages": [
      { "stage_id": "generator_1", "avg_ms": 2500 }
    ]
  }
}

Cost Forecasting

{
  "cost_forecast": {
    "current_daily_avg": 2.50,
    "projected_daily": 2.75,
    "projected_weekly": 19.25,
    "projected_monthly": 82.50,
    "trend": "up",
    "confidence": 0.85
  }
}

Optimization Opportunities

The engine identifies actionable optimizations:

{
  "optimization_opportunities": [
    {
      "type": "model_downgrade",
      "description": "Switch to claude-3-haiku for simple tasks",
      "current_cost": 45.00,
      "potential_savings": 31.50,
      "savings_percent": 70,
      "difficulty": "medium"
    },
    {
      "type": "reliability",
      "description": "Reduce failures to cut wasted resources",
      "current_cost": 5.60,
      "potential_savings": 4.48,
      "difficulty": "medium"
    }
  ]
}

Model Efficiency

Compare efficiency across models:

{
  "model_efficiency": [
    {
      "provider": "anthropic",
      "model": "claude-3-5-sonnet",
      "avg_latency_ms": 1200,
      "cost_per_1k_tokens": 0.003,
      "success_rate": 0.98,
      "usage_count": 1250
    },
    {
      "provider": "openai",
      "model": "gpt-4o-mini",
      "avg_latency_ms": 800,
      "cost_per_1k_tokens": 0.00015,
      "success_rate": 0.99,
      "usage_count": 850
    }
  ]
}

Filtering Insights

GET /api/v1/analytics/insights?category=cost&severity=warning&pipeline_id=my-pipeline

Filter by:

  • category: cost, performance, reliability, usage, optimization
  • severity: critical, warning, info
  • pipeline_id: Focus on specific pipeline

Detection Thresholds

MetricWarningCritical
Cost spike50% increase100% increase
Failure rate10%25%
Performance degradation30% slowerN/A

Python Integration

from flowmason_studio.services.insights_service import get_insights_service

service = get_insights_service()

# Generate full report
report = service.generate_report(
    org_id="default",
    days=30,
    include_recommendations=True,
    include_forecasts=True,
)

print(f"Total insights: {len(report.insights)}")
print(f"Potential savings: ${report.total_potential_savings:.2f}")

# Check for critical issues
critical = [i for i in report.insights if i.severity == "critical"]
if critical:
    print(f"ALERT: {len(critical)} critical issues!")
    for insight in critical:
        print(f"  - {insight.title}")

# Get quick summary
summary = service.get_summary(org_id="default", days=7)
print(f"Critical: {summary.critical_count}")
print(f"Warnings: {summary.warning_count}")

Best Practices

  1. Review daily - Check insights summary for critical issues
  2. Act on critical - Address critical severity immediately
  3. Track trends - Monitor direction over time
  4. Optimize incrementally - Address one opportunity at a time
  5. Set alerts - Integrate with webhooks for critical insights
  6. Compare periods - Use different time ranges to spot patterns

Dashboard Integration

The Studio web portal displays insights in real-time:

  • Summary cards - Quick view of critical/warning counts
  • Trend charts - Visualize cost and performance over time
  • Failure analysis - Breakdown by error type and pipeline
  • Model comparison - Side-by-side efficiency metrics
  • Recommendations - Actionable optimization suggestions