In this tutorial, you’ll build a content summarizer pipeline that fetches web content, extracts the text, and generates a summary using AI.
What You’ll Learn
- Define multi-stage pipelines
- Use built-in operators and nodes
- Connect stages with dependencies
- Use template expressions
The Pipeline We’ll Build
Our content summarizer will:
- Fetch - Retrieve content from a URL
- Extract - Parse and extract the main text
- Summarize - Generate a summary using AI
Step 1: Create the Pipeline File
Create a new file pipelines/content-summarizer.pipeline.json:
{
"name": "content-summarizer",
"version": "1.0.0",
"description": "Fetch and summarize web content"
}
Step 2: Define Input Schema
Add an input schema to validate the URL:
{
"name": "content-summarizer",
"version": "1.0.0",
"description": "Fetch and summarize web content",
"input": {
"type": "object",
"properties": {
"url": {
"type": "string",
"format": "uri",
"description": "URL to summarize"
}
},
"required": ["url"]
}
}
Step 3: Add the Fetch Stage
Add the first stage to fetch content:
{
"stages": [
{
"id": "fetch",
"component": "http-request",
"config": {
"url": "{{input.url}}",
"method": "GET"
}
}
]
}
Notice the {{input.url}} template expression - this references the input we defined.
Step 4: Add the Extract Stage
Add a stage to extract the content:
{
"id": "extract",
"component": "json-transform",
"depends_on": ["fetch"],
"config": {
"expression": "body.content || body.text || body"
}
}
The depends_on array specifies that this stage waits for “fetch” to complete.
Step 5: Add the Summarize Stage
Add the AI summarization stage:
{
"id": "summarize",
"component": "generator",
"depends_on": ["extract"],
"config": {
"prompt": "Summarize the following content in 3 bullet points:\n\n{{stages.extract.output}}"
}
}
The {{stages.extract.output}} expression references the output of the extract stage.
Complete Pipeline
Here’s the full pipeline:
{
"name": "content-summarizer",
"version": "1.0.0",
"description": "Fetch and summarize web content",
"input": {
"type": "object",
"properties": {
"url": {
"type": "string",
"format": "uri",
"description": "URL to summarize"
}
},
"required": ["url"]
},
"output": {
"type": "object",
"properties": {
"summary": { "type": "string" }
}
},
"stages": [
{
"id": "fetch",
"component": "http-request",
"config": {
"url": "{{input.url}}",
"method": "GET"
}
},
{
"id": "extract",
"component": "json-transform",
"depends_on": ["fetch"],
"config": {
"expression": "body.content || body.text || body"
}
},
{
"id": "summarize",
"component": "generator",
"depends_on": ["extract"],
"config": {
"prompt": "Summarize the following content in 3 bullet points:\n\n{{stages.extract.output}}"
}
}
]
}
Step 6: Run the Pipeline
Using CLI
fm run pipelines/content-summarizer.pipeline.json \
--input '{"url": "https://example.com/article"}'
Using VSCode
- Open the pipeline file
- Press F5
- Enter the URL when prompted
Step 7: View in DAG Editor
- With the pipeline open, click the DAG icon in the toolbar
- See your pipeline as a visual graph:
[fetch] → [extract] → [summarize]
Adding Error Handling
Wrap the pipeline in error handling:
{
"id": "safe-fetch",
"component": "trycatch",
"config": {
"try_stage": "fetch",
"catch_stage": "handle-error"
}
}
Exercises
Try these enhancements:
- Add language detection - Detect the content language before summarizing
- Add keyword extraction - Extract key topics from the content
- Multiple formats - Output summary as Markdown, HTML, or plain text
What’s Next?
You’ve built a multi-stage AI pipeline! In the next tutorial, you’ll learn how to debug pipelines effectively.
Key concepts covered:
- Multi-stage pipeline structure
- Input/output schemas
- Template expressions
- Stage dependencies
- Built-in components
Continue to Debugging Pipelines.