Setting Up Alerts¶
Alerts notify you when something goes wrong with your integrations -- before users notice and before issues cascade. DIBOP supports automated alerts based on configurable rules.
What Alerts Do¶
An alert is an automated notification that fires when a metric crosses a threshold you define. Alerts help you:
- Detect orchestration failures within minutes instead of hours
- Catch SLA breaches before they accumulate
- Be notified when a connected system goes offline
- Respond to rate limiting or credential issues quickly
Alert Types¶
DIBOP supports several types of alerts:
| Type | Description | Example |
|---|---|---|
| Execution Failure | An orchestration execution fails | "VIN Decode Sync failed" |
| Consecutive Failures | An orchestration fails N times in a row | "Inventory Sync has failed 5 consecutive times" |
| SLA Breach | Success rate drops below a threshold over a time window | "SLA dropped below 99% over the last 7 days" |
| System Down | A connected system fails health checks | "Mercedes-Benz OneAPI is offline" |
| Latency Threshold | Average response time exceeds a limit | "DMS average latency exceeded 5000ms" |
| Error Rate Spike | Error rate exceeds a percentage over a time window | "Error rate for CRM exceeded 10% in the last hour" |
Notification Channels¶
When an alert fires, DIBOP can notify you through:
Email¶
Send an email to one or more recipients. The email includes:
- Alert rule name and description
- The metric value that triggered the alert
- A link to the relevant dashboard or execution trace
- Suggested actions
Webhook¶
Send an HTTP POST to a URL of your choice. The webhook payload includes structured JSON data about the alert, suitable for integration with:
- Incident management tools (PagerDuty, OpsGenie)
- Chat platforms (via custom integrations)
- Internal monitoring systems
Example webhook payload:
{
"alert_rule": "SLA Breach - Inventory Sync",
"severity": "critical",
"metric": "success_rate",
"threshold": 99.0,
"actual_value": 97.5,
"window": "7d",
"triggered_at": "2026-04-08T14:32:00Z",
"orchestration_id": "abc-123",
"dashboard_url": "https://app.dibop.ca/observability"
}
Slack¶
Send a message to a Slack channel. Configure this in the alert rule by providing a Slack webhook URL.
Creating Your First Alert¶
Step 1: Navigate to Alert Rules¶
Go to MONITOR > Alert Rules in the sidebar and click Create Rule.
Step 2: Define the Rule¶
| Field | Description |
|---|---|
| Name | A descriptive name (e.g., "SLA Breach - Vehicle Sync") |
| Description | What this alert monitors and why |
| Type | The alert type (from the list above) |
| Scope | Which orchestrations or systems this rule applies to (all, or specific ones) |
Step 3: Set the Condition¶
Configure the threshold that triggers the alert:
Example: SLA Breach
| Setting | Value |
|---|---|
| Metric | Success Rate |
| Operator | Less than |
| Threshold | 99% |
| Window | 7 days |
| Evaluate Every | 1 hour |
Example: Consecutive Failures
| Setting | Value |
|---|---|
| Metric | Consecutive Failures |
| Operator | Greater than or equal to |
| Threshold | 3 |
Step 4: Configure Notifications¶
Choose one or more notification channels and configure them:
- Email: Enter recipient email addresses
- Webhook: Enter the webhook URL
- Slack: Enter the Slack webhook URL
Step 5: Save and Activate¶
Click Save. The rule is immediately active and begins evaluating at the configured interval.
Silencing Alerts¶
During planned maintenance, you may want to silence alerts temporarily to avoid false notifications.
Silencing All Alerts¶
- On the Observability Dashboard, click Silence Alerts
- Set the silence duration (e.g., 2 hours)
- Optionally add a reason (e.g., "Planned DMS maintenance")
- Click Silence
Silencing a Specific Rule¶
- Go to MONITOR > Alert Rules
- Find the rule you want to silence
- Click the three-dot menu and select Silence
- Set the duration and reason
- Click Silence
During a silence period:
- Alerts are still evaluated and recorded in the alert history
- No notifications are sent
- The silence expires automatically after the configured duration
- You can manually unsilence at any time
Do Not Forget
If maintenance finishes early, unsilence your alerts manually. Otherwise, real issues that occur during the remaining silence period will go unnotified.
Alert History¶
View the history of all alert firings:
- Go to MONITOR > Alert Rules
- Click on a rule to see its history
- Each firing shows:
- When the alert triggered
- The metric value at the time
- Which notifications were sent
- When the alert resolved (if auto-resolving)
Auto-Resolving Alerts¶
Some alert types auto-resolve when the condition is no longer met:
- SLA Breach: Resolves when the success rate returns above the threshold
- System Down: Resolves when the system passes a health check
- Latency Threshold: Resolves when latency drops below the threshold
Auto-resolved alerts send a resolution notification to the same channels that received the original alert.
Best Practices¶
- Start with broad rules: Create a few high-level rules (e.g., "SLA below 99%" for all orchestrations) before drilling into specific rules
- Avoid alert fatigue: Do not create too many rules with low thresholds -- this leads to ignored alerts
- Use escalation: Set a mild threshold for the first notification, and a stricter threshold for a more urgent notification
- Document your rules: Use the description field to explain why the rule exists and what the responder should do
- Review alert history monthly: Are alerts firing frequently? Adjust thresholds or fix the underlying issues
Next Steps¶
- Alert Rules -- detailed reference for rule configuration
- Observability Dashboard -- see alert indicators on the dashboard
- Execution Log -- investigate the executions that triggered alerts