Setting Up Alerts¶

Alerts notify you when something goes wrong with your integrations -- before users notice and before issues cascade. DIBOP supports automated alerts based on configurable rules.

What Alerts Do¶

An alert is an automated notification that fires when a metric crosses a threshold you define. Alerts help you:

Detect orchestration failures within minutes instead of hours
Catch SLA breaches before they accumulate
Be notified when a connected system goes offline
Respond to rate limiting or credential issues quickly

Alert Types¶

DIBOP supports several types of alerts:

Type	Description	Example
Execution Failure	An orchestration execution fails	"VIN Decode Sync failed"
Consecutive Failures	An orchestration fails N times in a row	"Inventory Sync has failed 5 consecutive times"
SLA Breach	Success rate drops below a threshold over a time window	"SLA dropped below 99% over the last 7 days"
System Down	A connected system fails health checks	"Mercedes-Benz OneAPI is offline"
Latency Threshold	Average response time exceeds a limit	"DMS average latency exceeded 5000ms"
Error Rate Spike	Error rate exceeds a percentage over a time window	"Error rate for CRM exceeded 10% in the last hour"

Notification Channels¶

When an alert fires, DIBOP can notify you through:

Email¶

Send an email to one or more recipients. The email includes:

Alert rule name and description
The metric value that triggered the alert
A link to the relevant dashboard or execution trace
Suggested actions

Webhook¶

Send an HTTP POST to a URL of your choice. The webhook payload includes structured JSON data about the alert, suitable for integration with:

Incident management tools (PagerDuty, OpsGenie)
Chat platforms (via custom integrations)
Internal monitoring systems

Example webhook payload:

{
  "alert_rule": "SLA Breach - Inventory Sync",
  "severity": "critical",
  "metric": "success_rate",
  "threshold": 99.0,
  "actual_value": 97.5,
  "window": "7d",
  "triggered_at": "2026-04-08T14:32:00Z",
  "orchestration_id": "abc-123",
  "dashboard_url": "https://app.dibop.ca/observability"
}

Slack¶

Send a message to a Slack channel. Configure this in the alert rule by providing a Slack webhook URL.

Creating Your First Alert¶

Step 1: Navigate to Alert Rules¶

Go to MONITOR > Alert Rules in the sidebar and click Create Rule.

Step 2: Define the Rule¶

Field	Description
Name	A descriptive name (e.g., "SLA Breach - Vehicle Sync")
Description	What this alert monitors and why
Type	The alert type (from the list above)
Scope	Which orchestrations or systems this rule applies to (all, or specific ones)

Step 3: Set the Condition¶

Configure the threshold that triggers the alert:

Example: SLA Breach

Setting	Value
Metric	Success Rate
Operator	Less than
Threshold	99%
Window	7 days
Evaluate Every	1 hour

Example: Consecutive Failures

Setting	Value
Metric	Consecutive Failures
Operator	Greater than or equal to
Threshold	3

Step 4: Configure Notifications¶

Choose one or more notification channels and configure them:

Email: Enter recipient email addresses
Webhook: Enter the webhook URL
Slack: Enter the Slack webhook URL

Step 5: Save and Activate¶

Click Save. The rule is immediately active and begins evaluating at the configured interval.

Silencing Alerts¶

During planned maintenance, you may want to silence alerts temporarily to avoid false notifications.

Silencing All Alerts¶

On the Observability Dashboard, click Silence Alerts
Set the silence duration (e.g., 2 hours)
Optionally add a reason (e.g., "Planned DMS maintenance")
Click Silence

Silencing a Specific Rule¶

Go to MONITOR > Alert Rules
Find the rule you want to silence
Click the three-dot menu and select Silence
Set the duration and reason
Click Silence

During a silence period:

Alerts are still evaluated and recorded in the alert history
No notifications are sent
The silence expires automatically after the configured duration
You can manually unsilence at any time

Do Not Forget

If maintenance finishes early, unsilence your alerts manually. Otherwise, real issues that occur during the remaining silence period will go unnotified.

Alert History¶

View the history of all alert firings:

Go to MONITOR > Alert Rules
Click on a rule to see its history
Each firing shows:
- When the alert triggered
- The metric value at the time
- Which notifications were sent
- When the alert resolved (if auto-resolving)

Auto-Resolving Alerts¶

Some alert types auto-resolve when the condition is no longer met:

SLA Breach: Resolves when the success rate returns above the threshold
System Down: Resolves when the system passes a health check
Latency Threshold: Resolves when latency drops below the threshold

Auto-resolved alerts send a resolution notification to the same channels that received the original alert.

Best Practices¶

Start with broad rules: Create a few high-level rules (e.g., "SLA below 99%" for all orchestrations) before drilling into specific rules
Avoid alert fatigue: Do not create too many rules with low thresholds -- this leads to ignored alerts
Use escalation: Set a mild threshold for the first notification, and a stricter threshold for a more urgent notification
Document your rules: Use the description field to explain why the rule exists and what the responder should do
Review alert history monthly: Are alerts firing frequently? Adjust thresholds or fix the underlying issues

Next Steps¶

Alert Rules -- detailed reference for rule configuration
Observability Dashboard -- see alert indicators on the dashboard
Execution Log -- investigate the executions that triggered alerts