Skip to content

Setting Up Alerts

Alerts notify you when something goes wrong with your integrations -- before users notice and before issues cascade. DIBOP supports automated alerts based on configurable rules.


What Alerts Do

An alert is an automated notification that fires when a metric crosses a threshold you define. Alerts help you:

  • Detect orchestration failures within minutes instead of hours
  • Catch SLA breaches before they accumulate
  • Be notified when a connected system goes offline
  • Respond to rate limiting or credential issues quickly

Alert Types

DIBOP supports several types of alerts:

Type Description Example
Execution Failure An orchestration execution fails "VIN Decode Sync failed"
Consecutive Failures An orchestration fails N times in a row "Inventory Sync has failed 5 consecutive times"
SLA Breach Success rate drops below a threshold over a time window "SLA dropped below 99% over the last 7 days"
System Down A connected system fails health checks "Mercedes-Benz OneAPI is offline"
Latency Threshold Average response time exceeds a limit "DMS average latency exceeded 5000ms"
Error Rate Spike Error rate exceeds a percentage over a time window "Error rate for CRM exceeded 10% in the last hour"

Notification Channels

When an alert fires, DIBOP can notify you through:

Email

Send an email to one or more recipients. The email includes:

  • Alert rule name and description
  • The metric value that triggered the alert
  • A link to the relevant dashboard or execution trace
  • Suggested actions

Webhook

Send an HTTP POST to a URL of your choice. The webhook payload includes structured JSON data about the alert, suitable for integration with:

  • Incident management tools (PagerDuty, OpsGenie)
  • Chat platforms (via custom integrations)
  • Internal monitoring systems

Example webhook payload:

{
  "alert_rule": "SLA Breach - Inventory Sync",
  "severity": "critical",
  "metric": "success_rate",
  "threshold": 99.0,
  "actual_value": 97.5,
  "window": "7d",
  "triggered_at": "2026-04-08T14:32:00Z",
  "orchestration_id": "abc-123",
  "dashboard_url": "https://app.dibop.ca/observability"
}

Slack

Send a message to a Slack channel. Configure this in the alert rule by providing a Slack webhook URL.


Creating Your First Alert

Step 1: Navigate to Alert Rules

Go to MONITOR > Alert Rules in the sidebar and click Create Rule.

Step 2: Define the Rule

Field Description
Name A descriptive name (e.g., "SLA Breach - Vehicle Sync")
Description What this alert monitors and why
Type The alert type (from the list above)
Scope Which orchestrations or systems this rule applies to (all, or specific ones)

Step 3: Set the Condition

Configure the threshold that triggers the alert:

Example: SLA Breach

Setting Value
Metric Success Rate
Operator Less than
Threshold 99%
Window 7 days
Evaluate Every 1 hour

Example: Consecutive Failures

Setting Value
Metric Consecutive Failures
Operator Greater than or equal to
Threshold 3

Step 4: Configure Notifications

Choose one or more notification channels and configure them:

  • Email: Enter recipient email addresses
  • Webhook: Enter the webhook URL
  • Slack: Enter the Slack webhook URL

Step 5: Save and Activate

Click Save. The rule is immediately active and begins evaluating at the configured interval.


Silencing Alerts

During planned maintenance, you may want to silence alerts temporarily to avoid false notifications.

Silencing All Alerts

  1. On the Observability Dashboard, click Silence Alerts
  2. Set the silence duration (e.g., 2 hours)
  3. Optionally add a reason (e.g., "Planned DMS maintenance")
  4. Click Silence

Silencing a Specific Rule

  1. Go to MONITOR > Alert Rules
  2. Find the rule you want to silence
  3. Click the three-dot menu and select Silence
  4. Set the duration and reason
  5. Click Silence

During a silence period:

  • Alerts are still evaluated and recorded in the alert history
  • No notifications are sent
  • The silence expires automatically after the configured duration
  • You can manually unsilence at any time

Do Not Forget

If maintenance finishes early, unsilence your alerts manually. Otherwise, real issues that occur during the remaining silence period will go unnotified.


Alert History

View the history of all alert firings:

  1. Go to MONITOR > Alert Rules
  2. Click on a rule to see its history
  3. Each firing shows:
    • When the alert triggered
    • The metric value at the time
    • Which notifications were sent
    • When the alert resolved (if auto-resolving)

Auto-Resolving Alerts

Some alert types auto-resolve when the condition is no longer met:

  • SLA Breach: Resolves when the success rate returns above the threshold
  • System Down: Resolves when the system passes a health check
  • Latency Threshold: Resolves when latency drops below the threshold

Auto-resolved alerts send a resolution notification to the same channels that received the original alert.


Best Practices

  1. Start with broad rules: Create a few high-level rules (e.g., "SLA below 99%" for all orchestrations) before drilling into specific rules
  2. Avoid alert fatigue: Do not create too many rules with low thresholds -- this leads to ignored alerts
  3. Use escalation: Set a mild threshold for the first notification, and a stricter threshold for a more urgent notification
  4. Document your rules: Use the description field to explain why the rule exists and what the responder should do
  5. Review alert history monthly: Are alerts firing frequently? Adjust thresholds or fix the underlying issues

Next Steps