Skip to content

Observability Dashboard

The Observability Dashboard is your central monitoring hub. It provides real-time visibility into the health of your orchestrations, API calls, and connected systems, with SLA tracking and trend analysis.


Accessing the Dashboard

Navigate to MONITOR > Observability in the sidebar. The dashboard opens with two tabs:

  • Orchestrations -- execution metrics, SLA compliance, and trends
  • APIs -- per-system API call metrics and SLA table

Orchestrations Tab

The Orchestrations tab provides a comprehensive view of your orchestration health.

SLA Banner

At the top of the tab, a prominent banner shows your current SLA status:

Indicator Meaning
Green SLA is within target (e.g., 99%+ success rate)
Amber SLA is approaching the threshold
Red SLA has been breached

The banner shows:

  • Current SLA percentage (e.g., "99.2% over 7 days")
  • Number of successful vs failed executions
  • Time since last failure

Execution Charts

Below the SLA banner, you will see charts showing:

Executions Over Time

A time-series chart showing the number of orchestration executions per hour or per day. Bars are colour-coded:

  • Green: Successful executions
  • Red: Failed executions
  • Amber: Partially successful (some steps failed but the orchestration completed)

Success Rate Trend

A line chart showing the success rate percentage over time. This helps you spot gradual degradation before it becomes an SLA breach.

Duration Distribution

A histogram showing how long executions take. Use this to identify:

  • The typical execution time (the peak of the distribution)
  • Outliers (unusually slow executions)
  • Whether execution times are increasing over time

Time Range

Use the time range selector to view data for:

  • Last 24 hours
  • Last 7 days
  • Last 30 days
  • Custom date range

Per-Orchestration Breakdown

Scroll down to see a table listing each orchestration with its metrics:

Column Description
Orchestration Name of the orchestration
Executions Total number of executions in the selected period
Success Rate Percentage of successful executions
Avg Duration Average execution time in milliseconds
Last Run Timestamp of the most recent execution
Status Current state (Active, Paused, Draft)

Click any row to drill into that orchestration's execution history.


APIs Tab

The APIs tab shows metrics for every external API call DIBOP makes on your behalf.

Per-System SLA Table

A table listing each connected system with its API call metrics:

Column Description
System The connected system name and icon
Calls (7d) Total API calls in the last 7 days
Success Rate Percentage of calls that returned a successful response (2xx)
Avg Duration Average response time in milliseconds
SLA Status Whether the system meets its SLA target (99% success, configurable latency threshold)
Last Call Timestamp of the most recent API call

SLA Thresholds

SLA thresholds are configurable per system. The default is 99% success rate and 2000ms average latency. Contact your platform administrator to adjust these thresholds.

Recent Calls

Below the SLA table, a live feed of recent API calls shows:

  • System name and icon
  • HTTP method and endpoint
  • Response status code
  • Duration
  • Timestamp

Click any call to see the full request and response details in the API Call Log.


Alert Indicators

If any alert rules have been triggered, alert indicators appear on the dashboard:

  • Active alerts are shown as red badges on the affected metrics
  • Click an alert badge to see the alert details and history
  • Silenced alerts are shown with a muted indicator

See Setting Up Alerts for how to configure alert rules.

Silencing Alerts During Maintenance

If you are performing planned maintenance and expect temporary failures:

  1. Click the Silence Alerts button on the dashboard
  2. Set a silence duration (e.g., 2 hours)
  3. During the silence period, alerts are still recorded but no notifications are sent
  4. Alerts resume automatically when the silence period ends

Do Not Forget to Unsilence

If you set a long silence period and maintenance finishes early, remember to manually unsilence alerts. Otherwise, real issues that occur after maintenance may go unnoticed.


Exporting Data

You can export dashboard data for reporting or further analysis:

  1. Click the Export button in the toolbar
  2. Choose the format:
    • CSV -- tabular data for spreadsheets
    • JSON -- structured data for programmatic use
  3. Select the time range and metrics to include
  4. Click Download

Refreshing

The dashboard loads data when you navigate to it and does not auto-refresh. Click the Refresh button in the toolbar to load the latest data. For real-time monitoring, consider opening the Execution Log which updates in real time.


Best Practices

  1. Check the dashboard daily -- a quick glance at the SLA banner tells you if action is needed
  2. Set up alert rules -- do not rely on manual dashboard checks; let DIBOP notify you of issues
  3. Investigate trends -- a slowly declining success rate is often more important than a single failure
  4. Use the time range selector -- compare 7-day and 30-day views to spot long-term trends
  5. Drill into failures -- click through to the Execution Trace to understand why failures occur

Next Steps