Observability Dashboard¶
The Observability Dashboard is your central monitoring hub. It provides real-time visibility into the health of your orchestrations, API calls, and connected systems, with SLA tracking and trend analysis.
Accessing the Dashboard¶
Navigate to MONITOR > Observability in the sidebar. The dashboard opens with two tabs:
- Orchestrations -- execution metrics, SLA compliance, and trends
- APIs -- per-system API call metrics and SLA table
Orchestrations Tab¶
The Orchestrations tab provides a comprehensive view of your orchestration health.
SLA Banner¶
At the top of the tab, a prominent banner shows your current SLA status:
| Indicator | Meaning |
|---|---|
| Green | SLA is within target (e.g., 99%+ success rate) |
| Amber | SLA is approaching the threshold |
| Red | SLA has been breached |
The banner shows:
- Current SLA percentage (e.g., "99.2% over 7 days")
- Number of successful vs failed executions
- Time since last failure
Execution Charts¶
Below the SLA banner, you will see charts showing:
Executions Over Time¶
A time-series chart showing the number of orchestration executions per hour or per day. Bars are colour-coded:
- Green: Successful executions
- Red: Failed executions
- Amber: Partially successful (some steps failed but the orchestration completed)
Success Rate Trend¶
A line chart showing the success rate percentage over time. This helps you spot gradual degradation before it becomes an SLA breach.
Duration Distribution¶
A histogram showing how long executions take. Use this to identify:
- The typical execution time (the peak of the distribution)
- Outliers (unusually slow executions)
- Whether execution times are increasing over time
Time Range¶
Use the time range selector to view data for:
- Last 24 hours
- Last 7 days
- Last 30 days
- Custom date range
Per-Orchestration Breakdown¶
Scroll down to see a table listing each orchestration with its metrics:
| Column | Description |
|---|---|
| Orchestration | Name of the orchestration |
| Executions | Total number of executions in the selected period |
| Success Rate | Percentage of successful executions |
| Avg Duration | Average execution time in milliseconds |
| Last Run | Timestamp of the most recent execution |
| Status | Current state (Active, Paused, Draft) |
Click any row to drill into that orchestration's execution history.
APIs Tab¶
The APIs tab shows metrics for every external API call DIBOP makes on your behalf.
Per-System SLA Table¶
A table listing each connected system with its API call metrics:
| Column | Description |
|---|---|
| System | The connected system name and icon |
| Calls (7d) | Total API calls in the last 7 days |
| Success Rate | Percentage of calls that returned a successful response (2xx) |
| Avg Duration | Average response time in milliseconds |
| SLA Status | Whether the system meets its SLA target (99% success, configurable latency threshold) |
| Last Call | Timestamp of the most recent API call |
SLA Thresholds
SLA thresholds are configurable per system. The default is 99% success rate and 2000ms average latency. Contact your platform administrator to adjust these thresholds.
Recent Calls¶
Below the SLA table, a live feed of recent API calls shows:
- System name and icon
- HTTP method and endpoint
- Response status code
- Duration
- Timestamp
Click any call to see the full request and response details in the API Call Log.
Alert Indicators¶
If any alert rules have been triggered, alert indicators appear on the dashboard:
- Active alerts are shown as red badges on the affected metrics
- Click an alert badge to see the alert details and history
- Silenced alerts are shown with a muted indicator
See Setting Up Alerts for how to configure alert rules.
Silencing Alerts During Maintenance¶
If you are performing planned maintenance and expect temporary failures:
- Click the Silence Alerts button on the dashboard
- Set a silence duration (e.g., 2 hours)
- During the silence period, alerts are still recorded but no notifications are sent
- Alerts resume automatically when the silence period ends
Do Not Forget to Unsilence
If you set a long silence period and maintenance finishes early, remember to manually unsilence alerts. Otherwise, real issues that occur after maintenance may go unnoticed.
Exporting Data¶
You can export dashboard data for reporting or further analysis:
- Click the Export button in the toolbar
- Choose the format:
- CSV -- tabular data for spreadsheets
- JSON -- structured data for programmatic use
- Select the time range and metrics to include
- Click Download
Refreshing¶
The dashboard loads data when you navigate to it and does not auto-refresh. Click the Refresh button in the toolbar to load the latest data. For real-time monitoring, consider opening the Execution Log which updates in real time.
Best Practices¶
- Check the dashboard daily -- a quick glance at the SLA banner tells you if action is needed
- Set up alert rules -- do not rely on manual dashboard checks; let DIBOP notify you of issues
- Investigate trends -- a slowly declining success rate is often more important than a single failure
- Use the time range selector -- compare 7-day and 30-day views to spot long-term trends
- Drill into failures -- click through to the Execution Trace to understand why failures occur
Next Steps¶
- Execution Log -- detailed list of every execution
- Execution Trace -- step-by-step breakdown of a single execution
- API Call Log -- every external API call
- Setting Up Alerts -- get notified before issues become outages