What is the Canonical Data Model?¶
The Canonical Data Model (CDM) is DIBOP's universal data schema -- a single, shared definition of how business data is structured. It is the foundation that makes DIBOP's integration capabilities possible.
The Problem Without a CDM¶
Imagine you have five systems: an OEM portal, a DMS, a CRM, a finance platform, and a telematics provider. Each system uses its own names and formats for the same concepts:
| Concept | OEM Portal | DMS | CRM | Finance | Telematics |
|---|---|---|---|---|---|
| Vehicle ID | fahrzeug_id |
stock_vin |
asset_number |
collateral_id |
device_vehicle_id |
| Customer Name | kunde_name |
cust_full_name |
contact_name |
borrower_name |
owner |
| Mileage | kilometerstand |
odometer_reading |
mileage |
N/A | total_distance_km |
Without a common model, connecting these systems requires building a separate translation for every pair: OEM-to-DMS, OEM-to-CRM, DMS-to-CRM, DMS-to-Finance, and so on. For N systems, that is N x (N-1) translations -- a number that grows rapidly.
The CDM Solution¶
The CDM introduces a single, authoritative definition for each business concept:
| CDM Field | Definition |
|---|---|
vin |
The 17-character Vehicle Identification Number |
customer_name |
The customer's full name |
mileage_km |
Current odometer reading in kilometres |
Each connector maps its native fields to and from the CDM -- once. Any pair of systems can then exchange data by going through the CDM:
For N systems, you need only 2N translations (one inbound, one outbound per system) instead of N x (N-1). Adding a new system requires only one new set of mappings, and it instantly works with all existing systems.
CDM Structure¶
The CDM is organised into domains, each representing a business entity:
Canonical Data Model
├── Vehicle
│ ├── vin (string, required)
│ ├── make (string)
│ ├── model (string)
│ ├── year (integer)
│ ├── mileage_km (number)
│ └── ... (30+ fields)
├── Customer
│ ├── customer_id (string, required)
│ ├── customer_name (string, PII)
│ ├── email (string, PII)
│ ├── phone (string, PII)
│ └── ... (20+ fields)
├── Order
│ ├── order_id (string, required)
│ ├── order_date (datetime)
│ ├── total_amount (number)
│ └── ...
└── ... (22 domains total)
Each domain has:
- Required fields: Must be present for the record to be valid
- Optional fields: Available for mapping but not required
- PII classification: Each field is marked as None, PII, or Sensitive PII
- Data types: Strict typing (string, number, integer, boolean, datetime)
Why "Canonical"?¶
The word "canonical" means "according to a standard rule or authority." In data modelling, the canonical model is the one true representation. When two systems disagree about how to name a field, the CDM provides the definitive answer.
This means:
- There is exactly one definition of what a "vehicle" looks like
- All connectors translate to and from this one definition
- All orchestrations work with this one set of field names
- No ambiguity about what a field means or what format it uses
How the CDM Is Used¶
In Connectors¶
Each connector includes mappings that translate its native fields to CDM fields and back:
Mercedes-Benz OneAPI CDM
──────────────────── ───
fin → vin
marke → make
modell → model
erstzulassung → first_registration_date
kilometerstand → mileage_km
These mappings are defined when the connector is built (using the Connector SDK) and can be enhanced with AI-assisted mapping.
In Orchestrations¶
When you build an orchestration, you work with CDM fields:
{
"step": "enrich_vehicle",
"input": {
"vin": "${fetch_vehicle.vin}",
"make": "${fetch_vehicle.make}",
"model": "${fetch_vehicle.model}"
}
}
The CDM field names are consistent regardless of which source or target system is involved.
In Monitoring¶
The CDM enables cross-system analytics. Because all data is normalised to the same fields, you can:
- Count vehicles across all systems using a single query
- Track customer records across DMS and CRM using a shared
customer_id - Identify data discrepancies between systems (e.g., different mileage values for the same VIN)
PII Classification¶
Every field in the CDM is classified for personally identifiable information:
| Classification | Meaning | Examples | How DIBOP Handles It |
|---|---|---|---|
| None | Not personally identifiable | Vehicle make, model, year | Shown in full in logs and exports |
| PII | Personally identifiable | Name, email, phone, address | Partially masked in logs; encrypted at rest |
| Sensitive PII | Highly sensitive | Government ID, SSN, financial accounts | Fully redacted in logs; encrypted at rest |
PII classification is enforced automatically:
- Execution logs and API call logs mask PII fields
- Exports include PII warnings
- Retention policies can be configured separately for PII data
See the Canonical Explorer to view PII classifications for every field.
Extensibility¶
The CDM is not static. It grows as DIBOP adds support for new business domains:
- DIBOP ships with 22 standard domains
- Platform Admins can request new domains for industry-specific needs
- Custom fields can be added to existing domains without breaking existing mappings
Standard First
Always check the existing CDM domains and fields before requesting custom additions. The more data that maps to standard fields, the more interoperable your integrations become.
Next Steps¶
- Available Domains -- browse all 22 domains with field lists
- Field Mapping -- learn how to map connector fields to the CDM
- Canonical Explorer -- explore the CDM interactively in the DIBOP UI
- CDM in Orchestrations -- how orchestrations use CDM fields