Data models
Contents

Why model data for APIs?

Having a clear data model for your APIs is really important for several reasons. It helps make your APIs work better, stay consistent, and be easier for everyone to use.

Establishing a shared contract

A well-defined data model acts like a clear “agreement” between the people who create an API and the people who use it. This shared understanding makes sure that requests and responses are handled and checked in the same way across different teams and systems, leading to smoother interactions.

Enabling validation and consistency

By clearly stating what fields should be there, what type of data they should hold, which ones are required, and what values are allowed (often done using definitions like OpenAPI or JSON Schema), APIs can automatically check incoming data. This helps prevent errors and ensures the data stays correct and reliable.

Supporting tools & automation

When API data models are well-defined, they allow for powerful automation:

Automatic documentation: Tools can automatically create easy-to-read documentation (like Swagger/OpenAPI).
Automatic code generation: You can automatically generate client-side code (SDKs) and server-side code.
Type information: It provides crucial type information for programming languages that use strong typing (like TypeScript, Java, or Go), making development more robust.

Promoting scalability and maintainability

Building a data model that is separate from your database structure allows your APIs to change and grow without directly affecting how your data is stored internally. This prevents tight connections between different parts of your system, makes your API more flexible, and reduces the chance of breaking existing code when internal database structures are updated.

Improving API design & developer experience

Thinking about APIs as “models” rather than just a list of web addresses encourages careful design of resources, actions, and how data flows. This way of designing leads to APIs that are more intuitive and effective for developers to use.

Classic data modeling: Conceptual → Logical → Physical

Data modeling traditionally uses a three-step approach, first defined by ANSI in 1975. This layered structure helps keep things organised and flexible when designing computer systems.

Conceptual model

– Purpose: This is the highest-level view. It captures the main ideas of a business area, its key “things” (entities) and how they relate to each other, without worrying about any technical details of how it will be built.
Audience & tools: It’s for business people and system planners. It’s often shown using Entity-Relationship Diagrams (ERDs), which use boxes for entities and lines for relationships.
– Key characteristics:
      – Defines the main entities (like “Customer,” “Order”) and how they connect (e.g., one customer can have many orders).
      – Uses general descriptions for data types (like “string” or “number”), avoiding specific technical types.
      – Focuses on the scope: deciding what data is important for the model and what isn’t.

Logical model

Purpose: This provides a detailed picture of the data structure, but it’s still independent of any specific technology. It maps the entities from the conceptual model to “tables” and the attributes to “columns” (like in a database).
Audience & tools: It’s for data architects and developers. It commonly uses UML diagrams or more detailed ERD forms.
Key characteristics:
     – Lists all the specific pieces of information (attributes) for each entity, defining their types (like integer, date, or text).
     – Clearly shows relationships with how many items are involved (cardinality, e.g., one-to-many), including primary keys (unique identifiers for each record) and foreign keys (links to other tables).
     – Emphasises “normalisation” which means organising the data to reduce repetition and improve consistency.

Physical model

Purpose: This is the concrete, real-world plan. It takes the logical model and turns it into something that can be built using a specific database system and computer infrastructure.
Audience & tools: It’s for database engineers and operations teams who focus on setting up the database (using DDL commands) and making sure it performs well.
– Key characteristics:
      – Includes exact names for tables and columns, with precise data types (e.g., VARCHAR(100) for text up to 100 characters, INT for whole numbers).
      – Designs elements like indexes (to speed up searches), partitions (to divide large tables), and other settings for performance and scalability.
      – Defines rules to keep data valid, such as primary keys, foreign keys, unique rules, and even automated actions (triggers) or pre-calculated views of data.

Why layered modeling matters

Decoupling & flexibility: Each level of the model is separate. This means you can change how your data is stored (physical layer) without necessarily having to redesign your logical or conceptual models.
Improved collaboration: Non-technical people can easily understand and contribute early on with the conceptual models. Technical teams then get the precision they need with the logical and physical layers.
Governance & manageability: This layered approach helps with strict documentation, version control, and overall management. It ensures you can trace every detail, from initial business rules all the way to how the data is actually implemented.

Modeling in REST, OpenAPI & JSON schema

For APIs that follow the REST style, data models are typically defined and enforced using JSON Schema, especially when you’re also using OpenAPI. This approach makes sure that the data your API sends and receives is both well-documented and automatically checked by machines.

REST payloads & JSON schema

Core concepts

JSON schema is a standard that describes the exact structure of JSON data. It defines things like:

– What kind of data each piece of information is (e.g., text, numbers, true/false).
– Which pieces of information are required.
– Specific formats (e.g., date, email).
– Allowed options (enumerations).
– How complex, nested pieces of data (objects and arrays) should be structured.
– Rules for values (e.g., a minimum number, a specific text pattern).

OpenAPI integration

OpenAPI, a standard for describing APIs, uses JSON Schema to define the API’s data models. OpenAPI v3.0 uses an older version of JSON Schema, while v3.1 fully supports the latest JSON Schema 2020-12 standard.

Benefits

Validation: Tools can automatically check if incoming data matches your defined schema. If the data is wrong, the API can immediately send back a 400 Bad Request error.
Documentation: Your schema can automatically generate interactive API documentation (using tools like Swagger UI or Redoc), making it easier for developers to understand your API.
Tooling & code generation: A defined schema allows for automatically generating code for client applications (SDKs), type definitions (for languages like TypeScript, Java, or Go), and server-side code (stubs).

Anatomy of an OpenAPI schema object

A schema object in OpenAPI defines the fields (called “properties”) of your data, their types, example values, and rules.

Example (for a pet in a pet store API):

components:
  schemas:
    Pet:
      type: object
      required: [id, type, price]
      properties:
        id:
          type: integer
        type:
          type: string
          enum: [dog, cat, fish]
        price:
          type: number
          minimum: 25
          maximum: 500
:contentReference[oaicite:1
5]{index=15}

Schema composition: OpenAPI schemas can also handle complex relationships and options using keywords like allOf, oneOf, anyOf, and not. These help reduce repetition and model more intricate data rules.

Version matters: OpenAPI 3.0 vs. 3.1

There are important differences in how OpenAPI versions use JSON Schema:

JSON schema version
– OpenAPI 3.0: Partial support for Draft-05
– OpenAPI 3.1: Full support for JSON Schema 2020-12

Keywords support

– OpenAPI 3.0: Limited support
– OpenAPI 3.1: Full set, including unevaluatedProperties and dynamic $ref

External schema references

– OpenAPI 3.0: Partial support via $ref
– OpenAPI 3.1: Full capability with JSON Schema support

Discriminator support
– OpenAPI 3.0: Available
– OpenAPI 3.1: Same, with enhancements

Validation & tooling

Several tools help you validate JSON data against your schemas:

Middleware libraries: Some libraries (like Committee for Ruby/Rails or openapi-schema-validation for Java/Spring) can check both incoming requests and outgoing responses directly against your OpenAPI specification. If data doesn’t match, they return clear 400 Bad Request errors.
Schema validation engines: Tools like openapi4j ensure that your API design adheres correctly to the Schema Object rules. Generic JSON Schema validators can also automatically figure out data types based on your schemas.

Reusability & maintainability

Reusable schemas: Define common data structures once under components/schemas in your OpenAPI spec. Then, you can refer to them ($ref) throughout your API, avoiding repetition.
Modular design: You can split your OpenAPI specification into multiple files and use references to link them together. This helps organise large APIs and keeps different concerns separate.

Sportmonks and API data modeling

– Structured JSON schema: Sportmonks’ Football API 3.0 delivers data in well-defined JSON formats, with endpoints like fixtures, livescores, statistics, and expected (xG) clearly documented in nested structures (“includes”) and demo JSON response files.
API‑centric contract design: Our documentation emphasises defining available fields, data types, filter options, rate limits, and error codes, mirroring best practices in contractual API modeling via OpenAPI/JSON Schema.
Versioned model evolution: We rebuilt our data model from API 2.0 to 3.0 to improve efficiency, support more sports, simplify authentication, and refine nested includes.
Modular data design: We provide reusable components (“components guide”), demo files, and Postman collections, enabling clients to discover, test, and integrate payload schemas effectively.

Build with confidence using Sportmonks’ structured data models

Our Football API 3.0 is built on clean, well-defined JSON schemas designed to support scalable, reliable integrations. From fixtures to player stats, our modular structure and demo responses help you model data accurately from day one.

Get started today and bring structure to your sports data workflows.

Faqs about data models

What is a data model in API?
A data model in an API is an abstract framework that defines the structure of the data exchanged, typically in JSON or XML. It specifies which fields are present, their data types (e.g. text, number, date), how they relate to each other, and the constraints for validating values. This model forms the contract that ensures consistent communication between systems.
What is the API model?
The API model refers to the structured representation of data objects and their relationships as exposed via the API. It acts as the blueprint for request and response formats, guiding both how data is exchanged over the wire and how tools like documentation, validation, and code generation operate.
What data structure is used in an API?
Most modern APIs use structured formats like JSON or XML. Schemas (e.g. JSON Schema, OpenAPI Schema) are used to define these structures: specifying object types, nested arrays, enumerations, required fields, and validation rules.
What are the four different types of data models?
The most commonly recognised types are:
  1. Conceptual model – A high-level, business-oriented view that defines entities and their relationships abstractly.
  2. Logical model – A more detailed, tech-agnostic design mapping entities to named attributes (e.g. tables and columns).
  3. Physical model – A concrete model tailored to a specific database, with precise column types, indexes, partitions, and constraints.
[Optional] Specialized models – such as hierarchical, network, object-oriented, and entity–attribute–value (EAV) models used in specific storage context.

Written by David Jaja

David Jaja is a technical content manager at Sportmonks, where he makes complex football data easier to understand for developers and businesses. With a background in frontend development and technical writing, he helps bridge the gap between technology and sports data. Through clear, insightful content, he ensures Sportmonks' APIs are accessible and easy to use, empowering developers to build standout football applications