API Composition: API Integration, Aggregation & Orchestration #

We use the term API composition to encompass three main aspects of working with multiple API endpoints: integration, orchestration, and aggregation.

A key driver for the Supergraph is the need for API composition. GraphQL (monolithic or federated) is a special case of this need.

What is API composition & why is it hard? #

While domain owners (producers) are owners of a domain API, in a multi-consumer and multi-producer scenario, API consumers often also need specialized APIs that are optimized for their use cases.

Screen Shot 2024-05-13 at 10 22 07 PM

This creates a tension between “domain-driven” API design and ownership and “consumer-driven” API design.

Design & Ownership	Benefits	Challenges
Domain-driven	Consistent, standardized API for multiple consumers	Not optimized for consumer needs
Consumer-driven	Optimized for single consumer and can incorporate consumer-specific business logic	Hard to standardize and needs to be purpose-built for every consumer

A supergraph allows both these API design and ownership models to co-exist. A supergraph platform brings in domain APIs via its subgraph connectors and provides a self-serve API orchestration and aggregation layer across the various domains. This allows domain owners to design and evolve their domain API, and other supergraph stakeholders to aggregate APIs on-demand and add custom orchestration workflows.

A supergraph platform solves the following three API composition problems:

Integration
Aggregation
Orchestration

Solving API Integration #

Given that domains and domain APIs exist, API integration remains challenging for API consumers for the following reasons:

The API output format or protocol is not ideal or optimal for a consumer.
The API does not have a typed schema and/or does not provide an SDK experience for the consumer.
The API’s documentation is missing or out of date.
The API does not have standardized conventions or follow a consistent design.
API versioning creates tension with high-velocity development

A supergraph provides a systematic way to address these challenges because it provides a common semantic layer and registry for the underlying domains and their APIs. A well-setup supergraph platform provides out-of-the-box solutions for the challenges mentioned above.

Solving API Aggregation (or batching) #

API consumers often need to fetch data from multiple API endpoints. API aggregation or batching, performed closer to the domains, can prevent excessive data transfer and reduce network round trips.

API aggregation is challenging because:

Explosive creation of new aggregation API endpoints: Different consumers have different needs, and their needs evolve rapidly in a high-velocity environment.
Fuzzy ownership: Domain APIs are owned and designed by domain owners, but often it is not clear who builds, designs, and operates API endpoints that aggregate data across these endpoints.

A supergraph provides a self-service model for API aggregation and batching by modeling the underlying domains as a “graph” and then allowing API consumers to fetch whatever slice of data they need on-demand without requiring the development and maintenance of new aggregate endpoints.

A well-setup supergraph platform provides a high level of composability that makes different types of API aggregation possible on demand. For example:

Joins: Fetch data from A and related data from B.
Nested filters: Fetch data from A, filtered by a property value of its related data B.

Solving API Orchestration #

API consumers often need to create reliable workflows that require sequencing multiple API calls interspersed with business logic. Even if the underlying domain APIs exist, API orchestration is challenging because it is the part of the API that is consumer-defined and potentially spans multiple domains.

This makes it challenging to create a unified technology approach and identify owners to build and operate these workflows.

Related: Sagas, Distributed transactions, state machines.

A supergraph platform should provide a clear operating model and technology best practices to manage API orchestration, beyond simple aggregation/batching use-cases.

Supergraph checklist for API composition #

1. Integration	Making it easy for API consumers to integrate APIs into their services
1.1 Multiple API formats	Can the supergraph platform to automatically provide output formats beyond GraphQL? Eg: REST/OpenAPI, gRPC. This is required to prevent a lock in to the GraphQL protocol as needs change over time.
1.2 Documentation	Does the supergraph platform help domain owners maintain documentation? If the underlying domain (database, code or APIs) are already documented, are those automatically picked up by the supergraph platform?
1.3 Standardization	Does the supergraph platform provide or enforce a standardized domain API design? (Eg: pagination, filtering, sorting etc)
2. Aggregation	Making it easy for API consumers to aggregate/batch multiple API calls into one
2.1 Relationships	Does the supergraph provide a way of creating relationships between any 2 entities or endpoints without requiring changes from the domain owners
2.2 Composability	How many "join" features does the supergraph provide, given a relationship between 2 entities in the supergraph? Examples
3. Orchestration	Making it easy for supergraph stakeholders to author custom API orchestration
3.1 Custom orchestration business logic	Does the supergraph provide a way to author orchestration flows within or across underlying domains?

Federated GraphQL Anti-patterns #

Building a federated GraphQL API with a supergraph is a strategic decision. When the key desired benefit of GraphQL is to solve an API composition problem on top of existing domains, then the key expected ROI is to improve API integration, aggregation, and orchestration. In this case, the following anti-patterns should be avoided. Building a federated GraphQL API with a supergraph is a strategic decision, and the wrong choice can create thousands of person-hours of technical debt and legacy that become hard to unwind.

❌ Pure schema-driven implementation:
- The situation: The entire GraphQL schema is hand-written to, purportedly meet the goal of a “consumer first” API design.
- The problem: Domains and domain APIs are already designed keeping the needs of one or more consumers in mind. Recreating these parts of API in the GraphQL schema is essentially complete duplication of effort and results in the creation of a parallel standard. A consumer driven approach is only important for the parts of the API that require API aggregation and custom API orchestration workflows.
- Symptoms:
  - A parallel API standards group starts to exist.
  - Existing domain owners are not willing to own their subgraph GraphQL servers and find it tedious.
  - A new squad or team is spending their entire development time building and maintaining a GraphQL wrapper.
- The solution:
  - Domain-driven subgraphs: Auto-generate subgraphs that accurately reflect the domain. Subgraph improvements are driven by domain design improvements.
  - Consumer-driven additions to the supergraph: Create a tech stack and an operating model for making high-velocity additions to the supergraph based on the needs of a specific consumer.
❌ Forcing subgraph owners to own inter-subgraph relationships:
- The situation: Relationships between subgraphs can only be specified by subgraph owners.
- The problem: While this approach allows subgraph owners to extend and connect their subgraphs to other subgraphs, it requires domain owners (subgraph owners) to understand other subgraphs.
- Symptoms:
  - Disconnected supergraphs. Inefficient query execution plans if relationships are hard to implement in some directions.
  - API consumers who are frustrated at not being able to participate in creating inter-subgraph relationships.
- The solution: In addition to allowing subgraph owners to create inter-subgraph relationships, allow the creation of relationships in the supergraph engine outside the subgraphs as well.
❌ Requiring an additional API registry to create custom resolvers (eg: federated mutations):
- The situation: Mutations can only originate from a domain subgraph.
- The problem: Supergraph stakeholders are forced to create a new subgraph that directly connects to underlying domains. This requires stakeholders to refer to another API registry to understand how to connect to the underlying subgraph domains (eg: database, REST).
- Symptoms:
  - API consumers are frustrated at not being able to easily create custom resolvers that represent some unique business logic and workflow for their specific need.
  - Lack of an operating model around federating mutations, or API sagas, or distributed transactions.
- The solution: The supergraph platform should provide a way for supergraph stakeholders to author custom API workflows interspersed with business logic without having to refer to APIs outside of the supergraph.