The Supergraph Manifesto #
Supergraph is an architecture framework that offers reference architectures, design guidelines/principles and an operating model to help multiple teams to collaborate on a self-serve platform for federated data access, API integration/composition or GraphQL APIs. An implementation artifact of the Supergraph architecture is called a supergraph (lowercase s
).
When a supergraph is built with a GraphQL federation stack, the engine is often called a gateway or a router and the subgraph connectors are often GraphQL services.
A supergraph is typically used for the following 2 use-cases:
- Self-serve API composition platform: A self-serve operating model for API integration, orchestration & aggregation
- Federated data access layer: A federated data layer that allows realtime access to data sources with cross-domain composability (joins, filtering etc.) Related: Data mesh, data products
Strategy and Core concepts #
A supergraph approach aims to build a flywheel of data access and supply to incrementally improve self-service access to data and APIs.
I. CONNECT domains #
Domain owners (or data owners, or API producers) should be able to seamlessly connect their domains to the platform. A major challenge in building supergraph is the resistance to change by the domain owners. They often oppose having to build, operate and maintain another API layer, such as a GraphQL server that creates another wrapper on their domain. This reluctance and concern is understandable and completely valid and must be systematically addressed by the supergraph platform strategy and the supergraph reference architecture.
This has two main implications for the subgraph connector’s lifecycle and runtime:
- Subgraph connector CI/CD: As domain owners change their domains, the API contract published via the supergraph engine, must stay in sync with the least amount of overhead for the domain owner. The SDLC, change-management or CI/CD process of the domain owners must involve updating their API contract (eg: versioning), prevent breaking changes and keeping documentation up to date.
- Subgraph connector performance: The subgraph connector must not reduce performance as compared to what is provided by accessing the underlying domain directly. API performance characteristics as measured by latency, payload size & concurrency.
Guaranteeing a smooth CI/CD process and high-performance connectivity gives domain owners confidence that they can connect their domains to the supergraph platform and iterate on changes to their domains fearlessly.
This unlocks self-serve connectivity for domain owners.
II. CONSUME APIs #
API consumers should be able to discover and consume APIs in a way that doesn’t require manual API integration, aggregation or composition effort as far as possible. API consumers have several common needs when they’re dealing with fixed API endpoints or specific data queries:
- fetch different projections of data to prevent over-fetching
- join data from multiple places to prevent under-fetching
- filter, paginate, sort and aggregate data from multiple places
To provide an API experience that makes the consumption experience truly self-serve, there are two key requirements:
- Composable API design: The API presented by the supergraph engine must allow for on-demand composability. GraphQL is a great API to express composability semantics, but regardless of the API format used, a standardized, composable API design is a critical requirement.
- API portal: High-quality search, discovery and documentation of both the API and the underlying API models is critical to enable self-serve consumption. The more information that can be made available to API consumers the better. Eg: Data lineage, Authorization policies etc as appropriate.
This unlocks self-serve consumption for API consumers
III. DISCOVER demand #
Understanding how API consumers use their domain and identify their unmet needs is crucial for API producers. This insight allows API producers to enhance their domain. It also helps discover new domain owners to connect their domain into the supergraph.
This necessitates 2 key capabilities of the supergraph platform to create a consumer-first, agile culture:
- API consumption, API schema & portal analytics: A supergraph is analogous to a marketplace and needs to provide the marketplace owners and producers with insights to help improve the marketplace for the consumers.
- Ecosystem integrations: The supergraph platform should be able to integrate with existing communication and catalog tools, in particular to help understand unmet demand of API consumers.
This closes the loop and allows the supergraph platform to create a virtuous cycle of success for producers and consumers.
Architecture guide #
CI/CD and build system (control plane) #
The control plane of the supergraph is critical to help domain owners connect their domains to the supergraph.
There are 3 components in the control plane of the supergraph
- The domain itself
- The subgraph
- The supergraph
The control plane should define the following SDLC to help keep the supergraph in sync with the domain as the underlying domain changes.
Distributed data plane #
The supergraph data plane is critical to enable high performance access to upstream domains so that API producers can maintain their domain without hidden future maintenance costs:
API schema design guide #
Standardization #
A supergraph API schema should create standardized conventions on the following:
Standardization Attribute | Capability |
S1 | Separating models (resources) & commands (methods)
Example
|
S2 | Model filteringExampleGet a list of articles published this year
|
S3 | Model sortingExampleGet a list of articles sorted in reverse by the date of publishing
|
S4 | Model paginationExamplePaginate the above list with 20 objects per page and fetch the 3rd page
|
S5 | Model aggregations over fieldsExampleGet a count of authors and their average age
|
Prior art
- Google Cloud API design guide
- Resource: A resource-oriented API is generally modeled as a resource hierarchy, where each node is either a simple resource or a collection resource
- Method: Resources are manipulated via a small set of methods
Composability #
The supergraph API is typically a GraphQL / JSON API. There are varying degrees of composability an API can offer, as listed out in the following table:
Composability Attribute | Capability | Description |
C1 | Joining data | Join related data together in a "foreign key" like joinExampleGet a list of authors and their articles
|
C2 | Nested filtering | Filter a parent by a property of its child (i.e. a property of a related entity)ExampleGet a list of authors whose have published an article this year
|
C3 | Nested sorting | Sort a parent by a property of its child (i.e. a property of a related entity)ExampleGet a list of articles sorted by the names of their author
|
C4 | Nested pagination | Fetch a paginated list of parents, along with a paginated & sorted list of children for each parentExampleGet the 2nd page of a list of authors and the first page of their articles, sorted by the article's title field
|
C5 | Nested aggregation | Aggregate a child/parent in the context of its parent/childExampleGet a list of authors and the number of articles written by each author
|
These composability attributes are what increase the level of self-serve composition and reduce the need for manual API aggregation and composition.