Skip to main content

System Design & Distributed Systems Glossary

📌 If you have any questions shoot us an email or join us on Discord! 💜

Software engineering terms often lack industry-wide definitions due to the field's breadth and rapid evolution. While developing Multiplayer, we realized that terms like "system design" and "software architecture" were used interchangeably, so we created shared definitions to ensure clear communication internally and externally.

This glossary is a reflection of how we approach these ideas and it’s intended to help others align on key concepts. This isn't an exhaustive list and we’re open to suggestions and feedback!

Distributed systems

Distributed systems consist of multiple independent components (or nodes) that work together, communicating over a network, to solve shared problems or deliver services.

These systems enhance scalability, fault tolerance, and efficiency by distributing tasks across individual devices or servers.

🔖 Deep dive into Distributed Systems Architecture

System design

System design (not to be confused with design systems) is the iterative process of defining and evolving the system architecture to meet the business requirements.

It involves defining the high-level conceptual structure of the software system and all its major components and interactions - across all aspects of the system (i.e. software, hardware, data, interfaces, and user interactions).

Expected outputs of the system design process:

🔖 Deep dive into System Design vs Software Design

🔖 Deep dive into a System Design Primer & Examples

Software design

Software design focuses on the software architecture, providing a detailed blueprint for individual components, how they interact, and how the code is written (e.g. classes, functions, and modules).

Although software architecture has a narrower scope than system architecture, it is similarly dynamic: it evolves over time, as the requirements change.

Detailed software design documentation is an expected output of the software design process.

🔖 Deep dive into System Design vs Software Design

Design Systems

System Design and Design Systems are two completely different terms although, understandably, they are easily confused.

A Design System is a collection of repeatable components that allows developers to create interfaces and experiences quickly; keeping a consistent look and feel in terms of colors, typography, spacing, etc. It goes beyond just a style guide and patterns library - it also includes standards and documentation on why and how to use the design components.

System architecture styles

System architecture styles define the overall structure and organization of a software system, showing the highest level of abstraction of the system design.

Architecture styles answer fundamental questions such as how the system components communicate, how data flows, and how the system is divided into modules or layers. Changes to architectural styles are significant and can be costly.

System architectures can often be categorized into two broad paradigms:

  • Centralized, which are typically represented by client-server systems
  • Decentralized, which are exemplified by peer-to-peer systems.

Examples of common system architecture styles:

🔖 Deep dive into System Architecture Styles

Software architecture styles

Often there is no distinction between system and software architecture styles, however, some architectural styles focus specifically on how to organize the information at the software level. For example:

  • Component-based Architecture
  • Domain Driven Design (DDD) Architecture
  • Object Oriented Architecture
  • Hexagonal Architecture (Ports and Adapters)
  • Clean Architecture
  • Functional Architecture
  • Data Driven Architecture

Distributed system design patterns

Distributed system design patterns are frequently used solutions to common challenges in distributed computing related to data storage, messaging, system management, and compute capability. Here are some examples:

  • Ambassador
  • Circuit Breaker
  • CQRS
  • Event Sourcing
  • Leader Election
  • Publisher / Subscriber
  • Sharding

🔖 Deep dive into Distributed System Design Patterns

Software design patterns

Software design patterns are reusable solutions to common coding and implementation problems. They include categories like Creational (e.g., Factory Method, Builder, Singleton, etc.), Structural (e.g., Adapter, Bridge, Object Tree, etc. ), and Behavioral (e.g., CoR, Observer, etc.).

System architecture diagram

A system architecture diagram provides a high-level, visual overview of a system's components, services, and interactions. It helps stakeholders understand the overall architecture, the individual relationships between them, and how data and processes flow through the system.

🔖 Deep dive into Architecture Diagram Examples

🔖 Deep dive into System Platforms

Sequence diagram

Sequence diagrams illustrate the order of interactions between system components, focusing on the sequence of events and operations over time. They are particularly useful for understanding the flow of processes in dynamic systems.

🔖 Deep dive into Architecture Diagram Examples

Network diagram

Network diagrams show the physical or logical layout of a network, detailing how different network nodes (like servers or routers) are connected. They aid in network troubleshooting, design, and security analysis by providing a clear view of the system’s communication structure.

🔖 Deep dive into Architecture Diagram Examples

API flow diagram

API flow diagrams visually represent how an API interacts with different system components, showcasing the expected behavior and flow of data across these interactions.

They help with clearly defining and communicating the expected behavior of an API in various levels of detail, which is especially useful when onboarding new team members or clarifying API functionality within teams.

🔖 Deep dive into API Flow Diagram

Microservices diagram

A microservices diagram illustrates the interactions and dependencies between individual microservices within a system. These diagrams clarify how services communicate, helping teams understand the system’s structure and plan for scalability and system evolution.

🔖 Deep dive into Microservices Diagram

System design review

A system design review is an important practice in software development to assess the technical feasibility, performance, and potential risks of a system. This practice is essential not only for new system architectures but also throughout the software development lifecycle (SDLC) and should be performed when adding features, editing APIs, refactoring existing solutions, migrating to new paradigms, etc.

Regular design reviews ensure alignment with business requirements, document key architectural decisions, and prevent technical debt accumulation (e.g. uncovering any unintentional architectural violations before implementation).

By continuously assessing and refining the system design, teams can optimize the architecture and avoid bottlenecks — proactively identifying, evaluating, and prioritizing areas of the system that require re-architecture and standardization— ensuring the long-term scalability and health of the software system.

🔖 Deep dive into System Design Reviews

Backend architecture

Backend architecture refers to the design and organization of all system components, infrastructure, and processes that are not visible to end users. This includes elements such as servers, databases, APIs, third-party integrations, business logic, caching solutions, message brokers, a security layer, and more.

It’s typically responsible for processes like authentication, data management, third-party integrations, and security. The architecture forms the backbone of how a system functions behind the scenes, ensuring performance, scalability, and security.

🔖 Deep dive into Backend Architecture

Frontend architecture

Frontend architecture focuses on the design and structure of the user interface (UI), structuring and organizing frontend code, leveraging reusable components, and implementing design patterns to manage data flows and interactions.

It ensures efficient, maintainable, and scalable web applications by structuring the interactions between users and backend systems.

Evolutionary architecture

Neal Ford, Rebecca Parsons and Patrick Kua proposed this idea in the book “Building Evolutionary Architectures” which focuses on the concept that “An evolutionary architecture supports guided, incremental change as a first principle across multiple dimensions.”

In short, evolutionary architecture promotes continuous, incremental changes to a system’s architecture over time. It supports adaptability to changing requirements, technologies, and business priorities, allowing the system to evolve while maintaining its core principles without requiring costly overhauls.

Technical debt (TD)

Introduced by Ward Cunningham in the early '90s, technical debt (TD) is a metaphor comparing the accumulated consequences of past decisions, compromises, and shortcuts in software development to financial debt.

Just as unpaid loans accrue interest, software TD accumulates "interest" in the form of bugs, inefficiencies, and increased development time.

While technical debt can provide initial speed (e.g. prioritizing short-term benefits at the expense of long-term health) delaying "repayment" can make future changes more difficult and costly.

And indeed the cost of technical debt extends beyond the immediate need to refactor initial design and implementation choices; it also encompasses the broader business impact of missed opportunities, failed modernization efforts, and high turnover due to inflexible and brittle IT systems.

🔖 Deep dive into Strategies for Managing Technical Debt

🔖 Deep dive into Technical Debt Examples

Technical credit

Technical credit is a strategy that seeks to preemptively forestall the accumulation of technical debt by proactively designing systems to prevent future bottlenecks.

However, because of time constrains and unforeseeable future requirements, the frequent outcome is an over-engineered solution where no corrective action is taken.

This inaction is partly due to the sunk cost fallacy—the hope that the currently unused solution might find relevance in the future. Another deterrent is the cost associated with refactoring. Removing an integrated solution from the architecture demands significant development effort to ensure it doesn't negatively impact customers or hinder the delivery of new features. Once a solution is entrenched in the architecture, excising it becomes a challenge.

🔖 Deep dive into Strategies for Managing Technical Debt

Architectural technical debt (ATD)

Architectural technical debt (ATD) is a category of technical debt, that represents the long-term impact of decisions made during the system design process, especially concerning the choice of structure (e.g., architectural style), the choice of technologies (e.g., frameworks, packages, libraries), and even the choice of programming languages and development methodologies.

Unlike code-level debt, ATD imposes greater constraints on a system’s scalability and adaptability and can result in a fragile architecture, limiting future evolution and system performance.

🔖 Deep dive into Understanding Architectural Technical Debt

🔖 Deep dive into Technical Debt Examples

Architectural drift

Architectural drift falls under the broader category of architectural technical debt and refers to the gradual deviation of a system's architectural design from its original or intended architecture due to ad-hoc alterations and additions.

This form of degradation introduces (both intentionally and unintentionally) design elements that, while not part of the initial architectural plan, do not necessarily contravene it. The architecture remains fundamentally intact, but accumulates unaccounted-for decisions like inconsistent coding practices, redundant components, or tangled dependencies. These elements often go undocumented, rendering the original architecture misleading and potentially undermining trust in both the system architecture and its associated documentation.

🔖 Deep dive into Recovering your Architecture after Drift and Erosion

Architectural erosion

Architectural erosion falls under the broader category of architectural technical debt and occurs when new design elements directly conflict with or undermine the system's foundational architecture, thus violating its guiding principles. Examples include tightly coupled modules, bypassing security protocols, and ignoring performance constraints. Erosion not only compromises the system's integrity but leads to a fragile architecture that is likely to encounter significant issues in the future.

🔖 Deep dive into Recovering your Architecture after Drift and Erosion

Architecture recovery

Architecture recovery is the process of rediscovering or reconstructing a system’s intended architecture after architectural drift or erosion. This involves analyzing code and documentation, often in legacy systems, to recover or update architectural details for improved maintenance and scalability.

It is often mentioned in conjunction with ‘software archaeology’ which is the study of poorly documented legacy software.

Both of these processes are tedious, time-consuming, and error-prone when done by hand. By using tools like Multiplayer’s Auto-Documentation you can discover, track and auto-document your system effortlessly.

Architectural observability

Architectural observability involves having a thorough, accurate, and real-time view of the system's architecture. Its objective is to foster a shared understanding of the architecture and the system's internal behaviors among developers, architects, and various other stakeholders.

This also ensures the identification of early signs of architectural drift from the original design and sheds light on any instances of architectural erosion, preventing the accumulation of architectural technical debt.

Without the proper tools, visualizing the entire system architecture in real-time becomes a daunting task, often relegated to manual, general-purpose diagramming tools. That’s why you can use tools like Multiplayer’s Auto-Documentation to discover, track and auto-document your system effortlessly.

Minimum viable architecture (MVA)

A minimum viable architecture (MVA) is the simplest design of a system’s architecture that supports the core functionalities of a Minimum Viable Product (MVP) while (a) allowing for future scalability and adaptability, (b) preventing potential user disillusionment due to performance issues, and (c) safeguarding against competitors.

In short the MVA allows the application to meet its current requirements, without jeopardizing its ability to meet future requirements - i.e. minimizing architectural technical debt and achieving long-term viability.

🔖 Deep dive into the Minimum Viable Architecture

Application performance monitoring (APM)

Application Performance Monitoring (APM) is the practice of using software tools and telemetry data to track key metrics for application health and performance.

APM tools are usually associated with Observability 1.0 and help detect software health problems and identify trends in the application overall performance and user experience.

🔖 Deep dive into the Platform Debugger

Observability

Observability enables teams to gain insights into system behavior beyond traditional monitoring. It involves revealing unknown unknowns and understanding complex system interactions, allowing teams to diagnose why issues occur.

The term and definition of "observability" has been long debated and recently there has been a distinction between Observability 1.0 and 2.0:

  • Observability 1.0 closely aligns with traditional monitoring frameworks and APM tools and refers to the traditional approach where vast amounts of telemetry data (metrics, logs, and traces) is collected and then displayed with dashboards. These tools are great for detecting known unknowns and spotting trends in production. But they often leave developers sifting through mountains of data, piecing together insights like searching for a needle in a haystack.

  • Observability 2.0 emphasizes real-time, actionable insights across the entire software development lifecycle. Instead of simply monitoring for problems, it helps developers understand system behaviors, revealing the unknown unknowns.

The evolution of how we think and implement observability reflects the growing complexity of modern software systems and the need for more sophisticated tools to manage and understand them.

🔖 Deep dive into Observability 1.0 vs Observability 2.0

System requirements documentation

System requirements documentation outlines the functional and non-functional requirements that guide system design, highlighting how they will drive the architectural decisions (or how the requirements might be influenced by the system architecture itself).

System architecture documentation

System architecture documentation is the comprehensive collection of information about the system architecture, its design, and evolution over time. It includes various diagrams (e.g. system architecture diagram), records of the architecture's goals, constraints, design decisions, and historical evolution (e.g. architecture decision records), any other documents that help guide the development process (e.g. HLD, SAD, KDD, ARD, LLD, ADR, etc.) plus any formalized processes or practices adopted by the team (e.g. system analysis and design (SAD) practices).

Clear and comprehensive documentation enables effective communication among stakeholders, ensuring that everyone understands the system's structure, components, and interactions. It facilitates better decision-making by providing insights into design choices, trade-offs, and constraints, thus minimizing the risk of misunderstandings, errors, and rework. Moreover, architecture documentation is a reference for future maintenance, enhancements, and scalability, allowing teams to maintain system integrity and coherence over time.

🔖 Deep dive into Architecture Documentation

Architecture decision record (ADR)

An architecture decision record (ADR) captures an architectural decision, along with its context, options considered, and rationale. Lightweight ADRs include title, context, decision, and consequences, serving as a snapshot of choices made at specific points in the project.

Software design documentation

Software design documentation outlines the specifications for how each individual component, module, or subsystem will be implemented and all the interactions and dependencies between them (e.g. technical details, environment variables (with required, optionals, default values), algorithms, data structures, coding guidelines, run time configuration, etc.)

Big design up front (BDUF)

Big Design Up Front (BDUF) involves creating a comprehensive system and software design before implementation begins, traditionally associated with the waterfall development model.

While this upfront planning helps address system constraints and requirements early, it contrasts with agile approaches that prioritize iterative design. A balanced approach involves establishing core architectural constraints upfront while remaining open to evolving the design as requirements change with continuous system design reviews.

Enterprise architecture (EA)

Enterprise architecture (EA) aligns technology infrastructure with business goals, establishing a strategic roadmap for IT and business alignment. This involves analyzing systems, identifying optimization opportunities, enforcing technology standards, choices, and frameworks across projects and creating architectural artefacts (e.g., business capability models, target state architectures).

Enterprise architects work with leadership to ensure technology investments support long-term business goals, although the necessity of this role in modern agile environments is sometimes debated.

🔖 Deep dive into Enterprise Architecture Strategy**

System analysis and design (SAD)

System analysis and design (SAD) is a structured approach to developing information systems that meet organizational objectives. SAD involves assessing current systems, identifying improvement areas, and designing new solutions that align with both functional and non-functional requirements.

The core principles and methodologies that underpin SAD include an iterative process of understanding and documenting system requirements, ensuring stakeholder engagement, and using tools and techniques for conceptualizing and planning system architectures.

🔖 Deep dive into System Analysis and Design

Architecture tradeoff analysis method (ATAM)

The architecture tradeoff analysis method (ATAM) is an approach within the system analysis and design (SAD) practice.

It was developed by the Software Engineering Institute (SEI) at Carnegie Mellon University and it uses a structured approach to identifying and mitigating risks associated with the proposed architecture.

ATAM engages stakeholders to prioritize quality requirements and weigh tradeoffs, such as balancing speed with system stability and enabling teams to make informed architectural decisions.

🔖 Deep dive into System Analysis and Design

Software architecture analysis method (SAAM)

The software architecture analysis method (SAAM ) is an approach within the system analysis and design (SAD) practice.

It is another method developed by the Software Engineering Institute (SEI) at Carnegie Mellon University, which evaluates the system architecture’s ability to meet specific quality attributes ( e.g. performance, scalability, and maintainability) through scenario-based evaluations.

🔖 Deep dive into System Analysis and Design

Scalability

Scalability refers to the system’s ability to handle increased workloads without degrading performance. It can be achieved via:

  • Horizontal Scaling: Adding more servers/nodes.
  • Vertical Scaling: Increasing resources on existing nodes.

🔖 Deep dive into a System Design Primer & Examples

Availability

Availability ensures that a system is consistently accessible to users with minimal downtime. System architects often incorporate redundancy—such as backup systems or duplicate components—to enhance availability. This redundancy allows for seamless operation even if a primary component fails, ensuring uninterrupted service for users.

🔖 Deep dive into a System Design Primer & Examples

Reliability

Reliability involves the system's resilience and ability to handle errors gracefully. A reliable system consistently delivers accurate results and mitigates potential failures to avoid crashes or unexpected behavior, contributing to overall stability.

🔖 Deep dive into a System Design Primer & Examples

Maintainability

Maintainability refers to a system’s ease of modification, debugging, and adaptation over time. A maintainable system has a modular, clean codebase, well-documented system information, and low technical debt.

🔖 Deep dive into a System Design Primer & Examples

Fault tolerance

Fault tolerance ensures continued operation despite failures in hardware, software, or network. Key methods include:

  • Replication: Duplicating system elements for redundancy.
  • Checkpoints: Periodically saving the system’s state for recovery.
  • Failover: Automatically switching to backup systems when the primary fails.

🔖 Deep dive into a System Design Primer & Examples

Consistency

Consistency in distributed systems ensures uniform data across all nodes, but it can be managed differently depending on application requirements. Common models include:

  • Strong Consistency: Guarantees that all data replicas reflect the latest update immediately. But it can impact performance due to synchronization overhead.
  • Eventual Consistency: Prioritizes speed, allowing temporary inconsistencies that resolve over time, as updates are eventually propagated across all replicas.
  • Causal Consistency: Maintains order for causally related updates, but does not guarantee immediate consistency. It offers a balance between performance and consistency requirements.

🔖 Deep dive into a System Design Primer & Examples

Partitioning/Sharding

Partitioning (or sharding) splits large datasets into smaller, more manageable parts distributed across nodes. This approach enhances scalability and fault tolerance, as operations run in parallel, and failures are isolated to individual partitions.

Implementing sharding requires careful consideration of data distribution, shard keys, and balancing data across shards.

🔖 Deep dive into a System Design Primer & Examples

Monolithic architecture

In a monolithic architecture, an application functions as a single unit deployed on a single server or across multiple servers with load balancers within a distributed system.

This setup is straightforward to develop, test, and deploy, with faster execution due to fewer inter-process communications. However, scaling specific components is challenging, often requiring the entire application to scale together, resulting in inefficient resource use and increased maintenance complexity.

🔖 Deep dive into System Architecture Design

Microservices architecture

Microservices architecture structures applications as loosely coupled, independent services focused on specific business functions. Each service has its own data storage and communicates with others via APIs, allowing for independent development, deployment, and scalability.

This isolation enhances resilience, as failures are contained, but the architecture requires careful design to manage data consistency, communication overhead, and distributed system challenges.

🔖 Deep dive into Microservices Design Patterns

🔖 Deep dive into System Architecture Design

Event-driven architecture

Event-driven architectures enable asynchronous communication between components via events, promoting scalability and flexibility. This approach decouples components, allowing services to operate independently in response to specific events. However, handling event flow, ensuring message order, and managing eventual consistency introduce complexity.

🔖 Deep dive into System Architecture Design

Serverless architecture

Serverless architectures allows developers to focus on code, with cloud providers managing infrastructure, scaling, and maintenance.

Functions are deployed as independent services, scaling automatically with demand.

While serverless architectures improve cost-efficiency and deployment speed, they come with limitations like vendor lock-in, latency from cold starts, and complexities in debugging and monitoring.

🔖 Deep dive into System Architecture Design

Edge computing architecture

Edge computing moves data processing closer to users and devices, reducing latency and bandwidth usage. By processing locally rather than relying on central cloud servers, edge computing improves application reliability but requires robust management of resource constraints, connectivity, and security across distributed infrastructure.

🔖 Deep dive into System Architecture Design

Peer-to-peer architecture

Peer-to-peer (P2P) architecture distributes control across nodes, eliminating central servers and allowing the network to scale dynamically with each added node. This decentralization increases resilience and efficiency but introduces complexities in network management, data consistency, and security, requiring careful design to ensure reliability in large, dynamic networks.

🔖 Deep dive into System Architecture Design

Multi cloud architecture

Multi-cloud architecture is a strategic approach to designing and managing a software system that spans multiple cloud providers. By leveraging services and infrastructure from different vendors—such as AWS, Azure, and Google Cloud—organizations can optimize their systems for redundancy, performance, cost efficiency, and compliance requirements.

🔖 Deep dive into Multi Cloud Architecture

Components

A system is comprised of various software and hardware components, which are self-contained units that perform specific functions, such as databases, servers, clients, APIs, or applications. Together, system components collaborate to execute tasks, manage resources, and provide interfaces for user interaction.

Components are also the building blocks for system architecture diagrams, helping to illustrate system interactions and responsibilities.

🔖 Deep dive into System Architecture

Dependencies

Dependencies are relationships between components, and can be internal within the application and with external services (e.g. managed service providers, APM tools, logging systems, etc.).

System diagrams usually use lines or arrows to illustrate the interactions and communication pathways between components (i.e. how data flows, control is passed, or services are invoked).

Application dependency mapping (ADM)

Application dependency mapping (ADM) is a process used to visualize the connections between an application’s components, producing a topological view of all the system dependencies.

🔖 Deep dive into Application Dependency Mapping

Application programming interfaces (APIs)

Application programming interfaces (APIs) facilitate communication between components, systems, and applications through a set of rules and protocols.

APIs specify methods, URIs and URLs, and data formats that applications can use to request and receive data. API types include RESTful APIs, GraphQL, and gRPC, each suited to specific communication needs.

API gateways centralize requests, improving client interaction and managing complex architectures.

🔖 Deep dive into API Design Patterns

Data interfaces

Data interfaces define storage mechanisms (e.g. SQL vs NoSQL databases), access patterns (e.g., sequential access, etc.), and strategies for optimizing data retrieval and management.

🔖 Deep dive into System Architecture

User interaction layers (UI and UX)

User interaction layers define how users interact with the system, aiming to create an intuitive and user-friendly environment for end users. They include the

  • User interface (UI): the visual elements for user interaction
  • User experience (UX): the overall system experience.

🔖 Deep dive into System Architecture

Security layers

Security layers implement measures like user authentication, authorization, and data encryption to safeguard the system from unauthorized access, breaches, and other threats.

🔖 Deep dive into System Architecture

Server

Servers provide resources, data, and services within a network, managing business logic and responding to client requests. They can be provisioned locally, in the cloud, or in a hybrid setup.

🔖 Deep dive into Backend Architecture

Caching solution

Caching temporarily stores frequently accessed data in a fast-access storage layer. This boosts performance by reducing retrieval time. It can be implemented in-browser, through CDNs, with in-memory caches like Redis, or within application code.

🔖 Deep dive into Backend Architecture

Database

Databases store, manage, and retrieve data. Relational (SQL) databases—MySQL, PostgreSQL, etc.— organize data via tables with predefined schemas, while non relational (NoSQL) databases—MongoDB, Cassandra, etc.— provide more flexible schemas.

🔖 Deep dive into Backend Architecture

Platform engineering

Platform engineering is the discipline of designing, building, and maintaining an internal developer platform (IDP) that abstracts and automates complex infrastructure and operational tasks. It focuses on creating reusable tools, workflows, and self-service capabilities, empowering developers to build, deploy, and manage software with greater efficiency and autonomy.

The goal of platform engineering is to reduce friction in the software development lifecycle by providing standardized solutions for common challenges, such as environment provisioning, CI/CD pipelines, observability, and security. This enables engineering teams to focus on delivering value to end users rather than getting bogged down by repetitive or complex infrastructure tasks.

🔖 Deep dive into Platform Engineering