Skip to content

Commit ea73f4c

Browse files
committed
update glossary.md
1 parent c1d21ad commit ea73f4c

1 file changed

Lines changed: 250 additions & 0 deletions

File tree

Glossary.md

Lines changed: 250 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@
1111
1. [Cloud Design Patterns](#cloud-design-patterns)
1212
1. [Compiler Framework: LLVM vs GCC](#compiler-framework-llvm-vs-gcc)
1313
1. [Conway's Law](#conways-law)
14+
1. [CPU 101](#cpu-101)
1415
1. [Cracking Coding Interviews](#cracking-coding-interviews)
1516
1. [Data Engineering & Data Scientists Vocab 101](#data-engineering-data-scientists-vocab-101)
1617
1. [Data Management in Distributed Systems (Partitioning, Shuffling and Bucketing)](#data-management-in-distributed-systems-partitioning-shuffling-and-bucketing)
@@ -23,12 +24,14 @@
2324
1. [Gartner's PACE Layered Application Strategy](#gartners-pace-layered-application-strategy)
2425
1. [Generic: PECS (Producer Extends, Consumer Super)](#generic-pecs-producer-extends-consumer-super)
2526
1. [Hadoop Ecosystem](#hadoop-ecosystem)
27+
1. [Idempotent, Backfill](#idempotent-backfill)
2628
1. [JIT vs AOT](#jit-vs-aot)
2729
1. [Measuring Engineering Productivity (DORA, SPACE, DX Core 4, DevEx)](#measuring-engineering-productivity-dora-space-dx-core-4-devex)
2830
1. [Medallion Architecture](#medallion-architecture)
2931
1. [Memory Consistency Model (SC vs TSO vs Relaxed)](#memory-consistency-model-sc-vs-tso-vs-relaxed)
3032
1. [Message Broker Pattern](#message-broker-pattern)
3133
1. [Mixin](#mixin)
34+
1. [Network Design 101](#network-design-101)
3235
1. [OLAP vs OLTP](#olap-vs-oltp)
3336
1. [Passkey](#passkey)
3437
1. [Popular Enterprise Architecture Frameworks](#popular-enterprise-architecture-frameworks)
@@ -38,6 +41,7 @@
3841
1. [RBAC vs ReBAC](#rbac-vs-rebac)
3942
1. [Reactive Programming vs Event-Driven Architecture](#reactive-programming-vs-event-driven-architecture)
4043
1. [Real-time Communication and Messaging (MQTT, AMQP and WebSocket)](#real-time-communication-and-messaging-mqtt-amqp-and-websocket)
44+
1. [Scaling a system 101](#scaling-a-system-101)
4145
1. [Security Words 101](#security-words-101)
4246
1. [SLA, SLO, and SLI](#sla-slo-and-sli)
4347
1. [Slowly Changing Dimensions (SCD)](#slowly-changing-dimensions-scd)
@@ -53,6 +57,7 @@
5357
1. [Transfer Learning, Fine-tuning, Multitask Learning and Federated Learning](#transfer-learning-fine-tuning-multitask-learning-and-federated-learning)
5458
1. [Web Services and APIs (SOAP, RestAPI, GraphQL, gRPC and Kafka)](#web-services-and-apis-soap-restapi-graphql-grpc-and-kafka)
5559
1. [Windows UI Development Frameworks](#windows-ui-development-frameworks)
60+
1. [Zanzibar](#zanzibar)
5661

5762
#### 9 Clean Code Principles
5863
<a id="9-clean-code-principles-2"></a>
@@ -259,6 +264,100 @@ Software engineering principle that states that the structure of a system reflec
259264

260265
---
261266

267+
#### CPU 101
268+
<a id="cpu-101-2"></a>
269+
270+
[ref](https://cpu.land/editions/one-pager)
271+
272+
```mermaid
273+
flowchart TD
274+
%% ── Boot ────────────────────────────────────────────────────
275+
Firmware["UEFI / BIOS"] --> Bootloader["Bootloader (GRUB)"]
276+
Bootloader --> KernelInit["Kernel Init<br/>(rings · page tables · IDT · scheduler)"]
277+
KernelInit --> InitProcess["Init Process (PID 1)"]
278+
InitProcess -->|"spawns via"| Fork
279+
280+
%% ── fork + exec ─────────────────────────────────────────────
281+
Fork["fork()"] -->|"SYSCALL: rax=syscall#<br/>rdi/rsi=args → clone + COW pages"| IDT
282+
Fork -->|"child calls"| Exec["exec()"]
283+
Exec -->|"SYSCALL: execve"| IDT
284+
IDT["IDT Handlers<br/>(syscalls · interrupts)"] -->|"exec: load ELF segments"| UserPages["User Pages (Ring 3)"]
285+
IDT -->|"exec: set IP to entry point"| IP["Instruction Pointer (IP)"]
286+
IDT -->|"Ring 0→3: SYSRET restores user mode<br/>(returning from kernel back to user · result in rax)"| Registers["Registers (eax, ebx, ...)"]
287+
288+
%% ── CPU Fetch-Execute loop ──────────────────────────────────
289+
%% NOTE: pure computation (math, logic, local memory reads) runs here
290+
%% directly in Ring 3 — NO syscall needed
291+
NoteRing3["⚑ Not all operations go to kernel<br/>arithmetic · logic · local memory stay in Ring 3 — no mode switch"]
292+
NoteRing3 -.->|"no kernel handoff"| IP
293+
294+
IP -->|"fetch (virtual addr)"| MMU["MMU<br/>Memory Management Unit<br/>(translate + ring protection)"]
295+
Registers -->|"load / store"| MMU
296+
MMU -->|"page table walk"| PageTable["Page Table"]
297+
PageTable --> KernelPages["Kernel Pages (Ring 0)"]
298+
PageTable --> UserPages
299+
300+
%% ── Syscall boundary note ───────────────────────────────────
301+
%% NOTE: syscall only triggered at system resource boundary
302+
NoteSyscall["⚑ User code is handed over to kernel via SYSCALL<br/>Kernel manages system interactions (files · network · memory)<br/>User gets result back in rax"]
303+
NoteSyscall -.->|"triggers SYSCALL instruction"| IDT
304+
305+
%% ── Preemptive Multitasking ─────────────────────────────────
306+
Timer["Hardware Timer Interrupt (PIT)"] -->|"fires IRQ every ~few ms"| IDT
307+
IDT -->|"timer: run scheduler"| Scheduler["Scheduler"]
308+
Scheduler -->|"restore registers + IP"| Registers
309+
Scheduler -->|"swap CR3 (address space)"| MMU
310+
311+
%% ── Styling ─────────────────────────────────────────────────
312+
classDef boot fill:#fffacd,stroke:#333,stroke-width:2px
313+
classDef cpu fill:#f0f8ff,stroke:#333,stroke-width:2px
314+
classDef mem fill:#fff0f5,stroke:#333,stroke-width:2px
315+
classDef kernel fill:#f5f5dc,stroke:#333,stroke-width:2px
316+
classDef hw fill:#e6e6fa,stroke:#333,stroke-width:2px
317+
classDef proc fill:#ffe4e1,stroke:#333,stroke-width:2px
318+
classDef note fill:#f0fff0,stroke:#999,stroke-width:1px,stroke-dasharray:4 2,color:#555
319+
320+
class Firmware,Bootloader,KernelInit,InitProcess boot
321+
class IP,Registers cpu
322+
class MMU,PageTable,KernelPages,UserPages mem
323+
class IDT,Scheduler kernel
324+
class Timer hw
325+
class Fork,Exec proc
326+
class NoteRing3,NoteSyscall note
327+
```
328+
329+
🔹 **Fetch-Execute Cycle**: The CPU holds an **instruction pointer** (register) pointing into RAM. It endlessly repeats: fetch instruction → execute → advance pointer. Jump instructions alter the pointer; this is how control flow works.
330+
331+
🔹 **Registers**: Small, extremely fast storage buckets inside the CPU (e.g., `eax`, `ebx`). One special register is the instruction pointer. Others control CPU modes and permission levels.
332+
333+
🔹 **Privilege Rings (Kernel vs User mode)**: Modern CPUs have at least two modes.
334+
- **Kernel mode (Ring 0)**: unrestricted — any instruction, any memory.
335+
- **User mode (Ring 3)**: limited — no direct I/O, no arbitrary memory access, no changing CPU settings.
336+
The kernel runs in Ring 0; user programs run in Ring 3. The CPU starts in kernel mode at boot; the OS switches to user mode before running programs.
337+
338+
🔹 **System Calls (Syscalls)**: The only safe way for user-mode code to request kernel services (open file, allocate memory, spawn process, etc.).
339+
1. OS pre-registers handler addresses in an **Interrupt Descriptor Table (IDT)** at boot.
340+
2. Program triggers a **software interrupt** (`INT 0x80`) or uses `SYSCALL` / `SYSENTER` instructions.
341+
3. CPU switches to kernel mode and jumps to the registered handler.
342+
4. Kernel does the work, then executes `IRET` / `SYSRET` to return to user mode.
343+
344+
🔹 **Paging & Virtual Memory**: Every memory address a program uses is a **virtual address**. The **Memory Management Unit (MMU)** translates it to a physical RAM address using a **page table** (a dictionary stored in RAM, pointed to by a CPU register). Benefits:
345+
- Each process has its own isolated address space (e.g., two processes can both use `0x400000` pointing to different physical memory).
346+
- Kernel marks its own pages as ring-0-only, so user-mode code cannot read kernel memory even though kernel addresses are present in the virtual map.
347+
- **Demand paging**: pages are only loaded into physical RAM when first accessed (page fault → kernel loads the page → retries the instruction).
348+
349+
🔹 **Preemptive Multitasking**: A **timer chip (PIT)** fires a **hardware interrupt** every few milliseconds. The CPU switches to kernel mode, the OS scheduler saves the current process state (registers, instruction pointer) and restores another process — the **context switch**. Timeslices on Linux are typically 0.75 – 6 ms.
350+
351+
🔹 **Boot → Run sequence**:
352+
`Firmware (UEFI/BIOS)``Bootloader (GRUB)``Kernel init``Page tables set up, interrupts registered``init process (PID 1, e.g. systemd)``fork/exec` → user programs running
353+
354+
🔹 **fork & exec pattern**:
355+
- `fork()` — clones the current process; child gets PID 0 return value, parent gets child PID. Memory pages are marked **copy-on-write (COW)**; no physical copy until a write occurs.
356+
- `exec()` — replaces the current process image with a new program (parsed from an ELF binary: load `.text`, `.data`, `.bss` sections into virtual memory, jump to entry point).
357+
Every process on Linux traces its ancestry back to PID 1 via fork-exec.
358+
359+
---
360+
262361
#### Cracking Coding Interviews
263362
<a id="cracking-coding-interviews-2"></a>
264363
🔹[ref](https://x.com/systemdesign42/status/1776590986837160427)
@@ -486,6 +585,14 @@ public static void consume(List<? super Shape> shapes) {
486585

487586
---
488587

588+
#### Idempotent, Backfill
589+
<a id="idempotent-backfill-2"></a>
590+
591+
🔹 **Idempotent**: An operation that produces the same result regardless of how many times it is applied. For example, a database upsert or an HTTP PUT request. Critical for safe retries in distributed systems.
592+
🔹 **Backfill**: The process of reprocessing or reloading historical data into a system, often used in data pipelines to populate missing or updated records retroactively.
593+
594+
---
595+
489596
#### JIT vs AOT
490597
<a id="jit-vs-aot-2"></a>
491598
🔹[JIT vs AOT](https://stackoverflow.com/questions/32653951/when-does-ahead-of-time-aot-compilation-happen): **JIT** and **AOT** are two types of compilers that differ in when they convert a program from one language to another, either at run-time or build-time.
@@ -545,6 +652,105 @@ Memory consistency model: [A Primer on Memory Consistency and Cache Coherence](h
545652

546653
---
547654

655+
#### Network Design 101
656+
<a id="network-design-101-2"></a>
657+
658+
###### Routing Protocols
659+
1. Internal routing
660+
RIPv2 – Distance-vector, small networks
661+
OSPF – Link-state, fast internal routing
662+
EIGRP – Cisco hybrid, efficient IGP
663+
2. External routing
664+
BGP – Inter-domain routing, policy-based
665+
666+
###### Network Functions / Devices
667+
NAT – Private ↔ public IP translation
668+
PAT (NATP) – Many private IPs → one public IP
669+
L2 Switch – MAC-based forwarding
670+
L3 Switch – Routing + switching combined
671+
VLAN – Logical network segmentation
672+
ICMP – Network error & reachability checks
673+
SNMP – Device monitoring & alerts
674+
ARP – IP → MAC address resolution
675+
676+
###### VLAN vs VNET vs VPC
677+
In classic networking, VLANs are used for internal traffic segmentation, while a virtual network (referred to here as VNet) focuses on subnetting and routing. In public cloud, VPC (AWS/GCP) and VNet (Azure) represent tenant-scoped network, service, and security boundaries. These constructs operate at different abstraction levels and should not be treated as the same object, as they serve different roles in each context.
678+
679+
###### E2E Network flow
680+
```mermaid
681+
sequenceDiagram
682+
autonumber
683+
684+
participant Host as Client Endpoint<br/>(PC / Laptop, VLAN 10)
685+
participant L2 as Access Switch<br/>(L2 Switch)
686+
participant L3 as Distribution Switch<br/>(L3 Switch / Router)
687+
participant Core as Core Router<br/>(Core Routing)
688+
participant Edge as Edge Firewall / Router<br/>(NAT / PAT)
689+
participant ISP as ISP Router<br/>(Internet Gateway)
690+
participant Remote as External Server<br/>(Public Service)
691+
692+
%% --- Participant Notes (Layman) ---
693+
Note over Host: User device that sends and receives data
694+
Note over L2: Connects devices and forwards frames by MAC
695+
Note over L3: Routes traffic between local IP networks
696+
Note over Core: High-speed backbone for internal traffic
697+
Note over Edge: Internet exit that translates addresses
698+
Note over ISP: Provider router carrying Internet traffic
699+
Note over Remote: Remote system providing the service
700+
701+
%% --- Design Rationale ---
702+
Note over Host,L2: L2 access retained for endpoint scale,<br/>VLAN segmentation, and broadcast control
703+
704+
%% --- L2 / VLAN / ARP ---
705+
Host->>L2: Ethernet Frame (VLAN 10)
706+
Note right of L2: 802.1Q VLAN tagging
707+
708+
Host->>L2: ARP Request (Who is default gateway?)
709+
L2->>L3: Forward ARP request (VLAN 10)
710+
711+
%% --- SVI ---
712+
Note right of L3: SVI (Vlan10)<br/>Virtual L3 interface<br/>Default gateway for VLAN 10
713+
714+
L3->>L2: ARP Reply (SVI MAC)
715+
L2->>Host: ARP Reply delivered
716+
717+
%% --- L3 Routing ---
718+
Host->>L2: IP Packet to default gateway
719+
L2->>L3: Frame forwarded to SVI
720+
Note right of L3: Inter-VLAN routing via SVI
721+
722+
%% --- Internal Routing ---
723+
Note over L3,Core: IGP (OSPF / EIGRP / RIP)<br/>Fast routing inside one network
724+
L3->>Core: Forward packet (best internal path)
725+
726+
%% --- IGP vs BGP Explanation ---
727+
Note over Core,Edge: IGP = internal path selection<br/>BGP = external path & policy control
728+
729+
%% --- Edge / NAT ---
730+
Core->>Edge: Forward to perimeter
731+
Edge->>Edge: NAT / PAT translation
732+
Note right of Edge: Private IP → Public IP
733+
734+
%% --- External Routing ---
735+
Note over Edge,ISP: BGP (External Routing)<br/>Policy-based Internet path selection
736+
Edge->>ISP: Forward packet
737+
ISP->>Remote: Deliver packet
738+
739+
%% --- Return Traffic ---
740+
Remote->>ISP: Response
741+
ISP->>Edge: Return packet
742+
Edge->>Edge: Reverse NAT
743+
Edge->>Core: Forward
744+
Core->>L3: Forward
745+
L3->>L2: Frame to VLAN 10
746+
L2->>Host: Packet delivered
747+
748+
%% --- Monitoring ---
749+
Note over L3,Edge: SNMP monitoring (health & counters)
750+
```
751+
752+
---
753+
548754
#### OLAP vs OLTP
549755
<a id="olap-vs-oltp-2"></a>
550756
🔹**OLAP**: Used for complex data analysis and business reporting, such as financial analysis and sales forecasting.
@@ -634,6 +840,23 @@ Push & Pull model in Azure
634840

635841
---
636842

843+
#### Scaling a system 101
844+
<a id="scaling-a-system-101-2"></a>
845+
846+
[ref](https://blog.algomaster.io/p/scaling-a-system-from-0-to-10-million-users)
847+
848+
| Stage | Users | Strategic Focus | Architecture | Primary Bottleneck | Key Techniques | Core Takeaway |
849+
|-------|-------|-----------------|--------------|-------------------|----------------|---------------|
850+
| **1 – Single Server** | 0 – 100 | Ship fast | Everything on one VM | Dev speed, no load yet | Monolith, single VM + DB (e.g. $20–50/mo VPS), reverse proxy (Nginx) | Optimize for iteration speed, not scalability. Don't over-engineer. |
851+
| **2 – Separate DB** | 100 – 1K | Stabilize | App server + dedicated DB | App & DB compete for same CPU/memory | Move DB to its own server (managed: RDS/Supabase), connection pooling (PgBouncer) | Isolate DB resource contention; use managed services to save ops time. |
852+
| **3 – Load Balancer + Horizontal Scale** | 1K – 10K | Handle burst | Stateless app tier behind LB | Single app server is a SPOF | Add load balancer, 2+ stateless app servers, shared session store (Redis), auto-scaling group | Make app tier stateless so any server can handle any request. |
853+
| **4 – Caching + CDN** | 10K – 100K | Protect DB | Read-heavy architecture | DB read saturation | CDN for static assets, cache-aside with Redis/Memcached, read replicas, DB query optimization | 80–90%+ of reads can be served from cache; CDN removes static load entirely. |
854+
| **5 – Async + Queues** | 100K – 1M | Automate | Stateless + event-driven | Traffic spikes, slow write paths | Message queues (SQS/RabbitMQ), async workers (Celery/Sidekiq), rate limiting, auto-scaling policies | Decouple slow/heavy work from the request path; absorb traffic spikes via queues. |
855+
| **6 – Microservices + CQRS** | 1M – 10M | Reliability | Service-oriented + CQRS | Monolith deployment risk, DB write contention | Break into microservices, CQRS (separate read/write models), event sourcing, per-service DBs | Enables independent deployments and scaling; adds operational complexity. |
856+
| **7 – Multi-region + Sharding** | 10M+ | Global resilience | Distributed global systems | Latency, cross-region reliability, single-DB limits | Multi-region deployment (active-active/active-passive), DB sharding, global CDN, data locality policies | Shift focus from performance → reliability & user-perceived latency by geography. |
857+
858+
---
859+
637860
#### Security Words 101
638861
<a id="security-words-101-2"></a>
639862

@@ -646,6 +869,7 @@ Push & Pull model in Azure
646869
🔹 **IAM**: Identity and Access Management
647870
🔹 **SSO**: Single Sign-On
648871
🔹 **MFA**: Multi-Factor Authentication
872+
🔹 **SSPR**: Self-Service Password Reset
649873

650874
- **Threat Detection and Response**
651875
🔹 **ATA**: Advanced Threat Analytics
@@ -682,6 +906,8 @@ Push & Pull model in Azure
682906
🔹 **SCOM/ACS**: System Center Operations Manager / Audit Collection Services
683907
🔹 **GRC**: Governance, Risk, and Compliance
684908
🔹 **SOC**: Security Operations Center
909+
🔹 **CSPM**: Cloud Security Posture Management
910+
🔹 **CIEM**: Cloud Infrastructure Entitlement Management
685911

686912
---
687913

@@ -697,6 +923,13 @@ Push & Pull model in Azure
697923
<a id="slowly-changing-dimensions-scd-2"></a>
698924
**Slowly Changing Dimensions** change over time, but at a slow pace and unpredictably. For example, a customer's address in a retail business.
699925

926+
| Type | Strategy | Description | Trade-off |
927+
| ---- | -------- | ----------- | --------- |
928+
| **SCD Type 0** | Retain original | Dimension values never change; original value is always preserved. | No history; ignores real-world changes. |
929+
| **SCD Type 1** | Overwrite | Old value is replaced with the new value; no history kept. | Simple to implement; history is lost. |
930+
| **SCD Type 2** | Add new row | A new row is inserted for each change; old row is marked inactive (with `start_date` / `end_date` or `is_current` flag). | Full history preserved; table can grow large. |
931+
| **SCD Type 3** | Add new column | A new column stores the previous value alongside the current value. | Limited history (only one prior value). |
932+
700933
---
701934

702935
#### Software Defined Networking (SDN)
@@ -894,3 +1127,20 @@ graph TD
8941127
```
8951128

8961129
**[`^ back to top ^`](#index)**
1130+
1131+
---
1132+
1133+
#### Zanzibar
1134+
<a id="zanzibar-2"></a>
1135+
1136+
**Zanzibar** is Google's global authorization system (published 2019) that underpins access control for Google Drive, YouTube, Maps, and other services.
1137+
1138+
🔹 **Tuple-based approach**: Permissions are stored as relationship tuples `(object#relation@user)`, e.g., `doc:readme#owner@user:alice`. This makes relationships explicit and queryable.
1139+
🔹 **Zookie**: A consistency token returned on each write. Clients pass it back on subsequent reads to guarantee "read-your-writes" consistency without requiring full global linearizability on every read.
1140+
🔹 **Configuration language**: A schema DSL defines object types, relations, and permission inheritance rules (e.g., "viewer inherits from editor"), making access policies auditable and reusable.
1141+
🔹 **Leopard**: An indexing subsystem inside Zanzibar that pre-computes and caches transitive group membership, optimizing large fan-out permission checks.
1142+
🔹 **Spanner**: Zanzibar uses Google Spanner as its underlying storage, providing globally distributed, externally consistent transactions via TrueTime.
1143+
🔹 **External consistency**: Reads and writes are globally ordered using Spanner's TrueTime API, ensuring no stale permission grants across distributed replicas.
1144+
🔹 **Open-source adoptions**: OpenFGA (CNCF), SpiceDB (Authzed), and Ory Keto are popular open-source implementations inspired by Zanzibar.
1145+
1146+
**[`^ back to top ^`](#index)**

0 commit comments

Comments
 (0)