Networking Fundamentals

Every distributed system communicates over a network. Understanding DNS, TCP/IP, HTTP, and related protocols is non-negotiable. This chapter covers the networking concepts you need to make sound architectural decisions.

The OSI Model (Simplified)

The Open Systems Interconnection (OSI) model breaks network communication into seven layers. For system design, four layers matter most:

LayerNameKey ProtocolsWhat It Does
7ApplicationHTTP, HTTPS, WebSocket, gRPC, DNSThe interface your application code interacts with directly.
4TransportTCP, UDPReliable (TCP) or fast unreliable (UDP) delivery of data between hosts.
3NetworkIP (IPv4, IPv6)Routing packets between networks using IP addresses.
2Data LinkEthernet, Wi-FiTransmitting frames between devices on the same local network.

IP Addresses and Ports

An IP address uniquely identifies a machine on a network. IPv4 addresses are 32-bit (e.g., 192.168.1.1). IPv6 addresses are 128-bit, solving the address exhaustion problem.

A port identifies a specific process or service on a machine. Ports 0-1023 are well-known (80 = HTTP, 443 = HTTPS, 22 = SSH, 5432 = PostgreSQL). Ports 1024-65535 are available for applications.

A socket is the combination of IP address + port that uniquely identifies an endpoint of a connection (e.g., 93.184.216.34:443).

TCP: Transmission Control Protocol

TCP provides reliable, ordered, error-checked delivery of data. It is the foundation for HTTP, HTTPS, SSH, and most application-level protocols.

The Three-Way Handshake

TCP Three-Way Handshake
Client Server SYN (seq=x) SYN-ACK (seq=y, ack=x+1) ACK (ack=y+1) Connection Established

Key TCP Concepts

  • Reliability: TCP retransmits lost packets. The receiver acknowledges each segment; if an ACK is not received in time, the sender resends.
  • Ordering: Each byte is numbered with a sequence number. The receiver reassembles data in the correct order regardless of arrival order.
  • Flow Control: The receiver advertises a window size indicating how much data it can accept. The sender never exceeds this window.
  • Congestion Control: TCP monitors the network for signs of congestion (packet loss, increased round-trip time) and reduces sending rate accordingly. Algorithms like slow start, congestion avoidance, and fast retransmit govern this behavior.

UDP: User Datagram Protocol

UDP is a connectionless, unreliable protocol. It sends datagrams without establishing a connection, without acknowledgments, and without ordering guarantees.

Why use it? Speed. No handshake overhead, no retransmission delay. Applications that can tolerate some data loss but need low latency prefer UDP:

  • Video streaming and voice calls (VoIP)
  • Online gaming
  • DNS queries (small, quick lookups)
  • IoT sensor data

TCP

  • Connection-oriented (handshake)
  • Reliable delivery with retransmissions
  • Ordered data stream
  • Flow and congestion control
  • Higher latency, higher overhead

UDP

  • Connectionless (no handshake)
  • Unreliable: no delivery guarantee
  • No ordering guarantee
  • No built-in flow control
  • Lower latency, lower overhead

DNS: Domain Name System

DNS translates human-readable domain names into IP addresses. It is a hierarchical, distributed database.

DNS Resolution Flow
Browser Recursive Resolver Root DNS (.) TLD DNS (.com, .org) Authoritative (example.com) IP Address 93.184.216.34

DNS Record Types

TypePurposeExample
AMaps domain to IPv4 addressexample.com -> 93.184.216.34
AAAAMaps domain to IPv6 addressexample.com -> 2606:2800:220:1:...
CNAMEAlias one domain to anotherwww.example.com -> example.com
MXMail exchange serversexample.com -> mail.example.com
NSAuthoritative name servers for domainexample.com -> ns1.example.com
TXTArbitrary text (SPF, verification)example.com -> "v=spf1 ..."
SRVService location (host + port)_sip._tcp.example.com -> ...

DNS Caching

DNS responses include a TTL (Time To Live) value that tells resolvers how long to cache the result. Lower TTL means faster propagation of changes but more DNS lookups. Higher TTL reduces lookup latency but slows down changes. Common TTLs range from 60 seconds to 24 hours.

System Design Relevance
DNS is your first line of traffic routing. With DNS-based load balancing, you return different IP addresses to distribute traffic across data centers. Services like Route 53 (AWS) support weighted routing, latency-based routing, and health-checked failover: all at the DNS layer.

HTTP and HTTPS

HTTP (Hypertext Transfer Protocol) is the application-layer protocol that powers the web. HTTPS is HTTP over TLS: the same protocol with encryption.

HTTP/1.1 vs. HTTP/2 vs. HTTP/3

FeatureHTTP/1.1HTTP/2HTTP/3
TransportTCPTCPQUIC (over UDP)
MultiplexingNo (one request per connection at a time)Yes (multiple streams per connection)Yes (with no head-of-line blocking)
Header CompressionNoHPACKQPACK
Server PushNoYesYes
Connection SetupTCP + TLS (2-3 round trips)TCP + TLS (2-3 round trips)0-RTT or 1-RTT

TLS / SSL

TLS (Transport Layer Security) encrypts data in transit. A TLS handshake establishes a shared secret between client and server:

  1. Client sends a "Client Hello" with supported cipher suites and a random number.
  2. Server responds with "Server Hello," chosen cipher suite, its certificate, and a random number.
  3. Client verifies the server's certificate against trusted Certificate Authorities (CAs).
  4. Both sides derive session keys from the exchanged information.
  5. All subsequent data is encrypted with these session keys.

TLS 1.3 reduced this to a single round trip (1-RTT) and eliminated insecure cipher suites.

WebSockets

WebSockets upgrade an HTTP connection to a full-duplex communication channel. After the initial handshake:

  • Both client and server can send messages at any time.
  • Messages are framed (not streamed), so each message is a discrete unit.
  • The connection stays open until explicitly closed.
  • Overhead per message is minimal (2-14 bytes) compared to HTTP headers (hundreds of bytes).

Use WebSockets for chat applications, live sports scores, collaborative editing, multiplayer games, and any scenario requiring low-latency bidirectional communication.

gRPC

gRPC is a high-performance RPC framework built on HTTP/2 and Protocol Buffers (protobuf). Key properties:

  • Binary serialization (protobuf) is much smaller and faster than JSON.
  • Supports streaming: unary, server-streaming, client-streaming, and bidirectional streaming.
  • Strongly typed contracts defined in .proto files.
  • Code generation for multiple languages.
  • Commonly used for internal service-to-service communication in microservice architectures.

Key Takeaways

  • TCP provides reliability at the cost of latency; UDP provides speed at the cost of reliability. Most web traffic uses TCP.
  • DNS is a distributed, hierarchical system. It is cacheable and can be used for load balancing and failover.
  • HTTP/2 and HTTP/3 solve many performance problems of HTTP/1.1 through multiplexing and reduced connection overhead.
  • TLS encrypts data in transit. Always use HTTPS in production.
  • Choose WebSockets for real-time bidirectional data; use SSE when only server-to-client push is needed.
  • gRPC is optimal for inter-service communication where performance and type safety matter.

Chapter Check-Up

Quick quiz to reinforce what you just learned.

๐Ÿงช

Practice What You Learned

Configure CDN edge routing and see how DNS resolves traffic across regions.

Start Guided Lab โ†’