devops

Networking β€” Absolute Basics for DevOps

OSI, TCP/IP, DNS, HTTP, HTTPS, load balancing, and the debugging tools you'll use daily


Why Networking Matters

Every bug in a distributed system is either a code bug or a networking bug. β€œIt works on my machine” almost always means a networking problem. You don’t need to be a network engineer β€” but you need to know how data travels and where it can fail.


OSI vs TCP/IP (Practical Mapping)

Forget memorizing 7 layers β€” know what each layer means operationally.

OSI LayerTCP/IP LayerWhat it isExamples
7 ApplicationApplicationYour app’s protocolHTTP, DNS, SSH, SMTP
6 PresentationApplicationEncoding, encryptionTLS, JSON serialization
5 SessionApplicationConnection managementTLS sessions
4 TransportTransportEnd-to-end deliveryTCP, UDP
3 NetworkInternetRouting between networksIP, ICMP
2 Data LinkNetwork AccessDelivery on one networkEthernet, WiFi, ARP
1 PhysicalNetwork AccessCables, signalsFiber, copper, radio

DevOps practical view:

  • L4 = ports and connections (TCP/UDP) β€” ss, netstat, firewalls
  • L7 = application protocols β€” curl, dig, app logs

IP Addressing & CIDR

IP Address: 192.168.1.100
Subnet Mask: 255.255.255.0 = /24
Network: 192.168.1.0
Broadcast: 192.168.1.255
Hosts: 192.168.1.1 - 192.168.1.254 (254 usable)

CIDR Notation

CIDRHostsCommon Use
/321Single host (security group rules)
/302Point-to-point links
/2814Small subnet
/24254Standard LAN, small VPC subnet
/1665,534Large VPC CIDR block
/816MRFC1918 private range (10.0.0.0/8)

Private (RFC 1918) Ranges

10.0.0.0/8 (10.x.x.x)
172.16.0.0/12 (172.16.x.x - 172.31.x.x)
192.168.0.0/16 (192.168.x.x)
Terminal window
# Quick CIDR calculations
ipcalc 192.168.1.0/24 # detailed breakdown
python3 -c "import ipaddress; print(list(ipaddress.IPv4Network('10.0.1.0/28')))"

Ports & Sockets

A socket is an IP address + port + protocol. It uniquely identifies a network endpoint.

Well-known ports:

PortProtocolService
22TCPSSH
25TCPSMTP
53TCP/UDPDNS
80TCPHTTP
443TCPHTTPS
3306TCPMySQL
5432TCPPostgreSQL
6379TCPRedis
8080TCPAlt HTTP / app servers
27017TCPMongoDB
Terminal window
# What's listening on which ports?
ss -tlnp # TCP, listening, numeric, with processes
ss -ulnp # UDP equivalent
# Who is using port 8080?
ss -tlnp sport = :8080
lsof -i :8080
# Check if remote port is reachable
nc -zv 10.0.0.5 5432 # TCP
nc -zuv 10.0.0.5 53 # UDP
timeout 3 bash -c "echo > /dev/tcp/10.0.0.5/80" && echo "open"

TCP Handshake & Common Failures

TCP requires a 3-way handshake before any data flows:

Client Server
|── SYN ────→ | "I want to connect"
|← SYN-ACK ── | "OK, I'm ready"
|── ACK ─────→ | "Acknowledged, let's go"
| |
|← DATA ←────→ | (now data can flow)
| |
|── FIN ─────→ | "I'm done"
|← FIN-ACK ── | "OK, closing"

TCP State Machine (the states you’ll see)

StateMeaning
LISTENServer waiting for connections
ESTABLISHEDActive connection
TIME_WAITWaiting after close (2x MSL, ~60s)
CLOSE_WAITRemote closed, local hasn’t yet
SYN_SENTClient sent SYN, waiting for SYN-ACK
SYN_RECVServer received SYN, sent SYN-ACK
Terminal window
# See connection states
ss -tn state established
ss -tn state time-wait | wc -l # too many = high traffic or slow close
ss -tn state close-wait # app not closing connections = memory leak

Common TCP Failures

SymptomLikely Cause
Connection refusedNothing listening on that port
Connection timed outFirewall dropping packets, or host unreachable
SYN_SENT stuckFirewall blocking SYN packets outbound
Many TIME_WAITNormal at high traffic, or keepalive not configured
Many CLOSE_WAITApplication not closing sockets

DNS

Resolution Flow

Browser/App
β”‚
β”œβ”€ 1. Check local cache (/etc/hosts, OS cache)
β”‚
β”œβ”€ 2. Ask recursive resolver (usually your router or ISP)
β”‚ └─ Resolver checks its cache
β”‚
β”œβ”€ 3. Resolver asks Root servers β†’ "who handles .com?"
β”‚
β”œβ”€ 4. Resolver asks TLD servers β†’ "who handles example.com?"
β”‚
β”œβ”€ 5. Resolver asks Authoritative NS β†’ "what's www.example.com?"
β”‚
└─ 6. Returns IP, caches result with TTL
Terminal window
# Check /etc/hosts first (always checked before DNS)
cat /etc/hosts
# DNS resolution order configured in:
cat /etc/nsswitch.conf | grep hosts
# Which DNS server am I using?
cat /etc/resolv.conf
resolvectl status # systemd-resolved systems
# Force DNS resolution (bypass OS cache)
dig www.example.com
nslookup www.example.com

DNS Record Types

TypePurposeExample
AIPv4 addressexample.com β†’ 93.184.216.34
AAAAIPv6 addressexample.com β†’ 2606:2800:...
CNAMEAlias to another namewww β†’ example.com
MXMail serverexample.com β†’ mail.example.com
TXTArbitrary textSPF, DKIM, verification records
NSNameserverWho handles this domain
PTRReverse DNS (IP β†’ name)93.184.216.34 β†’ example.com
SOAZone authorityPrimary NS, contact, refresh

Caching & TTL

Terminal window
# Check TTL on a record
dig www.example.com | grep -A1 "ANSWER"
# ;; ANSWER SECTION:
# www.example.com. 300 IN A 93.184.216.34
# ^^^ TTL in seconds
# Query specific DNS server
dig @8.8.8.8 www.example.com
# Query all record types
dig example.com ANY
# Reverse lookup (IP β†’ hostname)
dig -x 93.184.216.34
# Trace full resolution path
dig +trace www.example.com

TTL debugging: If you changed a DNS record and it’s not propagating, check the old TTL β€” that’s how long resolvers were told to cache it.


HTTP

Request Lifecycle

1. DNS resolution β†’ get server IP
2. TCP handshake β†’ establish connection
3. (HTTPS) TLS handshake β†’ negotiate encryption
4. HTTP request β†’ client sends request headers + body
5. Server processes request
6. HTTP response β†’ server sends status + headers + body
7. Connection close or keep-alive for next request

Request Structure

GET /api/users?page=1 HTTP/1.1
Host: api.example.com
Accept: application/json
Authorization: Bearer eyJ...
User-Agent: curl/7.88.0

Response Status Codes

RangeMeaningCommon ones
2xxSuccess200 OK, 201 Created, 204 No Content
3xxRedirect301 Permanent, 302 Temporary, 304 Not Modified
4xxClient error400 Bad Request, 401 Unauth, 403 Forbidden, 404 Not Found
5xxServer error500 Internal Error, 502 Bad Gateway, 503 Unavailable, 504 Timeout

Key Headers

Terminal window
# Content negotiation
Accept: application/json
Content-Type: application/json
# Authentication
Authorization: Bearer <token>
Authorization: Basic <base64(user:pass)>
# Caching
Cache-Control: max-age=3600
ETag: "33a64df551425fcc55e4d42a148795d9f25f89d"
# Security
Strict-Transport-Security: max-age=31536000
X-Content-Type-Options: nosniff
X-Frame-Options: DENY

HTTPS: Certificates & TLS

TLS Termination

Client β†’ [HTTPS] β†’ Load Balancer β†’ [HTTP] β†’ App Server
(TLS terminated here)

TLS termination decrypts traffic at the load balancer/reverse proxy so your app servers don’t need to handle encryption.

Certificates

A TLS certificate:

  1. Proves server identity (signed by a trusted CA)
  2. Contains the server’s public key
  3. Has an expiry date
Terminal window
# Check a site's certificate
curl -v https://example.com 2>&1 | grep -A5 "Server certificate"
# Detailed cert info
openssl s_client -connect example.com:443 -servername example.com < /dev/null
# Check cert expiry
echo | openssl s_client -connect example.com:443 2>/dev/null | \
openssl x509 -noout -dates
# Check local cert file
openssl x509 -in cert.pem -noout -text
openssl x509 -in cert.pem -noout -dates

Common certificate errors:

ErrorCause
Certificate expiredPast notAfter date
Certificate not trustedSelf-signed or unknown CA
Hostname mismatchCN/SAN doesn’t match the domain
Incomplete chainMissing intermediate certificates

Load Balancing

L4 vs L7

Layer 4 (Transport)Layer 7 (Application)
SeesIP + TCP/UDPHTTP headers, URL, cookies
Routes byIP:portURL path, host header, content
TLSPass-through or terminateTerminate (usually)
SpeedFasterSlightly slower
ExamplesAWS NLB, HAProxy TCPAWS ALB, nginx, HAProxy HTTP
Use caseRaw TCP, non-HTTPHTTP services, A/B testing

Health Checks

# nginx upstream with health checks
upstream backend {
server 10.0.0.1:8080;
server 10.0.0.2:8080;
server 10.0.0.3:8080 backup; # only used if others fail
keepalive 32;
}

Health check types:

TypeWhat it checks
TCPPort is accepting connections
HTTPReturns 2xx status code
HTTPSTLS + 2xx status code
CustomSpecific response body, latency threshold

Debugging Tools

curl (Advanced Usage)

Terminal window
# Full timing breakdown
curl -w "\ndns:%{time_namelookup}s connect:%{time_connect}s tls:%{time_appconnect}s transfer:%{time_starttransfer}s total:%{time_total}s\n" \
-s -o /dev/null https://example.com
# Follow redirects
curl -L https://example.com
# Custom headers
curl -H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
https://api.example.com/endpoint
# POST with JSON body
curl -X POST \
-H "Content-Type: application/json" \
-d '{"key":"value"}' \
https://api.example.com/data
# Save response with headers
curl -D headers.txt -o response.body https://example.com
# Test with specific DNS (bypass system DNS)
curl --resolve example.com:443:93.184.216.34 https://example.com
# Verbose (see headers, TLS handshake)
curl -v https://example.com
# Ignore TLS cert errors (for testing ONLY)
curl -k https://self-signed.example.com

dig / nslookup

Terminal window
# Basic lookup
dig example.com
# Short output (just IP)
dig +short example.com
# Specific record type
dig example.com MX
dig example.com TXT
dig example.com NS
# Query specific DNS server
dig @1.1.1.1 example.com
# Reverse lookup
dig -x 93.184.216.34
# Trace full resolution
dig +trace example.com
# Check all authoritative servers for a domain
dig +nssearch example.com

ss / netstat

Terminal window
# ss β€” modern replacement for netstat
ss -tlnp # TCP, listening, numeric ports, with processes
ss -tuln # TCP + UDP, listening
ss -tn dst :443 # connections to port 443
ss -s # summary statistics
# Filter by state
ss -tn state established
ss -tn state time-wait | wc -l
# netstat (older, still on many systems)
netstat -tlnp
netstat -an | grep ESTABLISHED | wc -l

traceroute

Terminal window
# Trace the network path to a host
traceroute example.com
# Use TCP (more firewall-friendly)
traceroute -T -p 443 example.com
# mtr β€” continuous traceroute (better for debugging)
mtr example.com
mtr --report example.com # run for 10 cycles then print report

tcpdump (Basic)

Terminal window
# Capture on interface
tcpdump -i eth0
# Filter by host
tcpdump -i eth0 host 10.0.0.5
# Filter by port
tcpdump -i eth0 port 80
# Capture to file for Wireshark analysis
tcpdump -i eth0 -w capture.pcap
# Read capture file
tcpdump -r capture.pcap
# Useful filters
tcpdump -i eth0 'tcp port 443 and host api.example.com'
tcpdump -i eth0 'tcp[tcpflags] & tcp-rst != 0' # capture RSTs