Socket Basics#
Socket programming is the foundation of network communication in Python and virtually
all networked applications. A socket is an endpoint for sending and receiving data
across a network, providing a bidirectional communication channel between processes
on the same machine or across different machines over the Internet. While Python
provides high-level networking interfaces like urllib, requests, and asyncio,
understanding low-level socket operations is essential for building custom protocols,
debugging network issues, implementing network tools, and interfacing with system-level
networking APIs.
This cheat sheet covers the fundamentals of socket programming in Python, including hostname and DNS resolution, IP address manipulation, network byte order conversion, timeout handling, multicast communication, and proxy support. Whether you’re building a simple client-server application, implementing a custom protocol, or troubleshooting network connectivity issues, these examples provide the building blocks you need.
Get Hostname#
The socket.gethostname() function returns the current machine’s hostname as
configured in the operating system, while socket.gethostbyname() performs a
DNS lookup to resolve a hostname to its IPv4 address. These functions are the most
basic building blocks for network programming, allowing your application to identify
itself and resolve other hosts on the network. Note that gethostbyname() only
returns IPv4 addresses; use getaddrinfo() for IPv6 support.
>>> import socket
>>> socket.gethostname()
'MacBookPro-4380.local'
>>> hostname = socket.gethostname()
>>> socket.gethostbyname(hostname)
'172.20.10.4'
>>> socket.gethostbyname('localhost')
'127.0.0.1'
Get Address Info (DNS Resolution)#
socket.getaddrinfo() is the most versatile and recommended function for DNS
resolution in modern Python code. Unlike gethostbyname(), it supports both IPv4
and IPv6 addresses, returns multiple results when available, and provides complete
information including address family, socket type, protocol, canonical name, and
socket address. This function is essential for writing protocol-agnostic code that
works seamlessly with both IPv4 and IPv6 networks.
import socket
import sys
try:
for res in socket.getaddrinfo(sys.argv[1], None,
proto=socket.IPPROTO_TCP):
family = res[0]
sockaddr = res[4]
print(family, sockaddr)
except socket.gaierror:
print("Invalid")
Output:
$ python gai.py 192.0.2.244
AddressFamily.AF_INET ('192.0.2.244', 0)
$ python gai.py 2001:db8:f00d::1:d
AddressFamily.AF_INET6 ('2001:db8:f00d::1:d', 0, 0, 0)
$ python gai.py www.google.com
AddressFamily.AF_INET6 ('2607:f8b0:4006:818::2004', 0, 0, 0)
AddressFamily.AF_INET ('172.217.10.132', 0)
It handles unusual cases, valid and invalid:
$ python gai.py 10.0.0.256 # octet overflow
Invalid
$ python gai.py not-exist.example.com # unresolvable
Invalid
$ python gai.py fe80::1%eth0 # scoped
AddressFamily.AF_INET6 ('fe80::1%eth0', 0, 0, 2)
$ python gai.py ::ffff:192.0.2.128 # IPv4-Mapped
AddressFamily.AF_INET6 ('::ffff:192.0.2.128', 0, 0, 0)
$ python gai.py 0xc000027b # IPv4 in hex
AddressFamily.AF_INET ('192.0.2.123', 0)
Advanced DNS Queries#
While socket.getaddrinfo() handles basic hostname resolution, many applications
require more advanced DNS operations like querying specific record types. MX records
identify mail servers for a domain, TXT records store SPF and DKIM data for email
authentication, NS records list authoritative name servers, and SRV records enable
service discovery. The dnspython library provides a comprehensive DNS toolkit
that supports all record types, custom nameservers, DNSSEC validation, and zone
transfers. This is essential for building email validation systems, service discovery
mechanisms, and DNS monitoring tools.
# pip install dnspython
import dns.resolver
# Query MX records (mail servers)
answers = dns.resolver.resolve('google.com', 'MX')
for rdata in answers:
print(f"MX: {rdata.exchange} (priority: {rdata.preference})")
# Query TXT records (SPF, DKIM, etc.)
answers = dns.resolver.resolve('google.com', 'TXT')
for rdata in answers:
print(f"TXT: {rdata}")
# Query NS records (name servers)
answers = dns.resolver.resolve('google.com', 'NS')
for rdata in answers:
print(f"NS: {rdata}")
# Query A records with custom nameserver
resolver = dns.resolver.Resolver()
resolver.nameservers = ['8.8.8.8'] # Use Google DNS
answers = resolver.resolve('example.com', 'A')
for rdata in answers:
print(f"A: {rdata}")
Reverse DNS Lookup#
Reverse DNS (rDNS) lookup converts an IP address back to its associated hostname by querying PTR records in the in-addr.arpa (IPv4) or ip6.arpa (IPv6) domains. This is commonly used for logging to make IP addresses human-readable, security analysis to verify that a connecting client’s IP matches its claimed hostname, spam filtering to check if mail servers have valid reverse DNS, and network troubleshooting to identify devices on a network.
>>> import socket
>>> # Reverse lookup returns (hostname, aliases, addresses)
>>> socket.gethostbyaddr('8.8.8.8')
('dns.google', [], ['8.8.8.8'])
>>> socket.gethostbyaddr('140.82.112.4')
('github.com', [], ['140.82.112.4'])
>>> # Using getfqdn for fully qualified domain name
>>> socket.getfqdn('8.8.8.8')
'dns.google'
Network Byte Order Conversion#
Network protocols universally use big-endian byte order (most significant byte first),
also called “network byte order,” while most modern CPUs (x86, ARM in little-endian
mode) use little-endian (least significant byte first). When sending multi-byte
integers over the network, you must convert from host byte order to network byte
order, and vice versa when receiving. The htons/htonl functions convert host
to network order for short (16-bit) and long (32-bit) integers, while ntohs/ntohl
convert network to host order. Failing to perform these conversions causes subtle
bugs where values appear corrupted on machines with different endianness.
# little-endian machine
>>> import socket
>>> a = 1 # host endian
>>> socket.htons(a) # host to network short (16-bit)
256
>>> socket.htonl(a) # host to network long (32-bit)
16777216
>>> socket.ntohs(256) # network to host short
1
>>> socket.ntohl(16777216) # network to host long
1
IP Address String/Binary Conversion#
IP addresses are typically displayed as human-readable strings (dotted-quad for IPv4
like “192.168.1.1”, or colon-hex for IPv6 like “2001:db8::1”), but network protocols
transmit them as binary data (4 bytes for IPv4, 16 bytes for IPv6). The inet_aton
and inet_ntoa functions convert between string and binary formats for IPv4 only.
For code that needs to support both IPv4 and IPv6, use inet_pton (presentation
to network) and inet_ntop (network to presentation), which take an address family
parameter to specify the IP version.
>>> import socket
>>> # IPv4: string to binary
>>> addr = socket.inet_aton('127.0.0.1')
>>> addr
b'\x7f\x00\x00\x01'
>>> # IPv4: binary to string
>>> socket.inet_ntoa(addr)
'127.0.0.1'
>>> # IPv4/IPv6: use inet_pton/inet_ntop
>>> socket.inet_pton(socket.AF_INET, '192.168.1.1')
b'\xc0\xa8\x01\x01'
>>> socket.inet_pton(socket.AF_INET6, '::1')
b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01'
>>> socket.inet_ntop(socket.AF_INET6, b'\x00' * 15 + b'\x01')
'::1'
MAC Address Conversion#
MAC (Media Access Control) addresses are 48-bit hardware identifiers assigned to
network interface cards, typically displayed as six colon-separated hexadecimal
pairs like “00:11:22:33:44:55”. When working with raw Ethernet frames or ARP packets,
you need to convert between this human-readable format and the 6-byte binary format
used in network protocols. The binascii module provides hexlify and unhexlify
functions for this conversion.
>>> import binascii
>>> mac = '00:11:32:3c:c3:0b'
>>> byte = binascii.unhexlify(mac.replace(':', ''))
>>> byte
b'\x00\x112<\xc3\x0b'
>>> binascii.hexlify(byte)
b'0011323cc30b'
>>> # Format back to colon-separated
>>> ':'.join(f'{b:02x}' for b in byte)
'00:11:32:3c:c3:0b'
Check Port Availability#
Before starting a server, you often need to verify that the desired port is available
for binding. Similarly, network monitoring tools need to check if remote services are
reachable. The is_port_open function attempts a TCP connection to test remote
service availability, while is_port_available tries to bind locally to check if
a port is free. These checks are essential for service health monitoring, port
scanning, and avoiding “Address already in use” errors when starting servers.
import socket
def is_port_open(host, port, timeout=3):
"""Check if a port is open on a remote host."""
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.settimeout(timeout)
try:
sock.connect((host, port))
return True
except (socket.timeout, ConnectionRefusedError, OSError):
return False
finally:
sock.close()
def is_port_available(port):
"""Check if a local port is available for binding."""
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
try:
sock.bind(('', port))
return True
except OSError:
return False
finally:
sock.close()
# Usage
print(is_port_open('google.com', 443)) # True
print(is_port_available(8080)) # True if not in use
Get Network Interfaces#
Multi-homed servers (machines with multiple network interfaces) need to discover
their available IP addresses to bind to specific interfaces or advertise their
addresses to clients. The basic approach uses getaddrinfo on the hostname, but
for detailed interface information including interface names, netmasks, and broadcast
addresses, the netifaces library provides a cross-platform solution. This is
useful for network configuration tools, service discovery, and building applications
that need to select specific network interfaces.
import socket
def get_local_ips():
"""Get all local IP addresses."""
ips = []
hostname = socket.gethostname()
try:
# Get all addresses for hostname
for info in socket.getaddrinfo(hostname, None):
ip = info[4][0]
if ip not in ips:
ips.append(ip)
except socket.gaierror:
pass
return ips
# For more detailed interface info, use netifaces
# pip install netifaces
import netifaces
for iface in netifaces.interfaces():
addrs = netifaces.ifaddresses(iface)
if netifaces.AF_INET in addrs:
for addr in addrs[netifaces.AF_INET]:
print(f"{iface}: {addr['addr']}")
Socket Options#
Socket options control low-level socket behavior and are essential for building
robust network applications. SO_REUSEADDR allows immediate restart of servers
without waiting for TIME_WAIT to expire. SO_REUSEPORT enables multiple processes
to bind to the same port for load balancing. Buffer size options (SO_SNDBUF,
SO_RCVBUF) tune throughput for high-bandwidth applications. SO_KEEPALIVE
detects dead connections by sending periodic probes. Understanding these options
helps you optimize performance and handle edge cases in production systems.
import socket
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
# Reuse address (avoid "Address already in use" error)
sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
# Reuse port (multiple processes can bind to same port)
sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEPORT, 1)
# Set send/receive buffer sizes
sock.setsockopt(socket.SOL_SOCKET, socket.SO_SNDBUF, 65536)
sock.setsockopt(socket.SOL_SOCKET, socket.SO_RCVBUF, 65536)
# Enable TCP keepalive
sock.setsockopt(socket.SOL_SOCKET, socket.SO_KEEPALIVE, 1)
# Set timeout (seconds)
sock.settimeout(10.0)
# Non-blocking mode
sock.setblocking(False)
# Get current option value
print(sock.getsockopt(socket.SOL_SOCKET, socket.SO_SNDBUF))
Troubleshooting: Connection Refused#
“Connection refused” is one of the most common network errors, but its cause isn’t always obvious. It can mean the target port has no listening service, a firewall is actively rejecting connections, or the service crashed. Other errors like “Connection timed out” suggest the host is unreachable or a firewall is silently dropping packets, while “Network unreachable” indicates routing problems. This diagnostic function categorizes different error types to help identify the root cause, which is essential for debugging network connectivity issues in development and production environments.
import socket
import errno
def diagnose_connection(host, port):
"""Diagnose connection issues."""
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.settimeout(5)
try:
sock.connect((host, port))
print(f"✓ Connected to {host}:{port}")
except socket.timeout:
print(f"✗ Timeout - host may be unreachable or firewalled")
except ConnectionRefusedError:
print(f"✗ Connection refused - no service on {port}")
except socket.gaierror as e:
print(f"✗ DNS error - cannot resolve {host}: {e}")
except OSError as e:
if e.errno == errno.ENETUNREACH:
print(f"✗ Network unreachable")
elif e.errno == errno.EHOSTUNREACH:
print(f"✗ Host unreachable")
else:
print(f"✗ OS error: {e}")
finally:
sock.close()
diagnose_connection('localhost', 8080)
Timeout Handling#
Network operations can block indefinitely if a remote host becomes unresponsive,
a network path fails, or packets are lost. Without proper timeout handling, your
application may hang forever waiting for data that will never arrive. Python sockets
support timeouts at multiple levels: a global timeout via settimeout() that applies
to all operations, or per-operation timeouts using select() for more precise control.
Always set appropriate timeouts based on your application’s requirements—too short
causes false failures, too long delays error detection.
import socket
import errno
def connect_with_timeout(host, port, timeout=5):
"""Connect with timeout and proper error handling."""
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.settimeout(timeout)
try:
sock.connect((host, port))
return sock
except socket.timeout:
print(f"Connection to {host}:{port} timed out")
sock.close()
return None
except OSError as e:
print(f"Connection failed: {e}")
sock.close()
return None
def recv_with_timeout(sock, bufsize=4096, timeout=10):
"""Receive data with timeout."""
sock.settimeout(timeout)
try:
return sock.recv(bufsize)
except socket.timeout:
return None # Timeout, no data
# Per-operation timeout using select
import select
def recv_timeout(sock, bufsize, timeout):
"""Receive with timeout using select (more precise)."""
ready, _, _ = select.select([sock], [], [], timeout)
if ready:
return sock.recv(bufsize)
raise socket.timeout("recv timed out")
Graceful Shutdown#
Simply calling close() on a socket may lose data still in transit. The TCP
protocol requires a proper four-way handshake (FIN-ACK sequence) to ensure both
sides have finished sending. The shutdown() method provides fine-grained control:
SHUT_WR sends a FIN packet signaling “I’m done sending” while still allowing
reads, SHUT_RD stops receiving, and SHUT_RDWR does both. For clean termination,
call shutdown(SHUT_WR) first, drain any remaining incoming data, then close().
This pattern is especially important for protocols where the server waits for client
EOF before sending its final response.
import socket
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.connect(('example.com', 80))
sock.send(b'GET / HTTP/1.0\r\nHost: example.com\r\n\r\n')
# Shutdown write side - signals EOF to server
sock.shutdown(socket.SHUT_WR)
# Read remaining data
response = b''
while True:
data = sock.recv(4096)
if not data:
break
response += data
# Now close the socket
sock.close()
# shutdown options:
# SHUT_RD - no more reads
# SHUT_WR - no more writes (sends FIN)
# SHUT_RDWR - no more reads or writes
Multicast UDP#
Multicast is a one-to-many communication model where a single packet is delivered to multiple receivers simultaneously. Unlike broadcast (which floods the entire network) or unicast (one sender, one receiver), multicast uses special IP addresses (224.0.0.0 to 239.255.255.255) and IGMP protocol to efficiently route packets only to interested receivers. Receivers must explicitly join a multicast group to receive traffic. The TTL (Time To Live) controls how far packets travel—TTL=1 stays on the local subnet, higher values cross routers. Multicast is ideal for streaming media, real-time data feeds, and service discovery where the same data goes to many clients.
import socket
import struct
MCAST_GROUP = '224.1.1.1'
MCAST_PORT = 5007
# Sender
def multicast_sender():
sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM, socket.IPPROTO_UDP)
sock.setsockopt(socket.IPPROTO_IP, socket.IP_MULTICAST_TTL, 2)
sock.sendto(b'Hello Multicast!', (MCAST_GROUP, MCAST_PORT))
sock.close()
# Receiver
def multicast_receiver():
sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM, socket.IPPROTO_UDP)
sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
sock.bind(('', MCAST_PORT))
# Join multicast group
mreq = struct.pack('4sl', socket.inet_aton(MCAST_GROUP), socket.INADDR_ANY)
sock.setsockopt(socket.IPPROTO_IP, socket.IP_ADD_MEMBERSHIP, mreq)
data, addr = sock.recvfrom(1024)
print(f"Received: {data} from {addr}")
sock.close()
HTTP Client with Sockets#
While high-level libraries like urllib and requests handle HTTP elegantly,
understanding raw HTTP over sockets is invaluable for debugging, implementing custom
protocols, or working in constrained environments. HTTP/1.1 is a text-based protocol:
you send a request line (GET /path HTTP/1.1), headers (key-value pairs), a blank
line, and optionally a body. The server responds similarly. Key headers include
Host (required in HTTP/1.1), Connection: close (to signal single request),
and Content-Length for bodies. This low-level approach reveals exactly what’s
happening on the wire, making it easier to diagnose issues like malformed headers,
encoding problems, or TLS handshake failures.
import socket
import ssl
def http_get(host, path='/', port=80, use_ssl=False):
"""Simple HTTP GET using raw sockets."""
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
if use_ssl:
context = ssl.create_default_context()
sock = context.wrap_socket(sock, server_hostname=host)
port = 443
sock.connect((host, port))
request = f"GET {path} HTTP/1.1\r\nHost: {host}\r\nConnection: close\r\n\r\n"
sock.send(request.encode())
response = b''
while True:
data = sock.recv(4096)
if not data:
break
response += data
sock.close()
# Split headers and body
header_end = response.find(b'\r\n\r\n')
headers = response[:header_end].decode()
body = response[header_end + 4:]
return headers, body
# Usage
headers, body = http_get('example.com', '/', use_ssl=True)
print(headers)
SOCKS Proxy Support#
SOCKS (Socket Secure) is a protocol that routes network traffic through a proxy
server, providing anonymity and the ability to bypass firewalls or geographic
restrictions. Unlike HTTP proxies that only handle HTTP traffic, SOCKS operates
at a lower level and can proxy any TCP (and with SOCKS5, UDP) traffic. SOCKS5
adds authentication and IPv6 support. Common use cases include routing traffic
through Tor (which uses SOCKS5 on port 9050), accessing internal networks via
SSH tunnels (ssh -D), or corporate proxy requirements. The PySocks library
makes it easy to route Python socket connections through SOCKS proxies, either
globally (patching all sockets) or per-connection.
# pip install PySocks
import socks
import socket
# Method 1: Patch all sockets globally
socks.set_default_proxy(socks.SOCKS5, "localhost", 9050)
socket.socket = socks.socksocket
# Now all socket connections go through the proxy
s = socket.socket()
s.connect(("example.com", 80))
# Method 2: Create proxy socket directly
s = socks.socksocket()
s.set_proxy(socks.SOCKS5, "localhost", 9050)
s.connect(("example.com", 80))
# Method 3: With authentication
s = socks.socksocket()
s.set_proxy(socks.SOCKS5, "proxy.example.com", 1080,
username="user", password="pass")