Network Programming Basics in Ruby
In this series of blog posts, we want to pull back the curtain on building web applications. In my experience, most developers use a framework and application server and do not look behind the curtain to see what happens inside of them. Frameworks are great for making development quicker and more comfortable, providing abstractions for low-level details and recurring patterns. But there’s always the risk of losing touch with what’s actually happening under the hood, which can complicate debugging and lead to unnecessary complexity. It helps to have a sense of what’s going on beneath those abstractions. Let’s start with the basics of networking.
The Internet protocol suite
Before we start diving into network programming, let us briefly recap the four layers of the Internet protocol suite as defined in RFC 1122. Each layer builds upon the layer beneath to offer services to the layers above it.
On the lowest level, we find the link layer. The link layer describes the communication between two physically connected nodes. One example of a link layer protocol is Ethernet: It specifies how the bits of the layers above are turned into electrical signals within a cable. It also describes the direct connection of one node to another. Routing data on a network is left to the layer above.
On top of this, we find the Internet layer with the Internet protocols (IPv4 and
IPv61). Everything that is considered part of the Internet uses the IP
protocols. They offer the abstraction of sending packages from one node to another that are not
directly connected on the link layer. Each package is independent though, making a connectionless
protocol. This layer provides a node with an IP address. The so-called loopback address2
127.0.0.1 in IPv4) and their alias
localhost allow us to address our own machine.
The third layer is called the transport layer. This is where protocols like TCP and UDP reside. TCP provides us with connection-oriented, reliable, ordered and error-checked communication. The much more minimal but sometimes more efficient UDP protocol does not provide any of TCP’s features or guarantees. Both TCP and UDP provide us with “ports”, which are 16-bit unsigned integers (a number between 0 and 65.5353).
The transport layer is also responsible for three essential qualities: confidentiality, integrity, and authenticity, as offered by Transport Layer Security (TLS). TLS is not a transport protocol itself, but an extension to an existing transport protocol like TCP4. When using TLS, the communication between two parties is confidential, protected from tampering, and we know that the two parties are who they claim to be. (In most cases, only one of the two parties proves their identity – the case where both do it is called mTLS, m for mutual). TLS is essential for any secure Internet communication. Only if we talk to our own machine via loopback, we don’t need TLS.
According to Daniel Stenberg (of curl fame), it is basically impossible to introduce a new transport protocol today. Even Google, arguably one of the most powerful players today, decided to build their new QUIC protocol on top of the existing UDP protocol. We still consider it a transport protocol because the layer above uses it like one, hiding UDP from it entirely. QUIC offers the same guarantees as TCP, with the goal of better performance and mandatory TLS. QUIC with its advantages and disadvantages is out of scope for this blog post.
At the top layer, we have a plethora of application protocols. From mail protocols like IMAP and SMTP to file transfer protocols like SFTP. It’s important to note that the three bottom layers are typically provided by your operating system kernel (QUIC being an exception, at least for now). The kernel provides abstractions to communicate either via TCP or UDP. The protocols at the application layer, on the other hand, are mostly implemented in userland.
In this series of blog posts, we will focus on the Web and its Hypertext Transfer Protocol (HTTP). HTTP requires a reliable connection between client and server: HTTP/1.1 and HTTP/2 both use TCP over IP (often referred to as TCP/IP) while HTTP/3 uses QUIC over IP (or QUIC/IP).
netcat is a command-line application that allows two computers to talk to each other. We will
cheat a bit in this chapter and let our computer talk to itself – but it works in the same way. So
feel free to try it out later if you have more than one computer at hand (you will have to replace
::1 with the IP address of the other computer). In case you run into problems connecting via
netcat in the following sections, check if your firewall might be blocking the connection.
netcat is one of those tools that has been implemented many times. On macOS and Linux, there is some
version of netcat installed. Depending on your system, it might be called
We will be using
nc in our code samples.
netcat can be started in two different modes:
- In server mode, netcat will listen for clients that want to connect to it.
- In client mode, netcat connects to a server that is listening for connections.
Note that we are using IPv6 in our examples. If you have trouble with that, you can remove the
option and replace
So let us open two terminal windows. In the first window we start our netcat server:
nc -6 -l 8080
-6 option ensure we are using IPv6. The
l option means that our netcat is listening. The
number is the port it should listen on. Our netcat is now eager for input. In our second terminal
window, we will start a netcat client:
nc ::1 8080
Here we told netcat to connect to our own computer, using
::1 and the same port that we’d chosen
above. As soon as we press enter, we establish a network connection between the two netcats.
Now netcat is waiting for us. Type
Hello! and press enter to send it. This message will be shown
in the other terminal as well! If we write something in the window of our listening netcat and hit
enter, this will also appear in the sending window and vice versa. In other words: We are using
netcat as a chat application. To exit it, we use
ctrl+c. If you close either one of the two
netcats, the other one will notice that its chat partner is gone and exit as well.
Let’s write some Ruby code, starting with writing text to a file:
file = File.open("output.txt", "w") # `w` stands for write-only mode file.write("Hello\n") file.write("World") file.close
output.txt file will now contain two separate lines:
World. This is due to
\n, which represents a line break.
We can also read a certain number of characters from a file when we put it into read mode:
file = File.open("output.txt", "r") p file.read(3) p file.read(2) file.close
This will print
"Hel" and then
read method always reads the next characters. Think
of it as a tape deck. Remember those? When we open a file, we start at the beginning. But every time
read we move the read head along in the file. We can “rewind” that tape to the beginning
with the – you guessed it –
There is also a way to read text files line by line using
file = File.open("output.txt", "r") p file.gets p file.gets file.close
This will first output
"Hello\n" and then
The counterpart for writing is
file = File.open("output.txt", "w") file.puts("Hello") file.puts("Bye") file.close
This will write two lines into the text file.
gets can also be used to read and write from the terminal. You probably have used
puts in that way. What we interact with when we call these two methods is called stdin
and stdout. They behave like files.
Alright. Let’s talk about sockets. A socket (also known as Berkely socket, POSIX socket or BSD socket) is a file that represents a connection. You have probably heard the saying “In UNIX, everything is a file”. In modern UNIX systems, we can represent all kinds of things as a file (but not everything): a printer, for example, or a network connection. So when we represent a network connection as a file, we can interact with it as if it were a file: We can read from it or write to it. The inventors of UNIX came up with this principle in 1983 and it remains the de-facto standard for network communication on UNIX-like systems (for example GNU/Linux or macOS). Most programming languages have some way to interact with a socket. So does Ruby. Everything we learned above about files in Ruby also works with a network connection.
A socket can operate in either server or client mode. In the same way that we need to provide a mode when we open a file, we need to choose a mode when we open a socket. If we want Ruby to open a file, we need to tell Ruby the path to the file. If we want Ruby to open a socket, we instead provide a network address consisting of a network interface and a port. For a server socket, this is the network address it listens on – in the case of a client socket, it is the network address it wants to connect to.
For each socket we need to make the following choices:
- Should it be an IPv4, IPv6 or UNIX domain socket?
- Should it communicate via stream (TCP) or datagram (UDP)?
- Which network interface should it listen on (for a server socket) or connect to (for a client socket)?
- Which port should it listen on (for a server socket) or connect to (for a client socket)?
When the socket connection is established, we can use commands like
write on it. In
the next section, we will replace our netcat server with a small Ruby script.
Our first Ruby server
When we want to work with sockets in Ruby, we first require the
socket standard library and then
initialize our server socket:
require "socket" server_socket = Socket.new(:INET6, :STREAM)
:INET6 argument means that we want to use IPv6, the
:STREAM argument means we want to use
TCP (and not UDP – which would be
:DGRAM). After creating the socket, we need an address (the
combination of a port and the loopback address):
address = Socket.pack_sockaddr_in(8080, "::1")
We then continue by binding our socket to this address:
Then we need to tell the socket to listen for incoming connections. When we do that, we additionally provide a size for the listen queue, which keeps track of incoming connections. When we are ready to answer another connection, we look into the listen queue. When we tell our socket to listen, we need to provide a size for that queue. We will accept five connections for now:
Ok, we are now ready and listening. Let’s accept a connection:
connection_socket, addrinfo = server_socket.accept
We get back another socket: This socket represents this specific connection to a specific client.
addrinfo is a bit of information about our client. For now, we will output the info and close
pp addrinfo connection_socket.close
Put these lines of code into a Ruby file and start the program. It will create no output, but it will also not quit. Instead, it just waits for a connection, like netcat in our previous example. We already know how to make a connection with netcat, so let’s start our client in a separate terminal window:
echo "hello from netcat" | nc ::1 8080
We have made a connection. The program outputs a line and then quits:
#<Addrinfo: [::1]:62392 TCP>
Well. Interesting. But wouldn’t it be nice if it keeps running and waits for more clients to connect? Let’s wrap the two lines that accept and close connections in a loop:
loop do connection_socket, _ = server_socket.accept pp addrinfo connection_socket.close end
Here is the entire script:
require "socket" server_socket = Socket.new(:INET6, :STREAM) address = Socket.pack_sockaddr_in(8080, "::1") server_socket.bind(address) server_socket.listen(5) loop do connection_socket, addrinfo = server_socket.accept pp addrinfo connection_socket.close end
Ok, let’s try it again:
echo "hello from netcat" | nc ::1 8080
This time, the server stays open. We can run our client again and again. Keeping the server open and answering connections in a loop is a very common pattern for server sockets. It is so common that Ruby offers us a single method to do it:
require "socket" Socket.tcp_server_loop("::1", 8080) do |connection, addrinfo| pp addrinfo connection.close end
This will do the same thing as the code above. Notice that we also replaced
connection. It is still a socket, but
connection is a little shorter. To wrap up our work,
let’s print out what the client is sending us and reply with a friendly “hello from Ruby” response.
This is where the “a socket is like a file” part comes into play. In the same way that we can call
puts on files, we can call them on our socket:
require "socket" Socket.tcp_server_loop("::1", 8080) do |connection| puts connection.gets connection.puts("hello from Ruby") connection.close end
If we start our server and run
echo 'hello from netcat' | nc ::1 8080 in the other window, we see
the messages from netcat on the server and the message from Ruby on the client. We have written a
TCP-based server in Ruby!
The tiny server we built talks to us. Most servers will, however, not talk to a human being. Instead, they will speak with another application. Computers are not huge fans of informal communication. So when we want two computers to talk to each other, we need a language that they can use. We call these languages application protocols. This will be the topic of the next blog post.
Thanks to Essy, Bascht, FND, Stefan Bodewig and mkhl for their feedback ❤️
This layer is moving quite slowly: The IPv6 standard became a draft in 1998, and was ratified in 2017. RFC 9386 estimates that in January 2022, about 30% of Internet users were using IPv6 – leaving 70% using its predecessor IPv4, introduced in 1981. ↩
To be precise, there are entire blocks reserved for loopback: In IPv4 the address block
::1/128in IPv6. ↩
This is due to IANA reserving a list of standardized ports (up to port 49.152). This is why we we need special privileges to listen on ports lower than 1024 on some systems. They are not as relevant today, and in most cases you can choose your ports arbitrarily. You can learn more about the port number ranges and the registered service names Transport Protocol Port Number . ↩