Network Programming Basics in Ruby

June 04, 2023

Let's explore the basics of networking programming with Ruby

In this series of blog posts, we want to pull back the curtain on building web applications. In my experience, most developers use a framework and application server and do not look behind the curtain to see what happens inside of them. Frameworks are great for making development quicker and more comfortable, providing abstractions for low-level details and recurring patterns. But there’s always the risk of losing touch with what’s actually happening under the hood, which can complicate debugging and lead to unnecessary complexity. It helps to have a sense of what’s going on beneath those abstractions. Let’s start with the basics of networking.

The Internet protocol suite
netcat
Sockets
Our first Ruby server
Conclusion
Thanks

The Internet protocol suite

Before we start diving into network programming, let us briefly recap the four layers of the Internet protocol suite as defined in RFC 1122. Each layer builds upon the layer beneath to offer services to the layers above it.

Link Layer

On the lowest level, we find the link layer. The link layer describes the communication between two physically connected nodes. One example of a link layer protocol is Ethernet: It specifies how the bits of the layers above are turned into electrical signals within a cable. It also describes the direct connection of one node to another. Routing data on a network is left to the layer above.

Internet Layer

On top of this, we find the Internet layer with the Internet protocols (IPv4 and IPv6¹). Everything that is considered part of the Internet uses the IP protocols. They offer the abstraction of sending packages from one node to another that are not directly connected on the link layer. Each package is independent though, making a connectionless protocol. This layer provides a node with an IP address. The so-called loopback address² ::1 (or 127.0.0.1 in IPv4) and their alias localhost allow us to address our own machine.

Transport Layer

The third layer is called the transport layer. This is where protocols like TCP and UDP reside. TCP provides us with connection-oriented, reliable, ordered and error-checked communication. The much more minimal but sometimes more efficient UDP protocol does not provide any of TCP’s features or guarantees. Both TCP and UDP provide us with “ports”, which are 16-bit unsigned integers (a number between 0 and 65.535³).

The transport layer is also responsible for three essential qualities: confidentiality, integrity, and authenticity, as offered by Transport Layer Security (TLS). TLS is not a transport protocol itself, but an extension to an existing transport protocol like TCP⁴. When using TLS, the communication between two parties is confidential, protected from tampering, and we know that the two parties are who they claim to be. (In most cases, only one of the two parties proves their identity – the case where both do it is called mTLS, m for mutual). TLS is essential for any secure Internet communication. Only if we talk to our own machine via loopback, we don’t need TLS.

According to Daniel Stenberg (of curl fame), it is basically impossible to introduce a new transport protocol today. Even Google, arguably one of the most powerful players today, decided to build their new QUIC protocol on top of the existing UDP protocol. We still consider it a transport protocol because the layer above uses it like one, hiding UDP from it entirely. QUIC offers the same guarantees as TCP, with the goal of better performance and mandatory TLS. QUIC with its advantages and disadvantages is out of scope for this blog post.

Application Layer

At the top layer, we have a plethora of application protocols. From mail protocols like IMAP and SMTP to file transfer protocols like SFTP. It’s important to note that the three bottom layers are typically provided by your operating system kernel (QUIC being an exception, at least for now). The kernel provides abstractions to communicate either via TCP or UDP. The protocols at the application layer, on the other hand, are mostly implemented in userland.

In this series of blog posts, we will focus on the Web and its Hypertext Transfer Protocol (HTTP). HTTP requires a reliable connection between client and server: HTTP/1.1 and HTTP/2 both use TCP over IP (often referred to as TCP/IP) while HTTP/3 uses QUIC over IP (or QUIC/IP).

netcat

netcat is a command-line application that allows two computers to talk to each other. We will cheat a bit in this chapter and let our computer talk to itself – but it works in the same way. So feel free to try it out later if you have more than one computer at hand (you will have to replace ::1 with the IP address of the other computer). In case you run into problems connecting via netcat in the following sections, check if your firewall might be blocking the connection.

netcat is one of those tools that has been implemented many times. On macOS and Linux, there is some version of netcat installed. Depending on your system, it might be called socat, nc or ncat. We will be using nc in our code samples.

netcat can be started in two different modes:

In server mode, netcat will listen for clients that want to connect to it.
In client mode, netcat connects to a server that is listening for connections.

Note that we are using IPv6 in our examples. If you have trouble with that, you can remove the -6 option and replace ::1 with 127.0.0.1.

So let us open two terminal windows. In the first window we start our netcat server:

nc -6 -l 8080

The -6 option ensure we are using IPv6. The l option means that our netcat is listening. The number is the port it should listen on. Our netcat is now eager for input. In our second terminal window, we will start a netcat client:

nc ::1 8080

Here we told netcat to connect to our own computer, using ::1 and the same port that we’d chosen above. As soon as we press enter, we establish a network connection between the two netcats.

Now netcat is waiting for us. Type Hello! and press enter to send it. This message will be shown in the other terminal as well! If we write something in the window of our listening netcat and hit enter, this will also appear in the sending window and vice versa. In other words: We are using netcat as a chat application. To exit it, we use ctrl+c. If you close either one of the two netcats, the other one will notice that its chat partner is gone and exit as well.

Sockets

Let’s write some Ruby code, starting with writing text to a file:

file = File.open("output.txt", "w") # `w` stands for write-only mode
file.write("Hello\n")
file.write("World")
file.close

The output.txt file will now contain two separate lines: Hello and World. This is due to \n, which represents a line break.

We can also read a certain number of characters from a file when we put it into read mode:

file = File.open("output.txt", "r")
p file.read(3)
p file.read(2)
file.close

This will print "Hel" and then "lo". The read method always reads the next characters. Think of it as a tape deck. Remember those? When we open a file, we start at the beginning. But every time we call read we move the read head along in the file. We can “rewind” that tape to the beginning with the – you guessed it – rewind method.

There is also a way to read text files line by line using gets:

file = File.open("output.txt", "r")
p file.gets
p file.gets
file.close

This will first output "Hello\n" and then "World".

The counterpart for writing is puts:

file = File.open("output.txt", "w")
file.puts("Hello")
file.puts("Bye")
file.close

This will write two lines into the text file.

Both puts and gets can also be used to read and write from the terminal. You probably have used at least puts in that way. What we interact with when we call these two methods is called stdin and stdout. They behave like files.

Alright. Let’s talk about sockets. A socket (also known as Berkely socket, POSIX socket or BSD socket) is a file that represents a connection. You have probably heard the saying “In UNIX, everything is a file”. In modern UNIX systems, we can represent all kinds of things as a file (but not everything): a printer, for example, or a network connection. So when we represent a network connection as a file, we can interact with it as if it were a file: We can read from it or write to it. The inventors of UNIX came up with this principle in 1983 and it remains the de-facto standard for network communication on UNIX-like systems (for example GNU/Linux or macOS). Most programming languages have some way to interact with a socket. So does Ruby. Everything we learned above about files in Ruby also works with a network connection.

A socket can operate in either server or client mode. In the same way that we need to provide a mode when we open a file, we need to choose a mode when we open a socket. If we want Ruby to open a file, we need to tell Ruby the path to the file. If we want Ruby to open a socket, we instead provide a network address consisting of a network interface and a port. For a server socket, this is the network address it listens on – in the case of a client socket, it is the network address it wants to connect to.

For each socket we need to make the following choices:

Should it be an IPv4, IPv6 or UNIX domain socket?
Should it communicate via stream (TCP) or datagram (UDP)?
Which network interface should it listen on (for a server socket) or connect to (for a client socket)?
Which port should it listen on (for a server socket) or connect to (for a client socket)?

When the socket connection is established, we can use commands like read and write on it. In the next section, we will replace our netcat server with a small Ruby script.

Our first Ruby server

When we want to work with sockets in Ruby, we first require the socket standard library and then initialize our server socket:

require "socket"

server_socket = Socket.new(:INET6, :STREAM)

The :INET6 argument means that we want to use IPv6, the :STREAM argument means we want to use TCP (and not UDP – which would be :DGRAM). After creating the socket, we need an address (the combination of a port and the loopback address):

address = Socket.pack_sockaddr_in(8080, "::1")

We then continue by binding our socket to this address:

server_socket.bind(address)

Then we need to tell the socket to listen for incoming connections. When we do that, we additionally provide a size for the listen queue, which keeps track of incoming connections. When we are ready to answer another connection, we look into the listen queue. When we tell our socket to listen, we need to provide a size for that queue. We will accept five connections for now:

server_socket.listen(5)

Ok, we are now ready and listening. Let’s accept a connection:

connection_socket, addrinfo = server_socket.accept

We get back another socket: This socket represents this specific connection to a specific client. The addrinfo is a bit of information about our client. For now, we will output the info and close the socket:

pp addrinfo
connection_socket.close

Put these lines of code into a Ruby file and start the program. It will create no output, but it will also not quit. Instead, it just waits for a connection, like netcat in our previous example. We already know how to make a connection with netcat, so let’s start our client in a separate terminal window:

echo "hello from netcat" | nc ::1 8080

We have made a connection. The program outputs a line and then quits:

#<Addrinfo: [::1]:62392 TCP>

Well. Interesting. But wouldn’t it be nice if it keeps running and waits for more clients to connect? Let’s wrap the two lines that accept and close connections in a loop:

loop do
  connection_socket, _ = server_socket.accept
  pp addrinfo
  connection_socket.close
end

Here is the entire script:

require "socket"

server_socket = Socket.new(:INET6, :STREAM)
address = Socket.pack_sockaddr_in(8080, "::1")

server_socket.bind(address)

server_socket.listen(5)

loop do
  connection_socket, addrinfo = server_socket.accept
  pp addrinfo
  connection_socket.close
end

Ok, let’s try it again:

echo "hello from netcat" | nc ::1 8080

This time, the server stays open. We can run our client again and again. Keeping the server open and answering connections in a loop is a very common pattern for server sockets. It is so common that Ruby offers us a single method to do it:

require "socket"

Socket.tcp_server_loop("::1", 8080) do |connection, addrinfo|
  pp addrinfo
  connection.close
end

This will do the same thing as the code above. Notice that we also replaced connection_socket with connection. It is still a socket, but connection is a little shorter. To wrap up our work, let’s print out what the client is sending us and reply with a friendly “hello from Ruby” response. This is where the “a socket is like a file” part comes into play. In the same way that we can call gets and puts on files, we can call them on our socket:

require "socket"

Socket.tcp_server_loop("::1", 8080) do |connection|
  puts connection.gets

  connection.puts("hello from Ruby")
  connection.close
end

If we start our server and run echo 'hello from netcat' | nc ::1 8080 in the other window, we see the messages from netcat on the server and the message from Ruby on the client. We have written a TCP-based server in Ruby!

Conclusion

The tiny server we built talks to us. Most servers will, however, not talk to a human being. Instead, they will speak with another application. Computers are not huge fans of informal communication. So when we want two computers to talk to each other, we need a language that they can use. We call these languages application protocols. This will be the topic of the next blog post.

Thanks

Thanks to Essy, Bascht, FND, Stefan Bodewig and mkhl for their feedback ❤️

Footnotes

This layer is moving quite slowly: The IPv6 standard became a draft in 1998, and was ratified in 2017. RFC 9386 estimates that in January 2022, about 30% of Internet users were using IPv6 – leaving 70% using its predecessor IPv4, introduced in 1981. ↩
To be precise, there are entire blocks reserved for loopback: In IPv4 the address block 127.0.0.0/8 and ::1/128 in IPv6. ↩
This is due to IANA reserving a list of standardized ports (up to port 49.152). This is why we we need special privileges to listen on ports lower than 1024 on some systems. They are not as relevant today, and in most cases you can choose your ports arbitrarily. You can learn more about the port number ranges and the registered service names Transport Protocol Port Number . ↩
In the OSI model, TLS would be on the session layer rather than at the transport layer. The Internet Protocol Suite combines these layers into one layer, this leads to this weirdness. ↩