Practical launchd sockets

I run a bunch of services such as smbd(8) file sharing server on my personal server that I don’t wish to expose to the outer world. Some of them are therefore only accessible on local addresses, which is obviously impractical if I want to access them from outside my home. I have a VPN set up so this problem has an easy (well, setting up the IPSec can be also headache) solution, but this seemed too heavy to me.

What I consider as a nice solution for this situation is the ssh(1) ability to forward ports. With a single command like ssh -L4445:localhost:445 srv11 I am able to mount the shared drive on my mac using mount_smbfs //localhost:4445/<share> <dest>. This is easy, but I still have to fire up the ssh command manually which is annoying. Anyway, as my thesis supervisor used to say: Life is much more fun when it could be automated.

My first encounter with the solution was when a friend of mine told me about the systemd.socket(5) at the University. Soon I learned that the functionality was a long time ago implemented by the inetd daemon, then by the launchd and then came the systemd2.

Not like this

My first attempt was straightforward… and wrong. I created the following plist with configuration me.kabele.smbd_forward.plist in ~/Library/LaunchAgents.

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
	<key>Label</key>
	<string>me.kabele.smbd_forward</string>
	<key>ProgramArguments</key>
	<array>
		<string>ssh</string>
		<string>-TL4445:localhost:445</string>
		<string>srv1</string>
	</array>
	<key>Sockets</key>
	<dict>
		<key>Listeners</key>
		<dict>
			<key>SockType</key>
			<string>stream</string>
			<key>SockNodeName</key>
			<string>127.0.0.1</string>
			<key>SockServiceName</key>
			<integer>4445</integer>
		</dict>
	</dict>
	<key>inetdCompatibility</key>
	<dict>
		<key>Wait</key>
		<true/>
	</dict>
	<key>StandardErrorPath</key>
	<string>/Users/vitkabele/Library/Logs/me.kabele.smbd_forward.err</string>
	<key>StandardOutPath</key>
	<string>/Users/vitkabele/Library/Logs/me.kabele.smbd_forward.out</string>
</dict>
</plist>

Then I needed to somehow start the job. There is sadly a little documentation on this around the web (which is why I also decided to write this) when compared to let’s say the systemd suite. Luckily, the systemd is inspired by the launchd and they are both intended to to the same thing so the ideas are similar.

First the launchd must be notified about the existence of new service definition, which is done by launchctl bootstrap gui/501 me.kabele.smbd_forward.plist. As far as I understand at this moment, this is similar to calling systemd --user daemon-reload, except that the systemd version automatically flushes the old version and loads the new one, while the launchd refuses to load the unit again if you didn’t done the bootout before. Also the daemon-reload reloads all files from watched paths and the bootstrap is called once per service. The benefit of launchd bootstrap is that you can load the service plist from arbitrary location, while the daemon-reload only scans it’s service definition directory (~/.config/systemd/user).

On the other hand, if the service file is placed at the well known location it is loaded automatically on domain startup, which is the moment when user logs in for the gui/<uid> domain.

After loading the service file and connecting to the localhost:4445, my connection was immediately dropped. I consulted the output of launchctl print gui/501/me.smbd_kabele.forward and I noticed two things. First, the service kept restarting endlessly and second, it reports that on the associated socket are still bytes to read. That made me suspicious, that the service somehow does not take over the socket and it only dies, so the launchd keeps restarting the service because the demand (connection to the port) still exists.

Looking at all of this, I realised that I really didn’t think a lot about the whole thing. If the port is being bind to the launchd, how could possibly my ssh command bind to the same port? Indeed, the stderr log file was full of messages like: bind [127.0.0.1]:4445: Address already in use. This is obviously nonsense, so nothing else left than finally do what I should do in the first place – look up the documentation of inetd.

To get rid of endlessly restarting service one must run launchctl bootout gui/501/me.kabele.forward. It took me a while to do this correctly because I copied the plist template from the web and I didn’t change the <Label> element which had a valued com.example.service. The launchd differs from systemd in that the service names are not derived from the service files, but are taken from the <Label> element instead. The bootout command therefore failed because I tried to boot out the gui/501/me.kabele.forward service, but I should instead reference it as gui/501/com.example.service.

Second try

The inetd daemon can be configured to launch the services in two different ways. They are called wait and nowait.

  • wait in this mode, the inetd passes the opened socket to the launched service as file descriptor 0 and lets the service to all the work. The service must accept connections on the socket on its own and then handle the particular requests. The launchd waits until the service exits (therefore the mode name) and then again takes over the socket.

  • nowait in this mode, the service manager first accepts the connection (using accept(2)) which produces a new file descriptor that is later passed to the launched service. The service then does not need to accept the connections, but can only focus on processing the requests. Here launchd never gives up on the socket and only forwards the opened connections. It therefore does not need to wait for the service to exits and there might run multiple services in parallel, under the launchd supervision.

The launchd.plist(5) provides a key called inetdCompatibility with subkey (?) Wait that controls exactly this two types of behaviour. To make things more fun, launchd behaves in yet another way if the inetdCompatibility key is omitted. I.e. the default launchd behaviour is neither wait nor nowait.

With this knowledge I modified my plist configuration as this (only diff shown).

*** service.v1.plist	2022-02-04 12:13:25.000000000 +0100
--- service.v2.plist	2022-02-04 12:21:02.000000000 +0100
***************
*** 9,12 ****
              <string>ssh</string>
-             <string>-TL4445:localhost:445</string>
              <string>srv1</string>
          </array>
--- 9,12 ----
              <string>ssh</string>
              <string>srv1</string>
+             <string>bash</string>
          </array>
***************
*** 27,29 ****
              <key>Wait</key>
!             <true/>
          </dict>
--- 27,29 ----
              <key>Wait</key>
!             <false/>
          </dict>

After modifying the plist file it is necessary to notify launchd about the change by combination of bootout + bootstrap (or by restarting the computer :).

Now the ssh command is launched with file descriptor zero being the opened connection. Because ssh expects that file descriptor zero is standard input and forwards it to the fd 0 on the remote machine, there is a bash launched remotely.

This is way better! Listing opened files (e.g. by the lsof(1)) now really shows that file descriptor zero is opened network connection and the remote bash really receives what I send to my local port.

This is certainly progress, but our communication is only one way. Commands written to the socket are executed on the remote machine and output of these commands is forwarded to stdout, which according to our plist file is forwarded to ~/Library/Logs/me.kabele.smnd_forward.out. It is still not enough to use the socket for serious network communication like the SMB or HTTP protocol.

Working setup

For the bi-directional communication, we need our ssh command to write its standard output to the file descriptor zero (something like shell redirection 1>&0).

I wrote a simple C wrapper program for testing purposes. It uses dup2(2) syscall to close fd 1, replace it with copy of fd 0 and then execve the required program.

#include <stdio.h>
#include <unistd.h>
#include <errno.h>
#include <sys/types.h>
#include <sys/socket.h>

int main(int argc, char **argv, char **envp) {

	const int err = dup2(0, 1);
	if (err == -1) {
		fprintf(stderr, "Cannot dup stdout");
		return 2;
	}

	return execve(argv[1], &argv[1], envp);
}

The wrapper program is now executed by launchd and then execve’s itself in the /usr/bin/ssh. Note that the ssh is now referred to by its full path, because my wrapper does not perform $PATH lookup. The StandardOutPath key is also deleted, because it is not necessary anyway.

*** service.v2.plist	2022-02-04 12:21:02.000000000 +0100
--- service.v3.plist	2022-02-04 13:55:58.000000000 +0100
***************
*** 8,10 ****
          <array>
!             <string>ssh</string>
              <string>srv1</string>
--- 8,11 ----
          <array>
!             <string>/Users/vitkabele/wrapper/a.out</string>
!             <string>/usr/bin/ssh</string>
              <string>srv1</string>
***************
*** 31,34 ****
          <string>/Users/vitkabele/Library/Logs/me.kabele.smbd_forward.err</string>
-         <key>StandardOutPath</key>
-         <string>/Users/vitkabele/Library/Logs/me.kabele.smbd_forward.out</string>
      </dict>
--- 32,33 ----

Let’s reload the service and try it. With this configuration I was indeed able to access remote bash with netcat on my local port 4445 without launching port forwarding manually. When running multiple nc localhost 4445 instances, we obtain multiple instances of remote shell as expected.

The following set of changes finally produces the desired behaviour.

*** service.v3.plist	2022-02-04 13:55:58.000000000 +0100
--- service.v4.plist	2022-02-04 14:01:42.000000000 +0100
***************
*** 10,13 ****
              <string>/usr/bin/ssh</string>
!             <string>srv1</string>
!             <string>bash</string>
          </array>
--- 10,14 ----
              <string>/usr/bin/ssh</string>
!             <string>nc</string>
!             <string>localhost</string>
!             <string>445</string>
          </array>

Running mount_smbfs //localhost:4445/<share> <path> on my local machine works. I am asked for a password and then the remote directory is indeed mounted. It also works from the Finder’s graphical interface (Press CMD+K in finder and fill localhost:4445).

Conclusion

This setup indeed generates pretty seamless workflow with pretty resource usage. Required forwarding is started on demand and when it is not needed for longer time, it is killed and resources are released. It also does not suffer by the ubiquitous broken pipe error when you migrate between different connections which is another reason why running unconditionally the ssh -L command in background is not feasible.

I measured about 5MiB of throughput via pv(1) which is not awesome, but also not bad given that I am in western Germany and my server in southern Bohemia. Also care must be given to measuring download speed, because the mounted smb filesystem caches the files and so second measurement returns insane speeds in hundreds of MiBs. Measured is the speed of copying files which is fine, but whole different story is the overall user experience. Listing the directory contents is waaay to slow in both Finder and terminal. Should I however not blame ssh nor network for this, because more likely there is some misconfiguration of my SMB server.

All fine, but the custom wrapper is fragile piece of code and better it should go away. I’d expect that program for duplicating file descriptors would be present in basic UNIX system tools, yet I didn’t find any. Maybe there is not or maybe I was just not trying enough.

In the end, I am satisfied with the result.


  1. Notice that the first port is +4000 because I want this whole setup work without root privileges. The mount itself does not require them on macOS. ↩︎

  2. http://0pointer.de/blog/projects/systemd.html ↩︎