I’ve been playing with xenserver lately to quickly model small Hadoop clusters. One of the frustrating things about xenserver is the lack of good graphical user interfaces that provide for a minimal amount of automation. This means I’m frequently dropping to the command line on the xenserver master and running the xe
tools by hand in order to speed up certain operations. This can get tedious, so I started playing with Docker on my Mac to run the xe
tools over an encrypted tunnel (via stunnel
).
Down the stunnel rabbit hole
Now, xe
handles the stunnel
setup pretty seamlessly. You provide it a username, password, and the name of the xenserver master. It does the rest. When I started using this under docker, I quickly encountered weird behavior between the two tools. No matter what I did, stunnel
would throw logging output regarding the session setup.
[root@8a4a435df6a8 stunnel]# xe vm-list name-label=vm params=uuid --minimal stunnel: LOG5[137:139863907756096]: stunnel 4.56 on x86_64-redhat-linux-gnu platform stunnel: LOG5[137:139863907756096]: Compiled/running with OpenSSL 1.0.1e-fips 11 Feb 2013 stunnel: LOG5[137:139863907756096]: Threading:PTHREAD Sockets:POLL,IPv6 SSL:ENGINE,OCSP,FIPS Auth:LIBWRAP stunnel: LOG5[137:139863907756096]: Reading configuration from descriptor 4 stunnel: LOG5[137:139863907756096]: FIPS mode is disabled stunnel: LOG5[137:139863907756096]: Configuration successful stunnel: LOG5[137:139863907756096]: Service [stunnel] accepted connection from unnamed socket stunnel: LOG5[137:139863907756096]: connect_blocking: connected 10.1.1.32:443 stunnel: LOG5[137:139863907756096]: Service [stunnel] connected remote server from 172.17.0.2:41462 6c281c5c-375c-48f8-0917-75cfc414f671
As you can see, that’s a bunch of crap dumped out. At first, I thought the logs were coming in via stderr, but attempts to redirect it went nowhere. It kept showing up. When running the same xe
command on a non-Dockerized image, the output looked correct: I would get the uuid with no weird logging output.
realhost# xe vm-list name-label=vm params=uuid --minimal 6c281c5c-375c-48f8-0917-75cfc414f671
When good guesses go bad
My hunch was that logging was being written out on some file descriptor other than stderr. The quick way to confirm this was to run the command through strace
and see what was happening under the hood when the syscalls were being invoked.
20:17:28.671410 execve("/usr/bin/stunnel", ["/usr/bin/stunnel", "-fd", "4"], [/* 10 vars */]) = 0 <0.000637>
The initial entry point that I found was possibly on file descriptor 4, but I realized this was being used as a communications input path from xe
to stunnel
, not output.
Further up in the strace
output, I found where the stdin, stdout, and stderr file descriptors were being duplicated during the process fork, but eventually were closed out, so none of the standard channels were even in use. Where could it still be coming from?
My next stop was to see if there was anything interesting in and around the write()
syscall outputting one of the messages.
95 20:17:28.831429 write(2, "2016.10.05 20:17:28 LOG5[95:140377872062528]: Compiled/running with OpenSSL 1.0.1e-fips 11 Feb 2013\n", 100) = 100 <0.000251> 95 20:17:28.831867 mmap(NULL, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fac45337000 <0.000107> 95 20:17:28.832108 mremap(0x7fac45337000, 12288, 4096, MREMAP_MAYMOVE) = 0x7fac45337000 <0.000106> 95 20:17:28.832307 socket(PF_LOCAL, SOCK_DGRAM|SOCK_CLOEXEC, 0) = 4 <0.000126> 95 20:17:28.832610 connect(4, {sa_family=AF_LOCAL, sun_path="/dev/log"}, 110) = -1 ENOENT (No such file or directory) <0.000109> 95 20:17:28.832811 close(4) = 0 <0.000148> 95 20:17:28.833161 open("/dev/console", O_WRONLY|O_NOCTTY) = 4 <0.000115> 95 20:17:28.833375 fstat(4, {st_mode=S_IFCHR|0600, st_rdev=makedev(136, 0), ...}) = 0 <0.000062> 95 20:17:28.833619 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fac45339000 <0.000050> 95 20:17:28.833801 lseek(4, 0, SEEK_CUR) = -1 ESPIPE (Illegal seek) <0.000046> 95 20:17:28.833927 write(4, "stunnel: LOG5[95:140377872062528]: Threading:PTHREAD Sockets:POLL,IPv6 SSL:ENGINE,OCSP,FIPS Auth:LIBWRAP\r\n", 106) = 106 <0.000119> 95 20:17:28.834134 munmap(0x7fac45339000, 4096) = 0 <0.000057> 95 20:17:28.834354 close(4) = 0 <0.000063>
Here’s where it gets interesting.
Docker always gets you
Recall that I’m running this under Docker. For this test, I was doing something like:
$ docker run -i -t centos:7 /bin/bash [root@8a4a435df6a8 /]#
This ends up running a pretty cut down environment where bash
is doing the work of systemd
or init
. Nothing else is running.
[root@8a4a435df6a8 stunnel]# ps PID TTY TIME CMD 1 ? 00:00:00 bash 139 ? 00:00:00 ps
Ordinarily, this isn’t a big deal. Except in the case where a tool is attempting to use syslog messages to log information. Which stunnel
does.
95 20:17:28.832610 connect(4, {sa_family=AF_LOCAL, sun_path="/dev/log"}, 110) = -1 ENOENT (No such file or directory) <0.000109> 95 20:17:28.832811 close(4) = 0 <0.000148> 95 20:17:28.833161 open("/dev/console", O_WRONLY|O_NOCTTY) = 4 <0.000115>
A lightbulb appears
Here’s my “ahha!” moment. stunnel
is attempting to talk to a local syslog daemon via the /dev/log
device (a unix socket). On full installs of a system, this special device file exists to provide a pathway for system-local processes to communicate with syslog. But in a cut-down Docker environment, not everything is fully populated inside the container. So what happens?
stunnel
starts.- It attempts to open the unix socket to communicate and log data to syslog.
- The syscall fails with No such file or directory because it doesn’t exist.
stunnel
 falls through to a backup course of communication by looking for/dev/console
.stunnel
finds it, opens it, and writes a log to it.
But, why should I be seeing this? Am I running from a console? Yes. Because of the way I’ve invoked my container, the console is tied to my terminal, which I confirmed with the following:
[root@8a4a435df6a8 stunnel]# tty /dev/console
stunnel
was using a failsafe logging mechanism and writing directly to my terminal. The log output would show up on my screen, but because it wasn’t flowing through a file descriptor attached directly to the processes, I would never be able to redirect it.
I’m not sure how I’ll work around this. I do want the functionality of being able to run xe
commands from my Mac, but I don’t want to go through the steps of building the xapi-xe
packages natively. I tried that and failed after a few hours of staring at OCaml code.
For now, this is just annoying output and doesn’t hamper my ability to automate xenserver commands from the container, so I’ll just live with it.
A happy ending
A week after posting this, a few alternative methods were found to address the problem. I’m a happier person now.