Advanced Namespace Tools blog

05 Feburary 2018

Setting up a new public grid

Fully ten years after the first attempt to make a public grid of 9p services, 9gridchan finally has a useful and usable implementation of the concept. Right now, a small but brave band of Grid Pioneers is working collaboratively to explore the possibilities of a Net that Never Was, but May Yet Be - a shared environment built from a distributed set of 9p fileservers. For information on connecting to and using the grid, visit wiki.9gridchan.info.

This is the behind the scenes story of how the grid is assembled, complete with gory technical information. Send your children out of the room, because we will be sharing the explicit, unexpurgurated shell commands used to create it.

Overall Architecture

The current grid is provided by four machines. A central root fileserver running venti+fossil is the foundation, and 3 cpu servers tcp boot from it. This arrangement is convenient because all the servers share the same root fs, but it has the disadvantage that an issue with the root fileserver will toast the entire grid. This was dramatized by the fact that my VPS provider, the generally admirable vultr, decided to reboot that node 15 minutes after I went to sleep with the public grid still on the front page of hn. Because I had yet to create automated setup/restart/failure-detection scripts, this caused a regrettable service outage at an inconvenient moment, but so it goes. It is possible it would be better to shift to an architecture without any central point of failure like this, there is no necessity other than convenience to make all the grid servers share the same root filesystem.

Anyway, 3 cpus tcp root from the main fileserver. Inferno runs on one of them to provide a service registry, then that and other servers provide a chat service, a wiki service, a writable ramfs, a read only export of the root fs, a plumber with other grid resources in its namespace, and an additional publicly writable registry service. The 'gridstart' script dials the registry and obtains the list of available services, then dials and mounts them and starts a subrio in the grid namespace with several applications running. This subrio acts as a collaborative environment, with an irc-like chat, shared filesystem for uploads and downloads, and the shared plumber allows plumbed messages to auto-open links in the mothra web browser, graphics/document files in the page viewer, and text files in the acme editor. The end result is a collective environment which still maintains a sane privilege boundary between systems, because they are simply accessing the same filesystem data via 9p.

Registry Service

The inferno registry fs keeps a dynamic list of available services. Servers mount the registry, open the 'new' file and write a string that defines the location of the service and its attributes. Plan 9 doesn't support this mechanism by default, so I modified the 'listen1' command into 'gridlisten1' to also announce the started service to an inferno registry. The registry tracks what services are available by the fact that clients 'hold on' to the file descriptor. Once it is closed, the service is removed from the registry. Here is the relevant 'doregister' function body:

int regfd;
char *loc;
switch(rfork(RFPROC|RFFDG)) {
case -1:
	fprint(2, "error forking\n");
case 0:
	if((loc=getenv("myip")) == 0)
		loc=getenv("sysname");
	regfd=open("/mnt/registry/new", OWRITE);
	fprint(regfd, "tcp!%s!%s sys %s service %s mountpoint %s is %s", loc, announce+6, getenv("sysname"), service, mountpoint, description);
	for(;;)
		sleep(1000);
	break;
default:
	return;
}

So when gridlisten1 calls do register, the registry writing process is forked off, then tries to get 'myip' from the environment, and writes a string that includes the service name and desired mountpoint parameters that were passed to the command. That process sleeps forever without closing the regfd while the main flow of control returns to the listener process.

Once all the nodes are booted and ready to start providing grid services, the first thing to set up is the grid registry. Inferno currently requires a 32-bit environment to run on plan9, so a machine named cpu32 hosts it. We want the Inferno to run in the background so we put it in a hubfs like this:

hubfs -s regemu
mount -c /srv/regemu /n/hubfs
touch /n/hubfs/regemu0 /n/hubfs/regemu1 /n/hubfs/regemu2
emu </n/hubfs/regemu0 >>/n/hubfs/regemu1 >>[2]/n/hubfs/regemu2 &

That starts the emu running inside the hubfs. Now we want to tell the inferno to start registry services and export them to the Plan 9 environment via the host machine /srv. We can either attach to the hubfs or write the commands to the input hub filedescriptor. The following are inferno commands:

bind -c '#₪' /srv
mount -A -c {ndb/registry} /mnt/registry
9srvfs registry /mnt/registry &
mount -A -c {ndb/registry} /n/pubregistry
9srvfs pubregistry /n/pubregistry &

Now we want to export these registries publicly. We also need to add a line to /lib/namespace.local so that a listener that changes to 'none' will mount the registry in its namespace.

mount -c /srv/registry /mnt/registry

Now we will use the gridlisten1 utility to export the registry service and announce it to itself, and put the listener in a hubfs to capture its output and allow us to easily terminate it if desired. We also want to allow none to mount the services and specify our ip.

chmod 666 /srv/registry
chmod 666 /srv/pubregistry
myip=107.191.50.176
hub listenreg
gridlisten1 -v -d gridregistry -m /mnt/registry tcp!*!6675 /bin/exportfs -S /srv/registry
%detach
hub listenpubreg
gridlisten1 -v d- pubregistry -m /n/pubregistry tcp!*!7675 /bin/exportfs -S /srv/pubregistry
%detach

The grid also provides tls-encrypted versions of the services, using 9front's ability to wrap a tls connection with dp9ik auth. The password is publicly known, we are just using it to setup tls transport encryption. We can use a trampoline to accept incoming calls, wrap them in encryption, and proxy to the services on the local machine. Because we need to stay in the namespace with the factotum we added the grid key to, we need to add the -t flag to gridlisten to not change to the 'none' user. Still in the same shell so myip is still in the env:

auth/factotum
echo 'key proto=dp9ik user=glenda dom=grid !password=9gridchan' >/mnt/factotum/ctl
mount -c /srv/registry /mnt/registry
hub listenregtls
gridlisten1 -tv -d gridregistry -m /mnt/registry tcp!*!16675 tlssrv -A /bin/aux/trampoline tcp!127.1!6675
%detach
hub listenpubregtls
gridlisten1 -tv -d pubregistry -m /n/pubregistry tcp!*!17675 tlssrv -A /bin/aux/trampoline tcp!127.1!7675
%detach

Additional services

Because registry service is very light in terms of load and data transfer, it makes sense to put some more services on this box also. We will also run the hubfs chat server and the plumber on this box. The plumber setup will have to wait until it can mount the other other grid services, but hubfs chat can be setup now. There are a couple minor modifications to the standard version of hubfs to make it slightly more resilient as a chat service. Files cannot be removed or truncated, and the total number of hubs is arbitrarily capped at 7. A larger number of hubs will make sense if the userbase increases and needs more channels. The diff is minimal:

< 	MAXHUBS = 77,				/* Total number of hubs that can be created */
---
> 	MAXHUBS = 7,				/* Total number of hubs that can be created */
88a89
> void fsremove(Req *r);
97a99
> 	.remove = fsremove,
412a415,420
> void
> fsremove(Req *r)
> {
> 	respond(r, "Hub removal prohibited");
> }
> 
429a438
> /*
434a444
> */

The comment marks in the diff are enclosing the "if(r->ifcall.mode&OTRUNC)" block. Setting up the chat hubfs, still in the same shell as used earlier:

hubfs -s gridchat
mount -c /srv/gridchat /n/gridchat
touch /n/gridchat/chat
chmod 666 /srv/gridchat
hub listenchat
gridlisten1 -v -d gridchat -m /n/chat tcp!*!9997 /bin/exportfs -S /srv/gridchat
%detach
hub listenchattls
gridlisten1 -tv -d gridchat -m /n/chat tcp!*!19997 tlssrv -A /bin/aux/trampoline tcp!127.1!9997
%detach

The wiki service runs on a dedicated vps node because it may receive substantial web traffic separate from the grid usage. On one of the other tcp boot cpus, we will need to import the registry /srv and we wont be able to change to none because it won't be in the namespace.

[myip and factotum boilerplate omitted]
rimport -a cpu32 /srv
mount -c /srv/registry /mnt/registry
wikifs -p 666 -s gridwiki /sys/lib/wiki
hub listenwiki
gridlisten1 -tv -d gridwiki -m /mnt/wiki tcp!*!17035 /bin/exportfs -S /srv/gridwiki
%detach
hub listenwikitls
gridlisten1 -tv -d gridwiki -m /mnt/wiki tcp!*!27035 tlssrv -A /bin/aux/trampoline tcp!127.1!17035
%detach

At this point you are probably beginning to see a pattern with all this, but I will continue through the rest of the services. We also have a dedicated node for gridram, because it is one of the heavier services, both in ram usage (obviously) and network i/o.

[myip, factotum, registry import boilerplate omitted]
ramfs -S gridram
chmod 666 /srv/gridram
hub listenram
gridlisten1 -tv -d gridram -m /n/gridram tcp!*!9996 /bin/exportfs -S /srv/gridram
[parallel tls service boilerplate omitted]

We also want to serve a read only export of the root of the main grid server, so on that machine we do some additional shenanigans for security reasons. In particular, we definitely want to be none for this export so we do a srvfs of a mount of the registry import so that /srv/registry will be mountable after the change to none:

rimport cpu32 /srv /n/c32srv
mount -c /n/c32srv/registry /n/registry
srvfs registry /n/registry
hub listenroot
gridlisten1 -v -d gridroot -m /n/gridroot tcp!*!564 /bin/exportfs -R -S /srv/boot

With all of this setup, we are ready to go back to cpu32 to set up the grid plumber. The plumber has a minor change: its rules file is immutable, and the default rules are changed substantially.

96c96
< 	{ "rules",		QTFILE,	Qrules,		0400 },
---
> 	{ "rules",		QTFILE,	Qrules,		0600 },

The rules are changed mostly so that the plumber will never run any applications, just send messages. The reason for this is since the plumber is running on a grid server, running applications would run them on the grid server, not in the workspace of grid clients! Additionally, one of the default plumb rules would be potentially disastrous in this situation:

type	is	text
data	matches	'Local (.*)'
plumb	to	none
plumb	start	rc -c $1

That rule would allow grid clients to cause the plumber to run arbitrary code on the grid server, an obvious security foot-gun. So the plumb rules are stripped down to simply transmit messages for opening links and files in the appropriate running applications in the client namespace, and the rules file is made read-only to keep it that way. The reason the grid plumber needs to be set up last is that for the plumber to work, it needs the gridram and gridroot to be mounted in its namespace, so that plumbed messages referencing those files will be transmitted.

rimport -a cpu64 /srv
mount /srv/gridram /n/gridram
mount /srv/boot /n/gridroot
safeplumber -p gridplumbrules
[check /srv for the id of this plumber]
chmod 666 /srv/plumb.foo.pid
hub listenplumber
gridlisten1 -tv -d gridplumber -m /mnt/plumb tcp!*!9998 /bin/exportfs -S /srv/plumb.foo.pid

End results

With all this done, we now have a registry exported that looks about like this:

tcp!107.191.50.176!6675 is gridregistry mountpoint /mnt/registry service /bin/exportfs sys cpu32
tcp!107.191.50.176!9997 is gridchat mountpoint /n/chat service /bin/exportfs sys cpu32
tcp!107.191.50.176!16675 is gridregistry mountpoint /mnt/registry service /bin/tlssrv sys cpu32
tcp!107.191.50.176!19998 is gridplumber mountpoint /mnt/plumb service /bin/tlssrv sys cpu32
tcp!45.76.231.117!9996 is gridram mountpoint /n/gridram service /bin/exportfs sys cpu64
tcp!45.76.231.117!19996 is gridram mountpoint /n/gridram service /bin/tlssrv sys cpu64
tcp!45.63.75.148!10564 is gridroot mountpoint /n/gridroot service /bin/tlssrv sys pgvf
tcp!107.191.50.176!17675 is pubregistry mountpoint /n/pubregistry service /bin/tlssrv sys cpu32
tcp!45.76.22.6!17035 is gridwiki mountpoint /mnt/wiki service /bin/exportfs sys gridwiki
tcp!45.76.22.6!27035 is gridwiki mountpoint /mnt/wiki service /bin/tlssrv sys gridwiki
tcp!107.191.50.176!9998 is gridplumber mountpoint /mnt/plumb service /bin/exportfs sys cpu32
tcp!107.191.50.176!19997 is gridchat mountpoint /n/chat service /bin/tlssrv sys cpu32
tcp!107.191.50.176!7675 is pubregistry mountpoint /n/pubregistry service /bin/exportfs sys cpu32

This is processed by the gridstart script into a set of commands for mounting the resources, according to whether or not the -t flag for tls service is present.

This grid project has so far been surprisingly successful - users have been finding the set of services synergistic and enjoyable to work with. We have seen several newcomers to Plan 9 successfully join and explore the concepts of the operating system, and we think this framework has potential to expand and gain additional utility. There are some obvious possible extensions, such as offering fully usable shells on grid servers, but this might require a system of accounts and passwords rather than completely open access as has existed up til now. On the other hand, the current users have been entirely constructive and non-abusive, so perhaps nothing bad would happen allowing unrestricted code execution on a node dedicated to that purpose. In general, providing Plan 9 services in an open fashion has so far never resulted in malicious mischief - people interested in exploring Plan 9 don't seem to be the type who want to spoil the fun for everyone. Thanks to everyone who has participated so far, and I think this phase of the grid is just a beginning - this feels like something Plan 9 was meant to do, and we can expand the range of services and functionality considerably.

I hope some interested readers will join us on the grid.