Procd Wisdom

By Greg Quinn

Introduction

This page will serve as my brain dump of all things ProcD. My goal is to highlight things that aren't obvious from looking at the code. In addition, I'm neither dying nor moving farther than 0.2 miles from the HTCondor project's home in the CS building so please feel free to let me know if something is glaringly missing from this page.

High-Level Summary of ProcD's Operation

When the ProcD starts up, it looks up its parent PID and begins monitoring that process and any child processes. Child processes are discovered by polling the system. The interval at which the ProcD polls the system can be specified via the the command line (actually this is the maximum interval; see below about registering subfamilies).

For a family that the ProcD is monitoring, clients can request the following actions:

  • Get resource usage totals for the processes in the family
  • Send a signal to the family's "root" process
  • Suspend / continue all the family's processes
  • Kill all the family's processes

The ProcD supports nested families (or subfamilies). For example, the Master may start a ProcD which then begins monitoring the Master's family. When the Master starts a daemon like the StartD, it will want to be able to act on the StartD and all it's children in the ways listed above. So it registers a subfamily with the ProcD. The StartD will similarly register a nested family when it spawns the Starter, as will the Starter when it spawns the job.

When a new subfamily is registered, the client specifies a maximum snapshot interval. For example the Master may only care to poll for new processes in the StartD's family every 60 seconds, but the Starter may want to check for new processes in a job's family every 5 seconds. The actual polling interval that the ProcD will use is the minimum of all these requested maximums, including the "default" which is 60 if not specified on the command line.

After registering a new subfamily, a client of the ProcD can tell the ProcD what "tracking methods" it should use in discovering what processes belong to the family. Methods include things like using environment variables, UIDs, or GIDs. Without specifying any additional methods, only PPIDs will be used.

Relationship to PrivSep

The ProcD was born out of the PrivSep project. The idea was that while most root-enabled things could be accomplished with a stateless root-owned setuid binary (the condor_root_switchboard), our process family tracking code needed a root-running daemon. In hindsight, I think this may have been a mistake (see ticket #110 ). Even without the ProcD's supposed benefit for PrivSep it is still a big win for scalability, particularly as we move toward more and more cores. Previously, HTCondor was executing the process tracking code in each Starter in addition to in the StartD and Master. In addition, the ProcD has some other performance advantages over the older process tracking code, like using a hash table when snapshotting to "remember" whether processes that have been seen before as in families we are interested in or not.

There are some important consequences of the ProcD's PrivSep heritage. The fact that the ProcD is included in the "trusted" portion of the PrivSep architecture inspired us to make it include as little code as possible. As such, it does not use dprintf logging, doesn't use HTCondor's configuration subsystem, and doesn't (currently) use some handy features like the Google cored dumper. Logging is handled in a very basic way and is missing crucial features that dprintf has like rotation. As a result, the ProcD's log file is disabled by default. Configuration is handled via the command line. A HTCondor daemon starting the ProcD will read its configuration parameters out of the HTCondor config file then place those on the ProcD's command line.

Named Pipe Communication

The ProcD uses named pipe IPC for communication with its clients. This is true on both UNIX and Windows, though what "named pipe" means is wildly different in the two contexts. Most of the differences are hidden away in the LocalClient/LocalServer classes in the ProcD. The main user-visible difference is how named pipes are actually named. On UNIX, they exist as nodes in the file system. On Windows, they use a separate namespace. Ticket #292 includes further detail.

The ProcD relies on OS-level authorization mechanisms to ensure that only HTCondor can make requests of it. On Windows, everything is running as SYSTEM so this is simple. On UNIX, the ProcD runs as root but it must allow access to HTCondor deamons that may not be root (think PrivSep ). To do this, the ProcD chowns the named pipe is uses for incoming requests to the HTCondor UID (which HTCondor must provide as a command line argument when starting the ProcD).

Integration with the Rest of HTCondor

Access to the ProcD from other HTCondor code is provided via DaemonCore . Create_Process has a parameter of type FamilyInfo that can be used when the created process should be registered as a subfamily with the ProcD. The FamilyInfo struct contains information regarding the maximum acceptable snapshot interval and what tracking methods to use. After the Create_Process call, the ProcD's services are accessible via other call into DaemonCore (i.e., Get_Family_Usage, Kill_Family, etc.)

Each daemon can be configured to either use the ProcD for process tracking or to use the old school KillFamily class. The reason is that when the ProcD is enabled and it crashes, any daemon using it will EXCEPT. Early on in the ProcD's existence this happened to several of the Master daemons in our CS pool, which was enough to get the old KillFamily code back as an alternative. The ProcFamilyInterface , ProcFamilyProxy , and ProxFamilyDirect classes in condor_c++_util provide the common interface to these different implementations. The current default is for the Master to use the old code and all other daemons to use the ProcD.

A HTCondor daemon configured to use a ProcD will share the ProcD of its parent if one is available. Environment variables are used to communicate whether this is the case. See the ProcFamilyProxy constructor for details.

A HTCondor daemon configured to use the ProcD will start it on-demand. Specifically, the first call to DaemonCore::Create_Process with a non-NULL FamilyInfo argument will result in the ProcD being spawned. Alternatively, a daemon can force the ProcD to be started with a call to DaemonCore::Proc_Family_Init. The SchedD does this since it never calls Create_Process with a non-NULL FamilyInfo argument, but we want any local universe Starters that are created to share a single ProcD instead of having them each create their own.

OSG ProcD

Oh yeah, then there's the OsgProcd , documented on a separate page.