Set Process Title

From CodeCodex

On *nix systems, a process listing (for example with ps(1)) will normally show the command-line arguments passed to each process. Some programs overwrite their command-line argument area with an informative message to give the user an idea of what they’re doing. For example, on my currently-running Debian system, a ps listing shows lines like

root     25403 25368  0 Aug09 ?        00:00:23 hald-addon-input: Listening on /dev/input/event2 /d
root     25502 25368  0 Aug09 ?        00:06:55 hald-addon-storage: polling /dev/sr0 (every 2 sec)

How does a program do this? On BSD derivatives, there is a standard C library routine called setproctitle(3) to take care of the details. However, there seems to be no standard equivalent for Linux. The following is the closest I’ve been able to come, to an equivalent to the BSD routine.

extern char **
    environ;  
static char * 
    argstart = NULL;
static size_t       
    maxarglen; /* maximum available size of argument area */
static bool                                                 
    envmoved = false;                                       

void setproctitle
  (              
    char ** argv, /* argv as passed to main, so args can be moved if necessary */
    const char * fmt, /* printf(3)-style format string for process title */
    ... /* args to format string */
  )                                                                              
  /* something as close as possible to BSD setproctitle(3), but for Linux.
    Note I need argv as passed to main, in order to be able to poke the process
    arguments area. Also don't call routines like putenv(3) or setenv(3)
    prior to using this routine. */
  {
    char title[512]; /* big enough? */
    ssize_t titlelen;
      {
        va_list args;
        va_start(args, fmt);
        titlelen = vsnprintf(title, sizeof title, fmt, args);
        va_end(args);
        if (titlelen < 0)
          {
            titlelen = 0; /* ignore error */
            title[0] = 0;
          } /*if*/
        titlelen += 1; /* including trailing nul */
        if (titlelen > sizeof title)
          {
            title[sizeof title - 1] = '\0'; /* do I need to do this? */
            titlelen = sizeof title;
          } /*if*/
      }
    if (argstart == NULL)
      {
      /* first call, find and initialize argument area */
        char ** thisarg = argv;
        maxarglen = 0;
        argstart = *thisarg;
        while (*thisarg != NULL)
          {
            maxarglen += strlen(*thisarg++) + 1; /* including terminating nul */
          } /*while*/
        memset(argstart, 0, maxarglen); /* clear it all out */
      } /*if*/
    if (titlelen > maxarglen && !envmoved)
      {
      /* relocate the environment strings and use that area for the command line
        as well */
        char ** srcenv;
        char ** dstenv;
        char ** newenv;
        size_t envlen = 0;
        size_t nrenv = 1; /* nr env strings + 1 for terminating NULL pointer */
        if (argstart + maxarglen == environ[0]) /* not already moved by e.g. libc */
          {
            srcenv = environ;
            while (*srcenv != NULL)
              {
                envlen += strlen(*srcenv++) + 1; /* including terminating nul */
                ++nrenv; /* count 'em up */
              } /*while*/
            newenv = (char **)malloc(sizeof(char *) * nrenv); /* new env array, never freed! */
            srcenv = environ;
            dstenv = newenv;
            while (*srcenv != NULL)
              {
              /* copy the environment strings */
                *dstenv++ = strdup(*srcenv++);
              } /*while*/
            *dstenv = NULL; /* mark end of new environment array */
            memset(environ[0], 0, envlen); /* clear old environment area */
            maxarglen += envlen; /* this much extra space now available */
            environ = newenv; /* so libc etc pick up new environment location */
          } /*if*/
        envmoved = true;
      } /*if*/
    if (titlelen > maxarglen)
      {
        titlelen = maxarglen; /* truncate to fit available area */
      } /*if*/
    if (titlelen > 0)
      {
      /* set the new title */
        const size_t oldtitlelen = strlen(argstart) + 1; /* including trailing nul */
        memcpy(argstart, title, titlelen);
        argstart[titlelen - 1] = '\0'; /* if not already done */
        if (oldtitlelen > titlelen)
          {
          /* wipe out remnants of previous title */
            memset(argstart + titlelen, 0, oldtitlelen - titlelen);
          } /*if*/
      } /*if*/
  } /*setproctitle*/

Theory of Operation[edit]

The command-line arguments and environment variables for a process can be found from fields arg_start, arg_end, env_start and env_end of the mm_struct object defined in include/linux/mm_types.h in the current kernel sources. These areas are set up in different ways depending on the executable format; for ELF, see the routine create_elf_tables in fs/binfmt_elf.c. This puts the envp pointer array immediately following the null entry at the end of the argv pointer array, both in the userspace stack. The actual strings are stored contiguously in the area beginning at mm_struct.arg_start for the argument strings, and mm_struct.env_start (immediately follows mm_struct.arg_end) for the environment strings. The current process command line is made visible (to utilities like ps(1) etc) via the proc_pid_cmdline routine in fs/proc/base.c. This routine checks that the byte at mm_struct.arg_end is still a null; if not, it assumes the process has overwritten its argument and environment area with an extra-long title, and appends the extra data beginning at mm_struct.env_start as well.

Limitations: this routine can only use the available argument and environment area. If the command-line arguments and environment are small to begin with, then that limits the length of process title that can be set. Also libc library routines like putenv(3) do their own relocation of the environ array and strings; if they are used before the first call of this routine that needs to overflow into the environment area, then it won't be able to find the original location of the latter.

Example[edit]

int main
  (
    int argc,
    char ** argv
  )
  {
    setproctitle(argv, "my process ID is %d, isn't that nice", getpid());
    sleep(30);
    return
        0;
  } /*main*/