C:Workshop:2018:D4

De wiki-prog
Révision de 11 janvier 2018 à 13:17 par Slashvar (discuter | contributions) (dotpsrtee.c and Makefile)

Aller à : navigation, rechercher
 Due date for this tutorial: 2018-01-22 at 2pm
 Assignment directory: dotpstree

Introduction

Our goal for the last session of this workshop is to implement a variant of pstree(1). This command shows a tree representation of the running processes (using ASCII art), since don't really want to do ASCII art, we'll use the dot format instead and use dot(1) (or any other graphviz commands) to obtain a graphical representation of this tree. We will limit our display to the pid of running processes.

Tree of running processes

For that, we need to access the processes information described in /proc, but we don't want to loose time on directory crawling. Instead we'll use the base library of commands like ps(1) and top(1): libprocps. It's not part of the standard C library but is available on all Linux (in the base system).

The workflow of our program is as follow:

  • get the list of processes and their information
  • build the tree in memory
  • serialize the tree in the dot format

The following sessions are here to guide you in this process.

Trees, lists and process representation

All data structures will be intrusive (see [1]).

General Trees as binary trees

The file tree.h (code follow) describes the lists and trees (note that this header is self contained, no other files is needed):

/* tree.h : intrusive general tree as binary trees */
/* provides also basic intrusive lists             */
 
# ifndef DOTPSTREE_TREE_H_
# define DOTPSTREE_TREE_H_
 
# include <stddef.h>
 
# define CONTAINER_OF(_type_, _field_, _ptr_) \
	  ((_type_*)((char*)(_ptr_) - offsetof(_type_, _field_)))
 
/* classic intrusive lists */
struct list
{
  struct list *next;
};
 
/* list_init_sentinel(head) initializes a new list with head as a sentinel */
/* head must be allocated separately, of course                            */
static inline
void list_init_sentinel(struct list *head)
{
  head->next = NULL;
}
 
/* list_push_front(head, elm) pushes elm in front of the sentinel head */
/* head and elm must be allocated list cells                           */
static inline
void list_push_front(struct list *head, struct list *elm)
{
  elm->next = head->next;
  head->next = elm;
}
 
/* list_pop_front(head) pops the first element of the list and returns it */
/* returns NULL if the list is empty                                      */
static inline
struct list* list_pop_front(struct list *head)
{
  struct list *elm = head->next;
  if (elm) {
    head->next = elm->next;
    elm->next = NULL;
  }
  return elm;
}
 
/* foreach_list(_head_, _list_iter_) iterating macro on list */
/* _list_iter_ must be the name of an existing variable of   */
/* type struct list*                                         */
 
# define foreach_list(_head_, _list_iter_)	\
	 for (_list_iter_ = _head_;		\
	      _list_iter_->next;		\
	      _list_iter_ = _list_iter_->next )
 
/* General trees implemented using binary representation */
 
struct tree
{
  struct tree *child, *sibling;
};
 
/* tree_attach_child(parent, child) attaches the node child to parent      */
/* both parent and child must be valid nodes, child must not have siblings */
static inline
void tree_attach_child(struct tree *parent, struct tree *child)
{
  child->sibling = parent->child;
  parent->child = child;
}
 
/* foreach_child(_node_, _chld_iter_) macro iterating on children of _node_ */
/* _chld_iter_ must the name of a variable of type struct tree*             */
 
# define foreach_child(_node_, _chld_iter_)	  \
	 for (_chld_iter_ = _node_->child;	  \
	      _chld_iter_;			  \
	      _chld_iter_ = _chld_iter_->sibling)
 
/* for example only */
static inline
size_t tree_number_of_children(struct tree *node)
{
  size_t count = 0;
  struct tree *child;
  foreach_child(node, child) {
    count += 1;
  }
  return count;
}
 
# endif /* DOTPSTREE_TREE_H_ */

The implementation follows the classical First-Child-Right-Sibling model where a node contains a pointer to its first child and a pointer to its right sibling (the next child of its parent.)

Process and memory management

Memory Pool

During the construction of the process tree, we will build a lot of nodes and then throw them away when finished. It can interesting to use a memory pool allocator rather than relying on malloc and having to free each node one by one.

Here is an implementation of such a memory pool:

/* mempool.h: a simple memory pool */
 
# ifndef DOTPSTREE_MEMPOOL_H_
# define DOTPSTREE_MEMPOOL_H_
 
# include <assert.h>
# include <stddef.h>
# include <stdlib.h>
 
struct mempool
{
  size_t capacity, size;
  char *data;
};
 
static inline
struct mempool* mempool_init(size_t capacity)
{
  struct mempool *pool = malloc(sizeof (struct mempool));
  pool->size = 0;
  pool->capacity = capacity;
  pool->data = calloc(capacity, sizeof (char));
  return pool;
}
 
static inline
void *mempool_alloc(struct mempool *pool, size_t size)
{
  // you should take care of initial capacity
  assert(pool->size + size <= pool->capacity);
  void *ptr = pool->data + pool->size;
  pool->size += size;
  return ptr;
}
 
static inline
void mempool_reset(struct mempool *pool)
{
  pool->size = 0;
}
 
static inline
void mempool_delete(struct mempool *pool)
{
  free(pool->data);
  free(pool);
}
 
# endif /* DOTPSTREE_MEMPOOL_H_ */

Process representation

The following structure will represent a process:

struct process
{
  pid_t pid;
  struct list list;
  struct tree tree;
};

We limit our data to the pid of each process, but a lot more information can be added if you want to go further.

Each process will belong to a process list (for lookup) and a process tree.

These processes will be stored in a process_set described by this structure:

struct process_set
{
  struct list *process_list;
  struct tree *process_tree;
  struct mempool *pool;
};

The roles of process_list and process_tree are obvious. pool is the memory pool used to allocate memory all nodes. Note that thanks to the intrusive implementation, we have only memory cell allocated for each process, even if process nodes belong to two data structures.

Here is the complete process.h header, describing these two structures and their basic operations. Note that this header defines functions that you'll need to implement in process.c (see assignment later.)

/* process.h: process structure */
 
# ifndef DOTPSTREE_PROCESS_H_
# define DOTPSTREE_PROCESS_H_
 
# include <stddef.h>
# include <unistd.h>
# include <proc/readproc.h>
 
# include "mempool.h"
# include "tree.h"
 
struct process
{
  pid_t pid;
  struct list list;
  struct tree tree;
};
 
struct process_set
{
  struct list *process_list;
  struct tree *process_tree;
  struct mempool *pool;
};
 
static inline
void process_set_init(struct process_set *set)
{
  // be large, allocate a million of struct process
  // lazy memory mapping is nice (and any way what is 100MB)
  set->pool = mempool_init(sizeof (struct process) * (1 << 20));
  set->process_list = mempool_alloc(set->pool, sizeof (struct list));
  list_init_sentinel(set->process_list);
  set->process_tree = NULL;
}
 
static inline
struct process* process_set_find(struct process_set *set, pid_t pid)
{
  struct list *cur;
  foreach_list(set->process_list, cur) {
    struct process *proc = CONTAINER_OF(struct process, list, cur->next);
    if (proc->pid == pid)
      return proc;
  }
  return NULL;
}
 
struct process* process_set_add(struct process_set *set, pid_t pid);
void process_set_attach_process(struct process_set *set, proc_t *procinfo);
struct process_set* make_process_set(void);
 
static inline
void process_set_delete(struct process_set *set)
{
  mempool_delete(set->pool);
  free(set);
}
 
# endif /* DOTPSTREE_PROCESS_H_ */

Assignment

Your job is to implement our dot version of pstree.

Your must provide a directory (in your git) named dotpstree and containing:

  • dotpstree:
    • AUTHORS
    • Makefile
    • dot_out.c and dot_out.h: file containing tree traversal and dot output building
    • dotpstree.c: the main file
    • mempool.h
    • process.c: implementation of the corresponding header
    • process.h
    • tree.h

process.c

You must provide implementation for the following functions:

/* process_set_add(set, pid) tries to add the process pid to the list and returns it */
/* if it's already in the list, returns the process node                             */
/* if it's the first encounter of the process, it'll add it to the process list      */
/* the process will be set as the root of process tree if pid==1                     */
struct process* process_set_add(struct process_set *set, pid_t pid);
 
/* process_set_attach_process(set, procinfo) attaches a new process to the set       */
/* this function is responsible to add the process and attach it to its parent       */
void process_set_attach_process(struct process_set *set, proc_t *procinfo);
 
/* make_process_set() creates and populates the set of processes                     */
struct process_set* make_process_set(void);

Here are some details:

  • process_set_add(set, pid):
    • if pid is already in the list, return it
    • create the node (using the memory pool !)
    • attach it to the list
    • if pid == 1, then set the root of process tree to this node
  • process_set_attach(set, procinfo):
    • get needed information from procinfo
    • add the process node
    • if pid is not 1, search for its parent and add the node to children the parent's children list
  • make_process_set():
    • allocate (using malloc(3)) and initialize a process set
    • get the process information list using readproctab(PROC_FILLSTAT)
    • iterate through the list of process information and attach the corresponding process
    • returns the set

readproctab(3)

This function is provided by libprocps, read its manual page for more information.

readproctab(PROC_FILLSTAT) will return an array of proc_t* terminated by NULL

The type proc_t is a structure type, it is described in /usr/include/proc/readproc.h, but concretely what we need is only the following fields:

  • tgid which is (almost) a synonym for pid
  • ppid the pid of the parent of the current process

Note that the manual page describes a function named freeproctab(3) but this function doesn't exist. We won't free the array obtained.

dot_out.c

The header file:

/* dot_out.h: produce dot file from process tree */
 
# ifndef DOTPSTREE_DOT_OUT_H_
# define DOTPSTREE_DOT_OUT_H_
 
# include <stdio.h>
 
# include "process.h"
 
void dot_output(FILE *out, struct process_set *set);
 
# endif /* DOTPSTREE_DOT_OUT_H_ */

Implement the function dot_output(out, set). The parameter out is a stdio stream usable with fprintf(3) it will serve as an output for our dot.

Building the tree requires a simple traversal like a depth first one.

The function must build the whole dot file looking like the following example (this the file used to generate the tree at the beginning of this subject, using the command dot -Grankdir=LR):

digraph PSTree {
  1 -> 22066;
  22066 -> 22069;
  22069 -> 22159;
  22159 -> 22165;
  22069 -> 22077;
  22077 -> 22078;
  22078 -> 22083;
  22083 -> 24030;
  22083 -> 23745;
  22083 -> 23624;
  22083 -> 23498;
  22083 -> 23465;
  22083 -> 22871;
  22083 -> 22852;
  22083 -> 22553;
  22083 -> 22227;
  22083 -> 22221;
  22083 -> 22200;
  22083 -> 20416;
  22083 -> 19785;
  22083 -> 19742;
  22083 -> 19507;
  22083 -> 18099;
  22078 -> 22080;
  22080 -> 22081;
  22069 -> 22075;
  22069 -> 22074;
  1 -> 20680;
  1 -> 19999;
  1 -> 17082;
  17082 -> 22448;
  17082 -> 20725;
  17082 -> 20670;
  20670 -> 22446;
  17082 -> 19990;
  17082 -> 19981;
  17082 -> 19976;
  17082 -> 17099;
  1 -> 16841;
  1 -> 10646;
  10646 -> 17134;
  17134 -> 19942;
  19942 -> 20019;
  20019 -> 20372;
  20372 -> 20668;
  20372 -> 20662;
  20372 -> 18163;
  18163 -> 18164;
  20372 -> 9630;
  9630 -> 9631;
  9631 -> 29337;
  20372 -> 8660;
  8660 -> 8661;
  20019 -> 20022;
  1 -> 8813;
  1 -> 583;
  1 -> 529;
  1 -> 520;
  1 -> 514;
  1 -> 508;
  1 -> 501;
  501 -> 23894;
  1 -> 498;
  1 -> 475;
  475 -> 513;
  1 -> 474;
  1 -> 473;
  1 -> 471;
  1 -> 464;
  1 -> 462;
  1 -> 461;
  461 -> 463;
  1 -> 388;
  1 -> 294;
  1 -> 248;
}

dotpsrtee.c and Makefile

Here is a complete Makefile:

# Makefile
 
CC=gcc
CPPFLAGS= -MMD -D_XOPEN_SOURCE=500
CFLAGS= -Wall -Wextra -std=c99 -O0 -g
LDFLAGS=
LDLIBS= -lprocps
 
SRC= process.c dot_out.c dotpstree.c
OBJ= ${SRC:.c=.o}
DEP= ${SRC:.c=.d}
 
all: dotpstree
 
dotpstree: ${OBJ}
 
clean:
	${RM} ${OBJ}
	${RM} ${DEP}
	${RM} dotpstree
 
-include ${DEP}
 
# END

And the main program:

/* dotpstree.c: a dot version of pstree using procps-ng */
 
# include <err.h>
# include <stdlib.h>
# include <stdio.h>
 
# include "dot_out.h"
# include "process.h"
 
int main(int argc, char *argv[])
{
 
  struct process_set *set = make_process_set();
 
  int has_file_output = 0;
  FILE *out = stdout;
  if (argc > 1) {
    out = fopen(argv[1], "w");
    has_file_output = 1;
    if (!out)
      err(1, "can't open %s", argv[1]);
  }
 
  dot_output(out, set);
 
  if (has_file_output)
    fclose(out);
 
  process_set_delete(set);
 
  return 0;
}

Advanced options

Once your able to build the complete tree of processes, we can add a simple filter in order to only display process for a given user.