C:Workshop:2018:D4 : Différence entre versions

De wiki-prog
Aller à : navigation, rechercher
(Advanced options)
(Selecting a user)
Ligne 545 : Ligne 545 :
 
Once your able to build the complete tree of processes, we can add a simple filter in order to only display process for a given user.
 
Once your able to build the complete tree of processes, we can add a simple filter in order to only display process for a given user.
  
Again, we have some fields in <tt>proc_t</tt> for that: <tt>ruser</tt> and <tt>ruid</tt>.
+
This a little bit more tricky than command name: we can pass some options to <tt>readproctab(3)</tt> in order to get only process matching a given user id, but in that case we probably won't have a tree but a forest. Implementing this bonus requires some change in the way we build the tree, but you can simplifies the problem by keeping a fake root in place of the process with <tt>pid 1</tt>.
  
The first one is a static string and the second one a simple user id stored as an <tt>int</tt>. You can choose to support selection by user name or user id as you want.
+
In order to obtain processes from a given uid, you will need to use the following call to <tt>readproctab(3)</tt> (see also <tt>openproc(3)</tt> for more information):
  
The simplest strategy for this bonus is to build the complete tree and only filter process when building the output (which will probably be a forest and not a tree).
+
<pre>
 +
  tab = readproctab(PROC_FILLSTAT | PROC_UID(uid));
 +
</pre>
 +
 
 +
If you want to use a username rather than a uid number, you need <tt>getpwwnam(3)</tt> (read the manual).

Version du 11 janvier 2018 à 14:24

 Due date for this tutorial: 2018-01-22 at 2pm
 Assignment directory: dotpstree

Introduction

Our goal for the last session of this workshop is to implement a variant of pstree(1). This command shows a tree representation of the running processes (using ASCII art), since don't really want to do ASCII art, we'll use the dot format instead and use dot(1) (or any other graphviz commands) to obtain a graphical representation of this tree. We will limit our display to the pid of running processes.

Tree of running processes

For that, we need to access the processes information described in /proc, but we don't want to loose time on directory crawling. Instead we'll use the base library of commands like ps(1) and top(1): libprocps. It's not part of the standard C library but is available on all Linux (in the base system).

The workflow of our program is as follow:

  • get the list of processes and their information
  • build the tree in memory
  • serialize the tree in the dot format

The following sessions are here to guide you in this process.

Trees, lists and process representation

All data structures will be intrusive (see [1]).

General Trees as binary trees

The file tree.h (code follow) describes the lists and trees (note that this header is self contained, no other files is needed):

/* tree.h : intrusive general tree as binary trees */
/* provides also basic intrusive lists             */
 
# ifndef DOTPSTREE_TREE_H_
# define DOTPSTREE_TREE_H_
 
# include <stddef.h>
 
# define CONTAINER_OF(_type_, _field_, _ptr_) \
	  ((_type_*)((char*)(_ptr_) - offsetof(_type_, _field_)))
 
/* classic intrusive lists */
struct list
{
  struct list *next;
};
 
/* list_init_sentinel(head) initializes a new list with head as a sentinel */
/* head must be allocated separately, of course                            */
static inline
void list_init_sentinel(struct list *head)
{
  head->next = NULL;
}
 
/* list_push_front(head, elm) pushes elm in front of the sentinel head */
/* head and elm must be allocated list cells                           */
static inline
void list_push_front(struct list *head, struct list *elm)
{
  elm->next = head->next;
  head->next = elm;
}
 
/* list_pop_front(head) pops the first element of the list and returns it */
/* returns NULL if the list is empty                                      */
static inline
struct list* list_pop_front(struct list *head)
{
  struct list *elm = head->next;
  if (elm) {
    head->next = elm->next;
    elm->next = NULL;
  }
  return elm;
}
 
/* foreach_list(_head_, _list_iter_) iterating macro on list */
/* _list_iter_ must be the name of an existing variable of   */
/* type struct list*                                         */
 
# define foreach_list(_head_, _list_iter_)	\
	 for (_list_iter_ = _head_;		\
	      _list_iter_->next;		\
	      _list_iter_ = _list_iter_->next )
 
/* General trees implemented using binary representation */
 
struct tree
{
  struct tree *child, *sibling;
};
 
/* tree_attach_child(parent, child) attaches the node child to parent      */
/* both parent and child must be valid nodes, child must not have siblings */
static inline
void tree_attach_child(struct tree *parent, struct tree *child)
{
  child->sibling = parent->child;
  parent->child = child;
}
 
/* foreach_child(_node_, _chld_iter_) macro iterating on children of _node_ */
/* _chld_iter_ must the name of a variable of type struct tree*             */
 
# define foreach_child(_node_, _chld_iter_)	  \
	 for (_chld_iter_ = _node_->child;	  \
	      _chld_iter_;			  \
	      _chld_iter_ = _chld_iter_->sibling)
 
/* for example only */
static inline
size_t tree_number_of_children(struct tree *node)
{
  size_t count = 0;
  struct tree *child;
  foreach_child(node, child) {
    count += 1;
  }
  return count;
}
 
# endif /* DOTPSTREE_TREE_H_ */

The implementation follows the classical First-Child-Right-Sibling model where a node contains a pointer to its first child and a pointer to its right sibling (the next child of its parent.)

Process and memory management

Memory Pool

During the construction of the process tree, we will build a lot of nodes and then throw them away when finished. It can interesting to use a memory pool allocator rather than relying on malloc and having to free each node one by one.

Here is an implementation of such a memory pool:

/* mempool.h: a simple memory pool */
 
# ifndef DOTPSTREE_MEMPOOL_H_
# define DOTPSTREE_MEMPOOL_H_
 
# include <assert.h>
# include <stddef.h>
# include <stdlib.h>
 
struct mempool
{
  size_t capacity, size;
  char *data;
};
 
static inline
struct mempool* mempool_init(size_t capacity)
{
  struct mempool *pool = malloc(sizeof (struct mempool));
  pool->size = 0;
  pool->capacity = capacity;
  pool->data = calloc(capacity, sizeof (char));
  return pool;
}
 
static inline
void *mempool_alloc(struct mempool *pool, size_t size)
{
  // you should take care of initial capacity
  assert(pool->size + size <= pool->capacity);
  void *ptr = pool->data + pool->size;
  pool->size += size;
  return ptr;
}
 
static inline
void mempool_reset(struct mempool *pool)
{
  pool->size = 0;
}
 
static inline
void mempool_delete(struct mempool *pool)
{
  free(pool->data);
  free(pool);
}
 
# endif /* DOTPSTREE_MEMPOOL_H_ */

Process representation

The following structure will represent a process:

struct process
{
  pid_t pid;
  struct list list;
  struct tree tree;
};

We limit our data to the pid of each process, but a lot more information can be added if you want to go further.

Each process will belong to a process list (for lookup) and a process tree.

These processes will be stored in a process_set described by this structure:

struct process_set
{
  struct list *process_list;
  struct tree *process_tree;
  struct mempool *pool;
};

The roles of process_list and process_tree are obvious. pool is the memory pool used to allocate memory all nodes. Note that thanks to the intrusive implementation, we have only memory cell allocated for each process, even if process nodes belong to two data structures.

Here is the complete process.h header, describing these two structures and their basic operations. Note that this header defines functions that you'll need to implement in process.c (see assignment later.)

/* process.h: process structure */
 
# ifndef DOTPSTREE_PROCESS_H_
# define DOTPSTREE_PROCESS_H_
 
# include <stddef.h>
# include <unistd.h>
# include <proc/readproc.h>
 
# include "mempool.h"
# include "tree.h"
 
struct process
{
  pid_t pid;
  struct list list;
  struct tree tree;
};
 
struct process_set
{
  struct list *process_list;
  struct tree *process_tree;
  struct mempool *pool;
};
 
static inline
void process_set_init(struct process_set *set)
{
  // be large, allocate a million of struct process
  // lazy memory mapping is nice (and any way what is 100MB)
  set->pool = mempool_init(sizeof (struct process) * (1 << 20));
  set->process_list = mempool_alloc(set->pool, sizeof (struct list));
  list_init_sentinel(set->process_list);
  set->process_tree = NULL;
}
 
static inline
struct process* process_set_find(struct process_set *set, pid_t pid)
{
  struct list *cur;
  foreach_list(set->process_list, cur) {
    struct process *proc = CONTAINER_OF(struct process, list, cur->next);
    if (proc->pid == pid)
      return proc;
  }
  return NULL;
}
 
struct process* process_set_add(struct process_set *set, pid_t pid);
void process_set_attach_process(struct process_set *set, proc_t *procinfo);
struct process_set* make_process_set(void);
 
static inline
void process_set_delete(struct process_set *set)
{
  mempool_delete(set->pool);
  free(set);
}
 
# endif /* DOTPSTREE_PROCESS_H_ */

Assignment

Your job is to implement our dot version of pstree.

Your must provide a directory (in your git) named dotpstree and containing:

  • dotpstree:
    • AUTHORS
    • Makefile
    • dot_out.c and dot_out.h: file containing tree traversal and dot output building
    • dotpstree.c: the main file
    • mempool.h
    • process.c: implementation of the corresponding header
    • process.h
    • tree.h

process.c

You must provide implementation for the following functions:

/* process_set_add(set, pid) tries to add the process pid to the list and returns it */
/* if it's already in the list, returns the process node                             */
/* if it's the first encounter of the process, it'll add it to the process list      */
/* the process will be set as the root of process tree if pid==1                     */
struct process* process_set_add(struct process_set *set, pid_t pid);
 
/* process_set_attach_process(set, procinfo) attaches a new process to the set       */
/* this function is responsible to add the process and attach it to its parent       */
void process_set_attach_process(struct process_set *set, proc_t *procinfo);
 
/* make_process_set() creates and populates the set of processes                     */
struct process_set* make_process_set(void);

Here are some details:

  • process_set_add(set, pid):
    • if pid is already in the list, return it
    • create the node (using the memory pool !)
    • attach it to the list
    • if pid == 1, then set the root of process tree to this node
  • process_set_attach(set, procinfo):
    • get needed information from procinfo
    • add the process node
    • if pid is not 1, search for its parent and add the node to children the parent's children list
  • make_process_set():
    • allocate (using malloc(3)) and initialize a process set
    • get the process information list using readproctab(PROC_FILLSTAT)
    • iterate through the list of process information and attach the corresponding process
    • returns the set

readproctab(3)

This function is provided by libprocps, read its manual page for more information.

readproctab(PROC_FILLSTAT) will return an array of proc_t* terminated by NULL

The type proc_t is a structure type, it is described in /usr/include/proc/readproc.h, but concretely what we need is only the following fields:

  • tgid which is (almost) a synonym for pid
  • ppid the pid of the parent of the current process

Note that the manual page describes a function named freeproctab(3) but this function doesn't exist. We won't free the array obtained.

dot_out.c

The header file:

/* dot_out.h: produce dot file from process tree */
 
# ifndef DOTPSTREE_DOT_OUT_H_
# define DOTPSTREE_DOT_OUT_H_
 
# include <stdio.h>
 
# include "process.h"
 
void dot_output(FILE *out, struct process_set *set);
 
# endif /* DOTPSTREE_DOT_OUT_H_ */

Implement the function dot_output(out, set). The parameter out is a stdio stream usable with fprintf(3) it will serve as an output for our dot.

Building the tree requires a simple traversal like a depth first one.

The function must build the whole dot file looking like the following example (this the file used to generate the tree at the beginning of this subject, using the command dot -Grankdir=LR):

digraph PSTree {
  1 -> 22066;
  22066 -> 22069;
  22069 -> 22159;
  22159 -> 22165;
  22069 -> 22077;
  22077 -> 22078;
  22078 -> 22083;
  22083 -> 24030;
  22083 -> 23745;
  22083 -> 23624;
  22083 -> 23498;
  22083 -> 23465;
  22083 -> 22871;
  22083 -> 22852;
  22083 -> 22553;
  22083 -> 22227;
  22083 -> 22221;
  22083 -> 22200;
  22083 -> 20416;
  22083 -> 19785;
  22083 -> 19742;
  22083 -> 19507;
  22083 -> 18099;
  22078 -> 22080;
  22080 -> 22081;
  22069 -> 22075;
  22069 -> 22074;
  1 -> 20680;
  1 -> 19999;
  1 -> 17082;
  17082 -> 22448;
  17082 -> 20725;
  17082 -> 20670;
  20670 -> 22446;
  17082 -> 19990;
  17082 -> 19981;
  17082 -> 19976;
  17082 -> 17099;
  1 -> 16841;
  1 -> 10646;
  10646 -> 17134;
  17134 -> 19942;
  19942 -> 20019;
  20019 -> 20372;
  20372 -> 20668;
  20372 -> 20662;
  20372 -> 18163;
  18163 -> 18164;
  20372 -> 9630;
  9630 -> 9631;
  9631 -> 29337;
  20372 -> 8660;
  8660 -> 8661;
  20019 -> 20022;
  1 -> 8813;
  1 -> 583;
  1 -> 529;
  1 -> 520;
  1 -> 514;
  1 -> 508;
  1 -> 501;
  501 -> 23894;
  1 -> 498;
  1 -> 475;
  475 -> 513;
  1 -> 474;
  1 -> 473;
  1 -> 471;
  1 -> 464;
  1 -> 462;
  1 -> 461;
  461 -> 463;
  1 -> 388;
  1 -> 294;
  1 -> 248;
}

dotpsrtee.c and Makefile

Here is a complete Makefile:

# Makefile
 
CC=gcc
CPPFLAGS= -MMD -D_XOPEN_SOURCE=500
CFLAGS= -Wall -Wextra -std=c99 -O0 -g
LDFLAGS=
LDLIBS= -lprocps
 
SRC= process.c dot_out.c dotpstree.c
OBJ= ${SRC:.c=.o}
DEP= ${SRC:.c=.d}
 
all: dotpstree
 
dotpstree: ${OBJ}
 
clean:
	${RM} ${OBJ}
	${RM} ${DEP}
	${RM} dotpstree
 
-include ${DEP}
 
# END

And the main program:

/* dotpstree.c: a dot version of pstree using procps-ng */
 
# include <err.h>
# include <stdlib.h>
# include <stdio.h>
 
# include "dot_out.h"
# include "process.h"
 
int main(int argc, char *argv[])
{
 
  struct process_set *set = make_process_set();
 
  int has_file_output = 0;
  FILE *out = stdout;
  if (argc > 1) {
    out = fopen(argv[1], "w");
    has_file_output = 1;
    if (!out)
      err(1, "can't open %s", argv[1]);
  }
 
  dot_output(out, set);
 
  if (has_file_output)
    fclose(out);
 
  process_set_delete(set);
 
  return 0;
}

Advanced options

In this section we will extend our program with some extra features, if you want to go a little bit further, you can take a look at the man page for pstree(1).

In order to support those options, you need to add parameters to the command line of the program, you are free to manage them as you wish but the program must keep the original behavior when run with no parameter or with a single one. You will provide a file called BONUS with a quick description on how we can tests your bonus.

These extensions are bonus and thus less guided.

Display process name

So far we only use PID of process, it would be nice to have the command name.

In structure proc_t there's two fields that can be used for that: cmdline and cmd.

The first one is an array of strings corresponding to the argv array in the process and the second one is the basename of the command (the last element of command path, thus the command name). The cmd field seems appropriate, it's a static string (of max 16char) in the structure, you will extend our process structure in order to store that name and copy the content of cmd in your struct.

You can use that name when generating the dot output by adding for each process a line that may look like this:

  1 [label="init"];

Selecting a user

Once your able to build the complete tree of processes, we can add a simple filter in order to only display process for a given user.

This a little bit more tricky than command name: we can pass some options to readproctab(3) in order to get only process matching a given user id, but in that case we probably won't have a tree but a forest. Implementing this bonus requires some change in the way we build the tree, but you can simplifies the problem by keeping a fake root in place of the process with pid 1.

In order to obtain processes from a given uid, you will need to use the following call to readproctab(3) (see also openproc(3) for more information):

  tab = readproctab(PROC_FILLSTAT | PROC_UID(uid));

If you want to use a username rather than a uid number, you need getpwwnam(3) (read the manual).