CS 420 - Assignment 4

Updated: October 17th (due date extended)

Updated: October 16th

Due: Friday, October 26th by 11:59 PM

Cooperative Threads, part 1

In this assignment, you will implement a simple cooperative many-to-one thread library.

A cooperative thread implementation is one where a running thread remains executing on the CPU until it voluntarily allows another thread to be scheduled.

The YThreads API

The thread library you will implement is called "YThreads", and its API functions closely resember those used in POSIX threads (pthreads).

For this assignment (part 1), you will implement the following functions:

/*
* Initialize any global data structures used by the
* ythreads library.
*/
void ythread_init(void);

/*
* Create a new thread.
*
* Params:
* p_thread_id - pointer to a ythread_t variable where the new thread's
* thread id will be stored
* start_function - pointer to the new thread's start function
* arg - argument that will be passed to the new thread's
* start function
*
* Returns:
* 1 if successful, 0 if the new thread can't be created.
*/
int ythread_create(ythread_t *p_thread_id, long (*start_function)(void *), void *arg);

/*
* Return the thread id of the current thread.
*
* Returns:
* Thread id of the current thread.
*/
ythread_t ythread_self(void);

/*
* Cause the current thread to exit.
*
* Params:
* exit_code - the thread's exit code
*/
void ythread_exit(long exit_code);

/*
* Wait for a given thread to exit.
*
* Params:
* thread_id - the thread id of the thread to wait for
* p_exit_code - pointer to a long variable in which the waited-for thread's
* exit code will be stored (pass NULL if you don't
* care about the exit code)
*/
void ythread_join(ythread_t thread_id, long *p_exit_code);

/*
* Cause the current thread to be suspended (put in the run queue),
* and a new scheduling decision to be made.
*/
void ythread_yield(void);

Getting started

Updated: October 17th

Download CS420_Assign4.zip and extract its contents.

For this project, you must work from the command line in Linux or Solaris.  Development with cygwin or Visual Studio isn't supported.

If you need to use Windows:

The following ISO image contains the a virtual machine image for Fedora Core 5 (Linux), as well as the Windows executable for VMware player.  You can do the project from within the virtual machine.

http://faculty.ycp.edu/~dhovemey/vmware-player-fc5.iso

Note: this ISO image is approximately 1.5 GB in size.

Burn the image to a DVD (it won't fit on a CD --- see me if you need a blank DVD).  The DVD contains a single zip archive with a number of files inside.  Extract the files inside the zip archive to a folder on your Windows machine.  First, install VMware player using the provided executable.  Next, load the file fedora-fc5-i386.vmx from within VMware player.

Once you are running Linux inside the virtual machine, you can use a web browser to download the assignment and (later) upload the completed assignment to the Marmoset server.

Use Applications->Accessories->Terminal to open a command prompt window, and Applications->Accessories->Text Editor to open a text editor.

When you're done with your assignment, execute the command make clean in a terminal window, open a file browser window, find the directory containing your assignment, right-click, choose Create Archive, and create a zip file of your completed assignment.  Then upload the zip file using a web browser.

The provided Makefile allows the ythreads library to be compiled, along with the test programs that use the library.  Just run the command make in the same directory as the Makefile.

You should also run the command make depend at least once.  This will ensure that everything is recompiled appropriately when header files are modified.

You will implement the functions in the file src/ythread.c.  You may change any variables and data structures as you see fit; the ones already in the file are a suggestion that you may find useful.

Implementation

This assignment uses some advanced programming techniques.  You will need to think about how to implement each function.

This section describes the important implementation techniques you will need to use.  Please read it carefully!

Thread data structure, thread table, and thread identifiers

Each thread in the process has a corresponding thread data structure, which is an instance of struct ythread.  This struct is defined in src/ythread.c.  The thread data structure keeps track of the important information about the thread, such as

The thread table is an array of thread data structures.  The thread table is the array called s_thread_table in src/ythread.c.  The elements of this array whose state is set to NONEXISTENT are unused, and may be allocated as needed when new threads are created.

A thread's thread identifier is simply the index of the thread's thread data structure in the thread table.  E.g., the thread whose thread data structure is stored in the first element of the thread table (element 0) has thread identifier 0.  The type ythread_t represents thread identifiers; it is an alias for int.

The private function get_thread_id takes a pointer to a thread data structure and returns the thread id of that thread.

The private function get_thread takes a thrad id and returns a pointer to the thread data structure of that thread.

Thread stacks

Each thread must have its own activation record stack.  Therefore, when a new thread is created, you must allocate a buffer to use for its stack.  The constant STACK_SIZE in src/ythread.c defines the number of bytes you should allocate for each new thread's stack.  Use malloc to allocate each stack.

Saving and restoring thread context

To switch between threads, you must save the context of the thread being suspended and restore the context of the thread being activated.  A thread's context consists of the contents of CPU registers, including the general-purpose registers and also the stack pointer, frame pointer, and instruction counter.  The stack pointer and frame pointer registers are used to push and pop activation records as the thread calls and returns from functions.  The instruction pointer keeps track of the next instruction to be executed.

The C library functions setjmp and longjmp save and restore thread contexts, respectively.  Each takes a jmp_buf data structure as an argument.  (Observe that this is the type of the context field in struct ythread.)

To save a thread context, if t is a pointer to a thread data structure, then:

setjmp(t->context)
saves the current thread's context (CPU register contents).  (Note that you don't have to put an ampersand in front of t->context: it is passed by reference automatically.)  Saving a context "freezes" the state of the thread, so it can be resumed later.

The setjmp function returns a value, and in general, you need to use the returned value to correctly use the function.  Here is how to interpret the return value:

In other words, after a context has been saved (with setjmp returning 0), at a later time the saved context can be restored (by a call to  the longjmp function elsewhere in the program), causing control to return to the location of the original setjmp where the context was saved---but this time, setjmp returns a non-zero value.  [This may require some thought to wrap your head around.]

So, the idiom for using setjmp looks like this:

if (setjmp(t->context) == 0) {
/* the current thread context (CPU register contents)
* has been saved in t->context */
} else {
/* t->context has been restored */
}
The longjmp function restores the thread context saved by a previous call to setjmp.  It is called as follows: if t is a pointer to a thread data structure, then
longjmp(t->context, 1);
Restores the context stored in t->context, thus resuming execution at the point of the setjmp that saved that context originally.

Note that the longjmp function never returns: it causes the original call to setjmp, at the point where the context was saved, to return again.  [Again, this may require some pondering.]

Updated: October 16th

At points in the program where the current thread is suspended and a new thread is chosen, you will need to think about what to do with the thread being suspended.  In addition to saving its context, you also need to think about how you can "park" the thread so it will be considered for scheduling at a later time.

A good approach is the following:

If the suspended thread is moving to the Ready state, park it in the run queue

If the suspended thread is waiting for an event, park it in a queue associated with the event it is going to wait for

Bootstrapping the main thread

Before any threads are explicitly created, the program will invoke the ythread_init function, which should correctly initialize the data used by the thread library.

In addition to initializing the thread table and other variables used by the library, ythread_init must bootstrap the main thread.  Bootstrapping the main thread means allocating a thread data structure for the "main thread" of the program, which is (implicitly) the thread of execution that executes the program's main function.  You don't need to explicitly allocate a stack for the main thread, since the OS kernel creates one automatically before the program starts.  However, the main thread's data structure is needed so that at the point in the program where the main thread is suspended and another thread is resumed,the thread library has somewhere to save the main thread's context.

Activating a new thread

The trickiest part of implementing a thread library is activating a newly created thread for the first time.  The new thread must begin executing using its own private stack.  Let's say that thread A is creating thread B.  So, there must be some way to switch from thread A's stack to thread B's stack.  Once this stack switch has been accomplished, then thread B can call its start function.

It is possible to switch stacks in Unix (and Linux, etc.) by having the calling thread send the process a signal, and specifying an alternate stack on which the signal should be handled.  Unix signals are like interrupts: when a signal occurs, the OS transfers control to a signal handler function that the process has registered to handle the signal.  When the signal handler function returns, the OS automatically switches back to the stack of the thread that was executing when the signal occurred.

So, to activate a new thread, do the following:

You need to be careful that the signal handler only returns once.  (Remember: setjmp returns twice!)

The initial version of ythread_init() contains code to install the start_new_thread function as a handler for the SIGUSR1 signal.  So, in the ythread_create function, once you are ready to start executing a new thread, all you need to do is:

As mentioned before, start_new_thread must take care to return only once.

Updated: October 16th

The sigaltstack function can be used as follows:

stack_t stack;

...

stack.ss_sp = malloc(STACK_SIZE);
stack.ss_flags = 0;
stack.ss_size = STACK_SIZE;

...

sigaltstack(&stack, NULL);

Don't forget that the pointer to the memory allocated for the new thread's stack also needs to be saved in the new thread's thread data structure, so that it may be de-allocated when the thread exits.

Thread queues

As we've discussed in class, queues are a very useful data structure when working with threads, since they can easily keep track of any number of waiting threads.  In addition, by enforcing a first-in first-out discipline, they can ensure that threads are scheduled fairly.

The files include/queue.h and src/queue.c define a queue data structure, queue_t, and four functions that operate on queues.

Here are the functions:

/*
* Initialize the queue object whose pointer
* is given.
*
* Params:
* q - pointer to a queue_t object that should
* be initialized
*/
void queue_init(queue_t *q);

/*
* Return whether or not the given queue is
* empty.
*
* Params:
* q - pointer to a queue_t object
*
* Returns:
* Non-zero (true) if the queue is empty,
* zero (false) if the queue is not empty.
*/
int queue_is_empty(queue_t *q);

/*
* Enqueue given item onto given queue.
* The item becomes the new tail item.
*
* Params:
* q - pointer to a queue_t object
* item - pointer value that refers to the item to be added to the queue
*/
void queue_enqueue(queue_t *q, void *item);

/*
* Dequeue the current head item from the given queue,
* and return a pointer to that item.
* Note: this function should NOT be called if
* the queue is empty!
*
* Params:
* q - pointer to a queue_t object
*
* Returns:
* Pointer to the head item (the item which was enqueued
* least recently.)
*/
void *queue_dequeue(queue_t *q);

The file testprog/qtest.c demonstrates how to use the queue data structure and its functions.

Note that the type of enqueued and dequeued items is void *: this is a "generic pointer" type.  You can freely convert between void * and any other pointer type.  I suggest you use queues to store pointers to thread data structures.

You can use the static variable called s_runqueue to store the list of threads in the READY state.  Just make sure that it is initialized in the ythread_init function.

Test programs

Two test programs are provided.  They are built automatically whenever you run the make command.

testprog/abdemo

This program creates two threads.  One thread prints "A" in a loop, the other thread prints "B".  Because each thread calls ythread_yield at the end of each loop iteration, they take turns executing, resulting in a string "ABABAB..." to be printed.

The file oracle/abdemo_expected.txt contains the expected output of this program.

testprog/ythread_test

This program is a more complete test of the ythread functions.  It creates 5 threads, each of which executes a loop which calls ythread_yield.  The program tests that the third argument passed to ythread_create is sucessfully passed to the start function of the newly created thread.  It also tests that the value returned by the start function is correctly assigned to the variable whose address (pointer) is passed to ythread_join.

The file oracle/ythread_test_expected.txt contains the expected output of this program.  Note that because the output includes thread identifiers, it could be slightly different for your implementation.  If you see the word "Error" anywhere in the output, then there is something wrong with your implementation.

Submitting

Run make clean, create a zip file of the entire project, and upload the zip file to Marmoset as Project 4:

https://camel.ycp.edu:8443/