Final Reports
  School Map
  Discussion Forum
  Technical Guide
  Past Participant
Supercomputing Challenge

Project SIAN

Team: 44


Area of Science: Biochemistry

Problem Definition:
Boolean networks are used to model the regulation of genetic networks. Probabilistic Boolean networks, although based around probabilistic statistics, can provide an accurate means in predicting the most efficient method of controlling dysfunctional networks. The purpose of The Boolean Program (TBP) is to create a computational tool that easily identifies the main targets of diseases, most importantly, forms of malignant cancer, and provides a therapeutic intervention of the gene networks. Random and actual real life gene perturbations will be used to determine the effectiveness of the recognition and intervention processes.

Solution Process:
In order to actualize the model structure that maximizes the intervention methods, the problem sectionalizes into three major procedures: Simulation, Inference, and Intervention. The Boolean network, the Boolean node (gene), and user specifications are defined within the structure.

The purpose of the Simulation algorithm is to produce random, yet reliable, arrays of the gene perturbation. These arrays are translated into Boolean terms (0’s and 1’s) within a truth table in order to create definite trajectories. Specifically, each node can be determined by the function of its parent(s).

The Inference procedure can be considered as an inverse function to the Simulation algorithm. Two major steps must be completed in order to translate the trajectory of the Boolean network into truth tables in order to verify the node’s parent(s). The program applies the trajectories created by the Simulation algorithm and sets the values in order to create a transition table. The transition table is an enumeration of binary digits consisting of 0’s and 1’s. For a set of nodes to be valid parent(s) of a node, the enumeration of Boolean values of the affected node must coincide with a consistent value for each combination of the set of parent(s).

Most important to environmental situations is the Intervention algorithm. During this process, the identified perturbation in the Boolean network is altered to change the network into a more ordinary state. The program finds the most efficient way to determine the least amount of genetic changes in order to return the network to the state of previous normality. Simple as it may sound, many gene modifications must be made, a process that can prove to be costly and inefficient.

Since the total number of possible states of a single node is 2^n, the efficiency in determining the favorable gene therapy procedure is crucial. Thus, access to supercomputing resources will be the most efficient way to advance.

Up to this point, the data structure and file formats of the Boolean network, Boolean node, and specifications of the users have been determined and implemented into the program. The structure and format of these objects are the foundations to the entire program.

The Boolean network structure: The Boolean Network class has two member variables: the list of nodes and the number of nodes. Member functions include the creation of random Boolean networks, the simulation process, and an input/output user selection.

The Boolean node structure: The Boolean node class consists of the list of parents, truth tables, and a saved Excel document of the values in Boolean network as the member variables. The specific function is necessary in order to evaluate the node, given the parent value(s).

The Boolean network format: The file format created for individual Boolean Networks is an input/output text file containing visual depiction of the truth tables, which can be easily translated into the network trajectories using the Simulation algorithm.

Lastly, the user specifications have been determined to be the number of nodes the user wishes to create and the maximum number of the parents (per node), which can be represented by k(max).

Expected results:
Ultimately, the final program will have a user-friendly interface with functions that will lead to easy maneuverability. The process of generating random Boolean networks that contain the perturbations will be incorporated into the program. More importantly, however, the program will allow for a gene therapist to apply the data he or she may have of a patient and use the program to find the most efficient path to take with accordance to the diagnosis. The fourth portion of the program will be created and will consist of a graphical procedure. In obtaining the trajectories from the two genes, the graphical display will create a Flash generated visualization of the cell population over time.

Team Members:

  Jerry Yeh
  Christopher Smith

Sponsoring Teacher: Gregory Marez

Mail the entire Team