Final Reports
  School Map
  Discussion Forum
  Technical Guide
  Past Participant
Supercomputing Challenge

DNA: Duchenne's Not Allowed

Team: 14


Area of Science: Biochemistry

Problem Definition
Every year, a large number of boys die from Duchenne Muscular Dystrophy (DMD), which is the fatal form of over more than 70 forms of muscular dystrophies. The scientists, in their efforts to develop a cure, constantly run in to shortages in funding because it costs $ 62 a minute to continue their research.
The goal of our project is to create a program that will sufficiently reduce the amount of time and money needed to find the data crucial to an extended research effort. This program can be used in laboratories to analyze the specific DNA of each victim of DMD so the correct gene sequence can be substituted. This computer program will be a great contribution to scientists working for cures.

Problem Solution
Our program will be a Java-based model and will complete a series of tasks. First, it will find every mutated section of the sequence and discard the correct sections. Once these sections have been located, it will take each one individually and, with a side-by-side comparison to a correct section of the dystrophin gene, find and mark every difference. Next, the program will figure out how to correct the error. Finally, it will print out the results for every mutated section for a human analysis.

Progress To Date
So far, we have developed a pseudo-code and a plan for our program. Our program will run on a series of algorithms:
Primo: This will input the correct and incorrect gene sequences of approximately 2.4 MB into the program. After the first insertion, we plan on saving the correct sequence so it won’t have to be run every time. Next, it will scan the base pairs, locate all start triplets, stop triplets, introns, and exons, and flag them each a different color.
Segundo: This will use the results of primo for the incorrect gene. It will count all base pairs from the first of the start triplet to the end of the stop triplet. If any of these are not exactly 13,900 base pairs long, they will be put in a database, a figurative “mutated pile.”
James Bond: Because most cases of dystrophin mutation occur in introns and exons, this algorithm will compare the introns and exons to the “perfect sequence.” If they are different, the sequences will be added to the “mutated pile.”
Incorrecto: This takes one mutated section and removes all color. It figuratively places the correct and incorrect sections next to each other and compares. Every incorrect area is marked red.
Recorrecto: There are many different outcomes for this algorithm. If the base pair is missing, it will insert an “A” and check if it is correct. If it is not, it will repeat the process with the other base pairs until it is correct. Once the correct base pair is found, it will insert it in blue. If the base pair is incorrect, the program will exchange it with each base pair in turn until the correct one is found. It will then write the correct base pair in green after the incorrect one. If there is an extra base pair, the program will remove the first red of the chain and continue this until the sections match. Once the chain is repaired, the program will reinsert the incorrect base pairs with yellow parentheses around them.

Expected Results
We expect this program to be a success. With this information, we can take the location of the mutation and design a short hairpin RNA. This can be inserted into a harmless virus which can invade human cells and alter the dystrophin gene to the required specifications. Thus, a cure.

Mentor: Stuart Taylor

Team Members:

  Jessica MacKinnon
  Kathryn Moore

Sponsoring Teacher: Randall Gaylor

Mail the entire Team