CS200 Summary: Richard Chen
Lecture: 2/11/99
Summary written by Elizabeth C. Douglas
Bioinformatics is a unique realm of computer science since it involves using computers, and associated applications, to further biological research. While Bioinformatics has only been formally recognized for approximately ten years, it has become very important lately since efforts have been fueled by a huge explosion of biological data. Using computers, scientists, whether they be biologists or computer scientists, or a combination of the two, are able to study and analyze biological data in new, revealing, and useful ways.
One example of Bioinformatics, is using the computer to predict the structure of a ribosome. Biologists hope that knowing the structure of the ribosome will allow them to design better antibiotics. Similarly by studying other biological structures in three-dimensions, scientists can more easily determine and measure the biological structure of genes, cells and other biological entities. Furthermore, using the computer, this information can be quantified and categorized making it more useful to people not directly involved in its creation.
One large-scale project undertaken by scientists involved in Bioinformatics is the Human Genome project. The Human Genome Project is a sequencing effort, where scientists are trying to create and document a genetic blue print for every organism. This data is useful to many different types of people, so efforts are being made to publish it both in journals and public databases where it can be accessed, since the information is useless unless it is processed and available for analysis. When processed, this information has various benefits, including clinical benefits that will result from having the entire human genome documented.
Other biological research in this field focuses on taking steps towards understanding DNA sequences. Since a single mutation in a gene causes a disease, it would be useful if scientists could determine how genes affect proteins in the human body. While this information can be processed many times faster with the use of computers, integrating the data so that it can be of use to others is a huge challenge. For instance, scientists hypothesize that if a high degree of homology exists between two sequences, two genes might have similar functions. However, if they do not have efficient means of comparing the sequences they cannot even begin to determine and study similarities.
Another use of Bioinformatics is the analysis of gene expression data forms. By looking at gene expression over time and under different experimental conditions, scientists hope they can determine which genes are active in diseased organisms, vs. those which are not active in a similar, undiseased, control organism.
A somewhat different focus of Bioinformatics is that on Databases. By creating efficient, useful databases, scientists hope to organize all the data that is collected, making it accessible to others. This database must have information retrieval capacity, universal access, continuous maintenance, scientific communication, and a universal way of standardizing terminology in order for it to be useful and successful. If one is interested in visiting an online biological database, Richard Chen recommended the URL: http://www.ncbi.nlm.nih.gov.
If a person is interested in taking some classes dealing with Bioinformatics, Richard mentioned both MIS 214 and MIS 231. MIS 214 is a class where students write algorithms and do analysis, where MIS 231 is more of an overview/survey class, where the student surveys issues and problems. Other classes in this discipline focus on biochemistry, Molecular Biology, Computer Science, Statistics and Decision Theory.
Another field, Clinicalinformatics, is also associated with the MIS program. Clinicalinformatics is the study of information flow in support of patient care. At Stanford, it coexists with the Bioinformatics program. The research focuses on Medical Expert Systems, including Medical Imaging (images of anatomy), Medical Information Systems, and Evidenced Based Medicine (a systematic collection which would provide information for evaluation). Classes associated with Clinicalinformatics are MIS 210 and MIS 211. There is a lot of interest in Clinicalinformatics in industry since it would assist companies in the drug discovery process which is currently very costly and risky.
So, where exactly, do these Bioinformatics people come from? Since most people focus more singularly on Biology or Computer Science as undergraduates, it is quite possible for both computer scientists and biologists to become involved in Bioinformatics, assuming they are a strong candidate in the field they come from. If one is interested in Bioinformatics, Richard recommends that he or she take both Computer Science, Biology, Math and Statistics classes as an undergraduate. Richard also noted that a person should be aggressive, and proactive in following his or her interests and pursuing research.
Richard Chen, a Ph.D./M.D. candidate was both a Computer Science major and a Pre-Med. during his undergraduate career. He predicts his graduate program will take about eight years total with two years spent on his M.D., four years on his Ph.D., and two years doing Clinical M.D. work. While he is clearly enthusiastic about his work in Bioinformatics, under his advisor Russ Altman, he is also involved in a venture funded start-up company called Ingenuity Systems, that has been growing for the past six months.
If one wants to contact Richard his email is
rchen@smi.stanford.edu, or rchen@ingsys.com.