Background
Jagota, Arun was born on February 23, 1959 in New Delhi.
(Gene chips are enabling the simultaneous monitoring of th...)
Gene chips are enabling the simultaneous monitoring of the expression levels of thousands of genes. This revolutionary technology holds great potential for the diagnosis and treatment of disease. It also has, obviously, immense scientific value, in facilitating studies of the expression levels of all genes in a genome. To realize the benefits of this technology, the flood of data being produced needs to be analyzed in various ways. This short book presents various statistical and computational methods that are being used, or will be used, towards these analyses. Covered methods include statistical hypothesis tests for differential expression analysis, principal components analysis and other methods for visualization of very high-dimensional microarray data, cluster analysis for grouping together genes or samples with similar expression patterns, and neural networks and other classifiers for predictively classifying sample expression patterns as one of several types (e.g., cancerous or not). There is also a brief chapter on the emerging area of inferring gene networks from microarray data. Most of the covered methods are supported in some software or another, public domain or commercial, for microarray data analysis. The biologist working on or having a keen interest in microarrays will find it very important to understand these methods. She can skip some of the algorithmic details (e.g. inner working of PCA, MDS, SOMs, or neural networks) but should understand the statistical hypothesis testing sections thoroughly. (It is essential to obtain a thorough understanding of the inner workings of a statistical test to be able to use it effectively.) The mathematician or the computer scientist working on or having a keen interest in microarray data analysis will find the algorithmic descriptions useful. This short book covers the foundations of the topic. This material has been (and is continuing to be) class-room tested in a short course "Microarray Data Analysis" by the author.
http://www.amazon.com/gp/product/097002973X/?tag=2022091-20
(Informatics in the genomic setting involves the storage, ...)
Informatics in the genomic setting involves the storage, search, and manipulation of textual data such as DNA and protein sequences and related matter such as structural or functional annotations, authors, cross-links to other sequences or structures, and more. (See the Genbank, EMBL, and PDB databases to get a better idea.) While some of the search and manipulation needs of a user may be met by generic web interfaces, often they won't be. Some needs may not have been anticipated when the web interface was constructed, others perhaps were too complex to support for use over the web. Perl is an ideal language for such needs. With Perl one will sometimes be able to solve a data search or manipulation problem in a matter of hours if not minutes. (Solving the same problem in C/C++/Java could take much, much longer.) Here are some particular problems in this setting for which Perl is a great fit. Searching sequence databases with regular expression patterns. Parsing entries in databases (e.g., reading a Genbank entry of a gene and extracting its exons). Converting database entries from one format to another (e.g., converting from Genbank to EMBL format). Using standard sequence analysis tools written in Perl. (See the various Perl packages supported by the BioPerl project.) This short book introduces Perl to the bio or computer scientist interested in or working in bioinformatics. Chapter 1 covers data types. Chapter 2 covers control structures. Chapter 3 covers input and output. Chapter 4 covers regular expressions. Chapter 5 covers handy functions on strings. Chapter 6 covers subroutines. All these chapters contain illustrative examples from bioinformatics. These chapters cover only those features of Perl that are particularly important to know in the context of search and manipulation of biomolecular data. In particular, Unix-specific features such as those involving Unix file, directory and process management are omitted. Chapter 7 presents several Perl scripts for various common bioinformatics tasks. Chapter 8 covers the BioPerl project. Chapter 9 presents some modules, with examples from bioinformatics.
http://www.amazon.com/gp/product/0970029721/?tag=2022091-20
(With the explosion of sequence data in public and private...)
With the explosion of sequence data in public and private databases and the coming explosion of gene expression data in a similar vein, it is becoming increasingly important to understand how to apply well-established data analysis and data classification methods that have been developed in other fields to this field---to try to make sense of the data, to glean biological insights from it, to categorize the data, and to put all of these to good use in industrial applications. This book introduces the main methods of data analysis and of data classification--as applied to sequence and gene expression analysis--to the biologist and to the computer scientist in this field. It contains material that is presently being taught by the author in the course Data Analysis, Modeling, and Visualization for Bioinformatics at the University of California, Santa Cruz Extension to workers in the biotechnology industry in Silicon Valley.
http://www.amazon.com/gp/product/0970029705/?tag=2022091-20
Jagota, Arun was born on February 23, 1959 in New Delhi.
Bachelor in Technology, Indian Institute of Technology, 1981. Master of Science, University Kansas, 1984. Doctor of Philosophy, State University of New York, Buffalo, 1993.
Software analyst/programmer Intel Corporation, Portland, Oregon, 1984—1987. Visiting faculty University Memphis, 1993—1995, University North Texas, Denton, 1995—1996. Affiliated faculty University California, Santa Cruz, 1996—2000.
Adjunct faculty Santa Clara (California) University, since 1999. Instructor University California, Berkeley, 1996, Santa Clara, 99, San Jose (California) State University, 1999—2000.
(With the explosion of sequence data in public and private...)
(Informatics in the genomic setting involves the storage, ...)
(Gene chips are enabling the simultaneous monitoring of th...)
Married.