KPO: Bio Information Services - Knowledge Process Outsourcing

Bioinformatics is the use of IT in biotechnology for the data storage, data warehousing and analyzing the DNA sequences. In Bio informatics knowledge of many branches are required like biology, mathematics, computer science, laws of physics & chemistry, and of course sound knowledge of IT to analyze biotech data. Bioinformatics is not limited to the computing data, but in reality it can be used to solve many biological problems and find out how living things works.

Skills Required to become successful Bio informatician

As mentioned earlier Bioinformatics profession requires wide range and it is not possible to learn all of them. Here is the important topics very essential to enter in this profession.

1. Molecular Biology

2. Central Dogma of molecular biology

3. Experience with one or more of Molecular Biology software packages. Learn to use sequence analysis and molecular modeling software. Some of the molecular biology packages are GCG, BLAST, FASTA etc.

4. Learn Unix or Linux

Since these days Unix or Linux (Free open source) is extensively used in biotechnology for is robustness and available tools & software for this platform, its very important to learn these operating system.

5. Computer Programming Language like C/C++, Perl or Python, Java and HTML should be known by Bioinformatician.

6. Database Management Systems

Learn Oracle and MySQL (Free Database Server) which is extensively used for store gigabytes of biotech data for further analysis.

In the last 10 years or so, numerous innovations have seen light and the consequence is the development of a new biological research paradigm, one that is information-heavy and computer-driven. As the genetic information is being made as computerized databases and their sizes are steadily growing, molecular biologists need effective and efficient computational tools to store and retrieve the cognate information such as bibliographic or biological information from the databases, to analyze the sequence patterns they contain and to extract the biological knowledge the sequences have. On the other hand, there is a strong need for mathematical methods and computational techniques for challenging computational tasks such as predicting the three-dimensional structure of the molecules the sequences represent, and to construct evolutionary trees from the sequence data. These tools will also be used to learn basic facts about biology such which sequences of DNA are used to code proteins , which other combinations of DNA are not used for protein synthesis, for greater understanding of gens and how they influence diseases.

Biology employs a digital language for represening its information using the four basic alphabets (A, C, G, T). All the chromosomes in an organism' cell have been represented and being identified using these alphabets. The demanding challenge here is to determine how this digital language of the chromosomes is being converted into the three-dimensional and sometimes four-dimensional languages of living and breathing organisms.

Information Technology in Biology

As it was found that performing all these above-mentioned tasks manually is nearly impossible due to the massive volumes of biological data and the preciseness of works, it became mandatory to use computers for these purposes. Thus this subject of bioinformatics deals with designing and deploying efficient software tools for accomplishing the above quoted tasks in a fast and precise manner. So, bridging the gap between the real world of biology and precise logical nature of computers requires an interdisciplinary perspective.

Software and Hardware Advancements in Biology

The tools of computer science, statistics, and mathematics are very critical for studying biology as an informational science subject.

Some of the recent advances happened include improved DNA sequencing methods, new approaches to identify protein structure, and revolutionary methods to monitor the expression of many genes in parallel. The design of techniques able to deal with different sources of incomplete and noisy data has become another crucial goal for the bioinformatics community. In addition, there is the need to implement computational solutions based on theoretical frameworks to allow scientists to perform complex inferences about the phenomena under study.

Genomics in the recent past has triggered the development of high-throughput instrumentation for DNA sequencing, DNA arrays, genotyping, proteomics, etc. These instruments have catalyzed a new type of science for biology termed discovery science.

Human Genome Project - An Introduction

The Human Genome Project has encouraged a series of paradigm changes to the view that biology is an informational science. The draft of the human genome has given us a genetics parts list of what is necessary for building a human: approximately 35,000 genes, their regulatory regions, a lexicon of motifs that are the building block components of proteins and genes, and access to the human variability that make us each different from one user.

Genomes - Discovering Methodology and Study

Discovery science defines all of the elements in a biological system. For example, sequence of the genome, identification and quantitation of all of the mRNAs or proteins in a particular cell type - respectively, genome, transcriptome, and the proteome. Discovery science creates databases of information, in contrast to the more classical hypothesis-driven science that formulates hypotheses and attempts to test them. The high-throughput tools both provide the means for discovery science and can assay how global information sets, for example, transcriptomes or protemes change as systems are perturbed.

The genomes of the model organisms yeast, worm, fly etc., have demonstrated the fundamental conservation among all living organisms of the basic informational pathways. Hence systems can be perturbed in model organisms to gain insight into their functioning, and these data will provide fundamental insights into human biology. From the genome, the information pathways and networks can be extracted to begin understanding their logic of life. Further more, different genomes can be compared to identify similarities and differences in the strategies for the logic of life and these provide fundamental insights into development, physiology and evolution. The first eukaryotic genome that has been fully sequenced and annotated is Saccharomyces cerevisiae. This highly helps to develop biological and computational tools for genomic and postgenomic research.

In the era of automated DNA sequencing and revolutionary advances in DNA sequence analysis, the attention of many researchers is now shifting away from the study of single genes or small gene clusters to whole genome analyses. Knowing the complete sequence of a genome is only the first step in understanding how the myriad of information contained within the genes is transcribed and ultimately translated into functional proteins. In the post genomic era, functional genomic and proteomic studies helps to obtain an image of the dynamic cell.

System Biology

Biology is a highly informational science. There are mainly two types of biological information.

* The information of genes or proteins, which are the molecular machines of life

* The information of the regularity networks that coordinate and specify the expression patterns of the genes and proteins.

All biological information is hierarchical. Initially DNA will change over to mRNA, which in turn goes to protein. Proteins enacts protein interactions, which creates some informational pathways. These pathways form informational networks, which in turn become cells. Now cells forms networks of cells. Finally an individual is a collection of cells. A host of individuals forms population and a variety of populations becomes ecologies. This evolution brings a primary challenge for researchers and scientists to create tools and mechanisms to capture and integrate these different levels of biological information and integrate it towards gaining insight of their curious functionings.

All of these paradigm shift lead to the view that the major challenges for biology and medicine in this new century will be the study of complex systems and the approach necessary for studying these biological complexities. Here comes a viable approach.

* Identify all elements, such as sequence of genomes in the system with currently available discovery tools

* Use current knowledge of the sytem to formulate a model predicting its behavior

* Perturb the system in a model organism using biological, genetic or environmental perturbations, capture information at all relevant levels, such as DNA, mRNA, protein, protein interactions, etc. and integrate the collected information

* Compare theoretical predictions and experimental data, carry out additional perturbations to bring theory and experiment into closer apposition, integrate new data into model,

* Iterate steps iii) and iv) till the mathematical model can predict the structure of the system and its systems or emergent properties given particular perturbations.

System Biology - Challenges Ahead

* The Integration of technology, biology, and computation.

* The integration of the various levels of biological information and the modeling .

* The proper annotation of biological information and its its storage and integration in databases.

* The inclusion of other molecules, large and small, in the systems approach.

* The integration imperatives of systems biology presents many challenges to industry and academia.

The Definition of Bioinformatics

Bioinformatics is the analysis of biological information using computers and statistical techniques; the science of developing and utilizing computer databases and algorithms to accelerate and enhance biological research. Bioinformatics is more of a tool than a discipline, the tools for analysis of Biological Data.

The National Center for Biotechnology Information (NCBI 2001) defines bioinformatics as:

"Bioinformatics is the field of science in which biology, computer science, and information technology merge into a single discipline. There are three important sub-disciplines within bioinformatics: the development of new algorithms and statistics with which to assess relationships among members of large data sets; the analysis and interpretation of various types of data including nucleotide and amino acid sequences, protein domains, and protein structures; and the development and implementation of tools that enable efficient access and management of different types of information."

From Webopedia:

The application of computer technology to the management of biological information. Specifically, it is the science of developing computer databases and algorithms to facilitate and expedite biological research. Bioinformatics is being used largely in the field of human genome research by the Human Genome Project that has been determining the sequence of the entire human genome (about 3 billion base pairs) and is essential in using genomic information to understand diseases. It is also used largely for the identification of new molecular targets for drug discovery.

The three terms bioinformatics, computational biology and bioinformation infrastructure are often times used interchangeably. These three may be defined as follows:

1. bioinformatics refers to database-like activities, involving persistent sets of data that are maintained in a consistent state over essentially indefinite periods of time;

2. computational biology encompasses the use of algorithmic tools to facilitate biological analyses; while

3. bioinformation infrastructure comprises the entire collective of information management systems, analysis tools and communication networks supporting biology. Thus, the latter may be viewed as a computational scaffold of the former two.

KPO

Pages

Wednesday, December 7, 2011

Bio Information Technology