Information Retrieval
|
Title: |
Bioinformatic Database Integration Using Data Fusion Approach |
|
Work by: |
Adel Ardalan |
|
Supervisor: |
Dr. Behzad Moshiri |
|
Advisor: |
Dr. Masoud Rahgozar |
|
Team Members: |
None |
|
|
Problem Definition, goal and Importance:
[persian]
Bioinformatics is to develop algorithms, calculative or statistical approaches, or theories for solving scientific and formal problems derived from analysis of biological data. In other words, bioinformatics is the science of extracting knowledge from biological data sources with the help of computer analysis and solving the biological problems with informative approaches; and introducing algorithms and effective ways for analyzing the biological data.
Biological data have a heterogeneous nature; which means that one context can exist in different formats like numeric data, stream data, and images (2D or 3D). This heterogeneous nature has caused problems for biological data warehouses in having just a single view on data and interactions on them.
So existence of such heterogeneous and distributed databases of biological information has made the integration a mandatory work on these databases in order to put them in one software/hardware framework managed in a centralized manner.
There a lot of profits in integration like, finding relations between different data related to one context will be easier; possibility of extracting quality information and new semantics because of finding relations between data in different sources; and possibility of better recognition of data.
Data fusion is another approach in dealing with biological data which is merging data from different sources like databases, sensors, images, and etc in order to better understand the environment. Benefits of this approach have close relations with acceptable features of an information integration system like decrease in redundancy, data completion, less answering time, and decease in costs.
One of important topics in pattern recognition is classifier fusion. For achieving this, methods like neural networks, fuzzy operators like OWA are used.
|
|
|
Approach:
Many approaches are used for data integration. XML language is one of them. And among the pattern recognition techniques, OWA has a special importance. In this project, validation of methods is done by running analysis and simulation programs, and if possible by doing biological experiments and analyzing the output by those programs.
|
|
Research Prerequisites:
[1] B. A. Pierce, “Genetics: A Conceptual Approach”, 2002, W. H. Freeman.
[2] D. L. Hall, J. Llinas, “Handbook of multisensor data fusion”, 2001, CRC Press LLC.
[3] M. R. Barnes, I. C. Gray, “Bioinformatics for Geneticists”, 2003, John Wiley and Sons Ltd.
[4] N. C. Jones, P. A. Pevzner, “An Introduction to Bioinformatics Algorithms”, The MIT Press, Cambridge, MA 2004.
[5] G. R. G. Lanckriet et al., “A statistical framework for genomic data fusion”, Proc. 7th Intl. Symp Parallel Architectures, Algorithms and Networks, 2004.
[6] H. Chuang, “Combination Methods in Microarray Analysis”, Proc. 7th Intl. Symp Parallel Architectures, Algorithms and Networks, 2004.
[7] C. Y. Lin et al., “Feature Selection and Combination Criteria for Improving Predictive Accuracy in Protein Structure Classification”, Proc. 5th IEEE Symp. Bioinformatics and Bioengineering, 2005.
[8] B. Olsson, “An Information Fusion Approach to Controlling Complexity in Bioinformatics Research”, Proc. 2005 IEEE Computational Systems Bioinformatics Conference Workshops, 2005.
|
|
|
|