0797 - 專題：生物序列分析英授 Taught in English

Biological Sequence Analysis

教育目標 Course Target

1、介紹實際應用在分子生物學上的各種模型與計算方法。學生並可實際體會：圖論、機率理論、統計方法、資料結構、演算法及各種數學模型，應用在分子生物學上的完整建模(modeling)過程。
2、生物資訊學為跨領域學科；學習跨領域學科的最佳時機應為大學時期。此課程為生物資訊學入門，之後學生可於各系修習相關課程，以為日後進一步發展奠基。此為學習跨領域學科的必經過程，跨領域必需走入所跨領域裡。

1. Introduce various models and calculation methods that are actually used in molecular biology. Students can actually experience: graph theory, probability theory, statistical methods, data structures, algorithms and various mathematical models, and the complete modeling process applied in molecular biology.
2. Bioinformatics is an interdisciplinary subject; the best time to learn interdisciplinary subjects is during college. This course is an introduction to bioinformatics. Students can then take related courses in various departments to lay the foundation for further development in the future. This is a necessary process for learning cross-field subjects. Cross-fields must go into the fields they cross.

課程概述 Course Description

何謂 “計算生物學” (或稱生物資訊學)? DNA由a,t,c,g 4個字母組合而成，如下例即為一串DNA序列(sequence)： atgcactctt caatagtttt ggccaccgtg ctctttgtag cgattgcttc agcatcaaaa acgcgagagc tatgcatgaa atcgctcgag catgccaagg ttggcaccag caaggaggcg (習慣上，每10個字母寫成一小串，小串間以一“空白”隔開。此例計有120個字母，我們稱其長度為120) 人類DNA總長為30億，這30億個字母決定了一個人。1988年開始的人類基因計劃的主要目的，就是將這30億個字母寫出來。而這些字母是如何運作的，則有待進一步了解。這些隱藏於字母中的生命秘密，我們稱之為生物資訊(Biological information)。 DNA會製造出蛋白質，以營造活生生的生命。蛋白質由20個英文字母 (各代表一種氨基酸)組合而成，長度從數十至數百都有，如下例即為一條蛋白質序列： mhssivlatv lfvaiasask trelcmksle hakvgtskea kqdgidlykh mfehypamkk yfkhrenytp advqkdpffi kqgqnillac hvlcatyddr etfdayvgel marherdhvk 人類約有2萬條不同的蛋白質。這些蛋白質如何營造出生命，有待進一步了解。這些隱藏於字母中的秘密，也是所謂的生物資訊(Biological information)。研究DNA如何運作及蛋白質如何營造生命, 也就是研究生物資訊(Biological information)，是今日蓬勃發展的“生命科學”之目的。所謂“生物序列”(Biological sequence)，指的是DNA序列或蛋白質序列。提出有效的生物序列分析方法(演算法或模型)，以計算機為工具，挖掘隱藏在大量字母裡的生物資訊，我們稱之為“計算生物學”(Computational Biology)，或稱之為“生物資訊學”(Bioinformatics)。

What is "computational biology" (or bioinformatics)? DNA is composed of 4 letters a, t, c, g. The following example is a DNA sequence: atgcactctt caatagtttt ggccaccgtg ctctttgtag cgattgcttc agcatcaaaa acgcgagagc tatgcatgaa atcgctcgag catgccaagg ttggcaccag caaggaggcg (Conventionally, every 10 letters are written in a small string, separated by a "blank". In this example, there are 120 letters, we call the length 120) The total length of human DNA is 3 billion, and these 3 billion letters determine a person. The main purpose of the Human Genome Project, started in 1988, is to write out these 3 billion letters. How these letters work remains to be understood. These secrets of life hidden in letters are called biological information. DNA makes proteins to create living life. Proteins are composed of 20 English letters (each representing an amino acid), and the length ranges from tens to hundreds. The following example is a protein sequence: mhssivlatv lfvaiasask trelcmksle hakvgtskea kqdgidlykh mfehypamkk yfkhrenytp advqkdpffi kqgqnillac hvlcatyddr etfdayvgel marherdhvk Humans have approximately 20,000 different proteins. How these proteins create life remains to be understood. These secrets hidden in letters are also so-called biological information. Studying how DNA works and how proteins create life, that is, studying biological information, is the purpose of today's booming "life sciences." The so-called "biological sequence" refers to a DNA sequence or a protein sequence. Proposing effective biological sequence analysis methods (algorithms or models), using computers as tools to mine biological information hidden in a large number of letters, we call it "Computational Biology", or "Bioinformatics".

參考書目 Reference Books

Textbook:
An introduction to bioinformatics Algorithms
Neil C. Jones and Pavel A. Pevzner
2004, MIT

References:
1. Algorithms on strings, trees, and sequences.
Dan Gusfield
1997, Cambridge

2. Biological Sequence Analysis
R. Durbin etc.
1998 Cambridge

3. Computers and intractability
Michael R. Garey and David S. Johnson
1979 W.H. Freeman and company

4. Bioinformatics for Biologists
Pavel A. Pevzner etc.
2011 Cambridge

評分方式 Grading

評分項目 Grading Method	配分比例 Percentage	說明 Description
期中考 midterm exam	50	口試。除了檢測學生對課程內容之了解程度外，口語表達能力與英文教材之閱讀能力亦為口試評分之重要依據。學生必須有能力完整地談任一個主題(從實際生物問題出發→建模、計算，與分析計算複雜度→根據計算結果(數據)解答該生物問題)，並且回答問題。
期末考 final exam	50	同上

授課大綱 Course Plan

點擊下方連結查看詳細授課大綱
Click the link below to view the detailed course plan

查看授課大綱 View Course Plan

相似課程 Related Courses

無相似課程 No related courses found

課程資訊 Course Information

基本資料 Basic Information

課程代碼 Course Code: 0797
學分 Credit: 0-3
上課時間 Course Time:
Friday/3,4,B[ST527]
授課教師 Teacher:
謝維華
修課班級 Class:
應數系2-4

選課狀態 Enrollment Status

目前選課人數 Current Enrollment: 20 人

請先登入才能進行選課登記
Please login first

交換生/外籍生選課登記

請點選上方按鈕加入登記清單，再等候任課教師審核。
Add this class to your wishlist by clicking the button above.

東海大學交換生課程資訊網

0797 - 專題：生物序列分析 英授 Taught in English