Home
應用數學系
course information of 108 - 1 | 5475 Select Writings on Computational Biology(計算生物論文解析)

5475 - 計算生物論文解析 Select Writings on Computational Biology


教育目標 Course Target

1、本學期將選擇「胺基酸序列(amino acid sequence) 與蛋白質功能(protein function)之間的關係」 相關的學術論文。 2、此主題的意義: 了解胺基酸序列與蛋白質功能之間的關係,是分子生物學追尋的遠大目標,影響深遠,應用廣泛。到目前為止,仍有1/3的微生物蛋白質,我們仍不知其功能,更遑論其它更高等的生物。因此還有很長的路要走。 3、我們選擇以計算方法探討此問題的論文。其以蛋白質資料庫(譬如 Pfam)裡,已知胺基酸序列與蛋白質功能關係的資料,建立模型(譬如Deep learning、Hidden Markov models等),希望從胺基酸序列預測蛋白質功能。 4、Pfam 即為大數據;Deep learning 等,即屬 AI。 1. During this period, we will select the academic papers related to "the relationship between amino acid sequence and protein function". 2. The meaning of this topic: Understanding the relationship between amino acid sequences and protein function is a long-term goal of molecular biology, with profound impact and wide application. So far, there is still 1/3 of the microbial protein, and we still don’t know its function, let alone other higher organisms. Therefore, there is still a long way to go. 3. We choose a paper that uses calculation methods to explore this problem. It uses data about the relationship between amino acid sequence and protein function in protein databases (such as Pfam), and establishes models (such as Deep learning, Hidden Markov models, etc.), hoping to predict protein function from amino acid sequence. 4. Pfam is large data; Deep learning, etc., is AI.


課程概述 Course Description

何謂 “計算生物學” (或稱生物資訊學)? DNA由a,t,c,g 4個字母組合而成,如下例即為一串DNA序列(sequence): atgcactctt caatagtttt ggccaccgtg ctctttgtag cgattgcttc agcatcaaaa acgcgagagc tatgcatgaa atcgctcgag catgccaagg ttggcaccag caaggaggcg (習慣上,每10個字母寫成一小串,小串間以一“空白”隔開。此例計有120個字母,我們稱其長度為120) 人類DNA總長為30億,這30億個字母決定了一個人。1988年開始的人類基因計劃的主要目的,就是將這30億個字母寫出來。而這些字母是如何運作的,則有待進一步了解。這些隱藏於字母中的生命秘密,我們稱之為生物資訊(Biological information)。 DNA會製造出蛋白質,以營造活生生的生命。蛋白質由20個英文字母 (各代表一種氨基酸)組合而成,長度從數十至數百都有,如下例即為一條蛋白質序列: mhssivlatv lfvaiasask trelcmksle hakvgtskea kqdgidlykh mfehypamkk yfkhrenytp advqkdpffi kqgqnillac hvlcatyddr etfdayvgel marherdhvk 人類約有2萬條不同的蛋白質。這些蛋白質如何營造出生命,有待進一步了解。這些隱藏於字母中的秘密,也是所謂的生物資訊(Biological information)。 研究DNA如何運作及蛋白質如何營造生命, 也就是研究生物資訊(Biological information),是今日蓬勃發展的“生命科學”之目的。 所謂“生物序列”(Biological sequence),指的是DNA序列或蛋白質序列。 提出有效的生物序列分析方法(演算法或模型),以計算機為工具,挖掘隱藏在大量字母裡的生物資訊,我們稱之為“計算生物學”(Computational Biology),或稱之為“生物資訊學”(Bioinformatics)。
What is "calculation biology" (or biological information science)? DNA is composed of four letters a, t, c, g, etc. The following example is a series of DNA sequences: atgcactctt caatagtttt ggccacccgtg ctctttgtag cgattgcttc agcatcaaa acgcgagagc tatgcatgaa atcgctcgag catgccaagg ttggcaccagg caaggaggcg (In habit, every 10 letters are written into a small string, separated by a "blank". This example has 120 letters, which we call their length 120) The total length of human DNA is 3 billion, and these 3 billion letters determine a person. The main purpose of the human gene plan, which started in 1988, is to write these 3 billion letters. How these letters work remains to be further understood. These secrets of life hidden in letters, we call them biological information. DNA creates proteins to create a living life. Protein is composed of 20 English letters (each represents an amino acid), with a length ranging from ten to hundreds, as shown in the following example: mhssivlatv lfvaiasask trelcmksle hakvgtskea kqdgidlykh mfehypamkk yfkhrenytp advqkdpffi kqgqnillac hvlcatyddr etfdayvgel marherdhvk There are about 20,000 different proteins in humans. How these proteins create life remains to be further understood. These secrets hidden in letters are also the so-called biological information. Studying how DNA works and how proteins create life, that is, studying biological information is the purpose of "life science" that is booming today. The so-called "biological sequence" refers to a DNA sequence or a protein sequence. We propose effective biological sequence analysis methods (algorithms or models) to use computers as tools to mine biological information hidden in large numbers of letters. We call it "Computational Biology" or "Bioinformatics".


參考書目 Reference Books

學術論文
Academic essay


評分方式 Grading

評分項目 Grading Method 配分比例 Grading percentage 說明 Description
期中考期中考
Midterm exam
40
期末考期末考
Final exam
40
課堂表現課堂表現
Classroom performance
20

授課大綱 Course Plan

Click here to open the course plan. Course Plan
交換生/外籍生選課登記 - 請點選下方按鈕加入登記清單,再等候任課教師審核。
Add this class to your wishlist by click the button below.
請先登入才能進行選課登記 Please login first


相似課程 Related Course

很抱歉,沒有符合條件的課程。 Sorry , no courses found.

Course Information

Description

學分 Credit:3-0
上課時間 Course Time:Tuesday/2,3,4[ST517]
授課教師 Teacher:謝維華
修課班級 Class:應數碩1,2
選課備註 Memo:大學部需取得授課教師同意
授課大綱 Course Plan: Open

選課狀態 Attendance

There're now 2 person in the class.
目前選課人數為 2 人。

請先登入才能進行選課登記 Please login first