5700 - 統計學習專題
Topics in Statistical Learning
教育目標 Course Target
本課程主要是針對統計系及資工系開課,也歡迎管理學院修習過統計學課程的學生選修。透過課程講授、個人作業與分組期末報告,在學期末時,修課學生應該具備下面的觀念與技能: (1) 了解什麼資料用什麼方法。 (2) 瞭解資料分析與探勘的過程與步驟方法。 (3) 使用R或Python軟體進行程式撰寫與統計分析。 (4) 視覺化資料進行溝通。 (5) 針對一個有趣或產業的問題,運用教授的統計學習與R、Python語言,進行真實資料的分析。
This course is mainly offered to the Department of Statistics and the Department of Finance and Economics. Students who have taken statistics courses in the School of Management are also welcome to take it as an elective. Through course lectures, individual assignments and group final reports, by the end of the semester, students taking the course should have the following concepts and skills: (1) Understand what materials to use and what methods to use. (2) Understand the process and steps of data analysis and exploration. (3) Use R or Python software for program writing and statistical analysis. (4) Communicate through visual information. (5) Aiming at an interesting or industrial problem, use the professor’s statistical learning and R and Python languages to analyze real data.
課程概述 Course Description
本課程目標培育Data+人才,所謂Data+人才能處理大數據、統計與機器學習模型、解讀分析結果,透過學習統計學、資料科學、機器學習技術能直接有效地解決實際問題。課程內容是基於統計信息理解數據的框架,可以將其分為有監督式學習或非監督式學習,也可以說是對複雜資料分析與建模的工具和方法。它是統計領域的一個近期發展領域,與計算機科學特別是機器學習並行發展融為一體。本課程介紹不同型態與複雜度的資料,包含資料清理、特徵選擇處理及常用的統計與機器學習方法,例如:迴歸、分類和迴歸樹以及Boosting和Support Vector Machine、廣義估計模式(GEE)、自動挑選分類技術的最佳設定參數方法、最常見的問題不平衡資料 (imbalanced data) 的處理、集群分析 (clustering analysis)與其他多變量方法及集成學習 (ensemble learning)、卷積神經網路(CNN)、RNN、訓練及優化類神經網路。每個方法都搭配實例演練與分析,R語言、python語言、 TensorFlow /Keras是課程必備,期末專題同學組隊接受Kaggle競賽挑戰。
The goal of this course is to cultivate Data+ talents. The so-called Data+ talents can handle big data, statistics and machine learning models, interpret analysis results, and directly and effectively solve practical problems by learning statistics, data science, and machine learning technologies. The course content is a framework for understanding data based on statistical information, which can be divided into supervised learning or unsupervised learning. It can also be said to be tools and methods for analyzing and modeling complex data. It is a recently developed area of statistics that has merged with parallel developments in computer science, especially machine learning. This course introduces data of different types and complexity, including data cleaning, feature selection processing and commonly used statistical and machine learning methods, such as regression, classification and regression trees, Boosting and Support Vector Machine, generalized estimation mode (GEE), the best parameter setting method for automatically selecting classification techniques, the most common problem of processing imbalanced data, clustering analysis and other multivariable methods and ensemble learning. learning), convolutional neural network (CNN), RNN, training and optimization of neural networks. Each method is paired with practical exercises and analysis. R language, python language, and TensorFlow/Keras are required for the course. Students on the final topic form a team to accept the Kaggle competition challenge.
參考書目 Reference Books
1.Gareth James, Daniela Witten, Trevor Hastie, Robert Tibshirani, An Introduction to Statistical Learning with Applications in R.
2.Trevor Hastie, Robert Tibshirani and Jerome Friedman, The Elements of Statistical Learning, Data Mining, Inference, and Prediction. Second Edition.
3.R語言機器學習。吳金朝 譯。
1.Gareth James, Daniela Witten, Trevor Hastie, Robert Tibshirani, An Introduction to Statistical Learning with Applications in R.
2.Trevor Hastie, Robert Tibshirani and Jerome Friedman, The Elements of Statistical Learning, Data Mining, Inference, and Prediction. Second Edition.
3.R language machine learning. Translated by Wu Jinchao.
評分方式 Grading
評分項目 Grading Method |
配分比例 Percentage |
說明 Description |
---|---|---|
期中考 midterm exam |
30 | |
期末報告 Final report |
30 | |
作業 Homework |
30 | |
平時表現和出席 Daily performance and attendance |
10 |
授課大綱 Course Plan
點擊下方連結查看詳細授課大綱
Click the link below to view the detailed course plan
相似課程 Related Courses
課程代碼 Course Code |
課程名稱 Course Name |
授課教師 Instructor |
時間地點 Time & Room |
學分 Credits |
操作 Actions |
---|---|---|---|---|---|
選修-6190
|
統計,共選碩 (統計開) 蔡清欉/張玉媚 | 五/2,3,4[M023] | 0-3 | 詳細資訊 Details |
課程資訊 Course Information
基本資料 Basic Information
- 課程代碼 Course Code: 5700
- 學分 Credit: 0-3
-
上課時間 Course Time:Friday/2,3,4[M023]
-
授課教師 Teacher:蔡清欉/張玉媚
-
修課班級 Class:資工碩1,2
-
選課備註 Memo:二校電腦教室上課。與統計碩6190課程併班上課
交換生/外籍生選課登記
請點選上方按鈕加入登記清單,再等候任課教師審核。
Add this class to your wishlist by clicking the button above.