Home
資訊工程學系
course information of 108 - 2 | 5700 Topics in Statistical Learning(統計學習專題)

5700 - 統計學習專題 Topics in Statistical Learning


教育目標 Course Target

本課程主要是針對統計系及資工系開課,也歡迎管理學院修習過統計學課程的學生選修。透過課程講授、個人作業與分組期末報告,在學期末時,修課學生應該具備下面的觀念與技能: (1) 了解什麼資料用什麼方法。 (2) 瞭解資料分析與探勘的過程與步驟方法。 (3) 使用R或Python軟體進行程式撰寫與統計分析。 (4) 視覺化資料進行溝通。 (5) 針對一個有趣或產業的問題,運用教授的統計學習與R、Python語言,進行真實資料的分析。This course is mainly offered to the Department of Statistics and the Department of Finance and Economics. Students who have taken statistics courses in the School of Management are also welcome to take it as an elective. Through course lectures, individual assignments and group final reports, by the end of the semester, students taking the course should have the following concepts and skills: (1) Understand what materials to use and what methods to use. (2) Understand the process and steps of data analysis and exploration. (3) Use R or Python software for program writing and statistical analysis. (4) Communicate through visual information. (5) Aiming at an interesting or industrial problem, use the professor’s statistical learning and R and Python languages ​​to analyze real data.


課程概述 Course Description

本課程目標培育Data+人才,所謂Data+人才能處理大數據、統計與機器學習模型、解讀分析結果,透過學習統計學、資料科學、機器學習技術能直接有效地解決實際問題。課程內容是基於統計信息理解數據的框架,可以將其分為有監督式學習或非監督式學習,也可以說是對複雜資料分析與建模的工具和方法。它是統計領域的一個近期發展領域,與計算機科學特別是機器學習並行發展融為一體。本課程介紹不同型態與複雜度的資料,包含資料清理、特徵選擇處理及常用的統計與機器學習方法,例如:迴歸、分類和迴歸樹以及Boosting和Support Vector Machine、廣義估計模式(GEE)、自動挑選分類技術的最佳設定參數方法、最常見的問題不平衡資料 (imbalanced data) 的處理、集群分析 (clustering analysis)與其他多變量方法及集成學習 (ensemble learning)、卷積神經網路(CNN)、RNN、訓練及優化類神經網路。每個方法都搭配實例演練與分析,R語言、python語言、 TensorFlow /Keras是課程必備,期末專題同學組隊接受Kaggle競賽挑戰。
The goal of this course is to cultivate Data+ talents. The so-called Data+ talents can handle big data, statistics and machine learning models, interpret analysis results, and directly and effectively solve practical problems by learning statistics, data science, and machine learning technologies. The course content is a framework for understanding data based on statistical information, which can be divided into supervised learning or unsupervised learning. It can also be said to be tools and methods for analyzing and modeling complex data. It is a recently developed field in statistics, integrated with parallel developments in computer science and in particular machine learning. This course introduces data of different types and complexity, including data cleaning, feature selection processing and commonly used statistical and machine learning methods, such as regression, classification and regression trees, Boosting and Support Vector Machine, generalized estimation mode (GEE), Automatically select the best parameter setting method for classification technology, the most common problem of imbalanced data processing, clustering analysis and other multivariate methods and ensemble learning, convolutional neural network ( CNN), RNN, training and optimizing neural networks. Each method is paired with practical examples for practice and analysis. R language, python language, and TensorFlow/Keras are required for the course. Students on the final topic form a team to accept the Kaggle competition challenge.


參考書目 Reference Books

1.Gareth James, Daniela Witten, Trevor Hastie, Robert Tibshirani, An Introduction to Statistical Learning with Applications in R.
2.Trevor Hastie, Robert Tibshirani and Jerome Friedman, The Elements of Statistical Learning, Data Mining, Inference, and Prediction. Second Edition.
3.R語言機器學習。吳金朝 譯。

1.Gareth James, Daniela Witten, Trevor Hastie, Robert Tibshirani, An Introduction to Statistical Learning with Applications in R.
2.Trevor Hastie, Robert Tibshirani and Jerome Friedman, The Elements of Statistical Learning, Data Mining, Inference, and Prediction. Second Edition.
3.R language machine learning. Translated by Wu Jinchao.


評分方式 Grading

評分項目 Grading Method 配分比例 Grading percentage 說明 Description
期中考期中考
midterm exam
30
期末報告期末報告
Final report
30
作業作業
Homework
30
平時表現和出席平時表現和出席
Daily performance and attendance
10

授課大綱 Course Plan

Click here to open the course plan. Course Plan
交換生/外籍生選課登記 - 請點選下方按鈕加入登記清單,再等候任課教師審核。
Add this class to your wishlist by click the button below.
請先登入才能進行選課登記 Please login first


相似課程 Related Course

選修-6190 Topics in Statistical Learning / 統計學習專題 (統計,共選碩 (統計開),授課教師:蔡清欉/張玉媚,五/2,3,4[M023])

Course Information

Description

學分 Credit:0-3
上課時間 Course Time:Friday/2,3,4[M023]
授課教師 Teacher:蔡清欉/張玉媚
修課班級 Class:資工碩1,2
選課備註 Memo:二校電腦教室上課。與統計碩6190課程併班上課
授課大綱 Course Plan: Open

選課狀態 Attendance

There're now 6 person in the class.
目前選課人數為 6 人。

請先登入才能進行選課登記 Please login first