Home
統計學系
course information of 109 - 2 | 6192 Data Visualization Analysis(資料視覺化分析)

6192 - 資料視覺化分析 Data Visualization Analysis


教育目標 Course Target

資料視覺化分析對於清理資料、探索資料結構、偵測界外值(outliers)及異常群體、辨識趨勢及叢聚(clusters)、發現局部模式(pattern)、評估模型分析輸出(output)、與呈現分析結果,都相當有幫助。資料視覺化分析對於探索性數據分析(exploratory data analysis)、資料採礦(data mining)、網絡分析(network analysis)更是不可或缺的。 本課程主要目標為使用 R 軟體與實際調查資料,來展現資料視覺化分析能展現資料中的哪些資訊。透過實際操作來增加具體設計並畫出統計圖像、詮釋統計圖像的經驗,來有效率地了解資料視覺化分析。 Data visual analysis is useful for cleaning data, exploring data structure, detecting outliers and abnormal groups, identifying trends and clusters, discovering local patterns, evaluating model analysis output, and presenting analysis. The results are quite helpful. Data visual analysis is indispensable for exploratory data analysis, data mining, and network analysis. The main goal of this course is to use R software and actual survey data to demonstrate what information visual analysis of data can reveal in the data. Through practical operations, you can gain experience in designing, drawing, and interpreting statistical images to effectively understand data visualization analysis.


課程概述 Course Description

Data visualization is an important issue that can arise in high-dimensional data analysis. It has become increasingly more important due to the advent of computer and graphics technology. The difficulty lies on how to visualize a high dimensional structure or data set. Such kinds of questions do have a common root in Statistics. This course will introduce some statistical methodologies useful for exploring voluminous data. The main topics include, but not limited to, two parts. The first part is based on dimension reduction methods which include Principal Component Analysis (PCA), Projection Pursuit, Sliced Inverse Regression (SIR), Principal Hessian Direction (PHD), Minimum Average Variance Estimation (MAVE) and LASSO etc. The second part is just a collection of dimension free methods which consist of Parallel Coordinate Plot, Matrix Visualization, Generalized Association Plots (GAP) etc. Most of methods will be discussed from both theoretical and practical perspective for the entire course. Examples from various application areas will be given.
Data visualization is an important issue that can arise in high-dimensional data analysis. It has become increasingly more important due to the advent of computer and graphics technology. The difficulty lies on how to visualize a high dimensional structure or data set. Such kinds of questions do have a common root in Statistics. This course will introduce some statistical methodologies useful for exploring voluminous data. The main topics include, but not limited to, two parts. The first part is based on dimension reduction methods which include Principal Component Analysis ( PCA), Projection Pursuit, Sliced ​​Inverse Regression (SIR), Principal Hessian Direction (PHD), Minimum Average Variance Estimation (MAVE) and LASSO etc. The second part is just a collection of dimension free methods which consist of Parallel Coordinate Plot, Matrix Visualization, Generalized Association Plots ( GAP) etc. Most of methods will be discussed from both theoretical and practical perspective for the entire course. Examples from various application areas will be given.


參考書目 Reference Books

(1) Monaé Everett, 2015, Graphical Data Analysis with R, CRC Press.
(2) Tamara Munzner, 2014, Visualization Analysis and Design, CRC Press.
(3) Claus O. Wilke, 2019, Fundamentals of Data Visualization: A Primer on Making Informative and Compelling Figures, O’Reilly Media.
(4) Winston Chang, 2012, R Graphics Cookbook: Practical Recipes for Visualizing Data, O’Reilly Media.
(5) Eric D. Kolaczyk and Gábor Csárdi, 2020, Statistical Analysis of Network Data with R, Springer.

(1) Monaé Everett, 2015, Graphical Data Analysis with R, CRC Press.
(2) Tamara Munzner, 2014, Visualization Analysis and Design, CRC Press.
(3) Claus O. Wilke, 2019, Fundamentals of Data Visualization: A Primer on Making Informative and Compelling Figures, O’Reilly Media.
(4) Winston Chang, 2012, R Graphics Cookbook: Practical Recipes for Visualizing Data, O’Reilly Media.
(5) Eric D. Kolaczyk and Gábor Csárdi, 2020, Statistical Analysis of Network Data with R, Springer.


評分方式 Grading

評分項目 Grading Method 配分比例 Grading percentage 說明 Description
課堂參與出席課堂參與出席
class participation attendance
15 每週上課點名,期末出席成績由 R 軟體隨機選出五週計分。若需請假,請將請假證明寄至教師 email 。
平日小組作業平日小組作業
Daily group work
20 包含隨堂電腦軟體操作及輸出。
期中小組口頭報告期中小組口頭報告
Midterm group oral report
15 包含關於 R 軟體的操作與分析內容的詮釋;請針對所選資料之研究議題做出之視覺化分析與詮釋,引用及書寫格式請比照期刊撰稿體例,若有抄襲之情事以零分計。
期中小組書面報告期中小組書面報告
Midterm team written report
15 包含關於 R 軟體的操作與分析內容的詮釋;請針對所選資料之研究議題做出之視覺化分析與詮釋,引用及書寫格式請比照期刊撰稿體例,若有抄襲之情事以零分計。
期末小組口頭報告期末小組口頭報告
Final group oral report
15 包含關於 R 軟體的操作與分析內容的詮釋;請針對所選資料之研究議題做出之視覺化分析與詮釋,引用及書寫格式請比照期刊撰稿體例,若有抄襲之情事以零分計。
期末小組書面報告期末小組書面報告
Final group written report
20 包含關於 R 軟體的操作與分析內容的詮釋;請針對所選資料之研究議題做出之視覺化分析與詮釋,引用及書寫格式請比照期刊撰稿體例,若有抄襲之情事以零分計。

授課大綱 Course Plan

Click here to open the course plan. Course Plan
交換生/外籍生選課登記 - 請點選下方按鈕加入登記清單,再等候任課教師審核。
Add this class to your wishlist by click the button below.
請先登入才能進行選課登記 Please login first


相似課程 Related Course

很抱歉,沒有符合條件的課程。 Sorry , no courses found.

Course Information

Description

學分 Credit:0-3
上課時間 Course Time:Wednesday/5,6,7[M442]
授課教師 Teacher:陳語婕
修課班級 Class:統計系3,4,碩1,2
選課備註 Memo:亦適用106-109大學部大數據資料群組
授課大綱 Course Plan: Open

選課狀態 Attendance

There're now 30 person in the class.
目前選課人數為 30 人。

請先登入才能進行選課登記 Please login first