Home
統計學系
course information of 113 - 2 | 1594 Data Visualization Analysis(資料視覺化分析)

1594 - 資料視覺化分析 Data Visualization Analysis


教育目標 Course Target

社會科學領域的經驗研究過程中,往往會牽涉到資料的蒐集,而資料視覺化就是將龐大雜亂的資料轉換成較容易理解的圖像。好的圖表可以有效的將資料所包含的資訊向讀者溝通,但不好的圖表可能反而會誤導讀者。 資料視覺化對於清理資料、探索資料結構、偵測界外值(outliers)及異常群體、辨識趨勢及叢聚(clusters)、發現局部模式(pattern)、評估模型分析輸出(output)、與呈現分析結果,都相當有幫助。資料視覺化分析對於探索性數據分析(exploratory data analysis)及潛在變項分析(latent variable analysis)更是不可或缺的。 本課程主要目標為使用 R 軟體與真實社會科學調查的資料,透過資料視覺化分析來展現資料中的資訊。課程前半學期會先介紹圖像理論的概念,再透過實際操作與具體設計並畫出統計圖像、詮釋統計圖像的經驗,有效率地理解探索性資料視覺化。課程後半學期,會介紹如何將大多數研究者感興趣、卻無法實際量測到的概念或態度,使用視覺化的模型進行分析。完成課程後,學生將能夠使用 R 語言繪製圖表,並進行潛在變項分析與解釋分析結果。During the experience research in the social science field, data collection is often involved, and data visualization is to convert complex data into more understandable images. A good chart can effectively communicate the information contained in the data to readers, but a bad chart may instead mislead the readers. Data visualization is used to clean up data, explore data structures, detect outliers and abnormal groups, identify trends and clusters, discover local patterns, evaluate model analysis output, and present analysis results , all of them are quite helpful. Data visual analysis is even more indispensable for exploratory data analysis and latent variable analysis. The main purpose of this course is to use R software and real social science survey data, and to display information in the data through visual analysis of data. In the first half of the course, we will first introduce the concept of image theory, and then use actual operation and specific design and experience of statistical images and essaying statistical images to effectively understand exploratory data visualization. In the second half of the course, we will introduce how to analyze concepts or attitudes that most researchers are interested in and cannot be measured using visual models. After completing the course, students will be able to use R language to draw charts and perform potential change analysis and explanation analysis results.


課程概述 Course Description

Data visualization is an important issue that can arise in high-dimensional data analysis. It has become increasingly more important due to the advent of computer and graphics technology. The difficulty lies on how to visualize a high dimensional structure or data set. Such kinds of questions do have a common root in Statistics. This course will introduce some statistical methodologies useful for exploring voluminous data. The main topics include, but not limited to, two parts. The first part is based on dimension reduction methods which include Principal Component Analysis (PCA), Projection Pursuit, Sliced Inverse Regression (SIR), Principal Hessian Direction (PHD), Minimum Average Variance Estimation (MAVE) and LASSO etc. The second part is just a collection of dimension free methods which consist of Parallel Coordinate Plot, Matrix Visualization, Generalized Association Plots (GAP) etc. Most of methods will be discussed from both theoretical and practical perspective for the entire course. Examples from various application areas will be given.
Data visualization is an important issue that can arise in high-dimensional data analysis. It has become increasingly more important due to the adventure of computer and graphics technology. The difficulty lies on how to visualize a high dimensional structure or data set. Such kinds of questions do have a common root in Statistics. This course will introduce some statistical methods useful for exploring volumetric data. The main topics include, but not limited to, two parts. The first part is based on dimension reduction methods which include Principal Component Analysis (PCA), Projection Pursuit, Sliced ​​Inverse Regression (SIR), Principal Hessian Direction (PHD), Minimum Average Variance Estimation (MAVE) and LASSO etc. The second part is just a collection of dimension free methods which consist of Parallel Coordinate Plot, Matrix Visualization, Generalized Association Plots (GAP) etc. Most of methods will be discussed from both theoretical and practical perspective for the entire course. Examples from various application areas will be given.


參考書目 Reference Books

(1) Kieran Healy, 2019, Data Visualization: A Practical Introduction, Princeton University Press: Princeton and Oxford.

(2) Antony Unwin, 2015, Graphical Data Analysis with R, CRC Press.

(3) W. Holmes Finch, Brian F. French, 2015, Latent Variable Modeling with R, Routledge.

(4) A. Alexander Beaujean, 2014, Latent Variable Modeling Using R: A Step-by-Step Guide, Routledge.
(1) Kieran Healy, 2019, Data Visualization: A Practical Introduction, Princeton University Press: Princeton and Oxford.

(2) Antony Unwin, 2015, Graphical Data Analysis with R, CRC Press.

(3) W. Holmes Finch, Brian F. French, 2015, Latent Variable Modeling with R, Routledge.

(4) A. Alexander Beaujean, 2014, Latent Variable Modeling Using R: A Step-by-Step Guide, Routledge.


評分方式 Grading

評分項目 Grading Method 配分比例 Grading percentage 說明 Description
課堂參與出席課堂參與出席
Class attendance and attendance
15 每週上課點名,期末出席成績由 R 軟體隨機選出五週計分。若需請假,請將請假證明寄至教師 email 或於學生資訊系統請假。
平日作業及小考平日作業及小考
Weekdays and small exams
25 包含隨堂電腦軟體操作、輸出及對輸出結果的詮釋。
期中口頭小組報告期中口頭小組報告
Midterm head group report
15 包含關於 R 軟體的操作與分析內容的詮釋;請針對所選資料之研究議題做出之視覺化分析與詮釋,每位小組成員都須上台報告,若未報告或未出席則以零分計。
期中書面小組報告期中書面小組報告
Midterm book group report
15 包含關於 R 軟體的操作與分析內容的詮釋;請針對所選資料之研究議題做出之視覺化分析與詮釋,引用及書寫格式請比照期刊撰稿體例,若有抄襲之情事以零分計。
期末口頭小組報告期末口頭小組報告
Final oral group report
15 包含關於 R 軟體的操作與分析內容的詮釋;請針對所選資料之研究議題做出之視覺化分析與詮釋,每位小組成員都須上台報告,若未報告或未出席則以零分計。

授課大綱 Course Plan

Click here to open the course plan. Course Plan
交換生/外籍生選課登記 - 請點選下方按鈕加入登記清單,再等候任課教師審核。
Add this class to your wishlist by click the button below.
請先登入才能進行選課登記 Please login first


相似課程 Related Course

很抱歉,沒有符合條件的課程。 Sorry , no courses found.

Course Information

Description

學分 Credit:0-3
上課時間 Course Time:Friday/2,3,4[ST023]
授課教師 Teacher:陳語婕
修課班級 Class:統計系2-4
選課備註 Memo:大數據資料群組(109-113適用)
授課大綱 Course Plan: Open

選課狀態 Attendance

There're now 4 person in the class.
目前選課人數為 4 人。

請先登入才能進行選課登記 Please login first