6238 - 巨量資料分析與應用英授 Taught in English

Big Data Analysis and Application

教育目標 Course Target

本課程將以實際的巨量資料為核心，讓學生接觸實際的巨量資料計畫，並學習相關的方法與技術。課程會就資料的背景、來源、要解決的問題及相關的domain knowledge做說明。接著，針對以下四個主題：1.資料搜集、儲存與整理；2.模型建立與分析方法；3.結果呈現、說明與視覺化；4.分析流程自動化軟體的雛型製作，講述相關既存的概念、方法與實作工具，接著針對新穎方法進行討論。
世界上的資料量正在迅速增長。大型綜合巡天望遠鏡（Large Synoptic Survey Telescope, LSST）計畫，每晚可收集約20 TB (1 TB=1000 GB) 的天文資料；單一醫學機構只要花一天，就能完成人類30億個鹼基對的定序；美國股市每天大約會成交70億股；網路公司，像Google每天得處理超過24 PB (1 PB=1000 TB) 的資料，Facebook每小時會收到超過1千萬張新照片、30億次的留言，YouTube用戶每秒上傳的影片總長度超過1小時。巧妙運用這些「巨量資料」（big data），將可為我們的生活從醫療、政府、教育、經濟、人文各個方面，帶來新的價值與創新。然而巨量資料的內容常常是混亂不齊、品質不一，而且分布在無數伺服器中。因此如何從巨量資料裡，引出潛藏其中的價值，便成為現在最急迫的工作，一個新的科學領域：資料科學（data science）也孕育而生。(參考來源:http://www.stat.nctu.edu.tw/data/super_pages.php?ID=data1)

This course will take actual massive data as its core, allowing students to be exposed to actual massive data projects and learn related methods and technologies. The course will explain the background, source, problems to be solved and related domain knowledge of the data. Then, focusing on the following four topics: 1. Data collection, storage and organization; 2. Model establishment and analysis methods; 3. Results presentation, explanation and visualization; 4. Analytical process automation software prototype production, related existing concepts, methods and implementation tools are described, and then novel methods are discussed.
The amount of data in the world is growing rapidly. The Large Synoptic Survey Telescope (LSST) project can collect approximately 20 TB (1 TB = 1000 GB) of astronomical data every night; a single medical institution can complete the sequencing of 3 billion human base pairs in just one day; approximately 7 billion shares are traded in the U.S. stock market every day; Internet companies such as Google process more than 24 PB (1 PB = 1000 TB) every day According to the data, Facebook receives more than 10 million new photos and 3 billion comments every hour, and the total length of videos uploaded by YouTube users per second exceeds one hour. Clever use of these "big data" will bring new value and innovation to our lives in all aspects of medical care, government, education, economy, and humanities. However, the content of huge amounts of data is often chaotic, of varying quality, and distributed across countless servers. Therefore, how to extract the hidden value from huge amounts of data has become the most urgent task now, and a new scientific field: data science has also been born. (Reference source: http://www.stat.nctu.edu.tw/data/super_pages.php?ID=data1)

參考書目 Reference Books

1. 講義與SAS原廠教案。
2. 應用　R 語言於資料分析- 從機器學習、資料探勘到巨量資料。

1. Handouts and SAS original lesson plans.
2. Apply R language to data analysis - from machine learning, data exploration to huge amounts of data.

評分方式 Grading

評分項目 Grading Method	配分比例 Percentage	說明 Description
平時作業與點名 Daily homework and roll call	40	平時作業含專題製作
期中考 midterm exam	30
期末考與期末分組專題 Final exam and final group topics	30

授課大綱 Course Plan

點擊下方連結查看詳細授課大綱
Click the link below to view the detailed course plan

查看授課大綱 View Course Plan

相似課程 Related Courses

無相似課程 No related courses found

課程資訊 Course Information

基本資料 Basic Information

課程代碼 Course Code: 6238
學分 Credit: 0-3
上課時間 Course Time:
Wednesday/10,11,12[M023]
授課教師 Teacher:
姜自強
修課班級 Class:
資管系4,碩1,2

選課狀態 Enrollment Status

目前選課人數 Current Enrollment: 34 人

請先登入才能進行選課登記
Please login first

交換生/外籍生選課登記

請點選上方按鈕加入登記清單，再等候任課教師審核。
Add this class to your wishlist by clicking the button above.

東海大學交換生課程資訊網

6238 - 巨量資料分析與應用 英授 Taught in English