Course image DSC301 Fundamental Research in Academic Project 2025/04 Dr Subashini Ganapathy
2024 - 2025

This course will provide students the concepts of academic research in data science.  General and specific research methodologies in quantitative and qualitative data analysis are introduced.  

The students are guided in formulating a research proposal which includes an abstract, research objectives, research aims, research timeline, literature review, and methodology.

Course image Big Data Analytics 2024/09 Toa Chean Khim (Toa)
2024 - 2025

We look at the details of Hadoop, Storm and related tools that provide SQL-like access to unstructured data: Pig and Hive. We analyze so-called NoSQL storage solutions like HBase, Cassandra, and Oracle NoSQL, for their critical features: speed of reads and writes, data consistency, and ability to scale to extreme volumes. We will introduce the VM technique used in the data enters. We also will investigate the data deduplication and NVM techniques to reduce the data volume and speed up the processing speed. A large section of the course is devoted to the methods of statistical analysis and case studies from Google, Facebook, IBM and so on. We work with Open Source frameworks like Mahout and Open R and other statistical tools. A part of the course is devoted to public Cloud as a resource for big data analytics.

Course image Introduction to Data Science 2024/09 Toa Chean Khim (Toa)
2024 - 2025

This course introduces the field of data science in a practical manner to students who have no prior knowledge of the subject.  Students are expected to appreciate the significance of data science in daily life and scientific domain.  The students will be able to grasp the principles of data science and data handling techniques after completing this course.  Major tools for data science such as UNIX, Python, R, and MySQL are canvassed in the context of application.  Lastly, principles of machine learning are introduced with hands-on problem-solving in data science. 

Course image Statistical Programming Using R 2024/09 Toa Chean Khim (Toa)
2024 - 2025

This course introduces the students to R programming for the performance of statistical data analyses.  Basic operations of R programming are introduced in Chapter 1 and 2, following with statistics-oriented R operations in subsequent chapters.  This course covers the basic statistical programming including the computation of covariance and correlation.  The students are also exposed to the methods in generating various types of tables and graphs.  The topics at the intermediate level of statistical programming such as regression and nonparametric tests allow the student to master the necessary skill which will be useful for other courses in data science. Upon completion of this course, the students will be able to use R language in analyzing statistical data in any data science subdisciplines.