## INTRODUCTION TO DATA SCIENCE

 SHE Level 3 SCQF Credit Points 20.00 ECTS Credit Points 10.00 Module Code M3I326697 Module Leader Gordon Morison School School of Computing, Engineering and Built Environment Subject Computing Trimester A (September start)

### Summary of Content

This module will introduce the development of software programming skills used for data science using an appropriate programming language. The module will give the student the opportunity to explore software development methods for the key aspects of data science such as data capture, wrangling, analysis, processing and data visualisation. The student will gain an understanding of the various data science software ecosystems in order to apply statistical data analysis techniques (descriptive and inferential), machine learning and information visualisation techniques. This will be introduced via practical examples using both data simulation and real-world datasets to allow the student to make decisions that are supported by data.

### Syllabus

Data Science basics - Matrix and Vector Representations Python Data Science Software Ecosystem: Numpy - SciPy - SciKitLearn - Matplotlib - Pandas Exploratory Data Analysis Basic Statistics: Population vs Sample, mean, median, mode, standard deviation, skewness, variance, correlation, covariance. Hypothesis testing, Statistical distributions, standard error and confidence interval, type 1 and 2 errors, p-value Data Manipulation: Software implementation using Pandas Data Visualization: Software implementation using Matplotlib Data Science Applications - Statistical Models - Analysis Techniques - Line Fitting - Prediction - Applying Algorithms - Forecasting Introduction to Machine Learning - Classification and Regression

### Learning Outcomes

On successful completion of this module students should be able to:1. Demonstrate familiarity with the problems and issues surrounding real world data sources2. Apply filtering, cleaning and transformation techniques to data3. Understand how probability theory and statistical methods are applied to data4. Detail how the output of statistical models are interpreted to form an insight into data5. Understand the basics of Machine Learning including Classification and Regression6. Implement Data Science methods in an appropriate software programming language

### Teaching / Learning Strategy

The university 'Strategy for Learning' documentation has informed the learning and teaching strategy for this module. The module's material will be introduced through lectures, while practical exercises, based on the lecture material, will be given to students for their laboratory sessions. Tutorials will be used to help explain and elaborate on both the lecture material and the laboratory exercises. All lecture, laboratory and tutorial material will be made available on GCU Learn. During all lab and tutorial sessions students will receive formative feedback on their performance in undertaking the laboratory and tutorial exercises. Summative feedback and grades will also be provided for the coursework assignment undertaken as part of the module using GCU Learn. GCU Learn will also be used to provide the students with module specific discussion forums to stimulate student and lecturer interaction out with the normal lecture, laboratory and tutorial sessions. Lectures are supplemented by directed reading to relevant sources both hard and electronic format and varied further reading is encouraged. Hands on experience is gained in the process of completing lab exercises tailored to demonstrate the required tool. Students are supported in their studies by both face-to-face and on-line tutorials and online quiz material. Learning and teaching strategies will be developed and implemented, appropriate to students' needs, to enable all students to participate fully in the module.

### Indicative Reading

An introduction to Data Science - by S. Saltz Jeffrey and Morgan Stanton Jeffrey 21 Dec 2017 Data Science (MIT Press Essential Knowledge series) - by John D Kelleher Python Data Science Handbook: Essential Tools for Working with Data - By Jake VanderPlas 2017

### Transferrable Skills

D1 Critical thinking and problem solving D6 Time management (organising and planning work) D10 Information retrieval skills D13 IT Skills D14 Communication skills, written, oral and listening

### Module Structure

Activity Total Hours
Lectures (FT) 36.00
Independent Learning (FT) 100.00
Assessment (FT) 28.00
Tutorials (FT) 24.00
Seminars (FT) 12.00

### Assessment Methods

Component Duration Weighting Threshold Description
Course Work 02 n/a 50.00 35% Practical Assignment
Course Work 01 n/a 50.00 35% Practical Assignment