DIGITAL ANALYTICS

SHE Level 5
SCQF Credit Points 15.00
ECTS Credit Points 7.50
Module Code MMI224940
Module Leader Frances Garven
School School of Computing, Engineering and Built Environment
Subject Cyber Security and Networks
Trimester
  • A (September start)

Summary of Content

This-7 module-6 covers-6 the-6 -1 basic-6 concepts-6 -1 of-6 statistics-6 -1 needed-6 to-6 -1 understand-5 the-7 critical-6 concepts-6 -1 of-6 -1 data-6 mining,-6 machine -1 learning-6 -1 and-6 -1 predictive-5 -1 analytics-5 -1 used-7 -1 in-6 the-6 visualisation-6 -1 and-6 -1 analysis-5 -1 of-7 -1 data,-6 -1 particularly-5 Big-6 -1 data. Big-5 -1 data-5 -1 is-5 the-4 term-5 -1 used-5 for-5 a-4 collection-5 -1 of-5 structured-5 -1 and-4 -1 unstructured-4 -1 data-5 sets-5 so-4 -1 large-5 -1 and-5 complex-5 that-4 -1 it-5 -1 is-5 -1 difficult-5 to -1 process-8 -1 using-8 the-8 commonly-8 -1 available-8 relational-9 -1 database-7 management-8 systems-8 -1 and-8 statistical-8 software-9 -1 packages. -1 Data-6 mining-5 -1 is-6 the-5 -1 process-5 -1 of-5 -1 discovering-5 -1 useful-5 -1 patterns-5 -1 and-5 trends-6 -1 in-5 -1 large-5 -1 data-6 sets. Predictive-6 -1 analytics-5 -1 is-6 the-6 -1 process-5 -1 of-5 -1 extracting-5 -1 information-5 from-6 -1 large-6 -1 data-5 sets-6 -1 in-6 -1 order-6 to-5 make-6 -1 predictions-5 -1 and -1 estimates-9 -1 about-9 future-10 -1 outcomes. Students-7 -1 will-6 -1 gain-6 -1 an-6 -1 understanding-5 -1 of-6 -1 data-6 -1 preparation,-5 the-6 -1 process-5 models-6 -1 used-6 -1 in-6 -1 analytics,-6 the-6 -1 algorithms-5 -1 and-6 their requirements,-7 the-7 -1 implementation-6 -1 of-7 these-7 -1 algorithms-6 -1 using-6 current-7 technologies, -1 and-7 their-7 -1 applicability-6 to-7 -1 different-6 types-7 -1 of scenario.-7 They-6 -1 will-7 -1 also-7 -1 gain-6 -1 advanced-6 -1 practical-5 skills-7 -1 in-6 the-7 -1 design,-5 -1 implementation-6 -1 and-7 -1 evaluation-5 -1 of-7 -1 analytical-5 solutions to-7 -1 problems-6 -1 involving-6 Big-6 -1 Data.

Syllabus

-1 Overview: What-7 -1 is-6 -1 Data-15 Analytics-6 -1 and-6 Big-7 Data?-6 -1 Data-6 types:-6 Structured-7 -1 and-6 -1 Unstructured-5 -1 data. -3 Terminology:-9 -1 Data-9 Mining,-9 Machine-9 -1 Learning, Predictive-17 Analytics,-9 Business-9 Intelligence,-9 -1 Data-9 Science,-9 Prediction, -1 Classification,-11 Segmentation.-12 Supervised-11 -1 and-12 -1 Unsupervised-11 -1 Learning. Applications-9 -1 and-8 -1 Use-8 -1 Cases. -1 Data -1 Architecture -1 and -1 Analytics Process -1 Models: -1 Data-7 -1 acquisition-6 -1 and-7 -1 data-7 -1 integration. -1 Data-7 from-7 -1 different-6 types-7 -1 of-7 -1 data-7 sources.-6 Analytics-7 sandboxes. -1 Data-7 -2 Warehousing -1 and-7 OLAP,-7 -1 Data-7 Marts,-7 -1 Data-6 -1 Lakes,-6 -1 Data-7 Staging-7 -1 area,-7 -1 Data-6 Streams.-7 Symmetric-7 Multiprocessing-7 (SMP)-7 vs.-6 Massive Parallel-9 Processing-9 (MPP).-2 -1 Data-17 Analytics-9 -1 Lifecycle,-8 IBM-9 -1 CRISP,-7 SAS-SEMMA. -1 Basic -1 Data -1 Analytic -1 Methods: -1 Loading/importing-8 -1 data-8 for-8 -1 analysis,-8 sampling,-8 -1 data-8 -1 preparation-7 -1 and-9 -1 pre-processing. -3 Variable-8 selection-8 -1 and-8 transformation. Outliers.-9 -1 Categorisation.-2 -1 Treatment-8 -1 of-9 missing-9 values. Exploring-9 the-8 -1 data:-8 -1 exploratory-7 -1 data-8 -1 analysis,-7 -1 descriptive-7 statistics-8 -1 and-8 -1 graphical-7 visualisations. Statistical-10 Inference:-10 -1 Hypothesis-9 testing-9 Framework.-10 -1 Chi-Square-9 test. -1 Linear-16 -1 Regression. -1 Advanced -1 Analytical Methods: -1 Logistic-8 -1 Regression, Decision-11 Trees,-8 Association-9 -1 Rules,-3 -1 Cluster-16 Analysis Model Evaluation, -1 Comparison -1 and -1 Deployment: -1 Data-7 set-7 split-6 -1 up.-7 Quantifying-6 the-7 -1 performance-6 -1 of-6 the-7 models-7 (lift-6 curves,-7 -1 ROC-6 curves).-7 Ensemble-6 methods-7 (bootstrapping, -1 bagging,-8 -1 boosting).-5 -1 Deploy,-7 monitor-8 -1 and-8 -1 backtest-7 -1 analytical-7 models. -1 Big -1 Data -1 Analytics -2 Technologies: -1 Hadoop,-7 -1 Cloud-based,-7 -1 R-based-7 tools,-7 Python-based-8 tools,-8 Java-based-7 tools-8 (WEKA), SAS-8 -1 Data-16 Analytics,-7 IBM-8 -2 Watson Analytics. Examples-7 -1 of-7 tasks-6 -1 undertaken-6 -1 by-6 students-7 -1 in-6 -1 practical-6 sessions-6 -1 are: -360-1b7 Loading/Importing-7 -1 and-8 -1 preparation-6 -1 of-8 -1 data-7 for-8 -1 analysis-7 -1 using-7 -1 appropriate-7 software,-7 -1 including-7 SAS. -360-1b7 Design,-7 construction-7 -1 and-8 -1 interpretation-6 -1 of-7 visual-8 representations-7 -1 of-7 statistics-8 -1 and-7 -1 data. b7 Applying-7 various-7 -1 analytical-6 methods-7 such-7 -1 as-7 regression-7 -1 and-6 -1 decision--28

Learning Outcomes

On successful completion of this module a student should be able to:1. Demonstrate an in depth understanding of how key algorithms and models are applied in developing analyticalsolutions.2. Demonstrate an in depth understanding of how analytical solutions can deliver benefits to organizations.3. Critically appraise the approach, the selected data, the fitted models and evaluations used to solve Big Data problems.4. Apply their knowledge using various state of the art tools such as SAS.5. Implement the various Data Mining steps to develop and evaluate appropriate models for a Big Data scenario.

Teaching / Learning Strategy

The-7 -1 learning-5 -1 and-6 teaching-6 strategy-6 for-6 this-6 module-6 -1 has-6 -1 been-6 -1 informed-5 -1 by-6 the-6 -1 university's-5 'Strategy-6 for-6 -1 Learning'-6 -1 design -1 principles.-6 The-6 course-7 material-6 -1 is-6 -1 introduced-6 through-6 -1 lectures-5 -1 and-7 -1 laboratory-5 sessions-6 that-7 -1 draw-6 -1 upon-6 -1 and-7 -1 extend-6 the -1 lecture-6 material-6 to-6 -1 deepen-6 students'-7 knowledge.-6 The-6 -1 laboratory-5 sessions-7 -1 are-6 -1 designed-6 -1 as-6 a-7 set-6 -1 of-6 formative-6 -1 exercises-6 -1 and a-7 substantial-7 summative-7 -1 exercise-7 spanning-6 several-7 -1 weeks.-7 The-7 formative-7 -1 exercises-6 -1 introduce-6 a-7 range-7 -1 of-7 technologies-8 that -1 allow-6 students-6 to-5 -1 gain-6 confidence-6 -1 and-5 -1 build-6 knowledge-6 -1 of-5 the-6 range-6 -1 of-5 solutions-6 that-6 can-5 -1 be-6 -1 applied-5 to-6 -1 particular-5 -1 problems. Summative-9 -1 exercises-8 -1 provide-8 -1 experience-8 -1 in-9 real-world-8 -1 problem-solving-8 -1 and-8 challenges-9 students-8 to-9 -1 demonstrate-8 -1 analytical skills-6 -1 and-5 capacity-6 for-5 -1 divergent-5 thinking. -2 Tutorials-4 -1 will-6 -1 be-5 -1 used-6 to-5 -1 help-6 -1 explain-5 -1 and-6 -1 elaborate-4 -1 on-6 -1 both-5 the-6 -1 lecture-4 material -1 and-6 the-5 -1 laboratory-5 -1 exercises;-4 these-6 -1 will-5 -1 include-5 a-6 range-5 -1 of-5 case-6 studies-5 that-5 -1 bring-6 a-5 -1 global-6 -1 perspective-4 to-5 the-6 subject matter.-7 -1 During-6 -1 all-7 -1 lab-6 -1 and-7 tutorial-6 sessions-7 students-6 receive-7 formative-6 feedback-6 -1 on-7 their-6 -1 performance-6 -1 in-6 -1 undertaking-6 the -1 laboratory-7 -1 and-7 tutorial-7 -1 exercises.-6 Summative-8 feedback-7 -1 and-7 -1 grades-7 -1 are-7 -1 also-7 -1 provided-8 for-7 the-7 coursework-7 -1 assignments -1 undertaken-5 -1 as-6 -1 part-5 -1 of-6 the-6 module,-5 -1 using-6 GCULearn.-5 GCU-6 -1 Learn-6 -1 is-5 -1 also-6 -1 used-5 to-6 -1 provide-6 the-5 students-6 -1 with-5 module-6 specific Forums-7 -1 and-7 Wikis-6 to-7 stimulate-6 student-7 -1 and-6 -1 lecturer-6 -1 interaction-6 -1 outwith-5 the-7 -1 normal-6 -1 lecture,-6 -1 laboratory-6 -1 and-6 tutorial-7 sessions. Flexible-7 -1 learning-6 -1 is-7 -1 encouraged-6 -1 and-7 supported.-6 All-7 teaching-7 materials-7 -1 and-7 self-testing-6 -1 exercises-6 -1 are-7 made-7 -1 available-7 -1 on GCULearn-7 -1 and-6 -1 links-6 -1 are-6 -1 provided-6 to-6 -1 external-6 materials-6 such-6 -1 as-6 -1 podcasts,-5 MOOCs,-6 videos-6 -1 and-7 relevant-6 -1 literature.-5 All-6 the computing-7 resources-6 -1 used-7 for-6 -1 laboratories-6 -1 are-7 made-6 -1 available-7 -1 either-6 -1 by-7 virtual-6 machine-6 -1 images-7 (supplied-6 to-7 students-6 for -1 use-7 -1 on-6 their-6 -1 own-6 computers)-6 -1 or-6 -1 online-7 -1 using-6 -1 industry-5 standard-6 cloud-6 com

Indicative Reading

-1 Data-7 Mining-6 -1 and-6 Predictive-15 Analytics (Wiley-6 Series-6 -1 on-7 Methods-6 -1 and-15 Applications-6 -1 in-6 -1 Data-7 Mining-34 )-6 -1 Daniel-9 T.-6 -1 Larose-6 ,Chantal -1 D.-6 -1 Larose-5 (24-6 Apr-5 -1 2015) -1 Data-7 Mining:-3 -1 Concepts-5 -1 and-9 -3 Techniques (The-6 Morgan-7 Kaufmann-6 Series-6 -1 in-7 -1 Data-6 Management-7 Systems). Jiawei-7 -1 Han-6 -1 3rd -1 Revised-6 -1 edition-6 (8-6 Mar. -3 2011) -1 Data-8 Science-7 -1 and-8 Big-7 -1 Data-17 Analytics:-7 -1 Discovering,-7 Analyzing, -2 Visualizing-7 -1 and-7 Presenting-8 -1 Data-7 Hardcover.-8 EMC-7 Education Services (17-5 Mar-6 -1 2015) Analytics-6 -1 in-5 a-6 Big-5 -1 Data-6 World:-4 The-6 Essential-5 Guide-5 to-6 -1 Data-5 Science-6 -1 and-5 -1 its-15 Applications-5 Bart-6 Baesens (Wiley-5 -1 and-6 SAS Business-6 Series) (1-6 Jul-5 -1 2014) Big-6 -1 Data-14 Analytics-6 For-5 Beginners.-5 Faraz-6 -1 Rabbani-5 ,-6 Ali-5 -1 Roghani (26-5 Jan-5 -1 2015) Practical-7 Business-15 Analytics-6 -1 Using-6 SAS:-5 A-14 -1 Hands-on-6 Guide Shailendra-6 Kadre, -3 Venkat-6 -1 Reddy-6 Konasani (30-6 Jan-6 -1 2015) -1 Hadoop:-5 The-6 -1 Definitive-5 Guide-5 -1 3rd-6 Edition. -8 Tom-6 White.-6 (29-5 May-6 -1 2012) -1 Data-16 Analytics-6 -1 with-7 -1 Hadoop-6 Benjamin-7 Bengfort,-6 Jenny-7 Kim-6 (25-7 Jan-6 -1 2016) <http://www.kdnuggets.com/>

Transferrable Skills

D1 Specialist knowledge and application D2 Critical thinking and problem solving D3 Critical analysis D4 Communication skills, written, oral and listening D5 Numeracy D6 Effective information retrieval and research skills D7 Computer literacy D8 Self-confidence, self-discipline & self-reliance (independent working) D10 Creativity, innovation & independent thinking D14 Ability to prioritise tasks and time management D16 Presentation skills

Module Structure

Activity Total Hours
Tutorials (FT) 9.00
Tutorials (FDL) 9.00
Lectures (FT) 18.00
Independent Learning (FT) 75.00
Assessment (FT) 30.00
Assessment (FDL) 30.00
Practicals (FT) 18.00
Practicals (FDL) 18.00
Lectures (FDL) 18.00
Independent Learning (FDL) 75.00

Assessment Methods

Component Duration Weighting Threshold Description
Exam (School) 2.00 50.00 45% Lab test: Practical lab test involving an application/on-line practical lab test (2 hours)
Exam (School) 1.00 50.00 45% Class test: Written paper/on-line class test (1 hour)