BIG DATA LANDSCAPE

SHE Level 5
SCQF Credit Points 20.00
ECTS Credit Points 10.00
Module Code MMI223994
Module Leader Brian Hainey
School School of Computing, Engineering and Built Environment
Subject Computing
Trimester
  • A (September start)

Summary of Content

This module covers the process of managing Big Data throughout its lifecycle, from requirements through retirement. The lifecycle crosses different application systems, databases and storage media. Students will gain an understanding of the full Big Data value chain. They will be able to analyze the challenges and opportunities associated with the different stages which Big Data passes through.

Syllabus

Landscape Overview Introduction Applications Infrastructure Technologies Types of Big Data: Graph, Social, Security, Telematics etc Metadata Big data lifecycle management Data lineage, provenance and infrastructure Data at rest and in motion Data Warehousing, marts, lakes Big data value chains: From big data to intelligence/knowledge From big data to decision making The interface of artificial intelligence and Big Data Data visualisation Big data ecosystems Big data reference architecture Data integration Data virtualization Data federation Linked open data: RDF, Semantics and Ontology Big data applications Big Data privacy, security, legal and governance Examples of tasks undertaken by students in practical sessions are: -360b7 Configuring and using the Hadoop file system. b7 Implementing parallel, distributed algorithms on a cluster using MapReduce. b7 Creation, deployment and testing of an end-to-end information value-chain in a virtual environment. b7 Utilising NoSQL databases.

Learning Outcomes

On completion of the module the student should be able to:Demonstrate a detailed understanding of the requirements, concepts and principles of Big Data systems.Demonstrate a detailed understanding of Big Data storage, transmission, processing and analytics from an enterprise system perspective.Critically appraise the data components of an enterprise system.Analyse the role of enterprise system data components in the Big Data information value chain.Critically appraise Big Data technologies, infrastructure, services and standards.

Teaching / Learning Strategy

The learning and teaching strategy for this module has been informed by the university's 'Strategy for Learning' design principles. The course material is introduced through lectures and laboratory sessions that draw upon and extend the lecture material to deepen students' knowledge. The laboratory sessions are designed as a set of formative exercises and a substantial summative exercise spanning several weeks. The formative exercises introduce a range of technologies that allow students to gain confidence and build knowledge of the range of solutions that can be applied to particular problems. Summative exercises provide experience in real-world problem-solving and challenges students to demonstrate analytical skills and capacity for divergent thinking. Tutorials will be used to help explain and elaborate on both the lecture material and the laboratory exercises; these will include a range of case studies that bring a global perspective to the subject matter. During all lab and tutorial sessions students receive formative feedback on their performance in undertaking the laboratory and tutorial exercises. Summative feedback and grades are also provided for the coursework assignments undertaken as part of the module, using GCULearn. GCU Learn is also used to provide the students with module specific Forums and Wikis to stimulate student and lecturer interaction outwith the normal lecture, laboratory and tutorial sessions. Flexible learning is encouraged and supported. All teaching materials and self-testing exercises are made available on GCULearn and links are provided to external materials such as podcasts, MOOCs, videos and relevant literature. All the computing resources used for laboratories are made available either by virtual machine images (supplied to students for use on their own computers) or online using industry standard cloud computing services provided by major global computing industry vendors. Due to the provision of all material and computing facilities online, the module is suitable for use where Flexible and Distributed Learning (FDL) is required.

Indicative Reading

David Feinleib. Big Data Bootcamp -- What Managers Need to Know to Profit from the Big Data Revolution. ISBN: 978-1-4842-0041-4 (Print) 978-1-4842-0040-7 (Online), Springer, 2014 Paul C. Zikopoulos, et al. Harness the Power of Big Data -- The IBM Big Data Platform. ISBN: 978-0-07180818-7, McGraw-Hill, 2013, 281 pages Mayer-Schonberger V, Cukier K (2013) Big Data, A revolution that will transform How we live, work and think, John Murray Zikopoulos P, deRoos D, Bienko C, Buglio R, Andrews M (2015) Big Data Beyond the Hype, McGrawHill Kitchin R (2014), The Data Revolution H. Gilbert Miller, Peter Mork, "From Data to Decisions: A Value Chain for Big Data", IT Professional, vol.15, no. 1, pp. 57-59, Jan.-Feb. 2013, doi:10.1109/MITP.2013.11 Jingran Li, et al., Big Data in product lifecycle management. Int J Adv Manuf Technol (2015) vol 81, pp. 667-684 http://www.bigdatalandscape.com Y. Demchenko, ; C. de Laat, ; P. Membrey. Defining architecture components of the Big Data Ecosystem. 2014 International Conference on Collaboration Technologies and Systems (CTS'14), Minneapolis, USA, 19-23 May 2014, Page(s): 104 - 112 Chuck Ballard, et al. IBM Informix: Integration Through Data Federation. August 2003, 270 pages. ibm.com/redbooks SAP. Data Federation Administration Tool Guide. SAP, 2013. Foster Provost and Tom Fawcett. Data science and its relationship to big data and data-driven decision making. Big Data, Vol. 1 No. 1, March 2013. DOI: 10.1089/big.2013.1508 Chuck Ballard, et al. Information Governance Principles and Practices for a Big Data Landscape. ibm.com/redbooks, March 2014, 280 pages Jos van Wezel, et al. Data Life Cycle Labs -- A New Concept to Support Data-Intensive Science. http://arxiv.org/abs/1212.5596 Sunil Soares, Big Data Governance: An Emerging Imperative, 2012, MC Press, ISBN 978-1583473771 Peter Schlampp, Spark: Lighting Up The Big Data Landscape, https://www.platfora.com/blog-post/spark-lighting-up-the-big-data-landscape/ June 15, 201

Transferrable Skills

D1 Specialist knowledge and application D2 Critical thinking and problem solving D3 Critical analysis D4 Communication skills, written, oral and listening D6 Effective information retrieval and research skills D11 Knowledge of international affairs D14 Ability to prioritise tasks and time management D16 Presentation skills D17 Commercial awareness

Module Structure

Activity Total Hours
Tutorials (FT) 12.00
Practicals (FT) 24.00
Lectures (FT) 24.00
Assessment (FT) 20.00
Independent Learning (FT) 120.00

Assessment Methods

Component Duration Weighting Threshold Description
Coursework 2 n/a 60.00 45% Research based investigative report (2500 words approx.)
Coursework 1 n/a 40.00 45% Problem based assessment