Announcement: Mindasys is now Skillet! Check out our new website.

Key points about this course

Duration : 2 Days
Course Fee : RM2,799.00

HRD Corp Claimable Course

Foundations for Big Data Analysis with SQL

Live Virtual Class

Public Class

In-House Training

Private Class

Course Overview

In this course, you'll get a big-picture view of using SQL for big data, starting with an overview of data, database systems, and the common querying language (SQL). Then you'll learn the characteristics of big data and SQL tools for working on big data platforms.

You'll also install an exercise environment (virtual machine) to be used through the specialization courses, and you'll have an opportunity to do some initial exploration of databases and tables in that environment.

This course is available on face to face classroom training or live virtual class training and online class training.

Course Prerequisites

Before continuing, be sure that you have access to a computer that meets the following hardware and software requirements:

• Windows, macOS, or Linux operating system (iPads and Android tablets will not work) • 64-bit operating system (32-bit operating systems will not work)

• 8 GB RAM or more • 25GB free disk space or more

• Intel VT-x or AMD-V virtualization support enabled (on Mac computers with Intel processors, this is always enabled; on Windows and Linux computers, you might need to enable it in the BIOS)

• For Windows XP computers only: You must have an unzip utility such as 7-Zip or WinZip installed (Windows XP’s built-in unzip utility will not work)

Course Objectives

By the end of the course, you will be able to

• distinguish operational from analytic databases, and understand how these are applied in big data;

• understand how database and table design provides structures for working with data;

• appreciate how differences in volume and variety of data affects your choice of an appropriate database system; • recognize the features and benefits of SQL dialects designed to work with big data systems for storage and analysis; and

• explore databases and tables in a big data platform. To use the hands-on environment for this course, you need to download and install a virtual machine and the software on which to run it.

Course Content

 

Data and Databases

You'll learn about database systems and the distinction between operational and analytic databases.

  • What Is Data?
  • Why Organize Data?
  • What Does a DBMS Do?
  • Relational Databases and SQL
  • The Success of RDBMSs and SQL
  • Operational and Analytic Databases
  • Comparing Operational and Analytic DBs: SELECT Statements
  • Comparing Operational and Analytic DBs: DML Activity
  • Operational and Analytic Databases: Further Comparisons

Relational Databases and SQL

  • Introducing Table Schemas
  • NULL Values
  • Data Types
  • Primary Keys
  • Foreign Keys
  • Two Strategies for Database Design
  • Database Normalization
  • Denormalization
  • Differences
  • Trade-offs
  • Database Transactions
  • ACID
  • Enforcing Business Rules: Constraints and Triggers
  • Business Rules and ACID for Analytics?

Big Data

  • How Big Is Big Data?
  • Distributed Storage
  • Distributed Processing
  • Structured Data
  • Unstructured Data
  • Semi-Structured Data
  • Strengths of Traditional RDBMSs
  • Limitations of Traditional RDBMSs
  • SQL and Structured Data
  • SQL and Semi-structured Data
  • SQL and Unstructured Data

SQL Tools for Big Data Analysis

  • Big Data Analytic Databases (Data Warehouses)
  • NoSQL: Operational, Unstructured and Semi-structured
  • Non-transactional, Structured Systems
  • Big Data ACID-Compliant RDBMSs
  • Search Engines
  • Challenges
  • What We Keep
  • What We Give Up
  • What We Add
  • Where to Store Big Data
  • Coupling of Data and Metadata

  • Foundations for Big Data Analysis with SQL

  • Ask For