Announcement: Mindasys is now Skillet! Check out our new website.

Key points about this course

Duration : 4 Days
Course Fee : RM 5,200.00

HRD Corp Claimable Course

Introduction to Data Science in Python
Exam Code : Not available

Live Virtual Class

Public Class

In-House Training

Private Class

Course Overview

This course will introduce the learner to the basics of the python programming environment, including fundamental python programming techniques such as lambdas, reading and manipulating csv files, and the numpy library.

The course will introduce data manipulation and cleaning techniques using the popular python pandas data science library and introduce the abstraction of the Series and DataFrame as the central data structures for data analysis, along with tutorials on how to use functions such as groupby, merge, and pivot tables effectively. By the end of this course, students will be able to take tabular data, clean it, manipulate it, and run basic inferential statistical analyses.

This course should be taken before any of the other Applied Data Science with Python courses: Applied Plotting, Charting & Data Representation in Python, Applied Machine Learning in Python, Applied Text Mining in Python, Applied Social Network Analysis in Python.


Course Objectives

What You Will Learn

  • Describe common Python functionality and features used for data science
  • Explain distributions, sampling, and t-tests
  • Query DataFrame structures for cleaning and processing
  • Understand techniques such as lambdas and manipulating csv files
Course Content


Module 1: Fundamentals of Data Manipulation with Python

In this module you’ll get an introduction to the field of data science, review common Python functionality and features which data scientists use, and be introduced to the Coursera Jupyter Notebook for the lectures. All of the course information on grading, prerequisites, and expectations are on the course syllabus.

Related Topics

  • Data Science
  • The Coursera Jupyter Notebook System
  • Python Functions
  • Python Types and Sequences
  • Python More on Strings
  • Python Demonstration: Reading and Writing CSV files
  • Python Dates and Times
  • Advanced Python Objects, map()
  • Advanced Python Lambda and List Comprehensions
  • Advanced Python Demonstration: The Numerical Python Library (NumPy)


Module 2: Basic Data Processing with Pandas

In this module of the course, you’ll learn the fundamentals of one of the most important toolkits Python has for data cleaning and processing — pandas. You’ll learn how to read in data into DataFrame structures, how to query these structures, and the details about such structures are indexed.

Related Topics

  • Introduction
  • The Series Data Structure
  • Querying a Series
  • The DataFrame Data Structure
  • DataFrame Indexing and Loading
  • Querying a DataFrame
  • Indexing Dataframes
  • Missing Values


Module 3: More Data Processing with Pandas

In this module you’ll deepen your understanding of the python pandas library by learning how to merge DataFrames, generate summary tables, group data into logical pieces, and manipulate dates. We’ll also refresh your understanding of scales of data, and discuss issues with creating metrics for analysis.

Related Topics

  • Merging Dataframes
  • Pandas Idioms
  • Group by
  • Scales
  • Pivot Tables
  • Date Functionality


Module 4: Answering Question with Messy Data

In this module of the course, you’ll be introduced to a variety of statistical techniques such a distributions, sampling and t-tests.

Related Topics

  • Basic Statistical Testing
  • Other Forms of Structured Data

  • Introduction to Data Science in Python

  • Request For