You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
brad-gh 40a40adf62 Update README.md 2 years ago
..
ex00 fix: pip install tabulate 2 years ago
ex01 Renaming with uppercase of readme files to respect standard 2 years ago
ex02 Renaming with uppercase of readme files to respect standard 2 years ago
ex03 Renaming with uppercase of readme files to respect standard 2 years ago
ex04 Renaming with uppercase of readme files to respect standard 2 years ago
ex05 Renaming with uppercase of readme files to respect standard 2 years ago
ex06 Update README.md 2 years ago
README.md Renaming with uppercase of readme files to respect standard 2 years ago

README.md

W1D04 Piscine AI - Data Science

Data wrangling with Pandas

Data wrangling is one of the crucial tasks in data science and analysis which includes operations like:

  • Data Sorting: To rearrange values in ascending or descending order.
  • Data Filtration: To create a subset of available data.
  • Data Reduction: To eliminate or replace unwanted values.
  • Data Access: To read or write data files.
  • Data Processing: To perform aggregation, statistical, and similar operations on specific values. Ax explained before, Pandas is an open source library, specifically developed for data science and analysis. It is built upon the Numpy (to handle numeric data in tabular form) package and has inbuilt data structures to ease-up the process of data manipulation, aka data munging/wrangling.

Exercises of the day

  • Exercise 1 Concatenate
  • Exercise 2 Merge
  • Exercise 3 Merge MultiIndex
  • Exercise 4 Groupby Apply
  • Exercise 5 Groupby Agg
  • Exercise 6 Unstack

Virtual Environment

  • Python 3.x
  • NumPy
  • Pandas
  • Jupyter or JupyterLab

Version of Pandas I used to do the exercises: 1.0.1. I suggest to use the most recent one.

Resources