You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

1.7 KiB

W1D03 Piscine AI - Data Science

Visualizations

While working on a dataset it is important to check the distribution of the data. Obviously, for most of humans it is difficult to visualize the data in more than 3 dimensions

"Viz" is important to understand the data and to show results. We'll discover three libraries to visualize data in Python. These are one of the most used visualisation "libraries" in Python:

  • Pandas visualization module
  • Matplotlib
  • Plotly

The goal is to understand the basics of those libraries. You'll have time during the project to master one (or the three) of them. You may wonder why using one library is not enough. The reason is simple: it depends on the usage. For example if you want to check the data quickly you may want to use Pandas viz module or Matplotlib. If you want to plot a custom and more elaborated plot I suggest to use Matplotlib or Plotly. And, if you want to create a very nice and interactive plot I suggest to use Plotly.

Exercises of the day

  • Exercise 1 Pandas plot 1
  • Exercise 2 Pandas plot 2
  • Exercise 3 Matplotlib 1
  • Exercise 4 Matplotlib 2
  • Exercise 5 Matplotlib subplots
  • Exercise 6 Plotly 1
  • Exercise 7 Plotly Box plots

Virtual Environment

  • Python 3.x
  • NumPy
  • Pandas
  • Matplotlib
  • Plotly
  • Jupyter or JupyterLab

I suggest to use the most recent version of the packages.

Resources