We'll be using the Anaconda distribution, which is a suite of common Python data science tools bundled around a package manager. This helps to manage virtual environments and project dependencies.
NumPy Tutorial: Your First Steps Into Data Science in Python - Real Python
<aside> 💠Legend:
Python term Numpy command IDE term/command Snippet Reference
</aside>
Install Numpy using Anaconda -
conda install numpy matplotlib
We are able to use IPython, which is an upgraded Python "Read Eval Print Loop" (REPL) that makes editing code in a live interpreter session more straightforward and prettier. 4
tuIn [1]: import numpy as np
In [2]: digits = np.array([
[1, 2, 3],
[4. 5. 6],
[6. 7, 9],
])
In [3]: digits
Out[3]:
array([[1, 2, 3],
[4, 5, 6],
[6, 7, 8]])
IPython can be installed as a standalone (pip install ipython
) or be bundled with the other tools.
Notebooks provide a series of mini-scripts called cells that can be run, and re-run in whatever order you want, all in the same Python memory session. Graphs and markdown can be rendered/included between cells
Jupyter notebook is the most popular notebook, but nteract
wraps the Jupyter functionality and makes it more approachable.
import numpy as np
CURVE_CENTER = 80
grades = np.array([72, 35, 64, 88, 51, 90, 74, 12]) # one-dimensional array. Shape of (8,). Data type int64.
def curve(grades): # taking grades array as a param
average = grades.mean() # mean method of grades
change = CURVE_CENTER - average # scalar, or single number (1,).
new_grades = grades + change # vectorization - performs the same operation for every element in the array. Also includes Broadcasting
return np.clip(new_grades, grades, 100) # limit/clip the values to a set of minimum and maximums. Can't go lower than the original grade or higher than 100
curve(grades)