This is the continuation of series of blog post on getting started to data science using python(part 1,part 2,part 3,part 4,part 5,part 6,part 7,part 8,part 9) . So today will explore the building blocks of python with special emphasis on numpy .
Most frequently used python library for data science community . Why it is being used so extensively , we will find it in this blog post .
Numpystands for ‘Numeric Python’ , it is the core library in python to do the scientific computing . It is the container of tools and technique to compute mathematical model of science .
Multidimensional array object is one of the powerful data structure for computation of arrays and matrices .
Enough of the theory lets dig deeper .
If you are following us for a while you might have already installed ananaconda which comes pre-loaded with numpy !!
A numpy array comes with 2 important state variables. Just like Python, it automatically detects dtype (if not mentioned)
Lets look into an example :
Initialization of NumPy via arrange()
Initialization with Zeros and 2d array
Numpy stores matrix in a row major format i.e the entire row will be stored first and then the second row will be stored .
Similarly you can have it for ones too :
More on Example :
In other words, you see that the result of x-y gives the shape of (3,4): y had a shape of (4,) and x had a shape of (3,4). The resulting array is formed from the maximum size of the array .
Initialization with Random Values
numpy.random.randint(low, high, size=None, dtype=’I’) : The value of matrix lies between low and high
Slicing and index in Numpy :
This is a very important concept in numpy while we can used it for a variety of usage . This operation is very similar to list operation where you can index and slice using square [ ] .
Just like list, numpy also has 0 indexing. Let us see some of the commonly used slicing techniques .
- Generic Slicing Operation : [start]:[end]:[jump]
- Only jump ::2
- Only end :5
- Start and jump 2::-1
- End and Jump :5:2
- Start, end and jump 2:7:3
Solving Mathematical problems using numpy :
Numpy has a wide range of function to do mathematical operations :
np.add(), np.subtract(), np.multiply(), np.divide() and np.remainder().
Broadcasting in numpy :
It is a way to broadcast data of lower or same dimension onto another ndarray. It is similar to map operation in python.
Matrix Operation :
One of the very important operation to look out for as its being extensively used for image processing , vector calculation and a lot more :
- Inner Product
- Vector Vector
- Matrix Vector
- Matrix Matrix
- Outer product
In terms of Inner Product or Dot product, will be be nothing but sum of element wise product.
So, the dot product between vector v=[1,2,3]v=[1,2,3] and w=[2,4,6]w=[2,4,6] will be
The outer product work as inverted L :
Lets look at few example :
Statistical Function in Numpy :
- percentile(data, percentage,axis=0)
Reshape and resizing of matrix :
Resizing of array is done to make combability of two operations so that the array are of same dimensions or different dimensions the element here gets changed .
Reshaping in array :
Besides resizing, you can also reshape your array, you give a new shape to an array without changing its data. Only key of reshaping is keep the size of the new array unchanged .
Transpose Operation :
This function reverses the axes for 2D array.
For multidimensional array, it permutes the matrix according to the axis argument.
This are some of the main function of NumPy , you can even use it for doing small visualization task .
There are lot can be done with numpy , will explore more of it in explanatory data analysis .
Stay tuned and happy learning !!