Exploring Numpy and Pandas in Python 3.9: Installation, Benefits, Data Import, Manipulation, and Export
Table of Contents
Introduction #
Numpy and Pandas are fundamental libraries for data manipulation and analysis in Python. Numpy provides support for large, multidimensional arrays and mathematical functions, while Pandas excels at handling structured data. In this article, we’ll explore both Numpy and Pandas in Python 3.9, covering their installations, caveats, and the benefits they offer. We’ll delve into importing data from various file formats such as CSV, Excel, and tabdelimited files using both libraries. Additionally, we’ll learn how to generate useful metrics, manipulate and massage the data using Numpy and Pandas, and finally export the data in different formats leveraging the power of both modules.
Installation of Numpy and Pandas on Python 3.9 #
To install Numpy and Pandas for Python 3.9, you can use pip, the package installer for Python:
pip install numpy pandas
Benefits of Numpy and Pandas #
Numpy: #

Efficiency: Numpy arrays are more memoryefficient and faster for numerical computations compared to standard Python lists.

Multidimensional Arrays: Numpy allows you to work with multidimensional arrays, enabling efficient handling of large datasets.

Broadcasting: Numpy supports broadcasting, a powerful feature that simplifies array operations.

Mathematical Functions: Numpy comes with a wide range of mathematical functions for various numerical operations.
Pandas: #

Data Structures: Pandas provides two primary data structures  Series and DataFrame  that are ideal for handling structured data.

Data Alignment: Pandas aligns data automatically based on labels, making it easy to perform operations on datasets with missing or misaligned data.

Data Wrangling: Pandas offers powerful tools for data wrangling, including filtering, transforming, and aggregating data.
Importing Data from Files into Numpy and Pandas #
Importing CSV Files #
Numpy:
import numpy as np
data_np = np.genfromtxt('data.csv', delimiter=',', skip_header=1)
Pandas:
import pandas as pd
data_pd = pd.read_csv('data.csv')
Importing Excel Files #
Numpy:
Numpy does not have direct support for reading Excel files. You can use Pandas to read the data and then convert it to a Numpy array:
import pandas as pd
import numpy as np
data_pd = pd.read_excel('data.xlsx')
data_np = data_pd.to_numpy()
Importing TabDelimited Files #
Numpy:
import numpy as np
data_np = np.genfromtxt('data.txt', delimiter='\t', skip_header=1)
Pandas:
import pandas as pd
data_pd = pd.read_csv('data.txt', delimiter='\t')
Generating Useful Metrics with Numpy and Pandas #
Numpy:
import numpy as np
data_np = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
# Mean
mean_np = np.mean(data_np)
# Standard Deviation
std_dev_np = np.std(data_np)
# Sum along rows or columns
sum_rows_np = np.sum(data_np, axis=1)
sum_cols_np = np.sum(data_np, axis=0)
Pandas:
import pandas as pd
data_pd = pd.DataFrame([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
# Mean
mean_pd = data_pd.mean().values
# Standard Deviation
std_dev_pd = data_pd.std().values
# Sum along rows or columns
sum_rows_pd = data_pd.sum(axis=1).values
sum_cols_pd = data_pd.sum().values
Data Manipulation and Massaging with Numpy and Pandas #
Numpy:
import numpy as np
data_np = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
# Transpose
transposed_data_np = np.transpose(data_np)
# Reshape
reshaped_data_np = data_np.reshape((1, 9))
# Slicing
subset_np = data_np[0:2, 1:3]
# Elementwise operations
doubled_data_np = data_np * 2
Pandas:
import pandas as pd
data_pd = pd.DataFrame([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
# Transpose
transposed_data_pd = data_pd.transpose()
# Reshape  Not applicable to DataFrame
# Slicing
subset_pd = data_pd.iloc[0:2, 1:3]
# Elementwise operations
doubled_data_pd = data_pd * 2
Exporting Data in Various Formats using Numpy and Pandas #
Exporting to CSV #
Numpy:
import numpy as np
data_np = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
np.savetxt('output_np.csv', data_np, delimiter=',')
Pandas:
import pandas as pd
data_pd = pd.DataFrame([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
data_pd.to_csv('output_pd.csv', index=False)
Exporting to Excel #
Numpy:
Numpy does not have direct support for writing to Excel files. You can use Pandas to write the data:
import pandas as pd
import numpy as np
data_np = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
data_pd = pd.DataFrame(data_np)
data_pd.to_excel('output.xlsx', index=False)
Exporting to TabDelimited Text File #
Numpy:
import numpy as np
data_np = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
np.savetxt('output.txt', data_np, delimiter='\t')
Pandas:
import pandas as pd
data_pd = pd.DataFrame([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
data_pd.to_csv('output.txt', sep='\t', index=False)
Conclusion #
Both Numpy and Pandas are powerful libraries for data manipulation and analysis in Python. Numpy excels at numerical computations with multidimensional arrays, while Pandas is ideal for handling structured data. By exploring their installation, data import, metrics generation, data manipulation, and export capabilities, you’ll be wellequipped to work on a wide range of datacentric projects. Whether you’re performing scientific computing, data analysis, or data cleaning, these libraries will streamline your workflow and empower you with powerful tools for data exploration and manipulation. Happy data crunching!