Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython

Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython

by William McKinney
Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython

Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython

by William McKinney

Paperback

$39.99 
  • SHIP THIS ITEM
    Temporarily Out of Stock Online
  • PICK UP IN STORE
    Check Availability at Nearby Stores

Related collections and offers


Overview

Python for Data Analysis is concerned with the nuts and bolts of manipulating, processing, cleaning, and crunching data in Python. It is also a practical, modern introduction to scientific computing in Python, tailored for data-intensive applications. This is a book about the parts of the Python language and libraries you’ll need to effectively solve a broad set of data analysis problems. This book is not an exposition on analytical methods using Python as the implementation language.

Written by Wes McKinney, the main author of the pandas library, this hands-on book is packed with practical cases studies. It’s ideal for analysts new to Python and for Python programmers new to scientific computing.

  • Use the IPython interactive shell as your primary development environment
  • Learn basic and advanced NumPy (Numerical Python) features
  • Get started with data analysis tools in the pandas library
  • Use high-performance tools to load, clean, transform, merge, and reshape data
  • Create scatter plots and static or interactive visualizations with matplotlib
  • Apply the pandas groupby facility to slice, dice, and summarize datasets
  • Measure data by points in time, whether it’s specific instances, fixed periods, or intervals
  • Learn how to solve problems in web analytics, social sciences, finance, and economics, through detailed examples

Product Details

ISBN-13: 9781449319793
Publisher: O'Reilly Media, Incorporated
Publication date: 10/29/2012
Pages: 463
Product dimensions: 7.00(w) x 9.10(h) x 0.90(d)

About the Author

Wes McKinney is the main author of pandas, the popular open source
Python library for data analysis. Wes is an active speaker and participant in the Python and open source communities. He worked as a quantitative analyst at AQR Capital Management and Python consultant before founding DataPad, a data analytics company, in 2013. He graduated from MIT with an S.B. in Mathematics.

Table of Contents

Preface; Conventions Used in This Book; Using Code Examples; Safari® Books Online; How to Contact Us; Chapter 1: Preliminaries; 1.1 What Is This Book About?; 1.2 Why Python for Data Analysis?; 1.3 Essential Python Libraries; 1.4 Installation and Setup; 1.5 Community and Conferences; 1.6 Navigating This Book; 1.7 Acknowledgements; Chapter 2: Introductory Examples; 2.1 1.usa.gov data from bit.ly; 2.2 MovieLens 1M Data Set; 2.3 US Baby Names 1880-2010; 2.4 Conclusions and The Path Ahead; Chapter 3: IPython: An Interactive Computing and Development Environment; 3.1 IPython Basics; 3.2 Using the Command History; 3.3 Interacting with the Operating System; 3.4 Software Development Tools; 3.5 IPython HTML Notebook; 3.6 Tips for Productive Code Development Using IPython; 3.7 Advanced IPython Features; 3.8 Credits; Chapter 4: NumPy Basics: Arrays and Vectorized Computation; 4.1 The NumPy ndarray: A Multidimensional Array Object; 4.2 Universal Functions: Fast Element-wise Array Functions; 4.3 Data Processing Using Arrays; 4.4 File Input and Output with Arrays; 4.5 Linear Algebra; 4.6 Random Number Generation; 4.7 Example: Random Walks; Chapter 5: Getting Started with pandas; 5.1 Introduction to pandas Data Structures; 5.2 Essential Functionality; 5.3 Summarizing and Computing Descriptive Statistics; 5.4 Handling Missing Data; 5.5 Hierarchical Indexing; 5.6 Other pandas Topics; Chapter 6: Data Loading, Storage, and File Formats; 6.1 Reading and Writing Data in Text Format; 6.2 Binary Data Formats; 6.3 Interacting with HTML and Web APIs; 6.4 Interacting with Databases; Chapter 7: Data Wrangling: Clean, Transform, Merge, Reshape; 7.1 Combining and Merging Data Sets; 7.2 Reshaping and Pivoting; 7.3 Data Transformation; 7.4 String Manipulation; 7.5 Example: USDA Food Database; Chapter 8: Plotting and Visualization; 8.1 A Brief matplotlib API Primer; 8.2 Plotting Functions in pandas; 8.3 Plotting Maps: Visualizing Haiti Earthquake Crisis Data; 8.4 Python Visualization Tool Ecosystem; Chapter 9: Data Aggregation and Group Operations; 9.1 GroupBy Mechanics; 9.2 Data Aggregation; 9.3 Group-wise Operations and Transformations; 9.4 Pivot Tables and Cross-Tabulation; 9.5 Example: 2012 Federal Election Commission Database; Chapter 10: Time Series; 10.1 Date and Time Data Types and Tools; 10.2 Time Series Basics; 10.3 Date Ranges, Frequencies, and Shifting; 10.4 Time Zone Handling; 10.5 Periods and Period Arithmetic; 10.6 Resampling and Frequency Conversion; 10.7 Time Series Plotting; 10.8 Moving Window Functions; 10.9 Performance and Memory Usage Notes; Chapter 11: Financial and Economic Data Applications; 11.1 Data Munging Topics; 11.2 Group Transforms and Analysis; 11.3 More Example Applications; Chapter 12: Advanced NumPy; 12.1 ndarray Object Internals; 12.2 Advanced Array Manipulation; 12.3 Broadcasting; 12.4 Advanced ufunc Usage; 12.5 Structured and Record Arrays; 12.6 More About Sorting; 12.7 NumPy Matrix Class; 12.8 Advanced Array Input and Output; 12.9 Performance Tips; Python Language Essentials; The Python Interpreter; The Basics; Data Structures and Sequences; Functions; Files and the operating system; Colophon;
From the B&N Reads Blog

Customer Reviews