Preliminaries

These tutorials make use of a number of Python packages. This section describes how to install these packages. (If you do not already have Python 3 installed on your system you may wish to skip to the section Installing Python.)

Required Python packages

The tutorials collected here assume that Python version 3.3 or higher is installed along with the following packages:

Note

As of this writing, NLTK 3.0 “alpha” is the only version released that works with Python 3. It must be installed from source. If you have pip available just use: pip install http://nltk.org/nltk3-alpha/nltk-3.0a4.tar.gz

If these packages are installed, the following lines of code should run in a Python interpreter without error.

In [1]: import numpy

In [2]: numpy.__version__
Out[2]: '1.9.1'

In [3]: import scipy

In [4]: scipy.__version__
Out[4]: '0.13.3'

In [5]: import matplotlib

In [6]: matplotlib.__version__
Out[6]: '1.4.2'

In [7]: import pandas

In [8]: pandas.__version__
Out[8]: '0.15.1'

In [9]: import nltk

In [10]: nltk.__version__
Out[10]: '3.0.0'

In [11]: import sklearn

In [12]: sklearn.__version__
Out[12]: '0.15.2'

The following packages are also worth checking out:

In [13]: import IPython

In [14]: IPython.__version__
Out[14]: '3.0.0'

In [15]: import statsmodels

In [16]: statsmodels.__version__
Out[16]: '0.6.1'

Note

Why Python 3? Python 3 stores strings as Unicode. This makes working with languages other than English immeasurably easier. In Python 3, “文” and “ø” are characters like “a” and “o” and require no special handling.

Installing Python

Installing Python on Linux

On Debian-based systems such as Ubuntu, Python 3 may be installed with:

apt-get install python3

On Fedora the command is:

yum install python3

Depending on how current the operating system is, these commands may or may not install Python version 3.3 or higher. Find the version of python available by running python3 --version in a terminal.

Installing packages on Mac OS X

Installing Python 3 via homebrew is the preferred method for those comfortable with the OS X command line interface.

Mac OS X installers for Python may be found on the official Python website.

Finally, Python 3.3 may also be installed via MacPorts <http://macports.org.

Installing Python on Windows

There are also a number of distributions of Python for Windows that come bundled with Python packages relevant to scientific computing including as NumPy, SciPy, and scikit-learn. One such distribution with excellent support for Python 3 is Anaconda Python.

Installing Python packages

Installing packages on Linux

Note

Advanced users may want to consider isolating these packages in a virtual environment.

Using the package manager

On recent versions of Debian and Ubuntu as well as Fedora Linux there are recompiled packages available that cover almost all of the requirements. With apt-get most of the requirements are installed with the following command:

sudo apt-get install python3-numpy python3-scipy python3-pandas python3-matplotlib ipython3

Using pip

Installing the required packages is straightforward if the pip installer is available. For example, NLTK may be installed with the following command:

pip install http://nltk.org/nltk3-alpha/nltk-3.0a4.tar.gz

scikit-learn may also be installed with pip:

pip install scikit-learn

Installing from source

pip should be available on any system with Python 3.4 or higher installed. If pip is not available, the packages may be installed from source. Source “tarballs” for NumPy and matplotlib can be obtained and installed with the following sequence of commands. To install NumPy from source use the following commands:

curl -O https://pypi.python.org/packages/source/n/numpy/numpy-1.8.1.tar.gz
tar zxvf numpy-1.7.1.tar.gz
cd numpy-1.7.1
python setup.py install

To install matplotlib from source, enter the following commands:

curl -O -L https://downloads.sourceforge.net/project/matplotlib/matplotlib/matplotlib-1.3.1/matplotlib-1.3.1.tar.gz
tar zxvf matplotlib-1.3.1.tar.gz
cd matplotlib-1.3.1
python setup.py install

To install NTLK:

curl -O http://nltk.org/nltk3-alpha/nltk-3.0a4.tar.gz
tar zxvf nltk-3.0a4.tar.gz
cd nltk-3.0a4
python setup.py install

Installing packages on Mac OS X

Installation of Python 3 and the required packages may be accomplished using MacPorts or homebrew. For example, the following command installs matplotlib for Python version 3.3 under MacPorts:

sudo port install py33-matplotlib

Homebrew has a wiki page Homebrew and Python that describes how Python is handled in homebrew.

Installing packages on Windows

There are a number of distributions of Python for Windows that come pre-packaged with packages relevant to scientific computing such as NumPy and SciPy. They include, for example, Anaconda Python. Anaconda includes almost all the packages used here. Also available are instructions on how to use Python 3 with Anaconda <http://continuum.io/blog/anaconda-python-3>.