Now, driven by the needs of data science, Python has become the 4th most popular language (according to TIOBE Index for Dec) and there have been a lot of interesting work to improve the usability of the tools.
Based on the homework I did today, the easiest way to set up your python dev environment is simply by using Anaconda. It is a great open source analytics platform from Continuum Analytics. It comes with toolings such as conda, the package manager, and many popular python libraries for data science needs. The company also offers cloud-based services for life cycle management of python packages, notebooks, etc.
Fat Installation with Anaconda
Simply follow the instructions here: http://docs.continuum.io/anaconda/install. By default it installs to your home directory (~/anaconda), which can be customized with the installer. You need to add ~/anaconda/bin to your PATH if the installer does not patch your PATH environment setting.To update your Anaconda installation, simply run:
>conda update anaconda
Conda is a great package manager for Python, more details on conda later in the post.
>conda install -c r r-essentials
This installs "IRKernel and over 80 mostly used R packages including dplyr, shiny, ggplot2, tidyr, caret and nnet".
Slim Installation with Miniconda
If you do not want to use the fat installation from Anaconda, you can also install Miniconda, which only includes Python and several essential packages. You can download the installer for your platform, see instructions here.Using Anaconda
With Anaconda or Miniconda installed, you are all set for development. Several quick notes that could help you have more fun.
conda, a package manager to rule them all
Conda is the command line package manager that solves a lot of issues with package and library management with Python. It is actually a package manager not just for Python, I even found NodeJS libraries there.
A quick list of features conda provides:
- virtual environments: it enables you to create separate environments with different Python version, list of libraries, etc. Something Virtualenv tries to provide, but much easier.
- package management
- build and distribute packages: you can either use Anadonda Cloud service, or host your own easily.
To learn more, check out conda cheat sheet (PDF), read conda official doc and watch the demo video (around 20 min, highly recommend).
Anaconda Cloud
Anaconda cloud (previously known as Binstar) is a hosted package management service for notebooks, environments, conda and PyPI packages, etc. Several quick links:- anaconda FAQ: http://docs.continuum.io/anaconda/faq
- anaconda command (CLI for Anaconda Cloud) reference: http://docs.anaconda.org/cli.html
- Using Anaconda Cloud: http://docs.anaconda.org/using.html
- IDE integration: http://docs.continuum.io/anaconda/ide_integration
- Anaconda Launcher (GUI desktop launcher app): http://docs.continuum.io/anaconda-launcher/index
- building conda packages: http://conda.pydata.org/docs/building/build.html
IDE integration
Anaconda can be easily integrated with your favorite IDEs, as mentioned here. To be frank, I am not aware of so many Python IDEs. I mostly use either text editor (such as VIM) or PyCharm from JetBrains.
The latest PyCharm already supports conda. All you need to do is add a new interpreter in preferences, and set it to your Anaconda python installation (e.g. ~/anaconda/bin/python) or the specific conda environment python installation (PyCharm supports both VirtualEnv and Conda env).
Ok, that's about it, hope you enjoy Anaconda and Python without the hassle of dev environment setup anymore.