Setting up the Environment [1]¶
The suite interfaces extensively with some external standard Python3 APIs to accomplish its tasks. It is essential that we get these dependencies [2] up and running, before we can get started.
Installing the Dependencies¶
Matplotlib (a plotting library for Python) powers the data visulation capabilities of the suite. It is built upon Tkinter (Tk interface for Python). To install tkiner:
$ sudo apt-get install python3-tk
It is strongly recommended to use pip and virtualenv for installing python packages specific to projects.
For instructions on installing and setting up pip and virtualenv, refer Installing packages using pip and virtualenv.
Note
Ensure that you activate your virtualenv (if you have created one) before installing python packages.
The suite is built upon Numpy, the fundamental package for numerical and scientific computing in Python. To install numpy:
(env) $ python3 -m pip install numpy
Installing Buddi-CRAVeD Suite¶
Installing from PyPI¶
The Buddi-CRAVeD (alias craved) package is available on PyPI (the Python Package Index). Pip allows for a one-step installation of the package and its dependencies using:
(env) $ python3 -m pip install craved
Installing from Source¶
Alternatively, the Buddi-CRAVeD suite (source code) can be also be download from https://github.com/elixir-code/craved.git .
$ git clone https://github.com/elixir-code/craved.git
Pip allows for a one-step installation of the package and its dependencies from local source using:
(env) $ python3 -m pip install <path to source directory>
For instructions on installing packages from VCS (Version Control System) or installing packages from local source, refer Python Packaging User Guide.
Preparing the Data Warehouse¶
The Buddi-CRAVeD warehouse directory functions as an aggregated store for intermediate data structures, sampled datasets and accumulated results.
To configure and setup, an empty directory ( of your choice ) as the suite’s data warehouse, execute the “craved-warehouse” script in the terminal from the chosen warehouse directory.
$ craved-warehouse
Note
The ‘craved-warehouse’ script, that configures and sets up the package’s warehouse can be invoked on empty directories only.
Warning
The craved doesn’t allow multiple warehouses to be configured simaltaneously. Successful reinvokations of the ‘craved-warehouse’ script on other directories will force the previous configuration to become invalid.
Extending Support for Large Datasets¶
The Buddi-CRAVeD suite’s enhanced support for cluster analysis of “larger” datasets is enabled through our modified versions of the companion libraries – scikit-learn and scipy.
These libraries in part derive their numerical computation capabilities from ATLAS (Automatically Tuned Linear Algebra Software). To install ATLAS:
$ sudo apt-get install libatlas-base-dev
The python wheel formats (built for linux systems) of the modified companion libraries can be downloaded from sourceforge (project : craved-support) - scikit_learn-0.18.1-cp35-cp35m-linux_x86_64.whl and scipy-0.19.1-cp35-cp35m-linux_x86_64.whl.
Pip allows for easy overwrite and installation of the remote wheels.
(env) $ python3 -m pip uninstall scikit-learn
(env) $ python3 -m pip install --use-wheel --no-index --find-links=https://sourceforge.net/projects/craved-support/files/scikit_learn-0.18.1-cp35-cp35m-linux_x86_64.whl scikit-learn
(env) $ python3 -m pip uninstall scipy
(env) $ python3 -m pip install --use-wheel --no-index --find-links=https://sourceforge.net/projects/craved-support/files/scipy-0.19.1-cp35-cp35m-linux_x86_64.whl scipy
For instructions on the usage of pip and wheel utilities for installing remote and local wheels, refer to the Wheel documentation.
Footnotes
| [1] | The instructions for setting up the environment are specific to Ubuntu based operating systems. However, it can replicated for other Linux Distros and Windows Systems. |
| [2] | The list of dependencies were generated on a python virtualenv created exclusively for the project and using pip (env) $ python3 -m pip freeze
|