I wanted to conveniently use data science tools without the hassle of installing the required languages and packages, while benefiting from the strengths of the Linux command line tools. There is a pre-packaged VM called the Data Science Toolbox that fills this need.
It comes with R and Python installed, along with the respective popular data analysis packages for R and Python. You will be able to install the VM successfully by following the instructions on the website, including installation of pre-requisites like VirtualBox and Vagrant.
One thing to note is that the VM is set up to be accessed through SSH and that it is configured to use 2 CPU cores and 2 GB RAM by default. If you need to increase the RAM allocation, you will need to edit Vagrantfile
and change the vb.memory
setting that is expressed in MB. For example 8 GB will be 8192 MB – see code sample below.
config.vm.provider "virtualbox" do |vb|
# Customize the amount of memory on the VM:
vb.memory = "8192"
end