Data Analytics with HPC

Queen’s University Belfast - 20-21st June 2018

Terry Sloan, Ioanna Lampaki 
__________________________________________________________________
 
Course Page including timetable: https://events.prace-ri.eu/event/695/
 
__________________________________________________________________
 
This is a live collaborative online document which we will use to share links, information and comments. All course participants are encouraged to contribute. 

Hi all, Eilidh here. I’ll be teaching the Data Cleaning Practical later.

For the Data Cleaning Practical at 12:

To run on your own laptop:

Prerequisites
•Python 2.7 and Conda. (I think it works with Python 3 too).

Download the Data Cleaning practical from 

Command line install:
cd to the downloaded data cleaning practical directory
•conda create --name pythonData
•conda install -n pythonData Jupyter pandas
•source activate pythonData
•jupyter notebook
•Open http://localhost:8888 in browser.

You should be able to see the data cleaning practical notebook in the browser that opens. Start a new notebook and follow the instructions on the powerpoint slides at http://www.archer.ac.uk/training/course-material/2018/02/data-an-belfast/DAwHPC-L03-Data-Cleaning-Practical.pdf

To run on one of the desktop machines here

Download the Data Cleaning practical from 
to the Documents folder.

From the start menu choose All Programs > Anaconda3 > Jupyter Notebook

You should be able to see the data cleaning practical notebook in the browser that opens. Start a new notebook and follow the instructions on the powerpoint slides at http://www.archer.ac.uk/training/course-material/2018/02/data-an-belfast/DAwHPC-L03-Data-Cleaning-Practical.pdf

To run on the DAC (data analytics cluster) at EPCC

N.B. Don’t follow the jupyter notebook instructions on the powerpoint slides.

Login to DAC from Windows
open PuTTY and set ssh connection:
•Host Name:username@login.rdf.ac.uk
•port: 22

Login to DAC from Mac or Unix