ClowdControl

Team

@Greg K @Anisha K  @Andrew D @Lee T @Amanda E @Chris G

Mission

integrating cloud-deployable pipelines and deep-learning tools with the MindControl QC platform for accessible, reliable, and scalable deployment regardless of available/”captive” compute resources.

Url

Contact

Summary

ClowdControl extends the MindControl web-based image QC platform to enable deployment of pipelines on high-performance computing environments in the cloud. Leveraging tools packaged in Docker, built in Javascript and Python, and solutions for data storage and cloud computing such as Amazon Web Services, ClowdControl shortens the loop between proof-reading, editing, and reprocessing data without requiring further computational resources or know-how on behalf of the user. As this tool evolves, it will support data access and control through a variety of databases and authentication methods, and be able to extensibly interface with any Boutiques-documented pipeline on a variety of computational platforms, both in the cloud and locally.

Demo

  • MindControl in the Cloud (currently offline to save $)  

Hacking

Plan

  • Status: ✔️ 
  • Status: (there will always be more pipelines)
  • Status: ✔️
  • Status: ✔️
  • Status: ✔️ 
  • Status: (because it’ll never be done-done)
  • Status: ✔️ 

1. ABIDE Data

  • Available scripts:
  • Download ABIDE data for a list of subjects 
  • Download entire freesurfer directory, BIDS directory, or both from Amazon S3
  • Same as get_abide_fs.py, but also runs Andrew’s QC (for BIDS data)
  • Download raw ABIDE2 data for a list of subjects form Amazon S3
  • Edit Dockerfile with name of script and your subject list
  • Build and run docker container (see README.md)

2. Freesurfer Pipelines in Docker

  • Translate the mindcontrol_docs FSPrep ipynb into a script to use as an entrypoint for a docker container.
  • ✔️ Built a docker container that takes a BIDS directory with FreeSurfer output in derivatives, and prepares images and stats to be loaded into Mindcontrol
  • Todo:
  •  Save control points (or other edits) created in Mindcontrol into the format/location readable by FreeSurfer
  • Rewrite parse stats commands to be version agnostic, and add more relevant stas
  • Write workflow to run the FreeSurfer BIDS-app to re-run recon-all after edits are made

3. LORIS QA SotW

Here are the fields in LORIS:
  • reduced dynamic range due to bright artifact/pixel 
  • slice to slice intensity differences
  • noisy scan
  • susceptibilty artifact above the ear canals.
  • susceptibilty artifact due to dental work
  • susceptibility artifact due to anatomy
  • sagittal ghosts
  • slight ringing artefacts
  • severe ringing artefacts
  • movement artefact due to eyes
  • movement artefact due to carotid flow
  • slight movement between packets
  • large movement between packets
  • Large AP wrap around affecting brain
  • Medium AP wrap around no affect on brain
  • Small AP wrap around no affect on brain
  • Too tight LR cutting into scalp
  • Too tight LR affecting brain
  • Top of scalp cut off
  • Top of brain cut off
  • Base of cerebellum cut off
  • missing top third - minc conversion?
  • checkerboard artifact
  • horizontal intensity striping (Venetian blind effect
  • high intensity in direction of acquisition
  • signal loss (dark patches)
  • copy of prev data
  • Duplicate series

4. MindControl in Docker

When run, the Docker image launches a Meteor server running MindControl. The Docker image is available on Dockerhub and the Dockerfile is on Github.

5. JSON Schema for Clowder

Things to consider:
  • Where is data coming from?
  • How will data get to the cloud?
  • Where will data go after being processed?
  • How will processed data get back into researcher’s hands?
  • What tool is being run in the cloud?
  • How is the tool being run (i.e. options, containers, etc…)

Building blocks:
  • BIDS apps for data formats
  • Boutiques for command-line options
  • AWS
  • S3
  • Batch/EC2

Remaining pieces:
  • Data transfer
  • Resource provisioning

Link to schema available on Github.

6. Write Clowder

There are two distinct portions of Clowder: the client-side and server-side applications.

Client-side
Necessary tasks:
  • parse/validate clowder request
  • validate data organization
  • validate compute pipeline
  • push input data to the cloud (if necessary)
  • launch server-side Clowder in the cloud
  • check status of jobs in the Clowder

Server-side
Necessary tasks:
  • interpret commands sent from client-side Clowder
  • pull input data from the cloud
  • launch pipeline on the data
  • push output data to the cloud

7. Integration

There is be a button available for the user in MindControl that grabs relevant data from the database entry for a corresponding file and initiates Clowder to begin necessary data transfer and pipeline execution in the cloud.

Metadata

Original Repos


Archive

Original Clowder Proposal

  • There exist many tools in the realm of human brain mapping that perform various complicated transforms on brain images. Many of these tools have been made easier to install on hardware since the advent of Docker containers (and BIDS Apps), however, without significant computational resources it is still challenging to run sophisticated pipelines on one's own. Leveraging Docker, and the descriptive I/O framework Boutiques, we propose to build a Python package which takes any pipeline with an accompanying Docker container, as well as Amazon Web Service credentials, and enables that tool to be deployed in the cloud completely transparently to the user. The package will be able to push and pull data within the cloud, and will benefit from the strengths of Boutiques such as input parameter validation prior to job submission. Users will be able to deploy jobs, monitor their status, and store, access, or distribute their results in the cloud. This tool will lower the barrier to entry for researchers who wish to perform computationally burdensome tasks but do not have access to computing resources. This tool can also be extended to run web-services in the cloud, such as Jupyter Notebooks, providing users with a web-UI for running and interacting with tools either for data visualization, introductory demonstration, or wrapping pipeline-deployment within a more accessible interface than the command line. This package will be completely open-source, and will be distributed on PyPi for both Python2 and Python3.

Original Mindcontrol Proposal

  • FreeSurfer is the most widely used software for modeling cortical surfaces, capable of providing a complete neuroanatomical description of the brain from T1 weighted images through both surface based and volume based processing streams. However, manual invervention is necessary to QC results or when certain stages of the pipeline fail, especially when analyzing pathological brains. When analyzing large datasets this process may become error prone. To help organize inspection and editing of FreeSurfer output among collaborators, Mindcontrol was developed as an open source web-based tool for quality control of neuroimaging pipeline outputs.

  • During this brainhack, we aim to extend Mindcontrol in the following ways:
  • - Create a Docker container encapsulating all of the components of Mindcontrol, to make it easier to deploy.
  • - Continue development of Nipype workflows to automatically prepare FreeSurfer output to be viewed within Mindcontrol.
  • - Create workflows to re-run various FreeSurfer processing steps after manual inteventions have been made in Mindcontrol
  • - Extend documentation to make it easier for other users to set up Mindcontrol for their own data