Loading...

ML WG #3

Attendance

Noah

David

Stephan

Tom

Agenda

Progress updates

Notes

Distributed training: horovod

Dask communications layer using nvlink- blog article in progress: https://github.com/dask/dask-blog/pull/28

PyTorch & distributed discussion https://github.com/dask/distributed/issues/2581

Noah. Workflow

cloud training

stephan: use GPU at paperspace, and AWS (Several environments)

environments with tensorflow is annoying

Tom: conda should improve this

Need for intermediate models

Hard-coded NN in fortran code: (hard to change)

Online learning. Python climate model interface

Streamz being maintained at anaconda (martin)

would interface with Kafka (and other data sources)

provide higher level API

How suitable for deep learning pipelines

Data loading issue

Summarize this https://github.com/pangeo-data/pangeo/issues/567 @Noah B

https://github.com/dmlc/dlpack

cuda__array_interface. no-copy operations. could be good for evaluation in climate model

Not so useful for on-disk?

Blog post. loop in Tom, Stephan, and Ryan.

nice benchmarks for data loading

toy problem of data loading. add some simple ML model. time ratio of training to loading

need to detect bottlenecks (check hardware).

Next meeting. Monday July 1 9AM