Loading...
ML WG #3
Attendance
Noah
David
Stephan
Tom
Agenda
Progress updates
Notes
Distributed training: horovod
Dask communications layer using nvlink- blog article in progress
:
https://github.com/dask/dask-blog/pull/28
PyTorch & distributed discussion
https://github.com/dask/distributed/issues/2581
Noah. Workflow
cloud training
stephan: use GPU at paperspace, and AWS
(Several
environments)
environments with tensorflow is annoying
Tom: conda should improve this
Need for intermediate models
Hard-coded NN in fortran code:
(hard
to change)
Online learning. Python climate model interface
Streamz being maintained at anaconda
(martin)
would interface with Kafka
(and
other data sources)
provide higher level API
How suitable for deep learning pipelines
Data loading issue
Summarize this
https://github.com/pangeo-data/pangeo/issues/567
@
Noah B
https://github.com/dmlc/dlpack
cuda__array_interface
. no-copy operations. could be good for evaluation in climate model
Not so useful for on-disk?
Blog post. loop in Tom, Stephan, and Ryan.
nice benchmarks for data loading
toy problem of data loading. add some simple ML model. time ratio of training to loading
need to detect bottlenecks
(check
hardware).
Next meeting. Monday July 1 9AM
Please turn on JavaScript to use Paper in all of its awesomeness. ^_^
Attendance
Agenda
Notes