ML WG #3

Attendance

  • Noah
  • David
  • Stephan
  • Tom

Agenda

  • Progress updates

Notes

  • Distributed training: horovod
  • Noah. Workflow
  • cloud training
  • stephan: use GPU at paperspace, and AWS (Several environments)
  • environments with tensorflow is annoying
  • Tom: conda should improve this
  • Need for intermediate models
  • Hard-coded NN in fortran code: (hard to change)
  • Online learning. Python climate model interface
  • Streamz being maintained at anaconda (martin)
  • would interface with Kafka (and other data sources)
  • provide higher level API
  • How suitable for deep learning pipelines
  • Data loading issue
  • cuda__array_interface. no-copy operations. could be good for evaluation in climate model
  • Not so useful for on-disk?
  • Blog post. loop in Tom, Stephan, and Ryan.
  • nice benchmarks for data loading
  • toy problem of data loading. add some simple ML model. time ratio of training to loading
  • need to detect bottlenecks (check hardware).
  • Next meeting. Monday July 1 9AM