🕘 Meeting notes: 2019-04-29 ML WG

Apr 29, 2019


Attendees

​@Ryan A​  / Columbia, LDEO / rabernat
Noah Brenowitz / UW / nbren12
Joe Hamman
Jim Bednar / Anaconda / jbednar
David Gagne / UCAR /


Agenda

  • tfRecords vs PyTables


Discussion

Ryan’s example

Learn Geostrophic balance on lat/lon grid
CNN didn’t work 
Augment data with 3d coordinates (x,y,z), struggled with periodic BC
XBatcher
Many samples (millions)

Jim

As part of new NASA/Anaconda project, wanting to collect workflows to optimize as best practices.  Will be monitoring the ml-workflow-examples and adapting them if appropriate.

David

Interpretability, partial dependence
Super cells
U,V,T → 1 km vorticity
csv data (84 MB)
model agnostic

Questions:
How will dask do on bigger problems?
Could be made data parallel by changing order of “means”

Ryan: tensorflow didn’t like dask-based DataLoaders

Action items

  • Let’s get this done @someone