- How are partitions represented to solids?
- config / environment dict as configured by user
- context resource
- How do we group runs for the same partition?
- How do we group runs across partitions for the same pipeline?
- Where do we need to specify the set of possible partitions?
- pipeline definition vs schedule definition vs standalone
- Should shoot for standalone… should get the schedule definition right, and then work on hacking around the dagit UI
- How do we specify execution of a partition?
- Where do we do partition selection?
- explicit selector on schedule definitions
- Figure out how to resolve config/environment dict based on partition
- Look at presets?
- Figure out execution API for partition
- for now, using tags
- Figure out if we should allow non-partition jobs for partitioned pipelines, where should the partition selection happen
- Support time-based partitions (variable-sized)
- Support fixed partitions (ML-style)
- Support execution of a partition through dagit
- Support execution of a partition through a scheduler
- Support execution of a batched set of partitions (backfill)
- Take in config to designate partition
- View job/run status by partition
Add partition definition function on a PipelineDefinition
- Natural for pipeline author to already know how to partition data
- Potentially overfitting by making Partition too prominent
Put partition definition function on a ScheduleDefinition