This breakout will focus on Datasets, Infrastructure, and Tools to make a hackathon successful
Participants
(please add yourself to table!)
Name
Organization
Email
Twitter
Dave Goodsmith
DataScience
dave@datascience.com
thegoodsmith,datascienceinc
Facilitators or a volunteer from the group will report out(4:00PM - Challenges/Lessons Learned, and 4:45PM - Curating and Building Solutions)
Team countries: France, Kenya, USA
Team sectors: Universities, research center(non-profit), industry
Challenges and Lessons Learned:
Challenge: Tools planned to use were not adequate to deal with problems trying to address.
Challenge: last minute glitches. Unexpected changes in the network. Lesson learned: have failover plans.
Challenge: finding datasets. Data scientist = data hunter. Lesson learned: took team a week to find right data to use. Don’t assume the open datasets are freely available. Public data set not always robust enough.
Challenge: having ground truth for analysis. Domain expert to validate research.
Challenge: datasets are just collected by organizations and they don’t realize it’s meaningless.
Challenge: how to get companies to share network data, e.g. network data, intellectual property/patent.
Lessons learned: have really good administrative staff on hand to solve issues, e.g. network issues.
Lessons learned: use virtual, collaborative document tool, e.g. Etherpad. Gives commonality and structure.
Lessons learned: best practice from Dept. of Commerce hackathon. Give clear guidelines for output. Have staff that vet the projects. Hackathon can be a way to demonstrate what is cool about an API. See commerce.gov/datausability
Challenge: can’t get the type of data you want. Lesson learned: create a synthetic set. Provide an end-to-end set of instructions.
Lessons learned: hackathon is a way to recruit/select data scientists.(highervue)
Lessons learned: predictions using fun, pop culture data, e.g. predict which Taylor Swift song will go viral.
Lessons learned: participants can sometimes cheat without breaking rules you’ve defined.
Resources:
RESOURCE
TESTIMONIALS/COMMENTS(Including“how might we build / scale this?”)
Cloudera
Data Robot
Tips/ how-to guide or other solutions and approaches to the challenges mentioned:
This breakout will focus on Datasets, Infrastructure, and Tools to make a hackathon successful
Participants
Facilitators or a volunteer from the group will report out (4:00PM - Challenges/Lessons Learned, and 4:45PM - Curating and Building Solutions)
Challenges and Lessons Learned:
Resources:
Tips/ how-to guide or other solutions and approaches to the challenges mentioned: