Your organizers, Ursula Kaczmarek and Matthew Lean, are here if you need anything.
Restrooms are down the hall toward the elevator and to the right.
Find Alex to connect your GitHub profile to the JupyterHub server.
10:30-11:15 AM: Intro to Shiny with Jason Bryer
This workshop will introduce participants to shiny Package in R. Shiny provides a framework for creating web applications in R. We will work through creating a simple shiny application to analyze high school graduations rates for New York State
1:30-2:15 PM: Open data sources with Matthew Lean
2:30-3:15 PM: Intro to machine learning with open data with Ursula Kaczmarek
This workshop will demonstrate a typical machine learning workflow in R using data from the NYS open data catalogue.
There’s a companion Jupyter notebook for Python users.
Project Hack Ideas
Open Source Software in Healthcare
healthcare data is always a good one!
Doing good with data - using my skills toward something more fulfilling than marketing attribution modeling :)
Quandl, SEC EDGAR, Amazon Product Information
Interested in public health, sustainability, and environmental public data and how they are currently being use to drive innovation
State/county/municipal website audit: standards for accessibility(visual/audio)
Uncovering unexpected sources of open data and linking the data to insights: FOILed data, muckrock.com, lilSis
Find public uses/collection of personal information like SSNs
Create dataset that could be used to extend Dependency Check to include vulnerabilities beyond just those in the National Vulnerability Database(NVD). Sample vulnerability scraper for Chromium. Perhaps develop a GitHub scraper that collects vulnerability-related bugs for an arbitrary project
Visualize NVD data feeds to understand which third-party libraries are“covered” and where vulnerability data needs to be collected
efficacy of education dollars spent
NYSERDA green building initiative
Mashup with some form of public health data
Visualize water quality alerts following wet weather
Harvest data out of unstructured corporate 10-K quarterly reports to do… something