Open Data Day Albany/Troy 2018 #OpenDataDay
With generous support from:

This hackpad: http://tiny.cc/odd18

Need help?

  • Your organizers, Ursula Kaczmarek and Matthew Lean, are here if you need anything.
  • Restrooms are down the hall toward the elevator and to the right. 
  • Find Alex to connect your GitHub profile to the JupyterHub server.

Workshops

10:30-11:15 AM:  Intro to Shiny with Jason Bryer
  • This workshop will introduce participants to shiny Package in R. Shiny provides a framework for creating web applications in R. We will work through creating a simple shiny application to analyze high school graduations rates for New York State

1:30-2:15 PM: Open data sources with Matthew Lean

2:30-3:15 PM: Intro to machine learning with open data with Ursula Kaczmarek
  • This workshop will demonstrate a typical machine learning workflow in R using data from the NYS open data catalogue. 
  • There’s a companion Jupyter notebook for Python users.

Project Hack Ideas

  • Open Source Software in Healthcare
  • healthcare data is always a good one!
  • Doing good with data - using my skills toward something more fulfilling than marketing attribution modeling :)
  • Using Tensorflow
  • Quandl, SEC EDGAR, Amazon Product Information
  • Interested in public health, sustainability, and environmental public data and how they are currently being use to drive innovation
  • State/county/municipal website audit:  standards for accessibility (visual/audio)
  • Uncovering unexpected sources of open data and linking the data to insights: FOILed data, muckrock.com, lilSis
  • usaspending.gov
  • Cybersecurity
  • Find public uses/collection of personal information like SSNs
  • Create dataset that could be used to extend Dependency Check to include vulnerabilities beyond just those in the National Vulnerability Database (NVD). Sample vulnerability scraper for Chromium. Perhaps develop a GitHub scraper that collects vulnerability-related bugs for an arbitrary project
  • Visualize NVD data feeds to understand which third-party libraries are “covered” and where vulnerability data needs to be collected
  • efficacy of education dollars spent
  • NYSERDA green building initiative 
  • Water quality
  • Mashup with some form of public health data
  • Visualize water quality alerts following wet weather
  • Harvest data out of unstructured corporate 10-K quarterly reports to do… something