So you want to build a project using machine learning!
You might have zero to basic knowledge on how ML works, and you might have some idea on the kind of ML project you want to build, but you’re not sure how to actually get started with machine learning. This starter pack is designed to give you a quick overview and a bunch of resources to jumpstart your machine learning hack!
If you have any questions or feedback on this starter pack, tell us on Slack(DM Jazz!). If you have any technical questions along the way, be sure to ask mentors on our help queue.
Have fun! ❤
Jazz and all the Cal Hacks Directors
What’s machine learning?
Machine learning is a way to make predictions about data. Essentially, the term“machine learning” refers to a bunch of algorithms that allow you to“learn” the trends present in a data set, and then predict what a future data point might look like.
Here are some super simple, common examples of cases where machine learning is useful:
Recognizing handwritten text(OCR)
Deciphering the emotions presents in tweets(sentiment analysis)
Filtering out unwanted messages(spam detection)
To build a machine learning project, you need three things:
A model that you want to train.
Training data that you use to teach your model.
Something you want to predict based on that training.
It helps to think of a machine learning model as a black box. You’re not sure what happens inside it, but you do know that if you toss in enough high-quality inputs as training data, the black box will eventually learn how to output correct values for new inputs.
But what’s inside the black box? A ton of math and statistics :) These resources provide good descriptions, in order of least to most detail, of how classical machine learning and deep learning models work(deep learning is machine learning but with neural networks). Pick a few to at least skim over, and be sure to check out the Intro to Deep Learning workshop on Friday night at 1 a.m. for an in-depth explanation of deep learning!
Note that unless you have a good understanding of linear algebra and multivariable calculus, you should try to avoid the trap of spending too much time delving into the intricacies of things like backpropagation. Some of these resources can be intimidating and hard to grasp initially - that’s completely natural! Keep in mind that, while it’s beautiful and enlightening and Joy In Its Purest Form to understand how machine learning models work, you don’t need to grasp all or even most of this content to use machine learning. Tons of libraries and APIs exist so that you can leverage ML from a higher level of abstraction.
What ML hackathon projects can I make? 😮
Here are some cool ML-based hackathon projects people have made in the past for inspiration!
News Report: A web app that helps find the ground truth amongst differing news articles using NLP(Winner, Best ML Hack @ Cal Hacks 4.0)
Stage Hand: A web app which helps users improve their public speaking ability using speech recognition and sentiment analysis(Winner, Best Use of Azure or Microsoft Tech @ Cal Hacks 4.0)
SpotMe: An app + hardware hack which tracks and corrects athletes’ movements(Winner, 2nd Place Overall + Cal Hacks Fellowship @ Cal Hacks 4.0)
How can I get started quickly and easily?
Instead of spinning up your own ML models, you can capitalize on models that others have already trained. This is often the quickest and easiest way to implement machine learning solutions at a hackathon, especially if the machine learning task you need to solve as part of your project is common enough to have already been researched, your data set isn’t too niche, and/or you have never built your own machine learning models before.
If you’re on the fence about whether you should use a machine learning API, check if your task falls into these common ones, and browse the capabilities of each API in the next list:
Language detection / translation
Speech recognition, text-to-speech, or speech-to-text
Below are APIs you can leverage to quickly and easily apply machine learning to your project: