[Workshop] How to Build Great Data Science Projects - Part 1
Use this document as a guide and a template for conceptualizing, planning and working on your data science project. Use the“Create Doc” button above to start with a fresh copy of this doc. Feel free to delete any sections/explanation to streamline your doc.
For the purpose of this workshop, we’ll use the word project to refer to some original work that will require at least 20-30 hours of work and produce a substantial output in the form of one or more of these things: Jupyter notebook, code repository, project report, blog post, web application, presentation or video walkthrough.
What are the steps involved in building a great project?
Finding a project topic/idea/domain
Finding a dataset for your project
Preparing an outline and setting deadlines
Executing and iterating on your project
Project Documentation & Presentation
Showcasing your Project on your Resume/LinkedIn
Maximizing your reach and improving your project
In Part 1 of this workshop, we’ll cover steps 1 to 4. Let’s get started!
Step 1: Finding a Project Topic/Idea/Domain
It’s OK to feel lost when you’re trying to figure out what topic you should build your next/first project on. Generally speaking, it should be approachable while being just a bit challenging, so that you can both complete the project and learn something.
What should be the topic of your project?
Something you have learned and want to practice e.g. Data Analysis with Python
For every course you take, you should have a project, ideally 2-3.
Something you are interested in learning e.g. scikit-learn, plotly etc.
It’s OK if you don’t already know the topics, you can learn while doing the project
Something that will fill a gap in your portfolio/Resume e.g. model deployment
Go through your Resume, or check out some job listings on LinkedIn
Something interesting you came across and want to replicate with a different dataset
Topics for Projects in Data Science & Machine Learning
Exploratory Data Analysis of a dataset
Data Visualization with Seaborn, Plotly, Folium(maps)
Supervised machine learning(regression/classification) on tabular data
Unsupervised machine learning(clustering/recommendations) on tabular data
Deep Learning on images(Computer vision)
Deep Learning for natural language processing
Data cleaning and feature engineering
Reinforcement learning
Dataset creation using web scraping or official APIs
Model deployment using Flask & Heroku
Creating a web/mobile application powered by machine learning
Participating in a active/completed data science competition on Kaggle etc.
A practical tutorial on any topic related to machine learning
Implementing a paper on a different dataset
Create and publish a Python library with utilities, models, functions etc.
Apply any of the above to a specific domain e.g. energy, astronomy, Covid-19,
Over the course of your learning journey, try to cover most, if not all of the above topics(a single project can cover multiple topics)
Inspiration for interesting projects
Whatever topic you have in mind, it’s quite likely that you can find projects done by others on the same topic. Look through 10-12 projects for inspiration before you finalize your topic.
Step 1: Finding a Project Topic/Idea/Domain
What should be the topic of your project?
Topics for Projects in Data Science & Machine Learning
Inspiration for interesting projects