NLP Systems Curriculum
At a high level, I just want to be as useful as possible. In an age of ascendant AI, a world of value to be unlocked with NLP, and given my preference for engineering over research, that means focusing on systems engineering, data engineering, and NLP. I want to be the guy who can conceptualize the NLP application, implement the whole system, assemble the dataset, and train the models. I’ll know I’ve succeeded when small, elite teams (i.e., successful ex-founders) want me to join them as a co-founder.

For systems and data engineering, a mentor recommended that I continue focusing on the relevant computer science (databases, operating systems, and distributed systems) and then learn subject-specific tools.


  • Mining of Massive Datasets.

The below is deprecated.

Resources wanted

  • Subject-specific tools. Is Airflow/Luigi? Spark?
  • + the best resources for these tools
  • More data science
  • NLP projects sequence

Would these be useful?

  • Little Book of Semaphores. I’ve never gone deep on concurrency before, and this book is the best resource I’ve found.


  • Machine learning performance resources (whatever Yaroslav has been reading)?