Modular Typeshed

Agenda

  • Vision for typeshed
  • Which modules should be in typeshed?
  • Mass generating stubs using stubgen
  • Scaling typeshed to thousands of packages

Monolithic Typeshed

  • Entire typeshed is shipped as a single package
  • A buggy stub can be a big problem for users
  • Slow iteration speed on fixes
  • Can’t pin to particular version of stub

Monolithic Typeshed

  • Can’t have stubs for multiple versions of a package
  • Users of old package versions are out of luck
  • Users may be stuck with old versions of type checkers due to typeshed
  • Problems will only get worse if we grow typeshed aggressively

Vision for Typeshed

  • Modular typeshed: separate PyPI package for each stub package (PEP 561)
  • Typical Python user can find stubs for every package they use
  • Stubs for thousands of popular 3rd party packages available
  • Fast iteration speed: new stub package ~immediately available after PR has been merged
  • Bugs in stubs are less bad
  • Move fast & fix things

Challenge 1: Which Packages to Include

  • Thousands of PyPI packages 
  • We can’t just generate stubs for all packages in one go (or ever)
  • Many PyPI packages used by few projects at most
  • Some PyPI packages are used by many thousands of users

🤔 How can we decide which packages are the most useful?

Idea: PyPI Download Counts

  • Stats on top downloaded PyPI packages are readily available
  • Problems:
  • Many packages only have high counts because a popular package depends on them
  • What about Conda, deb/rpm packages, and private pip repositories?
  • Big CI jobs
  • VM provisioning

Idea: Search Hits on GitHub 

  • Find all from pkg import and import pkg across GitHub
  • Problems:
  • Identical code copied over many repositories
  • Packages with generic names that are hard to search for
  • Low-value repositories
  • Legacy code and obsolete packages
  • Proprietary code missing

Idea: PyPI Downloads + GitHub Search

  • Libraries downloaded N+ times and M+ search hits on GitHub
  • Results look good 👍 

Challenge 2: Generating Baseline Stubs

  • Goal: generate baseline stubs for 100 top packages