We're sunsetting PodQuest on 2025-07-28. Thank you for your support!
Export Podcast Subscriptions
cover of episode #434 Most of OpenAI’s tech stack runs on Python

#434 Most of OpenAI’s tech stack runs on Python

2025/6/2
logo of podcast Python Bytes

Python Bytes

AI Deep Dive AI Chapters Transcript
People
B
Brian Ruckin
M
Michael Kennedy
Topics
Brian Ruckin: 我分享了一篇关于加速PyPI测试套件的文章,其中提到通过使用pytest-xdist并行执行测试、利用Python 3.12的sys.monitoring加速coverage、优化测试发现以及消除不必要的import可以显著提高测试速度。我发现pytest不仅适用于大型项目,而且通过一些优化技巧可以进一步提高其性能。例如,使用pytest-xdist时,需要注意数据库隔离问题,并可以使用pytest-sugar来改善输出。此外,通过-p:no参数可以禁用不必要的插件,减少import的开销。我计划撰写一篇关于加速pytest测试套件的文章或系列文章,分享更多实用技巧。

Deep Dive

Chapters
Trail of Bits significantly sped up PyPI's test suite using several techniques. Key improvements came from parallelizing tests with pytest-xdist, leveraging Python 3.12's sys.monitoring for faster coverage, optimizing test discovery, and eliminating unnecessary imports.
  • PyPI's test suite improved from 163 seconds to 30 seconds.
  • pytest-xdist enabled 67% time reduction by utilizing multiple cores.
  • Python 3.12's sys.monitoring and COVERAGE_CORE=sysmon resulted in a 53% time reduction.
  • Optimizing test discovery and eliminating unnecessary imports further enhanced performance.

Shownotes Transcript

Topics covered in this episode:

- [**Making PyPI’s test suite 81% faster**](https://blog.trailofbits.com/2025/05/01/making-pypis-test-suite-81-faster/?featured_on=pythonbytes))

Watch on YouTube)

About the show

Sponsored by Digital Ocean: pythonbytes.fm/digitalocean-gen-ai) Use code DO4BYTES and get $200 in free credit

Connect with the hosts

Join us on YouTube at pythonbytes.fm/live) to be part of the audience. Usually Monday at 10am PT. Older video versions available there too.

Finally, if you want an artisanal, hand-crafted digest of every week of the show notes in email form? Add your name and email to our friends of the show list), we'll never share it.

Brian #1: Making PyPI’s test suite 81% faster)

  • Alexis Challande

  • The PyPI backend is a project called Warehouse

  • It’s tested with pytest, and it’s a large project, thousands of tests.

  • Steps for speedup

  • Parallelizing test execution with pytest-xdist

  • 67% time reduction

  • --numprocesses=auto allows for using all cores

  • DB isolation - cool example of how to config postgress to give each test worker it’s on db

  • They used pytest-sugar to help with visualization, as xdist defaults to quite terse output

  • Use Python 3.12’s sys.monitoring to speed up coverage instrumentation

  • 53% time reduction

  • Nice example of using COVERAGE_CORE=sysmon

  • Optimize test discovery

  • Always use testpaths

  • Sped up collection time. 66% reduction (collection was 10% of time)

  • Not a huge savings, but it’s 1 line of config

  • Eliminate unnecessary imports

  • Use python -X importtime

  • Examine dependencies not used in testing.

  • Their example: ddtrace

  • A tool they use in production, but it also has a couple pytest plugins included

  • Those plugins caused ddtrace to get imported

  • Using -p:no ddtrace turns off the plugin bits

  • Notes from Brian:

  • I often get questions about if pytest is useful for large projects.

  • Short answer: Yes!

  • Longer answer: But you’ll probably want to speed it up

  • I need to extend this article with a general purpose “speeding up pytest” post or series.

  • -p:no can also be used to turn off any plugin, even builtin ones.

  • Examples include

  • nice to have developer focused pytest plugins that may not be necessary in CI

  • CI reporting plugins that aren’t needed by devs running tests locally

Michael #2: People aren’t talking enough about how most of OpenAI’s tech stack runs on Python)

  • Original article: Building, launching, and scaling ChatGPT Images)

  • Tech stack: The technology choices behind the product are surprisingly simple; dare I say, pragmatic!

  • Python: most of the product’s code is written in this language.

  • FastAPI): the Python framework used for building APIs quickly, using standard Python type hints. As the name suggests, FastAPI’s strength is that it takes less effort to create functional, production-ready APIs to be consumed by other services.

  • C: for parts of the code that need to be highly optimized, the team uses the lower-level C programming language

  • Temporal): used for asynchronous workflows and operations inside OpenAI. Temporal is a neat workflow solution that makes multi-step workflows reliable even when individual steps crash, without much effort by developers. It’s particularly useful for longer-running workflows like image generation at scale

Michael #3: PyCon Talks on YouTube)

Brian #4: Optimizing Python Import Performance)

  • Mostly pay attention to #'s 1-3

  • This is related to speeding up a test suite, speeding up necessary imports.

  • Finding what’s slow

  • Use python -X importtime <the reset of the command

  • Ex: python -X importtime ptyest

  • Techniques

  • Lazy imports

  • move slow-to-import imports into functions/methods

  • Avoiding circular imports

  • hopefully you’re doing that already

  • Optimize init.py files

  • Avoid unnecessary imports, heavy computations, complex logic

  • Notes from Brian

  • Some questions remain open for me

  • Does module aliasing really help much?

  • This applies to testing in a big way

  • Test collection imports your test suite, so anything imported at the top level of a file gets imported at test collection time, even if you only are running a subset of tests using filtering like -x or -m or other filter methods.

  • Run -X importtime on test collection.

  • Move slow imports into fixtures, so they get imported when needed, but NOT at collection.

  • See also:

  • option) )-X) in the standard docs)

  • Consider using import_profile)

Extras

Brian:

  • PEPs & Co.)

  • PEP is a ‘backronym”, an acronym where the words it stands for are filled in after the acronym is chosen. Barry Warsaw made this one up.

  • There are a lot of “enhancement proposal” and “improvement proposal” acronyms now from other communities

  • pythontest.com) has a new theme

  • More colorful. Neat search feature

  • Now it’s excruciatingly obvious that I haven’t blogged regularly in a while

  • I gotta get on that

  • Code highlighting might need tweaked for dark mode

Michael:

Joke: There is hope).