We're currently in the gold rush period of AI. The world cannot get enough. A consequence of this, is that rationing is in force. It's like the end of the second world war, but with GPUs. This is a good thing, because it means that we can't just spin up as many resources as we like. It's a bad thing, for the exact same reason.
If you're making use of Azure's Open AI resources for your AI needs, you'll be aware that there are limits known as "quotas" in place. If you're looking to control how many resources you're using, you'll want to be able to control the capacity of your deployments. This is possible with Bicep.
This post grew out of a GitHub issue around the topic where people were bumping on the message
the capacity should be null for standard deployment as they attempted to deploy. At the time that issue was raised, there was very little documentation on how to handle this. Since then, things have improved, but I thought it would be useful to have a post on the topic.