Get started with Azure OpenAI

Modern machine learning and AI research has moved rapidly from the lab to our IDEs, with tools like Azure Cognitive Services providing API-based access to pretrained models. There are many different approaches to delivering AI services, with one of the most promising ways to work with language being a technique called generative pretraining, or GPT, which handles large amounts of text.

OpenAI and Microsoft

Research lab OpenAI pioneered this technique, publishing the first paper on the subject in 2018. The model it uses has gone through several iterations, starting with unsupervised GPT-2, which used unlabeled data to imitate humans. Built on 40 GB of public Internet content, GPT-2 required significant training to deliver a model with 1.5 billion parameters. It was followed by GPT-3, a much larger model with 175 billion parameters. Exclusively licensed by Microsoft, GPT-3 is the basis for tools such as the Codex-driven programming code used by GitHub Copilot and the DALL-E image generator.

With a model like GPT-3 requiring significant amounts of computation and memory, on the order of thousands of petaflop/s-days, it is an ideal candidate for cloud-based high-performance computing on supercomputer hardware. specialized. Microsoft has built its own Nvidia-based servers for supercomputing on Azure, with its cloud instances appearing on the TOP500 supercomputer list. Azure’s AI servers are built around Nvidia Ampere A12000 Tensor Core GPUs, interconnected via a high-speed InfiniBand network.

Add OpenAI to Azure

OpenAI’s generative AI tools were built and trained on Azure servers. Under a long-standing agreement between OpenAI and Microsoft, OpenAI tools are made available as part of Azure, with Azure-specific APIs and integration with Azure billing services. After some time in private preview, the Azure OpenAI API suite is now generally available, with support for GPT-3 text generation and the Codex code model. Microsoft has announced that it will add DALL-E image generation in a future update.

This doesn’t mean that anyone can create an application that uses GPT-3; Microsoft always restricts access to ensure that projects comply with its policies for the ethical use of AI and are strictly limited to specific use cases. You must also be a direct Microsoft customer to access Azure OpenAI. Microsoft uses a similar process to gain access to its limited-access cognitive services, where there is a possibility of impersonation or breach of privacy.

These policies are likely to remain stringent and some areas, such as healthcare services, will likely require additional protection to meet regulatory requirements. Microsoft’s own experiences with AI language models taught it a lesson it doesn’t want to repeat. As additional protection, there are content filters on inputs and outputs, with alerts for Microsoft and developers.

Explore Azure OpenAI Studio

Once your account has been approved to use Azure OpenAI, you can start building code that uses its API endpoints. Appropriate Azure resources can be created from the portal, Azure CLI, or Arm templates. If you’re using the Azure portal, create a resource that’s allocated to your account and the resource group you intend to use for your application and all associated Azure services and infrastructure. Next, name the resource and select the pricing tier. At the moment, there’s only one pricing option, but that will likely change as Microsoft rolls out new service tiers.

With a resource in place, you can now deploy a model using Azure OpenAI Studio. This is where you will do most of your work with OpenAI. Currently, you can choose between members of the GPT-3 model family, including code-based codex. Additional templates use embeds, complex semantic information optimized for search.

Within each family there is a set of different models with names that indicate both cost and capacity. If you’re using GPT-3, Ada is the cheapest and least capable and Davinci is the highest. Each template is a superset of the previous one, so as the tasks get more complex, you don’t need to change your code, you just choose a different template. Interestingly, Microsoft recommends starting with the best-performing model when designing an OpenAI-powered application, as this allows you to tune the underlying model for price and performance when moving to production.

Work with template customization

Although GPT-3’s text completion features have gone viral, in practice your application will need to be much more focused on your specific use case. You don’t want GPT-3 to power a help desk that regularly gives irrelevant advice. You need to create a custom model using training examples with desired inputs and outputs, which Azure OpenAI calls “completions”. It is important to have a large set of training data and Microsoft recommends using several hundred examples. You can include all your prompts and completions in a single JSON file to simplify your training data management.

With a custom template in place, you can use Azure OpenAI Studio to test how GPT-3 works for your scenario. A basic playground lets you see how the model responds to specific prompts, with a basic console app that lets you type a prompt and returns an OpenAI completion. Microsoft describes creating a good prompt as “show, don’t tell,” suggesting that prompts should be as self-explanatory as possible for best results. Playground also helps train your model, so if you’re building a classifier, you can supply a list of text and expected outputs before supplying inputs and a trigger to get a response.

A useful feature of the playground is the ability to set an intent and expected behaviors early on, so if you’re using OpenAI to power a help desk triage tool, you can set the expectation that the output will be polite and calm. , making sure she won’t impersonate an angry user. The same tools can be used with the Codex template, so you can see how it works as a code completion tool or as a dynamic helper.

Write code to work with Azure OpenAI

Once you’re ready to start coding, you can use your deployment’s REST endpoints, either directly or with the OpenAI Python libraries. The latter is probably your quickest route to live code. You will need the endpoint URL, an authentication key, and the name of your deployment. Once you have them, set the appropriate environment variables for your code. As always, in production, it’s best not to hard-code the keys and use a tool like Azure Key Vault to manage them.

Calling an endpoint is quite simple: just use the openai.Completion.create to get a response, setting the maximum number of tokens needed to hold your prompt and its response. The response object returned by the API contains the text generated by your model, which can be extracted, formatted, and then used by the rest of your code. Basic calls are simple, and your code can use additional parameters to handle the response. These control the creativity of the model and how it samples its results. You can use these settings to ensure that answers are simple and accurate.

If you use another language, use its REST and JSON parsing tools. You can find an API reference in Azure OpenAI documentation or leverage Swagger specs hosted on Azure’s GitHub to generate API calls and work with returned data. This approach works well with IDEs like Visual Studio.

Azure OpenAI pricing

A key element of the OpenAI models is their token-based pricing model. Tokens in Azure OpenAI are not the familiar authentication token; they are sections of tokenized chains, which are created using an internal statistical model. Open AI provides a tool on their site to show how strings are tokenized to help you understand how your queries are billed. You can expect a token to contain about four characters of text, although it can be less or more; however, it should end up with 75 words requiring about 100 tokens (about a paragraph of normal text).

The more complex the model, the more expensive the tokens. The base Ada model costs around $0.0004 for 1,000 tokens, and the high-end Davinci is $0.02. If you apply your own tuning, there is a storage cost, and if you use integrations, the costs can be an order of magnitude higher due to the increased compute requirements. There are additional costs for fine-tuning the models, starting at $20 per compute hour. The Azure website provides sample prices, but actual prices may vary depending on your organization’s account relationship with Microsoft.

Perhaps the most surprising thing about Azure OpenAIclo is its simplicity. When using pre-made templates (with the option of fine-tuning), all you need to do is apply some basic pre-training, understand how prompts generate output, and bind the tools to your code, generating text content or code as you go. It’s necessary.

Copyright © 2023 IDG Communications, Inc.

Post a Comment

Cookie Consent
We serve cookies on this site to analyze traffic, remember your preferences, and optimize your experience.
Oops!
It seems there is something wrong with your internet connection. Please connect to the internet and start browsing again.
AdBlock Detected!
We have detected that you are using adblocking plugin in your browser.
The revenue we earn by the advertisements is used to manage this website, we request you to whitelist our website in your adblocking plugin.
Site is Blocked
Sorry! This site is not available in your country.