Meta’s Llama models have become the go-to standard for open large language models (LLMs). With 10 times the downloads from last year, Llama models continue to lead the industry on openness, modifiability, and cost efficiency and enable customers to develop their own amazing solutions. Oracle Cloud Infrastructure (OCI) Data Science already supports Llama 2 , 3, and 3.1 models, even on CPUs.
Llama 3.2 on OCI
In a step forward for AI development and deployment, OCI Data Science now supports Llama 3.2 through AI Quick Actions and the Bring Your Own Container (BYOC) feature. Meta’s powerful new Llama 3.2 models include:
• Llama 3.2 1B
• Llama 3.2 3B
• Llama 3.2 11B Vision
• Llama 3.2 90B Vision
OCI Data Science also now supports the following Llama 3.2 new fine-tuned models:
• Llama 3.2 1B Instruct
• Llama 3.2 3B Instruct
• Llama 3.2 11B Vision Instruct
• Llama 3.2 90B Vision Instruct
The Llama 3.2 models support a wide range of model sizes and parameters, making them adaptable for various AI requirements. The Llama 3.2 1B and 3B models are designed to be lightweight, while the Llama 3.2 11 B and 90 B vision models have advanced capabilities in handling text and images tasks. The models are designed to be efficient with reduced latency and improved performance.
Meet the cutting-edge Llama 3.2 models
Llama 3.2 new models include multimodal models (11B and 90B) and lightweight, text-only models (1B and 3B), bringing a diverse range of functionalities tailored to the following AI tasks:
• Llama 3.2 1B, Llama 3.2 1B Instruct, Llama 3.2 3B, and Llama 3.2 3B Instruct (text only): Support query and prompt rewriting capabilities that can fit on edge devices. Perfect for highly personalized applications, such as AI-powered personal assistants for mobile.
• Llama 3.2 11B Vision, Llama 3.2 11B Vision Instruct, Llama 3.2 90B Vision and Llama 3.2 90B Vision Instruct (text and image input): Built to interpret images and perform visual reasoning, these models can handle tasks, such as generating image captions, retrieving text from images, visual grounding, document-based visual question answering, and analytical reasoning.
Working with Llama 3.2 models in OCI Data Science AI Quick Actions
AI Quick Actions is an OCI Data Science feature that offers a no-code solution for customers to seamlessly manage, deploy, fine-tune, and evaluate foundation models. With the integration of AI Quick Actions and Hugging Face, users can easily bring in Llama 3.2 models from Hugging Face to use in AI Quick Actions without extensive setup. To get the latest release of AI Quick Actions, deactivate and reactivate the notebook session you are using to access the feature. If you create a new notebook, the latest version of AI Quick Actions will be available.
In the current version of OCI Data Science AI Quick Actions, we support Llama 3.2 1B, Llama 3.2 3B, Llama 3.2 1B Instruct, Llama 3.2 3B Instruct, Llama 3.2 11B Vision, Llama 3.2 90B Vision, Llama 3.2 11B Vision Instruct and Llama 3.2 90B Vision Instruct models. Behind the scenes, the Data Science service provides a vLLM container that’s compatible with these versions of the Llama 3.2 models. Customers can bring in these models from Hugging Face to AI Quick Actions by registering the model. Access to the Llama 3.2 models in Hugging Face requires acceptance of user agreement. After you have accepted the user agreement and have been granted access, you must generate a token from Hugging Face to log into the Hugging Face CLI in the terminal of the notebook you’re using for AI Quick Actions to validate your access. Use the following command to log in with your token:
Copied to Clipboard
Error: Could not Copy
Copied to Clipboard
Error: Could not Copy
huggingface-cli login <your-hugging-face-token>
You can begin the model registration process by navigating to the “Ready-to-Register” tab in Model Explorer, as shown in figure 1. When you select the card for the model you want to use, you begin the registration process. Alternatively, you can select the “Import a new model” card under “My Models” in the Model Explorer page, as shown in figure 2. Select “Register verified model” and choose the version of Llama 3.2 model you want to use. After the model has been registered, it shows up with its own model card under “My Models” in Model Explorer. You can deploy the model, fine-tune, and perform evaluation. We have also added the capability to pass in an image payload to the inferencing UI for customers to test their deployed models in real time.
Figure 1: Llama 3.2 1B, 3B, 11B and 90B models are ready to be imported from Hugging Face for use in OCI Data Science AI Quick Actions
Figure 2: You can also import Llama 3.2 models into AI Quick Actions by selecting the “Import new model” card
Working with Llama 3.2 models through the Bring Your Own Container approach
OCI Data Science supports Bring Your Own Model for deployment through a Bring Your Own Container (BYOC) approach. You can use this approach to work with any of the Llama 3.2 models, including the Llama 3.2 11B Vision, Llama 3.2 90B Vision, Llama3.2 11B Vision Instruct, and Llama 3.2 90B Vision Instruct models.
The Bring Your Own Container approach requires downloading the model from the host repository, either through the Llama website or Hugging Face, and creating a Data Science model catalog entry. The next step is to pull the inference container compatible with the model and push it in the OCI Registry. OCI Data Science Model Deployment supports container images residing in the OCI Registry. Finally, deploy the model and the inference container by creating a Data Science model deployment. When the model is deployed, you can invoke the model with an HTTP endpoint. For a sample of this process, see Deploy LLM Models using BYOC and Batch Inferencing guide.