Monday, December 25, 2023

How to get Started with Gemini API

Gemini is Google's latest family of large language models. This site contains all the info you need to start building applications with Gemini.

Google AI Studio 

https://ai.google.dev/tutorials/ai-studio_quickstart

Google AI Studio is a browser-based IDE for prototyping with generative models. Google AI Studio lets you quickly try out models and experiment with different prompts. When you've built something you're happy with, you can export it to code in your preferred programming language, powered by the Gemini API.

Google AI Studio provides several interfaces for prompts that are designed for different use cases:

Freeform prompts - These prompts offer an open-ended prompting experience for generating content and responses to instructions. You can use both images and text data for your prompts.

Structured prompts - This prompting technique lets you guide model output by providing a set of example requests and replies. Use this approach when you need more control over the structure of model output. 

Chat prompts - Use chat prompts to build conversational experiences. This prompting technique allows for multiple input and response turns to generate output. 

Google AI Studio also lets you to change the behavior of a model, using a technique called tuning:

Tuned model - Use this advanced technique to improve a model's responses for a specific task by providing more examples. Note that tuning is only available for legacy PaLM models. Turn on the Show legacy models option in Settings to enable this prompt.

Navigate to Google AI Studio.

In the left panel, select Create new > Freeform prompt.

In the right column Model field, select a model that supports images, such as the Gemini Pro Vision model.

In the prompt text area, enter the following text:


look at the following picture and tell me who is the architect


From the Insert bar above the prompt area, select Image, and choose one of the sample images of a building.

At the bottom of the app window, select Run to generate a reply for this request.


Sometimes, you want to be able to dynamically change parts of a prompt. For example, if you're building an interactive application, you may want to modify your prompt with different user inputs. For this, you can parameterize your prompts using variables.


Select the word or phrase you want to replace in your prompt. In this case, select the text: who is the architect.

From the Insert: header above the prompt, select {{ }} Test input.

In the Test your prompt table below the prompt, add an additional value for your prompt by selecting Add test example and entering an additional prompt value. Feel free to add several new input values.

At the bottom of the app window, select Run to generate a reply for each of the varying requests.

Experiment with model parameters

As you're prototyping your prompt, you can also play around with model run settings on the right side of the application. These are key settings to know about:


Model - Select what model you want to respond to your prompts. For more information about the available models and capabilities, see Models.

Temperature - Control how much randomness is allowed in the model's responses. Raising this value allows the model to produce more unexpected and creative responses.

Max outputs - Increase the number of responses the model returns for each request. This option can be helpful for quickly testing prompts by generating multiple responses for a single prompt.

Safety settings - Adjust safety settings for managing model responses. For more details about these controls, see the Safety settings.



Gemini API: Quickstart with Python


Prerequisites

Python 3.9+

An installation of jupyter to run the notebook.



pip install -q -U google-generativeai


import pathlib

import textwrap


import google.generativeai as genai


# Used to securely store your API key

from google.colab import userdata


from IPython.display import display

from IPython.display import Markdown



def to_markdown(text):

  text = text.replace('•', '  *')

  return Markdown(textwrap.indent(text, '> ', predicate=lambda _: True))



# Or use `os.getenv('GOOGLE_API_KEY')` to fetch an environment variable.

GOOGLE_API_KEY=userdata.get('GOOGLE_API_KEY')


genai.configure(api_key=GOOGLE_API_KEY)


List models

Now you're ready to call the Gemini API. Use list_models to see the available Gemini models:


gemini-pro: optimized for text-only prompts.

gemini-pro-vision: optimized for text-and-images prompts.



for m in genai.list_models():

  if 'generateContent' in m.supported_generation_methods:

    print(m.name)


The genai package also supports the PaLM family of models, but only the Gemini models support the generic, multimodal capabilities of the generateContent method.



model = genai.GenerativeModel('gemini-pro')


%%time

response = model.generate_content("What is the meaning of life?")


to_markdown(response.text)


If the API failed to return a result, use GenerateContentRespose.prompt_feedback to see if it was blocked due to saftey concerns regarding the prompt.


response.prompt_feedback


Gemini can generate multiple possible responses for a single prompt. These possible responses are called candidates, and you can review them to select the most suitable one as the response.


View the response candidates with GenerateContentResponse.candidates:


Like this, other programming environments supported are: 


https://ai.google.dev/tutorials/android_quickstart => Android 

https://ai.google.dev/tutorials/web_quickstart => Web 

https://ai.google.dev/tutorials/swift_quickstart => iOS 

https://ai.google.dev/tutorials/go_quickstart => Go 

https://ai.google.dev/tutorials/node_quickstart => NodeJS 

https://ai.google.dev/tutorials/rest_quickstart => REST API 


Also, let the users migrate from PaLM API to Gemini API 

https://ai.google.dev/docs/migration_guide => 


References:

https://ai.google.dev/docs

No comments:

Post a Comment