Friday, June 30, 2023

What Are Embeddings in OpenAI

OpenAI’s text embeddings measure the relatedness of text strings. Embeddings are commonly used for:


Search (where results are ranked by relevance to a query string)

Clustering (where text strings are grouped by similarity)

Recommendations (where items with related text strings are recommended)

Anomaly detection (where outliers with little relatedness are identified)

Diversity measurement (where similarity distributions are analyzed)

Classification (where text strings are classified by their most similar label)



An embedding is a vector (list) of floating point numbers. The distance between two vectors measures their relatedness. Small distances suggest high relatedness and large distances suggest low relatedness.


To get an embedding, send your text string to the embeddings API endpoint along with a choice of embedding model ID (e.g., text-embedding-ada-002). The response will contain an embedding, which you can extract, save, and use.


curl https://api.openai.com/v1/embeddings \

  -H "Content-Type: application/json" \

  -H "Authorization: Bearer $OPENAI_API_KEY" \

  -d '{

    "input": "Your text string goes here",

    "model": "text-embedding-ada-002"

  }'


{

  "data": [

    {

      "embedding": [

        -0.006929283495992422,

        -0.005336422007530928,

        ...

        -4.547132266452536e-05,

        -0.024047505110502243

No comments:

Post a Comment