Pinecone

class pinecone.Pinecone(api_key: str | None = None, host: str | None = None, proxy_url: str | None = None, proxy_headers: dict[str, str] | None = None, ssl_ca_certs: str | None = None, ssl_verify: bool | None = None, additional_headers: dict[str, str] | None = {}, pool_threads: int | None = None, **kwargs)[source]

A client for interacting with Pinecone APIs.

Pinecone.__init__(api_key: str | None = None, host: str | None = None, proxy_url: str | None = None, proxy_headers: dict[str, str] | None = None, ssl_ca_certs: str | None = None, ssl_verify: bool | None = None, additional_headers: dict[str, str] | None = {}, pool_threads: int | None = None, **kwargs) None[source]

The Pinecone class is the main entry point for interacting with Pinecone via this Python SDK. Instances of the Pinecone class are used to manage and interact with Pinecone resources such as indexes, backups, and collections. When using the SDK, calls are made on your behalf to the API documented at https://docs.pinecone.io.

The class also holds inference functionality (embed, rerank) under the inference namespace.

When you are ready to perform data operations on an index, you will need to instantiate an index client. Though the functionality of the index client is defined in a different class, it is instantiated through the Index() method in order for configurations to be shared between the two objects.

Parameters:
  • api_key (str, optional) – The API key to use for authentication. If not passed via kwarg, the API key will be read from the environment variable PINECONE_API_KEY.

  • host (str, optional) – The control plane host. If unspecified, the host api.pinecone.io will be used.

  • proxy_url (str, optional) – The URL of the proxy to use for the connection.

  • proxy_headers (dict[str, str], optional) – Additional headers to pass to the proxy. Use this if your proxy setup requires authentication.

  • ssl_ca_certs (str, optional) – The path to the SSL CA certificate bundle to use for the connection. This path should point to a file in PEM format. When not passed, the SDK will use the certificate bundle returned from certifi.where().

  • ssl_verify (bool, optional) – SSL verification is performed by default, but can be disabled using the boolean flag when testing with Pinecone Local or troubleshooting a proxy setup. You should never run with SSL verification disabled in production.

  • additional_headers (dict[str, str], optional) – Additional headers to pass to the API. This is mainly to support internal testing at Pinecone. End users should not need to use this unless following specific instructions to do so.

  • pool_threads (int, optional) – The number of threads to use for the ThreadPool when using methods that support the async_req keyword argument. The default number of threads is 5 * the number of CPUs in your execution environment.

Configuration with environment variables

If you instantiate the Pinecone client with no arguments, it will attempt to read the API key from the environment variable PINECONE_API_KEY.

from pinecone import Pinecone

pc = Pinecone()

Configuration with keyword arguments

If you prefer being more explicit in your code, you can also pass the API key as a keyword argument. This is also where you will pass additional configuration options such as proxy settings if you wish to use those.

import os
from pinecone import Pinecone

pc = Pinecone(
    api_key=os.environ.get("PINECONE_API_KEY"),
    host="https://api-staging.pinecone.io"
)

Environment variables

The Pinecone client supports the following environment variables:

  • PINECONE_API_KEY: The API key to use for authentication. If not passed via kwarg, the API key will be read from the environment variable PINECONE_API_KEY.

  • PINECONE_DEBUG_CURL: Enable some additional debug logging representing the HTTP requests as curl commands. The main use of is to run calls outside of the SDK to help evaluate whether a problem you are experiencing is due to the API’s behavior or the behavior of the SDK itself.

  • PINECONE_ADDITIONAL_HEADERS: A json string of a dictionary of header values to attach to all requests. This is primarily used for internal testing at Pinecone.

Warning

Be very careful with the PINECONE_DEBUG_CURL environment variable, as it will print out your API key which forms part of a required authentication header.

Proxy configuration

If your network setup requires you to interact with Pinecone via a proxy, you will need to pass additional configuration using optional keyword parameters. These optional parameters are forwarded to urllib3, which is the underlying library currently used by the Pinecone client to make HTTP requests. You may find it helpful to refer to the urllib3 documentation on working with proxies while troubleshooting these settings.

Here is a basic example:

from pinecone import Pinecone

pc = Pinecone(
    api_key='YOUR_API_KEY',
    proxy_url='https://your-proxy.com'
)

pc.list_indexes()

If your proxy requires authentication, you can pass those values in a header dictionary using the proxy_headers parameter.

from pinecone import Pinecone
from urllib3.util import make_headers

pc = Pinecone(
    api_key='YOUR_API_KEY',
    proxy_url='https://your-proxy.com',
    proxy_headers=make_headers(proxy_basic_auth='username:password')
)

pc.list_indexes()

Using proxies with self-signed certificates

By default the Pinecone Python client will perform SSL certificate verification using the CA bundle maintained by Mozilla in the certifi package. If your proxy server is using a self-signed certificate, you will need to pass the path to the certificate in PEM format using the ssl_ca_certs parameter.

from pinecone import Pinecone
from urllib3.util import make_headers

pc = Pinecone(
    api_key='YOUR_API_KEY',
    proxy_url='https://your-proxy.com',
    proxy_headers=make_headers(proxy_basic_auth='username:password'),
    ssl_ca_certs='path/to/cert-bundle.pem'
)

pc.list_indexes()

Disabling SSL verification

If you would like to disable SSL verification, you can pass the ssl_verify parameter with a value of False. We do not recommend going to production with SSL verification disabled.

from pinecone import Pinecone
from urllib3.util import make_headers

pc = Pinecone(
    api_key='YOUR_API_KEY',
    proxy_url='https://your-proxy.com',
    proxy_headers=make_headers(proxy_basic_auth='username:password'),
    ssl_ca_certs='path/to/cert-bundle.pem',
    ssl_verify=False
)

pc.list_indexes()
Pinecone.Index(name: str = '', host: str = '', **kwargs) Index[source]

Target an index for data operations.

Parameters:
  • name (str, optional) – The name of the index to target. If you specify the name of the index, the client will fetch the host url from the Pinecone control plane.

  • host (str, optional) – The host url of the index to target. If you specify the host url, the client will use the host url directly without making any additional calls to the control plane.

  • pool_threads (int, optional) – The number of threads to use when making parallel requests by calling index methods with optional kwarg async_req=True, or using methods that make use of thread-based parallelism automatically such as query_namespaces().

  • connection_pool_maxsize (int, optional) – The maximum number of connections to keep in the connection pool.

Returns:

An instance of the Index class.

Target an index by host url

In production situations, you want to upsert or query your data as quickly as possible. If you know in advance the host url of your index, you can eliminate a round trip to the Pinecone control plane by specifying the host of the index. If instead you pass the name of the index, the client will need to make an additional call to api.pinecone.io to get the host url before any data operations can take place.

import os
from pinecone import Pinecone

api_key = os.environ.get("PINECONE_API_KEY")
index_host = os.environ.get("PINECONE_INDEX_HOST")

pc = Pinecone(api_key=api_key)
index = pc.Index(host=index_host)

# Now you're ready to perform data operations
index.query(vector=[...], top_k=10)

To find your host url, you can use the describe_index method to call api.pinecone.io. The host url is returned in the response. Or, alternatively, the host is displayed in the Pinecone web console.

import os
from pinecone import Pinecone

pc = Pinecone(
    api_key=os.environ.get("PINECONE_API_KEY")
)

host = pc.describe_index('index-name').host

Target an index by name (not recommended for production)

For more casual usage, such as when you are playing and exploring with Pinecone in a notebook setting, you can also target an index by name. If you use this approach, the client may need to perform an extra call to the Pinecone control plane to get the host url on your behalf to get the index host.

The client will cache the index host for future use whenever it is seen, so you will only incur the overhead of only one call. But this approach is not recommended for production usage because it introduces an unnecessary runtime dependency on api.pinecone.io.

import os
from pinecone import Pinecone, ServerlessSpec

api_key = os.environ.get("PINECONE_API_KEY")

pc = Pinecone(api_key=api_key)
pc.create_index(
    name='my_index',
    dimension=1536,
    metric='cosine',
    spec=ServerlessSpec(cloud='aws', region='us-west-2')
)
index = pc.Index('my_index')

# Now you're ready to perform data operations
index.query(vector=[...], top_k=10)
Pinecone.IndexAsyncio(host: str, **kwargs) IndexAsyncio[source]

Build an asyncio-compatible Index object.

Parameters:

host (str, required) – The host url of the index to target. You can find this url in the Pinecone web console or by calling describe_index method of Pinecone or PineconeAsyncio.

Returns:

An instance of the IndexAsyncio class.

import asyncio
import os
from pinecone import Pinecone

async def main():
    pc = Pinecone(api_key=os.environ.get("PINECONE_API_KEY"))
    async with pc.IndexAsyncio(host=os.environ.get("PINECONE_INDEX_HOST")) as index:
        await index.query(vector=[...], top_k=10)

asyncio.run(main())

See more docs for PineconeAsyncio here.

DB Control Plane

Indexes

Pinecone.create_index(name: str, spec: Dict | 'ServerlessSpec' | 'PodSpec' | 'ByocSpec', dimension: int | None = None, metric: 'Metric' | str | None = 'cosine', timeout: int | None = None, deletion_protection: 'DeletionProtection' | str | None = 'disabled', vector_type: 'VectorType' | str | None = 'dense', tags: dict[str, str] | None = None) IndexModel[source]

Creates a Pinecone index.

Parameters:
  • name (str) – The name of the index to create. Must be unique within your project and cannot be changed once created. Allowed characters are lowercase letters, numbers, and hyphens and the name may not begin or end with hyphens. Maximum length is 45 characters.

  • metric (str, optional) – Type of similarity metric used in the vector index when querying, one of {"cosine", "dotproduct", "euclidean"}.

  • spec (Dict) – A dictionary containing configurations describing how the index should be deployed. For serverless indexes, specify region and cloud. Optionally, you can specify read_capacity to configure dedicated read capacity mode (OnDemand or Dedicated) and schema to configure which metadata fields are filterable. For pod indexes, specify replicas, shards, pods, pod_type, metadata_config, and source_collection. Alternatively, use the ServerlessSpec, PodSpec, or ByocSpec objects to specify these configurations.

  • dimension (int) – If you are creating an index with vector_type="dense" (which is the default), you need to specify dimension to indicate the size of your vectors. This should match the dimension of the embeddings you will be inserting. For example, if you are using OpenAI’s CLIP model, you should use dimension=1536. Dimension is a required field when creating an index with vector_type="dense" and should not be passed when vector_type="sparse".

  • timeout (int, optional) – Specify the number of seconds to wait until index gets ready. If None, wait indefinitely; if >=0, time out after this many seconds; if -1, return immediately and do not wait.

  • deletion_protection (Optional[Literal["enabled", "disabled"]]) – If enabled, the index cannot be deleted. If disabled, the index can be deleted.

  • vector_type (str, optional) – The type of vectors to be stored in the index. One of {"dense", "sparse"}.

  • tags (Optional[dict[str, str]]) – Tags are key-value pairs you can attach to indexes to better understand, organize, and identify your resources. Some example use cases include tagging indexes with the name of the model that generated the embeddings, the date the index was created, or the purpose of the index.

Returns:

A IndexModel instance containing a description of the index that was created.

Examples:

Creating a serverless index
import os
from pinecone import (
    Pinecone,
    ServerlessSpec,
    CloudProvider,
    AwsRegion,
    Metric,
    DeletionProtection,
    VectorType
)

pc = Pinecone(api_key=os.environ.get("PINECONE_API_KEY"))

pc.create_index(
    name="my_index",
    dimension=512,
    metric=Metric.COSINE,
    spec=ServerlessSpec(
        cloud=CloudProvider.AWS,
        region=AwsRegion.US_WEST_2,
        read_capacity={
            "mode": "Dedicated",
            "dedicated": {
                "node_type": "t1",
                "scaling": "Manual",
                "manual": {"shards": 2, "replicas": 2},
            },
        },
        schema={
            "genre": {"filterable": True},
            "year": {"filterable": True},
            "rating": {"filterable": True},
        },
    ),
    deletion_protection=DeletionProtection.DISABLED,
    vector_type=VectorType.DENSE,
    tags={
        "app": "movie-recommendations",
        "env": "production"
    }
)
Creating a pod index
import os
from pinecone import (
    Pinecone,
    PodSpec,
    PodIndexEnvironment,
    PodType,
    Metric,
    DeletionProtection,
    VectorType
)

pc = Pinecone(api_key=os.environ.get("PINECONE_API_KEY"))

pc.create_index(
    name="my_index",
    dimension=1536,
    metric=Metric.COSINE,
    spec=PodSpec(
        environment=PodIndexEnvironment.US_EAST4_GCP,
        pod_type=PodType.P1_X1
    ),
    deletion_protection=DeletionProtection.DISABLED,
    tags={
        "model": "clip",
        "app": "image-search",
        "env": "testing"
    }
)
Pinecone.create_index_for_model(name: str, cloud: 'CloudProvider' | str, region: 'AwsRegion' | 'GcpRegion' | 'AzureRegion' | str, embed: IndexEmbed' | 'CreateIndexForModelEmbedTypedDict, tags: dict[str, str] | None = None, deletion_protection: 'DeletionProtection' | str | None = 'disabled', read_capacity: 'ReadCapacityDict' | 'ReadCapacity' | 'ReadCapacityOnDemandSpec' | 'ReadCapacityDedicatedSpec' | None = None, schema: dict[str, 'MetadataSchemaFieldConfig'] | dict[str, dict[str, Any]] | 'BackupModelSchema' | None = None, timeout: int | None = None) IndexModel[source]

Create a Serverless index configured for use with Pinecone’s integrated inference models.

Parameters:
  • name (str) – The name of the index to create. Must be unique within your project and cannot be changed once created. Allowed characters are lowercase letters, numbers, and hyphens and the name may not begin or end with hyphens. Maximum length is 45 characters.

  • cloud (str) – The cloud provider to use for the index. One of {"aws", "gcp", "azure"}.

  • region (str) – The region to use for the index. Enum objects AwsRegion, GcpRegion, and AzureRegion are also available to help you quickly set these parameters, but may not be up to date as new regions become available.

  • embed (Union[Dict, IndexEmbed]) – The embedding configuration for the index. This param accepts a dictionary or an instance of the IndexEmbed object.

  • tags (Optional[dict[str, str]]) – Tags are key-value pairs you can attach to indexes to better understand, organize, and identify your resources. Some example use cases include tagging indexes with the name of the model that generated the embeddings, the date the index was created, or the purpose of the index.

  • deletion_protection (Optional[Literal["enabled", "disabled"]]) – If enabled, the index cannot be deleted. If disabled, the index can be deleted. This setting can be changed with configure_index.

  • read_capacity (Optional[Union[ReadCapacityDict, ReadCapacity, ReadCapacityOnDemandSpec, ReadCapacityDedicatedSpec]]) – Optional read capacity configuration. You can specify read_capacity to configure dedicated read capacity mode (OnDemand or Dedicated). See ServerlessSpec documentation for details on read capacity configuration.

  • schema (Optional[Union[dict[str, MetadataSchemaFieldConfig], dict[str, dict[str, Any]], BackupModelSchema]]) – Optional metadata schema configuration. You can specify schema to configure which metadata fields are filterable. The schema can be provided as a dictionary mapping field names to their configurations (e.g., {"genre": {"filterable": True}}) or as a dictionary with a fields key (e.g., {"fields": {"genre": {"filterable": True}}}).

  • timeout (Optional[int]) – Specify the number of seconds to wait until index is ready to receive data. If None, wait indefinitely; if >=0, time out after this many seconds; if -1, return immediately and do not wait.

Returns:

A description of the index that was created.

Return type:

IndexModel

The resulting index can be described, listed, configured, and deleted like any other Pinecone index with the describe_index, list_indexes, configure_index, and delete_index methods.

After the model is created, you can upsert records into the index with the upsert_records method, and search your records with the search method.

from pinecone import (
    Pinecone,
    IndexEmbed,
    CloudProvider,
    AwsRegion,
    EmbedModel,
    Metric,
)

pc = Pinecone()

if not pc.has_index("book-search"):
    desc = pc.create_index_for_model(
        name="book-search",
        cloud=CloudProvider.AWS,
        region=AwsRegion.US_EAST_1,
        embed=IndexEmbed(
            model=EmbedModel.Multilingual_E5_Large,
            metric=Metric.COSINE,
            field_map={
                "text": "description",
            },
        )
    )
Creating an index for model with schema and dedicated read capacity
from pinecone import (
    Pinecone,
    IndexEmbed,
    CloudProvider,
    AwsRegion,
    EmbedModel,
    Metric,
)

pc = Pinecone()

if not pc.has_index("book-search"):
    desc = pc.create_index_for_model(
        name="book-search",
        cloud=CloudProvider.AWS,
        region=AwsRegion.US_EAST_1,
        embed=IndexEmbed(
            model=EmbedModel.Multilingual_E5_Large,
            metric=Metric.COSINE,
            field_map={
                "text": "description",
            },
        ),
        read_capacity={
            "mode": "Dedicated",
            "dedicated": {
                "node_type": "t1",
                "scaling": "Manual",
                "manual": {"shards": 2, "replicas": 2},
            },
        },
        schema={
            "genre": {"filterable": True},
            "year": {"filterable": True},
            "rating": {"filterable": True},
        },
    )

See also

Official docs on available cloud regions

Model Gallery to learn about available models

Pinecone.create_index_from_backup(*, name: str, backup_id: str, deletion_protection: 'DeletionProtection' | str | None = 'disabled', tags: dict[str, str] | None = None, timeout: int | None = None) IndexModel[source]

Create an index from a backup.

Call list_backups to get a list of backups for your project.

Parameters:
  • name (str) – The name of the index to create.

  • backup_id (str) – The ID of the backup to restore.

  • deletion_protection (Optional[Literal["enabled", "disabled"]]) – If enabled, the index cannot be deleted. If disabled, the index can be deleted. This setting can be changed with configure_index.

  • tags (Optional[dict[str, str]]) – Tags are key-value pairs you can attach to indexes to better understand, organize, and identify your resources. Some example use cases include tagging indexes with the name of the model that generated the embeddings, the date the index was created, or the purpose of the index.

  • timeout – Specify the number of seconds to wait until index is ready to receive data. If None, wait indefinitely; if >=0, time out after this many seconds; if -1, return immediately and do not wait.

Returns:

A description of the index that was created.

Return type:

IndexModel

from pinecone import Pinecone

pc = Pinecone()

# List available backups
backups = pc.list_backups()
if backups:
    backup_id = backups[0].id

    # Create index from backup
    index = pc.create_index_from_backup(
        name="restored-index",
        backup_id=backup_id,
        deletion_protection="disabled"
    )
Pinecone.list_indexes() IndexList[source]

Lists all indexes in your project.

Returns:

Returns an IndexList object, which is iterable and contains a list of IndexModel objects. The IndexList also has a convenience method names() which returns a list of index names for situations where you just want to iterate over all index names.

The results include a description of all indexes in your project, including the index name, dimension, metric, status, and spec.

If you simply want to check whether an index exists, see the has_index() convenience method.

You can use the list_indexes() method to iterate over descriptions of every index in your project.

from pinecone import Pinecone

pc = Pinecone()

for index in pc.list_indexes():
    print(index.name)
    print(index.dimension)
    print(index.metric)
    print(index.status)
    print(index.host)
    print(index.spec)
Pinecone.describe_index(name: str) IndexModel[source]

Describes a Pinecone index.

Parameters:

name – the name of the index to describe.

Returns:

Returns an IndexModel object which gives access to properties such as the index name, dimension, metric, host url, status, and spec.

Getting your index host url

In a real production situation, you probably want to store the host url in an environment variable so you don’t have to call describe_index and re-fetch it every time you want to use the index. But this example shows how to get the value from the API using describe_index.

from pinecone import Pinecone, Index

pc = Pinecone()

index_name="my_index"
description = pc.describe_index(name=index_name)
print(description)
# {
#     "name": "my_index",
#     "metric": "cosine",
#     "host": "my_index-dojoi3u.svc.aped-4627-b74a.pinecone.io",
#     "spec": {
#         "serverless": {
#             "cloud": "aws",
#             "region": "us-east-1"
#         }
#     },
#     "status": {
#         "ready": true,
#         "state": "Ready"
#     },
#     "vector_type": "dense",
#     "dimension": 1024,
#     "deletion_protection": "enabled",
#     "tags": {
#         "environment": "production"
#     }
# }

print(f"Your index is hosted at {description.host}")

index = pc.Index(host=description.host)
index.upsert(vectors=[...])
Pinecone.configure_index(name: str, replicas: int | None = None, pod_type: 'PodType' | str | None = None, deletion_protection: 'DeletionProtection' | str | None = None, tags: dict[str, str] | None = None, embed: 'ConfigureIndexEmbed' | Dict | None = None, read_capacity: 'ReadCapacityDict' | 'ReadCapacity' | 'ReadCapacityOnDemandSpec' | 'ReadCapacityDedicatedSpec' | None = None) None[source]

Modify an index’s configuration.

Parameters:
  • name (str, required) – the name of the Index

  • replicas (int, optional) – the desired number of replicas, lowest value is 0.

  • pod_type (str or PodType, optional) – the new pod_type for the index. To learn more about the available pod types, please see Understanding Indexes. Note that pod type is only available for pod-based indexes.

  • deletion_protection (str or DeletionProtection, optional) – If set to 'enabled', the index cannot be deleted. If 'disabled', the index can be deleted.

  • tags (dict[str, str], optional) – A dictionary of tags to apply to the index. Tags are key-value pairs that can be used to organize and manage indexes. To remove a tag, set the value to “”. Tags passed to configure_index will be merged with existing tags and any with the value empty string will be removed.

  • embed (Optional[Union[ConfigureIndexEmbed, Dict]], optional) – configures the integrated inference embedding settings for the index. You can convert an existing index to an integrated index by specifying the embedding model and field_map. The index vector type and dimension must match the model vector type and dimension, and the index similarity metric must be supported by the model. You can later change the embedding configuration to update the field_map, read_parameters, or write_parameters. Once set, the model cannot be changed.

  • read_capacity (Optional[Union[ReadCapacityDict, ReadCapacity, ReadCapacityOnDemandSpec, ReadCapacityDedicatedSpec]]) – Optional read capacity configuration for serverless indexes. You can specify read_capacity to configure dedicated read capacity mode (OnDemand or Dedicated). See ServerlessSpec documentation for details on read capacity configuration. Note that read capacity configuration is only available for serverless indexes.

This method is used to modify an index’s configuration. It can be used to:

  • Configure read capacity for serverless indexes using read_capacity

  • Scale a pod-based index horizontally using replicas

  • Scale a pod-based index vertically using pod_type

  • Enable or disable deletion protection using deletion_protection

  • Add, change, or remove tags using tags

Configuring read capacity for serverless indexes

To configure read capacity for serverless indexes, pass the read_capacity parameter to the configure_index method. You can configure either OnDemand or Dedicated read capacity mode.

from pinecone import Pinecone

pc = Pinecone()

# Configure to OnDemand read capacity (default)
pc.configure_index(
    name="my_index",
    read_capacity={"mode": "OnDemand"}
)

# Configure to Dedicated read capacity with manual scaling
pc.configure_index(
    name="my_index",
    read_capacity={
        "mode": "Dedicated",
        "dedicated": {
            "node_type": "t1",
            "scaling": "Manual",
            "manual": {"shards": 1, "replicas": 1}
        }
    }
)

# Verify the configuration was applied
desc = pc.describe_index("my_index")
assert desc.spec.serverless.read_capacity.mode == "Dedicated"

Scaling pod-based indexes

To scale your pod-based index, you pass a replicas and/or pod_type param to the configure_index method. pod_type may be a string or a value from the PodType enum.

from pinecone import Pinecone, PodType

pc = Pinecone()
pc.configure_index(
    name="my_index",
    replicas=2,
    pod_type=PodType.P1_X2
)

After providing these new configurations, you must call describe_index to see the status of the index as the changes are applied.

Enabling or disabling deletion protection

To enable or disable deletion protection, pass the deletion_protection parameter to the configure_index method. When deletion protection is enabled, the index cannot be deleted with the delete_index method.

from pinecone import Pinecone, DeletionProtection

pc = Pinecone()

# Enable deletion protection
pc.configure_index(
    name="my_index",
    deletion_protection=DeletionProtection.ENABLED
)

# Call describe_index to see the change was applied.
assert pc.describe_index("my_index").deletion_protection == "enabled"

# Disable deletion protection
pc.configure_index(
    name="my_index",
    deletion_protection=DeletionProtection.DISABLED
)

Adding, changing, or removing tags

To add, change, or remove tags, pass the tags parameter to the configure_index method. When tags are passed using configure_index, they are merged with any existing tags already on the index. To remove a tag, set the value of the key to an empty string.

from pinecone import Pinecone

pc = Pinecone()

# Add a tag
pc.configure_index(name="my_index", tags={"environment": "staging"})

# Change a tag
pc.configure_index(name="my_index", tags={"environment": "production"})

# Remove a tag
pc.configure_index(name="my_index", tags={"environment": ""})

# Call describe_index to view the tags are changed
print(pc.describe_index("my_index").tags)
Pinecone.delete_index(name: str, timeout: int | None = None) None[source]

Deletes a Pinecone index.

Parameters:
  • name (str) – the name of the index.

  • timeout (int, optional) – Number of seconds to poll status checking whether the index has been deleted. If None, wait indefinitely; if >=0, time out after this many seconds; if -1, return immediately and do not wait.

Deleting an index is an irreversible operation. All data in the index will be lost. When you use this command, a request is sent to the Pinecone control plane to delete the index, but the termination is not synchronous because resources take a few moments to be released.

By default the delete_index method will block until polling of the describe_index method shows that the delete operation has completed. If you prefer to return immediately and not wait for the index to be deleted, you can pass timeout=-1 to the method.

After the delete request is submitted, polling describe_index will show that the index transitions into a Terminating state before eventually resulting in a 404 after it has been removed.

This operation can fail if the index is configured with deletion_protection="enabled". In this case, you will need to call configure_index to disable deletion protection before you can delete the index.

from pinecone import Pinecone

pc = Pinecone()

index_name = "my_index"
desc = pc.describe_index(name=index_name)

if desc.deletion_protection == "enabled":
    # If for some reason deletion protection is enabled, you will need to disable it first
    # before you can delete the index. But use caution as this operation is not reversible
    # and if somebody enabled deletion protection, they probably had a good reason.
    pc.configure_index(name=index_name, deletion_protection="disabled")

pc.delete_index(name=index_name)
Pinecone.has_index(name: str) bool[source]

Checks if a Pinecone index exists.

Parameters:

name – The name of the index to check for existence.

Returns:

Returns True if the index exists, False otherwise.

from pinecone import Pinecone, ServerlessSpec

pc = Pinecone()

index_name = "my_index"
if not pc.has_index(index_name):
    print("Index does not exist, creating...")
    pc.create_index(
        name=index_name,
        dimension=768,
        metric="cosine",
        spec=ServerlessSpec(cloud="aws", region="us-west-2")
    )

Backups

Pinecone.create_backup(*, index_name: str, backup_name: str, description: str = '') BackupModel[source]

Create a backup of an index.

Parameters:
  • index_name (str) – The name of the index to backup.

  • backup_name (str) – The name to give the backup.

  • description (str, optional) – Optional description of the backup.

from pinecone import Pinecone

pc = Pinecone()

# Create a backup of an index
backup = pc.create_backup(
    index_name="my_index",
    backup_name="my_backup",
    description="Daily backup"
)

print(f"Backup created with ID: {backup.id}")
Pinecone.list_backups(*, index_name: str | None = None, limit: int | None = 10, pagination_token: str | None = None) BackupList[source]

List backups.

If index_name is provided, the backups will be filtered by index. If no index_name is provided, all backups in the project will be returned.

Parameters:
  • index_name (str, optional) – The name of the index to list backups for.

  • limit (int, optional) – The maximum number of backups to return.

  • pagination_token (str, optional) – The pagination token to use for pagination.

from pinecone import Pinecone

pc = Pinecone()

# List all backups
all_backups = pc.list_backups(limit=20)

# List backups for a specific index
index_backups = pc.list_backups(index_name="my_index", limit=10)

for backup in index_backups:
    print(f"Backup: {backup.name}, Status: {backup.status}")
Pinecone.describe_backup(*, backup_id: str) BackupModel[source]

Describe a backup.

Parameters:

backup_id (str) – The ID of the backup to describe.

from pinecone import Pinecone

pc = Pinecone()

backup = pc.describe_backup(backup_id="backup-123")
print(f"Backup: {backup.name}")
print(f"Status: {backup.status}")
print(f"Index: {backup.index_name}")
Pinecone.delete_backup(*, backup_id: str) None[source]

Delete a backup.

Parameters:

backup_id (str) – The ID of the backup to delete.

from pinecone import Pinecone

pc = Pinecone()

pc.delete_backup(backup_id="backup-123")

Collections

Pinecone.create_collection(name: str, source: str) None[source]

Create a collection from a pod-based index.

Parameters:
  • name (str, required) – Name of the collection

  • source (str, required) – Name of the source index

from pinecone import Pinecone

pc = Pinecone()

# Create a collection from an existing pod-based index
pc.create_collection(name="my_collection", source="my_index")
Pinecone.list_collections() CollectionList[source]

List all collections.

from pinecone import Pinecone

pc = Pinecone()

for collection in pc.list_collections():
    print(collection.name)
    print(collection.source)

# You can also iterate specifically over the collection
# names with the .names() helper.
collection_name="my_collection"
for collection_name in pc.list_collections().names():
    print(collection_name)
Pinecone.describe_collection(name: str) dict[str, Any][source]

Describes a collection.

Parameters:

name (str) – The name of the collection

Returns:

Description of the collection

from pinecone import Pinecone

pc = Pinecone()

description = pc.describe_collection("my_collection")
print(description.name)
print(description.source)
print(description.status)
print(description.size)
Pinecone.delete_collection(name: str) None[source]

Deletes a collection.

Parameters:

name (str) – The name of the collection to delete.

Deleting a collection is an irreversible operation. All data in the collection will be lost.

This method tells Pinecone you would like to delete a collection, but it takes a few moments to complete the operation. Use the describe_collection() method to confirm that the collection has been deleted.

from pinecone import Pinecone

pc = Pinecone()

pc.delete_collection(name="my_collection")

Restore Jobs

Pinecone.list_restore_jobs(*, limit: int | None = 10, pagination_token: str | None = None) RestoreJobList[source]

List restore jobs.

Parameters:
  • limit (int) – The maximum number of restore jobs to return.

  • pagination_token (str) – The pagination token to use for pagination.

from pinecone import Pinecone

pc = Pinecone()

restore_jobs = pc.list_restore_jobs(limit=20)

for job in restore_jobs:
    print(f"Job ID: {job.id}, Status: {job.status}")
Pinecone.describe_restore_job(*, job_id: str) RestoreJobModel[source]

Describe a restore job.

Parameters:

job_id (str) – The ID of the restore job to describe.

from pinecone import Pinecone

pc = Pinecone()

job = pc.describe_restore_job(job_id="job-123")
print(f"Job ID: {job.id}")
print(f"Status: {job.status}")
print(f"Source backup: {job.backup_id}")

DB Data Plane

class pinecone.db_data.Index(api_key: str, host: str, pool_threads: int | None = None, additional_headers: dict[str, str] | None = {}, openapi_config=None, **kwargs)[source]

A client for interacting with a Pinecone index via REST API. For improved performance, use the Pinecone GRPC index client.

Index.__init__(api_key: str, host: str, pool_threads: int | None = None, additional_headers: dict[str, str] | None = {}, openapi_config=None, **kwargs)[source]

Initialize the PluginAware class.

Parameters:
  • *args – Variable length argument list.

  • **kwargs – Arbitrary keyword arguments.

Raises:

AttributeError – If required attributes are not set in the subclass.

Index.describe_index_stats(filter: dict[str, str | int | float | bool] | dict[Literal['$eq'], str | int | float | bool] | dict[Literal['$ne'], str | int | float | bool] | dict[Literal['$gt'], int | float] | dict[Literal['$gte'], int | float] | dict[Literal['$lt'], int | float] | dict[Literal['$lte'], int | float] | dict[Literal['$in'], list[str | int | float | bool]] | dict[Literal['$nin'], list[str | int | float | bool]] | dict[Literal['$exists'], bool] | dict[Literal['$and'], list[dict[str, str | int | float | bool] | dict[Literal['$eq'], str | int | float | bool] | dict[Literal['$ne'], str | int | float | bool] | dict[Literal['$gt'], int | float] | dict[Literal['$gte'], int | float] | dict[Literal['$lt'], int | float] | dict[Literal['$lte'], int | float] | dict[Literal['$in'], list[str | int | float | bool]] | dict[Literal['$nin'], list[str | int | float | bool]] | dict[Literal['$exists'], bool]]] | dict[Literal['$or'], list[dict[str, str | int | float | bool] | dict[Literal['$eq'], str | int | float | bool] | dict[Literal['$ne'], str | int | float | bool] | dict[Literal['$gt'], int | float] | dict[Literal['$gte'], int | float] | dict[Literal['$lt'], int | float] | dict[Literal['$lte'], int | float] | dict[Literal['$in'], list[str | int | float | bool]] | dict[Literal['$nin'], list[str | int | float | bool]] | dict[Literal['$exists'], bool]]] | None = None, **kwargs) IndexDescription[source]

Get statistics about the index’s contents.

The DescribeIndexStats operation returns statistics about the index’s contents. For example: The vector count per namespace and the number of dimensions.

Parameters:
  • filter – If this parameter is present, the operation only returns statistics for vectors that satisfy the filter. See metadata filtering <https://www.pinecone.io/docs/metadata-filtering/>_ [optional]

  • **kwargs – Additional keyword arguments for the API call.

Returns:

Object which contains stats about the index.

Return type:

DescribeIndexStatsResponse

Examples:

>>> pc = Pinecone()
>>> index = pc.Index(host="example-index-host")
>>> stats = index.describe_index_stats()
>>> print(f"Total vectors: {stats.total_vector_count}")
>>> print(f"Dimension: {stats.dimension}")
>>> print(f"Namespaces: {list(stats.namespaces.keys())}")

>>> # Get stats for vectors matching a filter
>>> filtered_stats = index.describe_index_stats(
...     filter={'genre': {'$eq': 'drama'}}
... )

Vectors

Index.upsert(vectors: list[Vector] | list[tuple[str, list[float]]] | list[tuple[str, list[float], dict[str, str | int | float | list[str] | list[int] | list[float]]]] | list[VectorTypedDict], namespace: str | None = None, batch_size: int | None = None, show_progress: bool = True, **kwargs) UpsertResponse | ApplyResult[source]

Upsert vectors into a namespace of your index.

The upsert operation writes vectors into a namespace of your index. If a new value is upserted for an existing vector id, it will overwrite the previous value.

Parameters:
  • vectors – A list of vectors to upsert. Can be a list of Vector objects, tuples, or dictionaries.

  • namespace – The namespace to write to. If not specified, the default namespace is used. [optional]

  • batch_size – The number of vectors to upsert in each batch. If not specified, all vectors will be upserted in a single batch. [optional]

  • show_progress – Whether to show a progress bar using tqdm. Applied only if batch_size is provided. Default is True.

  • **kwargs – Additional keyword arguments for the API call.

Returns:

Includes the number of vectors upserted. If async_req=True, returns ApplyResult instead.

Return type:

UpsertResponse

Upserting dense vectors

When working with dense vectors, the dimension of each vector must match the dimension configured for the index. A vector can be represented in a variety of ways.

Upserting a dense vector using the Vector object
from pinecone import Pinecone, Vector

pc = Pinecone()
idx = pc.Index(host="example-index-host")

idx.upsert(
    namespace='my-namespace',
    vectors=[
        Vector(
            id='id1',
            values=[0.1, 0.2, 0.3, 0.4],
            metadata={'metadata_key': 'metadata_value'}
        ),
    ]
)
Upserting a dense vector as a two-element tuple (no metadata)
idx.upsert(
    namespace='my-namespace',
    vectors=[
        ('id1', [0.1, 0.2, 0.3, 0.4]),
    ]
)
Upserting a dense vector as a three-element tuple with metadata
idx.upsert(
    namespace='my-namespace',
    vectors=[
        (
            'id1',
            [0.1, 0.2, 0.3, 0.4],
            {'metadata_key': 'metadata_value'}
        ),
    ]
)
Upserting a dense vector using a vector dictionary
idx.upsert(
    namespace='my-namespace',
    vectors=[
        {
            "id": "id1",
            "values": [0.1, 0.2, 0.3, 0.4],
            "metadata": {"metadata_key": "metadata_value"}
        },
    ]
)

Upserting sparse vectors

Upserting a sparse vector
from pinecone import (
    Pinecone,
    Vector,
    SparseValues,
)

pc = Pinecone()
idx = pc.Index(host="example-index-host")

idx.upsert(
    namespace='my-namespace',
    vectors=[
        Vector(
            id='id1',
            sparse_values=SparseValues(
                indices=[1, 2],
                values=[0.2, 0.4]
            )
        ),
    ]
)
Upserting a sparse vector using a dictionary
idx.upsert(
    namespace='my-namespace',
    vectors=[
        {
            "id": "id1",
            "sparse_values": {
                "indices": [1, 2],
                "values": [0.2, 0.4]
            }
        },
    ]
)

Batch upsert

If you have a large number of vectors, you can upsert them in batches.

Upserting in batches
from pinecone import Pinecone, Vector
import random

pc = Pinecone()
idx = pc.Index(host="example-index-host")

num_vectors = 100000
vectors = [
    Vector(
        id=f'id{i}',
        values=[random.random() for _ in range(1536)])
    for i in range(num_vectors)
]

idx.upsert(
    namespace='my-namespace',
    vectors=vectors,
    batch_size=50
)

Visual progress bar with tqdm

To see a progress bar when upserting in batches, you will need to separately install tqdm. If tqdm is present, the client will detect and use it to display progress when show_progress=True.

To upsert in parallel, follow this link.

Index.query(*args, top_k: int, vector: list[float] | None = None, id: str | None = None, namespace: str | None = None, filter: dict[str, str | int | float | bool] | dict[Literal['$eq'], str | int | float | bool] | dict[Literal['$ne'], str | int | float | bool] | dict[Literal['$gt'], int | float] | dict[Literal['$gte'], int | float] | dict[Literal['$lt'], int | float] | dict[Literal['$lte'], int | float] | dict[Literal['$in'], list[str | int | float | bool]] | dict[Literal['$nin'], list[str | int | float | bool]] | dict[Literal['$exists'], bool] | dict[Literal['$and'], list[dict[str, str | int | float | bool] | dict[Literal['$eq'], str | int | float | bool] | dict[Literal['$ne'], str | int | float | bool] | dict[Literal['$gt'], int | float] | dict[Literal['$gte'], int | float] | dict[Literal['$lt'], int | float] | dict[Literal['$lte'], int | float] | dict[Literal['$in'], list[str | int | float | bool]] | dict[Literal['$nin'], list[str | int | float | bool]] | dict[Literal['$exists'], bool]]] | dict[Literal['$or'], list[dict[str, str | int | float | bool] | dict[Literal['$eq'], str | int | float | bool] | dict[Literal['$ne'], str | int | float | bool] | dict[Literal['$gt'], int | float] | dict[Literal['$gte'], int | float] | dict[Literal['$lt'], int | float] | dict[Literal['$lte'], int | float] | dict[Literal['$in'], list[str | int | float | bool]] | dict[Literal['$nin'], list[str | int | float | bool]] | dict[Literal['$exists'], bool]]] | None = None, include_values: bool | None = None, include_metadata: bool | None = None, sparse_vector: SparseValues | SparseVectorTypedDict | None = None, **kwargs) QueryResponse | ApplyResult[source]

Query a namespace using a query vector.

The Query operation searches a namespace, using a query vector. It retrieves the ids of the most similar items in a namespace, along with their similarity scores.

Parameters:
  • top_k – The number of results to return for each query. Must be an integer greater than 1.

  • vector – The query vector. This should be the same length as the dimension of the index being queried. Each query() request can contain only one of the parameters id or vector. [optional]

  • id – The unique ID of the vector to be used as a query vector. Each query() request can contain only one of the parameters vector or id. [optional]

  • namespace – The namespace to query vectors from. If not specified, the default namespace is used. [optional]

  • filter – The filter to apply. You can use vector metadata to limit your search. See metadata filtering <https://www.pinecone.io/docs/metadata-filtering/>_ [optional]

  • include_values – Indicates whether vector values are included in the response. If omitted the server will use the default value of False [optional]

  • include_metadata – Indicates whether metadata is included in the response as well as the ids. If omitted the server will use the default value of False [optional]

  • sparse_vector – Sparse values of the query vector. Expected to be either a SparseValues object or a dict of the form: {'indices': list[int], 'values': list[float]}, where the lists each have the same length. [optional]

  • **kwargs – Additional keyword arguments for the API call.

Returns:

Object which contains the list of the closest vectors as ScoredVector objects,

and namespace name. If async_req=True, returns ApplyResult instead.

Return type:

QueryResponse

Examples:

>>> # Query with a vector
>>> response = index.query(vector=[1, 2, 3], top_k=10, namespace='my_namespace')
>>> for match in response.matches:
...     print(f"ID: {match.id}, Score: {match.score}")

>>> # Query using an existing vector ID
>>> response = index.query(id='id1', top_k=10, namespace='my_namespace')

>>> # Query with metadata filter
>>> response = index.query(
...     vector=[1, 2, 3],
...     top_k=10,
...     namespace='my_namespace',
...     filter={'key': 'value'}
... )

>>> # Query with include_values and include_metadata
>>> response = index.query(
...     id='id1',
...     top_k=10,
...     namespace='my_namespace',
...     include_metadata=True,
...     include_values=True
... )

>>> # Query with sparse vector (hybrid search)
>>> response = index.query(
...     vector=[1, 2, 3],
...     sparse_vector={'indices': [1, 2], 'values': [0.2, 0.4]},
...     top_k=10,
...     namespace='my_namespace'
... )

>>> # Query with sparse vector using SparseValues object
>>> from pinecone import SparseValues
>>> response = index.query(
...     vector=[1, 2, 3],
...     sparse_vector=SparseValues(indices=[1, 2], values=[0.2, 0.4]),
...     top_k=10,
...     namespace='my_namespace'
... )
Index.query_namespaces(vector: list[float] | None, namespaces: list[str], metric: Literal['cosine', 'euclidean', 'dotproduct'], top_k: int | None = None, filter: dict[str, str | int | float | bool] | dict[Literal['$eq'], str | int | float | bool] | dict[Literal['$ne'], str | int | float | bool] | dict[Literal['$gt'], int | float] | dict[Literal['$gte'], int | float] | dict[Literal['$lt'], int | float] | dict[Literal['$lte'], int | float] | dict[Literal['$in'], list[str | int | float | bool]] | dict[Literal['$nin'], list[str | int | float | bool]] | dict[Literal['$exists'], bool] | dict[Literal['$and'], list[dict[str, str | int | float | bool] | dict[Literal['$eq'], str | int | float | bool] | dict[Literal['$ne'], str | int | float | bool] | dict[Literal['$gt'], int | float] | dict[Literal['$gte'], int | float] | dict[Literal['$lt'], int | float] | dict[Literal['$lte'], int | float] | dict[Literal['$in'], list[str | int | float | bool]] | dict[Literal['$nin'], list[str | int | float | bool]] | dict[Literal['$exists'], bool]]] | dict[Literal['$or'], list[dict[str, str | int | float | bool] | dict[Literal['$eq'], str | int | float | bool] | dict[Literal['$ne'], str | int | float | bool] | dict[Literal['$gt'], int | float] | dict[Literal['$gte'], int | float] | dict[Literal['$lt'], int | float] | dict[Literal['$lte'], int | float] | dict[Literal['$in'], list[str | int | float | bool]] | dict[Literal['$nin'], list[str | int | float | bool]] | dict[Literal['$exists'], bool]]] | None = None, include_values: bool | None = None, include_metadata: bool | None = None, sparse_vector: SparseValues | SparseVectorTypedDict | None = None, **kwargs) QueryNamespacesResults[source]

Query multiple namespaces in parallel and combine the results.

The query_namespaces() method is used to make a query to multiple namespaces in parallel and combine the results into one result set.

Note

Since several asynchronous calls are made on your behalf when calling this method, you will need to tune the pool_threads and connection_pool_maxsize parameter of the Index constructor to suit your workload. If these values are too small in relation to your workload, you will experience performance issues as requests queue up while waiting for a request thread to become available.

Parameters:
  • vector – The query vector, must be the same length as the dimension of the index being queried.

  • namespaces – The list of namespaces to query.

  • metric – Must be one of ‘cosine’, ‘euclidean’, ‘dotproduct’. This is needed in order to merge results across namespaces, since the interpretation of score depends on the index metric type.

  • top_k – The number of results you would like to request from each namespace. Defaults to 10. [optional]

  • filter – Pass an optional filter to filter results based on metadata. Defaults to None. [optional]

  • include_values – Boolean field indicating whether vector values should be included with results. Defaults to None. [optional]

  • include_metadata – Boolean field indicating whether vector metadata should be included with results. Defaults to None. [optional]

  • sparse_vector – If you are working with a dotproduct index, you can pass a sparse vector as part of your hybrid search. Defaults to None. [optional]

  • **kwargs – Additional keyword arguments for the API call.

Returns:

A QueryNamespacesResults object containing the combined results from all namespaces,

as well as the combined usage cost in read units.

Return type:

QueryNamespacesResults

Examples:

from pinecone import Pinecone

pc = Pinecone()

index = pc.Index(
    host="index-name",
    pool_threads=32,
    connection_pool_maxsize=32
)

query_vec = [0.1, 0.2, 0.3]  # An embedding that matches the index dimension
combined_results = index.query_namespaces(
    vector=query_vec,
    namespaces=['ns1', 'ns2', 'ns3', 'ns4'],
    metric="cosine",
    top_k=10,
    filter={'genre': {"$eq": "drama"}},
    include_values=True,
    include_metadata=True
)

for vec in combined_results.matches:
    print(vec.id, vec.score)
print(combined_results.usage)
Index.delete(ids: list[str] | None = None, delete_all: bool | None = None, namespace: str | None = None, filter: dict[str, str | int | float | bool] | dict[Literal['$eq'], str | int | float | bool] | dict[Literal['$ne'], str | int | float | bool] | dict[Literal['$gt'], int | float] | dict[Literal['$gte'], int | float] | dict[Literal['$lt'], int | float] | dict[Literal['$lte'], int | float] | dict[Literal['$in'], list[str | int | float | bool]] | dict[Literal['$nin'], list[str | int | float | bool]] | dict[Literal['$exists'], bool] | dict[Literal['$and'], list[dict[str, str | int | float | bool] | dict[Literal['$eq'], str | int | float | bool] | dict[Literal['$ne'], str | int | float | bool] | dict[Literal['$gt'], int | float] | dict[Literal['$gte'], int | float] | dict[Literal['$lt'], int | float] | dict[Literal['$lte'], int | float] | dict[Literal['$in'], list[str | int | float | bool]] | dict[Literal['$nin'], list[str | int | float | bool]] | dict[Literal['$exists'], bool]]] | dict[Literal['$or'], list[dict[str, str | int | float | bool] | dict[Literal['$eq'], str | int | float | bool] | dict[Literal['$ne'], str | int | float | bool] | dict[Literal['$gt'], int | float] | dict[Literal['$gte'], int | float] | dict[Literal['$lt'], int | float] | dict[Literal['$lte'], int | float] | dict[Literal['$in'], list[str | int | float | bool]] | dict[Literal['$nin'], list[str | int | float | bool]] | dict[Literal['$exists'], bool]]] | None = None, **kwargs) dict[str, Any][source]

Delete vectors from the index, from a single namespace.

The Delete operation deletes vectors from the index, from a single namespace. No error is raised if the vector id does not exist.

Note: For any delete call, if namespace is not specified, the default namespace "" is used. Since the delete operation does not error when ids are not present, this means you may not receive an error if you delete from the wrong namespace.

Delete can occur in the following mutually exclusive ways:

  1. Delete by ids from a single namespace

  2. Delete all vectors from a single namespace by setting delete_all to True

  3. Delete all vectors from a single namespace by specifying a metadata filter (note that for this option delete_all must be set to False)

Parameters:
  • ids – Vector ids to delete. [optional]

  • delete_all – This indicates that all vectors in the index namespace should be deleted. Default is False. [optional]

  • namespace – The namespace to delete vectors from. If not specified, the default namespace is used. [optional]

  • filter – If specified, the metadata filter here will be used to select the vectors to delete. This is mutually exclusive with specifying ids to delete in the ids param or using delete_all=True. See metadata filtering <https://www.pinecone.io/docs/metadata-filtering/>_ [optional]

  • **kwargs – Additional keyword arguments for the API call.

Returns:

An empty dictionary if the delete operation was successful.

Return type:

dict[str, Any]

Examples:

>>> # Delete specific vectors by ID
>>> index.delete(ids=['id1', 'id2'], namespace='my_namespace')
{}

>>> # Delete all vectors from a namespace
>>> index.delete(delete_all=True, namespace='my_namespace')
{}

>>> # Delete vectors matching a metadata filter
>>> index.delete(filter={'key': 'value'}, namespace='my_namespace')
{}
Index.fetch(ids: list[str], namespace: str | None = None, **kwargs) FetchResponse[source]

Fetch vectors by ID from a single namespace.

The fetch operation looks up and returns vectors, by ID, from a single namespace. The returned vectors include the vector data and/or metadata.

Parameters:
  • ids – The vector IDs to fetch.

  • namespace – The namespace to fetch vectors from. If not specified, the default namespace is used. [optional]

  • **kwargs – Additional keyword arguments for the API call.

Returns:

Object which contains the list of Vector objects, and namespace name.

Return type:

FetchResponse

Examples:

>>> # Fetch vectors from a specific namespace
>>> response = index.fetch(ids=['id1', 'id2'], namespace='my_namespace')
>>> for vector_id, vector in response.vectors.items():
...     print(f"{vector_id}: {vector.values}")

>>> # Fetch vectors from the default namespace
>>> response = index.fetch(ids=['id1', 'id2'])
Index.list(**kwargs)[source]

List vector IDs based on an id prefix within a single namespace (generator).

The list operation accepts all of the same arguments as list_paginated, and returns a generator that yields a list of the matching vector ids in each page of results. It automatically handles pagination tokens on your behalf.

Parameters:
  • prefix – The id prefix to match. If unspecified, an empty string prefix will be used with the effect of listing all ids in a namespace [optional]

  • limit – The maximum number of ids to return. If unspecified, the server will use a default value. [optional]

  • pagination_token – A token needed to fetch the next page of results. This token is returned in the response if additional results are available. [optional]

  • namespace – The namespace to fetch vectors from. If not specified, the default namespace is used. [optional]

  • **kwargs – Additional keyword arguments for the API call.

Yields:

list[str] – A list of vector IDs for each page of results.

Examples:

>>> # Iterate over all vector IDs with a prefix
>>> for ids in index.list(prefix='99', limit=5, namespace='my_namespace'):
...     print(ids)
['99', '990', '991', '992', '993']
['994', '995', '996', '997', '998']
['999']

>>> # Convert generator to list (be cautious with large datasets)
>>> all_ids = []
>>> for ids in index.list(prefix='99', namespace='my_namespace'):
...     all_ids.extend(ids)
Index.list_paginated(prefix: str | None = None, limit: int | None = None, pagination_token: str | None = None, namespace: str | None = None, **kwargs) ListResponse[source]

List vector IDs based on an id prefix within a single namespace (paginated).

The list_paginated operation finds vectors based on an id prefix within a single namespace. It returns matching ids in a paginated form, with a pagination token to fetch the next page of results. This id list can then be passed to fetch or delete operations, depending on your use case.

Consider using the list method to avoid having to handle pagination tokens manually.

Parameters:
  • prefix – The id prefix to match. If unspecified, an empty string prefix will be used with the effect of listing all ids in a namespace [optional]

  • limit – The maximum number of ids to return. If unspecified, the server will use a default value. [optional]

  • pagination_token – A token needed to fetch the next page of results. This token is returned in the response if additional results are available. [optional]

  • namespace – The namespace to fetch vectors from. If not specified, the default namespace is used. [optional]

  • **kwargs – Additional keyword arguments for the API call.

Returns:

Object which contains the list of ids, the namespace name, pagination information,

and usage showing the number of read_units consumed.

Return type:

ListResponse

Examples:

>>> # List vectors with a prefix
>>> results = index.list_paginated(prefix='99', limit=5, namespace='my_namespace')
>>> [v.id for v in results.vectors]
['99', '990', '991', '992', '993']
>>> # Get next page
>>> if results.pagination and results.pagination.next:
...     next_results = index.list_paginated(
...         prefix='99',
...         limit=5,
...         namespace='my_namespace',
...         pagination_token=results.pagination.next
...     )
Index.fetch_by_metadata(filter: dict[str, str | int | float | bool] | dict[Literal['$eq'], str | int | float | bool] | dict[Literal['$ne'], str | int | float | bool] | dict[Literal['$gt'], int | float] | dict[Literal['$gte'], int | float] | dict[Literal['$lt'], int | float] | dict[Literal['$lte'], int | float] | dict[Literal['$in'], list[str | int | float | bool]] | dict[Literal['$nin'], list[str | int | float | bool]] | dict[Literal['$exists'], bool] | dict[Literal['$and'], list[dict[str, str | int | float | bool] | dict[Literal['$eq'], str | int | float | bool] | dict[Literal['$ne'], str | int | float | bool] | dict[Literal['$gt'], int | float] | dict[Literal['$gte'], int | float] | dict[Literal['$lt'], int | float] | dict[Literal['$lte'], int | float] | dict[Literal['$in'], list[str | int | float | bool]] | dict[Literal['$nin'], list[str | int | float | bool]] | dict[Literal['$exists'], bool]]] | dict[Literal['$or'], list[dict[str, str | int | float | bool] | dict[Literal['$eq'], str | int | float | bool] | dict[Literal['$ne'], str | int | float | bool] | dict[Literal['$gt'], int | float] | dict[Literal['$gte'], int | float] | dict[Literal['$lt'], int | float] | dict[Literal['$lte'], int | float] | dict[Literal['$in'], list[str | int | float | bool]] | dict[Literal['$nin'], list[str | int | float | bool]] | dict[Literal['$exists'], bool]]], namespace: str | None = None, limit: int | None = None, pagination_token: str | None = None, **kwargs) FetchByMetadataResponse[source]

Fetch vectors by metadata filter.

Look up and return vectors by metadata filter from a single namespace. The returned vectors include the vector data and/or metadata.

Parameters:
  • filter – Metadata filter expression to select vectors. See metadata filtering <https://www.pinecone.io/docs/metadata-filtering/>_

  • namespace – The namespace to fetch vectors from. If not specified, the default namespace is used. [optional]

  • limit – Max number of vectors to return. Defaults to 100. [optional]

  • pagination_token – Pagination token to continue a previous listing operation. [optional]

  • **kwargs – Additional keyword arguments for the API call.

Returns:

Object containing the fetched vectors, namespace, usage, and pagination token.

Return type:

FetchByMetadataResponse

Examples:

>>> # Fetch vectors matching a complex filter
>>> response = index.fetch_by_metadata(
...     filter={'genre': {'$in': ['comedy', 'drama']}, 'year': {'$eq': 2019}},
...     namespace='my_namespace',
...     limit=50
... )
>>> print(f"Found {len(response.vectors)} vectors")

>>> # Fetch vectors with pagination
>>> response = index.fetch_by_metadata(
...     filter={'status': 'active'},
...     pagination_token='token123'
... )
>>> if response.pagination:
...     print(f"Next page token: {response.pagination.next}")
Index.update(id: str | None = None, values: list[float] | None = None, set_metadata: dict[str, str | int | float | list[str] | list[int] | list[float]] | None = None, namespace: str | None = None, sparse_values: SparseValues | SparseVectorTypedDict | None = None, filter: dict[str, str | int | float | bool] | dict[Literal['$eq'], str | int | float | bool] | dict[Literal['$ne'], str | int | float | bool] | dict[Literal['$gt'], int | float] | dict[Literal['$gte'], int | float] | dict[Literal['$lt'], int | float] | dict[Literal['$lte'], int | float] | dict[Literal['$in'], list[str | int | float | bool]] | dict[Literal['$nin'], list[str | int | float | bool]] | dict[Literal['$exists'], bool] | dict[Literal['$and'], list[dict[str, str | int | float | bool] | dict[Literal['$eq'], str | int | float | bool] | dict[Literal['$ne'], str | int | float | bool] | dict[Literal['$gt'], int | float] | dict[Literal['$gte'], int | float] | dict[Literal['$lt'], int | float] | dict[Literal['$lte'], int | float] | dict[Literal['$in'], list[str | int | float | bool]] | dict[Literal['$nin'], list[str | int | float | bool]] | dict[Literal['$exists'], bool]]] | dict[Literal['$or'], list[dict[str, str | int | float | bool] | dict[Literal['$eq'], str | int | float | bool] | dict[Literal['$ne'], str | int | float | bool] | dict[Literal['$gt'], int | float] | dict[Literal['$gte'], int | float] | dict[Literal['$lt'], int | float] | dict[Literal['$lte'], int | float] | dict[Literal['$in'], list[str | int | float | bool]] | dict[Literal['$nin'], list[str | int | float | bool]] | dict[Literal['$exists'], bool]]] | None = None, dry_run: bool | None = None, **kwargs) UpdateResponse[source]

Update vectors in a namespace.

The Update operation updates vectors in a namespace.

This method supports two update modes:

  1. Single vector update by ID: Provide id to update a specific vector. - Updates the vector with the given ID - If values is included, it will overwrite the previous vector values - If set_metadata is included, the metadata will be merged with existing metadata on the vector.

    Fields specified in set_metadata will overwrite existing fields with the same key, while fields not in set_metadata will remain unchanged.

  2. Bulk update by metadata filter: Provide filter to update all vectors matching the filter criteria. - Updates all vectors in the namespace that match the filter expression - Useful for updating metadata across multiple vectors at once - If set_metadata is included, the metadata will be merged with existing metadata on each vector.

    Fields specified in set_metadata will overwrite existing fields with the same key, while fields not in set_metadata will remain unchanged.

    • The response includes matched_records indicating how many vectors were updated

Either id or filter must be provided (but not both in the same call).

Parameters:
  • id – Vector’s unique id. Required for single vector updates. Must not be provided when using filter. [optional]

  • values – Vector values to set. [optional]

  • set_metadata – Metadata to merge with existing metadata on the vector(s). Fields specified will overwrite existing fields with the same key, while fields not specified will remain unchanged. [optional]

  • namespace – Namespace name where to update the vector(s). [optional]

  • sparse_values – Sparse values to update for the vector. Expected to be either a SparseValues object or a dict of the form: {'indices': list[int], 'values': list[float]} where the lists each have the same length. [optional]

  • filter – A metadata filter expression. When provided, updates all vectors in the namespace that match the filter criteria. See metadata filtering <https://www.pinecone.io/docs/metadata-filtering/>_. Must not be provided when using id. Either id or filter must be provided. [optional]

  • dry_run – If True, return the number of records that match the filter without executing the update. Only meaningful when using filter (not with id). Useful for previewing the impact of a bulk update before applying changes. Defaults to False. [optional]

  • **kwargs – Additional keyword arguments for the API call.

Returns:

An UpdateResponse object. When using filter-based updates, the response includes

matched_records indicating the number of vectors that were updated (or would be updated if dry_run=True).

Return type:

UpdateResponse

Examples:

Single vector update by ID:

>>> # Update vector values
>>> index.update(id='id1', values=[1, 2, 3], namespace='my_namespace')

>>> # Update vector metadata
>>> index.update(id='id1', set_metadata={'key': 'value'}, namespace='my_namespace')

>>> # Update vector values and sparse values
>>> index.update(
...     id='id1',
...     values=[1, 2, 3],
...     sparse_values={'indices': [1, 2], 'values': [0.2, 0.4]},
...     namespace='my_namespace'
... )

>>> # Update with SparseValues object
>>> from pinecone import SparseValues
>>> index.update(
...     id='id1',
...     values=[1, 2, 3],
...     sparse_values=SparseValues(indices=[1, 2], values=[0.2, 0.4]),
...     namespace='my_namespace'
... )

Bulk update by metadata filter:

>>> # Update metadata for all vectors matching the filter
>>> response = index.update(
...     set_metadata={'status': 'active'},
...     filter={'genre': {'$eq': 'drama'}},
...     namespace='my_namespace'
... )
>>> print(f"Updated {response.matched_records} vectors")

>>> # Preview how many vectors would be updated (dry run)
>>> response = index.update(
...     set_metadata={'status': 'active'},
...     filter={'genre': {'$eq': 'drama'}},
...     namespace='my_namespace',
...     dry_run=True
... )
>>> print(f"Would update {response.matched_records} vectors")
Index.upsert_from_dataframe(df, namespace: str | None = None, batch_size: int = 500, show_progress: bool = True) UpsertResponse[source]

Upsert vectors from a pandas DataFrame into the index.

Parameters:
  • df – A pandas DataFrame with the following columns: id, values, sparse_values, and metadata.

  • namespace – The namespace to upsert into. If not specified, the default namespace is used. [optional]

  • batch_size – The number of rows to upsert in a single batch. Defaults to 500.

  • show_progress – Whether to show a progress bar. Defaults to True.

Returns:

Object containing the number of vectors upserted.

Return type:

UpsertResponse

Examples:

import pandas as pd
from pinecone import Pinecone

pc = Pinecone()
idx = pc.Index(host="example-index-host")

# Create a DataFrame with vector data
df = pd.DataFrame({
    'id': ['id1', 'id2', 'id3'],
    'values': [
        [0.1, 0.2, 0.3],
        [0.4, 0.5, 0.6],
        [0.7, 0.8, 0.9]
    ],
    'metadata': [
        {'key1': 'value1'},
        {'key2': 'value2'},
        {'key3': 'value3'}
    ]
})

# Upsert from DataFrame
response = idx.upsert_from_dataframe(
    df=df,
    namespace='my-namespace',
    batch_size=100,
    show_progress=True
)

Bulk Import

Index.start_import(uri: str, integration_id: str | None = None, error_mode: 'ImportErrorMode' | Literal['CONTINUE', 'ABORT'] | str | None = 'CONTINUE') StartImportResponse[source]
Parameters:
  • uri (str) – The URI of the data to import. The URI must start with the scheme of a supported storage provider.

  • integration_id (str | None, optional) – If your bucket requires authentication to access, you need to pass the id of your storage integration using this property. Defaults to None.

  • error_mode – Defaults to “CONTINUE”. If set to “CONTINUE”, the import operation will continue even if some records fail to import. Pass “ABORT” to stop the import operation if any records fail to import.

Returns:

Contains the id of the import operation.

Return type:

StartImportResponse

Import data from a storage provider into an index. The uri must start with the scheme of a supported storage provider. For buckets that are not publicly readable, you will also need to separately configure a storage integration and pass the integration id.

Examples

>>> from pinecone import Pinecone
>>> index = Pinecone().Index('my-index')
>>> index.start_import(uri="s3://bucket-name/path/to/data.parquet")
{ id: "1" }
Index.list_imports(**kwargs) Iterator['ImportModel'][source]
Parameters:
  • limit (int | None) – The maximum number of operations to fetch in each network call. If unspecified, the server will use a default value. [optional]

  • pagination_token (str | None) – When there are multiple pages of results, a pagination token is returned in the response. The token can be used to fetch the next page of results. [optional]

Returns:

Returns a generator that yields each import operation. It automatically handles pagination tokens on your behalf so you can easily iterate over all results. The list_imports method accepts all of the same arguments as list_imports_paginated

for op in index.list_imports():
    print(op)

You can convert the generator into a list by wrapping the generator in a call to the built-in list function:

operations = list(index.list_imports())

You should be cautious with this approach because it will fetch all operations at once, which could be a large number of network calls and a lot of memory to hold the results.

Index.list_imports_paginated(limit: int | None = None, pagination_token: str | None = None, **kwargs) ListImportsResponse[source]
Parameters:
  • limit (int | None) – The maximum number of ids to return. If unspecified, the server will use a default value. [optional]

  • pagination_token (str | None) – A token needed to fetch the next page of results. This token is returned in the response if additional results are available. [optional]

Returns: ListImportsResponse object which contains the list of operations as ImportModel objects, pagination information,

and usage showing the number of read_units consumed.

The list_imports_paginated() operation returns information about import operations. It returns operations in a paginated form, with a pagination token to fetch the next page of results.

Consider using the list_imports method to avoid having to handle pagination tokens manually.

Examples:

>>> results = index.list_imports_paginated(limit=5)
>>> results.pagination.next
eyJza2lwX3Bhc3QiOiI5OTMiLCJwcmVmaXgiOiI5OSJ9
>>> results.data[0]
{
    "id": "6",
    "uri": "s3://dev-bulk-import-datasets-pub/10-records-dim-10/",
    "status": "Completed",
    "percent_complete": 100.0,
    "records_imported": 10,
    "created_at": "2024-09-06T14:52:02.567776+00:00",
    "finished_at": "2024-09-06T14:52:28.130717+00:00"
}
>>> next_results = index.list_imports_paginated(limit=5, pagination_token=results.pagination.next)
Index.describe_import(id: str) ImportModel[source]
Parameters:

id (str) – The id of the import operation. This value is returned when starting an import, and can be looked up using list_imports.

Returns:

An object containing operation id, status, and other details.

Return type:

ImportModel

describe_import is used to get detailed information about a specific import operation.

Index.cancel_import(id: str)[source]

Cancel an import operation.

Parameters:

id – The id of the import operation to cancel.

Returns:

The response from the cancel operation.

Examples:

>>> # Cancel an import operation
>>> index.cancel_import(id="import-123")

Records

If you have created an index using integrated inference, you can use the following methods to search and retrieve records.

Index.upsert_records(namespace: str, records: list[dict]) UpsertResponse[source]

Upsert records to a namespace.

A record is a dictionary that contains either an id or _id field along with other fields that will be stored as metadata. The id or _id field is used as the unique identifier for the record. At least one field in the record should correspond to a field mapping in the index’s embed configuration.

When records are upserted, Pinecone converts mapped fields into embeddings and upserts them into the specified namespace of the index.

Parameters:
  • namespace – The namespace of the index to upsert records to.

  • records – The records to upsert into the index. Each record should contain an id or _id field and fields that match the index’s embed configuration field mappings.

Returns:

Object which contains the number of records upserted.

Return type:

UpsertResponse

Examples:

Upserting records to be embedded with Pinecone’s integrated inference models
from pinecone import (
    Pinecone,
    CloudProvider,
    AwsRegion,
    EmbedModel,
    IndexEmbed
)

pc = Pinecone(api_key="<<PINECONE_API_KEY>>")

# Create an index configured for the multilingual-e5-large model
index_model = pc.create_index_for_model(
    name="my-model-index",
    cloud=CloudProvider.AWS,
    region=AwsRegion.US_WEST_2,
    embed=IndexEmbed(
        model=EmbedModel.Multilingual_E5_Large,
        field_map={"text": "my_text_field"}
    )
)

# Instantiate the index client
idx = pc.Index(host=index_model.host)

# Upsert records
idx.upsert_records(
    namespace="my-namespace",
    records=[
        {
            "_id": "test1",
            "my_text_field": "Apple is a popular fruit known for its sweetness and crisp texture.",
        },
        {
            "_id": "test2",
            "my_text_field": "The tech company Apple is known for its innovative products like the iPhone.",
        },
        {
            "_id": "test3",
            "my_text_field": "Many people enjoy eating apples as a healthy snack.",
        },
    ],
)
Index.search(namespace: str, query: SearchQueryTypedDict | SearchQuery, rerank: SearchRerankTypedDict | SearchRerank | None = None, fields: list[str] | None = ['*']) SearchRecordsResponse[source]

Search for records in a namespace.

This operation converts a query to a vector embedding and then searches a namespace. You can optionally provide a reranking operation as part of the search.

Parameters:
  • namespace – The namespace in the index to search.

  • query – The SearchQuery to use for the search. The query can include a match_terms field to specify which terms must be present in the text of each search hit. The match_terms should be a dict with strategy (str) and terms (list[str]) keys, e.g. {"strategy": "all", "terms": ["term1", "term2"]}. Currently only “all” strategy is supported, which means all specified terms must be present. Note: match_terms is only supported for sparse indexes with integrated embedding configured to use the pinecone-sparse-english-v0 model.

  • rerank – The SearchRerank to use with the search request. [optional]

  • fields – List of fields to return in the response. Defaults to [“*”] to return all fields. [optional]

Returns:

The records that match the search.

Return type:

SearchRecordsResponse

Examples:

from pinecone import (
    Pinecone,
    CloudProvider,
    AwsRegion,
    EmbedModel,
    IndexEmbed,
    SearchQuery,
    SearchRerank,
    RerankModel
)

pc = Pinecone(api_key="<<PINECONE_API_KEY>>")

# Create an index for your embedding model
index_model = pc.create_index_for_model(
    name="my-model-index",
    cloud=CloudProvider.AWS,
    region=AwsRegion.US_WEST_2,
    embed=IndexEmbed(
        model=EmbedModel.Multilingual_E5_Large,
        field_map={"text": "my_text_field"}
    )
)

# Instantiate the index client
idx = pc.Index(host=index_model.host)

# Search for similar records
response = idx.search(
    namespace="my-namespace",
    query=SearchQuery(
        inputs={
            "text": "Apple corporation",
        },
        top_k=3,
    ),
    rerank=SearchRerank(
        model=RerankModel.Bge_Reranker_V2_M3,
        rank_fields=["my_text_field"],
        top_n=3,
    ),
)
Index.search_records(namespace: str, query: SearchQueryTypedDict | SearchQuery, rerank: SearchRerankTypedDict | SearchRerank | None = None, fields: list[str] | None = ['*']) SearchRecordsResponse[source]

Alias of the search() method.

See search() for full documentation and examples.

Namespaces

Index.create_namespace(name: str, schema: dict[str, Any] | None = None, **kwargs) NamespaceDescription[source]

Create a namespace in a serverless index.

Create a namespace in a serverless index. For guidance and examples, see Manage namespaces.

Note: This operation is not supported for pod-based indexes.

Parameters:
  • name – The name of the namespace to create.

  • schema – Optional schema configuration for the namespace as a dictionary. [optional]

  • **kwargs – Additional keyword arguments for the API call.

Returns:

Information about the created namespace including vector count.

Return type:

NamespaceDescription

Examples:

>>> # Create a namespace with just a name
>>> namespace = index.create_namespace(name="my-namespace")
>>> print(f"Created namespace: {namespace.name}, Vector count: {namespace.vector_count}")

>>> # Create a namespace with schema configuration
>>> from pinecone.core.openapi.db_data.model.create_namespace_request_schema import CreateNamespaceRequestSchema
>>> schema = CreateNamespaceRequestSchema(fields={...})
>>> namespace = index.create_namespace(name="my-namespace", schema=schema)
Index.describe_namespace(namespace: str, **kwargs) NamespaceDescription[source]

Describe a namespace within an index, showing the vector count within the namespace.

Parameters:
  • namespace – The namespace to describe.

  • **kwargs – Additional keyword arguments for the API call.

Returns:

Information about the namespace including vector count.

Return type:

NamespaceDescription

Examples:

>>> namespace_info = index.describe_namespace(namespace="my-namespace")
>>> print(f"Namespace: {namespace_info.name}")
>>> print(f"Vector count: {namespace_info.vector_count}")
Index.delete_namespace(namespace: str, **kwargs) dict[str, Any][source]

Delete a namespace from an index.

Parameters:
  • namespace – The namespace to delete.

  • **kwargs – Additional keyword arguments for the API call.

Returns:

Response from the delete operation.

Return type:

dict[str, Any]

Examples:

>>> result = index.delete_namespace(namespace="my-namespace")
>>> print("Namespace deleted successfully")
Index.list_namespaces(limit: int | None = None, **kwargs) Iterator[ListNamespacesResponse][source]

List all namespaces in an index.

This method automatically handles pagination to return all results.

Parameters:
  • limit – The maximum number of namespaces to return. If unspecified, the server will use a default value. [optional]

  • **kwargs – Additional keyword arguments for the API call.

Returns:

An iterator that yields ListNamespacesResponse objects containing the list of namespaces.

Return type:

Iterator[ListNamespacesResponse]

Examples:

>>> # Iterate over all namespaces
>>> for namespace_response in index.list_namespaces(limit=5):
...     for namespace in namespace_response.namespaces:
...         print(f"Namespace: {namespace.name}, Vector count: {namespace.vector_count}")

>>> # Convert to list (be cautious with large datasets)
>>> results = list(index.list_namespaces(limit=5))
>>> for namespace_response in results:
...     for namespace in namespace_response.namespaces:
...         print(f"Namespace: {namespace.name}, Vector count: {namespace.vector_count}")
Index.list_namespaces_paginated(limit: int | None = None, pagination_token: str | None = None, **kwargs) ListNamespacesResponse[source]

List all namespaces in an index with pagination support.

The response includes pagination information if there are more results available.

Consider using the list_namespaces method to avoid having to handle pagination tokens manually.

Parameters:
  • limit – The maximum number of namespaces to return. If unspecified, the server will use a default value. [optional]

  • pagination_token – A token needed to fetch the next page of results. This token is returned in the response if additional results are available. [optional]

  • **kwargs – Additional keyword arguments for the API call.

Returns:

Object containing the list of namespaces and pagination information.

Return type:

ListNamespacesResponse

Examples:

>>> # Get first page of namespaces
>>> results = index.list_namespaces_paginated(limit=5)
>>> for namespace in results.namespaces:
...     print(f"Namespace: {namespace.name}, Vector count: {namespace.vector_count}")

>>> # Get next page if available
>>> if results.pagination and results.pagination.next:
...     next_results = index.list_namespaces_paginated(
...         limit=5,
...         pagination_token=results.pagination.next
...     )

Inference

Inference.embed(model: EmbedModel | str, inputs: str | list[Dict] | list[str], parameters: dict[str, Any] | None = None) EmbeddingsList[source]

Generates embeddings for the provided inputs using the specified model and (optional) parameters.

Parameters:
  • model (str, required) – The model to use for generating embeddings.

  • inputs (list, required) – A list of items to generate embeddings for.

  • parameters (dict, optional) – A dictionary of parameters to use when generating embeddings.

Returns:

EmbeddingsList object with keys data, model, and usage. The data key contains a list of n embeddings, where n = len(inputs). Precision of returned embeddings is either float16 or float32, with float32 being the default. model key is the model used to generate the embeddings. usage key contains the total number of tokens used at request-time.

Return type:

EmbeddingsList

from pinecone import Pinecone

pc = Pinecone()
inputs = ["Who created the first computer?"]
outputs = pc.inference.embed(
    model="multilingual-e5-large",
    inputs=inputs,
    parameters={"input_type": "passage", "truncate": "END"}
)
print(outputs)
# EmbeddingsList(
#     model='multilingual-e5-large',
#     data=[
#         {'values': [0.1, ...., 0.2]},
#     ],
#     usage={'total_tokens': 6}
# )

You can also use a single string input:

from pinecone import Pinecone

pc = Pinecone()
output = pc.inference.embed(
    model="text-embedding-3-small",
    inputs="Hello, world!"
)

Or use the EmbedModel enum:

from pinecone import Pinecone
from pinecone.inference import EmbedModel

pc = Pinecone()
outputs = pc.inference.embed(
    model=EmbedModel.TEXT_EMBEDDING_3_SMALL,
    inputs=["Document 1", "Document 2"]
)
Inference.rerank(model: RerankModel | str, query: str, documents: list[str] | list[dict[str, Any]], rank_fields: list[str] = ['text'], return_documents: bool = True, top_n: int | None = None, parameters: dict[str, Any] | None = None) RerankResult[source]

Rerank documents with associated relevance scores that represent the relevance of each document to the provided query using the specified model.

Parameters:
  • model (str, required) – The model to use for reranking.

  • query (str, required) – The query to compare with documents.

  • documents (list, required) – A list of documents or strings to rank.

  • rank_fields (list, optional) – A list of document fields to use for ranking. Defaults to [“text”].

  • return_documents (bool, optional) – Whether to include the documents in the response. Defaults to True.

  • top_n (int, optional) – How many documents to return. Defaults to len(documents).

  • parameters (dict, optional) – A dictionary of parameters to use when ranking documents.

Returns:

RerankResult object with keys data and usage. The data key contains a list of n documents, where n = top_n and type(n) = Document. The documents are sorted in order of relevance, with the first being the most relevant. The index field can be used to locate the document relative to the list of documents specified in the request. Each document contains a score key representing how close the document relates to the query.

Return type:

RerankResult

from pinecone import Pinecone

pc = Pinecone()
result = pc.inference.rerank(
    model="bge-reranker-v2-m3",
    query="Tell me about tech companies",
    documents=[
        "Apple is a popular fruit known for its sweetness and crisp texture.",
        "Software is still eating the world.",
        "Many people enjoy eating apples as a healthy snack.",
        "Acme Inc. has revolutionized the tech industry with its sleek designs and user-friendly interfaces.",
        "An apple a day keeps the doctor away, as the saying goes.",
    ],
    top_n=2,
    return_documents=True,
)
print(result)
# RerankResult(
#     model='bge-reranker-v2-m3',
#     data=[{
#         index=3,
#         score=0.020924192,
#         document={
#             text='Acme Inc. has revolutionized the tech industry with its sleek designs and user-friendly interfaces.'
#         }
#     },{
#         index=1,
#         score=0.00034464317,
#         document={
#             text='Software is still eating the world.'
#         }
#     }],
#     usage={'rerank_units': 1}
# )

You can also use document dictionaries with custom fields:

from pinecone import Pinecone

pc = Pinecone()
result = pc.inference.rerank(
    model="pinecone-rerank-v0",
    query="What is machine learning?",
    documents=[
        {"text": "Machine learning is a subset of AI.", "category": "tech"},
        {"text": "Cooking recipes for pasta.", "category": "food"},
    ],
    rank_fields=["text"],
    top_n=1
)

Or use the RerankModel enum:

from pinecone import Pinecone
from pinecone.inference import RerankModel

pc = Pinecone()
result = pc.inference.rerank(
    model=RerankModel.PINECONE_RERANK_V0,
    query="Your query here",
    documents=["doc1", "doc2", "doc3"]
)
Inference.list_models(*, type: str | None = None, vector_type: str | None = None) ModelInfoList[source]

List all available models.

Parameters:
  • type (str, optional) – The type of model to list. Either “embed” or “rerank”.

  • vector_type (str, optional) – The type of vector to list. Either “dense” or “sparse”.

Returns:

A list of models.

Return type:

ModelInfoList

from pinecone import Pinecone

pc = Pinecone()

# List all models
models = pc.inference.list_models()

# List models, with model type filtering
models = pc.inference.list_models(type="embed")
models = pc.inference.list_models(type="rerank")

# List models, with vector type filtering
models = pc.inference.list_models(vector_type="dense")
models = pc.inference.list_models(vector_type="sparse")

# List models, with both type and vector type filtering
models = pc.inference.list_models(type="rerank", vector_type="dense")
Inference.get_model(model_name: str) ModelInfo[source]

Get details on a specific model.

Parameters:

model_name (str, required) – The name of the model to get details on.

Returns:

A ModelInfo object.

Return type:

ModelInfo

from pinecone import Pinecone

pc = Pinecone()
model_info = pc.inference.get_model(model_name="pinecone-rerank-v0")
print(model_info)
# {
#     "model": "pinecone-rerank-v0",
#     "short_description": "A state of the art reranking model that out-performs competitors on widely accepted benchmarks. It can handle chunks up to 512 tokens (1-2 paragraphs)",
#     "type": "rerank",
#     "supported_parameters": [
#         {
#             "parameter": "truncate",
#             "type": "one_of",
#             "value_type": "string",
#             "required": false,
#             "default": "END",
#             "allowed_values": [
#                 "END",
#                 "NONE"
#             ]
#         }
#     ],
#     "modality": "text",
#     "max_sequence_length": 512,
#     "max_batch_size": 100,
#     "provider_name": "Pinecone",
#     "supported_metrics": []
# }