PineconeGRPC¶

class pinecone.grpc.PineconeGRPC(api_key: str | None = None, host: str | None = None, proxy_url: str | None = None, proxy_headers: dict[str, str] | None = None, ssl_ca_certs: str | None = None, ssl_verify: bool | None = None, additional_headers: dict[str, str] | None = {}, pool_threads: int | None = None, **kwargs)[source]¶

An alternative version of the Pinecone client that uses gRPC instead of HTTP for data operations.

Installing the gRPC client

You must install extra dependencies in order to install the GRPC client.

Installing with pip

# Install the latest version
pip3 install "pinecone[grpc]"

# Install a specific version
pip3 install "pinecone[grpc]"==7.0.2

Installing with poetry

# Install the latest version
poetry add pinecone --extras grpc

# Install a specific version
poetry add pinecone==7.0.2 --extras grpc

Using the gRPC client

import os
from pinecone.grpc import PineconeGRPC

pc = PineconeGRPC(api_key=os.environ.get("PINECONE_API_KEY"))

# From this point on, usage is identical to the HTTP client.
index = pc.Index("my-index", host=os.environ("PINECONE_INDEX_HOST"))
index.query(...)

PineconeGRPC.Index(name: str = '', host: str = '', **kwargs)[source]¶

Target an index for data operations.

### Target an index by host url

In production situations, you want to uspert or query your data as quickly as possible. If you know in advance the host url of your index, you can eliminate a round trip to the Pinecone control plane by specifying the host of the index.

To find your host url, you can use the Pinecone control plane to describe the index. The host url is returned in the response. Or, alternatively, the host is displayed in the Pinecone web console.

Target an index by name (not recommended for production)

For more casual usage, such as when you are playing and exploring with Pinecone in a notebook setting, you can also target an index by name. If you use this approach, the client may need to perform an extra call to the Pinecone control plane to get the host url on your behalf to get the index host.

The client will cache the index host for future use whenever it is seen, so you will only incur the overhead of only one call. But this approach is not recommended for production usage.

import os
from pinecone import ServerlessSpec
from pinecone.grpc import PineconeGRPC

api_key = os.environ.get("PINECONE_API_KEY")

pc = PineconeGRPC(api_key=api_key)
pc.create_index(
    name='my-index',
    dimension=1536,
    metric='cosine',
    spec=ServerlessSpec(cloud='aws', region='us-west-2')
)
index = pc.Index('my-index')

# Now you're ready to perform data operations
index.query(vector=[...], top_k=10)

DB Control Plane¶

Indexes¶

Creates a Pinecone index.

Parameters:

name (str) – The name of the index to create. Must be unique within your project and cannot be changed once created. Allowed characters are lowercase letters, numbers, and hyphens and the name may not begin or end with hyphens. Maximum length is 45 characters.
metric (str, optional) – Type of similarity metric used in the vector index when querying, one of {"cosine", "dotproduct", "euclidean"}.
spec (Dict) – A dictionary containing configurations describing how the index should be deployed. For serverless indexes, specify region and cloud. Optionally, you can specify read_capacity to configure dedicated read capacity mode (OnDemand or Dedicated) and schema to configure which metadata fields are filterable. For pod indexes, specify replicas, shards, pods, pod_type, metadata_config, and source_collection. Alternatively, use the ServerlessSpec, PodSpec, or ByocSpec objects to specify these configurations.
dimension (int) – If you are creating an index with vector_type="dense" (which is the default), you need to specify dimension to indicate the size of your vectors. This should match the dimension of the embeddings you will be inserting. For example, if you are using OpenAI’s CLIP model, you should use dimension=1536. Dimension is a required field when creating an index with vector_type="dense" and should not be passed when vector_type="sparse".
timeout (int, optional) – Specify the number of seconds to wait until index gets ready. If None, wait indefinitely; if >=0, time out after this many seconds; if -1, return immediately and do not wait.
deletion_protection (Optional[Literal["enabled", "disabled"]]) – If enabled, the index cannot be deleted. If disabled, the index can be deleted.
vector_type (str, optional) – The type of vectors to be stored in the index. One of {"dense", "sparse"}.
tags (Optional[dict[str, str]]) – Tags are key-value pairs you can attach to indexes to better understand, organize, and identify your resources. Some example use cases include tagging indexes with the name of the model that generated the embeddings, the date the index was created, or the purpose of the index.

Returns:

A IndexModel instance containing a description of the index that was created.

Examples:

Creating a serverless index¶

import os
from pinecone import (
    Pinecone,
    ServerlessSpec,
    CloudProvider,
    AwsRegion,
    Metric,
    DeletionProtection,
    VectorType
)

pc = Pinecone(api_key=os.environ.get("PINECONE_API_KEY"))

pc.create_index(
    name="my_index",
    dimension=512,
    metric=Metric.COSINE,
    spec=ServerlessSpec(
        cloud=CloudProvider.AWS,
        region=AwsRegion.US_WEST_2,
        read_capacity={
            "mode": "Dedicated",
            "dedicated": {
                "node_type": "t1",
                "scaling": "Manual",
                "manual": {"shards": 2, "replicas": 2},
            },
        },
        schema={
            "genre": {"filterable": True},
            "year": {"filterable": True},
            "rating": {"filterable": True},
        },
    ),
    deletion_protection=DeletionProtection.DISABLED,
    vector_type=VectorType.DENSE,
    tags={
        "app": "movie-recommendations",
        "env": "production"
    }
)

Creating a pod index¶

import os
from pinecone import (
    Pinecone,
    PodSpec,
    PodIndexEnvironment,
    PodType,
    Metric,
    DeletionProtection,
    VectorType
)

pc = Pinecone(api_key=os.environ.get("PINECONE_API_KEY"))

pc.create_index(
    name="my_index",
    dimension=1536,
    metric=Metric.COSINE,
    spec=PodSpec(
        environment=PodIndexEnvironment.US_EAST4_GCP,
        pod_type=PodType.P1_X1
    ),
    deletion_protection=DeletionProtection.DISABLED,
    tags={
        "model": "clip",
        "app": "image-search",
        "env": "testing"
    }
)

Create a Serverless index configured for use with Pinecone’s integrated inference models.

Parameters:

name (str) – The name of the index to create. Must be unique within your project and cannot be changed once created. Allowed characters are lowercase letters, numbers, and hyphens and the name may not begin or end with hyphens. Maximum length is 45 characters.
cloud (str) – The cloud provider to use for the index. One of {"aws", "gcp", "azure"}.
region (str) – The region to use for the index. Enum objects AwsRegion, GcpRegion, and AzureRegion are also available to help you quickly set these parameters, but may not be up to date as new regions become available.
embed (Union[Dict, IndexEmbed]) – The embedding configuration for the index. This param accepts a dictionary or an instance of the IndexEmbed object.
tags (Optional[dict[str, str]]) – Tags are key-value pairs you can attach to indexes to better understand, organize, and identify your resources. Some example use cases include tagging indexes with the name of the model that generated the embeddings, the date the index was created, or the purpose of the index.
deletion_protection (Optional[Literal["enabled", "disabled"]]) – If enabled, the index cannot be deleted. If disabled, the index can be deleted. This setting can be changed with configure_index.
read_capacity (Optional[Union[ReadCapacityDict, ReadCapacity, ReadCapacityOnDemandSpec, ReadCapacityDedicatedSpec]]) – Optional read capacity configuration. You can specify read_capacity to configure dedicated read capacity mode (OnDemand or Dedicated). See ServerlessSpec documentation for details on read capacity configuration.
schema (Optional[Union[dict[str, MetadataSchemaFieldConfig], dict[str, dict[str, Any]], BackupModelSchema]]) – Optional metadata schema configuration. You can specify schema to configure which metadata fields are filterable. The schema can be provided as a dictionary mapping field names to their configurations (e.g., {"genre": {"filterable": True}}) or as a dictionary with a fields key (e.g., {"fields": {"genre": {"filterable": True}}}).
timeout (Optional[int]) – Specify the number of seconds to wait until index is ready to receive data. If None, wait indefinitely; if >=0, time out after this many seconds; if -1, return immediately and do not wait.

Returns:

A description of the index that was created.

Return type:

IndexModel

The resulting index can be described, listed, configured, and deleted like any other Pinecone index with the describe_index, list_indexes, configure_index, and delete_index methods.

After the model is created, you can upsert records into the index with the upsert_records method, and search your records with the search method.

from pinecone import (
    Pinecone,
    IndexEmbed,
    CloudProvider,
    AwsRegion,
    EmbedModel,
    Metric,
)

pc = Pinecone()

if not pc.has_index("book-search"):
    desc = pc.create_index_for_model(
        name="book-search",
        cloud=CloudProvider.AWS,
        region=AwsRegion.US_EAST_1,
        embed=IndexEmbed(
            model=EmbedModel.Multilingual_E5_Large,
            metric=Metric.COSINE,
            field_map={
                "text": "description",
            },
        )
    )

Creating an index for model with schema and dedicated read capacity¶

from pinecone import (
    Pinecone,
    IndexEmbed,
    CloudProvider,
    AwsRegion,
    EmbedModel,
    Metric,
)

pc = Pinecone()

if not pc.has_index("book-search"):
    desc = pc.create_index_for_model(
        name="book-search",
        cloud=CloudProvider.AWS,
        region=AwsRegion.US_EAST_1,
        embed=IndexEmbed(
            model=EmbedModel.Multilingual_E5_Large,
            metric=Metric.COSINE,
            field_map={
                "text": "description",
            },
        ),
        read_capacity={
            "mode": "Dedicated",
            "dedicated": {
                "node_type": "t1",
                "scaling": "Manual",
                "manual": {"shards": 2, "replicas": 2},
            },
        },
        schema={
            "genre": {"filterable": True},
            "year": {"filterable": True},
            "rating": {"filterable": True},
        },
    )

Backups¶

PineconeGRPC.create_backup(*, index_name: str, backup_name: str, description: str = '') → BackupModel¶

Create a backup of an index.

Parameters:

index_name (str) – The name of the index to backup.
backup_name (str) – The name to give the backup.
description (str, optional) – Optional description of the backup.

from pinecone import Pinecone

pc = Pinecone()

# Create a backup of an index
backup = pc.create_backup(
    index_name="my_index",
    backup_name="my_backup",
    description="Daily backup"
)

print(f"Backup created with ID: {backup.id}")

PineconeGRPC.list_backups(*, index_name: str | None = None, limit: int | None = 10, pagination_token: str | None = None) → BackupList¶

List backups.

If index_name is provided, the backups will be filtered by index. If no index_name is provided, all backups in the project will be returned.

Parameters:

index_name (str, optional) – The name of the index to list backups for.
limit (int, optional) – The maximum number of backups to return.
pagination_token (str, optional) – The pagination token to use for pagination.

from pinecone import Pinecone

pc = Pinecone()

# List all backups
all_backups = pc.list_backups(limit=20)

# List backups for a specific index
index_backups = pc.list_backups(index_name="my_index", limit=10)

for backup in index_backups:
    print(f"Backup: {backup.name}, Status: {backup.status}")

PineconeGRPC.describe_backup(*, backup_id: str) → BackupModel¶

Describe a backup.

Parameters:: backup_id (str) – The ID of the backup to describe.

from pinecone import Pinecone

pc = Pinecone()

backup = pc.describe_backup(backup_id="backup-123")
print(f"Backup: {backup.name}")
print(f"Status: {backup.status}")
print(f"Index: {backup.index_name}")

PineconeGRPC.delete_backup(*, backup_id: str) → None¶

Delete a backup.

Parameters:: backup_id (str) – The ID of the backup to delete.

from pinecone import Pinecone

pc = Pinecone()

pc.delete_backup(backup_id="backup-123")

Collections¶

PineconeGRPC.create_collection(name: str, source: str) → None¶

Create a collection from a pod-based index.

Parameters:

name (str, required) – Name of the collection
source (str, required) – Name of the source index

from pinecone import Pinecone

pc = Pinecone()

# Create a collection from an existing pod-based index
pc.create_collection(name="my_collection", source="my_index")

PineconeGRPC.list_collections() → CollectionList¶

List all collections.

from pinecone import Pinecone

pc = Pinecone()

for collection in pc.list_collections():
    print(collection.name)
    print(collection.source)

# You can also iterate specifically over the collection
# names with the .names() helper.
collection_name="my_collection"
for collection_name in pc.list_collections().names():
    print(collection_name)

PineconeGRPC.describe_collection(name: str) → dict[str, Any]¶

Describes a collection.

Parameters:: name (str) – The name of the collection
Returns:: Description of the collection

from pinecone import Pinecone

pc = Pinecone()

description = pc.describe_collection("my_collection")
print(description.name)
print(description.source)
print(description.status)
print(description.size)

PineconeGRPC.delete_collection(name: str) → None¶

Deletes a collection.

Parameters:: name (str) – The name of the collection to delete.

Deleting a collection is an irreversible operation. All data in the collection will be lost.

This method tells Pinecone you would like to delete a collection, but it takes a few moments to complete the operation. Use the describe_collection() method to confirm that the collection has been deleted.

from pinecone import Pinecone

pc = Pinecone()

pc.delete_collection(name="my_collection")

Restore Jobs¶

PineconeGRPC.list_restore_jobs(*, limit: int | None = 10, pagination_token: str | None = None) → RestoreJobList¶

List restore jobs.

Parameters:

limit (int) – The maximum number of restore jobs to return.
pagination_token (str) – The pagination token to use for pagination.

from pinecone import Pinecone

pc = Pinecone()

restore_jobs = pc.list_restore_jobs(limit=20)

for job in restore_jobs:
    print(f"Job ID: {job.id}, Status: {job.status}")

PineconeGRPC.describe_restore_job(*, job_id: str) → RestoreJobModel¶

Describe a restore job.

Parameters:: job_id (str) – The ID of the restore job to describe.

from pinecone import Pinecone

pc = Pinecone()

job = pc.describe_restore_job(job_id="job-123")
print(f"Job ID: {job.id}")
print(f"Status: {job.status}")
print(f"Source backup: {job.backup_id}")

DB Data Plane¶

class pinecone.grpc.GRPCIndex(index_name: str, config: Config, channel: Channel | None = None, grpc_config: GRPCClientConfig | None = None, pool_threads: int | None = None, _endpoint_override: str | None = None)[source]¶: A client for interacting with a Pinecone index via GRPC API.

GRPCIndex.__init__(index_name: str, config: Config, channel: Channel | None = None, grpc_config: GRPCClientConfig | None = None, pool_threads: int | None = None, _endpoint_override: str | None = None)¶

The DescribeIndexStats operation returns statistics about the index’s contents. For example: The vector count per namespace and the number of dimensions.

Examples:

>>> index.describe_index_stats()
>>> index.describe_index_stats(filter={'key': 'value'})

Parameters:

filter (dict[str, Union[str, float, int, bool, List, dict]])
present (If this parameter is)
filter. (the operation only returns statistics for vectors that satisfy the)
<https (See `metadata filtering) –
//www.pinecone.io/docs/metadata-filtering/>_` [optional]

Returns: DescribeIndexStatsResponse object which contains stats about the index.

Vectors¶

The upsert operation writes vectors into a namespace. If a new value is upserted for an existing vector id, it will overwrite the previous value.

Examples:

>>> index.upsert([('id1', [1.0, 2.0, 3.0], {'key': 'value'}),
                  ('id2', [1.0, 2.0, 3.0])
                  ],
                  namespace='ns1', async_req=True)
>>> index.upsert([{'id': 'id1', 'values': [1.0, 2.0, 3.0], 'metadata': {'key': 'value'}},
                  {'id': 'id2',
                            'values': [1.0, 2.0, 3.0],
                            'sparse_values': {'indices': [1, 8], 'values': [0.2, 0.4]},
                  ])
>>> index.upsert([GRPCVector(id='id1', values=[1.0, 2.0, 3.0], metadata={'key': 'value'}),
                  GRPCVector(id='id2', values=[1.0, 2.0, 3.0]),
                  GRPCVector(id='id3',
                             values=[1.0, 2.0, 3.0],
                             sparse_values=GRPCSparseValues(indices=[1, 2], values=[0.2, 0.4]))])

Parameters:

vectors (Union[list[Vector], list[Tuple]]) –
A list of vectors to upsert.

A vector can be represented by a 1) GRPCVector object, a 2) tuple or 3) a dictionary 1) if a tuple is used, it must be of the form (id, values, metadata) or (id, values).

where id is a string, vector is a list of floats, and metadata is a dict. Examples: (‘id1’, [1.0, 2.0, 3.0], {‘key’: ‘value’}), (‘id2’, [1.0, 2.0, 3.0])
1. if a GRPCVector object is used, a GRPCVector object must be of the form
  GRPCVector(id, values, metadata), where metadata is an optional argument of type dict[str, Union[str, float, int, bool, list[int], list[float], list[str]]]
  
  Examples: GRPCVector(id=’id1’, values=[1.0, 2.0, 3.0], metadata={‘key’: ‘value’}),
  GRPCVector(id=’id2’, values=[1.0, 2.0, 3.0]), GRPCVector(id=’id3’,
  
  values=[1.0, 2.0, 3.0], sparse_values=GRPCSparseValues(indices=[1, 2], values=[0.2, 0.4]))
2. if a dictionary is used, it must be in the form {‘id’: str, ‘values’: list[float], ‘sparse_values’: {‘indices’: list[int], ‘values’: list[float]},
  
  ’metadata’: dict}
Note: the dimension of each vector must match the dimension of the index.
async_req (bool) – If True, the upsert operation will be performed asynchronously. Cannot be used with batch_size. Defaults to False. See: https://docs.pinecone.io/docs/performance-tuning [optional]
namespace (str) – The namespace to write to. If not specified, the default namespace is used. [optional]
batch_size (int) –

The number of vectors to upsert in each batch.
Cannot be used with async_req=True.

If not specified, all vectors will be upserted in a single batch. [optional]
show_progress (bool) – Whether to show a progress bar using tqdm. Applied only if batch_size is provided. Default is True.

Returns: UpsertResponse, contains the number of vectors upserted

The Query operation searches a namespace, using a query vector. It retrieves the ids of the most similar items in a namespace, along with their similarity scores.

Examples:

>>> index.query(vector=[1, 2, 3], top_k=10, namespace='my_namespace')
>>> index.query(id='id1', top_k=10, namespace='my_namespace')
>>> index.query(vector=[1, 2, 3], top_k=10, namespace='my_namespace', filter={'key': 'value'})
>>> index.query(id='id1', top_k=10, namespace='my_namespace', include_metadata=True, include_values=True)
>>> index.query(vector=[1, 2, 3], sparse_vector={'indices': [1, 2], 'values': [0.2, 0.4]},
>>>             top_k=10, namespace='my_namespace')
>>> index.query(vector=[1, 2, 3], sparse_vector=GRPCSparseValues([1, 2], [0.2, 0.4]),
>>>             top_k=10, namespace='my_namespace')

Parameters:

vector (list[float]) – The query vector. This should be the same length as the dimension of the index being queried. Each query() request can contain only one of the parameters id or vector.. [optional]
id (str) – The unique ID of the vector to be used as a query vector. Each query() request can contain only one of the parameters vector or id.. [optional]
top_k (int) – The number of results to return for each query. Must be an integer greater than 1.
namespace (str) – The namespace to fetch vectors from. If not specified, the default namespace is used. [optional]
filter (dict[str, Union[str, float, int, bool, List, dict]]) – The filter to apply. You can use vector metadata to limit your search. See metadata filtering <https://www.pinecone.io/docs/metadata-filtering/>_ [optional]
include_values (bool) – Indicates whether vector values are included in the response. If omitted the server will use the default value of False [optional]
include_metadata (bool) – Indicates whether metadata is included in the response as well as the ids. If omitted the server will use the default value of False [optional]
sparse_vector –
(Union[SparseValues, dict[str, Union[list[float], list[int]]]]): sparse values of the query vector. Expected to be either a SparseValues object or a dict of the form:

{‘indices’: list[int], ‘values’: list[float]}, where the lists each have the same length.

Returns: QueryResponse object which contains the list of the closest vectors as ScoredVector objects,: and namespace name.

GRPCIndex.query_namespaces(vector: list[float], namespaces: list[str], metric: Literal['cosine', 'euclidean', 'dotproduct'], top_k: int | None = None, filter: dict[str, str | int | float | bool] | dict[Literal['$eq'], str | int | float | bool] | dict[Literal['$ne'], str | int | float | bool] | dict[Literal['$gt'], int | float] | dict[Literal['$gte'], int | float] | dict[Literal['$lt'], int | float] | dict[Literal['$lte'], int | float] | dict[Literal['$in'], list[str | int | float | bool]] | dict[Literal['$nin'], list[str | int | float | bool]] | dict[Literal['$exists'], bool] | dict[Literal['$and'], list[dict[str, str | int | float | bool] | dict[Literal['$eq'], str | int | float | bool] | dict[Literal['$ne'], str | int | float | bool] | dict[Literal['$gt'], int | float] | dict[Literal['$gte'], int | float] | dict[Literal['$lt'], int | float] | dict[Literal['$lte'], int | float] | dict[Literal['$in'], list[str | int | float | bool]] | dict[Literal['$nin'], list[str | int | float | bool]] | dict[Literal['$exists'], bool]]] | dict[Literal['$or'], list[dict[str, str | int | float | bool] | dict[Literal['$eq'], str | int | float | bool] | dict[Literal['$ne'], str | int | float | bool] | dict[Literal['$gt'], int | float] | dict[Literal['$gte'], int | float] | dict[Literal['$lt'], int | float] | dict[Literal['$lte'], int | float] | dict[Literal['$in'], list[str | int | float | bool]] | dict[Literal['$nin'], list[str | int | float | bool]] | dict[Literal['$exists'], bool]]] | None = None, include_values: bool | None = None, include_metadata: bool | None = None, sparse_vector: SparseValues | SparseVectorTypedDict | None = None, **kwargs) → QueryNamespacesResults[source]¶

The Delete operation deletes vectors from the index, from a single namespace. No error raised if the vector id does not exist.

Parameters:

ids (list[str]) – Vector ids to delete [optional]
delete_all (bool) – This indicates that all vectors in the index namespace should be deleted.. [optional] Default is False.
namespace (str) – The namespace to delete vectors from [optional] If not specified, the default namespace is used.
filter (FilterTypedDict) –
If specified, the metadata filter here will be used to select the vectors to delete. This is mutually exclusive with specifying ids to delete in the ids param or using delete_all=True.

See metadata filtering <https://www.pinecone.io/docs/metadata-filtering/>_ [optional]
async_req (bool) – If True, the delete operation will be performed asynchronously. Defaults to False. [optional]

Returns: DeleteResponse (contains no data) or a PineconeGrpcFuture object if async_req is True.

Note

For any delete call, if namespace is not specified, the default namespace is used.

Delete can occur in the following mutual exclusive ways:

Delete by ids from a single namespace
Delete all vectors from a single namespace by setting delete_all to True
Delete all vectors from a single namespace by specifying a metadata filter (note that for this option delete all must be set to False)

Examples:

>>> index.delete(ids=['id1', 'id2'], namespace='my_namespace')
>>> index.delete(delete_all=True, namespace='my_namespace')
>>> index.delete(filter={'key': 'value'}, namespace='my_namespace', async_req=True)

GRPCIndex.fetch(ids: list[str] | None, namespace: str | None = None, async_req: bool | None = False, **kwargs) → FetchResponse | PineconeGrpcFuture[source]¶

The fetch operation looks up and returns vectors, by ID, from a single namespace. The returned vectors include the vector data and/or metadata.

Examples:

>>> index.fetch(ids=['id1', 'id2'], namespace='my_namespace')
>>> index.fetch(ids=['id1', 'id2'])

Parameters:

ids (list[str]) – The vector IDs to fetch.
namespace (str) – The namespace to fetch vectors from. If not specified, the default namespace is used. [optional]

Returns: FetchResponse object which contains the list of Vector objects, and namespace name.

GRPCIndex.list(**kwargs) → Iterator[list[str]][source]¶

The list operation accepts all of the same arguments as list_paginated, and returns a generator that yields a list of the matching vector ids in each page of results. It automatically handles pagination tokens on your behalf.

Examples:

>>> for ids in index.list(prefix='99', limit=5, namespace='my_namespace'):
>>>     print(ids)
['99', '990', '991', '992', '993']
['994', '995', '996', '997', '998']
['999']

Parameters:

prefix (Optional[str]) – The id prefix to match. If unspecified, an empty string prefix will be used with the effect of listing all ids in a namespace [optional]
limit (Optional[int]) – The maximum number of ids to return. If unspecified, the server will use a default value. [optional]
pagination_token (Optional[str]) – A token needed to fetch the next page of results. This token is returned in the response if additional results are available. [optional]
namespace (Optional[str]) – The namespace to fetch vectors from. If not specified, the default namespace is used. [optional]

GRPCIndex.list_paginated(prefix: str | None = None, limit: int | None = None, pagination_token: str | None = None, namespace: str | None = None, **kwargs) → ListResponse[source]¶

The list_paginated operation finds vectors based on an id prefix within a single namespace. It returns matching ids in a paginated form, with a pagination token to fetch the next page of results. This id list can then be passed to fetch or delete operations, depending on your use case.

Consider using the list method to avoid having to handle pagination tokens manually.

Examples:

>>> results = index.list_paginated(prefix='99', limit=5, namespace='my_namespace')
>>> [v.id for v in results.vectors]
['99', '990', '991', '992', '993']
>>> results.pagination.next
eyJza2lwX3Bhc3QiOiI5OTMiLCJwcmVmaXgiOiI5OSJ9
>>> next_results = index.list_paginated(prefix='99', limit=5, namespace='my_namespace', pagination_token=results.pagination.next)

Parameters:

prefix (Optional[str]) – The id prefix to match. If unspecified, an empty string prefix will be used with the effect of listing all ids in a namespace [optional]
limit (Optional[int]) – The maximum number of ids to return. If unspecified, the server will use a default value. [optional]
pagination_token (Optional[str]) – A token needed to fetch the next page of results. This token is returned in the response if additional results are available. [optional]
namespace (Optional[str]) – The namespace to fetch vectors from. If not specified, the default namespace is used. [optional]

Returns: SimpleListResponse object which contains the list of ids, the namespace name, pagination information, and usage showing the number of read_units consumed.

Fetch vectors by metadata filter.

Look up and return vectors by metadata filter from a single namespace. The returned vectors include the vector data and/or metadata.

Examples:

>>> index.fetch_by_metadata(
...     filter={'genre': {'$in': ['comedy', 'drama']}, 'year': {'$eq': 2019}},
...     namespace='my_namespace',
...     limit=50
... )
>>> index.fetch_by_metadata(
...     filter={'status': 'active'},
...     pagination_token='token123'
... )

Parameters:

filter (dict[str, Union[str, float, int, bool, List, dict]]) – Metadata filter expression to select vectors. See metadata filtering <https://www.pinecone.io/docs/metadata-filtering/>_
namespace (str) – The namespace to fetch vectors from. If not specified, the default namespace is used. [optional]
limit (int) – Max number of vectors to return. Defaults to 100. [optional]
pagination_token (str) – Pagination token to continue a previous listing operation. [optional]
async_req (bool) – If True, the fetch operation will be performed asynchronously. Defaults to False. [optional]

Returns:

Object containing the fetched vectors, namespace, usage, and pagination token.

Return type:

FetchByMetadataResponse

The Update operation updates vectors in a namespace.

This method supports two update modes:

Single vector update by ID: Provide id to update a specific vector. - Updates the vector with the given ID - If values is included, it will overwrite the previous vector values - If set_metadata is included, the metadata will be merged with existing metadata on the vector.

Fields specified in set_metadata will overwrite existing fields with the same key, while fields not in set_metadata will remain unchanged.
Bulk update by metadata filter: Provide filter to update all vectors matching the filter criteria. - Updates all vectors in the namespace that match the filter expression - Useful for updating metadata across multiple vectors at once - If set_metadata is included, the metadata will be merged with existing metadata on each vector.

Fields specified in set_metadata will overwrite existing fields with the same key, while fields not in set_metadata will remain unchanged.
- The response includes matched_records indicating how many vectors were updated

Either id or filter must be provided (but not both in the same call).

Examples:

Single vector update by ID:

>>> # Update vector values
>>> index.update(id='id1', values=[1, 2, 3], namespace='my_namespace')
>>> # Update vector metadata
>>> index.update(id='id1', set_metadata={'key': 'value'}, namespace='my_namespace', async_req=True)
>>> # Update vector values and sparse values
>>> index.update(id='id1', values=[1, 2, 3], sparse_values={'indices': [1, 2], 'values': [0.2, 0.4]},
>>>              namespace='my_namespace')
>>> index.update(id='id1', values=[1, 2, 3], sparse_values=GRPCSparseValues(indices=[1, 2], values=[0.2, 0.4]),
>>>              namespace='my_namespace')

Bulk update by metadata filter:

>>> # Update metadata for all vectors matching the filter
>>> response = index.update(set_metadata={'status': 'active'}, filter={'genre': {'$eq': 'drama'}},
>>>                        namespace='my_namespace')
>>> print(f"Updated {response.matched_records} vectors")
>>> # Preview how many vectors would be updated (dry run)
>>> response = index.update(set_metadata={'status': 'active'}, filter={'genre': {'$eq': 'drama'}},
>>>                        namespace='my_namespace', dry_run=True)
>>> print(f"Would update {response.matched_records} vectors")

Parameters:

id (str) – Vector’s unique id. Required for single vector updates. Must not be provided when using filter. [optional]
async_req (bool) – If True, the update operation will be performed asynchronously. Defaults to False. [optional]
values (list[float]) – Vector values to set. [optional]
set_metadata (dict[str, Union[str, float, int, bool, list[int], list[float], list[str]]]]) – Metadata to merge with existing metadata on the vector(s). Fields specified will overwrite existing fields with the same key, while fields not specified will remain unchanged. [optional]
namespace (str) – Namespace name where to update the vector(s). [optional]
sparse_values – (dict[str, Union[list[float], list[int]]]): Sparse values to update for the vector. Expected to be either a GRPCSparseValues object or a dict of the form: {‘indices’: list[int], ‘values’: list[float]} where the lists each have the same length. [optional]
filter (dict[str, Union[str, float, int, bool, List, dict]]) – A metadata filter expression. When provided, updates all vectors in the namespace that match the filter criteria. See metadata filtering <https://www.pinecone.io/docs/metadata-filtering/>_. Must not be provided when using id. Either id or filter must be provided. [optional]
dry_run (bool) – If True, return the number of records that match the filter without executing the update. Only meaningful when using filter (not with id). Useful for previewing the impact of a bulk update before applying changes. Defaults to False. [optional]

Returns:

When using filter-based updates, the UpdateResponse includes matched_records indicating the number of vectors that were updated (or would be updated if dry_run=True). If async_req=True, returns a PineconeGrpcFuture object instead.

Return type:

UpdateResponse or PineconeGrpcFuture

GRPCIndex.upsert_from_dataframe(df: Any, namespace: str | None = None, batch_size: int = 500, use_async_requests: bool = True, show_progress: bool = True) → UpsertResponse[source]¶

Upserts a dataframe into the index.

Parameters:

df – A pandas dataframe with the following columns: id, values, sparse_values, and metadata.
namespace – The namespace to upsert into.
batch_size – The number of rows to upsert in a single batch.
use_async_requests – Whether to upsert multiple requests at the same time using asynchronous request mechanism. Set to False
show_progress – Whether to show a progress bar.

Namespaces¶

GRPCIndex.create_namespace(name: str, schema: dict[str, Any] | None = None, async_req: bool = False, **kwargs) → NamespaceDescription | PineconeGrpcFuture[source]¶

The create_namespace operation creates a namespace in a serverless index.

Examples:

>>> index.create_namespace(name='my_namespace')

>>> # Create namespace asynchronously
>>> future = index.create_namespace(name='my_namespace', async_req=True)
>>> namespace = future.result()

Parameters:

name (str) – The name of the namespace to create.
schema (Optional[dict[str, Any]]) – Optional schema configuration for the namespace as a dictionary. [optional]
async_req (bool) – If True, the create_namespace operation will be performed asynchronously. [optional]

Returns: NamespaceDescription object which contains information about the created namespace, or a PineconeGrpcFuture object if async_req is True.

GRPCIndex.describe_namespace(namespace: str, **kwargs) → NamespaceDescription[source]¶

The describe_namespace operation returns information about a specific namespace, including the total number of vectors in the namespace.

Examples:

>>> index.describe_namespace(namespace='my_namespace')

Parameters:: namespace (str) – The namespace to describe.

Returns: NamespaceDescription object which contains information about the namespace.

GRPCIndex.delete_namespace(namespace: str, **kwargs) → dict[str, Any][source]¶

The delete_namespace operation deletes a namespace from an index. This operation is irreversible and will permanently delete all data in the namespace.

Examples:

>>> index.delete_namespace(namespace='my_namespace')

Parameters:: namespace (str) – The namespace to delete.

Returns: Empty dictionary indicating successful deletion.

GRPCIndex.list_namespaces(limit: int | None = None, **kwargs)[source]¶

The list_namespaces operation accepts all of the same arguments as list_namespaces_paginated, and returns a generator that yields each namespace. It automatically handles pagination tokens on your behalf.

Parameters:: limit (Optional[int]) – The maximum number of namespaces to fetch in each network call. If unspecified, the server will use a default value. [optional]
Returns:: Returns a generator that yields each namespace. It automatically handles pagination tokens on your behalf so you can easily iterate over all results. The list_namespaces method accepts all of the same arguments as list_namespaces_paginated

Examples:

>>> for namespace in index.list_namespaces():
>>>     print(namespace.name)
namespace1
namespace2
namespace3

You can convert the generator into a list by wrapping the generator in a call to the built-in list function:

namespaces = list(index.list_namespaces())

You should be cautious with this approach because it will fetch all namespaces at once, which could be a large number of network calls and a lot of memory to hold the results.

GRPCIndex.list_namespaces_paginated(limit: int | None = None, pagination_token: str | None = None, **kwargs) → ListNamespacesResponse[source]¶

The list_namespaces_paginated operation returns a list of all namespaces in a serverless index. It returns namespaces in a paginated form, with a pagination token to fetch the next page of results.

Examples:

>>> results = index.list_namespaces_paginated(limit=10)
>>> [ns.name for ns in results.namespaces]
['namespace1', 'namespace2', 'namespace3']
>>> results.pagination.next
eyJza2lwX3Bhc3QiOiI5OTMiLCJwcmVmaXgiOiI5OSJ9
>>> next_results = index.list_namespaces_paginated(limit=10, pagination_token=results.pagination.next)

Parameters:

limit (Optional[int]) – The maximum number of namespaces to return. If unspecified, the server will use a default value. [optional]
pagination_token (Optional[str]) – A token needed to fetch the next page of results. This token is returned in the response if additional results are available. [optional]

Returns: ListNamespacesResponse object which contains the list of namespaces and pagination information.

Navigation

Related Topics

PineconeGRPC¶

DB Control Plane¶

Indexes¶

Backups¶

Collections¶

Restore Jobs¶

DB Data Plane¶

Vectors¶

Namespaces¶