AsyncPinecone¶

AsyncPinecone is the asynchronous control-plane client — use it inside an async with block to manage indexes, collections, backups, and related resources. Sub-clients for each resource type are accessed as properties (e.g. pc.indexes, pc.collections) and are lazily initialised on first access.

from pinecone import AsyncPinecone

async with AsyncPinecone(api_key="your-api-key") as pc:
    desc = await pc.indexes.describe("my-index")
    index = pc.index(host=desc.host)
    async with index:
        results = await index.query(
            vector=[0.012, -0.087, 0.153],
            top_k=10,
        )

Note

Unlike Pinecone, AsyncPinecone.index() cannot auto-resolve an index host by name. Call await pc.indexes.describe(name) first to populate the cache, then create the data-plane client:

desc = await pc.indexes.describe("my-index")
idx = pc.index("my-index")          # uses cached host
# — or —
idx = pc.index(host=desc.host)       # explicit host

class pinecone.async_client.pinecone.AsyncPinecone(api_key=None, *, host=None, additional_headers=None, source_tag=None, proxy_url=None, proxy_headers=None, ssl_ca_certs=None, ssl_verify=True, timeout=30.0, connection_pool_maxsize=0, retry_config=None)[source]¶

Bases: object

Asynchronous Pinecone client for control-plane operations.

Parameters:

api_key (str | None) – Pinecone API key. Falls back to PINECONE_API_KEY env var.
host (str | None) – Control-plane API host. Falls back to PINECONE_CONTROLLER_HOST env var, then defaults to https://api.pinecone.io.
additional_headers (Mapping[str, str] | None) – Extra headers included in every request.
source_tag (str | None) – Tag appended to the User-Agent string for request attribution.
proxy_url (str | None) – HTTP proxy URL for outgoing requests.
proxy_headers (Mapping[str, str] | None) – Not yet supported. Raises NotImplementedError if provided.
ssl_ca_certs (str | None) – Path to a CA certificate bundle for SSL verification.
ssl_verify (bool) – Whether to verify SSL certificates. Defaults to True.
timeout (float) – Request timeout in seconds. Defaults to 30.0.
connection_pool_maxsize (int) – Maximum number of connections to keep in the pool. 0 (default) uses httpx defaults.
retry_config (RetryConfig | None) – Custom retry configuration. When None (default), uses built-in defaults (5 attempts, exponential backoff, retries on 500/502/503/504 for GET/HEAD).

Raises:

PineconeValueError – If no API key can be resolved from arguments or environment variables.

Examples

from pinecone import AsyncPinecone

async with AsyncPinecone(api_key="your-api-key") as pc:
    index = await pc.index(name="my-index")
    async with index:
        results = await index.query(
            vector=[0.012, -0.087, 0.153, ...],  # 1536-dim embedding
            top_k=10,
        )

Note

Differences from sync Pinecone

index() is a coroutine. Unlike the sync Pinecone client, AsyncPinecone.index() must be awaited: idx = await pc.index(name="my-index"). On cache miss it performs a non-blocking describe call to resolve the host — no manual two-step dance needed.
upsert_from_dataframe() is not supported. AsyncIndex raises NotImplementedError for this method. Use batched upsert() calls instead.
No grpc parameter on index(). Async gRPC transport is not yet available, so the grpc option accepted by the sync client is absent here.

async __aenter__()[source]¶

Return type:: AsyncPinecone

async __aexit__(*args)[source]¶

Parameters:: args (Any)
Return type:: None

__init__(api_key=None, *, host=None, additional_headers=None, source_tag=None, proxy_url=None, proxy_headers=None, ssl_ca_certs=None, ssl_verify=True, timeout=30.0, connection_pool_maxsize=0, retry_config=None)[source]¶

Parameters:

api_key (str | None)
host (str | None)
additional_headers (Mapping[str, str] | None)
source_tag (str | None)
proxy_url (str | None)
proxy_headers (Mapping[str, str] | None)
ssl_ca_certs (str | None)
ssl_verify (bool)
timeout (float)
connection_pool_maxsize (int)
retry_config (RetryConfig | None)

Return type:

None

property assistant: _AsyncAssistantNamespaceProxy¶

Convenience alias for AsyncPinecone.assistants.

Returns a proxy that supports both namespace-style access (pc.assistant.create_assistant(...)) and the convenience call form (await pc.assistant("my-name") — shortcut for await pc.assistants.describe(name="my-name")).

The canonical entry point is AsyncPinecone.assistants; this alias is provided for ergonomic singular-form access and is not deprecated.

property assistants: AsyncAssistants¶

Access the AsyncAssistants namespace for assistant operations.

Lazily imported and instantiated on first access.

Returns:: AsyncAssistants namespace instance.

Examples

async with AsyncPinecone(api_key="your-api-key") as pc:
    assistants = await pc.assistants.list()

property backups: AsyncBackups¶

Access the AsyncBackups namespace for control-plane backup operations.

Lazily imported and instantiated on first access.

Returns:: AsyncBackups namespace instance.

Examples

async with AsyncPinecone(api_key="your-api-key") as pc:
    for backup in await pc.backups.list():
        print(backup.backup_id)

async close()[source]¶

Close all open HTTP connections.

Closes the main control-plane client and any namespace clients (inference, assistants, preview) that were initialized during this session.

Prefer the async context manager form (async with AsyncPinecone(...) as pc:) which calls close() automatically on exit.

Examples

Close the client explicitly after use:

>>> import asyncio
>>> from pinecone import AsyncPinecone
>>> async def example():
...     client = AsyncPinecone(api_key="your-api-key")
...     await client.close()
>>> asyncio.run(example())

Use AsyncPinecone as a context manager (close is called automatically):

>>> async def example():
...     async with AsyncPinecone(api_key="your-api-key") as pc:
...         _ = await pc.indexes.list()
>>> asyncio.run(example())

Return type:: None

property collections: AsyncCollections¶

Access the AsyncCollections namespace for control-plane collection operations.

Lazily imported and instantiated on first access.

Returns:: AsyncCollections namespace instance.

Examples

async with AsyncPinecone(api_key="your-api-key") as pc:
    for col in await pc.collections.list():
        print(col.name)

property config: PineconeConfig¶

The resolved configuration for this client.

Returns:: PineconeConfig containing the resolved API key, host, timeout, and connection settings.

async create_index_from_backup(*, name, backup_id, deletion_protection=None, tags=None, timeout=None)[source]¶

Create a new index by restoring from a backup.

Sends a POST to /backups/{backup_id}/create-index and then polls until the index is ready (unless timeout is -1).

Parameters:

name (str) – Name for the new index.
backup_id (str) – Identifier of the backup to restore from.
deletion_protection (DeletionProtection | str | None) – "enabled" or "disabled". Defaults to "disabled" server-side when omitted.
tags (Mapping[str, str] | None) – Optional key-value tags for the new index.
timeout (int | None) – Seconds to wait for readiness. None (default) blocks up to 300 s. -1 returns a CreateIndexFromBackupResponse immediately (contains restore_job_id and index_id) without polling.

Returns:

A CreateIndexFromBackupResponse when timeout is -1 (contains restore_job_id and index_id), or an IndexModel describing the restored index once it is ready.

Raises:

PineconeValueError – If name or backup_id is empty.
PineconeTimeoutError – If the index is not ready within the timeout.
IndexInitFailedError – If the index enters InitializationFailed state.
IndexTerminatedError – If the index enters Terminating or Disabled state.
ApiError – If the API returns an error response.

Return type:

CreateIndexFromBackupResponse | IndexModel

Examples

# Restore an index from a backup
from pinecone import AsyncPinecone
async with AsyncPinecone(api_key="your-api-key") as pc:
    index = await pc.create_index_from_backup(
        name="product-search-restored",
        backup_id="bk-daily-20240115",
    )

# Restore without waiting (returns restore_job_id)
async with AsyncPinecone(api_key="your-api-key") as pc:
    result = await pc.create_index_from_backup(
        name="product-search-restored",
        backup_id="bk-daily-20240115",
        timeout=-1,
    )
    print(result.restore_job_id)

async index(name='', *, host='')[source]¶

Create an async data plane client targeting a specific index.

Can target by host URL directly (skips the describe call) or by index name (triggers an async describe-index lookup to resolve the host on cache miss).

AsyncIndexes¶

class pinecone.async_client.indexes.AsyncIndexes(http, host_cache=None)[source]¶

Bases: object

Async control-plane operations for Pinecone indexes.

Provides list, describe, exists, create, delete, and configure methods.

Parameters:

http (AsyncHTTPClient) – Async HTTP client for making API requests.
host_cache (dict[str, str] | None)

Examples

from pinecone import AsyncPinecone

async with AsyncPinecone(api_key="your-api-key") as pc:
    for idx in await pc.indexes.list():
        print(idx.name)

__init__(http, host_cache=None)[source]¶

Parameters:

http (AsyncHTTPClient)
host_cache (dict[str, str] | None)

Return type:

None

async configure(name, *, replicas=None, pod_type=None, deletion_protection=None, tags=None, embed=None, read_capacity=None, serverless_read_capacity=None)[source]¶

Configure an existing index.

Updates mutable properties of an index such as replicas, pod type, deletion protection, tags, and read capacity.

Parameters:

name (str) – The name of the index to configure.
replicas (int | None) – Number of replicas for pod-based indexes.
pod_type (str | None) – Pod type for pod-based indexes (e.g. "p1.x2").
deletion_protection (DeletionProtection | str | None) – "enabled" or "disabled".
tags (dict[str, str] | None) – Key-value tags to merge with existing tags. Set a value to "" to remove a tag.
embed (dict[str, Any] | None) – Integrated index embed configuration updates. Forwarded verbatim as the embed key in the PATCH body.
read_capacity (dict[str, Any] | None) – Read capacity configuration for BYOC indexes. Pass {"mode": "OnDemand"} or {"mode": "Dedicated", "dedicated": {"node_type": "t1", "scaling": "Manual", "manual": {"replicas": 2, "shards": 1}}}.
serverless_read_capacity (dict[str, Any] | None) – Read capacity configuration for serverless indexes. Pass {"mode": "OnDemand"} or {"mode": "Dedicated", "dedicated": {...}}.

Raises:

PineconeValueError – If name is empty or read_capacity is invalid.
NotFoundError – If the index does not exist.
ApiError – If the API returns another error response.

Return type:

None

Examples

async with AsyncPinecone(api_key="your-api-key") as pc:
    await pc.indexes.configure("my-index", replicas=4)
    await pc.indexes.configure("my-index", tags={"env": "prod"})
    await pc.indexes.configure(
        "my-index", serverless_read_capacity={"mode": "OnDemand"}
    )

async create(*, name, spec, dimension=None, metric='cosine', vector_type='dense', deletion_protection='disabled', tags=None, schema=None, read_capacity=None, timeout=None)[source]¶

Create a new Pinecone index.

Supports serverless, pod-based, BYOC (bring your own cloud), and integrated (model-backed) index creation. Integrated indexes use Pinecone’s built-in embedding models so dimension and metric are inferred from the model.

Parameters:

name (str) – Name for the new index.
spec (ServerlessSpec | PodSpec | ByocSpec | IntegratedSpec | dict[str, Any]) – Deployment spec — a ServerlessSpec, PodSpec, ByocSpec, IntegratedSpec, or raw dict.
dimension (int | None) – Vector dimension (required for dense non-integrated indexes).
metric (Metric | str) – Similarity metric (cosine, euclidean, dotproduct).
vector_type (VectorType | str) – Vector type (dense or sparse).
deletion_protection (DeletionProtection | str) – Whether deletion protection is enabled.
tags (dict[str, str] | None) – Optional key-value tags.
schema (dict[str, Any] | None) – Optional metadata schema defining field types for indexing. Accepts both flat format ({"field": {"type": "str"}}) and nested format ({"fields": {"field": {"type": "str"}}}).
read_capacity (dict[str, Any] | None) – Optional read capacity configuration for integrated indexes. For example, {"mode": "OnDemand"} or a dedicated capacity dict.
timeout (int | None) – Seconds to wait for the index to become ready. Use None (default) to poll indefinitely every 5 seconds with no upper time bound. Use a positive int to poll with a deadline. Use -1 to return immediately without polling. Raises PineconeTimeoutError if the index is not ready before the deadline. IndexInitFailedError if initialization fails.

Returns:

IndexModel describing the created index.

Raises:

PineconeValueError – If inputs fail client-side validation.
NotFoundError – If the index disappears during readiness polling.
IndexInitFailedError – If the index fails to initialise.
PineconeTimeoutError – If the index is not ready before the deadline.
ApiError – If the API returns another error response.

Return type:

IndexModel

Examples

async with AsyncPinecone(api_key="your-api-key") as pc:
    await pc.indexes.create(
        name="my-index",
        dimension=1536,
        spec=ServerlessSpec(cloud="aws", region="us-east-1"),
    )

    await pc.indexes.create(
        name="my-integrated-index",
        spec=IntegratedSpec(
            cloud="aws",
            region="us-east-1",
            embed=EmbedConfig(
                model="multilingual-e5-large",
                field_map={"text": "my_text_field"},
            ),
        ),
    )

async delete(name, *, timeout=None)[source]¶

Delete an index by name.

After sending the delete request, removes the cached host URL for the index. By default, polls every 5 seconds until the index disappears with no upper time bound.

Parameters:

name (str) – The name of the index to delete.
timeout (int | None) – Seconds to wait for the index to disappear. Use None (default) to poll indefinitely until the index is gone. Use a positive int to poll with a deadline. Use -1 to return immediately without polling.

Raises:

PineconeValueError – If name is empty.
NotFoundError – If the index does not exist.
PineconeTimeoutError – If the index still exists after timeout seconds.
ApiError – If the API returns another error response.

Return type:

None

Examples

async with AsyncPinecone(api_key="your-api-key") as pc:
    await pc.indexes.delete("my-index")

    # Wait up to 60 seconds for deletion to complete
    await pc.indexes.delete("my-index", timeout=60)

async describe(name)[source]¶

Get detailed information about a named index.

After a successful call the host URL is cached internally for later data-plane client construction.

Parameters:

name (str) – The name of the index to describe.

Returns:

IndexModel with name, dimension, metric, host, spec, status, deletion_protection, vector_type, and tags.

Raises:

PineconeValueError – If name is empty.
NotFoundError – If the index does not exist.
ApiError – If the API returns another error response.

Return type:

IndexModel

Examples

async with AsyncPinecone(api_key="your-api-key") as pc:
    desc = await pc.indexes.describe("my-index")
    print(desc.host)

async exists(name)[source]¶

Check whether a named index exists.

Uses describe internally; returns True on success and False when a 404 is returned.

Parameters:: name (str) – The name of the index to check.
Returns:: True if the index exists, False otherwise. Returns False immediately without a network call if name is empty or whitespace-only.
Raises:: ApiError – If the API returns an error other than 404.
Return type:: bool

Examples

async with AsyncPinecone(api_key="your-api-key") as pc:
    if await pc.indexes.exists("my-index"):
        print("Index found")

async list()[source]¶

List all indexes in the project.

Returns all indexes in a single response without filtering, sorting, or pagination.

Returns:: IndexList supporting iteration, len(), index access, and a names() convenience method.
Raises:: ApiError – If the API returns an error response (e.g. authentication failure or server error).
Return type:: IndexList

Examples

async with AsyncPinecone(api_key="your-api-key") as pc:
    indexes = await pc.indexes.list()
    print(indexes.names())

AsyncCollections¶

class pinecone.async_client.collections.AsyncCollections(http)[source]¶

Bases: object

Async control-plane operations for Pinecone collections.

Provides methods to create, list, describe, and delete collections.

Parameters:: http (AsyncHTTPClient) – Async HTTP client for making API requests.

Examples

from pinecone import AsyncPinecone

async with AsyncPinecone(api_key="your-api-key") as pc:
    for col in await pc.collections.list():
        print(col.name)

__init__(http)[source]¶

Parameters:: http (AsyncHTTPClient)
Return type:: None

async create(*, name, source)[source]¶

Create a collection from an existing index.

Returns immediately after the API call without polling for readiness.

Parameters:

name (str) – Name for the new collection.
source (str) – Name of the source index.

Returns:

A CollectionModel describing the created collection.

Raises:

ValidationError – If name is empty, longer than 45 characters, contains characters outside [a-z0-9-], or starts/ends with a hyphen. Also raised if source is empty.

Return type:

CollectionModel

Examples

col = await pc.collections.create(name="my-collection", source="my-index")
print(col.status)

async delete(name)[source]¶

Delete a collection by name.

Parameters:

name (str) – The name of the collection to delete.

Raises:

ValidationError – If name is empty.
NotFoundError – If the collection does not exist.

Return type:

None

Examples

await pc.collections.delete("my-collection")

async describe(name)[source]¶

Get detailed information about a named collection.

Parameters:

name (str) – The name of the collection to describe.

Returns:

A CollectionModel with name, status, size, dimension, vector_count, and environment.

Raises:

ValidationError – If name is empty.
NotFoundError – If the collection does not exist.

Return type:

CollectionModel

Examples

desc = await pc.collections.describe("my-collection")
print(desc.size)

async list()[source]¶

List all collections in the project.

Returns all collections in a single response without filtering, sorting, or pagination.

Returns:: A CollectionList supporting iteration, len(), index access, and a names() convenience method.
Return type:: CollectionList

Examples

collections = await pc.collections.list()
print(collections.names())
for col in collections:
    print(col.name, col.status)

AsyncBackups¶

class pinecone.async_client.backups.AsyncBackups(http)[source]¶

Bases: object

Async control-plane operations for Pinecone backups.

Provides methods to create, list, describe, and delete backups.

Parameters:: http (AsyncHTTPClient) – Async HTTP client for making API requests.

Examples

from pinecone import AsyncPinecone

async with AsyncPinecone(api_key="your-api-key") as pc:
    for backup in await pc.backups.list():
        print(backup.backup_id)

__init__(http)[source]¶

Parameters:: http (AsyncHTTPClient)
Return type:: None

async create(*, index_name, name=None, description=None)[source]¶

Create a backup of an existing index.

Parameters:

index_name (str) – Name of the index to back up.
name (str | None) – Optional name for the backup.
description (str | None) – Description for the backup. When None (the default), no description is sent and the backend stores None.

Returns:

A BackupModel describing the created backup.

Raises:

PineconeValueError – If index_name is empty.
ApiError – If the API returns an error response.

Return type:

BackupModel

Examples

# Create a backup of an index
from pinecone import AsyncPinecone
async with AsyncPinecone(api_key="your-api-key") as pc:
    backup = await pc.backups.create(
        index_name="product-search",
    )
    print(backup.backup_id)

# Create a backup with a name and description
async with AsyncPinecone(api_key="your-api-key") as pc:
    backup = await pc.backups.create(
        index_name="product-search",
        name="daily-20240115",
        description="Scheduled daily backup before reindexing",
    )

async delete(*, backup_id)[source]¶

Delete a backup.

Parameters:

backup_id (str) – The identifier of the backup to delete.

Raises:

PineconeValueError – If backup_id is empty.
NotFoundError – If the backup does not exist.
ApiError – If the API returns another error response.

Return type:

None

Examples

from pinecone import AsyncPinecone
async with AsyncPinecone(api_key="your-api-key") as pc:
    await pc.backups.delete(backup_id="bk-daily-20240115")

async describe(*, backup_id)[source]¶

Get detailed information about a backup.

Parameters:

backup_id (str) – The identifier of the backup to describe.

Returns:

A BackupModel with full backup details.

Raises:

PineconeValueError – If backup_id is empty.
NotFoundError – If the backup does not exist.
ApiError – If the API returns another error response.

Return type:

BackupModel

Examples

from pinecone import AsyncPinecone
async with AsyncPinecone(api_key="your-api-key") as pc:
    backup = await pc.backups.describe(
        backup_id="bk-daily-20240115",
    )
    print(backup.status)

async get(*, backup_id)[source]¶

Get detailed information about a backup (alias for describe()).

Parameters:

backup_id (str) – The identifier of the backup.

Returns:

A BackupModel with full backup details.

Raises:

PineconeValueError – If backup_id is empty.
NotFoundError – If the backup does not exist.
ApiError – If the API returns another error response.

Return type:

BackupModel

Examples

from pinecone import AsyncPinecone
async with AsyncPinecone(api_key="your-api-key") as pc:
    backup = await pc.backups.get(
        backup_id="bk-daily-20240115",
    )
    print(backup.status)

async list(*, index_name=None, limit=None, pagination_token=None)[source]¶

List backups.

When index_name is provided, lists backups for that index only. Otherwise lists all backups in the project.

Parameters:

index_name (str | None) – Index name to filter by, or None for all.
limit (int | None) – Maximum number of results per page. When None, the backend applies its own default (100).
pagination_token (str | None) – Token for cursor-based pagination.

Returns:

A BackupList supporting iteration, len(), and index access.

Raises:

ApiError – If the API returns an error response.

Return type:

BackupList

Examples

# List all backups in the project
from pinecone import AsyncPinecone
async with AsyncPinecone(api_key="your-api-key") as pc:
    for backup in await pc.backups.list():
        print(backup.backup_id, backup.name)

# List backups for a specific index
async with AsyncPinecone(api_key="your-api-key") as pc:
    for backup in await pc.backups.list(
        index_name="product-search",
    ):
        print(backup.name)

AsyncRestoreJobs¶

class pinecone.async_client.restore_jobs.AsyncRestoreJobs(http)[source]¶

Bases: object

Async control-plane operations for Pinecone restore jobs.

Provides methods to list and describe restore jobs.

Parameters:: http (AsyncHTTPClient) – Async HTTP client for making API requests.

Examples

from pinecone import AsyncPinecone

async with AsyncPinecone(api_key="your-api-key") as pc:
    for job in await pc.restore_jobs.list():
        print(job.restore_job_id)

__init__(http)[source]¶

Parameters:: http (AsyncHTTPClient)
Return type:: None

async describe(*, job_id)[source]¶

Get detailed information about a restore job.

Parameters:

job_id (str) – The identifier of the restore job to describe.

Returns:

A RestoreJobModel with full restore job details.

Raises:

PineconeValueError – If job_id is empty.
NotFoundError – If the restore job does not exist.
ApiError – If the API returns another error response.

Return type:

RestoreJobModel

Examples

from pinecone import AsyncPinecone
async with AsyncPinecone(api_key="your-api-key") as pc:
    job = await pc.restore_jobs.describe(
        job_id="rj-restore-20240115",
    )
    print(job.status)

async list(*, limit=None, pagination_token=None)[source]¶

List all restore jobs in the project.

Supports cursor-based pagination.

Parameters:

limit (int | None) – Maximum number of results per page. When None, the backend applies its own default (100).
pagination_token (str | None) – Token for cursor-based pagination.

Returns:

A RestoreJobList supporting iteration, len(), and index access.

Raises:

ApiError – If the API returns an error response.

Return type:

RestoreJobList

Examples

# List all restore jobs
from pinecone import AsyncPinecone
async with AsyncPinecone(api_key="your-api-key") as pc:
    for job in await pc.restore_jobs.list():
        print(job.restore_job_id, job.status)

# List with a page size limit
async with AsyncPinecone(api_key="your-api-key") as pc:
    jobs = await pc.restore_jobs.list(limit=5)
    print(len(jobs))

AsyncInference¶

class pinecone.async_client.inference.AsyncInference(config)[source]¶

Bases: object

Asynchronous operations for Pinecone inference (embed & rerank).

Provides async methods to generate embeddings and rerank documents using Pinecone’s hosted models.

Parameters:: config (PineconeConfig) – SDK configuration used to construct an async HTTP client targeting the inference API version.

Examples

from pinecone import AsyncPinecone

async with AsyncPinecone(api_key="your-api-key") as pc:
    embeddings = await pc.inference.embed(
        model="multilingual-e5-large",
        inputs=["Hello, world!"],
    )

class EmbedModel(value)¶

Bases: str, Enum

Known embedding models for integrated indexes.

Multilingual_E5_Large = 'multilingual-e5-large'¶

Pinecone_Sparse_English_V0 = 'pinecone-sparse-english-v0'¶

class RerankModel(value)¶

Bases: str, Enum

Known reranking models.

Bge_Reranker_V2_M3 = 'bge-reranker-v2-m3'¶

Cohere_Rerank_3_5 = 'cohere-rerank-3.5'¶

Pinecone_Rerank_V0 = 'pinecone-rerank-v0'¶

__init__(config)[source]¶

Parameters:: config (PineconeConfig)
Return type:: None

async close()[source]¶

Close the underlying HTTP client.

Return type:: None

async embed(model, inputs, parameters=None)[source]¶

Generate embeddings for the provided inputs.

Parameters:

model (EmbedModel | str) – Embedding model name.
inputs (str | Sequence[str] | Sequence[Mapping[str, Any]]) – Text inputs. A single string is automatically wrapped. Any Sequence type (list, tuple, etc.) of strings or Mappings is accepted.
parameters (Mapping[str, Any] | None) – Model-specific parameters (e.g., {"input_type": "passage", "truncate": "END"}).

Returns:

An EmbeddingsList with .data, .model, and .usage.

Raises:

PineconeValueError – If model is empty or inputs is empty.
PineconeTypeError – If inputs has an invalid type.
ApiError – If the API returns an error response.
PineconeConnectionError – If a network-level connection fails (DNS, refused, transport error).
PineconeTimeoutError – If the request exceeds the configured timeout.

Return type:

EmbeddingsList

Examples

from pinecone import AsyncPinecone
async with AsyncPinecone(api_key="your-api-key") as pc:
    embeddings = await pc.inference.embed(
        model="multilingual-e5-large",
        inputs=["Hello, world!"],
        parameters={"input_type": "passage"},
    )

async get_model(*, model=None, **kwargs)[source]¶

Get detailed information about a specific model.

Parameters:

model (str) – The model identifier to look up.
kwargs (str)

Returns:

A ModelInfo with full model details.

Raises:

PineconeValueError – If model is empty.
NotFoundError – If the model does not exist.
ApiError – If the API returns another error response.
PineconeConnectionError – If a network-level connection fails (DNS, refused, transport error).
PineconeTimeoutError – If the request exceeds the configured timeout.

Return type:

ModelInfo

Examples

from pinecone import AsyncPinecone
async with AsyncPinecone(api_key="your-api-key") as pc:
    model_info = await pc.inference.get_model(
        model="multilingual-e5-large",
    )
    model_info.type

async list_models(*, type=None, vector_type=None)[source]¶

List available inference models.

Parameters:

type (str | None) – Filter by model type ("embed" or "rerank").
vector_type (str | None) – Filter by vector type ("dense" or "sparse"). Only relevant when type="embed".

Returns:

A ModelInfoList supporting iteration, len(), and .names().

Raises:

PineconeValueError – If type or vector_type is not a valid value.
ApiError – If the API returns an error response.
PineconeConnectionError – If a network-level connection fails (DNS, refused, transport error).
PineconeTimeoutError – If the request exceeds the configured timeout.

Return type:

ModelInfoList

Examples

from pinecone import AsyncPinecone
async with AsyncPinecone(api_key="your-api-key") as pc:
    models = await pc.inference.list_models()
    models.names()

property model: AsyncModelResource¶

Lazily-initialized resource for listing and getting model info.

Returns:: A AsyncModelResource that exposes .list() and .get() methods.

Examples

from pinecone import AsyncPinecone
async with AsyncPinecone(api_key="your-api-key") as pc:
    models = await pc.inference.model.list()
    info = await pc.inference.model.get("multilingual-e5-large")

async rerank(model, query, documents, rank_fields=['text'], return_documents=True, top_n=None, parameters=None)[source]¶

Rerank documents by relevance to a query.

Parameters:

model (RerankModel | str) – Reranking model name.
query (str) – Query text to rank against.
documents (Sequence[str] | Sequence[Mapping[str, Any]]) – Documents to rank. Strings are auto-wrapped as {"text": ...}. Any Sequence type (list, tuple, etc.) is accepted.
rank_fields (Sequence[str]) – Document fields to rank on. Defaults to ["text"].
return_documents (bool) – Include document text in response. Defaults to True.
top_n (int | None) – Number of top documents to return. None returns all.
parameters (dict[str, Any] | None) – Model-specific parameters.

Returns:

A RerankResult with .data and .usage.

Raises:

PineconeValueError – If model, query, or documents is empty.
PineconeTypeError – If documents has an invalid type.
ApiError – If the API returns an error response.
PineconeConnectionError – If a network-level connection fails (DNS, refused, transport error).
PineconeTimeoutError – If the request exceeds the configured timeout.

Return type:

RerankResult

Examples

from pinecone import AsyncPinecone
async with AsyncPinecone(api_key="your-api-key") as pc:
    result = await pc.inference.rerank(
        model="bge-reranker-v2-m3",
        query="Tell me about tech companies",
        documents=["Apple is a fruit.", "Acme Inc. revolutionized tech."],
        top_n=1,
    )

AsyncAssistants¶

class pinecone.async_client.assistants.AsyncAssistants(config)[source]¶

Bases: AsyncAssistantsLegacyNamespaceMixin

Async control-plane operations for Pinecone assistants.

Parameters:: config (PineconeConfig) – SDK configuration used to construct an HTTP client targeting the assistant API version.

Examples

from pinecone import AsyncPinecone

async with AsyncPinecone(api_key="your-api-key") as pc:
    assistants = pc.assistants

__init__(config)[source]¶

Parameters:: config (PineconeConfig)
Return type:: None

async chat(*, assistant_name, messages, model='gpt-4o', stream=False, temperature=None, filter=None, json_response=False, include_highlights=False, context_options=None)[source]¶

Chat with an assistant and receive citations in Pinecone-native format.

Parameters:

assistant_name (str) – Name of the assistant to chat with.
messages (list[Message | dict[str, str]]) – Conversation messages. Dicts are converted to Message objects; role defaults to "user" when not present.
model (str) – Large language model to use. Defaults to "gpt-4o". Must be one of the backend’s accepted values: "gpt-4o", "gpt-4o-mini", "gpt-4.1", "gpt-4.1-mini", "gpt-4.1-nano", "o3-mini", "o4-mini", "gpt-5", "claude-sonnet-4", "claude-sonnet-4-5", "gemini-2.5-pro", "gemini-2.5-flash". The aliases "claude-3-5-sonnet" and "claude-3-7-sonnet" are accepted but deprecated (silently remapped to "claude-sonnet-4-5" by the backend). Unknown model names are rejected by the backend with a 400 error.
stream (bool) – If True, return an AsyncChatStream. Defaults to False.
temperature (float | None) – Controls randomness. Lower values produce more deterministic responses. Omitted from request when None.
filter (dict[str, Any] | None) – Metadata filter restricting which documents are used as context. Omitted from request when None.
json_response (bool) – If True, instruct the assistant to return a JSON response. Cannot be used with streaming.
include_highlights (bool) – If True, include highlight snippets from referenced documents in citations.
context_options (ContextOptions | dict[str, Any] | None) – Options controlling context retrieval. Omitted from request when None.

Returns:

ChatResponse for non-streaming requests, or an AsyncChatStream for streaming requests.

Raises:

PineconeValueError – If both stream=True and json_response=True are specified.
ApiError – If the API returns an error response.

Return type:

ChatResponse | AsyncChatStream

Examples

# Non-streaming chat
import asyncio
from pinecone import AsyncPinecone

pc = AsyncPinecone(api_key="your-api-key")

async def main() -> None:
    response = await pc.assistants.chat(
        assistant_name="my-assistant",
        messages=[{"content": "What is Pinecone?"}],
    )
asyncio.run(main())

# Streaming chat
async def stream_main() -> None:
    stream = await pc.assistants.chat(
        assistant_name="my-assistant",
        messages=[{"content": "What is Pinecone?"}],
        stream=True,
    )
    async for text in stream.text():
        print(text, end="", flush=True)
asyncio.run(stream_main())

async chat_completions(*, assistant_name, messages, model='gpt-4o', stream=False, temperature=None, filter=None)[source]¶

Chat with an assistant using an OpenAI-compatible interface.

Returns responses in OpenAI chat completion format. Useful when you need inline citations or OpenAI-compatible responses. Has limited functionality compared to the standard chat() interface — does not support include_highlights, context_options, or json_response parameters.

Parameters:

assistant_name (str) – Name of the assistant to chat with.
messages (list[Message | dict[str, str]]) – Conversation messages. Dicts are converted to Message objects; role defaults to "user" when not present.
model (str) – Large language model to use. Defaults to "gpt-4o". Must be one of the backend’s accepted values: "gpt-4o", "gpt-4o-mini", "gpt-4.1", "gpt-4.1-mini", "gpt-4.1-nano", "o3-mini", "o4-mini", "gpt-5", "claude-sonnet-4", "claude-sonnet-4-5", "gemini-2.5-pro", "gemini-2.5-flash". The aliases "claude-3-5-sonnet" and "claude-3-7-sonnet" are accepted but deprecated (silently remapped to "claude-sonnet-4-5" by the backend). Unknown model names are rejected by the backend with a 400 error.
stream (bool) – If True, return an async streaming iterator. Defaults to False.
temperature (float | None) – Controls randomness. Lower values produce more deterministic responses. Omitted from request when None.
filter (dict[str, Any] | None) – Metadata filter restricting which documents are used as context. Omitted from request when None.

Returns:

ChatCompletionResponse for non-streaming requests, or an AsyncIterator[ChatCompletionStreamChunk] for streaming.

Raises:

ApiError – If the API returns an error response.

Return type:

ChatCompletionResponse | AsyncChatCompletionStream

Examples

# Non-streaming chat completion
import asyncio
from pinecone import AsyncPinecone

pc = AsyncPinecone(api_key="your-api-key")

async def main() -> None:
    response = await pc.assistants.chat_completions(
        assistant_name="research-assistant",
        messages=[{"content": "Explain quantum entanglement briefly."}],
    )
    print(response.choices[0].message.content)
asyncio.run(main())

# Streaming chat completion
async def stream_main() -> None:
    stream = await pc.assistants.chat_completions(
        assistant_name="research-assistant",
        messages=[{"content": "Explain quantum entanglement briefly."}],
        stream=True,
    )
    async for chunk in stream:
        print(chunk)
asyncio.run(stream_main())

async close()[source]¶

Close the underlying HTTP client and any cached data-plane clients.

Return type:: None

async context(*, assistant_name, query=None, messages=None, filter=None, top_k=None, snippet_size=None, multimodal=None, include_binary_content=None)[source]¶

Retrieve relevant context snippets from a Pinecone assistant.

Retrieves context snippets matching a text query or conversation history. Exactly one of query or messages must be provided and non-empty.

Parameters:

assistant_name (str) – Name of the assistant to retrieve context from.
query (str | None) – Text query to use for context retrieval. Mutually exclusive with messages. Empty string is treated as not provided.
messages (Sequence[Message | Mapping[str, str]] | None) – Conversation messages to use for context retrieval. Mutually exclusive with query. Empty list is treated as not provided. Dicts are converted to Message objects.
filter (dict[str, Any] | None) – Metadata filter restricting which documents contribute context. Omitted from request when None.
top_k (int | None) – Maximum number of context snippets to return. Omitted from request when None.
snippet_size (int | None) – Maximum snippet size in tokens. Omitted from request when None.
multimodal (bool | None) – Whether to include image-related context snippets. Omitted from request when None.
include_binary_content (bool | None) – Whether image snippets include base64 image data. Only meaningful when multimodal is True. Omitted from request when None.

Returns:

ContextResponse containing the matching context snippets.

Raises:

PineconeValueError – If both or neither of query and messages are provided (or if they are empty).
ApiError – If the API returns an error response.

Return type:

ContextResponse

Examples

# Retrieve context using a text query
response = await pc.assistants.context(
    assistant_name="my-assistant",
    query="What is Pinecone?",
)
for snippet in response.snippets:
    print(snippet.content)

async create(*, name=None, instructions=None, metadata=None, region='us', environment=None, timeout=None, **kwargs)[source]¶

Create a new Pinecone assistant.

Creates an assistant and optionally polls until it reaches "Ready" status. The assistant starts in "Initializing" status.

Parameters:

name (str) – Name for the new assistant. Must be 1-63 characters, start and end with an alphanumeric character, and consist only of lowercase alphanumeric characters or hyphens.
instructions (str | None) – Optional directive for the assistant. Maximum 16 KB.
metadata (dict[str, Any] | None) – Optional metadata dictionary. When omitted or None, no metadata is sent and the assistant is created without metadata (None).
region (str) – Region to deploy the assistant in. Must be "us" or "eu" (case-sensitive). Defaults to "us".
environment (str | None) – Optional environment override. Restricted to Pinecone-internal org plans; passing this on a non-internal plan raises a 403 error from the backend.
timeout (float | None) – Seconds to wait for the assistant to become ready. Use None (default) to poll indefinitely. Use -1 to return immediately without polling. Use 0 or a positive value to poll with a deadline.
kwargs (Any)

Returns:

AssistantModel describing the created assistant.

Raises:

PineconeValueError – If region is not "us" or "eu".
PineconeTimeoutError – If the assistant does not become ready before the deadline.
ApiError – If the API returns an error response.

Return type:

AssistantModel

Examples

from pinecone import AsyncPinecone
async with AsyncPinecone(api_key="your-api-key") as pc:
    assistant = await pc.assistants.create(name="my-assistant")

async delete(*, name=None, timeout=None, **kwargs)[source]¶

Delete a Pinecone assistant by name.

Sends a DELETE request, then polls every 5 seconds until the assistant is confirmed gone (404 from describe). Other errors during polling propagate immediately.

Parameters:

name (str) – The name of the assistant to delete.
timeout (float | None) – Seconds to wait for the assistant to disappear. Use None (default) to poll indefinitely. Use -1 to return immediately without polling. Use a positive value to poll with a deadline. Raises PineconeTimeoutError if the assistant is not gone before the deadline.
**kwargs (Any) – Accepted for backwards compatibility only. Unknown kwargs raise PineconeValueError.

Returns:

None.

Raises:

PineconeTimeoutError – If the assistant still exists after timeout seconds.
ApiError – If the API returns an error response.

Return type:

None

Examples

await pc.assistants.delete(name="my-assistant")

# Return immediately without waiting for deletion
await pc.assistants.delete(name="my-assistant", timeout=-1)

async delete_file(*, assistant_name, file_id, timeout=None)[source]¶

Delete a file from a Pinecone assistant.

Sends a DELETE request, then polls every 5 seconds until the file is confirmed gone (404 from describe_file). Other errors during polling propagate immediately.

Parameters:

assistant_name (str) – Name of the assistant that owns the file.
file_id (str) – Unique identifier of the file to delete.
timeout (float | None) – Seconds to wait for the file to be deleted. Use None (default) to poll indefinitely. Use -1 to return immediately without polling. Use a positive value to poll with a deadline. Raises PineconeTimeoutError if the file is not gone before the deadline.

Raises:

PineconeError – If server-side file deletion fails.
PineconeTimeoutError – If the file still exists after timeout seconds.
ApiError – If the API returns an error response.

Return type:

None

Examples

await pc.assistants.delete_file(
    assistant_name="my-assistant",
    file_id="file-abc123",
)

async describe(*, name=None, **kwargs)[source]¶

Get detailed information about a named assistant.

Parameters:

name (str) – The name of the assistant to describe.
kwargs (Any)

Returns:

AssistantModel with name, status, created_at, updated_at, metadata, instructions, and host.

Raises:

ApiError – If the API returns an error response (e.g. 404 when the assistant does not exist).

Return type:

AssistantModel

Examples

assistant = await pc.assistants.describe(name="my-assistant")
print(assistant.status)

async describe_file(*, assistant_name, file_id, include_url=False)[source]¶

Get the status and metadata of a file uploaded to an assistant.

Parameters:

assistant_name (str) – Name of the assistant that owns the file.
file_id (str) – Unique identifier of the file to retrieve.
include_url (bool) – If True, include a signed download URL in the response. Defaults to False.

Returns:

AssistantFileModel with file metadata and status.

Raises:

NotFoundError – If the file does not exist.
ApiError – If the API returns an error response.

Return type:

AssistantFileModel

Examples

file = await pc.assistants.describe_file(
    assistant_name="my-assistant",
    file_id="file-abc123",
)
print(file.status)

async evaluate_alignment(*, question, answer, ground_truth_answer)[source]¶

Evaluate answer alignment against a ground truth answer.

Measures the correctness and completeness of a generated answer with respect to a ground truth answer. Alignment is the harmonic mean of correctness (precision) and completeness (recall).

Parameters:

question (str) – The question for which the answer was generated.
answer (str) – The generated answer to evaluate.
ground_truth_answer (str) – The ground truth answer to compare against.

Returns:

AlignmentResult with aggregate scores, per-fact entailment results, and token usage statistics.

Raises:

ApiError – If the API returns an error response.

Return type:

AlignmentResult

Examples

result = await pc.assistants.evaluate_alignment(
    question="What is the capital of Spain?",
    answer="Barcelona.",
    ground_truth_answer="Madrid.",
)
print(result.scores.alignment)

list(*, limit=None, pagination_token=None)[source]¶

List assistants in the project with transparent lazy pagination.

Parameters:

limit (int | None) – Maximum number of assistants to yield across all pages. None (default) yields all assistants.
pagination_token (str | None) – Token to resume pagination from a previous call.

Returns:

AsyncPaginator over AssistantModel objects. Supports async for loops, .to_list(), .pages(), and limit.

Raises:

ApiError – If the API returns an error response.

Return type:

AsyncPaginator[AssistantModel]

Examples

async for a in pc.assistants.list():
    print(a.name, a.status)

all_assistants = await pc.assistants.list().to_list()

list_files(*, assistant_name, filter=None, limit=None, pagination_token=None)[source]¶

List files for an assistant with lazy async pagination.

Parameters:

assistant_name (str) – Name of the assistant whose files to list.
filter (dict[str, Any] | None) – Optional metadata filter expression. Serialized to a JSON string before being sent to the API.
limit (int | None) – Maximum number of files to yield across all pages. None (default) yields all files.
pagination_token (str | None) – Token to resume pagination from a previous call.

Returns:

AsyncPaginator over AssistantFileModel objects. Supports async for loops, .to_list(), .pages(), and limit.

Raises:

ApiError – If the API returns an error response.

Return type:

AsyncPaginator[AssistantFileModel]

Examples

async for f in pc.assistants.list_files(assistant_name="my-assistant"):
    print(f.name, f.status)

files = await pc.assistants.list_files(assistant_name="my-assistant").to_list()

async list_files_page(*, assistant_name, page_size=None, pagination_token=None, filter=None, **kwargs)[source]¶

List one page of files for an assistant with explicit pagination control.

Parameters:

assistant_name (str) – Name of the assistant whose files to list.
page_size (int | None) – Maximum number of files per page.
pagination_token (str | None) – Token from a previous response to fetch the next page.
filter (dict[str, Any] | None) – Optional metadata filter expression. Serialized to a JSON string before being sent to the API.
kwargs (Any)

Returns:

ListFilesResponse with a files list and an optional next continuation token.

Raises:

ApiError – If the API returns an error response.

Return type:

ListFilesResponse

Examples

page = await pc.assistants.list_files_page(
    assistant_name="my-assistant",
)
for f in page.files:
    print(f.name)
if page.next:
    next_page = await pc.assistants.list_files_page(
        assistant_name="my-assistant",
        pagination_token=page.next,
    )

async list_page(*, page_size=None, pagination_token=None, **kwargs)[source]¶

List one page of assistants with explicit pagination control.

Parameters:

page_size (int | None) – Maximum number of assistants per page.
pagination_token (str | None) – Token from a previous response to fetch the next page.
kwargs (Any)

Returns:

ListAssistantsResponse with an assistants list and an optional next continuation token.

Raises:

ApiError – If the API returns an error response.

Return type:

ListAssistantsResponse

Examples

page = await pc.assistants.list_page(page_size=10)
for a in page.assistants:
    print(a.name)
if page.next:
    next_page = await pc.assistants.list_page(pagination_token=page.next)

async update(*, name=None, instructions=None, metadata=None, **kwargs)[source]¶

Update an existing Pinecone assistant.

Updates the specified assistant’s instructions and/or metadata. Metadata is fully replaced (not merged) when provided.

Parameters:

name (str) – The name of the assistant to update.
instructions (str | None) – New instructions for the assistant. Pass an empty string to clear existing instructions.
metadata (dict[str, Any] | None) – New metadata dictionary. Fully replaces any existing metadata rather than merging.
kwargs (Any)

Returns:

AssistantModel describing the updated assistant.

Raises:

ApiError – If the API returns an error response (e.g. 404 when the assistant does not exist).

Return type:

AssistantModel

Examples

# Update an assistant's instructions
assistant = await pc.assistants.update(
    name="my-assistant",
    instructions="You are a helpful research assistant.",
)

# Replace an assistant's metadata
assistant = await pc.assistants.update(
    name="my-assistant",
    metadata={"team": "ml", "version": "2"},
)

async upload_file(*, assistant_name, file_path=None, file_stream=None, file_name=None, metadata=None, multimodal=None, file_id=None, timeout=None)[source]¶

Upload a file to a Pinecone assistant.

Uploads a file from a local path or an in-memory byte stream, then polls until server-side processing completes.

Parameters:

assistant_name (str) – Name of the target assistant.
file_path (str | None) – Path to a local file to upload. Mutually exclusive with file_stream.
file_stream (IO[bytes] | None) – An open byte stream to upload. Mutually exclusive with file_path. Use file_name to set the filename.
file_name (str | None) – Filename to associate with file_stream. Ignored when file_path is provided.
metadata (dict[str, Any] | None) – Optional metadata dictionary. Sent as a JSON string.
multimodal (bool | None) – Whether to enable multimodal processing for PDFs.
file_id (str | None) – Optional caller-specified file identifier for upsert behavior.
timeout (float | None) – Seconds to wait for processing to complete. None (default) polls indefinitely. Use -1 to return immediately after upload with one describe call. Raises PineconeTimeoutError if processing is not done before the deadline.

Returns:

AssistantFileModel fetched fresh from the API after processing completes.

Raises:

PineconeValueError – If both or neither of file_path and file_stream are provided, or if file_path does not exist.
PineconeTimeoutError – If processing does not complete before timeout.
PineconeError – If server-side processing fails.

Return type:

AssistantFileModel

Examples

file = await async_pc.assistants.upload_file(
    assistant_name="research-assistant",
    file_path="/data/report.pdf",
)
print(file.status)

with open("report.pdf", "rb") as f:
    file = await async_pc.assistants.upload_file(
        assistant_name="research-assistant",
        file_stream=f,
        file_name="report.pdf",
        metadata={"source": "quarterly-review"},
    )
print(file.status)