Pinecone¶

Pinecone is the synchronous control-plane client — use it to manage indexes, collections, backups, and related resources. Sub-clients for each resource type are accessed as properties (e.g. pc.indexes, pc.collections) and are lazily initialized on first access.

class pinecone.Pinecone(api_key=None, *, host=None, additional_headers=None, source_tag=None, proxy_url=None, proxy_headers=None, ssl_ca_certs=None, ssl_verify=True, timeout=30.0, connection_pool_maxsize=0, retry_config=None, **kwargs)[source]¶

Bases: object

Synchronous Pinecone client for control-plane operations.

Parameters:

api_key (str | None) – Pinecone API key. Falls back to PINECONE_API_KEY env var.
host (str | None) – Control-plane API host. Falls back to PINECONE_CONTROLLER_HOST env var, then defaults to https://api.pinecone.io.
additional_headers (Mapping[str, str] | None) – Extra headers included in every request.
source_tag (str | None) – Tag appended to the User-Agent string for request attribution.
proxy_url (str | None) – HTTP proxy URL for outgoing requests.
proxy_headers (Mapping[str, str] | None) – Custom headers for proxy authentication.
ssl_ca_certs (str | None) – Path to a CA certificate bundle for SSL verification.
ssl_verify (bool) – Whether to verify SSL certificates. Defaults to True.
timeout (float) – Request timeout in seconds. Defaults to 30.0.
connection_pool_maxsize (int) – Maximum number of connections to keep in the pool. 0 (default) uses httpx defaults.
retry_config (RetryConfig | None) – Custom retry configuration. When None (default), uses built-in defaults (5 attempts, exponential backoff, retries on 500/502/503/504 for GET/HEAD).
pool_threads (int | None) – Opt-in for the legacy async_req=True execution model on data-plane methods. When set, indexes created via index() accept async_req=True on upsert, query, describe_index_stats, and list_paginated. For new code, prefer AsyncPinecone or concurrent.futures.ThreadPoolExecutor. This kwarg exists for backcompat with pre-rewrite callers.
kwargs (Any)

Raises:

PineconeValueError – If no API key can be resolved from arguments or environment variables.

Examples

from pinecone import Pinecone

pc = Pinecone(api_key="your-api-key")  # or set PINECONE_API_KEY env var

# Control plane: manage indexes
indexes = pc.indexes.list()

# Data plane: operate on vectors
index = pc.index("my-index")

__init__(api_key=None, *, host=None, additional_headers=None, source_tag=None, proxy_url=None, proxy_headers=None, ssl_ca_certs=None, ssl_verify=True, timeout=30.0, connection_pool_maxsize=0, retry_config=None, **kwargs)[source]¶

Parameters:

api_key (str | None)
host (str | None)
additional_headers (Mapping[str, str] | None)
source_tag (str | None)
proxy_url (str | None)
proxy_headers (Mapping[str, str] | None)
ssl_ca_certs (str | None)
ssl_verify (bool)
timeout (float)
connection_pool_maxsize (int)
retry_config (RetryConfig | None)
kwargs (Any)

Return type:

None

property assistant: _AssistantNamespaceProxy¶

Return a callable proxy alias for Pinecone.assistants.

Returns a proxy that supports both namespace-style access (pc.assistant.create_assistant(...)) and the convenience call form (pc.assistant("my-name") — shortcut for pc.assistants.describe(name="my-name")).

The canonical entry point is Pinecone.assistants; this alias is provided for ergonomic singular-form access and is not deprecated.

Returns:: Proxy supporting namespace-style access (pc.assistant.create_assistant(...)) and the shorthand call form (pc.assistant("my-name")).

Examples

>>> bot = pc.assistant("acme-support-bot")
>>> pc.assistant.create_assistant(
...     name="support-bot",
...     instructions="Help users with billing questions.",
... )

property assistants: Assistants¶

Access the Assistants namespace for assistant operations.

Lazily imported and instantiated on first access.

Returns:: Assistants namespace instance.

Examples

>>> names = [assistant.name for assistant in pc.assistants.list()]

property backups: Backups¶

Access the Backups namespace for control-plane backup operations.

Lazily imported and instantiated on first access.

Returns:: Backups namespace instance.

Examples

>>> ids = [backup.backup_id for backup in pc.backups.list()]

close()[source]¶

Close all open HTTP connections.

Closes the main control-plane client and any namespace clients (inference, assistants, preview) that were initialized during this session.

Prefer the context manager form (with Pinecone(...) as pc:) which calls close() automatically on exit.

Examples

Close the client explicitly after use:

>>> from pinecone import Pinecone
>>> client = Pinecone(api_key="your-api-key")
>>> client.close()

Use Pinecone as a context manager (close is called automatically):

>>> with Pinecone(api_key="your-api-key") as pinecone_client:
...     _ = pinecone_client.indexes.list()

Return type:: None

property collections: Collections¶

Access the Collections namespace for control-plane collection operations.

Lazily imported and instantiated on first access.

Returns:: Collections namespace instance.

Examples

>>> names = [col.name for col in pc.collections.list()]

property config: PineconeConfig¶

Return the resolved configuration for this client.

Returns:: PineconeConfig containing the resolved API key, host, timeout, and connection settings.

Examples

>>> cfg = pc.config
>>> cfg.timeout
30.0

create_index_from_backup(*, name, backup_id, deletion_protection=None, tags=None, timeout=None)[source]¶

Create a new index by restoring from a backup.

Sends a POST to /backups/{backup_id}/create-index and then polls until the index is ready (unless timeout is -1).

Parameters:

name (str) – Name for the new index.
backup_id (str) – Identifier of the backup to restore from.
deletion_protection (DeletionProtection | str | None) – "enabled" or "disabled". Defaults to "disabled" server-side when omitted.
tags (Mapping[str, str] | None) – Optional key-value tags for the new index.
timeout (int | None) – Seconds to wait for readiness. None (default) blocks up to 300 s. -1 returns a CreateIndexFromBackupResponse immediately (contains restore_job_id and index_id) without polling.

Returns:

A CreateIndexFromBackupResponse when timeout is -1 (contains restore_job_id and index_id), or an IndexModel describing the restored index once it is ready.

Raises:

PineconeValueError – If name or backup_id is empty.
PineconeTimeoutError – If the index is not ready within the timeout.
IndexInitFailedError – If the index enters InitializationFailed state.
IndexTerminatedError – If the index enters Terminating or Disabled state.
ApiError – If the API returns an error response.

Return type:

CreateIndexFromBackupResponse | IndexModel

Examples

>>> from pinecone import Pinecone
>>> pc = Pinecone(api_key="your-api-key")
>>> index = pc.create_index_from_backup(
...     name="product-search-restored",
...     backup_id="bk-daily-20240115",
... )

>>> result = pc.create_index_from_backup(
...     name="product-search-restored",
...     backup_id="bk-daily-20240115",
...     timeout=-1,
... )
>>> print(result.restore_job_id)

index(name='', *, host='', grpc=False, pool_threads=None)[source]¶

Create a data plane client targeting a specific index.

Can target by host URL directly (skips the describe call) or by index name (triggers a describe-index lookup to resolve the host).

Indexes¶

class pinecone.client.indexes.Indexes(http, host_cache=None)[source]¶

Bases: object

Control-plane operations for Pinecone indexes.

Provides list, describe, exists, create, delete, and configure methods.

Collections¶

class pinecone.client.collections.Collections(http)[source]¶

Bases: object

Control-plane operations for Pinecone collections.

Provides methods to create, list, describe, and delete collections.

Parameters:: http (HTTPClient) – HTTP client for making API requests.

Examples

from pinecone import Pinecone

pc = Pinecone(api_key="your-api-key")
names = [col.name for col in pc.collections.list()]

__init__(http)[source]¶

Parameters:: http (HTTPClient)
Return type:: None

create(*, name, source)[source]¶

Create a collection from an existing index.

Returns immediately after the API call without polling for readiness.

Parameters:

name (str) – Name for the new collection.
source (str) – Name of the source index.

Returns:

A CollectionModel describing the created collection.

Raises:

ValidationError – If name is empty, longer than 45 characters, contains characters outside [a-z0-9-], or starts/ends with a hyphen. Also raised if source is empty.
ApiError – If the API returns an error response (e.g. authentication failure or server error).

Return type:

CollectionModel

Examples

>>> col = pc.collections.create(name="my-collection", source="my-index")
>>> col.status
'Initializing'

delete(name)[source]¶

Delete a collection by name.

Parameters:

name (str) – The name of the collection to delete.

Raises:

ValidationError – If name is empty.
NotFoundError – If the collection does not exist.
ApiError – If the API returns another error response.

Return type:

None

Examples

>>> pc.collections.delete("my-collection")

describe(name)[source]¶

Get detailed information about a named collection.

Parameters:

name (str) – The name of the collection to describe.

Returns:

A CollectionModel with name, status, size, dimension, vector_count, and environment.

Raises:

ValidationError – If name is empty.
NotFoundError – If the collection does not exist.
ApiError – If the API returns another error response.

Return type:

CollectionModel

Examples

>>> desc = pc.collections.describe("my-collection")
>>> desc.size
1024

list()[source]¶

List all collections in the project.

Returns all collections in a single response without filtering, sorting, or pagination.

Returns:: A CollectionList supporting iteration, len(), index access, and a names() convenience method.
Raises:: ApiError – If the API returns an error response (e.g. authentication failure or server error).
Return type:: CollectionList

Examples

>>> collections = pc.collections.list()
>>> collections.names()
['my-collection']

Backups¶

class pinecone.client.backups.Backups(http)[source]¶

Bases: object

Control-plane operations for Pinecone backups.

Provides methods to create, list, describe, and delete backups.

Parameters:: http (HTTPClient) – HTTP client for making API requests.

Examples

from pinecone import Pinecone

pc = Pinecone(api_key="your-api-key")
ids = [b.backup_id for b in pc.backups.list()]

__init__(http)[source]¶

Parameters:: http (HTTPClient)
Return type:: None

create(*, index_name, name=None, description=None)[source]¶

Create a backup of an existing index.

Parameters:

index_name (str) – Name of the index to back up.
name (str | None) – Optional name for the backup.
description (str | None) – Description for the backup. When None (the default), no description is sent and the backend stores None.

Returns:

A BackupModel describing the created backup.

Raises:

ValidationError – If index_name is empty.
ApiError – If the API returns an error response.

Return type:

BackupModel

Examples

>>> from pinecone import Pinecone
>>> pc = Pinecone(api_key="your-api-key")
>>> backup = pc.backups.create(index_name="product-search")
>>> backup.backup_id
'bk-abc123'

>>> backup = pc.backups.create(
...     index_name="product-search",
...     name="daily-20240115",
...     description="Scheduled daily backup before reindexing",
... )

delete(*, backup_id)[source]¶

Delete a backup.

Parameters:

backup_id (str) – The identifier of the backup to delete.

Raises:

ValidationError – If backup_id is empty.
NotFoundError – If the backup does not exist.
ApiError – If the API returns another error response.

Return type:

None

Examples

>>> from pinecone import Pinecone
>>> pc = Pinecone(api_key="your-api-key")
>>> pc.backups.delete(backup_id="bk-daily-20240115")

describe(*, backup_id)[source]¶

Get detailed information about a backup.

Parameters:

backup_id (str) – The identifier of the backup to describe.

Returns:

A BackupModel with full backup details.

Raises:

ValidationError – If backup_id is empty.
NotFoundError – If the backup does not exist.
ApiError – If the API returns another error response.

Return type:

BackupModel

Examples

>>> from pinecone import Pinecone
>>> pc = Pinecone(api_key="your-api-key")
>>> backup = pc.backups.describe(backup_id="bk-daily-20240115")
>>> backup.status
'Ready'

get(*, backup_id)[source]¶

Get detailed information about a backup (alias for describe()).

Parameters:

backup_id (str) – The identifier of the backup.

Returns:

A BackupModel with full backup details.

Raises:

ValidationError – If backup_id is empty.
NotFoundError – If the backup does not exist.
ApiError – If the API returns another error response.

Return type:

BackupModel

Examples

>>> from pinecone import Pinecone
>>> pc = Pinecone(api_key="your-api-key")
>>> backup = pc.backups.get(backup_id="bk-daily-20240115")
>>> backup.status
'Ready'

list(*, index_name=None, limit=None, pagination_token=None)[source]¶

List backups.

When index_name is provided, lists backups for that index only. Otherwise lists all backups in the project.

Parameters:

index_name (str | None) – Index name to filter by, or None for all.
limit (int | None) – Maximum number of results per page. When None, the backend applies its own default (100).
pagination_token (str | None) – Token for cursor-based pagination.

Returns:

A BackupList supporting iteration, len(), and index access.

Raises:

ApiError – If the API returns an error response.

Return type:

BackupList

Examples

>>> from pinecone import Pinecone
>>> pc = Pinecone(api_key="your-api-key")
>>> for backup in pc.backups.list():
...     print(backup.backup_id, backup.name)

>>> for backup in pc.backups.list(index_name="product-search"):
...     print(backup.name)

RestoreJobs¶

class pinecone.client.restore_jobs.RestoreJobs(http)[source]¶

Bases: object

Control-plane operations for Pinecone restore jobs.

Provides methods to list and describe restore jobs.

Parameters:: http (HTTPClient) – HTTP client for making API requests.

Examples

from pinecone import Pinecone

pc = Pinecone(api_key="your-api-key")
ids = [job.restore_job_id for job in pc.restore_jobs.list()]

__init__(http)[source]¶

Parameters:: http (HTTPClient)
Return type:: None

describe(*, job_id)[source]¶

Get detailed information about a restore job.

Parameters:

job_id (str) – The identifier of the restore job to describe.

Returns:

A RestoreJobModel with full restore job details.

Raises:

ValidationError – If job_id is empty.
NotFoundError – If the restore job does not exist.
ApiError – If the API returns another error response.

Return type:

RestoreJobModel

Examples

>>> from pinecone import Pinecone
>>> pc = Pinecone(api_key="your-api-key")
>>> job = pc.restore_jobs.describe(job_id="rj-restore-20240115")
>>> job.status
'Completed'

list(*, limit=None, pagination_token=None)[source]¶

List all restore jobs in the project.

Supports cursor-based pagination.

Parameters:

limit (int | None) – Maximum number of results per page. When None, the backend applies its own default (100).
pagination_token (str | None) – Token for cursor-based pagination.

Returns:

A RestoreJobList supporting iteration, len(), and index access.

Raises:

ApiError – If the API returns an error response.

Return type:

RestoreJobList

Examples

>>> from pinecone import Pinecone
>>> pc = Pinecone(api_key="your-api-key")
>>> for job in pc.restore_jobs.list():
...     print(job.restore_job_id, job.status)

>>> jobs = pc.restore_jobs.list(limit=5)
>>> len(jobs)
5

Inference¶

class pinecone.client.inference.Inference(config)[source]¶

Bases: object

Control-plane operations for Pinecone inference (embed & rerank).

Provides methods to generate embeddings and rerank documents using Pinecone’s hosted models.

Parameters:: config (PineconeConfig) – SDK configuration used to construct an HTTP client targeting the inference API version.

Examples

from pinecone import Pinecone

pc = Pinecone(api_key="your-api-key")
embeddings = pc.inference.embed(
    model="multilingual-e5-large", inputs=["Hello, world!"]
)

class EmbedModel(value)¶

Bases: str, Enum

Known embedding models for integrated indexes.

Multilingual_E5_Large = 'multilingual-e5-large'¶

Pinecone_Sparse_English_V0 = 'pinecone-sparse-english-v0'¶

class RerankModel(value)¶

Bases: str, Enum

Known reranking models.

Bge_Reranker_V2_M3 = 'bge-reranker-v2-m3'¶

Cohere_Rerank_3_5 = 'cohere-rerank-3.5'¶

Pinecone_Rerank_V0 = 'pinecone-rerank-v0'¶

__init__(config)[source]¶

Parameters:: config (PineconeConfig)
Return type:: None

close()[source]¶

Close the underlying HTTP client.

Return type:: None

embed(model, inputs, parameters=None)[source]¶

Generate embeddings for the provided inputs.

Parameters:

model (EmbedModel | str) – Embedding model name.
inputs (str | Sequence[str] | Sequence[Mapping[str, Any]]) – Text inputs. A single string is automatically wrapped. Any Sequence type (list, tuple, etc.) of strings or Mappings is accepted.
parameters (Mapping[str, Any] | None) –
Model-specific parameters (e.g., {"input_type": "passage", "truncate": "END"}). To discover valid parameters for a model, call get_model():
```
pc.inference.get_model(model="multilingual-e5-large").supported_parameters
```

Returns:

An EmbeddingsList with .data, .model, and .usage.

Raises:

PineconeValueError – If model is empty or inputs is empty.
PineconeTypeError – If inputs has an invalid type.
ApiError – If the API returns an error response.
PineconeConnectionError – If a network-level connection fails (DNS, refused, transport error).
PineconeTimeoutError – If the request exceeds the configured timeout.

Return type:

EmbeddingsList

Examples

>>> from pinecone import Pinecone
>>> pc = Pinecone(api_key="your-api-key")
>>> embeddings = pc.inference.embed(
...     model="multilingual-e5-large",
...     inputs=["Hello, world!"],
...     parameters={"input_type": "passage"},
... )
>>> len(embeddings.data)
1

Note

To store embeddings in a Pinecone index, extract the raw vector values and pass them to upsert():

values = embeddings.data[0].values
index.upsert(vectors=[("doc-1", values)])

Alternatively, use an index with integrated inference (IntegratedSpec) and call upsert_records() to let Pinecone handle embedding server-side — no manual embed step required.

get_model(*, model=None, **kwargs)[source]¶

Get detailed information about a specific model.

Parameters:

model (str) – The model identifier to look up.
kwargs (str)

Returns:

A ModelInfo with full model details.

Raises:

PineconeValueError – If model is empty.
NotFoundError – If the model does not exist.
ApiError – If the API returns another error response.
PineconeConnectionError – If a network-level connection fails (DNS, refused, transport error).
PineconeTimeoutError – If the request exceeds the configured timeout.

Return type:

ModelInfo

Examples

>>> from pinecone import Pinecone
>>> pc = Pinecone(api_key="your-api-key")
>>> model_info = pc.inference.get_model(model="multilingual-e5-large")
>>> model_info.type
'embed'

list_models(*, type=None, vector_type=None)[source]¶

List available inference models.

Parameters:

type (str | None) – Filter by model type ("embed" or "rerank").
vector_type (str | None) – Filter by vector type ("dense" or "sparse"). Only relevant when type="embed".

Returns:

A ModelInfoList supporting iteration, len(), and .names().

Raises:

PineconeValueError – If type or vector_type is not a valid value.
ApiError – If the API returns an error response.
PineconeConnectionError – If a network-level connection fails (DNS, refused, transport error).
PineconeTimeoutError – If the request exceeds the configured timeout.

Return type:

ModelInfoList

Examples

>>> from pinecone import Pinecone
>>> pc = Pinecone(api_key="your-api-key")
>>> models = pc.inference.list_models()
>>> models.names()
['multilingual-e5-large', 'pinecone-sparse-english-v0']

>>> embed_models = pc.inference.list_models(type="embed")

property model: ModelResource¶

Lazily-initialized resource for listing and getting model info.

Returns:: A ModelResource that exposes .list() and .get() methods.

Examples

>>> from pinecone import Pinecone
>>> pc = Pinecone(api_key="your-api-key")
>>> models = pc.inference.model.list()
>>> info = pc.inference.model.get("multilingual-e5-large")

rerank(model, query, documents, rank_fields=['text'], return_documents=True, top_n=None, parameters=None)[source]¶

Rerank documents by relevance to a query.

Parameters:

model (RerankModel | str) – Reranking model name.
query (str) – Query text to rank against.
documents (Sequence[str] | Sequence[Mapping[str, Any]]) – Documents to rank. Strings are auto-wrapped as {"text": ...}. Any Sequence type (list, tuple, etc.) is accepted.
rank_fields (Sequence[str]) – Document fields to rank on. Defaults to ["text"].
return_documents (bool) – Include document text in response. Defaults to True.
top_n (int | None) – Number of top documents to return. None returns all.
parameters (Mapping[str, Any] | None) –
Model-specific parameters. To discover valid parameters for a model, call get_model():
```
pc.inference.get_model(model="bge-reranker-v2-m3").supported_parameters
```

Returns:

A RerankResult with .data and .usage.

Raises:

PineconeValueError – If model, query, or documents is empty.
PineconeTypeError – If documents has an invalid type.
ApiError – If the API returns an error response.
PineconeConnectionError – If a network-level connection fails (DNS, refused, transport error).
PineconeTimeoutError – If the request exceeds the configured timeout.

Return type:

RerankResult

Examples

>>> from pinecone import Pinecone
>>> pc = Pinecone(api_key="your-api-key")
>>> result = pc.inference.rerank(
...     model="bge-reranker-v2-m3",
...     query="Tell me about tech companies",
...     documents=["Apple is a fruit.", "Acme Inc. revolutionized tech."],
...     top_n=1,
... )
>>> result.data[0].score
0.95

Assistants¶

class pinecone.client.assistants.Assistants(config)[source]¶

Bases: AssistantsLegacyNamespaceMixin

Control-plane operations for Pinecone assistants.

Parameters:: config (PineconeConfig) – SDK configuration used to construct an HTTP client targeting the assistant API version.

Examples

from pinecone import Pinecone

pc = Pinecone(api_key="your-api-key")
assistants = pc.assistants

__init__(config)[source]¶

Parameters:: config (PineconeConfig)
Return type:: None

chat(*, assistant_name, messages, model='gpt-4o', stream=False, temperature=None, filter=None, json_response=False, include_highlights=False, context_options=None)[source]¶

Chat with an assistant and receive citations in Pinecone-native format.

Parameters:

assistant_name (str) – Name of the assistant to chat with.
messages (list[Message | dict[str, str]]) – Conversation messages. Dicts are converted to Message objects; role defaults to "user" when not present.
model (str) – Large language model to use. Defaults to "gpt-4o". Must be one of the backend’s accepted values: "gpt-4o", "gpt-4o-mini", "gpt-4.1", "gpt-4.1-mini", "gpt-4.1-nano", "o3-mini", "o4-mini", "gpt-5", "claude-sonnet-4", "claude-sonnet-4-5", "gemini-2.5-pro", "gemini-2.5-flash". The aliases "claude-3-5-sonnet" and "claude-3-7-sonnet" are accepted but deprecated (silently remapped to "claude-sonnet-4-5" by the backend). Unknown model names are rejected by the backend with a 400 error.
stream (bool) – If True, return a ChatStream. Defaults to False.
temperature (float | None) – Controls randomness. Lower values produce more deterministic responses. Omitted from request when None.
filter (dict[str, Any] | None) – Metadata filter restricting which documents are used as context. Omitted from request when None.
json_response (bool) – If True, instruct the assistant to return a JSON response. Cannot be used with streaming.
include_highlights (bool) – If True, include highlight snippets from referenced documents in citations.
context_options (ContextOptions | dict[str, Any] | None) – Options controlling context retrieval. Omitted from request when None.

Returns:

ChatResponse for non-streaming requests, or a ChatStream for streaming requests.

Raises:

PineconeValueError – If both stream=True and json_response=True are specified.
ApiError – If the API returns an error response.

Return type:

ChatResponse | ChatStream

Examples

from pinecone import Pinecone
pc = Pinecone(api_key="your-api-key")
response = pc.assistants.chat(
    assistant_name="my-assistant",
    messages=[{"content": "What is Pinecone?"}],
)

stream = pc.assistants.chat(
    assistant_name="my-assistant",
    messages=[{"content": "What is Pinecone?"}],
    stream=True,
)
for text in stream.text():
    print(text, end="", flush=True)

chat_completions(*, assistant_name, messages, model='gpt-4o', stream=False, temperature=None, filter=None)[source]¶

Chat with an assistant using an OpenAI-compatible interface.

Returns responses in OpenAI chat completion format. Useful when you need inline citations or OpenAI-compatible responses. Has limited functionality compared to the standard chat() interface — does not support include_highlights, context_options, or json_response parameters.

Parameters:

assistant_name (str) – Name of the assistant to chat with.
messages (list[Message | dict[str, str]]) – Conversation messages. Dicts are converted to Message objects; role defaults to "user" when not present.
model (str) – Large language model to use. Defaults to "gpt-4o". Must be one of the backend’s accepted values: "gpt-4o", "gpt-4o-mini", "gpt-4.1", "gpt-4.1-mini", "gpt-4.1-nano", "o3-mini", "o4-mini", "gpt-5", "claude-sonnet-4", "claude-sonnet-4-5", "gemini-2.5-pro", "gemini-2.5-flash". The aliases "claude-3-5-sonnet" and "claude-3-7-sonnet" are accepted but deprecated (silently remapped to "claude-sonnet-4-5" by the backend). Unknown model names are rejected by the backend with a 400 error.
stream (bool) – If True, return a ChatCompletionStream. Defaults to False.
temperature (float | None) – Controls randomness. Lower values produce more deterministic responses. Omitted from request when None.
filter (dict[str, Any] | None) – Metadata filter restricting which documents are used as context. Omitted from request when None.

Returns:

ChatCompletionResponse for non-streaming requests, or a ChatCompletionStream for streaming requests.

Raises:

ApiError – If the API returns an error response.

Return type:

ChatCompletionResponse | ChatCompletionStream

Examples

from pinecone import Pinecone
pc = Pinecone(api_key="your-api-key")
response = pc.assistants.chat_completions(
    assistant_name="research-assistant",
    messages=[{"content": "Explain quantum entanglement briefly."}],
)
response.choices[0].message.content

stream = pc.assistants.chat_completions(
    assistant_name="research-assistant",
    messages=[{"content": "Explain quantum entanglement briefly."}],
    stream=True,
)
for chunk in stream:
    print(chunk)

close()[source]¶

Close the underlying HTTP client and any cached data-plane clients.

Return type:: None

context(*, assistant_name, query=None, messages=None, filter=None, top_k=None, snippet_size=None, multimodal=None, include_binary_content=None)[source]¶

Retrieve relevant context snippets from a Pinecone assistant.

Retrieves context snippets matching a text query or conversation history. Exactly one of query or messages must be provided and non-empty.

Parameters:

assistant_name (str) – Name of the assistant to retrieve context from.
query (str | None) – Text query to use for context retrieval. Mutually exclusive with messages. Empty string is treated as not provided.
messages (Sequence[Message | Mapping[str, str]] | None) – Conversation messages to use for context retrieval. Mutually exclusive with query. Empty list is treated as not provided. Dicts are converted to Message objects.
filter (dict[str, Any] | None) – Metadata filter restricting which documents contribute context. Omitted from request when None.
top_k (int | None) – Maximum number of context snippets to return. Omitted from request when None.
snippet_size (int | None) – Maximum snippet size in tokens. Omitted from request when None.
multimodal (bool | None) – Whether to include image-related context snippets. Omitted from request when None.
include_binary_content (bool | None) – Whether image snippets include base64 image data. Only meaningful when multimodal is True. Omitted from request when None.

Returns:

ContextResponse containing the matching context snippets.

Raises:

PineconeValueError – If both or neither of query and messages are provided (or if they are empty).
ApiError – If the API returns an error response.

Return type:

ContextResponse

Examples

response = pc.assistants.context(
    assistant_name="my-assistant",
    query="What is Pinecone?",
)
for snippet in response.snippets:
    print(snippet.content)

create(*, name=None, instructions=None, metadata=None, region='us', environment=None, timeout=None, **kwargs)[source]¶

Create a new Pinecone assistant.

Creates an assistant and optionally polls until it reaches "Ready" status. The assistant starts in "Initializing" status.

Parameters:

name (str) – Name for the new assistant. Must be 1-63 characters, start and end with an alphanumeric character, and consist only of lowercase alphanumeric characters or hyphens.
instructions (str | None) – Optional directive for the assistant to apply to all responses. Maximum 16 KB.
metadata (dict[str, Any] | None) – Optional metadata dictionary. When omitted or None, no metadata is sent and the assistant is created without metadata (None).
region (str) – Region to deploy the assistant in. Must be "us" or "eu" (case-sensitive). Defaults to "us".
environment (str | None) – Optional environment override. Restricted to Pinecone-internal org plans; passing this on a non-internal plan raises a 403 error from the backend.
timeout (float | None) – Seconds to wait for the assistant to become ready. Use None (default) to poll indefinitely. Use -1 to return immediately without polling. Use 0 or a positive value to poll with a deadline. Raises PineconeTimeoutError if the assistant is not ready before the deadline.
kwargs (Any)

Returns:

AssistantModel describing the created assistant.

Raises:

PineconeValueError – If region is not "us" or "eu".
PineconeTimeoutError – If the assistant does not become ready before the deadline.
ApiError – If the API returns an error response.

Return type:

AssistantModel

Examples

>>> from pinecone import Pinecone
>>> pc = Pinecone(api_key="your-api-key")
>>> assistant = pc.assistants.create(name="my-assistant")

>>> assistant = pc.assistants.create(
...     name="research-assistant",
...     instructions="You are a helpful research assistant.",
...     metadata={"team": "engineering", "version": "1"},
...     region="eu",
... )

delete(*, name=None, timeout=None, **kwargs)[source]¶

Delete a Pinecone assistant by name.

Sends a DELETE request, then polls every 5 seconds until the assistant is confirmed gone (404 from describe). Other errors during polling propagate immediately.

Parameters:

name (str) – The name of the assistant to delete.
timeout (float | None) – Seconds to wait for the assistant to disappear. Use None (default) to poll indefinitely. Use -1 to return immediately without polling. Use a positive value to poll with a deadline. Raises PineconeTimeoutError if the assistant is not gone before the deadline.
kwargs (Any)

Returns:

None

Raises:

PineconeTimeoutError – If the assistant still exists after timeout seconds.
ApiError – If the API returns an error response.

Return type:

None

Examples

pc.assistants.delete(name="my-assistant")

# Return immediately without waiting for deletion
pc.assistants.delete(name="my-assistant", timeout=-1)

delete_file(*, assistant_name, file_id, timeout=None)[source]¶

Delete a file from a Pinecone assistant.

Sends a DELETE request, then polls every 5 seconds until the file is confirmed gone (404 from describe_file). Other errors during polling propagate immediately.

Parameters:

assistant_name (str) – Name of the assistant that owns the file.
file_id (str) – Unique identifier of the file to delete.
timeout (float | None) – Seconds to wait for the file to be deleted. Use None (default) to poll indefinitely. Use -1 to return immediately without polling. Use a positive value to poll with a deadline. Raises PineconeTimeoutError if the file is not gone before the deadline.

Returns:

None

Raises:

PineconeError – If server-side file deletion fails.
PineconeTimeoutError – If the file still exists after timeout seconds.
ApiError – If the API returns an error response.

Return type:

None

Examples

>>> pc.assistants.delete_file(
...     assistant_name="my-assistant",
...     file_id="file-abc123",
... )

describe(*, name=None, **kwargs)[source]¶

Get detailed information about a named assistant.

Parameters:

name (str) – The name of the assistant to describe.
kwargs (Any)

Returns:

AssistantModel with name, status, created_at, updated_at, metadata, instructions, and host.

Raises:

ApiError – If the API returns an error response (e.g. 404 when the assistant does not exist).

Return type:

AssistantModel

Examples

>>> assistant = pc.assistants.describe(name="my-assistant")
>>> assistant.status
'Ready'

describe_file(*, assistant_name, file_id, include_url=False)[source]¶

Get the status and metadata of a file uploaded to an assistant.

Parameters:

assistant_name (str) – Name of the assistant that owns the file.
file_id (str) – Unique identifier of the file to retrieve.
include_url (bool) – If True, include a signed download URL in the response. Defaults to False.

Returns:

AssistantFileModel with file metadata and status.

Raises:

NotFoundError – If the file does not exist.
ApiError – If the API returns an error response.

Return type:

AssistantFileModel

Examples

>>> file = pc.assistants.describe_file(
...     assistant_name="my-assistant",
...     file_id="file-abc123",
... )
>>> file.status
'Available'

evaluate_alignment(*, question, answer, ground_truth_answer)[source]¶

Evaluate answer alignment against a ground truth answer.

Measures the correctness and completeness of a generated answer with respect to a ground truth answer. Alignment is the harmonic mean of correctness (precision) and completeness (recall).

Parameters:

question (str) – The question for which the answer was generated.
answer (str) – The generated answer to evaluate.
ground_truth_answer (str) – The ground truth answer to compare against.

Returns:

AlignmentResult with aggregate scores, per-fact entailment results, and token usage statistics.

Raises:

ApiError – If the API returns an error response.

Return type:

AlignmentResult

Examples

>>> result = pc.assistants.evaluate_alignment(
...     question="What is the capital of Spain?",
...     answer="Barcelona.",
...     ground_truth_answer="Madrid.",
... )

list(*, limit=None, pagination_token=None)[source]¶

List assistants in the project with transparent lazy pagination.

Parameters:

limit (int | None) – Maximum number of assistants to yield across all pages. None (default) yields all assistants.
pagination_token (str | None) – Token to resume pagination from a previous call.

Returns:

Paginator over AssistantModel objects. Supports for loops, .to_list(), .pages(), and limit.

Raises:

ApiError – If the API returns an error response.

Return type:

Paginator[AssistantModel]

Examples

for a in pc.assistants.list():
    print(a.name, a.status)

all_assistants = pc.assistants.list().to_list()

list_files(*, assistant_name, filter=None, limit=None, pagination_token=None)[source]¶

List files for an assistant with lazy pagination.

Parameters:

assistant_name (str) – Name of the assistant whose files to list.
filter (dict[str, Any] | None) – Optional metadata filter expression. Serialized to a JSON string before being sent to the API.
limit (int | None) – Maximum number of files to yield across all pages. None (default) yields all files.
pagination_token (str | None) – Token to resume pagination from a previous call.

Returns:

Paginator over AssistantFileModel objects. Supports for loops, .to_list(), .pages(), and limit.

Raises:

ApiError – If the API returns an error response.

Return type:

Paginator[AssistantFileModel]

Examples

for f in pc.assistants.list_files(assistant_name="my-assistant"):
    print(f.name, f.status)

files = pc.assistants.list_files(assistant_name="my-assistant").to_list()

list_files_page(*, assistant_name, page_size=None, pagination_token=None, filter=None, **kwargs)[source]¶

List one page of files for an assistant with explicit pagination control.

Only the parameters that are explicitly provided are sent in the request. Omitted parameters are not included as query params.

Parameters:

assistant_name (str) – Name of the assistant whose files to list.
page_size (int | None) – Maximum number of files per page.
pagination_token (str | None) – Token from a previous response to fetch the next page.
filter (dict[str, Any] | None) – Optional metadata filter expression. Serialized to a JSON string before being sent to the API.
kwargs (Any)

Returns:

ListFilesResponse with a files list and an optional next continuation token.

Raises:

ApiError – If the API returns an error response.

Return type:

ListFilesResponse

Examples

page = pc.assistants.list_files_page(assistant_name="my-assistant")
names = [f.name for f in page.files]
token = page.next  # use as pagination_token for the next call

list_page(*, page_size=None, pagination_token=None, **kwargs)[source]¶

List one page of assistants with explicit pagination control.

Only the parameters that are explicitly provided are sent in the request. Omitted parameters are not included as query params.

Parameters:

page_size (int | None) – Maximum number of assistants per page. Only sent when explicitly provided.
pagination_token (str | None) – Token from a previous response to fetch the next page.
kwargs (Any)

Returns:

ListAssistantsResponse with an assistants list and an optional next continuation token.

Raises:

ApiError – If the API returns an error response.

Return type:

ListAssistantsResponse

Examples

page = pc.assistants.list_page(page_size=10)
names = [a.name for a in page.assistants]
token = page.next  # use as pagination_token for the next call

update(*, name=None, instructions=None, metadata=None, **kwargs)[source]¶

Update an existing Pinecone assistant.

Updates the specified assistant’s instructions and/or metadata. Metadata is fully replaced (not merged) when provided.

Parameters:

name (str) – The name of the assistant to update.
instructions (str | None) – New instructions for the assistant. Pass an empty string to clear existing instructions.
metadata (dict[str, Any] | None) – New metadata dictionary. Fully replaces any existing metadata rather than merging.
kwargs (Any)

Returns:

AssistantModel describing the updated assistant.

Raises:

ApiError – If the API returns an error response (e.g. 404 when the assistant does not exist).

Return type:

AssistantModel

Examples

>>> assistant = pc.assistants.update(
...     name="my-assistant",
...     instructions="You are a helpful research assistant.",
... )

>>> assistant = pc.assistants.update(
...     name="my-assistant",
...     metadata={"team": "ml", "version": "2"},
... )

upload_file(*, assistant_name, file_path=None, file_stream=None, file_name=None, metadata=None, multimodal=None, file_id=None, timeout=None)[source]¶

Upload a file to a Pinecone assistant.

Uploads a file from a local path or an in-memory byte stream, then polls until server-side processing completes.

Parameters:

assistant_name (str) – Name of the target assistant.
file_path (str | None) – Path to a local file to upload. Mutually exclusive with file_stream.
file_stream (IO[bytes] | None) – An open byte stream to upload. Mutually exclusive with file_path. Use file_name to set the filename.
file_name (str | None) – Filename to associate with file_stream. Ignored when file_path is provided.
metadata (dict[str, Any] | None) – Optional metadata dictionary. Sent as a JSON string.
multimodal (bool | None) – Whether to enable multimodal processing for PDFs.
file_id (str | None) – Optional caller-specified file identifier for upsert behavior.
timeout (float | None) – Seconds to wait for processing to complete. None (default) polls indefinitely. Use -1 to return immediately after upload with one describe call. Raises PineconeTimeoutError if processing is not done before the deadline.

Returns:

AssistantFileModel fetched fresh from the API after processing completes.

Raises:

PineconeValueError – If both or neither of file_path and file_stream are provided, or if file_path does not exist.
PineconeTimeoutError – If processing does not complete before timeout.
PineconeError – If server-side processing fails.

Return type:

AssistantFileModel

Examples

>>> file = pc.assistants.upload_file(
...     assistant_name="research-assistant",
...     file_path="/data/report.pdf",
... )
>>> file.status
'Available'