GrpcIndex¶
Obtain a GrpcIndex instance via pinecone.Pinecone.index() with
grpc=True, or construct one directly.
from pinecone import Pinecone
pc = Pinecone(api_key="your-api-key")
# Resolve host automatically by index name
idx = pc.index("my-index", grpc=True)
# — or — construct directly with a host URL
from pinecone.grpc import GrpcIndex
idx = GrpcIndex(host="my-index-abc123.svc.pinecone.io", api_key="your-api-key")
GrpcIndex exposes the same data-plane operations as
Index but uses gRPC transport (backed by a Rust
extension) and returns PineconeFuture objects
from the *_async() methods.
Method groups:
Vectors —
upsert(),upsert_from_dataframe(),upsert_records(),query(),fetch(),update(),delete(),list(),list_paginated()Stats —
describe_index_stats()Integrated Inference —
search(),search_records()Async variants —
upsert_async(),query_async(),fetch_async(),delete_async()Lifecycle —
close()
- class pinecone.grpc.GrpcIndex(*, host, api_key=None, api_version='2025-10', source_tag=None, secure=True, timeout=20.0, connect_timeout=1.0)[source]¶
Bases:
objectSynchronous gRPC data plane client targeting a specific Pinecone index.
Provides the same interface as
Indexbut routes data-plane operations through a gRPC transport (via the Rust-backedGrpcChannel) instead of HTTP/REST.- Parameters:
host (str) – The index-specific data plane host URL.
api_key (str | None) – Pinecone API key. Falls back to
PINECONE_API_KEYenv var.api_version (str) – API version string. Defaults to the current data plane version.
source_tag (str | None) – Tag appended to the User-Agent string for request attribution.
secure (bool) – Whether to use TLS encryption. Defaults to
True.timeout (float) – Request timeout in seconds. Defaults to
20.0.connect_timeout (float) – Connection timeout in seconds. Defaults to
1.0.
- Raises:
ValidationError – If no API key can be resolved or the host is invalid.
Examples
from pinecone.grpc import GrpcIndex idx = GrpcIndex(host="movie-recs-abc123.svc.pinecone.io", api_key="...")
- __init__(*, host, api_key=None, api_version='2025-10', source_tag=None, secure=True, timeout=20.0, connect_timeout=1.0)[source]¶
- upsert(*, vectors, namespace='', batch_size=None, max_concurrency=4, show_progress=True, timeout=None)[source]¶
Upsert a batch of vectors into a namespace.
If a vector with the same ID already exists in the namespace, it is overwritten.
- Parameters:
vectors (Sequence[Vector | tuple[str, list[float]] | tuple[str, list[float], dict[str, Any]] | dict[str, Any]]) – Sequence of vectors to upsert. Each element can be a
Vectorinstance, a tuple of(id, values)or(id, values, metadata), or a dict withid,values, and optionalsparse_values/metadatakeys.namespace (str) – Target namespace. Defaults to the default (empty-string) namespace.
batch_size (int | None) – If set, splits
vectorsinto batches of this size and submits them in parallel via aThreadPoolExecutor.None(default) sends all vectors in a single channel call. Must be a positive integer when set.max_concurrency (int) – Number of parallel threads used when
batch_sizeis set. Default4, range[1, 64]. Ignored whenbatch_sizeisNone.show_progress (bool) – If
Trueandtqdmis installed, display a progress bar while submitting batches. Ignored whenbatch_sizeisNone. Defaults toTrue.timeout (float | None) – Per-call timeout in seconds. Applied per batch when batching. None uses the client-level default.
- Returns:
UpsertResponsewith the count of vectors upserted.- Raises:
TypeError – If a vector element is not a recognized format.
ValueError – If a vector element is malformed.
PineconeValueError – If
batch_sizeis not a positive integer ormax_concurrencyis outside[1, 64].PineconeTimeoutError – If the call exceeds timeout or the server returns CANCELLED with a timeout cause.
- Return type:
Notes
When
batch_sizeis set, batches are submitted in parallel via aThreadPoolExecutorofmax_concurrencyworkers (default 4, range 1–64). Per-batch retries are handled by the gRPC channel’s own retry policy. Partial failures do not raise — the returnedUpsertResponsecarriesupserted_count,failed_item_count,errors, andfailed_itemsfor inspection / retry. Passresponse.failed_itemsback toupsert(...)to retry only the failures.Examples
from pinecone.grpc import GrpcIndex from pinecone.models.vectors.vector import Vector idx = GrpcIndex(host="article-search-abc123.svc.pinecone.io", api_key="...") response = idx.upsert( vectors=[ Vector( id="article-101", values=[0.012, -0.087, 0.153, ...], # 1536-dim ), ("article-102", [0.045, 0.021, -0.064, ...]), {"id": "article-103", "values": [0.091, -0.032, 0.178, ...]}, ], namespace="articles-en", ) print(response.upserted_count)
- query(*, top_k, vector=None, id=None, namespace='', filter=None, include_values=False, include_metadata=False, sparse_vector=None, scan_factor=None, max_candidates=None, timeout=None)[source]¶
Query a namespace for the nearest neighbors of a vector.
- Parameters:
top_k (int) – Number of results to return (must be >= 1).
id (str | None) – ID of a stored vector to use as the query.
namespace (str) – Namespace to query. Defaults to the default namespace.
filter (dict[str, Any] | None) – Metadata filter expression.
include_values (bool) – Whether to include vector values in results.
include_metadata (bool) – Whether to include metadata in results.
sparse_vector (SparseValues | dict[str, Any] | None) – Sparse query vector with indices and values.
scan_factor (float | None) – DRN optimization — adjusts how much of the index is scanned. Range 0.5–4.0. Only supported for dedicated read node indexes. None uses server default.
max_candidates (int | None) – DRN optimization — caps candidate vectors to rerank. Range 1–100000. Only supported for dedicated read node indexes. None uses server default.
timeout (float | None) – Per-call timeout in seconds. None uses the client-level default.
- Returns:
QueryResponsewith matches, namespace, and usage info.- Raises:
ValidationError – If top_k < 1, both vector and id are provided, or none of vector, id, or sparse_vector are provided.
PineconeTimeoutError – If the call exceeds timeout or the server returns CANCELLED with a timeout cause.
- Return type:
Examples
response = idx.query( top_k=10, vector=[0.012, -0.087, 0.153, ...], # 1536-dim embedding ) for match in response.matches: print(match.id, match.score)
- fetch(*, ids, namespace='', timeout=None)[source]¶
Fetch vectors by their IDs from a namespace.
- Parameters:
- Returns:
FetchResponsewith a map of vector IDs to Vector objects, namespace, and usage info.- Raises:
ValidationError – If ids is empty.
PineconeTimeoutError – If the call exceeds timeout or the server returns CANCELLED with a timeout cause.
- Return type:
Examples
response = idx.fetch(ids=["article-101", "article-102"]) for vid, vec in response.vectors.items(): print(vid, vec.values)
- delete(*, ids=None, delete_all=False, filter=None, namespace='', timeout=None)[source]¶
Delete vectors from a namespace by ID, filter, or delete-all flag.
Exactly one of
ids,delete_all, orfiltermust be specified.- Parameters:
delete_all (bool) – If True, delete all vectors in the namespace.
filter (dict[str, Any] | None) – Metadata filter expression selecting vectors to delete.
namespace (str) – Namespace to delete from. Defaults to the default namespace.
timeout (float | None) – Per-call timeout in seconds. None uses the client-level default.
- Returns:
None
- Raises:
ValidationError – If zero or more than one deletion mode is specified.
PineconeTimeoutError – If the call exceeds timeout or the server returns CANCELLED with a timeout cause.
- Return type:
None
Examples
# Delete by IDs idx.delete(ids=["article-101", "article-102"]) # Delete all vectors in a namespace idx.delete(delete_all=True, namespace="articles-deprecated") # Delete by metadata filter idx.delete(filter={"category": {"$eq": "obsolete"}})
- update(*, id=None, values=None, sparse_values=None, set_metadata=None, namespace='', filter=None, dry_run=False, timeout=None)[source]¶
Update vectors by ID or metadata filter.
- Parameters:
id (str | None) – ID of the vector to update.
sparse_values (SparseValues | dict[str, Any] | None) – New sparse vector.
set_metadata (dict[str, Any] | None) – Metadata fields to set or overwrite.
namespace (str) – Namespace to target. Defaults to the default namespace.
filter (dict[str, Any] | None) – Metadata filter expression selecting vectors to update.
dry_run (bool) – If True, return the count of records that would be affected without applying changes.
timeout (float | None) – Per-call timeout in seconds. None uses the client-level default.
- Returns:
UpdateResponsewith matched_records count (when available).- Raises:
ValidationError – If both or neither of id and filter are provided.
PineconeTimeoutError – If the call exceeds timeout or the server returns CANCELLED with a timeout cause.
- Return type:
Examples
# Update by ID idx.update(id="article-101", values=[0.012, -0.087, 0.153, ...]) # Bulk-update metadata by filter idx.update( filter={"genre": {"$eq": "drama"}}, set_metadata={"year": 2020}, )
- list_paginated(*, prefix=None, limit=None, pagination_token=None, namespace='', timeout=None)[source]¶
Fetch a single page of vector IDs from a namespace.
- Parameters:
prefix (str | None) – Return only IDs starting with this prefix.
limit (int | None) – Maximum number of IDs to return in this page.
pagination_token (str | None) – Token from a previous response to fetch the next page.
namespace (str) – Namespace to list from. Defaults to the default namespace.
timeout (float | None) – Per-call timeout in seconds. None uses the client-level default.
- Returns:
ListResponsewith vector IDs, pagination info, namespace, and usage.- Raises:
PineconeTimeoutError – If the call exceeds timeout or the server returns CANCELLED with a timeout cause.
- Return type:
Examples
response = idx.list_paginated(prefix="doc1#", limit=50) for item in response.vectors: print(item.id)
- list(*, prefix=None, limit=None, namespace='', timeout=None)[source]¶
List vector IDs in a namespace, automatically following pagination.
Yields one
ListResponseper page.- Parameters:
prefix (str | None) – Return only IDs starting with this prefix.
limit (int | None) – Maximum number of IDs to return per page.
namespace (str) – Namespace to list from. Defaults to the default namespace.
timeout (float | None) – Per-call timeout in seconds applied to each page request. None uses the client-level default.
- Yields:
ListResponsefor each page of results.- Raises:
PineconeTimeoutError – If any page call exceeds timeout or the server returns CANCELLED with a timeout cause.
- Return type:
Examples
for page in idx.list(prefix="doc1#"): for item in page.vectors: print(item.id)
- describe_index_stats(*, filter=None, timeout=None)[source]¶
Return statistics for this index.
- Parameters:
- Returns:
DescribeIndexStatsResponsewith namespace summaries, dimension, total vector count, and fullness metrics.- Return type:
Examples
stats = idx.describe_index_stats() print(stats.total_vector_count, stats.dimension) # With filter — only count vectors matching the expression stats = idx.describe_index_stats( filter={"genre": {"$eq": "drama"}} )
- upsert_from_dataframe(df, namespace='', batch_size=500, show_progress=True)[source]¶
Upsert vectors from a pandas DataFrame using async batching.
Splits the DataFrame into batches of
batch_sizerows and submits each batch asynchronously viaupsert_async(), then aggregates the results.- Parameters:
df (pd.DataFrame) – A
pandas.DataFramewith at leastidandvaluescolumns.sparse_valuesandmetadatacolumns are included when present and non-None.namespace (str) – Target namespace. Defaults to the default namespace.
batch_size (int) – Number of rows per upsert batch. Defaults to 500.
show_progress (bool) – If
Trueandtqdmis installed, display a progress bar. Iftqdmis not installed, silently falls back to no progress bar.
- Returns:
UpsertResponsewith the total count of vectors upserted across all batches.- Raises:
RuntimeError – If
pandasis not installed.PineconeValueError – If df is not a
pandas.DataFrame.PineconeValueError – If batch_size is not a positive integer.
- Return type:
Examples
import pandas as pd from pinecone.grpc import GrpcIndex idx = GrpcIndex( host="article-search-abc123.svc.pinecone.io", api_key="your-api-key", ) df = pd.DataFrame([ {"id": "article-101", "values": [0.012, -0.087, 0.153]}, {"id": "article-102", "values": [0.045, 0.021, -0.064]}, ]) response = idx.upsert_from_dataframe(df) response.upserted_count
df = pd.DataFrame([ { "id": "article-101", "values": [0.012, -0.087, 0.153], "metadata": {"topic": "science", "year": 2024}, }, { "id": "article-102", "values": [0.045, 0.021, -0.064], "metadata": {"topic": "technology", "year": 2024}, }, ]) response = idx.upsert_from_dataframe( df, namespace="articles-en", batch_size=100, )
- upsert_async(*, vectors, namespace='', timeout=None)[source]¶
Submit an upsert operation and return a
PineconeFuture.Same parameters as
upsert(), includingtimeout (float | None)which sets a per-call timeout in seconds.- Returns:
PineconeFuture[UpsertResponse] that resolves to the upsert result.- Parameters:
- Return type:
Examples
future = index.upsert_async( vectors=[("doc-42", [0.012, -0.087, 0.153])], ) result = future.result() result.upserted_count # 1
- query_async(*, top_k, vector=None, id=None, namespace='', filter=None, include_values=False, include_metadata=False, sparse_vector=None, scan_factor=None, max_candidates=None, timeout=None)[source]¶
Submit a query operation and return a
PineconeFuture.Same parameters as
query(), includingtimeout (float | None)which sets a per-call timeout in seconds.- Returns:
PineconeFuture[QueryResponse] that resolves to the query result containing scored matches.- Parameters:
- Return type:
Examples
future = index.query_async( vector=[0.012, -0.087, 0.153], top_k=5, ) result = future.result() result.matches[0].id # 'doc-42' result.matches[0].score # 0.95
- fetch_async(*, ids, namespace='', timeout=None)[source]¶
Submit a fetch operation and return a
PineconeFuture.Same parameters as
fetch(), includingtimeout (float | None)which sets a per-call timeout in seconds.- Returns:
PineconeFuture[FetchResponse] that resolves to the fetched vectors keyed by ID.- Parameters:
- Return type:
Examples
future = index.fetch_async(ids=["doc-42", "doc-43"]) result = future.result() result.vectors["doc-42"].values # [0.012, -0.087, 0.153]
- delete_async(*, ids=None, delete_all=False, filter=None, namespace='', timeout=None)[source]¶
Submit a delete operation and return a
PineconeFuture.Same parameters as
delete(), includingtimeout (float | None)which sets a per-call timeout in seconds.- Returns:
PineconeFuture[None] that resolves when the delete operation completes.- Parameters:
- Return type:
PineconeFuture[None]
Examples
future = index.delete_async(ids=["doc-42", "doc-43"]) future.result()
future = index.delete_async(delete_all=True, namespace="docs") future.result()
- update_async(*, id=None, values=None, sparse_values=None, set_metadata=None, filter=None, namespace='', dry_run=False, timeout=None)[source]¶
Submit an update call without blocking; returns a
PineconeFuture.
- upsert_records(*, records, namespace, timeout=None)[source]¶
Upsert records for indexes with integrated inference.
Records are sent as newline-delimited JSON (NDJSON) over REST. Embeddings are generated server-side. This method delegates to the REST endpoint because the Pinecone gRPC API does not expose a records upsert operation.
- Parameters:
- Returns:
UpsertRecordsResponsewith the count of records submitted.- Raises:
PineconeValueError – If namespace is not a string or is empty/whitespace, records is empty, or a record is missing an identifier field.
ApiError – If the API returns an error response.
PineconeConnectionError – If a network-level connection fails (DNS, refused, transport error).
PineconeTimeoutError – If the request exceeds the configured timeout.
- Return type:
Examples
pc = Pinecone(api_key="YOUR_API_KEY") idx = pc.index("my-index", grpc=True) response = idx.upsert_records( namespace="articles-en", records=[ {"_id": "article-101", "text": "Vector DBs enable similarity search."}, {"_id": "article-102", "text": "RAG combines search with LLMs."}, ], ) print(response.record_count)
- search(*, namespace, top_k, inputs=None, vector=None, id=None, filter=None, fields=None, rerank=None, match_terms=None, timeout=None)[source]¶
Search records by text, vector, or ID with optional reranking.
Delegates to the REST endpoint because the Pinecone gRPC API does not expose a records search operation for integrated inference indexes.
Note
Use this method for indexes with integrated inference. For classic indexes where you provide your own vectors, use
query().- Parameters:
namespace (str) – Namespace to search in (required).
top_k (int) – Number of results to return (must be >= 1).
inputs (SearchInputs | dict[str, Any] | None) – Inputs for server-side embedding (e.g.
{"text": "query text"}).id (str | None) – ID of an existing record to use as the query.
filter (dict[str, Any] | None) – Metadata filter expression.
fields (list[str] | None) – Field names to include in results. When
None, the server returns all available fields.rerank (RerankConfig | dict[str, Any] | None) – Reranking configuration with
model(required),rank_fields(required), and optionaltop_n,parameters,querykeys. UseRerankConfigfor IDE autocompletion.match_terms (dict[str, Any] | None) – Term-matching constraint for sparse search. Requires keys
"strategy"(currently only"all") and"terms"(list of strings). Only supported for sparse indexes usingpinecone-sparse-english-v0.Nonedisables term matching.timeout (float | None)
- Returns:
SearchRecordsResponsewith hits and usage statistics.- Raises:
PineconeValueError – If
namespaceis not a string,top_k < 1, orrerankis missing required keys.ApiError – If the API returns an error response.
PineconeConnectionError – If a network-level connection fails (DNS, refused, transport error).
PineconeTimeoutError – If the request exceeds the configured timeout.
- Return type:
Examples
response = idx.search( namespace="articles-en", top_k=10, inputs={"text": "benefits of vector databases for search"}, ) for hit in response.result.hits: print(hit.id, hit.score)
Search with reranking:
response = idx.search( namespace="articles-en", top_k=10, inputs={"text": "benefits of vector databases"}, rerank={ "model": "bge-reranker-v2-m3", "rank_fields": ["text"], "top_n": 5, }, ) for hit in response.result.hits: print(hit.id, hit.score)
Note
Use inline
rerankwhen searching and reranking in a single call. Usepc.inference.rerank()when reranking results from a different source or when you need to rerank without searching.
- search_records(*, namespace, top_k, inputs=None, vector=None, id=None, filter=None, fields=None, rerank=None, match_terms=None, timeout=None)[source]¶
Alias for
search().Prefer calling
search()directly — this alias exists for backwards compatibility.- Parameters:
- Return type:
PineconeFuture¶
*_async() methods on GrpcIndex return a
PineconeFuture which is fully compatible with
concurrent.futures.as_completed() and concurrent.futures.wait().
- class pinecone.grpc.future.PineconeFuture(underlying)[source]¶
Bases:
Future[_T]Future returned by
GrpcIndex.*_async()methods.Wraps a
concurrent.futures.Futureand is fully compatible withconcurrent.futures.as_completed()andconcurrent.futures.wait().The default
result()timeout is 5 seconds. When the timeout elapses,PineconeTimeoutErroris raised with the message"deadline exceeded".Examples
from pinecone.grpc import GrpcIndex idx = GrpcIndex(host="article-search-abc123.svc.pinecone.io", api_key="your-api-key") future = idx.upsert_async(vectors=[("article-101", [0.012, -0.087, 0.153, ...])]) result = future.result() # blocks up to 5 seconds result.upserted_count # 1
from concurrent.futures import as_completed futures = [ idx.upsert_async(vectors=[("article-101", [0.012, -0.087, 0.153, ...])]), idx.upsert_async(vectors=[("article-102", [0.045, 0.021, -0.064, ...])]), ] for future in as_completed(futures): print(future.result().upserted_count)
- Parameters:
underlying (Future[_T])
- __init__(underlying)[source]¶
Initializes the future. Should not be called by clients.
- Parameters:
underlying (Future[_T])
- Return type:
None
- add_done_callback(fn)[source]¶
Attach a callable to be called when the future finishes.
The callable will be called with the future as its only argument.
- cancel()[source]¶
Attempt to cancel the underlying call.
Returns
Trueif the call was successfully cancelled,Falseif the call has already completed or is running.- Return type:
- exception(timeout=5.0)[source]¶
Return the exception raised by the call, or
None.- Parameters:
timeout (float | None) – Maximum seconds to wait. Defaults to 5.0.
- Raises:
PineconeTimeoutError – If timeout seconds elapse.
- Return type:
BaseException | None
- result(timeout=5.0)[source]¶
Return the result of the call that the future represents.
- Parameters:
timeout (float | None) – Maximum seconds to wait. Defaults to 5.0. Pass
Noneto block indefinitely.- Returns:
The result value set by the underlying future.
- Raises:
PineconeTimeoutError – If timeout seconds elapse before the result is available.
- Return type:
_T
Examples
future = idx.upsert_async(vectors=[("article-101", [0.012, -0.087, 0.153, ...])]) result = future.result() result.upserted_count # 1
future = idx.upsert_async(vectors=large_batch) result = future.result(timeout=30.0)
result = future.result(timeout=None)