Upserting and Querying Vectors

Use the Index client to insert and retrieve vectors from a Pinecone index. Get an index client via index():

from pinecone import Pinecone

pc = Pinecone(api_key="your-api-key")
index = pc.index("movie-recommendations")

Upsert vectors

upsert() inserts vectors or overwrites existing ones with the same ID.

Pass a list of tuples (id, values) or (id, values, metadata):

index.upsert(
    vectors=[
        ("movie-001", [0.012, -0.087, 0.153, ...]),
        ("movie-002", [0.045,  0.021, -0.064, ...]),
    ]
)

Using Vector objects

Vector objects support metadata and sparse values:

from pinecone import Vector

response = index.upsert(
    vectors=[
        Vector(id="movie-001", values=[0.012, -0.087, 0.153, ...]),
        Vector(
            id="movie-002",
            values=[0.045, 0.021, -0.064, ...],
            metadata={"genre": "comedy", "year": 2022},
        ),
    ]
)
print(response.upserted_count)  # 2

upsert returns an UpsertResponse. For a single-request upsert, only upserted_count is meaningful; for a batched upsert (see “Large datasets” below), the response also carries per-batch counters and a failed_items list for retry.

Upsert into a namespace

Pass namespace to target a specific partition:

index.upsert(
    vectors=[("movie-001", [0.012, -0.087, 0.153, ...])],
    namespace="movies-en",
)

The default namespace is "".

Large datasets

For datasets larger than a single payload, pass batch_size to split the upload into chunks. Batches are sent in parallel via a ThreadPoolExecutor (sync) or asyncio.Semaphore (async) of max_concurrency workers. HTTP-level retries happen automatically per batch via the configured RetryConfig.

response = index.upsert(
    vectors=large_list,
    batch_size=200,        # vectors per request
    max_concurrency=8,     # parallel in-flight requests (1–64)
    show_progress=True,    # tqdm progress bar (auto-skipped if tqdm not installed)
)
print(response.upserted_count)         # successful items
print(response.total_item_count)       # total submitted
print(response.successful_batch_count) # batches that succeeded

Defaults: batch_size=None keeps the single-request behaviour (no batching). When batch_size is set, max_concurrency defaults to 4 and show_progress defaults to True.

For DataFrame input, upsert_from_dataframe() provides the same parallel batching with column extraction. For millions of vectors, consider start_import() to load from cloud storage.

Handling partial failures

Unlike a single-request upsert (which raises on failure), a batched upsert never raises for per-batch errors. Instead, the returned UpsertResponse carries each failed batch’s exception and items, so you can retry only the failures.

response = index.upsert(vectors=huge_list, batch_size=200)

if response.has_errors:
    print(f"{response.failed_item_count} of {response.total_item_count} items failed")
    for err in response.errors:
        print(f"  batch {err.batch_index}: {err.error_message}")

    # Retry only the failures:
    retry = index.upsert(
        vectors=response.failed_items,
        batch_size=200,
    )

response.failed_items is a flat list[dict] of every item from every failed batch, in original order. Pass it directly back to upsert(...) for retry.

Inspect errors before retrying

Before retrying failed_items, look at why batches failed:

if response.has_errors:
    first = response.errors[0]
    print(f"first failure: {first.error_message}")

If every error has the same HTTP status — especially a 4xx like 400 (Bad Request), 401 (Unauthorized), 403 (Forbidden), or 422 (Unprocessable Entity) — the failures are about your data or your credentials, not transient infrastructure. Retrying with the same input will fail the same way. Fix the data or the credentials and retry the corrected items, or stop.

Why surfaced errors are usually persistent

The HTTP transport retries {408, 429, 500, 502, 503, 504} automatically up to three times with exponential backoff (see RetryConfig). That layer absorbs nearly all transient infrastructure issues. By the time an error reaches response.errors, it has either:

  • exhausted the retry budget (sustained 5xx, persistent 429), or

  • wasn’t retryable in the first place (4xx — bad input, auth, validation).

Either way, naive retries usually re-create the same problem. Treat each entry in response.errors as a real signal worth reading.

Batches fail atomically

Any per-batch error fails the entire batch — even if only one of its 200 vectors was the actual problem. So response.failed_items may contain 199 items that would have succeeded on their own, plus the one bad row that triggered the rejection. The server doesn’t surface per-item rejection details on the upsert path.

To isolate the bad row, re-batch the failures with a smaller batch_size (down to batch_size=1 if needed) — successful single-item batches narrow the problem to the rejected ones:

if response.has_errors:
    narrow = index.upsert(vectors=response.failed_items, batch_size=1)
    # narrow.failed_items now contains only the actually-bad rows

Query for nearest neighbors

query() returns the top_k closest vectors to a query vector:

response = index.query(
    vector=[0.012, -0.087, 0.153, ...],
    top_k=10,
)
for match in response.matches:
    print(match.id, match.score)

Each element of response.matches is a ScoredVector with id, score, values, metadata, and sparse_values fields. Results are ordered from most similar to least similar.

Include values or metadata in results

By default, values and metadata are omitted from matches to reduce payload size. Enable them explicitly:

response = index.query(
    vector=[0.012, -0.087, 0.153, ...],
    top_k=10,
    include_values=True,
    include_metadata=True,
)
for match in response.matches:
    print(match.id, match.score, match.metadata)

Filter by metadata

Pass a filter expression to restrict results to vectors whose metadata satisfies the condition:

response = index.query(
    vector=[0.012, -0.087, 0.153, ...],
    top_k=5,
    filter={"genre": {"$eq": "action"}, "year": {"$gte": 2020}},
    include_metadata=True,
)

Using the Field filter builder

Field provides a Python-native API for building filter expressions. The ==, !=, &, and | operators and .gt() / .gte() / .lt() / .lte() / .is_in() / .not_in() methods return a Condition object. Pass it to filter via .to_dict():

from pinecone import Field

condition = (Field("genre") == "action") & Field("year").gte(2020)

response = index.query(
    vector=[0.012, -0.087, 0.153, ...],
    top_k=5,
    filter=condition.to_dict(),
    include_metadata=True,
)

FilterBuilder is an alias for Field and can be used interchangeably.

Fetch vectors by ID

fetch() retrieves stored vectors by their IDs:

response = index.fetch(ids=["movie-001", "movie-002"])
for vid, vec in response.vectors.items():
    print(vid, vec.values[:3])

response.vectors is a dict[str, Vector]. IDs that do not exist are omitted rather than raising an error.

Update a vector

update() replaces a vector’s dense values, sparse values, or metadata.

Update dense values by ID:

index.update(id="movie-001", values=[0.099, -0.045, 0.210, ...])

Update metadata without changing values:

index.update(id="movie-001", set_metadata={"rating": 4.5, "genre": "thriller"})

Bulk-update metadata for all vectors matching a filter:

index.update(
    filter={"genre": {"$eq": "drama"}},
    set_metadata={"category": "classic"},
)

Delete vectors

delete() removes vectors from a namespace. Specify exactly one of ids, delete_all, or filter.

Delete by ID:

index.delete(ids=["movie-001", "movie-002"])

Delete all vectors in a namespace:

index.delete(delete_all=True, namespace="movies-deprecated")

Delete by metadata filter:

index.delete(filter={"year": {"$lte": 2000}})

Inspect index stats

describe_index_stats() returns aggregate counts and per-namespace summaries:

stats = index.describe_index_stats()
print(stats.total_vector_count)
print(stats.dimension)
print(stats.index_fullness)     # fraction 0.0–1.0

for namespace, summary in stats.namespaces.items():
    print(namespace, summary.vector_count)

Pass a filter to count only matching vectors:

stats = index.describe_index_stats(filter={"genre": {"$eq": "action"}})
print(stats.total_vector_count)

See also

  • Working with Namespaces — working with namespaces

  • Bulk Importing Vectors — bulk importing from cloud storage

  • Index — full data plane client reference

  • QueryResponse — query response model

  • ScoredVector — individual match in query results