Integrated Records (Server-Side Embedding)

Integrated indexes store text records and embed them server-side using a hosted model. You write text; Pinecone handles the embedding. No separate embed step required.

Create an integrated index

Use IntegratedSpec and provide an EmbedConfig that maps a document field to the embedding input:

from pinecone import Pinecone
from pinecone.models.indexes.specs import EmbedConfig, IntegratedSpec

pc = Pinecone(api_key="your-api-key")

pc.indexes.create(
    name="articles",
    spec=IntegratedSpec(
        cloud="aws",
        region="us-east-1",
        embed=EmbedConfig(
            model="multilingual-e5-large",
            field_map={"text": "body"},   # embed the "body" field of each record
        ),
    ),
)

field_map maps the model’s input field name ("text" for most models) to the field in your records that holds the content to embed.

Get a handle to the index:

index = pc.index("articles")

Upsert records

Call upsert_records() with a list of record dicts. Each record must have an _id (or id) field. Include any fields you configured in field_map:

response = index.upsert_records(
    namespace="en",
    records=[
        {"_id": "article-1", "body": "Vector databases accelerate AI search."},
        {"_id": "article-2", "body": "RAG pipelines combine retrieval with generation."},
        {"_id": "article-3", "body": "Pinecone scales to billions of vectors."},
    ],
)
print(response.record_count)   # number of records submitted

Records are sent as newline-delimited JSON (NDJSON). Embeddings are generated asynchronously by Pinecone; allow a moment before searching.

Search records

Call search() with inputs containing the query text. Pinecone embeds the query server-side and returns the nearest records:

results = index.search(
    namespace="en",
    top_k=5,
    inputs={"text": "AI and machine learning"},
)

for hit in results.result.hits:
    print(hit.id, hit.score, hit.fields)

Response: SearchRecordsResponse

search returns a SearchRecordsResponse:

  • .result.hits — list of Hit objects, ordered by descending score.

  • .usage.read_units — read units consumed.

  • .usage.embed_total_tokens — tokens used for embedding the query.

Each Hit exposes:

  • .id — the record identifier.

  • .score — similarity score (higher is more relevant).

  • .fields — dict of record fields returned in the result.

Use SearchInputs for IDE support

from pinecone.models.vectors.search import SearchInputs

results = index.search(
    namespace="en",
    top_k=5,
    inputs=SearchInputs(text="AI and machine learning"),
)

Rerank in a single search call

Pass a rerank config to retrieve and rerank in one request:

from pinecone.models.vectors.search import RerankConfig

results = index.search(
    namespace="en",
    top_k=10,
    inputs={"text": "best practices for vector search"},
    rerank=RerankConfig(
        model="bge-reranker-v2-m3",
        rank_fields=["body"],
        top_n=3,
    ),
)

for hit in results.result.hits:
    print(hit.id, hit.score)

top_k controls how many candidates are retrieved; top_n controls how many survive reranking. The response contains at most top_n hits.

You can also pass a plain dict for rerank:

results = index.search(
    namespace="en",
    top_k=10,
    inputs={"text": "best practices"},
    rerank={"model": "bge-reranker-v2-m3", "rank_fields": ["body"], "top_n": 3},
)

Filter by metadata

Pass a filter dict to restrict results to records matching metadata conditions:

results = index.search(
    namespace="en",
    top_k=5,
    inputs={"text": "quantum computing"},
    filter={"category": {"$eq": "science"}},
)

Select returned fields

By default the server returns all record fields. Use fields to restrict the response:

results = index.search(
    namespace="en",
    top_k=5,
    inputs={"text": "AI research"},
    fields=["body", "author"],
)

See also