Integrated Records (Server-Side Embedding)¶
Integrated indexes store text records and embed them server-side using a hosted model.
You write text; Pinecone handles the embedding. No separate embed step required.
Create an integrated index¶
Use IntegratedSpec and provide an EmbedConfig
that maps a document field to the embedding input:
from pinecone import Pinecone
from pinecone.models.indexes.specs import EmbedConfig, IntegratedSpec
pc = Pinecone(api_key="your-api-key")
pc.indexes.create(
name="articles",
spec=IntegratedSpec(
cloud="aws",
region="us-east-1",
embed=EmbedConfig(
model="multilingual-e5-large",
field_map={"text": "body"}, # embed the "body" field of each record
),
),
)
field_map maps the model’s input field name ("text" for most models) to the
field in your records that holds the content to embed.
Get a handle to the index:
index = pc.index("articles")
Upsert records¶
Call upsert_records() with a list of record dicts. Each record
must have an _id (or id) field. Include any fields you configured in
field_map:
response = index.upsert_records(
namespace="en",
records=[
{"_id": "article-1", "body": "Vector databases accelerate AI search."},
{"_id": "article-2", "body": "RAG pipelines combine retrieval with generation."},
{"_id": "article-3", "body": "Pinecone scales to billions of vectors."},
],
)
print(response.record_count) # number of records submitted
Records are sent as newline-delimited JSON (NDJSON). Embeddings are generated asynchronously by Pinecone; allow a moment before searching.
Search records¶
Call search() with inputs containing the query text. Pinecone
embeds the query server-side and returns the nearest records:
results = index.search(
namespace="en",
top_k=5,
inputs={"text": "AI and machine learning"},
)
for hit in results.result.hits:
print(hit.id, hit.score, hit.fields)
Response: SearchRecordsResponse¶
search returns a SearchRecordsResponse:
.result.hits— list ofHitobjects, ordered by descending score..usage.read_units— read units consumed..usage.embed_total_tokens— tokens used for embedding the query.
Each Hit exposes:
.id— the record identifier..score— similarity score (higher is more relevant)..fields— dict of record fields returned in the result.
Use SearchInputs for IDE support¶
from pinecone.models.vectors.search import SearchInputs
results = index.search(
namespace="en",
top_k=5,
inputs=SearchInputs(text="AI and machine learning"),
)
Rerank in a single search call¶
Pass a rerank config to retrieve and rerank in one request:
from pinecone.models.vectors.search import RerankConfig
results = index.search(
namespace="en",
top_k=10,
inputs={"text": "best practices for vector search"},
rerank=RerankConfig(
model="bge-reranker-v2-m3",
rank_fields=["body"],
top_n=3,
),
)
for hit in results.result.hits:
print(hit.id, hit.score)
top_k controls how many candidates are retrieved; top_n controls how many
survive reranking. The response contains at most top_n hits.
You can also pass a plain dict for rerank:
results = index.search(
namespace="en",
top_k=10,
inputs={"text": "best practices"},
rerank={"model": "bge-reranker-v2-m3", "rank_fields": ["body"], "top_n": 3},
)
Filter by metadata¶
Pass a filter dict to restrict results to records matching metadata conditions:
results = index.search(
namespace="en",
top_k=5,
inputs={"text": "quantum computing"},
filter={"category": {"$eq": "science"}},
)
Select returned fields¶
By default the server returns all record fields. Use fields to restrict the response:
results = index.search(
namespace="en",
top_k=5,
inputs={"text": "AI research"},
fields=["body", "author"],
)
See also¶
Generating Embeddings — generate embeddings manually for non-integrated indexes.
Reranking Results — rerank results from any source.