Reranking Results¶
Reranking reorders a set of candidate documents by relevance to a query. Use it after an initial retrieval step (vector search, keyword search, or a combined approach) to surface the most relevant results at the top.
Basic usage¶
from pinecone import Pinecone
pc = Pinecone(api_key="your-api-key")
documents = [
{"text": "Apple is a technology company."},
{"text": "The apple is a popular fruit."},
{"text": "Pinecone is a vector database."},
]
result = pc.inference.rerank(
model="bge-reranker-v2-m3",
query="Tell me about tech companies",
documents=documents,
)
for doc in result.data:
print(doc.index, doc.score, doc.document)
Pass plain strings and they are automatically wrapped as {"text": ...}:
result = pc.inference.rerank(
model="bge-reranker-v2-m3",
query="machine learning",
documents=["Neural networks are a type of ML model.", "Python is a programming language."],
)
Response: RerankResult¶
rerank returns a RerankResult containing:
.data— list ofRankedDocument, ordered by descending score..model— model name used..usage.rerank_units— rerank units consumed.
Each RankedDocument has:
.index— the original position of the document in the input list..score— relevance score (higher is more relevant)..document— the original document dict (Nonewhenreturn_documents=False).
top_n: return only the best results¶
Pass top_n to receive only the top N documents after reranking:
result = pc.inference.rerank(
model="bge-reranker-v2-m3",
query="search query",
documents=documents,
top_n=3,
)
assert len(result.data) == 3
rank_fields: choose which field to score on¶
By default the reranker scores the "text" field. Override with rank_fields:
result = pc.inference.rerank(
model="bge-reranker-v2-m3",
query="quarterly earnings",
documents=[
{"title": "Apple Q4 2024 results", "body": "Revenue grew 6% year over year."},
{"title": "Banana prices rise", "body": "Fruit prices hit new highs."},
],
rank_fields=["title", "body"],
)
Using the RerankModel enum¶
Use the RerankModel enum for tab-completion and typo
safety:
from pinecone import Pinecone
from pinecone.client.inference import Inference
pc = Pinecone(api_key="your-api-key")
result = pc.inference.rerank(
model=Inference.RerankModel.Bge_Reranker_V2_M3,
query="machine learning",
documents=documents,
)
Reranking in a pipeline¶
A typical two-stage retrieval pipeline: fetch candidates from Pinecone, then rerank.
from pinecone import Pinecone
pc = Pinecone(api_key="your-api-key")
index = pc.index("product-search")
# Stage 1: vector retrieval
query_embedding = pc.inference.embed(
model="multilingual-e5-large",
inputs=["best noise-cancelling headphones"],
parameters={"input_type": "query"},
).data[0].values
matches = index.query(vector=query_embedding, top_k=20, include_metadata=True)
candidates = [
{"text": m.metadata.get("description", ""), "id": m.id}
for m in matches.matches
]
# Stage 2: rerank
result = pc.inference.rerank(
model="bge-reranker-v2-m3",
query="best noise-cancelling headphones",
documents=candidates,
rank_fields=["text"],
top_n=5,
)
for doc in result.data:
print(doc.score, doc.document)
For integrated indexes, pass rerank directly inside
search() — see Integrated Records (Server-Side Embedding).
List available reranking models¶
models = pc.inference.model.list(type="rerank")
print(models.names())