pinecone.control.pinecone
1import time 2import logging 3from typing import Optional, Dict, Any, Union, List, Tuple, Literal 4 5from .index_host_store import IndexHostStore 6 7from pinecone.config import PineconeConfig, Config, ConfigBuilder 8 9from pinecone.core.openapi.control.api.manage_indexes_api import ManageIndexesApi 10from pinecone.core.openapi.shared.api_client import ApiClient 11 12 13from pinecone.utils import normalize_host, setup_openapi_client, build_plugin_setup_client 14from pinecone.core.openapi.control.models import ( 15 CreateCollectionRequest, 16 CreateIndexRequest, 17 ConfigureIndexRequest, 18 ConfigureIndexRequestSpec, 19 ConfigureIndexRequestSpecPod, 20 DeletionProtection, 21 IndexSpec, 22 ServerlessSpec as ServerlessSpecModel, 23 PodSpec as PodSpecModel, 24 PodSpecMetadataConfig, 25) 26from pinecone.core.openapi.shared import API_VERSION 27from pinecone.models import ServerlessSpec, PodSpec, IndexModel, IndexList, CollectionList 28from .langchain_import_warnings import _build_langchain_attribute_error_message 29 30from pinecone.data import Index 31 32from pinecone_plugin_interface import load_and_install as install_plugins 33 34logger = logging.getLogger(__name__) 35 36 37class Pinecone: 38 def __init__( 39 self, 40 api_key: Optional[str] = None, 41 host: Optional[str] = None, 42 proxy_url: Optional[str] = None, 43 proxy_headers: Optional[Dict[str, str]] = None, 44 ssl_ca_certs: Optional[str] = None, 45 ssl_verify: Optional[bool] = None, 46 config: Optional[Config] = None, 47 additional_headers: Optional[Dict[str, str]] = {}, 48 pool_threads: Optional[int] = 1, 49 index_api: Optional[ManageIndexesApi] = None, 50 **kwargs, 51 ): 52 """ 53 The `Pinecone` class is the main entry point for interacting with Pinecone via this Python SDK. 54 It is used to create, delete, and manage your indexes and collections. 55 56 :param api_key: The API key to use for authentication. If not passed via kwarg, the API key will be read from the environment variable `PINECONE_API_KEY`. 57 :type api_key: str, optional 58 :param host: The control plane host to connect to. 59 :type host: str, optional 60 :param proxy_url: The URL of the proxy to use for the connection. Default: `None` 61 :type proxy_url: str, optional 62 :param proxy_headers: Additional headers to pass to the proxy. Use this if your proxy setup requires authentication. Default: `{}` 63 :type proxy_headers: Dict[str, str], optional 64 :param ssl_ca_certs: The path to the SSL CA certificate bundle to use for the connection. This path should point to a file in PEM format. Default: `None` 65 :type ssl_ca_certs: str, optional 66 :param ssl_verify: SSL verification is performed by default, but can be disabled using the boolean flag. Default: `True` 67 :type ssl_verify: bool, optional 68 :param config: A `pinecone.config.Config` object. If passed, the `api_key` and `host` parameters will be ignored. 69 :type config: pinecone.config.Config, optional 70 :param additional_headers: Additional headers to pass to the API. Default: `{}` 71 :type additional_headers: Dict[str, str], optional 72 :param pool_threads: The number of threads to use for the connection pool. Default: `1` 73 :type pool_threads: int, optional 74 :param index_api: An instance of `pinecone.core.client.api.manage_indexes_api.ManageIndexesApi`. If passed, the `host` parameter will be ignored. 75 :type index_api: pinecone.core.client.api.manage_indexes_api.ManageIndexesApi, optional 76 77 78 ### Configuration with environment variables 79 80 If you instantiate the Pinecone client with no arguments, it will attempt to read the API key from the environment variable `PINECONE_API_KEY`. 81 82 ```python 83 from pinecone import Pinecone 84 85 pc = Pinecone() 86 ``` 87 88 ### Configuration with keyword arguments 89 90 If you prefer being more explicit in your code, you can also pass the API as a keyword argument. 91 92 ```python 93 import os 94 from pinecone import Pinecone 95 96 pc = Pinecone(api_key=os.environ.get("PINECONE_API_KEY")) 97 ``` 98 99 ### Environment variables 100 101 The Pinecone client supports the following environment variables: 102 103 - `PINECONE_API_KEY`: The API key to use for authentication. If not passed via 104 kwarg, the API key will be read from the environment variable `PINECONE_API_KEY`. 105 106 - `PINECONE_DEBUG_CURL`: When troubleshooting it can be very useful to run curl 107 commands against the control plane API to see exactly what data is being sent 108 and received without all the abstractions and transformations applied by the Python 109 SDK. If you set this environment variable to `true`, the Pinecone client will use 110 request parameters to print out an equivalent curl command that you can run yourself 111 or share with Pinecone support. **Be very careful with this option, as it will print out 112 your API key** which forms part of a required authentication header. Default: `false` 113 114 ### Proxy configuration 115 116 If your network setup requires you to interact with Pinecone via a proxy, you will need 117 to pass additional configuration using optional keyword parameters. These optional parameters 118 are forwarded to `urllib3`, which is the underlying library currently used by the Pinecone client to 119 make HTTP requests. You may find it helpful to refer to the 120 [urllib3 documentation on working with proxies](https://urllib3.readthedocs.io/en/stable/advanced-usage.html#http-and-https-proxies) 121 while troubleshooting these settings. 122 123 Here is a basic example: 124 125 ```python 126 from pinecone import Pinecone 127 128 pc = Pinecone( 129 api_key='YOUR_API_KEY', 130 proxy_url='https://your-proxy.com' 131 ) 132 133 pc.list_indexes() 134 ``` 135 136 If your proxy requires authentication, you can pass those values in a header dictionary using the `proxy_headers` parameter. 137 138 ```python 139 from pinecone import Pinecone 140 import urllib3 import make_headers 141 142 pc = Pinecone( 143 api_key='YOUR_API_KEY', 144 proxy_url='https://your-proxy.com', 145 proxy_headers=make_headers(proxy_basic_auth='username:password') 146 ) 147 148 pc.list_indexes() 149 ``` 150 151 ### Using proxies with self-signed certificates 152 153 By default the Pinecone Python client will perform SSL certificate verification 154 using the CA bundle maintained by Mozilla in the [certifi](https://pypi.org/project/certifi/) package. 155 If your proxy server is using a self-signed certificate, you will need to pass the path to the certificate 156 in PEM format using the `ssl_ca_certs` parameter. 157 158 ```python 159 from pinecone import Pinecone 160 import urllib3 import make_headers 161 162 pc = Pinecone( 163 api_key='YOUR_API_KEY', 164 proxy_url='https://your-proxy.com', 165 proxy_headers=make_headers(proxy_basic_auth='username:password'), 166 ssl_ca_certs='path/to/cert-bundle.pem' 167 ) 168 169 pc.list_indexes() 170 ``` 171 172 ### Disabling SSL verification 173 174 If you would like to disable SSL verification, you can pass the `ssl_verify` 175 parameter with a value of `False`. We do not recommend going to production with SSL verification disabled. 176 177 ```python 178 from pinecone import Pinecone 179 import urllib3 import make_headers 180 181 pc = Pinecone( 182 api_key='YOUR_API_KEY', 183 proxy_url='https://your-proxy.com', 184 proxy_headers=make_headers(proxy_basic_auth='username:password'), 185 ssl_ca_certs='path/to/cert-bundle.pem', 186 ssl_verify=False 187 ) 188 189 pc.list_indexes() 190 191 ``` 192 """ 193 if config: 194 if not isinstance(config, Config): 195 raise TypeError("config must be of type pinecone.config.Config") 196 else: 197 self.config = config 198 else: 199 self.config = PineconeConfig.build( 200 api_key=api_key, 201 host=host, 202 additional_headers=additional_headers, 203 proxy_url=proxy_url, 204 proxy_headers=proxy_headers, 205 ssl_ca_certs=ssl_ca_certs, 206 ssl_verify=ssl_verify, 207 **kwargs, 208 ) 209 210 if kwargs.get("openapi_config", None): 211 raise Exception( 212 "Passing openapi_config is no longer supported. Please pass settings such as proxy_url, proxy_headers, ssl_ca_certs, and ssl_verify directly to the Pinecone constructor as keyword arguments. See the README at https://github.com/pinecone-io/pinecone-python-client for examples." 213 ) 214 215 self.openapi_config = ConfigBuilder.build_openapi_config(self.config, **kwargs) 216 self.pool_threads = pool_threads 217 218 if index_api: 219 self.index_api = index_api 220 else: 221 self.index_api = setup_openapi_client( 222 api_client_klass=ApiClient, 223 api_klass=ManageIndexesApi, 224 config=self.config, 225 openapi_config=self.openapi_config, 226 pool_threads=pool_threads, 227 api_version=API_VERSION, 228 ) 229 230 self.index_host_store = IndexHostStore() 231 """ @private """ 232 233 self.load_plugins() 234 235 def load_plugins(self): 236 """@private""" 237 try: 238 # I don't expect this to ever throw, but wrapping this in a 239 # try block just in case to make sure a bad plugin doesn't 240 # halt client initialization. 241 openapi_client_builder = build_plugin_setup_client( 242 config=self.config, 243 openapi_config=self.openapi_config, 244 pool_threads=self.pool_threads, 245 ) 246 install_plugins(self, openapi_client_builder) 247 except Exception as e: 248 logger.error(f"Error loading plugins: {e}") 249 250 def create_index( 251 self, 252 name: str, 253 dimension: int, 254 spec: Union[Dict, ServerlessSpec, PodSpec], 255 metric: Optional[str] = "cosine", 256 timeout: Optional[int] = None, 257 deletion_protection: Optional[Literal["enabled", "disabled"]] = "disabled", 258 ): 259 """Creates a Pinecone index. 260 261 :param name: The name of the index to create. Must be unique within your project and 262 cannot be changed once created. Allowed characters are lowercase letters, numbers, 263 and hyphens and the name may not begin or end with hyphens. Maximum length is 45 characters. 264 :type name: str 265 :param dimension: The dimension of vectors that will be inserted in the index. This should 266 match the dimension of the embeddings you will be inserting. For example, if you are using 267 OpenAI's CLIP model, you should use `dimension=1536`. 268 :type dimension: int 269 :param metric: Type of metric used in the vector index when querying, one of `{"cosine", "dotproduct", "euclidean"}`. Defaults to `"cosine"`. 270 Defaults to `"cosine"`. 271 :type metric: str, optional 272 :param spec: A dictionary containing configurations describing how the index should be deployed. For serverless indexes, 273 specify region and cloud. For pod indexes, specify replicas, shards, pods, pod_type, metadata_config, and source_collection. 274 :type spec: Dict 275 :type timeout: int, optional 276 :param timeout: Specify the number of seconds to wait until index gets ready. If None, wait indefinitely; if >=0, time out after this many seconds; 277 if -1, return immediately and do not wait. Default: None 278 :param deletion_protection: If enabled, the index cannot be deleted. If disabled, the index can be deleted. Default: "disabled" 279 280 ### Creating a serverless index 281 282 ```python 283 import os 284 from pinecone import Pinecone, ServerlessSpec 285 286 client = Pinecone(api_key=os.environ.get("PINECONE_API_KEY")) 287 288 client.create_index( 289 name="my_index", 290 dimension=1536, 291 metric="cosine", 292 spec=ServerlessSpec(cloud="aws", region="us-west-2"), 293 deletion_protection="enabled" 294 ) 295 ``` 296 297 ### Creating a pod index 298 299 ```python 300 import os 301 from pinecone import Pinecone, PodSpec 302 303 client = Pinecone(api_key=os.environ.get("PINECONE_API_KEY")) 304 305 client.create_index( 306 name="my_index", 307 dimension=1536, 308 metric="cosine", 309 spec=PodSpec( 310 environment="us-east1-gcp", 311 pod_type="p1.x1" 312 ), 313 deletion_protection="enabled" 314 ) 315 ``` 316 """ 317 318 api_instance = self.index_api 319 320 def _parse_non_empty_args(args: List[Tuple[str, Any]]) -> Dict[str, Any]: 321 return {arg_name: val for arg_name, val in args if val is not None} 322 323 if deletion_protection in ["enabled", "disabled"]: 324 dp = DeletionProtection(deletion_protection) 325 else: 326 raise ValueError("deletion_protection must be either 'enabled' or 'disabled'") 327 328 if isinstance(spec, dict): 329 if "serverless" in spec: 330 index_spec = IndexSpec(serverless=ServerlessSpecModel(**spec["serverless"])) 331 elif "pod" in spec: 332 args_dict = _parse_non_empty_args( 333 [ 334 ("environment", spec["pod"].get("environment")), 335 ("metadata_config", spec["pod"].get("metadata_config")), 336 ("replicas", spec["pod"].get("replicas")), 337 ("shards", spec["pod"].get("shards")), 338 ("pods", spec["pod"].get("pods")), 339 ("source_collection", spec["pod"].get("source_collection")), 340 ] 341 ) 342 if args_dict.get("metadata_config"): 343 args_dict["metadata_config"] = PodSpecMetadataConfig( 344 indexed=args_dict["metadata_config"].get("indexed", None) 345 ) 346 index_spec = IndexSpec(pod=PodSpecModel(**args_dict)) 347 else: 348 raise ValueError("spec must contain either 'serverless' or 'pod' key") 349 elif isinstance(spec, ServerlessSpec): 350 index_spec = IndexSpec( 351 serverless=ServerlessSpecModel(cloud=spec.cloud, region=spec.region) 352 ) 353 elif isinstance(spec, PodSpec): 354 args_dict = _parse_non_empty_args( 355 [ 356 ("replicas", spec.replicas), 357 ("shards", spec.shards), 358 ("pods", spec.pods), 359 ("source_collection", spec.source_collection), 360 ] 361 ) 362 if spec.metadata_config: 363 args_dict["metadata_config"] = PodSpecMetadataConfig( 364 indexed=spec.metadata_config.get("indexed", None) 365 ) 366 367 index_spec = IndexSpec( 368 pod=PodSpecModel(environment=spec.environment, pod_type=spec.pod_type, **args_dict) 369 ) 370 else: 371 raise TypeError("spec must be of type dict, ServerlessSpec, or PodSpec") 372 373 api_instance.create_index( 374 create_index_request=CreateIndexRequest( 375 name=name, 376 dimension=dimension, 377 metric=metric, 378 spec=index_spec, 379 deletion_protection=dp, 380 ) 381 ) 382 383 def is_ready(): 384 status = self._get_status(name) 385 ready = status["ready"] 386 return ready 387 388 if timeout == -1: 389 return 390 if timeout is None: 391 while not is_ready(): 392 time.sleep(5) 393 else: 394 while (not is_ready()) and timeout >= 0: 395 time.sleep(5) 396 timeout -= 5 397 if timeout and timeout < 0: 398 raise ( 399 TimeoutError( 400 "Please call the describe_index API ({}) to confirm index status.".format( 401 "https://www.pinecone.io/docs/api/operation/describe_index/" 402 ) 403 ) 404 ) 405 406 def delete_index(self, name: str, timeout: Optional[int] = None): 407 """Deletes a Pinecone index. 408 409 Deleting an index is an irreversible operation. All data in the index will be lost. 410 When you use this command, a request is sent to the Pinecone control plane to delete 411 the index, but the termination is not synchronous because resources take a few moments to 412 be released. 413 414 You can check the status of the index by calling the `describe_index()` command. 415 With repeated polling of the describe_index command, you will see the index transition to a 416 `Terminating` state before eventually resulting in a 404 after it has been removed. 417 418 :param name: the name of the index. 419 :type name: str 420 :param timeout: Number of seconds to poll status checking whether the index has been deleted. If None, 421 wait indefinitely; if >=0, time out after this many seconds; 422 if -1, return immediately and do not wait. Default: None 423 :type timeout: int, optional 424 """ 425 api_instance = self.index_api 426 api_instance.delete_index(name) 427 self.index_host_store.delete_host(self.config, name) 428 429 def get_remaining(): 430 return name in self.list_indexes().names() 431 432 if timeout == -1: 433 return 434 435 if timeout is None: 436 while get_remaining(): 437 time.sleep(5) 438 else: 439 while get_remaining() and timeout >= 0: 440 time.sleep(5) 441 timeout -= 5 442 if timeout and timeout < 0: 443 raise ( 444 TimeoutError( 445 "Please call the list_indexes API ({}) to confirm if index is deleted".format( 446 "https://www.pinecone.io/docs/api/operation/list_indexes/" 447 ) 448 ) 449 ) 450 451 def list_indexes(self) -> IndexList: 452 """Lists all indexes. 453 454 The results include a description of all indexes in your project, including the 455 index name, dimension, metric, status, and spec. 456 457 :return: Returns an `IndexList` object, which is iterable and contains a 458 list of `IndexModel` objects. It also has a convenience method `names()` 459 which returns a list of index names. 460 461 ```python 462 from pinecone import Pinecone 463 464 client = Pinecone() 465 466 index_name = "my_index" 467 if index_name not in client.list_indexes().names(): 468 print("Index does not exist, creating...") 469 client.create_index( 470 name=index_name, 471 dimension=768, 472 metric="cosine", 473 spec=ServerlessSpec(cloud="aws", region="us-west-2") 474 ) 475 ``` 476 477 You can also use the `list_indexes()` method to iterate over all indexes in your project 478 and get other information besides just names. 479 480 ```python 481 from pinecone import Pinecone 482 483 client = Pinecone() 484 485 for index in client.list_indexes(): 486 print(index.name) 487 print(index.dimension) 488 print(index.metric) 489 print(index.status) 490 print(index.host) 491 print(index.spec) 492 ``` 493 494 """ 495 response = self.index_api.list_indexes() 496 return IndexList(response) 497 498 def describe_index(self, name: str): 499 """Describes a Pinecone index. 500 501 :param name: the name of the index to describe. 502 :return: Returns an `IndexModel` object 503 which gives access to properties such as the 504 index name, dimension, metric, host url, status, 505 and spec. 506 507 ### Getting your index host url 508 509 In a real production situation, you probably want to 510 store the host url in an environment variable so you 511 don't have to call describe_index and re-fetch it 512 every time you want to use the index. But this example 513 shows how to get the value from the API using describe_index. 514 515 ```python 516 from pinecone import Pinecone, Index 517 518 client = Pinecone() 519 520 description = client.describe_index("my_index") 521 522 host = description.host 523 print(f"Your index is hosted at {description.host}") 524 525 index = client.Index(name="my_index", host=host) 526 index.upsert(vectors=[...]) 527 ``` 528 """ 529 api_instance = self.index_api 530 description = api_instance.describe_index(name) 531 host = description.host 532 self.index_host_store.set_host(self.config, name, host) 533 534 return IndexModel(description) 535 536 def has_index(self, name: str) -> bool: 537 """Checks if a Pinecone index exists. 538 539 :param name: The name of the index to check for existence. 540 :return: Returns `True` if the index exists, `False` otherwise. 541 542 ### Example Usage 543 544 ```python 545 import os 546 from pinecone import Pinecone 547 548 api_key = os.environ.get("PINECONE_API_KEY") 549 pc = Pinecone(api_key=api_key) 550 551 if pc.has_index("my_index_name"): 552 print("The index exists") 553 else: 554 print("The index does not exist") 555 ``` 556 """ 557 558 if name in self.list_indexes().names(): 559 return True 560 else: 561 return False 562 563 def configure_index( 564 self, 565 name: str, 566 replicas: Optional[int] = None, 567 pod_type: Optional[str] = None, 568 deletion_protection: Optional[Literal["enabled", "disabled"]] = None, 569 ): 570 """This method is used to scale configuration fields for your pod-based Pinecone index. 571 572 :param: name: the name of the Index 573 :param: replicas: the desired number of replicas, lowest value is 0. 574 :param: pod_type: the new pod_type for the index. To learn more about the 575 available pod types, please see [Understanding Indexes](https://docs.pinecone.io/docs/indexes) 576 577 578 ```python 579 from pinecone import Pinecone 580 581 client = Pinecone() 582 583 # Make a configuration change 584 client.configure_index(name="my_index", replicas=4) 585 586 # Call describe_index to see the index status as the 587 # change is applied. 588 client.describe_index("my_index") 589 ``` 590 591 """ 592 api_instance = self.index_api 593 594 if deletion_protection is None: 595 description = self.describe_index(name=name) 596 dp = DeletionProtection(description.deletion_protection) 597 elif deletion_protection in ["enabled", "disabled"]: 598 dp = DeletionProtection(deletion_protection) 599 else: 600 raise ValueError("deletion_protection must be either 'enabled' or 'disabled'") 601 602 pod_config_args: Dict[str, Any] = {} 603 if pod_type: 604 pod_config_args.update(pod_type=pod_type) 605 if replicas: 606 pod_config_args.update(replicas=replicas) 607 608 if pod_config_args != {}: 609 spec = ConfigureIndexRequestSpec(pod=ConfigureIndexRequestSpecPod(**pod_config_args)) 610 req = ConfigureIndexRequest(deletion_protection=dp, spec=spec) 611 else: 612 req = ConfigureIndexRequest(deletion_protection=dp) 613 614 api_instance.configure_index(name, configure_index_request=req) 615 616 def create_collection(self, name: str, source: str): 617 """Create a collection from a pod-based index 618 619 :param name: Name of the collection 620 :param source: Name of the source index 621 """ 622 api_instance = self.index_api 623 api_instance.create_collection( 624 create_collection_request=CreateCollectionRequest(name=name, source=source) 625 ) 626 627 def list_collections(self) -> CollectionList: 628 """List all collections 629 630 ```python 631 from pinecone import Pinecone 632 633 client = Pinecone() 634 635 for collection in client.list_collections(): 636 print(collection.name) 637 print(collection.source) 638 639 # You can also iterate specifically over the collection 640 # names with the .names() helper. 641 collection_name="my_collection" 642 for collection_name in client.list_collections().names(): 643 print(collection_name) 644 ``` 645 """ 646 api_instance = self.index_api 647 response = api_instance.list_collections() 648 return CollectionList(response) 649 650 def delete_collection(self, name: str): 651 """Deletes a collection. 652 653 :param: name: The name of the collection 654 655 Deleting a collection is an irreversible operation. All data 656 in the collection will be lost. 657 658 This method tells Pinecone you would like to delete a collection, 659 but it takes a few moments to complete the operation. Use the 660 `describe_collection()` method to confirm that the collection 661 has been deleted. 662 """ 663 api_instance = self.index_api 664 api_instance.delete_collection(name) 665 666 def describe_collection(self, name: str): 667 """Describes a collection. 668 :param: The name of the collection 669 :return: Description of the collection 670 671 ```python 672 from pinecone import Pinecone 673 674 client = Pinecone() 675 676 description = client.describe_collection("my_collection") 677 print(description.name) 678 print(description.source) 679 print(description.status) 680 print(description.size) 681 ``` 682 """ 683 api_instance = self.index_api 684 return api_instance.describe_collection(name).to_dict() 685 686 def _get_status(self, name: str): 687 api_instance = self.index_api 688 response = api_instance.describe_index(name) 689 return response["status"] 690 691 @staticmethod 692 def from_texts(*args, **kwargs): 693 raise AttributeError(_build_langchain_attribute_error_message("from_texts")) 694 695 @staticmethod 696 def from_documents(*args, **kwargs): 697 raise AttributeError(_build_langchain_attribute_error_message("from_documents")) 698 699 def Index(self, name: str = "", host: str = "", **kwargs): 700 """ 701 Target an index for data operations. 702 703 ### Target an index by host url 704 705 In production situations, you want to uspert or query your data as quickly 706 as possible. If you know in advance the host url of your index, you can 707 eliminate a round trip to the Pinecone control plane by specifying the 708 host of the index. 709 710 ```python 711 import os 712 from pinecone import Pinecone 713 714 api_key = os.environ.get("PINECONE_API_KEY") 715 index_host = os.environ.get("PINECONE_INDEX_HOST") 716 717 pc = Pinecone(api_key=api_key) 718 index = pc.Index(host=index_host) 719 720 # Now you're ready to perform data operations 721 index.query(vector=[...], top_k=10) 722 ``` 723 724 To find your host url, you can use the Pinecone control plane to describe 725 the index. The host url is returned in the response. Or, alternatively, the 726 host is displayed in the Pinecone web console. 727 728 ```python 729 import os 730 from pinecone import Pinecone 731 732 pc = Pinecone( 733 api_key=os.environ.get("PINECONE_API_KEY") 734 ) 735 736 host = pc.describe_index('index-name').host 737 ``` 738 739 ### Target an index by name (not recommended for production) 740 741 For more casual usage, such as when you are playing and exploring with Pinecone 742 in a notebook setting, you can also target an index by name. If you use this 743 approach, the client may need to perform an extra call to the Pinecone control 744 plane to get the host url on your behalf to get the index host. 745 746 The client will cache the index host for future use whenever it is seen, so you 747 will only incur the overhead of only one call. But this approach is not 748 recommended for production usage. 749 750 ```python 751 import os 752 from pinecone import Pinecone, ServerlessSpec 753 754 api_key = os.environ.get("PINECONE_API_KEY") 755 756 pc = Pinecone(api_key=api_key) 757 pc.create_index( 758 name='my_index', 759 dimension=1536, 760 metric='cosine', 761 spec=ServerlessSpec(cloud='aws', region='us-west-2') 762 ) 763 index = pc.Index('my_index') 764 765 # Now you're ready to perform data operations 766 index.query(vector=[...], top_k=10) 767 ``` 768 769 Arguments: 770 name: The name of the index to target. If you specify the name of the index, the client will 771 fetch the host url from the Pinecone control plane. 772 host: The host url of the index to target. If you specify the host url, the client will use 773 the host url directly without making any additional calls to the control plane. 774 pool_threads: The number of threads to use when making parallel requests by calling index methods with optional kwarg async_req=True, or using methods that make use of parallelism automatically such as query_namespaces(). Default: 1 775 connection_pool_maxsize: The maximum number of connections to keep in the connection pool. Default: 5 * multiprocessing.cpu_count() 776 """ 777 if name == "" and host == "": 778 raise ValueError("Either name or host must be specified") 779 780 pt = kwargs.pop("pool_threads", None) or self.pool_threads 781 api_key = self.config.api_key 782 openapi_config = self.openapi_config 783 784 if host != "": 785 # Use host url if it is provided 786 index_host = normalize_host(host) 787 else: 788 # Otherwise, get host url from describe_index using the index name 789 index_host = self.index_host_store.get_host(self.index_api, self.config, name) 790 791 return Index( 792 host=index_host, 793 api_key=api_key, 794 pool_threads=pt, 795 openapi_config=openapi_config, 796 source_tag=self.config.source_tag, 797 **kwargs, 798 )
38class Pinecone: 39 def __init__( 40 self, 41 api_key: Optional[str] = None, 42 host: Optional[str] = None, 43 proxy_url: Optional[str] = None, 44 proxy_headers: Optional[Dict[str, str]] = None, 45 ssl_ca_certs: Optional[str] = None, 46 ssl_verify: Optional[bool] = None, 47 config: Optional[Config] = None, 48 additional_headers: Optional[Dict[str, str]] = {}, 49 pool_threads: Optional[int] = 1, 50 index_api: Optional[ManageIndexesApi] = None, 51 **kwargs, 52 ): 53 """ 54 The `Pinecone` class is the main entry point for interacting with Pinecone via this Python SDK. 55 It is used to create, delete, and manage your indexes and collections. 56 57 :param api_key: The API key to use for authentication. If not passed via kwarg, the API key will be read from the environment variable `PINECONE_API_KEY`. 58 :type api_key: str, optional 59 :param host: The control plane host to connect to. 60 :type host: str, optional 61 :param proxy_url: The URL of the proxy to use for the connection. Default: `None` 62 :type proxy_url: str, optional 63 :param proxy_headers: Additional headers to pass to the proxy. Use this if your proxy setup requires authentication. Default: `{}` 64 :type proxy_headers: Dict[str, str], optional 65 :param ssl_ca_certs: The path to the SSL CA certificate bundle to use for the connection. This path should point to a file in PEM format. Default: `None` 66 :type ssl_ca_certs: str, optional 67 :param ssl_verify: SSL verification is performed by default, but can be disabled using the boolean flag. Default: `True` 68 :type ssl_verify: bool, optional 69 :param config: A `pinecone.config.Config` object. If passed, the `api_key` and `host` parameters will be ignored. 70 :type config: pinecone.config.Config, optional 71 :param additional_headers: Additional headers to pass to the API. Default: `{}` 72 :type additional_headers: Dict[str, str], optional 73 :param pool_threads: The number of threads to use for the connection pool. Default: `1` 74 :type pool_threads: int, optional 75 :param index_api: An instance of `pinecone.core.client.api.manage_indexes_api.ManageIndexesApi`. If passed, the `host` parameter will be ignored. 76 :type index_api: pinecone.core.client.api.manage_indexes_api.ManageIndexesApi, optional 77 78 79 ### Configuration with environment variables 80 81 If you instantiate the Pinecone client with no arguments, it will attempt to read the API key from the environment variable `PINECONE_API_KEY`. 82 83 ```python 84 from pinecone import Pinecone 85 86 pc = Pinecone() 87 ``` 88 89 ### Configuration with keyword arguments 90 91 If you prefer being more explicit in your code, you can also pass the API as a keyword argument. 92 93 ```python 94 import os 95 from pinecone import Pinecone 96 97 pc = Pinecone(api_key=os.environ.get("PINECONE_API_KEY")) 98 ``` 99 100 ### Environment variables 101 102 The Pinecone client supports the following environment variables: 103 104 - `PINECONE_API_KEY`: The API key to use for authentication. If not passed via 105 kwarg, the API key will be read from the environment variable `PINECONE_API_KEY`. 106 107 - `PINECONE_DEBUG_CURL`: When troubleshooting it can be very useful to run curl 108 commands against the control plane API to see exactly what data is being sent 109 and received without all the abstractions and transformations applied by the Python 110 SDK. If you set this environment variable to `true`, the Pinecone client will use 111 request parameters to print out an equivalent curl command that you can run yourself 112 or share with Pinecone support. **Be very careful with this option, as it will print out 113 your API key** which forms part of a required authentication header. Default: `false` 114 115 ### Proxy configuration 116 117 If your network setup requires you to interact with Pinecone via a proxy, you will need 118 to pass additional configuration using optional keyword parameters. These optional parameters 119 are forwarded to `urllib3`, which is the underlying library currently used by the Pinecone client to 120 make HTTP requests. You may find it helpful to refer to the 121 [urllib3 documentation on working with proxies](https://urllib3.readthedocs.io/en/stable/advanced-usage.html#http-and-https-proxies) 122 while troubleshooting these settings. 123 124 Here is a basic example: 125 126 ```python 127 from pinecone import Pinecone 128 129 pc = Pinecone( 130 api_key='YOUR_API_KEY', 131 proxy_url='https://your-proxy.com' 132 ) 133 134 pc.list_indexes() 135 ``` 136 137 If your proxy requires authentication, you can pass those values in a header dictionary using the `proxy_headers` parameter. 138 139 ```python 140 from pinecone import Pinecone 141 import urllib3 import make_headers 142 143 pc = Pinecone( 144 api_key='YOUR_API_KEY', 145 proxy_url='https://your-proxy.com', 146 proxy_headers=make_headers(proxy_basic_auth='username:password') 147 ) 148 149 pc.list_indexes() 150 ``` 151 152 ### Using proxies with self-signed certificates 153 154 By default the Pinecone Python client will perform SSL certificate verification 155 using the CA bundle maintained by Mozilla in the [certifi](https://pypi.org/project/certifi/) package. 156 If your proxy server is using a self-signed certificate, you will need to pass the path to the certificate 157 in PEM format using the `ssl_ca_certs` parameter. 158 159 ```python 160 from pinecone import Pinecone 161 import urllib3 import make_headers 162 163 pc = Pinecone( 164 api_key='YOUR_API_KEY', 165 proxy_url='https://your-proxy.com', 166 proxy_headers=make_headers(proxy_basic_auth='username:password'), 167 ssl_ca_certs='path/to/cert-bundle.pem' 168 ) 169 170 pc.list_indexes() 171 ``` 172 173 ### Disabling SSL verification 174 175 If you would like to disable SSL verification, you can pass the `ssl_verify` 176 parameter with a value of `False`. We do not recommend going to production with SSL verification disabled. 177 178 ```python 179 from pinecone import Pinecone 180 import urllib3 import make_headers 181 182 pc = Pinecone( 183 api_key='YOUR_API_KEY', 184 proxy_url='https://your-proxy.com', 185 proxy_headers=make_headers(proxy_basic_auth='username:password'), 186 ssl_ca_certs='path/to/cert-bundle.pem', 187 ssl_verify=False 188 ) 189 190 pc.list_indexes() 191 192 ``` 193 """ 194 if config: 195 if not isinstance(config, Config): 196 raise TypeError("config must be of type pinecone.config.Config") 197 else: 198 self.config = config 199 else: 200 self.config = PineconeConfig.build( 201 api_key=api_key, 202 host=host, 203 additional_headers=additional_headers, 204 proxy_url=proxy_url, 205 proxy_headers=proxy_headers, 206 ssl_ca_certs=ssl_ca_certs, 207 ssl_verify=ssl_verify, 208 **kwargs, 209 ) 210 211 if kwargs.get("openapi_config", None): 212 raise Exception( 213 "Passing openapi_config is no longer supported. Please pass settings such as proxy_url, proxy_headers, ssl_ca_certs, and ssl_verify directly to the Pinecone constructor as keyword arguments. See the README at https://github.com/pinecone-io/pinecone-python-client for examples." 214 ) 215 216 self.openapi_config = ConfigBuilder.build_openapi_config(self.config, **kwargs) 217 self.pool_threads = pool_threads 218 219 if index_api: 220 self.index_api = index_api 221 else: 222 self.index_api = setup_openapi_client( 223 api_client_klass=ApiClient, 224 api_klass=ManageIndexesApi, 225 config=self.config, 226 openapi_config=self.openapi_config, 227 pool_threads=pool_threads, 228 api_version=API_VERSION, 229 ) 230 231 self.index_host_store = IndexHostStore() 232 """ @private """ 233 234 self.load_plugins() 235 236 def load_plugins(self): 237 """@private""" 238 try: 239 # I don't expect this to ever throw, but wrapping this in a 240 # try block just in case to make sure a bad plugin doesn't 241 # halt client initialization. 242 openapi_client_builder = build_plugin_setup_client( 243 config=self.config, 244 openapi_config=self.openapi_config, 245 pool_threads=self.pool_threads, 246 ) 247 install_plugins(self, openapi_client_builder) 248 except Exception as e: 249 logger.error(f"Error loading plugins: {e}") 250 251 def create_index( 252 self, 253 name: str, 254 dimension: int, 255 spec: Union[Dict, ServerlessSpec, PodSpec], 256 metric: Optional[str] = "cosine", 257 timeout: Optional[int] = None, 258 deletion_protection: Optional[Literal["enabled", "disabled"]] = "disabled", 259 ): 260 """Creates a Pinecone index. 261 262 :param name: The name of the index to create. Must be unique within your project and 263 cannot be changed once created. Allowed characters are lowercase letters, numbers, 264 and hyphens and the name may not begin or end with hyphens. Maximum length is 45 characters. 265 :type name: str 266 :param dimension: The dimension of vectors that will be inserted in the index. This should 267 match the dimension of the embeddings you will be inserting. For example, if you are using 268 OpenAI's CLIP model, you should use `dimension=1536`. 269 :type dimension: int 270 :param metric: Type of metric used in the vector index when querying, one of `{"cosine", "dotproduct", "euclidean"}`. Defaults to `"cosine"`. 271 Defaults to `"cosine"`. 272 :type metric: str, optional 273 :param spec: A dictionary containing configurations describing how the index should be deployed. For serverless indexes, 274 specify region and cloud. For pod indexes, specify replicas, shards, pods, pod_type, metadata_config, and source_collection. 275 :type spec: Dict 276 :type timeout: int, optional 277 :param timeout: Specify the number of seconds to wait until index gets ready. If None, wait indefinitely; if >=0, time out after this many seconds; 278 if -1, return immediately and do not wait. Default: None 279 :param deletion_protection: If enabled, the index cannot be deleted. If disabled, the index can be deleted. Default: "disabled" 280 281 ### Creating a serverless index 282 283 ```python 284 import os 285 from pinecone import Pinecone, ServerlessSpec 286 287 client = Pinecone(api_key=os.environ.get("PINECONE_API_KEY")) 288 289 client.create_index( 290 name="my_index", 291 dimension=1536, 292 metric="cosine", 293 spec=ServerlessSpec(cloud="aws", region="us-west-2"), 294 deletion_protection="enabled" 295 ) 296 ``` 297 298 ### Creating a pod index 299 300 ```python 301 import os 302 from pinecone import Pinecone, PodSpec 303 304 client = Pinecone(api_key=os.environ.get("PINECONE_API_KEY")) 305 306 client.create_index( 307 name="my_index", 308 dimension=1536, 309 metric="cosine", 310 spec=PodSpec( 311 environment="us-east1-gcp", 312 pod_type="p1.x1" 313 ), 314 deletion_protection="enabled" 315 ) 316 ``` 317 """ 318 319 api_instance = self.index_api 320 321 def _parse_non_empty_args(args: List[Tuple[str, Any]]) -> Dict[str, Any]: 322 return {arg_name: val for arg_name, val in args if val is not None} 323 324 if deletion_protection in ["enabled", "disabled"]: 325 dp = DeletionProtection(deletion_protection) 326 else: 327 raise ValueError("deletion_protection must be either 'enabled' or 'disabled'") 328 329 if isinstance(spec, dict): 330 if "serverless" in spec: 331 index_spec = IndexSpec(serverless=ServerlessSpecModel(**spec["serverless"])) 332 elif "pod" in spec: 333 args_dict = _parse_non_empty_args( 334 [ 335 ("environment", spec["pod"].get("environment")), 336 ("metadata_config", spec["pod"].get("metadata_config")), 337 ("replicas", spec["pod"].get("replicas")), 338 ("shards", spec["pod"].get("shards")), 339 ("pods", spec["pod"].get("pods")), 340 ("source_collection", spec["pod"].get("source_collection")), 341 ] 342 ) 343 if args_dict.get("metadata_config"): 344 args_dict["metadata_config"] = PodSpecMetadataConfig( 345 indexed=args_dict["metadata_config"].get("indexed", None) 346 ) 347 index_spec = IndexSpec(pod=PodSpecModel(**args_dict)) 348 else: 349 raise ValueError("spec must contain either 'serverless' or 'pod' key") 350 elif isinstance(spec, ServerlessSpec): 351 index_spec = IndexSpec( 352 serverless=ServerlessSpecModel(cloud=spec.cloud, region=spec.region) 353 ) 354 elif isinstance(spec, PodSpec): 355 args_dict = _parse_non_empty_args( 356 [ 357 ("replicas", spec.replicas), 358 ("shards", spec.shards), 359 ("pods", spec.pods), 360 ("source_collection", spec.source_collection), 361 ] 362 ) 363 if spec.metadata_config: 364 args_dict["metadata_config"] = PodSpecMetadataConfig( 365 indexed=spec.metadata_config.get("indexed", None) 366 ) 367 368 index_spec = IndexSpec( 369 pod=PodSpecModel(environment=spec.environment, pod_type=spec.pod_type, **args_dict) 370 ) 371 else: 372 raise TypeError("spec must be of type dict, ServerlessSpec, or PodSpec") 373 374 api_instance.create_index( 375 create_index_request=CreateIndexRequest( 376 name=name, 377 dimension=dimension, 378 metric=metric, 379 spec=index_spec, 380 deletion_protection=dp, 381 ) 382 ) 383 384 def is_ready(): 385 status = self._get_status(name) 386 ready = status["ready"] 387 return ready 388 389 if timeout == -1: 390 return 391 if timeout is None: 392 while not is_ready(): 393 time.sleep(5) 394 else: 395 while (not is_ready()) and timeout >= 0: 396 time.sleep(5) 397 timeout -= 5 398 if timeout and timeout < 0: 399 raise ( 400 TimeoutError( 401 "Please call the describe_index API ({}) to confirm index status.".format( 402 "https://www.pinecone.io/docs/api/operation/describe_index/" 403 ) 404 ) 405 ) 406 407 def delete_index(self, name: str, timeout: Optional[int] = None): 408 """Deletes a Pinecone index. 409 410 Deleting an index is an irreversible operation. All data in the index will be lost. 411 When you use this command, a request is sent to the Pinecone control plane to delete 412 the index, but the termination is not synchronous because resources take a few moments to 413 be released. 414 415 You can check the status of the index by calling the `describe_index()` command. 416 With repeated polling of the describe_index command, you will see the index transition to a 417 `Terminating` state before eventually resulting in a 404 after it has been removed. 418 419 :param name: the name of the index. 420 :type name: str 421 :param timeout: Number of seconds to poll status checking whether the index has been deleted. If None, 422 wait indefinitely; if >=0, time out after this many seconds; 423 if -1, return immediately and do not wait. Default: None 424 :type timeout: int, optional 425 """ 426 api_instance = self.index_api 427 api_instance.delete_index(name) 428 self.index_host_store.delete_host(self.config, name) 429 430 def get_remaining(): 431 return name in self.list_indexes().names() 432 433 if timeout == -1: 434 return 435 436 if timeout is None: 437 while get_remaining(): 438 time.sleep(5) 439 else: 440 while get_remaining() and timeout >= 0: 441 time.sleep(5) 442 timeout -= 5 443 if timeout and timeout < 0: 444 raise ( 445 TimeoutError( 446 "Please call the list_indexes API ({}) to confirm if index is deleted".format( 447 "https://www.pinecone.io/docs/api/operation/list_indexes/" 448 ) 449 ) 450 ) 451 452 def list_indexes(self) -> IndexList: 453 """Lists all indexes. 454 455 The results include a description of all indexes in your project, including the 456 index name, dimension, metric, status, and spec. 457 458 :return: Returns an `IndexList` object, which is iterable and contains a 459 list of `IndexModel` objects. It also has a convenience method `names()` 460 which returns a list of index names. 461 462 ```python 463 from pinecone import Pinecone 464 465 client = Pinecone() 466 467 index_name = "my_index" 468 if index_name not in client.list_indexes().names(): 469 print("Index does not exist, creating...") 470 client.create_index( 471 name=index_name, 472 dimension=768, 473 metric="cosine", 474 spec=ServerlessSpec(cloud="aws", region="us-west-2") 475 ) 476 ``` 477 478 You can also use the `list_indexes()` method to iterate over all indexes in your project 479 and get other information besides just names. 480 481 ```python 482 from pinecone import Pinecone 483 484 client = Pinecone() 485 486 for index in client.list_indexes(): 487 print(index.name) 488 print(index.dimension) 489 print(index.metric) 490 print(index.status) 491 print(index.host) 492 print(index.spec) 493 ``` 494 495 """ 496 response = self.index_api.list_indexes() 497 return IndexList(response) 498 499 def describe_index(self, name: str): 500 """Describes a Pinecone index. 501 502 :param name: the name of the index to describe. 503 :return: Returns an `IndexModel` object 504 which gives access to properties such as the 505 index name, dimension, metric, host url, status, 506 and spec. 507 508 ### Getting your index host url 509 510 In a real production situation, you probably want to 511 store the host url in an environment variable so you 512 don't have to call describe_index and re-fetch it 513 every time you want to use the index. But this example 514 shows how to get the value from the API using describe_index. 515 516 ```python 517 from pinecone import Pinecone, Index 518 519 client = Pinecone() 520 521 description = client.describe_index("my_index") 522 523 host = description.host 524 print(f"Your index is hosted at {description.host}") 525 526 index = client.Index(name="my_index", host=host) 527 index.upsert(vectors=[...]) 528 ``` 529 """ 530 api_instance = self.index_api 531 description = api_instance.describe_index(name) 532 host = description.host 533 self.index_host_store.set_host(self.config, name, host) 534 535 return IndexModel(description) 536 537 def has_index(self, name: str) -> bool: 538 """Checks if a Pinecone index exists. 539 540 :param name: The name of the index to check for existence. 541 :return: Returns `True` if the index exists, `False` otherwise. 542 543 ### Example Usage 544 545 ```python 546 import os 547 from pinecone import Pinecone 548 549 api_key = os.environ.get("PINECONE_API_KEY") 550 pc = Pinecone(api_key=api_key) 551 552 if pc.has_index("my_index_name"): 553 print("The index exists") 554 else: 555 print("The index does not exist") 556 ``` 557 """ 558 559 if name in self.list_indexes().names(): 560 return True 561 else: 562 return False 563 564 def configure_index( 565 self, 566 name: str, 567 replicas: Optional[int] = None, 568 pod_type: Optional[str] = None, 569 deletion_protection: Optional[Literal["enabled", "disabled"]] = None, 570 ): 571 """This method is used to scale configuration fields for your pod-based Pinecone index. 572 573 :param: name: the name of the Index 574 :param: replicas: the desired number of replicas, lowest value is 0. 575 :param: pod_type: the new pod_type for the index. To learn more about the 576 available pod types, please see [Understanding Indexes](https://docs.pinecone.io/docs/indexes) 577 578 579 ```python 580 from pinecone import Pinecone 581 582 client = Pinecone() 583 584 # Make a configuration change 585 client.configure_index(name="my_index", replicas=4) 586 587 # Call describe_index to see the index status as the 588 # change is applied. 589 client.describe_index("my_index") 590 ``` 591 592 """ 593 api_instance = self.index_api 594 595 if deletion_protection is None: 596 description = self.describe_index(name=name) 597 dp = DeletionProtection(description.deletion_protection) 598 elif deletion_protection in ["enabled", "disabled"]: 599 dp = DeletionProtection(deletion_protection) 600 else: 601 raise ValueError("deletion_protection must be either 'enabled' or 'disabled'") 602 603 pod_config_args: Dict[str, Any] = {} 604 if pod_type: 605 pod_config_args.update(pod_type=pod_type) 606 if replicas: 607 pod_config_args.update(replicas=replicas) 608 609 if pod_config_args != {}: 610 spec = ConfigureIndexRequestSpec(pod=ConfigureIndexRequestSpecPod(**pod_config_args)) 611 req = ConfigureIndexRequest(deletion_protection=dp, spec=spec) 612 else: 613 req = ConfigureIndexRequest(deletion_protection=dp) 614 615 api_instance.configure_index(name, configure_index_request=req) 616 617 def create_collection(self, name: str, source: str): 618 """Create a collection from a pod-based index 619 620 :param name: Name of the collection 621 :param source: Name of the source index 622 """ 623 api_instance = self.index_api 624 api_instance.create_collection( 625 create_collection_request=CreateCollectionRequest(name=name, source=source) 626 ) 627 628 def list_collections(self) -> CollectionList: 629 """List all collections 630 631 ```python 632 from pinecone import Pinecone 633 634 client = Pinecone() 635 636 for collection in client.list_collections(): 637 print(collection.name) 638 print(collection.source) 639 640 # You can also iterate specifically over the collection 641 # names with the .names() helper. 642 collection_name="my_collection" 643 for collection_name in client.list_collections().names(): 644 print(collection_name) 645 ``` 646 """ 647 api_instance = self.index_api 648 response = api_instance.list_collections() 649 return CollectionList(response) 650 651 def delete_collection(self, name: str): 652 """Deletes a collection. 653 654 :param: name: The name of the collection 655 656 Deleting a collection is an irreversible operation. All data 657 in the collection will be lost. 658 659 This method tells Pinecone you would like to delete a collection, 660 but it takes a few moments to complete the operation. Use the 661 `describe_collection()` method to confirm that the collection 662 has been deleted. 663 """ 664 api_instance = self.index_api 665 api_instance.delete_collection(name) 666 667 def describe_collection(self, name: str): 668 """Describes a collection. 669 :param: The name of the collection 670 :return: Description of the collection 671 672 ```python 673 from pinecone import Pinecone 674 675 client = Pinecone() 676 677 description = client.describe_collection("my_collection") 678 print(description.name) 679 print(description.source) 680 print(description.status) 681 print(description.size) 682 ``` 683 """ 684 api_instance = self.index_api 685 return api_instance.describe_collection(name).to_dict() 686 687 def _get_status(self, name: str): 688 api_instance = self.index_api 689 response = api_instance.describe_index(name) 690 return response["status"] 691 692 @staticmethod 693 def from_texts(*args, **kwargs): 694 raise AttributeError(_build_langchain_attribute_error_message("from_texts")) 695 696 @staticmethod 697 def from_documents(*args, **kwargs): 698 raise AttributeError(_build_langchain_attribute_error_message("from_documents")) 699 700 def Index(self, name: str = "", host: str = "", **kwargs): 701 """ 702 Target an index for data operations. 703 704 ### Target an index by host url 705 706 In production situations, you want to uspert or query your data as quickly 707 as possible. If you know in advance the host url of your index, you can 708 eliminate a round trip to the Pinecone control plane by specifying the 709 host of the index. 710 711 ```python 712 import os 713 from pinecone import Pinecone 714 715 api_key = os.environ.get("PINECONE_API_KEY") 716 index_host = os.environ.get("PINECONE_INDEX_HOST") 717 718 pc = Pinecone(api_key=api_key) 719 index = pc.Index(host=index_host) 720 721 # Now you're ready to perform data operations 722 index.query(vector=[...], top_k=10) 723 ``` 724 725 To find your host url, you can use the Pinecone control plane to describe 726 the index. The host url is returned in the response. Or, alternatively, the 727 host is displayed in the Pinecone web console. 728 729 ```python 730 import os 731 from pinecone import Pinecone 732 733 pc = Pinecone( 734 api_key=os.environ.get("PINECONE_API_KEY") 735 ) 736 737 host = pc.describe_index('index-name').host 738 ``` 739 740 ### Target an index by name (not recommended for production) 741 742 For more casual usage, such as when you are playing and exploring with Pinecone 743 in a notebook setting, you can also target an index by name. If you use this 744 approach, the client may need to perform an extra call to the Pinecone control 745 plane to get the host url on your behalf to get the index host. 746 747 The client will cache the index host for future use whenever it is seen, so you 748 will only incur the overhead of only one call. But this approach is not 749 recommended for production usage. 750 751 ```python 752 import os 753 from pinecone import Pinecone, ServerlessSpec 754 755 api_key = os.environ.get("PINECONE_API_KEY") 756 757 pc = Pinecone(api_key=api_key) 758 pc.create_index( 759 name='my_index', 760 dimension=1536, 761 metric='cosine', 762 spec=ServerlessSpec(cloud='aws', region='us-west-2') 763 ) 764 index = pc.Index('my_index') 765 766 # Now you're ready to perform data operations 767 index.query(vector=[...], top_k=10) 768 ``` 769 770 Arguments: 771 name: The name of the index to target. If you specify the name of the index, the client will 772 fetch the host url from the Pinecone control plane. 773 host: The host url of the index to target. If you specify the host url, the client will use 774 the host url directly without making any additional calls to the control plane. 775 pool_threads: The number of threads to use when making parallel requests by calling index methods with optional kwarg async_req=True, or using methods that make use of parallelism automatically such as query_namespaces(). Default: 1 776 connection_pool_maxsize: The maximum number of connections to keep in the connection pool. Default: 5 * multiprocessing.cpu_count() 777 """ 778 if name == "" and host == "": 779 raise ValueError("Either name or host must be specified") 780 781 pt = kwargs.pop("pool_threads", None) or self.pool_threads 782 api_key = self.config.api_key 783 openapi_config = self.openapi_config 784 785 if host != "": 786 # Use host url if it is provided 787 index_host = normalize_host(host) 788 else: 789 # Otherwise, get host url from describe_index using the index name 790 index_host = self.index_host_store.get_host(self.index_api, self.config, name) 791 792 return Index( 793 host=index_host, 794 api_key=api_key, 795 pool_threads=pt, 796 openapi_config=openapi_config, 797 source_tag=self.config.source_tag, 798 **kwargs, 799 )
39 def __init__( 40 self, 41 api_key: Optional[str] = None, 42 host: Optional[str] = None, 43 proxy_url: Optional[str] = None, 44 proxy_headers: Optional[Dict[str, str]] = None, 45 ssl_ca_certs: Optional[str] = None, 46 ssl_verify: Optional[bool] = None, 47 config: Optional[Config] = None, 48 additional_headers: Optional[Dict[str, str]] = {}, 49 pool_threads: Optional[int] = 1, 50 index_api: Optional[ManageIndexesApi] = None, 51 **kwargs, 52 ): 53 """ 54 The `Pinecone` class is the main entry point for interacting with Pinecone via this Python SDK. 55 It is used to create, delete, and manage your indexes and collections. 56 57 :param api_key: The API key to use for authentication. If not passed via kwarg, the API key will be read from the environment variable `PINECONE_API_KEY`. 58 :type api_key: str, optional 59 :param host: The control plane host to connect to. 60 :type host: str, optional 61 :param proxy_url: The URL of the proxy to use for the connection. Default: `None` 62 :type proxy_url: str, optional 63 :param proxy_headers: Additional headers to pass to the proxy. Use this if your proxy setup requires authentication. Default: `{}` 64 :type proxy_headers: Dict[str, str], optional 65 :param ssl_ca_certs: The path to the SSL CA certificate bundle to use for the connection. This path should point to a file in PEM format. Default: `None` 66 :type ssl_ca_certs: str, optional 67 :param ssl_verify: SSL verification is performed by default, but can be disabled using the boolean flag. Default: `True` 68 :type ssl_verify: bool, optional 69 :param config: A `pinecone.config.Config` object. If passed, the `api_key` and `host` parameters will be ignored. 70 :type config: pinecone.config.Config, optional 71 :param additional_headers: Additional headers to pass to the API. Default: `{}` 72 :type additional_headers: Dict[str, str], optional 73 :param pool_threads: The number of threads to use for the connection pool. Default: `1` 74 :type pool_threads: int, optional 75 :param index_api: An instance of `pinecone.core.client.api.manage_indexes_api.ManageIndexesApi`. If passed, the `host` parameter will be ignored. 76 :type index_api: pinecone.core.client.api.manage_indexes_api.ManageIndexesApi, optional 77 78 79 ### Configuration with environment variables 80 81 If you instantiate the Pinecone client with no arguments, it will attempt to read the API key from the environment variable `PINECONE_API_KEY`. 82 83 ```python 84 from pinecone import Pinecone 85 86 pc = Pinecone() 87 ``` 88 89 ### Configuration with keyword arguments 90 91 If you prefer being more explicit in your code, you can also pass the API as a keyword argument. 92 93 ```python 94 import os 95 from pinecone import Pinecone 96 97 pc = Pinecone(api_key=os.environ.get("PINECONE_API_KEY")) 98 ``` 99 100 ### Environment variables 101 102 The Pinecone client supports the following environment variables: 103 104 - `PINECONE_API_KEY`: The API key to use for authentication. If not passed via 105 kwarg, the API key will be read from the environment variable `PINECONE_API_KEY`. 106 107 - `PINECONE_DEBUG_CURL`: When troubleshooting it can be very useful to run curl 108 commands against the control plane API to see exactly what data is being sent 109 and received without all the abstractions and transformations applied by the Python 110 SDK. If you set this environment variable to `true`, the Pinecone client will use 111 request parameters to print out an equivalent curl command that you can run yourself 112 or share with Pinecone support. **Be very careful with this option, as it will print out 113 your API key** which forms part of a required authentication header. Default: `false` 114 115 ### Proxy configuration 116 117 If your network setup requires you to interact with Pinecone via a proxy, you will need 118 to pass additional configuration using optional keyword parameters. These optional parameters 119 are forwarded to `urllib3`, which is the underlying library currently used by the Pinecone client to 120 make HTTP requests. You may find it helpful to refer to the 121 [urllib3 documentation on working with proxies](https://urllib3.readthedocs.io/en/stable/advanced-usage.html#http-and-https-proxies) 122 while troubleshooting these settings. 123 124 Here is a basic example: 125 126 ```python 127 from pinecone import Pinecone 128 129 pc = Pinecone( 130 api_key='YOUR_API_KEY', 131 proxy_url='https://your-proxy.com' 132 ) 133 134 pc.list_indexes() 135 ``` 136 137 If your proxy requires authentication, you can pass those values in a header dictionary using the `proxy_headers` parameter. 138 139 ```python 140 from pinecone import Pinecone 141 import urllib3 import make_headers 142 143 pc = Pinecone( 144 api_key='YOUR_API_KEY', 145 proxy_url='https://your-proxy.com', 146 proxy_headers=make_headers(proxy_basic_auth='username:password') 147 ) 148 149 pc.list_indexes() 150 ``` 151 152 ### Using proxies with self-signed certificates 153 154 By default the Pinecone Python client will perform SSL certificate verification 155 using the CA bundle maintained by Mozilla in the [certifi](https://pypi.org/project/certifi/) package. 156 If your proxy server is using a self-signed certificate, you will need to pass the path to the certificate 157 in PEM format using the `ssl_ca_certs` parameter. 158 159 ```python 160 from pinecone import Pinecone 161 import urllib3 import make_headers 162 163 pc = Pinecone( 164 api_key='YOUR_API_KEY', 165 proxy_url='https://your-proxy.com', 166 proxy_headers=make_headers(proxy_basic_auth='username:password'), 167 ssl_ca_certs='path/to/cert-bundle.pem' 168 ) 169 170 pc.list_indexes() 171 ``` 172 173 ### Disabling SSL verification 174 175 If you would like to disable SSL verification, you can pass the `ssl_verify` 176 parameter with a value of `False`. We do not recommend going to production with SSL verification disabled. 177 178 ```python 179 from pinecone import Pinecone 180 import urllib3 import make_headers 181 182 pc = Pinecone( 183 api_key='YOUR_API_KEY', 184 proxy_url='https://your-proxy.com', 185 proxy_headers=make_headers(proxy_basic_auth='username:password'), 186 ssl_ca_certs='path/to/cert-bundle.pem', 187 ssl_verify=False 188 ) 189 190 pc.list_indexes() 191 192 ``` 193 """ 194 if config: 195 if not isinstance(config, Config): 196 raise TypeError("config must be of type pinecone.config.Config") 197 else: 198 self.config = config 199 else: 200 self.config = PineconeConfig.build( 201 api_key=api_key, 202 host=host, 203 additional_headers=additional_headers, 204 proxy_url=proxy_url, 205 proxy_headers=proxy_headers, 206 ssl_ca_certs=ssl_ca_certs, 207 ssl_verify=ssl_verify, 208 **kwargs, 209 ) 210 211 if kwargs.get("openapi_config", None): 212 raise Exception( 213 "Passing openapi_config is no longer supported. Please pass settings such as proxy_url, proxy_headers, ssl_ca_certs, and ssl_verify directly to the Pinecone constructor as keyword arguments. See the README at https://github.com/pinecone-io/pinecone-python-client for examples." 214 ) 215 216 self.openapi_config = ConfigBuilder.build_openapi_config(self.config, **kwargs) 217 self.pool_threads = pool_threads 218 219 if index_api: 220 self.index_api = index_api 221 else: 222 self.index_api = setup_openapi_client( 223 api_client_klass=ApiClient, 224 api_klass=ManageIndexesApi, 225 config=self.config, 226 openapi_config=self.openapi_config, 227 pool_threads=pool_threads, 228 api_version=API_VERSION, 229 ) 230 231 self.index_host_store = IndexHostStore() 232 """ @private """ 233 234 self.load_plugins()
The Pinecone
class is the main entry point for interacting with Pinecone via this Python SDK.
It is used to create, delete, and manage your indexes and collections.
Parameters
- api_key: The API key to use for authentication. If not passed via kwarg, the API key will be read from the environment variable
PINECONE_API_KEY
. - host: The control plane host to connect to.
- proxy_url: The URL of the proxy to use for the connection. Default:
None
- proxy_headers: Additional headers to pass to the proxy. Use this if your proxy setup requires authentication. Default:
{}
- ssl_ca_certs: The path to the SSL CA certificate bundle to use for the connection. This path should point to a file in PEM format. Default:
None
- ssl_verify: SSL verification is performed by default, but can be disabled using the boolean flag. Default:
True
- config: A
pinecone.config.Config
object. If passed, theapi_key
andhost
parameters will be ignored. - additional_headers: Additional headers to pass to the API. Default:
{}
- pool_threads: The number of threads to use for the connection pool. Default:
1
- index_api: An instance of
pinecone.core.client.api.manage_indexes_api.ManageIndexesApi
. If passed, thehost
parameter will be ignored.
Configuration with environment variables
If you instantiate the Pinecone client with no arguments, it will attempt to read the API key from the environment variable PINECONE_API_KEY
.
from pinecone import Pinecone
pc = Pinecone()
Configuration with keyword arguments
If you prefer being more explicit in your code, you can also pass the API as a keyword argument.
import os
from pinecone import Pinecone
pc = Pinecone(api_key=os.environ.get("PINECONE_API_KEY"))
Environment variables
The Pinecone client supports the following environment variables:
PINECONE_API_KEY
: The API key to use for authentication. If not passed via kwarg, the API key will be read from the environment variablePINECONE_API_KEY
.PINECONE_DEBUG_CURL
: When troubleshooting it can be very useful to run curl commands against the control plane API to see exactly what data is being sent and received without all the abstractions and transformations applied by the Python SDK. If you set this environment variable totrue
, the Pinecone client will use request parameters to print out an equivalent curl command that you can run yourself or share with Pinecone support. Be very careful with this option, as it will print out your API key which forms part of a required authentication header. Default:false
Proxy configuration
If your network setup requires you to interact with Pinecone via a proxy, you will need
to pass additional configuration using optional keyword parameters. These optional parameters
are forwarded to urllib3
, which is the underlying library currently used by the Pinecone client to
make HTTP requests. You may find it helpful to refer to the
urllib3 documentation on working with proxies
while troubleshooting these settings.
Here is a basic example:
from pinecone import Pinecone
pc = Pinecone(
api_key='YOUR_API_KEY',
proxy_url='https://your-proxy.com'
)
pc.list_indexes()
If your proxy requires authentication, you can pass those values in a header dictionary using the proxy_headers
parameter.
from pinecone import Pinecone
import urllib3 import make_headers
pc = Pinecone(
api_key='YOUR_API_KEY',
proxy_url='https://your-proxy.com',
proxy_headers=make_headers(proxy_basic_auth='username:password')
)
pc.list_indexes()
Using proxies with self-signed certificates
By default the Pinecone Python client will perform SSL certificate verification
using the CA bundle maintained by Mozilla in the certifi package.
If your proxy server is using a self-signed certificate, you will need to pass the path to the certificate
in PEM format using the ssl_ca_certs
parameter.
from pinecone import Pinecone
import urllib3 import make_headers
pc = Pinecone(
api_key='YOUR_API_KEY',
proxy_url='https://your-proxy.com',
proxy_headers=make_headers(proxy_basic_auth='username:password'),
ssl_ca_certs='path/to/cert-bundle.pem'
)
pc.list_indexes()
Disabling SSL verification
If you would like to disable SSL verification, you can pass the ssl_verify
parameter with a value of False
. We do not recommend going to production with SSL verification disabled.
from pinecone import Pinecone
import urllib3 import make_headers
pc = Pinecone(
api_key='YOUR_API_KEY',
proxy_url='https://your-proxy.com',
proxy_headers=make_headers(proxy_basic_auth='username:password'),
ssl_ca_certs='path/to/cert-bundle.pem',
ssl_verify=False
)
pc.list_indexes()
251 def create_index( 252 self, 253 name: str, 254 dimension: int, 255 spec: Union[Dict, ServerlessSpec, PodSpec], 256 metric: Optional[str] = "cosine", 257 timeout: Optional[int] = None, 258 deletion_protection: Optional[Literal["enabled", "disabled"]] = "disabled", 259 ): 260 """Creates a Pinecone index. 261 262 :param name: The name of the index to create. Must be unique within your project and 263 cannot be changed once created. Allowed characters are lowercase letters, numbers, 264 and hyphens and the name may not begin or end with hyphens. Maximum length is 45 characters. 265 :type name: str 266 :param dimension: The dimension of vectors that will be inserted in the index. This should 267 match the dimension of the embeddings you will be inserting. For example, if you are using 268 OpenAI's CLIP model, you should use `dimension=1536`. 269 :type dimension: int 270 :param metric: Type of metric used in the vector index when querying, one of `{"cosine", "dotproduct", "euclidean"}`. Defaults to `"cosine"`. 271 Defaults to `"cosine"`. 272 :type metric: str, optional 273 :param spec: A dictionary containing configurations describing how the index should be deployed. For serverless indexes, 274 specify region and cloud. For pod indexes, specify replicas, shards, pods, pod_type, metadata_config, and source_collection. 275 :type spec: Dict 276 :type timeout: int, optional 277 :param timeout: Specify the number of seconds to wait until index gets ready. If None, wait indefinitely; if >=0, time out after this many seconds; 278 if -1, return immediately and do not wait. Default: None 279 :param deletion_protection: If enabled, the index cannot be deleted. If disabled, the index can be deleted. Default: "disabled" 280 281 ### Creating a serverless index 282 283 ```python 284 import os 285 from pinecone import Pinecone, ServerlessSpec 286 287 client = Pinecone(api_key=os.environ.get("PINECONE_API_KEY")) 288 289 client.create_index( 290 name="my_index", 291 dimension=1536, 292 metric="cosine", 293 spec=ServerlessSpec(cloud="aws", region="us-west-2"), 294 deletion_protection="enabled" 295 ) 296 ``` 297 298 ### Creating a pod index 299 300 ```python 301 import os 302 from pinecone import Pinecone, PodSpec 303 304 client = Pinecone(api_key=os.environ.get("PINECONE_API_KEY")) 305 306 client.create_index( 307 name="my_index", 308 dimension=1536, 309 metric="cosine", 310 spec=PodSpec( 311 environment="us-east1-gcp", 312 pod_type="p1.x1" 313 ), 314 deletion_protection="enabled" 315 ) 316 ``` 317 """ 318 319 api_instance = self.index_api 320 321 def _parse_non_empty_args(args: List[Tuple[str, Any]]) -> Dict[str, Any]: 322 return {arg_name: val for arg_name, val in args if val is not None} 323 324 if deletion_protection in ["enabled", "disabled"]: 325 dp = DeletionProtection(deletion_protection) 326 else: 327 raise ValueError("deletion_protection must be either 'enabled' or 'disabled'") 328 329 if isinstance(spec, dict): 330 if "serverless" in spec: 331 index_spec = IndexSpec(serverless=ServerlessSpecModel(**spec["serverless"])) 332 elif "pod" in spec: 333 args_dict = _parse_non_empty_args( 334 [ 335 ("environment", spec["pod"].get("environment")), 336 ("metadata_config", spec["pod"].get("metadata_config")), 337 ("replicas", spec["pod"].get("replicas")), 338 ("shards", spec["pod"].get("shards")), 339 ("pods", spec["pod"].get("pods")), 340 ("source_collection", spec["pod"].get("source_collection")), 341 ] 342 ) 343 if args_dict.get("metadata_config"): 344 args_dict["metadata_config"] = PodSpecMetadataConfig( 345 indexed=args_dict["metadata_config"].get("indexed", None) 346 ) 347 index_spec = IndexSpec(pod=PodSpecModel(**args_dict)) 348 else: 349 raise ValueError("spec must contain either 'serverless' or 'pod' key") 350 elif isinstance(spec, ServerlessSpec): 351 index_spec = IndexSpec( 352 serverless=ServerlessSpecModel(cloud=spec.cloud, region=spec.region) 353 ) 354 elif isinstance(spec, PodSpec): 355 args_dict = _parse_non_empty_args( 356 [ 357 ("replicas", spec.replicas), 358 ("shards", spec.shards), 359 ("pods", spec.pods), 360 ("source_collection", spec.source_collection), 361 ] 362 ) 363 if spec.metadata_config: 364 args_dict["metadata_config"] = PodSpecMetadataConfig( 365 indexed=spec.metadata_config.get("indexed", None) 366 ) 367 368 index_spec = IndexSpec( 369 pod=PodSpecModel(environment=spec.environment, pod_type=spec.pod_type, **args_dict) 370 ) 371 else: 372 raise TypeError("spec must be of type dict, ServerlessSpec, or PodSpec") 373 374 api_instance.create_index( 375 create_index_request=CreateIndexRequest( 376 name=name, 377 dimension=dimension, 378 metric=metric, 379 spec=index_spec, 380 deletion_protection=dp, 381 ) 382 ) 383 384 def is_ready(): 385 status = self._get_status(name) 386 ready = status["ready"] 387 return ready 388 389 if timeout == -1: 390 return 391 if timeout is None: 392 while not is_ready(): 393 time.sleep(5) 394 else: 395 while (not is_ready()) and timeout >= 0: 396 time.sleep(5) 397 timeout -= 5 398 if timeout and timeout < 0: 399 raise ( 400 TimeoutError( 401 "Please call the describe_index API ({}) to confirm index status.".format( 402 "https://www.pinecone.io/docs/api/operation/describe_index/" 403 ) 404 ) 405 )
Creates a Pinecone index.
Parameters
- name: The name of the index to create. Must be unique within your project and cannot be changed once created. Allowed characters are lowercase letters, numbers, and hyphens and the name may not begin or end with hyphens. Maximum length is 45 characters.
- dimension: The dimension of vectors that will be inserted in the index. This should
match the dimension of the embeddings you will be inserting. For example, if you are using
OpenAI's CLIP model, you should use
dimension=1536
. - metric: Type of metric used in the vector index when querying, one of
{"cosine", "dotproduct", "euclidean"}
. Defaults to"cosine"
. Defaults to"cosine"
. - spec: A dictionary containing configurations describing how the index should be deployed. For serverless indexes, specify region and cloud. For pod indexes, specify replicas, shards, pods, pod_type, metadata_config, and source_collection.
- timeout: Specify the number of seconds to wait until index gets ready. If None, wait indefinitely; if >=0, time out after this many seconds; if -1, return immediately and do not wait. Default: None
- deletion_protection: If enabled, the index cannot be deleted. If disabled, the index can be deleted. Default: "disabled"
Creating a serverless index
import os
from pinecone import Pinecone, ServerlessSpec
client = Pinecone(api_key=os.environ.get("PINECONE_API_KEY"))
client.create_index(
name="my_index",
dimension=1536,
metric="cosine",
spec=ServerlessSpec(cloud="aws", region="us-west-2"),
deletion_protection="enabled"
)
Creating a pod index
import os
from pinecone import Pinecone, PodSpec
client = Pinecone(api_key=os.environ.get("PINECONE_API_KEY"))
client.create_index(
name="my_index",
dimension=1536,
metric="cosine",
spec=PodSpec(
environment="us-east1-gcp",
pod_type="p1.x1"
),
deletion_protection="enabled"
)
407 def delete_index(self, name: str, timeout: Optional[int] = None): 408 """Deletes a Pinecone index. 409 410 Deleting an index is an irreversible operation. All data in the index will be lost. 411 When you use this command, a request is sent to the Pinecone control plane to delete 412 the index, but the termination is not synchronous because resources take a few moments to 413 be released. 414 415 You can check the status of the index by calling the `describe_index()` command. 416 With repeated polling of the describe_index command, you will see the index transition to a 417 `Terminating` state before eventually resulting in a 404 after it has been removed. 418 419 :param name: the name of the index. 420 :type name: str 421 :param timeout: Number of seconds to poll status checking whether the index has been deleted. If None, 422 wait indefinitely; if >=0, time out after this many seconds; 423 if -1, return immediately and do not wait. Default: None 424 :type timeout: int, optional 425 """ 426 api_instance = self.index_api 427 api_instance.delete_index(name) 428 self.index_host_store.delete_host(self.config, name) 429 430 def get_remaining(): 431 return name in self.list_indexes().names() 432 433 if timeout == -1: 434 return 435 436 if timeout is None: 437 while get_remaining(): 438 time.sleep(5) 439 else: 440 while get_remaining() and timeout >= 0: 441 time.sleep(5) 442 timeout -= 5 443 if timeout and timeout < 0: 444 raise ( 445 TimeoutError( 446 "Please call the list_indexes API ({}) to confirm if index is deleted".format( 447 "https://www.pinecone.io/docs/api/operation/list_indexes/" 448 ) 449 ) 450 )
Deletes a Pinecone index.
Deleting an index is an irreversible operation. All data in the index will be lost. When you use this command, a request is sent to the Pinecone control plane to delete the index, but the termination is not synchronous because resources take a few moments to be released.
You can check the status of the index by calling the describe_index()
command.
With repeated polling of the describe_index command, you will see the index transition to a
Terminating
state before eventually resulting in a 404 after it has been removed.
Parameters
- name: the name of the index.
- timeout: Number of seconds to poll status checking whether the index has been deleted. If None, wait indefinitely; if >=0, time out after this many seconds; if -1, return immediately and do not wait. Default: None
452 def list_indexes(self) -> IndexList: 453 """Lists all indexes. 454 455 The results include a description of all indexes in your project, including the 456 index name, dimension, metric, status, and spec. 457 458 :return: Returns an `IndexList` object, which is iterable and contains a 459 list of `IndexModel` objects. It also has a convenience method `names()` 460 which returns a list of index names. 461 462 ```python 463 from pinecone import Pinecone 464 465 client = Pinecone() 466 467 index_name = "my_index" 468 if index_name not in client.list_indexes().names(): 469 print("Index does not exist, creating...") 470 client.create_index( 471 name=index_name, 472 dimension=768, 473 metric="cosine", 474 spec=ServerlessSpec(cloud="aws", region="us-west-2") 475 ) 476 ``` 477 478 You can also use the `list_indexes()` method to iterate over all indexes in your project 479 and get other information besides just names. 480 481 ```python 482 from pinecone import Pinecone 483 484 client = Pinecone() 485 486 for index in client.list_indexes(): 487 print(index.name) 488 print(index.dimension) 489 print(index.metric) 490 print(index.status) 491 print(index.host) 492 print(index.spec) 493 ``` 494 495 """ 496 response = self.index_api.list_indexes() 497 return IndexList(response)
Lists all indexes.
The results include a description of all indexes in your project, including the index name, dimension, metric, status, and spec.
Returns
Returns an
IndexList
object, which is iterable and contains a list ofIndexModel
objects. It also has a convenience methodnames()
which returns a list of index names.
from pinecone import Pinecone
client = Pinecone()
index_name = "my_index"
if index_name not in client.list_indexes().names():
print("Index does not exist, creating...")
client.create_index(
name=index_name,
dimension=768,
metric="cosine",
spec=ServerlessSpec(cloud="aws", region="us-west-2")
)
You can also use the list_indexes()
method to iterate over all indexes in your project
and get other information besides just names.
from pinecone import Pinecone
client = Pinecone()
for index in client.list_indexes():
print(index.name)
print(index.dimension)
print(index.metric)
print(index.status)
print(index.host)
print(index.spec)
499 def describe_index(self, name: str): 500 """Describes a Pinecone index. 501 502 :param name: the name of the index to describe. 503 :return: Returns an `IndexModel` object 504 which gives access to properties such as the 505 index name, dimension, metric, host url, status, 506 and spec. 507 508 ### Getting your index host url 509 510 In a real production situation, you probably want to 511 store the host url in an environment variable so you 512 don't have to call describe_index and re-fetch it 513 every time you want to use the index. But this example 514 shows how to get the value from the API using describe_index. 515 516 ```python 517 from pinecone import Pinecone, Index 518 519 client = Pinecone() 520 521 description = client.describe_index("my_index") 522 523 host = description.host 524 print(f"Your index is hosted at {description.host}") 525 526 index = client.Index(name="my_index", host=host) 527 index.upsert(vectors=[...]) 528 ``` 529 """ 530 api_instance = self.index_api 531 description = api_instance.describe_index(name) 532 host = description.host 533 self.index_host_store.set_host(self.config, name, host) 534 535 return IndexModel(description)
Describes a Pinecone index.
Parameters
- name: the name of the index to describe.
Returns
Returns an
IndexModel
object which gives access to properties such as the index name, dimension, metric, host url, status, and spec.
Getting your index host url
In a real production situation, you probably want to store the host url in an environment variable so you don't have to call describe_index and re-fetch it every time you want to use the index. But this example shows how to get the value from the API using describe_index.
from pinecone import Pinecone, Index
client = Pinecone()
description = client.describe_index("my_index")
host = description.host
print(f"Your index is hosted at {description.host}")
index = client.Index(name="my_index", host=host)
index.upsert(vectors=[...])
537 def has_index(self, name: str) -> bool: 538 """Checks if a Pinecone index exists. 539 540 :param name: The name of the index to check for existence. 541 :return: Returns `True` if the index exists, `False` otherwise. 542 543 ### Example Usage 544 545 ```python 546 import os 547 from pinecone import Pinecone 548 549 api_key = os.environ.get("PINECONE_API_KEY") 550 pc = Pinecone(api_key=api_key) 551 552 if pc.has_index("my_index_name"): 553 print("The index exists") 554 else: 555 print("The index does not exist") 556 ``` 557 """ 558 559 if name in self.list_indexes().names(): 560 return True 561 else: 562 return False
Checks if a Pinecone index exists.
Parameters
- name: The name of the index to check for existence.
Returns
Returns
True
if the index exists,False
otherwise.
Example Usage
import os
from pinecone import Pinecone
api_key = os.environ.get("PINECONE_API_KEY")
pc = Pinecone(api_key=api_key)
if pc.has_index("my_index_name"):
print("The index exists")
else:
print("The index does not exist")
564 def configure_index( 565 self, 566 name: str, 567 replicas: Optional[int] = None, 568 pod_type: Optional[str] = None, 569 deletion_protection: Optional[Literal["enabled", "disabled"]] = None, 570 ): 571 """This method is used to scale configuration fields for your pod-based Pinecone index. 572 573 :param: name: the name of the Index 574 :param: replicas: the desired number of replicas, lowest value is 0. 575 :param: pod_type: the new pod_type for the index. To learn more about the 576 available pod types, please see [Understanding Indexes](https://docs.pinecone.io/docs/indexes) 577 578 579 ```python 580 from pinecone import Pinecone 581 582 client = Pinecone() 583 584 # Make a configuration change 585 client.configure_index(name="my_index", replicas=4) 586 587 # Call describe_index to see the index status as the 588 # change is applied. 589 client.describe_index("my_index") 590 ``` 591 592 """ 593 api_instance = self.index_api 594 595 if deletion_protection is None: 596 description = self.describe_index(name=name) 597 dp = DeletionProtection(description.deletion_protection) 598 elif deletion_protection in ["enabled", "disabled"]: 599 dp = DeletionProtection(deletion_protection) 600 else: 601 raise ValueError("deletion_protection must be either 'enabled' or 'disabled'") 602 603 pod_config_args: Dict[str, Any] = {} 604 if pod_type: 605 pod_config_args.update(pod_type=pod_type) 606 if replicas: 607 pod_config_args.update(replicas=replicas) 608 609 if pod_config_args != {}: 610 spec = ConfigureIndexRequestSpec(pod=ConfigureIndexRequestSpecPod(**pod_config_args)) 611 req = ConfigureIndexRequest(deletion_protection=dp, spec=spec) 612 else: 613 req = ConfigureIndexRequest(deletion_protection=dp) 614 615 api_instance.configure_index(name, configure_index_request=req)
This method is used to scale configuration fields for your pod-based Pinecone index.
Parameters
- name: the name of the Index
- replicas: the desired number of replicas, lowest value is 0.
- pod_type: the new pod_type for the index. To learn more about the available pod types, please see pinecone.control.pinecone.io/docs/indexes">Understanding Indexes
from pinecone import Pinecone
client = Pinecone()
# Make a configuration change
client.configure_index(name="my_index", replicas=4)
# Call describe_index to see the index status as the
# change is applied.
client.describe_index("my_index")
617 def create_collection(self, name: str, source: str): 618 """Create a collection from a pod-based index 619 620 :param name: Name of the collection 621 :param source: Name of the source index 622 """ 623 api_instance = self.index_api 624 api_instance.create_collection( 625 create_collection_request=CreateCollectionRequest(name=name, source=source) 626 )
Create a collection from a pod-based index
Parameters
- name: Name of the collection
- source: Name of the source index
628 def list_collections(self) -> CollectionList: 629 """List all collections 630 631 ```python 632 from pinecone import Pinecone 633 634 client = Pinecone() 635 636 for collection in client.list_collections(): 637 print(collection.name) 638 print(collection.source) 639 640 # You can also iterate specifically over the collection 641 # names with the .names() helper. 642 collection_name="my_collection" 643 for collection_name in client.list_collections().names(): 644 print(collection_name) 645 ``` 646 """ 647 api_instance = self.index_api 648 response = api_instance.list_collections() 649 return CollectionList(response)
List all collections
from pinecone import Pinecone
client = Pinecone()
for collection in client.list_collections():
print(collection.name)
print(collection.source)
# You can also iterate specifically over the collection
# names with the .names() helper.
collection_name="my_collection"
for collection_name in client.list_collections().names():
print(collection_name)
651 def delete_collection(self, name: str): 652 """Deletes a collection. 653 654 :param: name: The name of the collection 655 656 Deleting a collection is an irreversible operation. All data 657 in the collection will be lost. 658 659 This method tells Pinecone you would like to delete a collection, 660 but it takes a few moments to complete the operation. Use the 661 `describe_collection()` method to confirm that the collection 662 has been deleted. 663 """ 664 api_instance = self.index_api 665 api_instance.delete_collection(name)
Deletes a collection.
Parameters
- name: The name of the collection
Deleting a collection is an irreversible operation. All data in the collection will be lost.
This method tells Pinecone you would like to delete a collection,
but it takes a few moments to complete the operation. Use the
describe_collection()
method to confirm that the collection
has been deleted.
667 def describe_collection(self, name: str): 668 """Describes a collection. 669 :param: The name of the collection 670 :return: Description of the collection 671 672 ```python 673 from pinecone import Pinecone 674 675 client = Pinecone() 676 677 description = client.describe_collection("my_collection") 678 print(description.name) 679 print(description.source) 680 print(description.status) 681 print(description.size) 682 ``` 683 """ 684 api_instance = self.index_api 685 return api_instance.describe_collection(name).to_dict()
Describes a collection.
Parameters
- The name of the collection
Returns
Description of the collection
from pinecone import Pinecone
client = Pinecone()
description = client.describe_collection("my_collection")
print(description.name)
print(description.source)
print(description.status)
print(description.size)
700 def Index(self, name: str = "", host: str = "", **kwargs): 701 """ 702 Target an index for data operations. 703 704 ### Target an index by host url 705 706 In production situations, you want to uspert or query your data as quickly 707 as possible. If you know in advance the host url of your index, you can 708 eliminate a round trip to the Pinecone control plane by specifying the 709 host of the index. 710 711 ```python 712 import os 713 from pinecone import Pinecone 714 715 api_key = os.environ.get("PINECONE_API_KEY") 716 index_host = os.environ.get("PINECONE_INDEX_HOST") 717 718 pc = Pinecone(api_key=api_key) 719 index = pc.Index(host=index_host) 720 721 # Now you're ready to perform data operations 722 index.query(vector=[...], top_k=10) 723 ``` 724 725 To find your host url, you can use the Pinecone control plane to describe 726 the index. The host url is returned in the response. Or, alternatively, the 727 host is displayed in the Pinecone web console. 728 729 ```python 730 import os 731 from pinecone import Pinecone 732 733 pc = Pinecone( 734 api_key=os.environ.get("PINECONE_API_KEY") 735 ) 736 737 host = pc.describe_index('index-name').host 738 ``` 739 740 ### Target an index by name (not recommended for production) 741 742 For more casual usage, such as when you are playing and exploring with Pinecone 743 in a notebook setting, you can also target an index by name. If you use this 744 approach, the client may need to perform an extra call to the Pinecone control 745 plane to get the host url on your behalf to get the index host. 746 747 The client will cache the index host for future use whenever it is seen, so you 748 will only incur the overhead of only one call. But this approach is not 749 recommended for production usage. 750 751 ```python 752 import os 753 from pinecone import Pinecone, ServerlessSpec 754 755 api_key = os.environ.get("PINECONE_API_KEY") 756 757 pc = Pinecone(api_key=api_key) 758 pc.create_index( 759 name='my_index', 760 dimension=1536, 761 metric='cosine', 762 spec=ServerlessSpec(cloud='aws', region='us-west-2') 763 ) 764 index = pc.Index('my_index') 765 766 # Now you're ready to perform data operations 767 index.query(vector=[...], top_k=10) 768 ``` 769 770 Arguments: 771 name: The name of the index to target. If you specify the name of the index, the client will 772 fetch the host url from the Pinecone control plane. 773 host: The host url of the index to target. If you specify the host url, the client will use 774 the host url directly without making any additional calls to the control plane. 775 pool_threads: The number of threads to use when making parallel requests by calling index methods with optional kwarg async_req=True, or using methods that make use of parallelism automatically such as query_namespaces(). Default: 1 776 connection_pool_maxsize: The maximum number of connections to keep in the connection pool. Default: 5 * multiprocessing.cpu_count() 777 """ 778 if name == "" and host == "": 779 raise ValueError("Either name or host must be specified") 780 781 pt = kwargs.pop("pool_threads", None) or self.pool_threads 782 api_key = self.config.api_key 783 openapi_config = self.openapi_config 784 785 if host != "": 786 # Use host url if it is provided 787 index_host = normalize_host(host) 788 else: 789 # Otherwise, get host url from describe_index using the index name 790 index_host = self.index_host_store.get_host(self.index_api, self.config, name) 791 792 return Index( 793 host=index_host, 794 api_key=api_key, 795 pool_threads=pt, 796 openapi_config=openapi_config, 797 source_tag=self.config.source_tag, 798 **kwargs, 799 )
Target an index for data operations.
Target an index by host url
In production situations, you want to uspert or query your data as quickly as possible. If you know in advance the host url of your index, you can eliminate a round trip to the Pinecone control plane by specifying the host of the index.
import os
from pinecone import Pinecone
api_key = os.environ.get("PINECONE_API_KEY")
index_host = os.environ.get("PINECONE_INDEX_HOST")
pc = Pinecone(api_key=api_key)
index = pc.Index(host=index_host)
# Now you're ready to perform data operations
index.query(vector=[...], top_k=10)
To find your host url, you can use the Pinecone control plane to describe the index. The host url is returned in the response. Or, alternatively, the host is displayed in the Pinecone web console.
import os
from pinecone import Pinecone
pc = Pinecone(
api_key=os.environ.get("PINECONE_API_KEY")
)
host = pc.describe_index('index-name').host
Target an index by name (not recommended for production)
For more casual usage, such as when you are playing and exploring with Pinecone in a notebook setting, you can also target an index by name. If you use this approach, the client may need to perform an extra call to the Pinecone control plane to get the host url on your behalf to get the index host.
The client will cache the index host for future use whenever it is seen, so you will only incur the overhead of only one call. But this approach is not recommended for production usage.
import os
from pinecone import Pinecone, ServerlessSpec
api_key = os.environ.get("PINECONE_API_KEY")
pc = Pinecone(api_key=api_key)
pc.create_index(
name='my_index',
dimension=1536,
metric='cosine',
spec=ServerlessSpec(cloud='aws', region='us-west-2')
)
index = pc.Index('my_index')
# Now you're ready to perform data operations
index.query(vector=[...], top_k=10)
Arguments:
- name: The name of the index to target. If you specify the name of the index, the client will fetch the host url from the Pinecone control plane.
- host: The host url of the index to target. If you specify the host url, the client will use the host url directly without making any additional calls to the control plane.
- pool_threads: The number of threads to use when making parallel requests by calling index methods with optional kwarg async_req=True, or using methods that make use of parallelism automatically such as query_namespaces(). Default: 1
- connection_pool_maxsize: The maximum number of connections to keep in the connection pool. Default: 5 * multiprocessing.cpu_count()