pinecone.control.pinecone
1import time 2import warnings 3from typing import Optional, Dict, Any, Union, List, cast, NamedTuple 4 5from .index_host_store import IndexHostStore 6 7from pinecone.config import PineconeConfig, Config, ConfigBuilder 8 9from pinecone.core.client.api.manage_indexes_api import ManageIndexesApi 10from pinecone.utils import normalize_host, setup_openapi_client 11from pinecone.core.client.models import ( 12 CreateCollectionRequest, 13 CreateIndexRequest, 14 ConfigureIndexRequest, 15 ConfigureIndexRequestSpec, 16 ConfigureIndexRequestSpecPod 17) 18from pinecone.models import ServerlessSpec, PodSpec, IndexList, CollectionList 19 20from pinecone.data import Index 21 22class Pinecone: 23 24 def __init__( 25 self, 26 api_key: Optional[str] = None, 27 host: Optional[str] = None, 28 proxy_url: Optional[str] = None, 29 proxy_headers: Optional[Dict[str, str]] = None, 30 ssl_ca_certs: Optional[str] = None, 31 ssl_verify: Optional[bool] = None, 32 config: Optional[Config] = None, 33 additional_headers: Optional[Dict[str, str]] = {}, 34 pool_threads: Optional[int] = 1, 35 index_api: Optional[ManageIndexesApi] = None, 36 **kwargs, 37 ): 38 """ 39 The `Pinecone` class is the main entry point for interacting with Pinecone via this Python SDK. 40 It is used to create, delete, and manage your indexes and collections. 41 42 :param api_key: The API key to use for authentication. If not passed via kwarg, the API key will be read from the environment variable `PINECONE_API_KEY`. 43 :type api_key: str, optional 44 :param host: The control plane host to connect to. 45 :type host: str, optional 46 :param proxy_url: The URL of the proxy to use for the connection. Default: `None` 47 :type proxy_url: str, optional 48 :param proxy_headers: Additional headers to pass to the proxy. Use this if your proxy setup requires authentication. Default: `{}` 49 :type proxy_headers: Dict[str, str], optional 50 :param ssl_ca_certs: The path to the SSL CA certificate bundle to use for the connection. This path should point to a file in PEM format. Default: `None` 51 :type ssl_ca_certs: str, optional 52 :param ssl_verify: SSL verification is performed by default, but can be disabled using the boolean flag. Default: `True` 53 :type ssl_verify: bool, optional 54 :param config: A `pinecone.config.Config` object. If passed, the `api_key` and `host` parameters will be ignored. 55 :type config: pinecone.config.Config, optional 56 :param additional_headers: Additional headers to pass to the API. Default: `{}` 57 :type additional_headers: Dict[str, str], optional 58 :param pool_threads: The number of threads to use for the connection pool. Default: `1` 59 :type pool_threads: int, optional 60 :param index_api: An instance of `pinecone.core.client.api.manage_indexes_api.ManageIndexesApi`. If passed, the `host` parameter will be ignored. 61 :type index_api: pinecone.core.client.api.manage_indexes_api.ManageIndexesApi, optional 62 63 64 ### Configuration with environment variables 65 66 If you instantiate the Pinecone client with no arguments, it will attempt to read the API key from the environment variable `PINECONE_API_KEY`. 67 68 ```python 69 from pinecone import Pinecone 70 71 pc = Pinecone() 72 ``` 73 74 ### Configuration with keyword arguments 75 76 If you prefer being more explicit in your code, you can also pass the API as a keyword argument. 77 78 ```python 79 import os 80 from pinecone import Pinecone 81 82 pc = Pinecone(api_key=os.environ.get("PINECONE_API_KEY")) 83 ``` 84 85 ### Environment variables 86 87 The Pinecone client supports the following environment variables: 88 89 - `PINECONE_API_KEY`: The API key to use for authentication. If not passed via 90 kwarg, the API key will be read from the environment variable `PINECONE_API_KEY`. 91 92 - `PINECONE_DEBUG_CURL`: When troubleshooting it can be very useful to run curl 93 commands against the control plane API to see exactly what data is being sent 94 and received without all the abstractions and transformations applied by the Python 95 SDK. If you set this environment variable to `true`, the Pinecone client will use 96 request parameters to print out an equivalent curl command that you can run yourself 97 or share with Pinecone support. **Be very careful with this option, as it will print out 98 your API key** which forms part of a required authentication header. Default: `false` 99 100 ### Proxy configuration 101 102 If your network setup requires you to interact with Pinecone via a proxy, you will need 103 to pass additional configuration using optional keyword parameters. These optional parameters 104 are forwarded to `urllib3`, which is the underlying library currently used by the Pinecone client to 105 make HTTP requests. You may find it helpful to refer to the 106 [urllib3 documentation on working with proxies](https://urllib3.readthedocs.io/en/stable/advanced-usage.html#http-and-https-proxies) 107 while troubleshooting these settings. 108 109 Here is a basic example: 110 111 ```python 112 from pinecone import Pinecone 113 114 pc = Pinecone( 115 api_key='YOUR_API_KEY', 116 proxy_url='https://your-proxy.com' 117 ) 118 119 pc.list_indexes() 120 ``` 121 122 If your proxy requires authentication, you can pass those values in a header dictionary using the `proxy_headers` parameter. 123 124 ```python 125 from pinecone import Pinecone 126 import urllib3 import make_headers 127 128 pc = Pinecone( 129 api_key='YOUR_API_KEY', 130 proxy_url='https://your-proxy.com', 131 proxy_headers=make_headers(proxy_basic_auth='username:password') 132 ) 133 134 pc.list_indexes() 135 ``` 136 137 ### Using proxies with self-signed certificates 138 139 By default the Pinecone Python client will perform SSL certificate verification 140 using the CA bundle maintained by Mozilla in the [certifi](https://pypi.org/project/certifi/) package. 141 If your proxy server is using a self-signed certificate, you will need to pass the path to the certificate 142 in PEM format using the `ssl_ca_certs` parameter. 143 144 ```python 145 from pinecone import Pinecone 146 import urllib3 import make_headers 147 148 pc = Pinecone( 149 api_key='YOUR_API_KEY', 150 proxy_url='https://your-proxy.com', 151 proxy_headers=make_headers(proxy_basic_auth='username:password'), 152 ssl_ca_certs='path/to/cert-bundle.pem' 153 ) 154 155 pc.list_indexes() 156 ``` 157 158 ### Disabling SSL verification 159 160 If you would like to disable SSL verification, you can pass the `ssl_verify` 161 parameter with a value of `False`. We do not recommend going to production with SSL verification disabled. 162 163 ```python 164 from pinecone import Pinecone 165 import urllib3 import make_headers 166 167 pc = Pinecone( 168 api_key='YOUR_API_KEY', 169 proxy_url='https://your-proxy.com', 170 proxy_headers=make_headers(proxy_basic_auth='username:password'), 171 ssl_ca_certs='path/to/cert-bundle.pem', 172 ssl_verify=False 173 ) 174 175 pc.list_indexes() 176 177 ``` 178 """ 179 if config: 180 if not isinstance(config, Config): 181 raise TypeError("config must be of type pinecone.config.Config") 182 else: 183 self.config = config 184 else: 185 self.config = PineconeConfig.build( 186 api_key=api_key, 187 host=host, 188 additional_headers=additional_headers, 189 proxy_url=proxy_url, 190 proxy_headers=proxy_headers, 191 ssl_ca_certs=ssl_ca_certs, 192 ssl_verify=ssl_verify, 193 **kwargs 194 ) 195 196 if kwargs.get("openapi_config", None): 197 warnings.warn("Passing openapi_config is deprecated and will be removed in a future release. Please pass settings such as proxy_url, proxy_headers, ssl_ca_certs, and ssl_verify directly to the Pinecone constructor as keyword arguments. See the README at https://github.com/pinecone-io/pinecone-python-client for examples.", DeprecationWarning) 198 199 self.openapi_config = ConfigBuilder.build_openapi_config(self.config, **kwargs) 200 self.pool_threads = pool_threads 201 202 if index_api: 203 self.index_api = index_api 204 else: 205 self.index_api = setup_openapi_client(ManageIndexesApi, self.config, self.openapi_config, pool_threads) 206 207 self.index_host_store = IndexHostStore() 208 """ @private """ 209 210 def create_index( 211 self, 212 name: str, 213 dimension: int, 214 spec: Union[Dict, ServerlessSpec, PodSpec], 215 metric: Optional[str] = "cosine", 216 timeout: Optional[int] = None, 217 ): 218 """Creates a Pinecone index. 219 220 :param name: The name of the index to create. Must be unique within your project and 221 cannot be changed once created. Allowed characters are lowercase letters, numbers, 222 and hyphens and the name may not begin or end with hyphens. Maximum length is 45 characters. 223 :type name: str 224 :param dimension: The dimension of vectors that will be inserted in the index. This should 225 match the dimension of the embeddings you will be inserting. For example, if you are using 226 OpenAI's CLIP model, you should use `dimension=1536`. 227 :type dimension: int 228 :param metric: Type of metric used in the vector index when querying, one of `{"cosine", "dotproduct", "euclidean"}`. Defaults to `"cosine"`. 229 Defaults to `"cosine"`. 230 :type metric: str, optional 231 :param spec: A dictionary containing configurations describing how the index should be deployed. For serverless indexes, 232 specify region and cloud. For pod indexes, specify replicas, shards, pods, pod_type, metadata_config, and source_collection. 233 :type spec: Dict 234 :type timeout: int, optional 235 :param timeout: Specify the number of seconds to wait until index gets ready. If None, wait indefinitely; if >=0, time out after this many seconds; 236 if -1, return immediately and do not wait. Default: None 237 238 ### Creating a serverless index 239 240 ```python 241 import os 242 from pinecone import Pinecone, ServerlessSpec 243 244 client = Pinecone(api_key=os.environ.get("PINECONE_API_KEY")) 245 246 client.create_index( 247 name="my_index", 248 dimension=1536, 249 metric="cosine", 250 spec=ServerlessSpec(cloud="aws", region="us-west-2") 251 ) 252 ``` 253 254 ### Creating a pod index 255 256 ```python 257 import os 258 from pinecone import Pinecone, PodSpec 259 260 client = Pinecone(api_key=os.environ.get("PINECONE_API_KEY")) 261 262 client.create_index( 263 name="my_index", 264 dimension=1536, 265 metric="cosine", 266 spec=PodSpec( 267 environment="us-east1-gcp", 268 pod_type="p1.x1" 269 ) 270 ) 271 ``` 272 """ 273 274 api_instance = self.index_api 275 276 if isinstance(spec, dict): 277 api_instance.create_index(create_index_request=CreateIndexRequest(name=name, dimension=dimension, metric=metric, spec=spec)) 278 elif isinstance(spec, ServerlessSpec): 279 api_instance.create_index(create_index_request=CreateIndexRequest(name=name, dimension=dimension, metric=metric, spec=spec.asdict())) 280 elif isinstance(spec, PodSpec): 281 api_instance.create_index(create_index_request=CreateIndexRequest(name=name, dimension=dimension, metric=metric, spec=spec.asdict())) 282 else: 283 raise TypeError("spec must be of type dict, ServerlessSpec, or PodSpec") 284 285 def is_ready(): 286 status = self._get_status(name) 287 ready = status["ready"] 288 return ready 289 290 if timeout == -1: 291 return 292 if timeout is None: 293 while not is_ready(): 294 time.sleep(5) 295 else: 296 while (not is_ready()) and timeout >= 0: 297 time.sleep(5) 298 timeout -= 5 299 if timeout and timeout < 0: 300 raise ( 301 TimeoutError( 302 "Please call the describe_index API ({}) to confirm index status.".format( 303 "https://www.pinecone.io/docs/api/operation/describe_index/" 304 ) 305 ) 306 ) 307 308 def delete_index(self, name: str, timeout: Optional[int] = None): 309 """Deletes a Pinecone index. 310 311 Deleting an index is an irreversible operation. All data in the index will be lost. 312 When you use this command, a request is sent to the Pinecone control plane to delete 313 the index, but the termination is not synchronous because resources take a few moments to 314 be released. 315 316 You can check the status of the index by calling the `describe_index()` command. 317 With repeated polling of the describe_index command, you will see the index transition to a 318 `Terminating` state before eventually resulting in a 404 after it has been removed. 319 320 :param name: the name of the index. 321 :type name: str 322 :param timeout: Number of seconds to poll status checking whether the index has been deleted. If None, 323 wait indefinitely; if >=0, time out after this many seconds; 324 if -1, return immediately and do not wait. Default: None 325 :type timeout: int, optional 326 """ 327 api_instance = self.index_api 328 api_instance.delete_index(name) 329 self.index_host_store.delete_host(self.config, name) 330 331 def get_remaining(): 332 return name in self.list_indexes().names() 333 334 if timeout == -1: 335 return 336 337 if timeout is None: 338 while get_remaining(): 339 time.sleep(5) 340 else: 341 while get_remaining() and timeout >= 0: 342 time.sleep(5) 343 timeout -= 5 344 if timeout and timeout < 0: 345 raise ( 346 TimeoutError( 347 "Please call the list_indexes API ({}) to confirm if index is deleted".format( 348 "https://www.pinecone.io/docs/api/operation/list_indexes/" 349 ) 350 ) 351 ) 352 353 def list_indexes(self) -> IndexList: 354 """Lists all indexes. 355 356 The results include a description of all indexes in your project, including the 357 index name, dimension, metric, status, and spec. 358 359 :return: Returns an `IndexList` object, which is iterable and contains a 360 list of `IndexDescription` objects. It also has a convenience method `names()` 361 which returns a list of index names. 362 363 ```python 364 from pinecone import Pinecone 365 366 client = Pinecone() 367 368 index_name = "my_index" 369 if index_name not in client.list_indexes().names(): 370 print("Index does not exist, creating...") 371 client.create_index( 372 name=index_name, 373 dimension=768, 374 metric="cosine", 375 spec=ServerlessSpec(cloud="aws", region="us-west-2") 376 ) 377 ``` 378 379 You can also use the `list_indexes()` method to iterate over all indexes in your project 380 and get other information besides just names. 381 382 ```python 383 from pinecone import Pinecone 384 385 client = Pinecone() 386 387 for index in client.list_indexes(): 388 print(index.name) 389 print(index.dimension) 390 print(index.metric) 391 print(index.status) 392 print(index.host) 393 print(index.spec) 394 ``` 395 396 """ 397 response = self.index_api.list_indexes() 398 return IndexList(response) 399 400 def describe_index(self, name: str): 401 """Describes a Pinecone index. 402 403 :param name: the name of the index to describe. 404 :return: Returns an `IndexDescription` object 405 which gives access to properties such as the 406 index name, dimension, metric, host url, status, 407 and spec. 408 409 ### Getting your index host url 410 411 In a real production situation, you probably want to 412 store the host url in an environment variable so you 413 don't have to call describe_index and re-fetch it 414 every time you want to use the index. But this example 415 shows how to get the value from the API using describe_index. 416 417 ```python 418 from pinecone import Pinecone, Index 419 420 client = Pinecone() 421 422 description = client.describe_index("my_index") 423 424 host = description.host 425 print(f"Your index is hosted at {description.host}") 426 427 index = client.Index(name="my_index", host=host) 428 index.upsert(vectors=[...]) 429 ``` 430 """ 431 api_instance = self.index_api 432 description = api_instance.describe_index(name) 433 host = description.host 434 self.index_host_store.set_host(self.config, name, host) 435 436 return description 437 438 def configure_index(self, name: str, replicas: Optional[int] = None, pod_type: Optional[str] = None): 439 """This method is used to scale configuration fields for your pod-based Pinecone index. 440 441 :param: name: the name of the Index 442 :param: replicas: the desired number of replicas, lowest value is 0. 443 :param: pod_type: the new pod_type for the index. To learn more about the 444 available pod types, please see [Understanding Indexes](https://docs.pinecone.io/docs/indexes) 445 446 447 ```python 448 from pinecone import Pinecone 449 450 client = Pinecone() 451 452 # Make a configuration change 453 client.configure_index(name="my_index", replicas=4) 454 455 # Call describe_index to see the index status as the 456 # change is applied. 457 client.describe_index("my_index") 458 ``` 459 460 """ 461 api_instance = self.index_api 462 config_args: Dict[str, Any] = {} 463 if pod_type: 464 config_args.update(pod_type=pod_type) 465 if replicas: 466 config_args.update(replicas=replicas) 467 configure_index_request = ConfigureIndexRequest( 468 spec=ConfigureIndexRequestSpec( 469 pod=ConfigureIndexRequestSpecPod(**config_args) 470 ) 471 ) 472 api_instance.configure_index(name, configure_index_request=configure_index_request) 473 474 def create_collection(self, name: str, source: str): 475 """Create a collection from a pod-based index 476 477 :param name: Name of the collection 478 :param source: Name of the source index 479 """ 480 api_instance = self.index_api 481 api_instance.create_collection(create_collection_request=CreateCollectionRequest(name=name, source=source)) 482 483 def list_collections(self) -> CollectionList: 484 """List all collections 485 486 ```python 487 from pinecone import Pinecone 488 489 client = Pinecone() 490 491 for collection in client.list_collections(): 492 print(collection.name) 493 print(collection.source) 494 495 # You can also iterate specifically over the collection 496 # names with the .names() helper. 497 collection_name="my_collection" 498 for collection_name in client.list_collections().names(): 499 print(collection_name) 500 ``` 501 """ 502 api_instance = self.index_api 503 response = api_instance.list_collections() 504 return CollectionList(response) 505 506 def delete_collection(self, name: str): 507 """Deletes a collection. 508 509 :param: name: The name of the collection 510 511 Deleting a collection is an irreversible operation. All data 512 in the collection will be lost. 513 514 This method tells Pinecone you would like to delete a collection, 515 but it takes a few moments to complete the operation. Use the 516 `describe_collection()` method to confirm that the collection 517 has been deleted. 518 """ 519 api_instance = self.index_api 520 api_instance.delete_collection(name) 521 522 def describe_collection(self, name: str): 523 """Describes a collection. 524 :param: The name of the collection 525 :return: Description of the collection 526 527 ```python 528 from pinecone import Pinecone 529 530 client = Pinecone() 531 532 description = client.describe_collection("my_collection") 533 print(description.name) 534 print(description.source) 535 print(description.status) 536 print(description.size) 537 print(description.) 538 ``` 539 """ 540 api_instance = self.index_api 541 return api_instance.describe_collection(name).to_dict() 542 543 def _get_status(self, name: str): 544 api_instance = self.index_api 545 response = api_instance.describe_index(name) 546 return response["status"] 547 548 549 def Index(self, name: str = '', host: str = '', **kwargs): 550 """ 551 Target an index for data operations. 552 553 ### Target an index by host url 554 555 In production situations, you want to uspert or query your data as quickly 556 as possible. If you know in advance the host url of your index, you can 557 eliminate a round trip to the Pinecone control plane by specifying the 558 host of the index. 559 560 ```python 561 import os 562 from pinecone import Pinecone 563 564 api_key = os.environ.get("PINECONE_API_KEY") 565 index_host = os.environ.get("PINECONE_INDEX_HOST") 566 567 pc = Pinecone(api_key=api_key) 568 index = pc.Index(host=index_host) 569 570 # Now you're ready to perform data operations 571 index.query(vector=[...], top_k=10) 572 ``` 573 574 To find your host url, you can use the Pinecone control plane to describe 575 the index. The host url is returned in the response. Or, alternatively, the 576 host is displayed in the Pinecone web console. 577 578 ```python 579 import os 580 from pinecone import Pinecone 581 582 pc = Pinecone( 583 api_key=os.environ.get("PINECONE_API_KEY") 584 ) 585 586 host = pc.describe_index('index-name').host 587 ``` 588 589 ### Target an index by name (not recommended for production) 590 591 For more casual usage, such as when you are playing and exploring with Pinecone 592 in a notebook setting, you can also target an index by name. If you use this 593 approach, the client may need to perform an extra call to the Pinecone control 594 plane to get the host url on your behalf to get the index host. 595 596 The client will cache the index host for future use whenever it is seen, so you 597 will only incur the overhead of only one call. But this approach is not 598 recommended for production usage. 599 600 ```python 601 import os 602 from pinecone import Pinecone, ServerlessSpec 603 604 api_key = os.environ.get("PINECONE_API_KEY") 605 606 pc = Pinecone(api_key=api_key) 607 pc.create_index( 608 name='my-index', 609 dimension=1536, 610 metric='cosine', 611 spec=ServerlessSpec(cloud='aws', region='us-west-2') 612 ) 613 index = pc.Index('my-index') 614 615 # Now you're ready to perform data operations 616 index.query(vector=[...], top_k=10) 617 ``` 618 """ 619 if name == '' and host == '': 620 raise ValueError("Either name or host must be specified") 621 622 pt = kwargs.pop('pool_threads', None) or self.pool_threads 623 api_key = self.config.api_key 624 openapi_config = self.openapi_config 625 626 if host != '': 627 # Use host url if it is provided 628 index_host=normalize_host(host) 629 else: 630 # Otherwise, get host url from describe_index using the index name 631 index_host = self.index_host_store.get_host(self.index_api, self.config, name) 632 633 return Index( 634 host=index_host, 635 api_key=api_key, 636 pool_threads=pt, 637 openapi_config=openapi_config, 638 source_tag=self.config.source_tag, 639 **kwargs 640 )
23class Pinecone: 24 25 def __init__( 26 self, 27 api_key: Optional[str] = None, 28 host: Optional[str] = None, 29 proxy_url: Optional[str] = None, 30 proxy_headers: Optional[Dict[str, str]] = None, 31 ssl_ca_certs: Optional[str] = None, 32 ssl_verify: Optional[bool] = None, 33 config: Optional[Config] = None, 34 additional_headers: Optional[Dict[str, str]] = {}, 35 pool_threads: Optional[int] = 1, 36 index_api: Optional[ManageIndexesApi] = None, 37 **kwargs, 38 ): 39 """ 40 The `Pinecone` class is the main entry point for interacting with Pinecone via this Python SDK. 41 It is used to create, delete, and manage your indexes and collections. 42 43 :param api_key: The API key to use for authentication. If not passed via kwarg, the API key will be read from the environment variable `PINECONE_API_KEY`. 44 :type api_key: str, optional 45 :param host: The control plane host to connect to. 46 :type host: str, optional 47 :param proxy_url: The URL of the proxy to use for the connection. Default: `None` 48 :type proxy_url: str, optional 49 :param proxy_headers: Additional headers to pass to the proxy. Use this if your proxy setup requires authentication. Default: `{}` 50 :type proxy_headers: Dict[str, str], optional 51 :param ssl_ca_certs: The path to the SSL CA certificate bundle to use for the connection. This path should point to a file in PEM format. Default: `None` 52 :type ssl_ca_certs: str, optional 53 :param ssl_verify: SSL verification is performed by default, but can be disabled using the boolean flag. Default: `True` 54 :type ssl_verify: bool, optional 55 :param config: A `pinecone.config.Config` object. If passed, the `api_key` and `host` parameters will be ignored. 56 :type config: pinecone.config.Config, optional 57 :param additional_headers: Additional headers to pass to the API. Default: `{}` 58 :type additional_headers: Dict[str, str], optional 59 :param pool_threads: The number of threads to use for the connection pool. Default: `1` 60 :type pool_threads: int, optional 61 :param index_api: An instance of `pinecone.core.client.api.manage_indexes_api.ManageIndexesApi`. If passed, the `host` parameter will be ignored. 62 :type index_api: pinecone.core.client.api.manage_indexes_api.ManageIndexesApi, optional 63 64 65 ### Configuration with environment variables 66 67 If you instantiate the Pinecone client with no arguments, it will attempt to read the API key from the environment variable `PINECONE_API_KEY`. 68 69 ```python 70 from pinecone import Pinecone 71 72 pc = Pinecone() 73 ``` 74 75 ### Configuration with keyword arguments 76 77 If you prefer being more explicit in your code, you can also pass the API as a keyword argument. 78 79 ```python 80 import os 81 from pinecone import Pinecone 82 83 pc = Pinecone(api_key=os.environ.get("PINECONE_API_KEY")) 84 ``` 85 86 ### Environment variables 87 88 The Pinecone client supports the following environment variables: 89 90 - `PINECONE_API_KEY`: The API key to use for authentication. If not passed via 91 kwarg, the API key will be read from the environment variable `PINECONE_API_KEY`. 92 93 - `PINECONE_DEBUG_CURL`: When troubleshooting it can be very useful to run curl 94 commands against the control plane API to see exactly what data is being sent 95 and received without all the abstractions and transformations applied by the Python 96 SDK. If you set this environment variable to `true`, the Pinecone client will use 97 request parameters to print out an equivalent curl command that you can run yourself 98 or share with Pinecone support. **Be very careful with this option, as it will print out 99 your API key** which forms part of a required authentication header. Default: `false` 100 101 ### Proxy configuration 102 103 If your network setup requires you to interact with Pinecone via a proxy, you will need 104 to pass additional configuration using optional keyword parameters. These optional parameters 105 are forwarded to `urllib3`, which is the underlying library currently used by the Pinecone client to 106 make HTTP requests. You may find it helpful to refer to the 107 [urllib3 documentation on working with proxies](https://urllib3.readthedocs.io/en/stable/advanced-usage.html#http-and-https-proxies) 108 while troubleshooting these settings. 109 110 Here is a basic example: 111 112 ```python 113 from pinecone import Pinecone 114 115 pc = Pinecone( 116 api_key='YOUR_API_KEY', 117 proxy_url='https://your-proxy.com' 118 ) 119 120 pc.list_indexes() 121 ``` 122 123 If your proxy requires authentication, you can pass those values in a header dictionary using the `proxy_headers` parameter. 124 125 ```python 126 from pinecone import Pinecone 127 import urllib3 import make_headers 128 129 pc = Pinecone( 130 api_key='YOUR_API_KEY', 131 proxy_url='https://your-proxy.com', 132 proxy_headers=make_headers(proxy_basic_auth='username:password') 133 ) 134 135 pc.list_indexes() 136 ``` 137 138 ### Using proxies with self-signed certificates 139 140 By default the Pinecone Python client will perform SSL certificate verification 141 using the CA bundle maintained by Mozilla in the [certifi](https://pypi.org/project/certifi/) package. 142 If your proxy server is using a self-signed certificate, you will need to pass the path to the certificate 143 in PEM format using the `ssl_ca_certs` parameter. 144 145 ```python 146 from pinecone import Pinecone 147 import urllib3 import make_headers 148 149 pc = Pinecone( 150 api_key='YOUR_API_KEY', 151 proxy_url='https://your-proxy.com', 152 proxy_headers=make_headers(proxy_basic_auth='username:password'), 153 ssl_ca_certs='path/to/cert-bundle.pem' 154 ) 155 156 pc.list_indexes() 157 ``` 158 159 ### Disabling SSL verification 160 161 If you would like to disable SSL verification, you can pass the `ssl_verify` 162 parameter with a value of `False`. We do not recommend going to production with SSL verification disabled. 163 164 ```python 165 from pinecone import Pinecone 166 import urllib3 import make_headers 167 168 pc = Pinecone( 169 api_key='YOUR_API_KEY', 170 proxy_url='https://your-proxy.com', 171 proxy_headers=make_headers(proxy_basic_auth='username:password'), 172 ssl_ca_certs='path/to/cert-bundle.pem', 173 ssl_verify=False 174 ) 175 176 pc.list_indexes() 177 178 ``` 179 """ 180 if config: 181 if not isinstance(config, Config): 182 raise TypeError("config must be of type pinecone.config.Config") 183 else: 184 self.config = config 185 else: 186 self.config = PineconeConfig.build( 187 api_key=api_key, 188 host=host, 189 additional_headers=additional_headers, 190 proxy_url=proxy_url, 191 proxy_headers=proxy_headers, 192 ssl_ca_certs=ssl_ca_certs, 193 ssl_verify=ssl_verify, 194 **kwargs 195 ) 196 197 if kwargs.get("openapi_config", None): 198 warnings.warn("Passing openapi_config is deprecated and will be removed in a future release. Please pass settings such as proxy_url, proxy_headers, ssl_ca_certs, and ssl_verify directly to the Pinecone constructor as keyword arguments. See the README at https://github.com/pinecone-io/pinecone-python-client for examples.", DeprecationWarning) 199 200 self.openapi_config = ConfigBuilder.build_openapi_config(self.config, **kwargs) 201 self.pool_threads = pool_threads 202 203 if index_api: 204 self.index_api = index_api 205 else: 206 self.index_api = setup_openapi_client(ManageIndexesApi, self.config, self.openapi_config, pool_threads) 207 208 self.index_host_store = IndexHostStore() 209 """ @private """ 210 211 def create_index( 212 self, 213 name: str, 214 dimension: int, 215 spec: Union[Dict, ServerlessSpec, PodSpec], 216 metric: Optional[str] = "cosine", 217 timeout: Optional[int] = None, 218 ): 219 """Creates a Pinecone index. 220 221 :param name: The name of the index to create. Must be unique within your project and 222 cannot be changed once created. Allowed characters are lowercase letters, numbers, 223 and hyphens and the name may not begin or end with hyphens. Maximum length is 45 characters. 224 :type name: str 225 :param dimension: The dimension of vectors that will be inserted in the index. This should 226 match the dimension of the embeddings you will be inserting. For example, if you are using 227 OpenAI's CLIP model, you should use `dimension=1536`. 228 :type dimension: int 229 :param metric: Type of metric used in the vector index when querying, one of `{"cosine", "dotproduct", "euclidean"}`. Defaults to `"cosine"`. 230 Defaults to `"cosine"`. 231 :type metric: str, optional 232 :param spec: A dictionary containing configurations describing how the index should be deployed. For serverless indexes, 233 specify region and cloud. For pod indexes, specify replicas, shards, pods, pod_type, metadata_config, and source_collection. 234 :type spec: Dict 235 :type timeout: int, optional 236 :param timeout: Specify the number of seconds to wait until index gets ready. If None, wait indefinitely; if >=0, time out after this many seconds; 237 if -1, return immediately and do not wait. Default: None 238 239 ### Creating a serverless index 240 241 ```python 242 import os 243 from pinecone import Pinecone, ServerlessSpec 244 245 client = Pinecone(api_key=os.environ.get("PINECONE_API_KEY")) 246 247 client.create_index( 248 name="my_index", 249 dimension=1536, 250 metric="cosine", 251 spec=ServerlessSpec(cloud="aws", region="us-west-2") 252 ) 253 ``` 254 255 ### Creating a pod index 256 257 ```python 258 import os 259 from pinecone import Pinecone, PodSpec 260 261 client = Pinecone(api_key=os.environ.get("PINECONE_API_KEY")) 262 263 client.create_index( 264 name="my_index", 265 dimension=1536, 266 metric="cosine", 267 spec=PodSpec( 268 environment="us-east1-gcp", 269 pod_type="p1.x1" 270 ) 271 ) 272 ``` 273 """ 274 275 api_instance = self.index_api 276 277 if isinstance(spec, dict): 278 api_instance.create_index(create_index_request=CreateIndexRequest(name=name, dimension=dimension, metric=metric, spec=spec)) 279 elif isinstance(spec, ServerlessSpec): 280 api_instance.create_index(create_index_request=CreateIndexRequest(name=name, dimension=dimension, metric=metric, spec=spec.asdict())) 281 elif isinstance(spec, PodSpec): 282 api_instance.create_index(create_index_request=CreateIndexRequest(name=name, dimension=dimension, metric=metric, spec=spec.asdict())) 283 else: 284 raise TypeError("spec must be of type dict, ServerlessSpec, or PodSpec") 285 286 def is_ready(): 287 status = self._get_status(name) 288 ready = status["ready"] 289 return ready 290 291 if timeout == -1: 292 return 293 if timeout is None: 294 while not is_ready(): 295 time.sleep(5) 296 else: 297 while (not is_ready()) and timeout >= 0: 298 time.sleep(5) 299 timeout -= 5 300 if timeout and timeout < 0: 301 raise ( 302 TimeoutError( 303 "Please call the describe_index API ({}) to confirm index status.".format( 304 "https://www.pinecone.io/docs/api/operation/describe_index/" 305 ) 306 ) 307 ) 308 309 def delete_index(self, name: str, timeout: Optional[int] = None): 310 """Deletes a Pinecone index. 311 312 Deleting an index is an irreversible operation. All data in the index will be lost. 313 When you use this command, a request is sent to the Pinecone control plane to delete 314 the index, but the termination is not synchronous because resources take a few moments to 315 be released. 316 317 You can check the status of the index by calling the `describe_index()` command. 318 With repeated polling of the describe_index command, you will see the index transition to a 319 `Terminating` state before eventually resulting in a 404 after it has been removed. 320 321 :param name: the name of the index. 322 :type name: str 323 :param timeout: Number of seconds to poll status checking whether the index has been deleted. If None, 324 wait indefinitely; if >=0, time out after this many seconds; 325 if -1, return immediately and do not wait. Default: None 326 :type timeout: int, optional 327 """ 328 api_instance = self.index_api 329 api_instance.delete_index(name) 330 self.index_host_store.delete_host(self.config, name) 331 332 def get_remaining(): 333 return name in self.list_indexes().names() 334 335 if timeout == -1: 336 return 337 338 if timeout is None: 339 while get_remaining(): 340 time.sleep(5) 341 else: 342 while get_remaining() and timeout >= 0: 343 time.sleep(5) 344 timeout -= 5 345 if timeout and timeout < 0: 346 raise ( 347 TimeoutError( 348 "Please call the list_indexes API ({}) to confirm if index is deleted".format( 349 "https://www.pinecone.io/docs/api/operation/list_indexes/" 350 ) 351 ) 352 ) 353 354 def list_indexes(self) -> IndexList: 355 """Lists all indexes. 356 357 The results include a description of all indexes in your project, including the 358 index name, dimension, metric, status, and spec. 359 360 :return: Returns an `IndexList` object, which is iterable and contains a 361 list of `IndexDescription` objects. It also has a convenience method `names()` 362 which returns a list of index names. 363 364 ```python 365 from pinecone import Pinecone 366 367 client = Pinecone() 368 369 index_name = "my_index" 370 if index_name not in client.list_indexes().names(): 371 print("Index does not exist, creating...") 372 client.create_index( 373 name=index_name, 374 dimension=768, 375 metric="cosine", 376 spec=ServerlessSpec(cloud="aws", region="us-west-2") 377 ) 378 ``` 379 380 You can also use the `list_indexes()` method to iterate over all indexes in your project 381 and get other information besides just names. 382 383 ```python 384 from pinecone import Pinecone 385 386 client = Pinecone() 387 388 for index in client.list_indexes(): 389 print(index.name) 390 print(index.dimension) 391 print(index.metric) 392 print(index.status) 393 print(index.host) 394 print(index.spec) 395 ``` 396 397 """ 398 response = self.index_api.list_indexes() 399 return IndexList(response) 400 401 def describe_index(self, name: str): 402 """Describes a Pinecone index. 403 404 :param name: the name of the index to describe. 405 :return: Returns an `IndexDescription` object 406 which gives access to properties such as the 407 index name, dimension, metric, host url, status, 408 and spec. 409 410 ### Getting your index host url 411 412 In a real production situation, you probably want to 413 store the host url in an environment variable so you 414 don't have to call describe_index and re-fetch it 415 every time you want to use the index. But this example 416 shows how to get the value from the API using describe_index. 417 418 ```python 419 from pinecone import Pinecone, Index 420 421 client = Pinecone() 422 423 description = client.describe_index("my_index") 424 425 host = description.host 426 print(f"Your index is hosted at {description.host}") 427 428 index = client.Index(name="my_index", host=host) 429 index.upsert(vectors=[...]) 430 ``` 431 """ 432 api_instance = self.index_api 433 description = api_instance.describe_index(name) 434 host = description.host 435 self.index_host_store.set_host(self.config, name, host) 436 437 return description 438 439 def configure_index(self, name: str, replicas: Optional[int] = None, pod_type: Optional[str] = None): 440 """This method is used to scale configuration fields for your pod-based Pinecone index. 441 442 :param: name: the name of the Index 443 :param: replicas: the desired number of replicas, lowest value is 0. 444 :param: pod_type: the new pod_type for the index. To learn more about the 445 available pod types, please see [Understanding Indexes](https://docs.pinecone.io/docs/indexes) 446 447 448 ```python 449 from pinecone import Pinecone 450 451 client = Pinecone() 452 453 # Make a configuration change 454 client.configure_index(name="my_index", replicas=4) 455 456 # Call describe_index to see the index status as the 457 # change is applied. 458 client.describe_index("my_index") 459 ``` 460 461 """ 462 api_instance = self.index_api 463 config_args: Dict[str, Any] = {} 464 if pod_type: 465 config_args.update(pod_type=pod_type) 466 if replicas: 467 config_args.update(replicas=replicas) 468 configure_index_request = ConfigureIndexRequest( 469 spec=ConfigureIndexRequestSpec( 470 pod=ConfigureIndexRequestSpecPod(**config_args) 471 ) 472 ) 473 api_instance.configure_index(name, configure_index_request=configure_index_request) 474 475 def create_collection(self, name: str, source: str): 476 """Create a collection from a pod-based index 477 478 :param name: Name of the collection 479 :param source: Name of the source index 480 """ 481 api_instance = self.index_api 482 api_instance.create_collection(create_collection_request=CreateCollectionRequest(name=name, source=source)) 483 484 def list_collections(self) -> CollectionList: 485 """List all collections 486 487 ```python 488 from pinecone import Pinecone 489 490 client = Pinecone() 491 492 for collection in client.list_collections(): 493 print(collection.name) 494 print(collection.source) 495 496 # You can also iterate specifically over the collection 497 # names with the .names() helper. 498 collection_name="my_collection" 499 for collection_name in client.list_collections().names(): 500 print(collection_name) 501 ``` 502 """ 503 api_instance = self.index_api 504 response = api_instance.list_collections() 505 return CollectionList(response) 506 507 def delete_collection(self, name: str): 508 """Deletes a collection. 509 510 :param: name: The name of the collection 511 512 Deleting a collection is an irreversible operation. All data 513 in the collection will be lost. 514 515 This method tells Pinecone you would like to delete a collection, 516 but it takes a few moments to complete the operation. Use the 517 `describe_collection()` method to confirm that the collection 518 has been deleted. 519 """ 520 api_instance = self.index_api 521 api_instance.delete_collection(name) 522 523 def describe_collection(self, name: str): 524 """Describes a collection. 525 :param: The name of the collection 526 :return: Description of the collection 527 528 ```python 529 from pinecone import Pinecone 530 531 client = Pinecone() 532 533 description = client.describe_collection("my_collection") 534 print(description.name) 535 print(description.source) 536 print(description.status) 537 print(description.size) 538 print(description.) 539 ``` 540 """ 541 api_instance = self.index_api 542 return api_instance.describe_collection(name).to_dict() 543 544 def _get_status(self, name: str): 545 api_instance = self.index_api 546 response = api_instance.describe_index(name) 547 return response["status"] 548 549 550 def Index(self, name: str = '', host: str = '', **kwargs): 551 """ 552 Target an index for data operations. 553 554 ### Target an index by host url 555 556 In production situations, you want to uspert or query your data as quickly 557 as possible. If you know in advance the host url of your index, you can 558 eliminate a round trip to the Pinecone control plane by specifying the 559 host of the index. 560 561 ```python 562 import os 563 from pinecone import Pinecone 564 565 api_key = os.environ.get("PINECONE_API_KEY") 566 index_host = os.environ.get("PINECONE_INDEX_HOST") 567 568 pc = Pinecone(api_key=api_key) 569 index = pc.Index(host=index_host) 570 571 # Now you're ready to perform data operations 572 index.query(vector=[...], top_k=10) 573 ``` 574 575 To find your host url, you can use the Pinecone control plane to describe 576 the index. The host url is returned in the response. Or, alternatively, the 577 host is displayed in the Pinecone web console. 578 579 ```python 580 import os 581 from pinecone import Pinecone 582 583 pc = Pinecone( 584 api_key=os.environ.get("PINECONE_API_KEY") 585 ) 586 587 host = pc.describe_index('index-name').host 588 ``` 589 590 ### Target an index by name (not recommended for production) 591 592 For more casual usage, such as when you are playing and exploring with Pinecone 593 in a notebook setting, you can also target an index by name. If you use this 594 approach, the client may need to perform an extra call to the Pinecone control 595 plane to get the host url on your behalf to get the index host. 596 597 The client will cache the index host for future use whenever it is seen, so you 598 will only incur the overhead of only one call. But this approach is not 599 recommended for production usage. 600 601 ```python 602 import os 603 from pinecone import Pinecone, ServerlessSpec 604 605 api_key = os.environ.get("PINECONE_API_KEY") 606 607 pc = Pinecone(api_key=api_key) 608 pc.create_index( 609 name='my-index', 610 dimension=1536, 611 metric='cosine', 612 spec=ServerlessSpec(cloud='aws', region='us-west-2') 613 ) 614 index = pc.Index('my-index') 615 616 # Now you're ready to perform data operations 617 index.query(vector=[...], top_k=10) 618 ``` 619 """ 620 if name == '' and host == '': 621 raise ValueError("Either name or host must be specified") 622 623 pt = kwargs.pop('pool_threads', None) or self.pool_threads 624 api_key = self.config.api_key 625 openapi_config = self.openapi_config 626 627 if host != '': 628 # Use host url if it is provided 629 index_host=normalize_host(host) 630 else: 631 # Otherwise, get host url from describe_index using the index name 632 index_host = self.index_host_store.get_host(self.index_api, self.config, name) 633 634 return Index( 635 host=index_host, 636 api_key=api_key, 637 pool_threads=pt, 638 openapi_config=openapi_config, 639 source_tag=self.config.source_tag, 640 **kwargs 641 )
25 def __init__( 26 self, 27 api_key: Optional[str] = None, 28 host: Optional[str] = None, 29 proxy_url: Optional[str] = None, 30 proxy_headers: Optional[Dict[str, str]] = None, 31 ssl_ca_certs: Optional[str] = None, 32 ssl_verify: Optional[bool] = None, 33 config: Optional[Config] = None, 34 additional_headers: Optional[Dict[str, str]] = {}, 35 pool_threads: Optional[int] = 1, 36 index_api: Optional[ManageIndexesApi] = None, 37 **kwargs, 38 ): 39 """ 40 The `Pinecone` class is the main entry point for interacting with Pinecone via this Python SDK. 41 It is used to create, delete, and manage your indexes and collections. 42 43 :param api_key: The API key to use for authentication. If not passed via kwarg, the API key will be read from the environment variable `PINECONE_API_KEY`. 44 :type api_key: str, optional 45 :param host: The control plane host to connect to. 46 :type host: str, optional 47 :param proxy_url: The URL of the proxy to use for the connection. Default: `None` 48 :type proxy_url: str, optional 49 :param proxy_headers: Additional headers to pass to the proxy. Use this if your proxy setup requires authentication. Default: `{}` 50 :type proxy_headers: Dict[str, str], optional 51 :param ssl_ca_certs: The path to the SSL CA certificate bundle to use for the connection. This path should point to a file in PEM format. Default: `None` 52 :type ssl_ca_certs: str, optional 53 :param ssl_verify: SSL verification is performed by default, but can be disabled using the boolean flag. Default: `True` 54 :type ssl_verify: bool, optional 55 :param config: A `pinecone.config.Config` object. If passed, the `api_key` and `host` parameters will be ignored. 56 :type config: pinecone.config.Config, optional 57 :param additional_headers: Additional headers to pass to the API. Default: `{}` 58 :type additional_headers: Dict[str, str], optional 59 :param pool_threads: The number of threads to use for the connection pool. Default: `1` 60 :type pool_threads: int, optional 61 :param index_api: An instance of `pinecone.core.client.api.manage_indexes_api.ManageIndexesApi`. If passed, the `host` parameter will be ignored. 62 :type index_api: pinecone.core.client.api.manage_indexes_api.ManageIndexesApi, optional 63 64 65 ### Configuration with environment variables 66 67 If you instantiate the Pinecone client with no arguments, it will attempt to read the API key from the environment variable `PINECONE_API_KEY`. 68 69 ```python 70 from pinecone import Pinecone 71 72 pc = Pinecone() 73 ``` 74 75 ### Configuration with keyword arguments 76 77 If you prefer being more explicit in your code, you can also pass the API as a keyword argument. 78 79 ```python 80 import os 81 from pinecone import Pinecone 82 83 pc = Pinecone(api_key=os.environ.get("PINECONE_API_KEY")) 84 ``` 85 86 ### Environment variables 87 88 The Pinecone client supports the following environment variables: 89 90 - `PINECONE_API_KEY`: The API key to use for authentication. If not passed via 91 kwarg, the API key will be read from the environment variable `PINECONE_API_KEY`. 92 93 - `PINECONE_DEBUG_CURL`: When troubleshooting it can be very useful to run curl 94 commands against the control plane API to see exactly what data is being sent 95 and received without all the abstractions and transformations applied by the Python 96 SDK. If you set this environment variable to `true`, the Pinecone client will use 97 request parameters to print out an equivalent curl command that you can run yourself 98 or share with Pinecone support. **Be very careful with this option, as it will print out 99 your API key** which forms part of a required authentication header. Default: `false` 100 101 ### Proxy configuration 102 103 If your network setup requires you to interact with Pinecone via a proxy, you will need 104 to pass additional configuration using optional keyword parameters. These optional parameters 105 are forwarded to `urllib3`, which is the underlying library currently used by the Pinecone client to 106 make HTTP requests. You may find it helpful to refer to the 107 [urllib3 documentation on working with proxies](https://urllib3.readthedocs.io/en/stable/advanced-usage.html#http-and-https-proxies) 108 while troubleshooting these settings. 109 110 Here is a basic example: 111 112 ```python 113 from pinecone import Pinecone 114 115 pc = Pinecone( 116 api_key='YOUR_API_KEY', 117 proxy_url='https://your-proxy.com' 118 ) 119 120 pc.list_indexes() 121 ``` 122 123 If your proxy requires authentication, you can pass those values in a header dictionary using the `proxy_headers` parameter. 124 125 ```python 126 from pinecone import Pinecone 127 import urllib3 import make_headers 128 129 pc = Pinecone( 130 api_key='YOUR_API_KEY', 131 proxy_url='https://your-proxy.com', 132 proxy_headers=make_headers(proxy_basic_auth='username:password') 133 ) 134 135 pc.list_indexes() 136 ``` 137 138 ### Using proxies with self-signed certificates 139 140 By default the Pinecone Python client will perform SSL certificate verification 141 using the CA bundle maintained by Mozilla in the [certifi](https://pypi.org/project/certifi/) package. 142 If your proxy server is using a self-signed certificate, you will need to pass the path to the certificate 143 in PEM format using the `ssl_ca_certs` parameter. 144 145 ```python 146 from pinecone import Pinecone 147 import urllib3 import make_headers 148 149 pc = Pinecone( 150 api_key='YOUR_API_KEY', 151 proxy_url='https://your-proxy.com', 152 proxy_headers=make_headers(proxy_basic_auth='username:password'), 153 ssl_ca_certs='path/to/cert-bundle.pem' 154 ) 155 156 pc.list_indexes() 157 ``` 158 159 ### Disabling SSL verification 160 161 If you would like to disable SSL verification, you can pass the `ssl_verify` 162 parameter with a value of `False`. We do not recommend going to production with SSL verification disabled. 163 164 ```python 165 from pinecone import Pinecone 166 import urllib3 import make_headers 167 168 pc = Pinecone( 169 api_key='YOUR_API_KEY', 170 proxy_url='https://your-proxy.com', 171 proxy_headers=make_headers(proxy_basic_auth='username:password'), 172 ssl_ca_certs='path/to/cert-bundle.pem', 173 ssl_verify=False 174 ) 175 176 pc.list_indexes() 177 178 ``` 179 """ 180 if config: 181 if not isinstance(config, Config): 182 raise TypeError("config must be of type pinecone.config.Config") 183 else: 184 self.config = config 185 else: 186 self.config = PineconeConfig.build( 187 api_key=api_key, 188 host=host, 189 additional_headers=additional_headers, 190 proxy_url=proxy_url, 191 proxy_headers=proxy_headers, 192 ssl_ca_certs=ssl_ca_certs, 193 ssl_verify=ssl_verify, 194 **kwargs 195 ) 196 197 if kwargs.get("openapi_config", None): 198 warnings.warn("Passing openapi_config is deprecated and will be removed in a future release. Please pass settings such as proxy_url, proxy_headers, ssl_ca_certs, and ssl_verify directly to the Pinecone constructor as keyword arguments. See the README at https://github.com/pinecone-io/pinecone-python-client for examples.", DeprecationWarning) 199 200 self.openapi_config = ConfigBuilder.build_openapi_config(self.config, **kwargs) 201 self.pool_threads = pool_threads 202 203 if index_api: 204 self.index_api = index_api 205 else: 206 self.index_api = setup_openapi_client(ManageIndexesApi, self.config, self.openapi_config, pool_threads) 207 208 self.index_host_store = IndexHostStore() 209 """ @private """
The Pinecone
class is the main entry point for interacting with Pinecone via this Python SDK.
It is used to create, delete, and manage your indexes and collections.
Parameters
- api_key: The API key to use for authentication. If not passed via kwarg, the API key will be read from the environment variable
PINECONE_API_KEY
. - host: The control plane host to connect to.
- proxy_url: The URL of the proxy to use for the connection. Default:
None
- proxy_headers: Additional headers to pass to the proxy. Use this if your proxy setup requires authentication. Default:
{}
- ssl_ca_certs: The path to the SSL CA certificate bundle to use for the connection. This path should point to a file in PEM format. Default:
None
- ssl_verify: SSL verification is performed by default, but can be disabled using the boolean flag. Default:
True
- config: A
pinecone.config.Config
object. If passed, theapi_key
andhost
parameters will be ignored. - additional_headers: Additional headers to pass to the API. Default:
{}
- pool_threads: The number of threads to use for the connection pool. Default:
1
- index_api: An instance of
pinecone.core.client.api.manage_indexes_api.ManageIndexesApi
. If passed, thehost
parameter will be ignored.
Configuration with environment variables
If you instantiate the Pinecone client with no arguments, it will attempt to read the API key from the environment variable PINECONE_API_KEY
.
from pinecone import Pinecone
pc = Pinecone()
Configuration with keyword arguments
If you prefer being more explicit in your code, you can also pass the API as a keyword argument.
import os
from pinecone import Pinecone
pc = Pinecone(api_key=os.environ.get("PINECONE_API_KEY"))
Environment variables
The Pinecone client supports the following environment variables:
PINECONE_API_KEY
: The API key to use for authentication. If not passed via kwarg, the API key will be read from the environment variablePINECONE_API_KEY
.PINECONE_DEBUG_CURL
: When troubleshooting it can be very useful to run curl commands against the control plane API to see exactly what data is being sent and received without all the abstractions and transformations applied by the Python SDK. If you set this environment variable totrue
, the Pinecone client will use request parameters to print out an equivalent curl command that you can run yourself or share with Pinecone support. Be very careful with this option, as it will print out your API key which forms part of a required authentication header. Default:false
Proxy configuration
If your network setup requires you to interact with Pinecone via a proxy, you will need
to pass additional configuration using optional keyword parameters. These optional parameters
are forwarded to urllib3
, which is the underlying library currently used by the Pinecone client to
make HTTP requests. You may find it helpful to refer to the
urllib3 documentation on working with proxies
while troubleshooting these settings.
Here is a basic example:
from pinecone import Pinecone
pc = Pinecone(
api_key='YOUR_API_KEY',
proxy_url='https://your-proxy.com'
)
pc.list_indexes()
If your proxy requires authentication, you can pass those values in a header dictionary using the proxy_headers
parameter.
from pinecone import Pinecone
import urllib3 import make_headers
pc = Pinecone(
api_key='YOUR_API_KEY',
proxy_url='https://your-proxy.com',
proxy_headers=make_headers(proxy_basic_auth='username:password')
)
pc.list_indexes()
Using proxies with self-signed certificates
By default the Pinecone Python client will perform SSL certificate verification
using the CA bundle maintained by Mozilla in the certifi package.
If your proxy server is using a self-signed certificate, you will need to pass the path to the certificate
in PEM format using the ssl_ca_certs
parameter.
from pinecone import Pinecone
import urllib3 import make_headers
pc = Pinecone(
api_key='YOUR_API_KEY',
proxy_url='https://your-proxy.com',
proxy_headers=make_headers(proxy_basic_auth='username:password'),
ssl_ca_certs='path/to/cert-bundle.pem'
)
pc.list_indexes()
Disabling SSL verification
If you would like to disable SSL verification, you can pass the ssl_verify
parameter with a value of False
. We do not recommend going to production with SSL verification disabled.
from pinecone import Pinecone
import urllib3 import make_headers
pc = Pinecone(
api_key='YOUR_API_KEY',
proxy_url='https://your-proxy.com',
proxy_headers=make_headers(proxy_basic_auth='username:password'),
ssl_ca_certs='path/to/cert-bundle.pem',
ssl_verify=False
)
pc.list_indexes()
211 def create_index( 212 self, 213 name: str, 214 dimension: int, 215 spec: Union[Dict, ServerlessSpec, PodSpec], 216 metric: Optional[str] = "cosine", 217 timeout: Optional[int] = None, 218 ): 219 """Creates a Pinecone index. 220 221 :param name: The name of the index to create. Must be unique within your project and 222 cannot be changed once created. Allowed characters are lowercase letters, numbers, 223 and hyphens and the name may not begin or end with hyphens. Maximum length is 45 characters. 224 :type name: str 225 :param dimension: The dimension of vectors that will be inserted in the index. This should 226 match the dimension of the embeddings you will be inserting. For example, if you are using 227 OpenAI's CLIP model, you should use `dimension=1536`. 228 :type dimension: int 229 :param metric: Type of metric used in the vector index when querying, one of `{"cosine", "dotproduct", "euclidean"}`. Defaults to `"cosine"`. 230 Defaults to `"cosine"`. 231 :type metric: str, optional 232 :param spec: A dictionary containing configurations describing how the index should be deployed. For serverless indexes, 233 specify region and cloud. For pod indexes, specify replicas, shards, pods, pod_type, metadata_config, and source_collection. 234 :type spec: Dict 235 :type timeout: int, optional 236 :param timeout: Specify the number of seconds to wait until index gets ready. If None, wait indefinitely; if >=0, time out after this many seconds; 237 if -1, return immediately and do not wait. Default: None 238 239 ### Creating a serverless index 240 241 ```python 242 import os 243 from pinecone import Pinecone, ServerlessSpec 244 245 client = Pinecone(api_key=os.environ.get("PINECONE_API_KEY")) 246 247 client.create_index( 248 name="my_index", 249 dimension=1536, 250 metric="cosine", 251 spec=ServerlessSpec(cloud="aws", region="us-west-2") 252 ) 253 ``` 254 255 ### Creating a pod index 256 257 ```python 258 import os 259 from pinecone import Pinecone, PodSpec 260 261 client = Pinecone(api_key=os.environ.get("PINECONE_API_KEY")) 262 263 client.create_index( 264 name="my_index", 265 dimension=1536, 266 metric="cosine", 267 spec=PodSpec( 268 environment="us-east1-gcp", 269 pod_type="p1.x1" 270 ) 271 ) 272 ``` 273 """ 274 275 api_instance = self.index_api 276 277 if isinstance(spec, dict): 278 api_instance.create_index(create_index_request=CreateIndexRequest(name=name, dimension=dimension, metric=metric, spec=spec)) 279 elif isinstance(spec, ServerlessSpec): 280 api_instance.create_index(create_index_request=CreateIndexRequest(name=name, dimension=dimension, metric=metric, spec=spec.asdict())) 281 elif isinstance(spec, PodSpec): 282 api_instance.create_index(create_index_request=CreateIndexRequest(name=name, dimension=dimension, metric=metric, spec=spec.asdict())) 283 else: 284 raise TypeError("spec must be of type dict, ServerlessSpec, or PodSpec") 285 286 def is_ready(): 287 status = self._get_status(name) 288 ready = status["ready"] 289 return ready 290 291 if timeout == -1: 292 return 293 if timeout is None: 294 while not is_ready(): 295 time.sleep(5) 296 else: 297 while (not is_ready()) and timeout >= 0: 298 time.sleep(5) 299 timeout -= 5 300 if timeout and timeout < 0: 301 raise ( 302 TimeoutError( 303 "Please call the describe_index API ({}) to confirm index status.".format( 304 "https://www.pinecone.io/docs/api/operation/describe_index/" 305 ) 306 ) 307 )
Creates a Pinecone index.
Parameters
- name: The name of the index to create. Must be unique within your project and cannot be changed once created. Allowed characters are lowercase letters, numbers, and hyphens and the name may not begin or end with hyphens. Maximum length is 45 characters.
- dimension: The dimension of vectors that will be inserted in the index. This should
match the dimension of the embeddings you will be inserting. For example, if you are using
OpenAI's CLIP model, you should use
dimension=1536
. - metric: Type of metric used in the vector index when querying, one of
{"cosine", "dotproduct", "euclidean"}
. Defaults to"cosine"
. Defaults to"cosine"
. - spec: A dictionary containing configurations describing how the index should be deployed. For serverless indexes, specify region and cloud. For pod indexes, specify replicas, shards, pods, pod_type, metadata_config, and source_collection.
- timeout: Specify the number of seconds to wait until index gets ready. If None, wait indefinitely; if >=0, time out after this many seconds; if -1, return immediately and do not wait. Default: None
Creating a serverless index
import os
from pinecone import Pinecone, ServerlessSpec
client = Pinecone(api_key=os.environ.get("PINECONE_API_KEY"))
client.create_index(
name="my_index",
dimension=1536,
metric="cosine",
spec=ServerlessSpec(cloud="aws", region="us-west-2")
)
Creating a pod index
import os
from pinecone import Pinecone, PodSpec
client = Pinecone(api_key=os.environ.get("PINECONE_API_KEY"))
client.create_index(
name="my_index",
dimension=1536,
metric="cosine",
spec=PodSpec(
environment="us-east1-gcp",
pod_type="p1.x1"
)
)
309 def delete_index(self, name: str, timeout: Optional[int] = None): 310 """Deletes a Pinecone index. 311 312 Deleting an index is an irreversible operation. All data in the index will be lost. 313 When you use this command, a request is sent to the Pinecone control plane to delete 314 the index, but the termination is not synchronous because resources take a few moments to 315 be released. 316 317 You can check the status of the index by calling the `describe_index()` command. 318 With repeated polling of the describe_index command, you will see the index transition to a 319 `Terminating` state before eventually resulting in a 404 after it has been removed. 320 321 :param name: the name of the index. 322 :type name: str 323 :param timeout: Number of seconds to poll status checking whether the index has been deleted. If None, 324 wait indefinitely; if >=0, time out after this many seconds; 325 if -1, return immediately and do not wait. Default: None 326 :type timeout: int, optional 327 """ 328 api_instance = self.index_api 329 api_instance.delete_index(name) 330 self.index_host_store.delete_host(self.config, name) 331 332 def get_remaining(): 333 return name in self.list_indexes().names() 334 335 if timeout == -1: 336 return 337 338 if timeout is None: 339 while get_remaining(): 340 time.sleep(5) 341 else: 342 while get_remaining() and timeout >= 0: 343 time.sleep(5) 344 timeout -= 5 345 if timeout and timeout < 0: 346 raise ( 347 TimeoutError( 348 "Please call the list_indexes API ({}) to confirm if index is deleted".format( 349 "https://www.pinecone.io/docs/api/operation/list_indexes/" 350 ) 351 ) 352 )
Deletes a Pinecone index.
Deleting an index is an irreversible operation. All data in the index will be lost. When you use this command, a request is sent to the Pinecone control plane to delete the index, but the termination is not synchronous because resources take a few moments to be released.
You can check the status of the index by calling the describe_index()
command.
With repeated polling of the describe_index command, you will see the index transition to a
Terminating
state before eventually resulting in a 404 after it has been removed.
Parameters
- name: the name of the index.
- timeout: Number of seconds to poll status checking whether the index has been deleted. If None, wait indefinitely; if >=0, time out after this many seconds; if -1, return immediately and do not wait. Default: None
354 def list_indexes(self) -> IndexList: 355 """Lists all indexes. 356 357 The results include a description of all indexes in your project, including the 358 index name, dimension, metric, status, and spec. 359 360 :return: Returns an `IndexList` object, which is iterable and contains a 361 list of `IndexDescription` objects. It also has a convenience method `names()` 362 which returns a list of index names. 363 364 ```python 365 from pinecone import Pinecone 366 367 client = Pinecone() 368 369 index_name = "my_index" 370 if index_name not in client.list_indexes().names(): 371 print("Index does not exist, creating...") 372 client.create_index( 373 name=index_name, 374 dimension=768, 375 metric="cosine", 376 spec=ServerlessSpec(cloud="aws", region="us-west-2") 377 ) 378 ``` 379 380 You can also use the `list_indexes()` method to iterate over all indexes in your project 381 and get other information besides just names. 382 383 ```python 384 from pinecone import Pinecone 385 386 client = Pinecone() 387 388 for index in client.list_indexes(): 389 print(index.name) 390 print(index.dimension) 391 print(index.metric) 392 print(index.status) 393 print(index.host) 394 print(index.spec) 395 ``` 396 397 """ 398 response = self.index_api.list_indexes() 399 return IndexList(response)
Lists all indexes.
The results include a description of all indexes in your project, including the index name, dimension, metric, status, and spec.
Returns
Returns an
IndexList
object, which is iterable and contains a list ofIndexDescription
objects. It also has a convenience methodnames()
which returns a list of index names.
from pinecone import Pinecone
client = Pinecone()
index_name = "my_index"
if index_name not in client.list_indexes().names():
print("Index does not exist, creating...")
client.create_index(
name=index_name,
dimension=768,
metric="cosine",
spec=ServerlessSpec(cloud="aws", region="us-west-2")
)
You can also use the list_indexes()
method to iterate over all indexes in your project
and get other information besides just names.
from pinecone import Pinecone
client = Pinecone()
for index in client.list_indexes():
print(index.name)
print(index.dimension)
print(index.metric)
print(index.status)
print(index.host)
print(index.spec)
401 def describe_index(self, name: str): 402 """Describes a Pinecone index. 403 404 :param name: the name of the index to describe. 405 :return: Returns an `IndexDescription` object 406 which gives access to properties such as the 407 index name, dimension, metric, host url, status, 408 and spec. 409 410 ### Getting your index host url 411 412 In a real production situation, you probably want to 413 store the host url in an environment variable so you 414 don't have to call describe_index and re-fetch it 415 every time you want to use the index. But this example 416 shows how to get the value from the API using describe_index. 417 418 ```python 419 from pinecone import Pinecone, Index 420 421 client = Pinecone() 422 423 description = client.describe_index("my_index") 424 425 host = description.host 426 print(f"Your index is hosted at {description.host}") 427 428 index = client.Index(name="my_index", host=host) 429 index.upsert(vectors=[...]) 430 ``` 431 """ 432 api_instance = self.index_api 433 description = api_instance.describe_index(name) 434 host = description.host 435 self.index_host_store.set_host(self.config, name, host) 436 437 return description
Describes a Pinecone index.
Parameters
- name: the name of the index to describe.
Returns
Returns an
IndexDescription
object which gives access to properties such as the index name, dimension, metric, host url, status, and spec.
Getting your index host url
In a real production situation, you probably want to store the host url in an environment variable so you don't have to call describe_index and re-fetch it every time you want to use the index. But this example shows how to get the value from the API using describe_index.
from pinecone import Pinecone, Index
client = Pinecone()
description = client.describe_index("my_index")
host = description.host
print(f"Your index is hosted at {description.host}")
index = client.Index(name="my_index", host=host)
index.upsert(vectors=[...])
439 def configure_index(self, name: str, replicas: Optional[int] = None, pod_type: Optional[str] = None): 440 """This method is used to scale configuration fields for your pod-based Pinecone index. 441 442 :param: name: the name of the Index 443 :param: replicas: the desired number of replicas, lowest value is 0. 444 :param: pod_type: the new pod_type for the index. To learn more about the 445 available pod types, please see [Understanding Indexes](https://docs.pinecone.io/docs/indexes) 446 447 448 ```python 449 from pinecone import Pinecone 450 451 client = Pinecone() 452 453 # Make a configuration change 454 client.configure_index(name="my_index", replicas=4) 455 456 # Call describe_index to see the index status as the 457 # change is applied. 458 client.describe_index("my_index") 459 ``` 460 461 """ 462 api_instance = self.index_api 463 config_args: Dict[str, Any] = {} 464 if pod_type: 465 config_args.update(pod_type=pod_type) 466 if replicas: 467 config_args.update(replicas=replicas) 468 configure_index_request = ConfigureIndexRequest( 469 spec=ConfigureIndexRequestSpec( 470 pod=ConfigureIndexRequestSpecPod(**config_args) 471 ) 472 ) 473 api_instance.configure_index(name, configure_index_request=configure_index_request)
This method is used to scale configuration fields for your pod-based Pinecone index.
Parameters
- name: the name of the Index
- replicas: the desired number of replicas, lowest value is 0.
- pod_type: the new pod_type for the index. To learn more about the available pod types, please see pinecone.control.pinecone.io/docs/indexes">Understanding Indexes
from pinecone import Pinecone
client = Pinecone()
# Make a configuration change
client.configure_index(name="my_index", replicas=4)
# Call describe_index to see the index status as the
# change is applied.
client.describe_index("my_index")
475 def create_collection(self, name: str, source: str): 476 """Create a collection from a pod-based index 477 478 :param name: Name of the collection 479 :param source: Name of the source index 480 """ 481 api_instance = self.index_api 482 api_instance.create_collection(create_collection_request=CreateCollectionRequest(name=name, source=source))
Create a collection from a pod-based index
Parameters
- name: Name of the collection
- source: Name of the source index
484 def list_collections(self) -> CollectionList: 485 """List all collections 486 487 ```python 488 from pinecone import Pinecone 489 490 client = Pinecone() 491 492 for collection in client.list_collections(): 493 print(collection.name) 494 print(collection.source) 495 496 # You can also iterate specifically over the collection 497 # names with the .names() helper. 498 collection_name="my_collection" 499 for collection_name in client.list_collections().names(): 500 print(collection_name) 501 ``` 502 """ 503 api_instance = self.index_api 504 response = api_instance.list_collections() 505 return CollectionList(response)
List all collections
from pinecone import Pinecone
client = Pinecone()
for collection in client.list_collections():
print(collection.name)
print(collection.source)
# You can also iterate specifically over the collection
# names with the .names() helper.
collection_name="my_collection"
for collection_name in client.list_collections().names():
print(collection_name)
507 def delete_collection(self, name: str): 508 """Deletes a collection. 509 510 :param: name: The name of the collection 511 512 Deleting a collection is an irreversible operation. All data 513 in the collection will be lost. 514 515 This method tells Pinecone you would like to delete a collection, 516 but it takes a few moments to complete the operation. Use the 517 `describe_collection()` method to confirm that the collection 518 has been deleted. 519 """ 520 api_instance = self.index_api 521 api_instance.delete_collection(name)
Deletes a collection.
Parameters
- name: The name of the collection
Deleting a collection is an irreversible operation. All data in the collection will be lost.
This method tells Pinecone you would like to delete a collection,
but it takes a few moments to complete the operation. Use the
describe_collection()
method to confirm that the collection
has been deleted.
523 def describe_collection(self, name: str): 524 """Describes a collection. 525 :param: The name of the collection 526 :return: Description of the collection 527 528 ```python 529 from pinecone import Pinecone 530 531 client = Pinecone() 532 533 description = client.describe_collection("my_collection") 534 print(description.name) 535 print(description.source) 536 print(description.status) 537 print(description.size) 538 print(description.) 539 ``` 540 """ 541 api_instance = self.index_api 542 return api_instance.describe_collection(name).to_dict()
Describes a collection.
Parameters
- The name of the collection
Returns
Description of the collection
from pinecone import Pinecone
client = Pinecone()
description = client.describe_collection("my_collection")
print(description.name)
print(description.source)
print(description.status)
print(description.size)
print(description.)
550 def Index(self, name: str = '', host: str = '', **kwargs): 551 """ 552 Target an index for data operations. 553 554 ### Target an index by host url 555 556 In production situations, you want to uspert or query your data as quickly 557 as possible. If you know in advance the host url of your index, you can 558 eliminate a round trip to the Pinecone control plane by specifying the 559 host of the index. 560 561 ```python 562 import os 563 from pinecone import Pinecone 564 565 api_key = os.environ.get("PINECONE_API_KEY") 566 index_host = os.environ.get("PINECONE_INDEX_HOST") 567 568 pc = Pinecone(api_key=api_key) 569 index = pc.Index(host=index_host) 570 571 # Now you're ready to perform data operations 572 index.query(vector=[...], top_k=10) 573 ``` 574 575 To find your host url, you can use the Pinecone control plane to describe 576 the index. The host url is returned in the response. Or, alternatively, the 577 host is displayed in the Pinecone web console. 578 579 ```python 580 import os 581 from pinecone import Pinecone 582 583 pc = Pinecone( 584 api_key=os.environ.get("PINECONE_API_KEY") 585 ) 586 587 host = pc.describe_index('index-name').host 588 ``` 589 590 ### Target an index by name (not recommended for production) 591 592 For more casual usage, such as when you are playing and exploring with Pinecone 593 in a notebook setting, you can also target an index by name. If you use this 594 approach, the client may need to perform an extra call to the Pinecone control 595 plane to get the host url on your behalf to get the index host. 596 597 The client will cache the index host for future use whenever it is seen, so you 598 will only incur the overhead of only one call. But this approach is not 599 recommended for production usage. 600 601 ```python 602 import os 603 from pinecone import Pinecone, ServerlessSpec 604 605 api_key = os.environ.get("PINECONE_API_KEY") 606 607 pc = Pinecone(api_key=api_key) 608 pc.create_index( 609 name='my-index', 610 dimension=1536, 611 metric='cosine', 612 spec=ServerlessSpec(cloud='aws', region='us-west-2') 613 ) 614 index = pc.Index('my-index') 615 616 # Now you're ready to perform data operations 617 index.query(vector=[...], top_k=10) 618 ``` 619 """ 620 if name == '' and host == '': 621 raise ValueError("Either name or host must be specified") 622 623 pt = kwargs.pop('pool_threads', None) or self.pool_threads 624 api_key = self.config.api_key 625 openapi_config = self.openapi_config 626 627 if host != '': 628 # Use host url if it is provided 629 index_host=normalize_host(host) 630 else: 631 # Otherwise, get host url from describe_index using the index name 632 index_host = self.index_host_store.get_host(self.index_api, self.config, name) 633 634 return Index( 635 host=index_host, 636 api_key=api_key, 637 pool_threads=pt, 638 openapi_config=openapi_config, 639 source_tag=self.config.source_tag, 640 **kwargs 641 )
Target an index for data operations.
Target an index by host url
In production situations, you want to uspert or query your data as quickly as possible. If you know in advance the host url of your index, you can eliminate a round trip to the Pinecone control plane by specifying the host of the index.
import os
from pinecone import Pinecone
api_key = os.environ.get("PINECONE_API_KEY")
index_host = os.environ.get("PINECONE_INDEX_HOST")
pc = Pinecone(api_key=api_key)
index = pc.Index(host=index_host)
# Now you're ready to perform data operations
index.query(vector=[...], top_k=10)
To find your host url, you can use the Pinecone control plane to describe the index. The host url is returned in the response. Or, alternatively, the host is displayed in the Pinecone web console.
import os
from pinecone import Pinecone
pc = Pinecone(
api_key=os.environ.get("PINECONE_API_KEY")
)
host = pc.describe_index('index-name').host
Target an index by name (not recommended for production)
For more casual usage, such as when you are playing and exploring with Pinecone in a notebook setting, you can also target an index by name. If you use this approach, the client may need to perform an extra call to the Pinecone control plane to get the host url on your behalf to get the index host.
The client will cache the index host for future use whenever it is seen, so you will only incur the overhead of only one call. But this approach is not recommended for production usage.
import os
from pinecone import Pinecone, ServerlessSpec
api_key = os.environ.get("PINECONE_API_KEY")
pc = Pinecone(api_key=api_key)
pc.create_index(
name='my-index',
dimension=1536,
metric='cosine',
spec=ServerlessSpec(cloud='aws', region='us-west-2')
)
index = pc.Index('my-index')
# Now you're ready to perform data operations
index.query(vector=[...], top_k=10)