Storage Backends

The storage backend is where the actual package files are kept.

Files

This will store your packages in a directory on disk. It’s much simpler and faster to set up if you don’t need the reliability and scalability of S3.

Set pypi.storage = file OR pypi.storage = pypicloud.storage.FileStorage OR leave it out completely since this is the default.

storage.dir

Argument: string

The directory where the package files should be stored.

S3

This option will store your packages in S3.

Note

Be sure you have set the correct S3 Policy.

Set pypi.storage = s3 OR pypi.s3 = pypicloud.storage.S3Storage

A few key, required options are mentioned below, but pypicloud attempts to support all options that can be passed to resource or to the Config object. In general you can simply prefix the option with storage. and pypicloud will pass it on. For example, to set the signature version on the Config object:

storage.signature_version = s3v4

Note that there is a s3 option dict as well. Those options should also just be prefixed with storage.. For example:

storage.use_accelerate_endpoint = true

Will pass the Config object the option Config(s3={'use_accelerate_endpoint': True}).

Note

If you plan to run pypicloud in multiple regions, read more about syncing pypicloud caches using S3 notifications

storage.bucket

Argument: string

The name of the S3 bucket to store packages in.

storage.region_name

Argument: string, semi-optional

The AWS region your bucket is in. If your bucket does not yet exist, it will be created in this region on startup. If blank, the classic US region will be used.

Warning

If your bucket name has a . character in it, or if it is in a newer region (such as eu-central-1), you must specify the storage.region_name!

storage.aws_access_key_id, storage.aws_secret_access_key

Argument: string, optional

Your AWS access key id and secret access key. If they are not specified then pypicloud will attempt to get the values from the environment variables AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY or any other credentials source.

storage.prefix

Argument: string, optional

If present, all packages will be prefixed with this value when stored in S3. Use this to store your packages in a subdirectory, such as “packages/”

storage.prepend_hash

Argument: bool, optional

Prepend a 4-letter hash to all S3 keys (default True). This helps S3 load balance when traffic scales. See the AWS documentation on the subject.

storage.expire_after

Argument: int, optional

How long (in seconds) the generated S3 urls are valid for (default 86400 (1 day)). In practice, there is no real reason why these generated urls need to expire at all. S3 does it for security, but expiring links isn’t part of the python package security model. So in theory you can bump this number up.

storage.redirect_urls

Argument: bool, optional

Leave this alone unless you’re having problems using easy_install. It defaults to True and should not be changed unless you encounter issues.

The long story: Why you should set redirect_urls = True

storage.server_side_encryption

Argument: str, optional

Enables AES-256 transparent server side encryption. See the AWS documention. Default is None.

storage.object_acl

Argument: string, optional

Sets uploaded object’s “canned” ACL. See the AWS documentation. Default is “private”, i.e. only the account owner will get full access. May be useful, if the bucket and pypicloud are hosted in different AWS accounts.

storage.public_url

Argument: bool, optional

If true, use public urls (in the form https://us-east-1.s3.amazonaws.com/<bucket>/<path>) instead of signed urls. If you configured your bucket to be public and are okay with anyone being able to read your packages, this will give you a speed boost (no expensive hashing operations) and should provide better HTTP caching behavior for the packages. Default is false.

CloudFront

This option will store your packages in S3 but use CloudFront to deliver the packages. This is an extension of the S3 storage backend and require the same settings as above, but also the settings listed below.

Set pypi.storage = cloudfront OR pypi.s3 = pypicloud.storage.CloudFrontS3Storage

storage.cloud_front_domain

Argument: string

The CloudFront domain you have set up. This CloudFront distribution must be set up to use your S3 bucket as the origin.

Example: https://dabcdefgh12345.cloudfront.net

storage.cloud_front_key_id

Argument: string, optional

If you want to protect your packages from public access you need to set up the CloudFront distribution to use signed URLs. This setting specifies the key id of the CloudFront key pair that is currently active on your AWS account.

storage.cloud_front_key_file

Argument: string, optional

Only needed when setting up CloudFront with signed URLs. This setting should be set to the full path of the CloudFront private key file.

storage.cloud_front_key_string

Argument: string, optional

The same as cloud_front_key_file, but contains the raw private key instead of a path to a file.

Google Cloud Storage

This option will store your packages in GCS.

Set pypi.storage = gcs OR pypi.s3 = pypicloud.storage.GoogleCloudStorage

Note

The gcs client libraries are not installed by default. To use this backend, you should install pypicloud with pip install pypicloud[gcs].

This backend supports most of the same configuration settings as the S3 backend, and is configured in the same manner as that backend (via config settings of the form storage.<key> = <value>).

Settings supported by the S3 backend that are not currently supported by the GCS backend are server_side_encryption and public_url.

Pypicloud authenticates with GCS using the usual Application Default Credentials strategy, see the documentation for more details. For example you can set the GOOGLE_APPLICATION_CREDENTIALS environment variable:

GOOGLE_APPLICATION_CREDENTIALS=/path/to/my/keyfile.json pserve pypicloud.ini

Pypicloud also exposes a config setting, storage.gcp_service_account_json_filename, documented below.

For more information on setting up a service account, see the GCS documentation.

If using the service account provided automatically when running in GCE, GKE, etc, then due to a restriction with the gcloud library, the IAM signing service must be used:

storage.gcp_use_iam_signer=true

In addition, when using the IAM signing service, the service account used needs to have iam.serviceAccounts.signBlob on the storage bucket. This is available as part of roles/iam.serviceAccountTokenCreator.

storage.bucket

Argument: string

The name of the GCS bucket to store packages in.

storage.region_name

Argument: string, semi-optional

The GCS region your bucket is in. If your bucket does not yet exist, it will be created in this region on startup. If blank, a default US multi-regional bucket will be created.

storage.gcp_api_endpoint

Argument: string, optional

The storage API URL to which to point the GCS Client. If not provided, will use default.

storage.gcp_service_account_json_filename

Argument: string, semi-optional

Path to a local file containing a GCP service account JSON key. This argument is required unless the path is provided via the GOOGLE_APPLICATION_CREDENTIALS environment variable.

storage.gcp_use_iam_signer

Argument: bool, optional

If true, will use the IAM credentials to sign the generated package links (default false).

storage.iam_signer_service_account_email

Argument: string, optional

The email address to use for signing GCS links when gcp_use_iam_signer = true. If not provided, will fall back to the email in gcp_service_account_json_filename.

See issue 261 for more details

storage.gcp_project_id

Argument: string, optional

ID of the GCP project that contains your storage bucket. This is only used when creating the bucket, and if you would like the bucket to be created in a project other than the project to which your GCP service account belongs.

storage.prefix

Argument: string, optional

If present, all packages will be prefixed with this value when stored in GCS. Use this to store your packages in a subdirectory, such as “packages/”

storage.prepend_hash

Argument: bool, optional

Prepend a 4-letter hash to all GCS keys (default True). This may help GCS load balance when traffic scales, although this is not as well-documented for GCS as for S3.

storage.expire_after

Argument: int, optional

How long (in seconds) the generated GCS urls are valid for (default 86400 (1 day)). In practice, there is no real reason why these generated urls need to expire at all. GCS does it for security, but expiring links isn’t part of the python package security model. So in theory you can bump this number up.

storage.redirect_urls

Argument: bool, optional

Leave this alone unless you’re having problems using easy_install. It defaults to True and should not be changed unless you encounter issues.

The long story: Why you should set redirect_urls = True

storage.object_acl

Argument: string, optional

Sets uploaded object’s “predefined” ACL. See the GCS documentation. Default is “private”, i.e. only the account owner will get full access. May be useful, if the bucket and pypicloud are hosted in different GCS accounts.

storage.storage_class

Argument: string, optional

Sets uploaded object’s storage class. See the GCS documentation. Defaults to the default storage class of the bucket, if the bucket is preexisting, or “regional” otherwise.

storage.gcp_use_iam_signer

Argument: boolean, optional

Sign blobs using IAM backed signing, rather than using GCP application credentials. The service account used needs to have iam.serviceAccounts.signBlob on the storage bucket. This is available as part of roles/iam.serviceAccountTokenCreator.

Azure Blob Storage

This option will store your packages in a container in Azure Blob Storage.

Set pypi.storage = azure-blob OR pypi.s3 = pypicloud.storage.AzureBlobStorage

A few key, required options are mentioned below.

storage.storage_account_name

Argument: string

The name of the Azure Storage Account. If not present, will look for the AZURE_STORAGE_ACCOUNT environment variable.

storage.storage_account_key

Argument: string

A valid access key, either key1 or key2. If not present, will look for the AZURE_STORAGE_KEY environment variable.

storage.storage_container_name

Argument: string

Name of the container you wish to store packages in.

storage.storage_account_url

Argument: string, optional

Storage data service endpoint. If not present, will look for the AZURE_STORAGE_SERVICE_ENDPOINT environment variable.