Storage Backends¶
The storage backend is where the actual package files are kept.
Files¶
This will store your packages in a directory on disk. It’s much simpler and faster to set up if you don’t need the reliability and scalability of S3.
Set pypi.storage = file
OR pypi.storage = pypicloud.storage.FileStorage
OR leave it out completely since this is the default.
storage.dir
¶
Argument: string
The directory where the package files should be stored.
S3¶
This option will store your packages in S3.
Note
Be sure you have set the correct S3 Policy.
Set pypi.storage = s3
OR pypi.s3 = pypicloud.storage.S3Storage
A few key, required options are mentioned below, but pypicloud attempts to
support all options that can be passed to resource
or to the Config
object. In general you can simply prefix the option with storage.
and
pypicloud will pass it on. For example, to set the signature version on the
Config object:
storage.signature_version = s3v4
Note that there is a s3
option dict as well. Those options should also just
be prefixed with storage.
. For example:
storage.use_accelerate_endpoint = true
Will pass the Config object the option Config(s3={'use_accelerate_endpoint': True})
.
Note
If you plan to run pypicloud in multiple regions, read more about syncing pypicloud caches using S3 notifications
storage.bucket
¶
Argument: string
The name of the S3 bucket to store packages in.
storage.region_name
¶
Argument: string, semi-optional
The AWS region your bucket is in. If your bucket does not yet exist, it will be created in this region on startup. If blank, the classic US region will be used.
Warning
If your bucket name has a .
character in it, or if it is in a newer region
(such as eu-central-1
), you must specify the storage.region_name
!
storage.aws_access_key_id
, storage.aws_secret_access_key
¶
Argument: string, optional
Your AWS access key id and secret access key. If they are not specified then
pypicloud will attempt to get the values from the environment variables
AWS_ACCESS_KEY_ID
and AWS_SECRET_ACCESS_KEY
or any other credentials
source.
storage.prefix
¶
Argument: string, optional
If present, all packages will be prefixed with this value when stored in S3. Use this to store your packages in a subdirectory, such as “packages/”
storage.prepend_hash
¶
Argument: bool, optional
Prepend a 4-letter hash to all S3 keys (default True). This helps S3 load balance when traffic scales. See the AWS documentation on the subject.
storage.expire_after
¶
Argument: int, optional
How long (in seconds) the generated S3 urls are valid for (default 86400 (1 day)). In practice, there is no real reason why these generated urls need to expire at all. S3 does it for security, but expiring links isn’t part of the python package security model. So in theory you can bump this number up.
storage.redirect_urls
¶
Argument: bool, optional
Leave this alone unless you’re having problems using easy_install
. It
defaults to True
and should not be changed unless you encounter issues.
The long story: Why you should set redirect_urls = True
storage.server_side_encryption
¶
Argument: str, optional
Enables AES-256 transparent server side encryption. See the AWS documention. Default is None.
storage.object_acl
¶
Argument: string, optional
Sets uploaded object’s “canned” ACL. See the AWS documentation. Default is “private”, i.e. only the account owner will get full access. May be useful, if the bucket and pypicloud are hosted in different AWS accounts.
storage.public_url
¶
Argument: bool, optional
If true
, use public urls (in the form
https://us-east-1.s3.amazonaws.com/<bucket>/<path>
) instead of signed urls. If
you configured your bucket to be public and are okay with anyone being able to
read your packages, this will give you a speed boost (no expensive hashing
operations) and should provide better HTTP caching behavior for the packages.
Default is false
.
CloudFront¶
This option will store your packages in S3 but use CloudFront to deliver the packages. This is an extension of the S3 storage backend and require the same settings as above, but also the settings listed below.
Set pypi.storage = cloudfront
OR pypi.s3 = pypicloud.storage.CloudFrontS3Storage
storage.cloud_front_domain
¶
Argument: string
The CloudFront domain you have set up. This CloudFront distribution must be set up to use your S3 bucket as the origin.
Example: https://dabcdefgh12345.cloudfront.net
storage.cloud_front_key_id
¶
Argument: string, optional
If you want to protect your packages from public access you need to set up the CloudFront distribution to use signed URLs. This setting specifies the key id of the CloudFront key pair that is currently active on your AWS account.
storage.cloud_front_key_file
¶
Argument: string, optional
Only needed when setting up CloudFront with signed URLs. This setting should be set to the full path of the CloudFront private key file.
storage.cloud_front_key_string
¶
Argument: string, optional
The same as cloud_front_key_file
, but contains the raw private key instead
of a path to a file.
Google Cloud Storage¶
This option will store your packages in GCS.
Set pypi.storage = gcs
OR pypi.s3 = pypicloud.storage.GoogleCloudStorage
Note
The gcs client libraries are not installed by default. To use this backend,
you should install pypicloud with pip install pypicloud[gcs]
.
This backend supports most of the same configuration settings as the S3 backend,
and is configured in the same manner as that backend (via config settings of the
form storage.<key> = <value>
).
Settings supported by the S3 backend that are not currently supported by the
GCS backend are server_side_encryption
and public_url
.
Pypicloud authenticates with GCS using the usual Application Default Credentials strategy,
see the documentation for
more details. For example you can set the GOOGLE_APPLICATION_CREDENTIALS
environment variable:
GOOGLE_APPLICATION_CREDENTIALS=/path/to/my/keyfile.json pserve pypicloud.ini
Pypicloud also exposes a config setting, storage.gcp_service_account_json_filename
,
documented below.
For more information on setting up a service account, see the GCS documentation.
If using the service account provided automatically when running in GCE, GKE, etc, then due to a restriction with the gcloud library, the IAM signing service must be used:
storage.gcp_use_iam_signer=true
In addition, when using the IAM signing service, the service account used needs to have
iam.serviceAccounts.signBlob
on the storage bucket. This is available as part of
roles/iam.serviceAccountTokenCreator
.
storage.bucket
¶
Argument: string
The name of the GCS bucket to store packages in.
storage.region_name
¶
Argument: string, semi-optional
The GCS region your bucket is in. If your bucket does not yet exist, it will be created in this region on startup. If blank, a default US multi-regional bucket will be created.
storage.gcp_api_endpoint
¶
Argument: string, optional
The storage API URL to which to point the GCS Client. If not provided, will use default.
storage.gcp_service_account_json_filename
¶
Argument: string, semi-optional
Path to a local file containing a GCP service account JSON key. This argument
is required unless the path is provided via the GOOGLE_APPLICATION_CREDENTIALS
environment variable.
storage.gcp_use_iam_signer
¶
Argument: bool, optional
If true, will use the IAM credentials to sign the generated package links
(default false
).
storage.iam_signer_service_account_email
¶
Argument: string, optional
The email address to use for signing GCS links when gcp_use_iam_signer =
true
. If not provided, will fall back to the email in
gcp_service_account_json_filename
.
See issue 261 for more details
storage.gcp_project_id
¶
Argument: string, optional
ID of the GCP project that contains your storage bucket. This is only used when creating the bucket, and if you would like the bucket to be created in a project other than the project to which your GCP service account belongs.
storage.prefix
¶
Argument: string, optional
If present, all packages will be prefixed with this value when stored in GCS. Use this to store your packages in a subdirectory, such as “packages/”
storage.prepend_hash
¶
Argument: bool, optional
Prepend a 4-letter hash to all GCS keys (default True). This may help GCS load balance when traffic scales, although this is not as well-documented for GCS as for S3.
storage.expire_after
¶
Argument: int, optional
How long (in seconds) the generated GCS urls are valid for (default 86400 (1 day)). In practice, there is no real reason why these generated urls need to expire at all. GCS does it for security, but expiring links isn’t part of the python package security model. So in theory you can bump this number up.
storage.redirect_urls
¶
Argument: bool, optional
Leave this alone unless you’re having problems using easy_install
. It
defaults to True
and should not be changed unless you encounter issues.
The long story: Why you should set redirect_urls = True
storage.object_acl
¶
Argument: string, optional
Sets uploaded object’s “predefined” ACL. See the GCS documentation. Default is “private”, i.e. only the account owner will get full access. May be useful, if the bucket and pypicloud are hosted in different GCS accounts.
storage.storage_class
¶
Argument: string, optional
Sets uploaded object’s storage class. See the GCS documentation. Defaults to the default storage class of the bucket, if the bucket is preexisting, or “regional” otherwise.
storage.gcp_use_iam_signer
¶
Argument: boolean, optional
Sign blobs using IAM backed signing, rather than using GCP application credentials.
The service account used needs to have iam.serviceAccounts.signBlob
on the storage
bucket. This is available as part of roles/iam.serviceAccountTokenCreator
.
Azure Blob Storage¶
This option will store your packages in a container in Azure Blob Storage.
Set pypi.storage = azure-blob
OR pypi.s3 = pypicloud.storage.AzureBlobStorage
A few key, required options are mentioned below.
storage.storage_account_name
¶
Argument: string
The name of the Azure Storage Account. If not present, will look for the
AZURE_STORAGE_ACCOUNT
environment variable.
storage.storage_account_key
¶
Argument: string
A valid access key, either key1 or key2. If not present, will look for the
AZURE_STORAGE_KEY
environment variable.
storage.storage_container_name
¶
Argument: string
Name of the container you wish to store packages in.
storage.storage_account_url
¶
Argument: string, optional
Storage data service endpoint. If not present, will look for the
AZURE_STORAGE_SERVICE_ENDPOINT
environment variable.