How to Upload Files to Azure Storage Blobs Using Python
Azure storage blobs offers a very cost effective and fast storage solution for unstructured data. It is an ideal solution if you want serve content such as images. It is also possible to speed up content delivery performance using Azure CDN service with it.
Before running the following programs, ensure that you have the pre-requisites ready. In the following sample python programs, I will be using the latest Python SDK v12 for Azure storage blob.
Install Python 3.6 or above. In Mac, use Homebrew to install python 3,
Install the Azure Blob storage client library for Python package,
Using Azure portal, create an Azure storage v2 account and a container before running the following programs. You will also need to copy the connection string for your storage account from the Azure portal. If you want public access to uploaded images, set the container public access level to "Blob (anonymous read access for blobs only)".
How to Upload Files to Azure Storage Blobs Using Python
The following program demonstrates a typical use case where you want to bulk upload a set of jpg images from a local folder to the Azure blob storage container. Note that for large number of files, this program may not be efficient as it sequentially uploads the images. Replace MY_CONNECTION_STRING, MY_IMAGE_CONTAINER and LOCAL_IMAGE_PATH before running the program.
# upload_blob_images.py # Python program to bulk upload jpg image files as blobs to azure storage # Uses latest python SDK() for Azure blob storage # Requires python 3.6 or above import os from azure.storage.blob import BlobServiceClient, BlobClient from azure.storage.blob import ContentSettings, ContainerClient # IMPORTANT: Replace connection string with your storage account connection string # Usually starts with DefaultEndpointsProtocol=https;... MY_CONNECTION_STRING = "REPLACE_THIS" # Replace with blob container. This should be already created in azure storage. MY_IMAGE_CONTAINER = "myimages" # Replace with the local folder which contains the image files for upload LOCAL_IMAGE_PATH = "REPLACE_THIS" class AzureBlobFileUploader: def __init__(self): print("Intializing AzureBlobFileUploader") # Initialize the connection to Azure storage account self.blob_service_client = BlobServiceClient.from_connection_string(MY_CONNECTION_STRING) def upload_all_images_in_folder(self): # Get all files with jpg extension and exclude directories all_file_names = [f for f in os.listdir(LOCAL_IMAGE_PATH) if os.path.isfile(os.path.join(LOCAL_IMAGE_PATH, f)) and ".jpg" in f] # Upload each file for file_name in all_file_names: self.upload_image(file_name) def upload_image(self,file_name): # Create blob with same name as local file name blob_client = self.blob_service_client.get_blob_client(container=MY_IMAGE_CONTAINER, blob=file_name) # Get full path to the file upload_file_path = os.path.join(LOCAL_IMAGE_PATH, file_name) # Create blob on storage # Overwrite if it already exists! image_content_setting = ContentSettings(content_type='image/jpeg') print(f"uploading file - {file_name}") with open(upload_file_path, "rb") as data: blob_client.upload_blob(data,overwrite=True,content_settings=image_content_setting) # Initialize class and upload files azure_blob_file_uploader = AzureBlobFileUploader() azure_blob_file_uploader.upload_all_images_in_folder()
Parallel Bulk Upload of Files to Azure Storage Blobs Using Python
The following python program is an improved version of the above program. This program uses a thread pool to upload a predefined number of images in parallel. Note that the program uses 10 as the thread pool count, but you can increase it for faster uploads if you have sufficient network bandwidth. If you don't specify content type, it will default to application/octet-stream.
# upload_blob_images_parallel.py # Python program to bulk upload jpg image files as blobs to azure storage # Uses ThreadPool for faster parallel uploads! # Uses latest python SDK() for Azure blob storage # Requires python 3.6 or above import os from multiprocessing.pool import ThreadPool from azure.storage.blob import BlobServiceClient, BlobClient from azure.storage.blob import ContentSettings, ContainerClient # IMPORTANT: Replace connection string with your storage account connection string # Usually starts with DefaultEndpointsProtocol=https;... MY_CONNECTION_STRING = "REPLACE_THIS" # Replace with blob container MY_IMAGE_CONTAINER = "myimages" # Replace with the local folder which contains the image files for upload LOCAL_IMAGE_PATH = "REPLACE_THIS" class AzureBlobFileUploader: def __init__(self): print("Intializing AzureBlobFileUploader") # Initialize the connection to Azure storage account self.blob_service_client = BlobServiceClient.from_connection_string(MY_CONNECTION_STRING) def upload_all_images_in_folder(self): # Get all files with jpg extension and exclude directories all_file_names = [f for f in os.listdir(LOCAL_IMAGE_PATH) if os.path.isfile(os.path.join(LOCAL_IMAGE_PATH, f)) and ".jpg" in f] result = self.run(all_file_names) print(result) def run(self,all_file_names): # Upload 10 files at a time! with ThreadPool(processes=int(10)) as pool: return pool.map(self.upload_image, all_file_names) def upload_image(self,file_name): # Create blob with same name as local file name blob_client = self.blob_service_client.get_blob_client(container=MY_IMAGE_CONTAINER, blob=file_name) # Get full path to the file upload_file_path = os.path.join(LOCAL_IMAGE_PATH, file_name) # Create blob on storage # Overwrite if it already exists! image_content_setting = ContentSettings(content_type='image/jpeg') print(f"uploading file - {file_name}") with open(upload_file_path, "rb") as data: blob_client.upload_blob(data,overwrite=True,content_settings=image_content_setting) return file_name # Initialize class and upload files azure_blob_file_uploader = AzureBlobFileUploader() azure_blob_file_uploader.upload_all_images_in_folder()
Using the ContentSettings object, it is possible to set content type, content encoding, content md5 or cache control for the blobs. See here for the full set of content_settings attributes.