How to Download Files From S3 Using Boto3 [Python]?

Introduction

Boto3 is an AWS SDK for Python. It allows users to create, and manage AWS services such as EC2 and S3. It provides an object oriented API services and low level services to the AWS services.

In this tutorial, you’ll

  • create session in Boto3 [Python]
  • Download files from S3 using Boto3 [Python]
  • Download all from S3 Bucket using Boto3 [Python]

Prerequisties

Before you start, you’ll need the following.

  • Install Boto3 using the command sudo pip3 install boto3
  • If AWS cli is installed and configured you can use the same credentials to create session using Boto3. You can install and configure AWS Cli using the How to install and Configure AWS Cli on ubuntu.

Create S3 Session in Boto3

In this section, you’ll create an S3 session in Boto3. First you’ll learn how to specify credentials for connecting to S3 using Boto3. Then you create a generic session to s3 and also create a specific s3 session.

You can create a session by using the boto3.Session() api by passing the access key and the secret access key. Boto3 looks at various configuration locations until it finds the configuration values such as settings.AWS_SERVER_PUBLIC_KEY. you’ll use the environmental variables of your system to access the configuration.

Create a generic session to your AWS service using the below code.

import boto3
session = boto3.Session(
    aws_access_key_id=settings.AWS_SERVER_PUBLIC_KEY,
    aws_secret_access_key=settings.AWS_SERVER_SECRET_KEY
)
  • boto3.Session() – Api method to create a session
  • aws_access_key_id – Parameter to denote the Access Key ID. settings.AWS_SERVER_PUBLIC_KEY is used to refer the global environmental variable. This will be set and available when your install and configure the AWS Cli version as specified in the prerequisite
  • aws_secret_access_key – Parameter to denote the Secret access key. settings.AWS_SERVER_SECRET_KEY is used to refer the global environment variable. This will be set and available when your install and configure the AWS Cli version as specified in the prerequisite

You’ve created a generic session to your AWS. Now you’ll access the resource s3.

Use the below command to access S3 as a resource using the session.

s3 = session.resource('s3')

If you do not want to create a session and access the resource, you can create an s3 client directly by using the following command.

s3_client = boto3.client('s3', 
                      aws_access_key_id=settings.AWS_SERVER_PUBLIC_KEY, 
                      aws_secret_access_key=settings.AWS_SERVER_SECRET_KEY, 
                      region_name=REGION_NAME
                      )
  • boto3.client – Api method to create a client directly
  • aws_access_key_id – Parameter to denote the Access Key ID. settings.AWS_SERVER_PUBLIC_KEY is used to refer the global environmental variable. This will be set and available when your install and configure the AWS Cli version as specified in the prerequisite
  • aws_secret_access_key – Parameter to denote the Secret access key. settings.AWS_SERVER_SECRET_KEY is used to refer the global environment variable. This will be set and available when your install and configure the AWS Cli version as specified in the prerequisite
  • region_name – Region where your S3 object resides. AWS Region is a separate geographic area. You can learn more about the AWS regions and the list of regions available in this AWS guide

You’ve created a client directly to access the S3 objects.

Now, you’ll learn how to access and download objects from the S3 bucket using Boto3 resource and Client.

Download a Single File From S3 Using Boto3

In this section, you’ll download a single file from AWS S3 using Boto3. You’ll use the download_file api from the S3 resource of the Boto3.

Use the below script to download a single file from S3 using Boto3 Resource.

import boto3

session = boto3.Session(
    aws_access_key_id=<Access Key ID>,
    aws_secret_access_key=<Secret Access Key>,
)

s3 = session.resource('s3')

s3.Bucket('BUCKET_NAME').download_file('OBJECT_NAME', 'FILE_NAME')

print('success')
  • session – to create a session with your AWS account. Explained in previous section
  • s3 – Resource created out of the session
  • s3.Bucket().download_file() – API method to download file from your S3 buckets.
    • BUCKET_NAME – Name your S3 Bucket. Root or parent folder
    • OBJECT_NAME – Name for the file to be downloaded. You can also give a name that is different from the object name. for e.g. If your file is existing as a.txt, you can download it as b.txt using this parameter
    • FILE_NAME – Full path of your S3 Objects. Including the sub folders in your s3 Bucket. for e.g. /folder1/folder2/filename.txt

Use the below script to download a single file from S3 using Boto3 Client.

import boto3
 
s3_client = boto3.client('s3', 
                      aws_access_key_id=<Access Key ID>,
                      aws_secret_access_key=<Secret Access Key>,
                      region_name='ap-south-1'
                      )

s3_client.download_file('BUCKET_NAME', 'OBJECT_NAME', 'FILE_NAME')
print('success')
  • s3_client – Client Created for S3 using Boto3
  • s3.client.download_file() – API method to download file from your S3 buckets.
    • BUCKET_NAME – Name your S3 Bucket. Root or parent folder
    • OBJECT_NAME – Name for the file to be downloaded. You can also give a name that is different from the object name. for e.g. If your file is existing as a.txt, you can download it as b.txt using this parameter
    • FILE_NAME – Full path of your S3 Objects. Including the sub folders in your s3 Bucket. for e.g. /folder1/folder2/filename.txt

You’ve downloaded a single file from AWS S3 using Python Boto3.

Next, you’ll download all files from S3.

Download All Files From S3 Using Boto3

In this section, you’ll download all files from S3 using Boto3.

You’ll create a s3 resource and iterate over a for loop using objetcs.all() api. Create necessary sub directories to avoid file replacements if there are one or more files existing in different sub buckets. Then download the file actually.

import os
import boto3

#Create Session
session = boto3.Session(
    aws_access_key_id=settings.AWS_SERVER_PUBLIC_KEY,
    aws_secret_access_key=settings.AWS_SERVER_SECRET_KEY,
)

#Initiate S3 Resource
s3 = session.resource('s3')

# Select Your S3 Bucket
your_bucket = s3.Bucket('your_bucket_name')

# Iterate All Objects in Your S3 Bucket Over the for Loop
for s3_object in your_bucket.objects.all():
   
    #Use this statement if your files are available directly in your bucket. 
    your_bucket.download_file(s3_object.key, filename_with_extension)

    #use below three line ONLY if you have sub directories available in S3 Bucket
    #Split the Object key and the file name.
    #parent directories will be stored in path and Filename will be stored in the filename
  
    path, filename = os.path.split(s3_object.key)

    #Create sub directories if its not existing
    os.makedirs(path)
    
    #Download the file in the sub directories or directory if its available. 
    your_bucket.download_file(s3_object.key, path/filename)

You’ve learnt how to download all files from a S3 Bucket using Boto3.

Download Folder From S3 Using Boto3

You cannot download folder from S3 using Boto3 using a clean implementation. Instead you can download all files from a directory using the previous section. Its the clean implementation.

You can also check How to download all files and folders from S3 using AWS Cli.

Running Python File in Terminal

After you’ve created the script in the Python3, you may need to run the Python script from the terminal. Refer the tutorial to learn How to Run Python File in terminal.

If you have any issues, you can also comment below to ask a question.

Conclusion

In this tutorial, you’ve learnt

  • How to specify credentials when connecting to AWS using Boto3 Python
  • How to download file from S3 using Boto3 Python
  • How to download all files from AWS S3 bucket using Boto3 Python
  • How to download folder from S3 using Boto3 Python

What Next?

<Watch this space for more updates on Blog>

Leave a Comment

Share via
Copy link
Powered by Social Snap