How to copy files to s3 using boto3

Posted on Sun 01 March 2020

python boto3

Boto is a the AWS SDK for Python. It's a library that allows you to interact with the different AWS services. You can use it either on a computer/server to run all sorts of automation or to write lambda functions in your AWS account.

In this article I will go over a simple case scenario of copying a file from the local machine to an s3 bucket. This can be easily adapted into creating automatic backups in the cloud.

AWS offers a free tier and if you are interested in their 12 month free you can create an account.

Creating the bucket

You can create the bucket either from the console or using the AWS CLI. I used the console for this one, so I logged into my AWS account. Under Services chose S3, then 'Create Bucket'.

Buckets in S3 must have unique names across AWS so you will only be able to give it a name that does not already exists. The interface will tell you that. I named mine razbackupbucket , so chose a name, then click Create.

Provisioning credentials

In order for us to use the AWS services we will need to setup access credentials. To do that, under Services again go to IAM.

Here under Users click on Add User. For the username I chose 'svc-s3' (the name is more for you than anything). I also selected Programmatic access only, as the user will not need access to the console. Click Next: Permissions.

For permissions I selected 'Attach existing policies directly', then when you type s3 in the filter, you can select AmazonS3FullAccess. Next: Tags. Next: Review. Create user.

It is very important on this screen to either download the .csv file or copy the Access key ID and Secret access key because once you press Close, you will never be able to see the Secret access key ever again.

Setup Python environment

On the computer now, in your work folder, create a new directory where everything will be created. I like to have virtual environments for every project and to keep things separated, so we will create one. After that we will install boto3 as well as python-dotenv to store out credentials properly as environment variables.

mkdir s3-project
cd s3-project
python3 -m venv venv
source ./venv/bin/activate
pip install boto3
pip install python-dotenv

We will now create 2 files, one called 'app.py', that will have our code, and one called '.env' where we can store the credentials.

touch app.py
touch .env

I also copied a new file in that folder called s3.png that I will use to copy to my bucket.

The .env file looks like this. Make sure you replace the values with the ones you got from the previous step.

AWS_ACCESS_KEY_ID=your-access-key-id
AWS_SECRET_ACCESS_KEY=your-secret-access-key

And finally here is the code in app.py that will take the image file and upload it to the S3 bucket.

import boto3 
import os 
from dotenv import load_dotenv

load_dotenv() # this loads the .env file with our credentials

file_name = 's3.png' # name of the file to upload
bucket_name = 'razbackupbucket' # name of the bucket

s3_client = boto3.client(
    's3',
    aws_access_key_id=os.getenv('AWS_ACCESS_KEY_ID'),
    aws_secret_access_key=os.getenv('AWS_SECRET_ACCESS_KEY')
)

response = s3_client.upload_file(file_name, bucket_name, file_name)

Notice that in the last line, we have the filename referenced 2 times. That is because the first one, we refer to the actual file on the disk and the second time, we chose what the name will be once it's uploaded to s3.

You can now run your program with the virtual environment still activated.

python app.py

If you go back and check your s3 bucket and refresh, you will see the new file in there. If not, this is a good time to track back and see what did not go according to plan.

More information on boto and S3 can be found here.

Thank you for reading this and please let me know if you have any questions.