Docker: backing up and ensuring our data on Docker and GDrive


To lost your production data is not a funny situation. Configure a periodic backup is really a good idea, but is not enought if you keep you backup files in the same computer you have the original data. you are putting all eggs in the same basket. Let me share with you an idea that allows you to do a periodic backup and upload all the generated backup to a cloud storage server like google drive.

Introduction

We rely all of our production and staging environments on docker compose, this is a tool that allows you to configure a multicontainer environemnt with just a .yaml file. With a simple set of commands you can stop, start, restart and update container (services in the docker container languaje) without affecting the other ones.

We also use mysql and mariadb to store our data.

We should build 2 docker image for this work, one image will do the backups periodically, and the other one will upload the backups files to a google drive folder.

One desirable property of this backup image is the ability to do backups for more than 1 database, we can use some env. variables to know which database to process.

#!/bin/sh

BACKUP_FOLDER=/opt/mysql/backup
NOW=$(date '+%Y-%m-%d_%H:%M:%S')

GZIP=$(which gzip)
MYSQLDUMP=$(which mysqldump)

### MySQL Server Login info ###
MDB=$MYSQL_DATABASES
MHOST=$MYSQL_CONTAINER_NAME
MPASS=$MYSQL_ROOT_PASSWORD
MUSER=root

[ ! -d "$BACKUP_FOLDER" ] && mkdir --parents $BACKUP_FOLDER


for MDB in $(echo $MYSQL_DATABASES | sed "s/,/ /g")
do
    FILE=${BACKUP_FOLDER}/${MDB}-${NOW}.sql.gz
    $MYSQLDUMP -h $MHOST -u $MUSER -p${MPASS} --databases $MDB  --single-transaction --hex-blob --default-character-set=utf8mb4 -N | $GZIP -9 > $FILE
done

The Backup Docker Image

Of course, we did not re-invented the weel, we took the main idea of internet, particulary from this article:

https://ricardolsmendes.medium.com/mysql-mariadb-with-scheduled-backup-jobs-running-in-docker-1956e9892e78

The main idea there is to create a docker container from a clean and small linux distribution like “Alpine”, add the mysql/mariadb clients there, and finally, copy your backup’s scripts in some special folder read by cron at regular time basis.

Those special folders are located below:

/etc/periodic

some examples are:
/etc/periodic/hourly/
/etc/periodic/daily/
/etc/periodic/weekly/

The RClone Docker Image

The idea here is to take a folder with all of our backups and upload them to a folder in a cloud storage service like google drive.

RClone allow us to sync a server folder with a google drive folder, In order to do that, we need to allow rclone to access to the Google Drive API

The original rclone Docker image is:

https://github.com/bcardiff/docker-rclone

How rclone is accessing to Google Drive?

You need to create a Google Developer Console Project (https://console.developers.google.com/) and enable GoogleDrive API, you will also need a Google Service Account (GSA) Configured.

Even you can use oauth to connect rclone to you google account, the integration with a Google Service Account provide a better, non-interactive process to link rclone with google.

Once the GSA is created, we generate a credential file (in json format), the content of this json file was embbed in the
rclone’s configuration file, in the variable called: service_account_credentials.

The rclone configuration file is a simple text file like this:

[gdrive_bkp]
type = drive
scope = drive
root_folder_id = 15wEtNwW6i6d4wWpKAu_QvCWpts_q_3-U
service_account_credentials = {  "type": "service_account",  "project_id": "your-project-id",  "private_key_id": "your private key id",  "private_key": " your private key",  "client_email": "your-service-account@backupserviceaccount-303115.iam.gserviceaccount.com",  "client_id": "your_client_id",  "auth_uri": "https://accounts.google.com/o/oauth2/auth",  "token_uri": "https://oauth2.googleapis.com/token",  "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",  "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/your-service-account%40backupserviceaccount-303115.iam.gserviceaccount.com"}

You will need to map a folder with this file as a volume to the rclone as we will show later.

Where are uploaded our files?

RClone will upload files to the Google Drive Space that belong to the GSA, and this is how you indicate the GSA in the rclone config file:

"client_email": "your-service-account@backupserviceaccount-303115.iam.gserviceaccount.com"

this has 2 issues:
1) it is not easy to access to the GSA’s Google Drive with the Browser
2) the GSA’s Google Drive space is limited

In oder to solve that, you may need a 2nd Google Account with more disk space in google drive . The idea here is to configure rclone to sync data in this a folder shared by the bigger account but accesing them trhought the GSA Account.

Why not use directly the account with more space with rclone and avoid the GSA?

Because the GSA Account allow us to access to their resources (like google drive) without entering a password/code manually. and the GSA is a secure method to access to our google cloud services.

Ok, once you have a folder created, the next step is to share that folder with the
GSA (you should share the folder with the GSA email address: your_gsa@backupserviceaccount-303115.iam.gserviceaccount.com)

Remember: the GSA should have Editor permission over the folder.

In order to use that shared folder in rclone, you will need the shared folder ID, if you click on the shared folder in your broswer, you will see the folder id as the last component of the url.

This folder Id is set in the rclone’s config file, in the root_folder_id variable, so rclone will use that folder as the root foder. You rclone config file should have a line like this one:

root_folder_id = your folder id

Exceuting RClone in Docker

#docker run --rm -it -v $(pwd)/config:/config -v $(pwd)/folder:/source -e SYNC_SRC="/source" -e SYNC_DEST="gdrive_bkp:/test" -e TZ="America/Argentina/Buenos_Aires" -e CRON="0 0 * * *" -e CRON_ABORT="0 6 * * *" -e FORCE_SYNC=1  bcardiff/rclone

the docker compose looks like this:

version: "3.4"

services:
  rclone:
    image: bcardiff/rclone
    container_name: ccloner
    restart: always
    volumes:
      - ./config:/config
      - ./folder:/source
    environment:      
      - SYNC_SRC=/source
      - SYNC_DEST=gdrive_bkp:/test
      - TZ=America/Argentina/Buenos_Aires
      - CRON=0 4 24 * *
      - CRON_ABORT=0 6 * * *
      - FORCE_SYNC=0

where ./config is the folder that contains the rclone config file and ./folder is the folder to sync, this folder is the same folder used in the backup docker container to save the backups generated.

Notice that we also configure cron here, ensure that rclone executes after the backups are generated.