Go to file
2024-10-03 09:35:37 +02:00
.github/workflows initial commit 2024-10-03 09:35:37 +02:00
scripts initial commit 2024-10-03 09:35:37 +02:00
tests/tests_cases initial commit 2024-10-03 09:35:37 +02:00
README.md initial commit 2024-10-03 09:35:37 +02:00

backup.sh ~ descartes underwriting technical test for devops

The script takes as input a repository url, a branch, a destination directory and optionally a number of commit to backup, verbose mode, debug mode.

It checkouts a git repository, list all commits in a given branch (from oldest to most recent), and for each commit sha checks if it was already backuped (check for ${DATA_DIR}/ directory), and if not, checks out all files mentionned in commit and dump their state at given commit into the per commit given directory.

The included workflow that actually performs the backup can be manually run through github UI.

Usage

$ /bin/bash scripts/backup.sh -r <repository> -b <branch> -d </path/to/data> [-n <commit limit>] [-v] [-x]

Ex:

$ /bin/bash scripts/backup.sh -r https://github.com/descartes-underwriting/devops-technical-test-data.git -b 01-01-2022-test -d $(pwd)/data -n 5
Cloning into 'devops-technical-test-data'...
remote: Enumerating objects: 21265, done.
...
new commit: 282180fe7e5d9cbf297f2f0ef813cffe60ce2328
new commit: 46fe26c9dcf2354a0ed3f304ed6818de9606f7b5
new commit: 21e5331d1c0256701bb90cf017e519d54a88f618
new commit: 47998b5317e66b3bd456cfb07268c93e223704f2
new commit: 7c5aebc1feeef4eaf19083019547457b8cf3fc3d
done: 5

$ ls -l data/
total 0
drwxr-xr-x 1 patrick chicac  8 Jan  1 09:28 21e5331d1c0256701bb90cf017e519d54a88f618
drwxr-xr-x 1 patrick chicac 18 Jan  1 09:28 282180fe7e5d9cbf297f2f0ef813cffe60ce2328
drwxr-xr-x 1 patrick chicac 14 Jan  1 09:28 46fe26c9dcf2354a0ed3f304ed6818de9606f7b5
drwxr-xr-x 1 patrick chicac 14 Jan  1 09:28 47998b5317e66b3bd456cfb07268c93e223704f2
drwxr-xr-x 1 patrick chicac 14 Jan  1 09:28 7c5aebc1feeef4eaf19083019547457b8cf3fc3d

$

Check .github/workflows/backup.yaml for usage sample.

Tests

A sample test script exists to verify basic use case of backup'ing a limited number of commits. The scripts/test.sh runs a couple of backups and run a diff -r against a manually verified backup included in the repository. A workflow runs tests on push.

Not covered / Improvements ideas

  • Deleted files in commits; The backup.sh script does not cover deleted files;
  • Keep track of latest commit backuped to allow iterative backups. As for now, the script checks all commits (which is not really efficient); However, if latest commit backup is tracked and this commit to be overwritten by a force push, it will be eventually required to re-do the whole backup.