Go to file
2024-10-04 05:39:00 +00:00
.github/workflows initial commit 2024-10-03 09:35:37 +02:00
data Auto-update from backup job 2024-10-04 05:39:00 +00:00
data_main Auto-update from backup job 2024-10-03 09:55:10 +00:00
scripts feat: keep track of empty commits; handle deleted files 2024-10-03 11:54:56 +02:00
tests/tests_cases feat: keep track of empty commits; handle deleted files 2024-10-03 11:54:56 +02:00
README.md feat: keep track of empty commits; handle deleted files 2024-10-03 11:54:56 +02:00

backup.sh ~ descartes underwriting technical test for devops

The script takes as input a repository url, a branch, a destination directory and optionally a number of commit to backup, verbose mode, debug mode.

It checkouts a git repository, list all commits in a given branch (from oldest to most recent), and for each commit sha checks if it was already backuped (check for ${DATA_DIR}/ directory), and if not, checks out all files mentionned in commit and dump their state at given commit into the per commit given directory.

The included workflow that actually performs the backup can be manually run through github UI.

Usage

$ /bin/bash scripts/backup.sh -r <repository> -b <branch> -d </path/to/data> [-n <commit limit>] [-v] [-x]

Ex:

$ /bin/bash scripts/backup.sh -r https://github.com/descartes-underwriting/devops-technical-test-data.git -b 01-01-2022-test -d $(pwd)/data -n 5
Cloning into 'devops-technical-test-data'...
remote: Enumerating objects: 21265, done.
...
new commit: 282180fe7e5d9cbf297f2f0ef813cffe60ce2328
new commit: 46fe26c9dcf2354a0ed3f304ed6818de9606f7b5
new commit: 21e5331d1c0256701bb90cf017e519d54a88f618
new commit: 47998b5317e66b3bd456cfb07268c93e223704f2
new commit: 7c5aebc1feeef4eaf19083019547457b8cf3fc3d
done: 5

$ ls -l data/
total 0
drwxr-xr-x 1 patrick chicac  8 Jan  1 09:28 21e5331d1c0256701bb90cf017e519d54a88f618
drwxr-xr-x 1 patrick chicac 18 Jan  1 09:28 282180fe7e5d9cbf297f2f0ef813cffe60ce2328
drwxr-xr-x 1 patrick chicac 14 Jan  1 09:28 46fe26c9dcf2354a0ed3f304ed6818de9606f7b5
drwxr-xr-x 1 patrick chicac 14 Jan  1 09:28 47998b5317e66b3bd456cfb07268c93e223704f2
drwxr-xr-x 1 patrick chicac 14 Jan  1 09:28 7c5aebc1feeef4eaf19083019547457b8cf3fc3d

$

Note that deleted objects are ignored (no file is created). However, to track commit ids, empty directories can be created and archived, using a .gitkeep.

Check .github/workflows/backup.yaml for usage sample.

Tests

A sample test script exists to verify basic use case of backup'ing a limited number of commits. The scripts/test.sh runs a couple of backups and run a diff -r against a manually verified backup included in the repository. A workflow runs tests on push.

Not covered / Improvements ideas

  • Keep track of latest commit backuped to allow iterative backups. As for now, the script checks all commits (which is not really efficient); However, if latest commit backup is tracked and this commit to be overwritten by a force push, it will be eventually required to re-do the whole backup.