# backup.sh ~ descartes underwriting technical test for devops
The script takes as input a repository url, a branch, a destination directory and optionally a number of commit to backup, verbose mode, debug mode.
It checkouts a git repository, list all commits in a given branch (from oldest to most recent), and for each commit sha checks if it was already backuped (check for ${DATA_DIR}/<commitsha> directory), and if not, checks out all files mentionned in commit and dump their state at given commit into the per commit given directory.
The included workflow that actually performs the backup can be manually run through github UI.
Note that deleted objects are ignored (no file is created). However, to track commit ids, empty directories can be created and archived, using a `.gitkeep`.
It is not here. I limited intentionally the workflow to automatically backup 10 commits. It it still possible to trigger the backup manually through github UI, or by running the script on your local workstation:
```sh
$ /bin/bash scripts/backup.sh -r https://github.com/descartes-underwriting/devops-technical-test-data.git -b 01-01-2022-test -d all
A sample test script exists to verify basic use case of backup'ing a limited number of commits. The `scripts/test.sh` runs a couple of backups and run a `diff -r` against a manually verified backup included in the repository. A workflow runs tests on push.
The script is testing a few backups (1, 5, 10) against Descartes Underwriting repository. Then I've created another repository with multiple files in commits to tests a bit more complex commits. I've the feeling this would require more work.
Check `.github/workflows/tests.yaml` for the test workflow.
- Keep track of latest commit backuped to allow iterative backups. As for now, the script checks all commits (which is not really efficient); However, if latest commit backup is tracked and this commit to be overwritten by a force push, it will be eventually required to re-do the whole backup.
- Move the script/workflow out of the backup repository to allow a re-usable workflow! Also, it is quite boring having the script in the same repository as it will create conflicts during push, as pushing will start backuping and will require pulling again the code.