feat: keep track of empty commits; handle deleted files
This commit is contained in:
parent
08424d1bf1
commit
7e2bdb0696
@ -37,6 +37,8 @@ drwxr-xr-x 1 patrick chicac 14 Jan 1 09:28 7c5aebc1feeef4eaf19083019547457b8cf3
|
||||
$
|
||||
```
|
||||
|
||||
Note that deleted objects are ignored (no file is created). However, to track commit ids, empty directories can be created and archived, using a `.gitkeep`.
|
||||
|
||||
Check `.github/workflows/backup.yaml` for usage sample.
|
||||
|
||||
## Tests
|
||||
@ -45,5 +47,5 @@ A sample test script exists to verify basic use case of backup'ing a limited num
|
||||
|
||||
## Not covered / Improvements ideas
|
||||
|
||||
- Deleted files in commits; The backup.sh script does not cover deleted files;
|
||||
- Keep track of latest commit backuped to allow iterative backups. As for now, the script checks all commits (which is not really efficient); However, if latest commit backup is tracked and this commit to be overwritten by a force push, it will be eventually required to re-do the whole backup.
|
||||
|
||||
|
@ -70,13 +70,12 @@ do
|
||||
# There are malformed files names that are creating complex filenames to parse.
|
||||
# Those malformed filenames are double quotes, so to remove quotes, -c core.quotepath=false
|
||||
# and -z are used. sed 's/\x0//g' is removing the null byte
|
||||
FILES=$($GIT ${GIT_OPTS[@]} show --pretty= --name-only -z ${COMMIT_SHA} | sed 's/\x0//g')
|
||||
FILES=$($GIT ${GIT_OPTS[@]} show --pretty= --name-only -z ${COMMIT_SHA} | sed 's/\x0//g')
|
||||
|
||||
if test -z ${FILES}
|
||||
then
|
||||
# merge commit, etc. There is no file here.
|
||||
echo "No file was found in commit ${COMMIT_SHA}; skipping"
|
||||
continue
|
||||
echo "No file was found in commit ${COMMIT_SHA}"
|
||||
fi
|
||||
|
||||
TARGET_BACKUP_SHA=${DATA_DIR}/${COMMIT_SHA}
|
||||
@ -92,10 +91,25 @@ do
|
||||
echo "Writing ${TARGET_BACKUP_SHA}/${FILE}"
|
||||
fi
|
||||
|
||||
# ${FILE} contains path/to/file
|
||||
$GIT ${GIT_OPTS[@]} show ${COMMIT_SHA}:${FILE} > ${TARGET_BACKUP_SHA}/${FILE}
|
||||
# Retrieve file state (Added, Modified, Deleted)
|
||||
STATE=$(${GIT} ${GIT_OPTS[@]} show --name-status --pretty= ${COMMIT_SHA} | grep -E '^..${FILE}$' | cut -f1)
|
||||
|
||||
if test "${STATE}" != "D"
|
||||
then
|
||||
# ${FILE} contains path/to/file
|
||||
$GIT ${GIT_OPTS[@]} show ${COMMIT_SHA}:${FILE} > ${TARGET_BACKUP_SHA}/${FILE}
|
||||
else
|
||||
echo "Skipping ${FILE} as file was deleted in this commit"
|
||||
fi
|
||||
done
|
||||
|
||||
# if ${TARGET_BACKUP_SHA} is empty, keep track it was "backuped"
|
||||
if test -z "$(ls -A ${TARGET_BACKUP_SHA})"
|
||||
then
|
||||
echo "Folder ${TARGET_BACKUP_SHA} is empty. Marking it to keep as it in the backup"
|
||||
touch ${TARGET_BACKUP_SHA}/.gitkeep
|
||||
fi
|
||||
|
||||
NUM_ADDED=$((NUM_ADDED + 1))
|
||||
|
||||
if test ${NUM_ADDED} -eq ${MAX_NUM}
|
||||
|
@ -1,100 +0,0 @@
|
||||
# Descartes Underwriting
|
||||
|
||||
## Context
|
||||
|
||||
We wish to create a backup tool that will save only the last modified files of a storage unit.
|
||||
|
||||
In our example, the storage unit is **not a bucket**.
|
||||
|
||||
The storage unit is the `DD-MM-YYYY-test` branch of the current `descartes-underwriting/devops-technical-test-data` git repository.
|
||||
|
||||
## Property
|
||||
|
||||
The `descartes-underwriting/devops-technical-test-data` repository is not frozen and will have new commits.
|
||||
|
||||
Commits will be added to the `DD-MM-YYYY-test` branch multiple times every day.
|
||||
|
||||
The `DD-MM-YYYY-test` branch name will be adapted using standard datetime convention eg: `01-01-2022-test` for the 1st of January 2022.
|
||||
|
||||
## Task
|
||||
|
||||
Develop a backup tool to save the modified files at each commit.
|
||||
|
||||
### Submission
|
||||
|
||||
If something is not clear, you can ask questions to the recruiter.
|
||||
|
||||
When submitting your project, your version should **not be draft** but complete and following best practices.
|
||||
|
||||
The solution should be saved on a **private** `descartes-devops` repository on your github account.
|
||||
|
||||
The solution should include:
|
||||
|
||||
- source code
|
||||
- test code
|
||||
|
||||
When the final version is ready:
|
||||
|
||||
1. Send an email to the recruiter indicating that you finished the project and sharing the url of the project
|
||||
2. Grant access to:
|
||||
|
||||
- <https://github.com/alexandreCameron>
|
||||
- <https://github.com/Mareak>
|
||||
- <https://github.com/jrdescartes>
|
||||
|
||||
### Script
|
||||
|
||||
Create a script to automate the backup process using open source software.
|
||||
|
||||
The script should track the changes fo the branch `DD-MM-YYYY-test` of the `descartes-underwriting/devops-technical-test-data` repository.
|
||||
|
||||
The execution of the script should be carried out with a github-action / gitlab-pipeline or any other tool automating git workflow on your git project.
|
||||
|
||||
It is highly recommended to use a scheduling tool to execute the back up process.
|
||||
|
||||
### Data
|
||||
|
||||
The backup should store files in separate folders.
|
||||
|
||||
The backup file structure should be based on the sha1 of the `descartes-underwriting/devops-technical-test-data`.
|
||||
|
||||
Starting from the initial commit [282180fe7e5d9cbf297f2f0ef813cffe60ce2328](https://github.com/descartes-underwriting/devops-technical-test-data/commit/282180fe7e5d9cbf297f2f0ef813cffe60ce2328), all the history should be backup.
|
||||
|
||||
## File structure example
|
||||
|
||||
For the following commits on the `descartes-underwriting/devops-technical-test-data`:
|
||||
|
||||
| SHA | OPERATION |
|
||||
|-----|-----------|
|
||||
| Commit_N | create readme.md |
|
||||
| Commit_N+1 | create doc.txt |
|
||||
| Commit_N+2 | create data/test/test.txt |
|
||||
| Commit_N+3 | append text to ./doc.txt |
|
||||
| Commit_N+4 | create test/project/project1.txt |
|
||||
|
||||
The `candidate/descartes-backup-project` repository should have
|
||||
|
||||
```bash
|
||||
$ tree .
|
||||
.
|
||||
├── .gitworkflow
|
||||
│ └── workflows
|
||||
│ └── my-lovely-workflow.yml
|
||||
├── data
|
||||
│ ├── N
|
||||
│ │ └── readme.md
|
||||
│ ├── N+1
|
||||
│ │ └── doc.txt
|
||||
│ ├── N+2
|
||||
│ │ └── data
|
||||
│ │ └── test
|
||||
│ │ └── test.txt
|
||||
│ ├── N+3
|
||||
│ │ └── doc.txt
|
||||
│ └── N+4
|
||||
│ └── test
|
||||
│ └── project
|
||||
│ └── project1.txt
|
||||
└── script
|
||||
└── my-beautiful-script.best-language
|
||||
```
|
@ -1,92 +0,0 @@
|
||||
# Descartes Underwriting
|
||||
|
||||
## Context
|
||||
|
||||
We wish to create a backup tool that will save only the last modified files of a storage unit.
|
||||
|
||||
In our example, the storage unit is **not a bucket**.
|
||||
|
||||
The storage unit is the `DD-MM-YYYY-test` branch of the current `descartes-underwriting/devops-technical-test-data` git repository.
|
||||
|
||||
## Property
|
||||
|
||||
The `descartes-underwriting/devops-technical-test-data` repository is not frozen and will have new commits.
|
||||
|
||||
Commits will be added to the `DD-MM-YYYY-test` branch multiple times every day.
|
||||
|
||||
The `DD-MM-YYYY-test` branch name will be adapted using standard datetime convention eg: `01-01-2022-test` for the 1st of January 2022.
|
||||
|
||||
## Task
|
||||
|
||||
Develop a backup tool to save the modified files at each commit.
|
||||
|
||||
### Submission
|
||||
|
||||
Script and data should be saved on a private `candidate/descartes-backup-project` repository on your github account.
|
||||
|
||||
Access should be granted to all members of the `descartes-underwriting` group:
|
||||
|
||||
<https://github.com/orgs/descartes-underwriting/people>
|
||||
|
||||
Especially:
|
||||
|
||||
* <https://github.com/alexandreCameron>
|
||||
* <https://github.com/Mareak>
|
||||
* <https://github.com/jrdescartes>
|
||||
|
||||
### Script
|
||||
|
||||
Create a script to automate the backup process using open source software.
|
||||
|
||||
The script should track the changes fo the branch `DD-MM-YYYY-test` of the `descartes-underwriting/devops-technical-test-data` repository.
|
||||
|
||||
The execution of the script should be carried out with a github-action / gitlab-pipeline or any other tool automating git workflow on your git project.
|
||||
|
||||
It is highly recommended to use a scheduling tool to execute the back up process.
|
||||
|
||||
### Data
|
||||
|
||||
The backup should store files in separate folders.
|
||||
|
||||
The backup file structure should be based on the sha1 of the `descartes-underwriting/devops-technical-test-data`.
|
||||
|
||||
Starting from the initial commit [282180fe7e5d9cbf297f2f0ef813cffe60ce2328](https://github.com/descartes-underwriting/devops-technical-test-data/commit/282180fe7e5d9cbf297f2f0ef813cffe60ce2328), all the history should be backup.
|
||||
|
||||
## File structure example
|
||||
|
||||
For the following commits on the `descartes-underwriting/devops-technical-test-data`:
|
||||
|
||||
| SHA | OPERATION |
|
||||
|-----|-----------|
|
||||
| Commit_N | create readme.md |
|
||||
| Commit_N+1 | create doc.txt |
|
||||
| Commit_N+2 | create data/test/test.txt |
|
||||
| Commit_N+3 | append text to ./doc.txt |
|
||||
| Commit_N+4 | create test/project/project1.txt |
|
||||
|
||||
The `candidate/descartes-backup-project` repository should have
|
||||
|
||||
```bash
|
||||
$ tree .
|
||||
.
|
||||
├── .gitworkflow
|
||||
│ └── workflows
|
||||
│ └── my-lovely-workflow.yml
|
||||
├── data
|
||||
│ ├── N
|
||||
│ │ └── readme.md
|
||||
│ ├── N+1
|
||||
│ │ └── doc.txt
|
||||
│ ├── N+2
|
||||
│ │ └── data
|
||||
│ │ └── test
|
||||
│ │ └── test.txt
|
||||
│ ├── N+3
|
||||
│ │ └── doc.txt
|
||||
│ └── N+4
|
||||
│ └── test
|
||||
│ └── project
|
||||
│ └── project1.txt
|
||||
└── script
|
||||
└── my-beautiful-script.best-language
|
||||
```
|
Loading…
x
Reference in New Issue
Block a user