Git clone -A problem in a big repositories.

DEVOPS ENGINEER | AWS | Java | Linux | Python | Git | Github | Docker | Spring Boot | Ansible | Jenkins | Algorithmic Trader | Pine Editior
This a part in the nginx-setup.yml where we are cloning the repository.
- name: Clone custom HTML page from GitHub
git:
repo: 'https://github.com/prem14choudhary/simple-app'
dest: /var/www/html
clone: yes
force: yes
update: yes
If you are exposed to industry practises will notice a issue here. What’s that ?
Issue is, if you are dealing with low size git repository this will be fine, but in organization the repository size is huge, so cloning that everytime whenever you update the code it will be disaster for the site.
For this we can use git pull instead of git clone.
Seems easy right?
But here’s the catch whenever your site gets high traffic, auto scaling group will automatically increase the number for instances. Now, the new instances dont have the old repository then how will be git pull command will be executed, because git pull needs the old repo where he can insert the changes.
Solution: You can add git clone in the user data of EC2 and update the playbook by adding git pull instead of git clone.This will work like whenever new EC2 is launch it automatically git clone your old repo from the git hub.
Steps to do this.
1. Go to Advanced details in EC2 launching.

2. Scroll down to user data.
3. Add the git clone command.
4. Update the ansible playbook task, remove the cloning task and add git pull task.
- name: Git pull to update the repository
git:
repo: 'https://github.com/<Github-username>/<repo-name>'
dest: /var/www/html
version: "main"
update: yes
force: yes
Here’s a one more issue.
We will counter that in next blog - Smart solution for executing git commands without being register in Github IP allow list. Till then
Hari OM Tat Sat🕉❤️


