Skip to content

1.2.2 Git Core Operations

Git daily minimal loop diagram

This section starts putting Git into practice. The focus is on mastering the add, commit, status, log, diff, and undo operations you will use every day, so you can build the basic development habit of “write a bit of code, check the status, and commit a record.”

  • Use git add, git commit, git status, and git log fluently
  • Learn to use git diff to inspect changes
  • Be able to write a .gitignore file
  • Master several commonly used undo operations

We will use a simulated AI project to practice all operations. First, create the project:

Terminal window
mkdir ai-image-classifier
cd ai-image-classifier
git init
# Create the basic project structure
mkdir data models src
touch src/train.py src/model.py src/utils.py
touch README.md requirements.txt

git status is one of the commands you will use most often. It tells you the current state of the repository: which files have been modified? Which files are in the staging area? Which files are not yet tracked by Git?

Terminal window
git status

Output:

On branch main
No commits yet
Untracked files:
(use "git add <file>..." to include in what will be committed)
README.md
requirements.txt
src/
nothing added to commit but untracked files present

Untracked files: Git can see these files, but it is not managing them yet. You need to use git add to tell Git, “please start tracking these files.”


Terminal window
# Add a single file
git add README.md
# Add multiple files
git add src/train.py src/model.py
# Add an entire folder
git add src/
# Add all files (most commonly used)
git add .
# Add all modified tracked files (excluding new files)
git add -u

First, write some content into the files:

Terminal window
echo "# AI Image Classifier" > README.md
echo "torch>=2.0" > requirements.txt
cat > src/model.py << 'EOF'
import torch.nn as nn
class SimpleCNN(nn.Module):
def __init__(self):
super().__init__()
self.conv1 = nn.Conv2d(3, 16, 3, padding=1)
self.pool = nn.MaxPool2d(2, 2)
self.fc1 = nn.Linear(16 * 16 * 16, 10)
def forward(self, x):
x = self.pool(torch.relu(self.conv1(x)))
x = x.view(-1, 16 * 16 * 16)
x = self.fc1(x)
return x
EOF

Now add all files and check the status:

Terminal window
git add .
git status

Output:

On branch main
No commits yet
Changes to be committed:
(use "git rm --cached <file>..." to unstage)
new file: README.md
new file: requirements.txt
new file: src/model.py
new file: src/train.py
new file: src/utils.py

The files have turned green as “Changes to be committed”, which means they are now in the staging area and ready to be committed.


Terminal window
git commit -m "Initialize project: add model definition and project structure"

Output:

[main (root-commit) a1b2c3d] Initialize project: add model definition and project structure
5 files changed, 18 insertions(+)
create mode 100644 README.md
create mode 100644 requirements.txt
create mode 100644 src/model.py
create mode 100644 src/train.py
create mode 100644 src/utils.py

Commit messages should be concise and clear, so people can immediately tell what changed.

Good commit messages:

Terminal window
git commit -m "Add CNN model definition"
git commit -m "Fix bug where learning rate was not updated in the training loop"
git commit -m "Add data augmentation: random flip and color jitter"
git commit -m "Update README: add installation instructions"

Bad commit messages:

Terminal window
git commit -m "update" # What was updated?
git commit -m "fix" # What was fixed?
git commit -m "aaa" # ???
git commit -m "Changed some stuff" # Basically says nothing

git diff tells you “what changed since the last commit.”

Terminal window
# Add a new layer to model.py
cat > src/model.py << 'EOF'
import torch
import torch.nn as nn
class SimpleCNN(nn.Module):
def __init__(self):
super().__init__()
self.conv1 = nn.Conv2d(3, 16, 3, padding=1)
self.conv2 = nn.Conv2d(16, 32, 3, padding=1) # Newly added convolution layer
self.pool = nn.MaxPool2d(2, 2)
self.fc1 = nn.Linear(32 * 8 * 8, 10)
def forward(self, x):
x = self.pool(torch.relu(self.conv1(x)))
x = self.pool(torch.relu(self.conv2(x))) # Newly added
x = x.view(-1, 32 * 8 * 8)
x = self.fc1(x)
return x
EOF

Now check the changes:

Terminal window
git diff

The output will highlight differences in red and green:

--- a/src/model.py
+++ b/src/model.py
@@ -1,4 +1,5 @@
+import torch
import torch.nn as nn
class SimpleCNN(nn.Module):
def __init__(self):
super().__init__()
self.conv1 = nn.Conv2d(3, 16, 3, padding=1)
+ self.conv2 = nn.Conv2d(16, 32, 3, padding=1) # Newly added convolution layer
self.pool = nn.MaxPool2d(2, 2)
- self.fc1 = nn.Linear(16 * 16 * 16, 10)
+ self.fc1 = nn.Linear(32 * 8 * 8, 10)
  • Red (- at the start): deleted lines
  • Green (+ at the start): newly added lines

Now commit this change:

Terminal window
git add src/model.py
git commit -m "Add a second convolution layer to improve model capacity"
Terminal window
git diff # View unstaged changes in the working directory
git diff --staged # View staged changes (added but not yet committed)
git diff HEAD~1 # View what the most recent commit changed
git diff abc1234 def5678 # Compare differences between two commits

Terminal window
git log

Output:

commit def5678... (HEAD -> main)
Author: Zhang San <[email protected]>
Date: Mon Feb 9 10:30:00 2026
Add a second convolution layer to improve model capacity
commit a1b2c3d...
Author: Zhang San <[email protected]>
Date: Mon Feb 9 10:00:00 2026
Initialize project: add model definition and project structure
Terminal window
# One record per line (most commonly used)
git log --oneline
# Output:
# def5678 Add a second convolution layer to improve model capacity
# a1b2c3d Initialize project: add model definition and project structure
# With file change statistics
git log --oneline --stat
# Show branches visually (very useful when branches exist)
git log --oneline --graph --all

Some files should not be managed by Git:

  • Data files (training data that is several GB)
  • Model weight files (hundreds of MB)
  • Virtual environment folders
  • Temporary files generated by the system
  • Sensitive information such as API keys and passwords

Create a .gitignore file to tell Git to ignore them:

Terminal window
cat > .gitignore << 'EOF'
# Python cache
__pycache__/
*.pyc
*.pyo
# Virtual environment
venv/
.venv/
env/
# Jupyter Notebook checkpoints
.ipynb_checkpoints/
# Data files (too large to put in Git)
data/*.csv
data/*.json
data/*.zip
*.h5
*.hdf5
# Model weight files
models/*.pt
models/*.pth
models/*.onnx
*.bin
# Environment variable files (contain API keys and other sensitive information)
.env
.env.local
# IDE settings
.vscode/
.idea/
# Operating system files
.DS_Store
Thumbs.db
# Logs
logs/
*.log
# Distribution / packaging
dist/
build/
*.egg-info/
EOF
Terminal window
git add .gitignore
git commit -m "Add .gitignore: ignore cache, data, model weights, and sensitive files"
Terminal window
# Create some files that should be ignored
mkdir -p __pycache__
touch __pycache__/model.cpython-311.pyc
touch .env
echo "OPENAI_API_KEY=your_api_key_here" > .env
# Check the status — these files will not appear
git status
# Output: nothing to commit, working tree clean

.env and __pycache__/ are both ignored and will not be committed to Git. Your API key is safe.


Git provides several different “safety nets,” and you choose one based on how far you want to go back.

For beginners, use this order of thinking:

flowchart TD
A["Something looks wrong"] --> B["Run git status"]
B --> C["Run git diff"]
C --> D{"Have you committed it?"}
D -->|"No, not staged"| E["git restore file<br/>discard local edits"]
D -->|"Staged but not committed"| F["git restore --staged file<br/>keep edits, unstage"]
D -->|"Committed"| G["Prefer a new fixing commit<br/>or use reset only in practice"]

Scenario 1: You changed a file but have not run add yet, and want to restore it

Section titled “Scenario 1: You changed a file but have not run add yet, and want to restore it”
Terminal window
# You changed src/utils.py, but you are not happy with the changes and want to restore it to the last committed state
git restore src/utils.py
# Restore all files
git restore .

Scenario 2: You already ran add and want to remove it from the staging area, but keep the changes

Section titled “Scenario 2: You already ran add and want to remove it from the staging area, but keep the changes”
Terminal window
# You added model.py, but you do not want to commit it yet
git restore --staged src/model.py
# The file will move from the staging area back to the working directory, and your changes will remain

Scenario 3: You already committed and want to change the commit message

Section titled “Scenario 3: You already committed and want to change the commit message”
Terminal window
# You just committed and realized the message was wrong
git commit --amend -m "Corrected commit message"

Scenario 4: You already committed and want to undo the entire commit

Section titled “Scenario 4: You already committed and want to undo the entire commit”
Terminal window
# Undo the most recent commit, but keep the file changes (go back to before add)
git reset HEAD~1
# Undo the most recent commit and discard all changes (full rollback, use with caution)
git reset --hard HEAD~1

For the first few weeks, treat git reset --hard as an emergency command, not a daily command. In team projects, rewriting history may also affect other people, so the safer habit is often to make a new commit that fixes the mistake.

Terminal window
# Suppose you accidentally committed an API key
echo "API_KEY=your_api_key_here" > config.py
git add .
git commit -m "Add configuration file"
# Oops! The key should not have been committed. Undo this commit
git reset HEAD~1
# The file is still there, but the commit has been undone. Now add it to .gitignore
echo "config.py" >> .gitignore
git add .gitignore
git commit -m "Update .gitignore: ignore config.py"
What I want to undoCommandAre the file changes still there?
Changes in the working directory (not yet added)git restore file-name❌ Discarded
Files in the staging area (added but not committed)git restore --staged file-name✅ Kept
The last commit message was wronggit commit --amend -m "new message"✅ Kept
The most recent commit (keep changes)git reset HEAD~1✅ Kept
The most recent commit (discard changes)git reset --hard HEAD~1❌ Discarded

Hands-on Practice: Simulate a Full Development Workflow

Section titled “Hands-on Practice: Simulate a Full Development Workflow”
Terminal window
# 1. Write the training script
cat > src/train.py << 'EOF'
import torch
from model import SimpleCNN
model = SimpleCNN()
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)
print("Number of model parameters:", sum(p.numel() for p in model.parameters()))
EOF
git add src/train.py
git commit -m "Add basic training script"
# 2. Write a utility function
cat > src/utils.py << 'EOF'
def accuracy(predictions, labels):
"""Calculate accuracy"""
correct = (predictions.argmax(dim=1) == labels).sum().item()
return correct / len(labels)
EOF
git add src/utils.py
git commit -m "Add accuracy calculation utility function"
# 3. Update the README
cat > README.md << 'EOF'
# AI Image Classifier
A simple image classification project using CNN.
## Installation
```bash
pip install -r requirements.txt
```
## Usage
```bash
python src/train.py
```
EOF
git add README.md
git commit -m "Update README: add installation and usage instructions"
# 4. View the full history
git log --oneline

You should see something like these 5 commit records:

f6g7h8i Update README: add installation and usage instructions
d4e5f6g Add accuracy calculation utility function
b2c3d4e Add basic training script
9a0b1c2 Add .gitignore: ignore cache, data, model weights, and sensitive files
a1b2c3d Initialize project: add model definition and project structure

Each one is an archive point you can roll back to.

Project reference and review notes
  1. The final git log --oneline should show the three exercise commits plus the earlier setup commits.
  2. Each commit should save one small idea: training script, utility function, or README update.
  3. git status after the final commit should be clean, or it should show only intentional untracked files that you can explain.
  4. If src/train.py fails because torch or model is missing, that is separate from the Git exercise. Record the run failure, but keep the commit workflow evidence.
  5. Good commit messages use action words and explain the change, such as Add basic training script, not vague messages like update stuff.

Keep this page’s proof of learning as a small evidence card:

Repo State
git status before and after the operation
Operation
init, add, commit, branch, merge, remote, pull, or push command used
History
git log or branch graph showing what changed
Failure Check
untracked files, wrong branch, merge conflict, or remote/auth issue
Expected Output
a clean Git trace that another learner can replay safely
CommandPurposeFrequency
git statusCheck the current status⭐⭐⭐⭐⭐
git add .Stage all changes⭐⭐⭐⭐⭐
git commit -m "message"Commit⭐⭐⭐⭐⭐
git log --onelineView history⭐⭐⭐⭐
git diffInspect changes⭐⭐⭐⭐
git restoreUndo working-directory changes⭐⭐⭐
git resetUndo commits⭐⭐