Session 4: Git Fundamentals

AEM 7010 · Doing Applied Economics Research: Practical Skills

Prof. Ariel Ortiz-Bobea

2026-04-20

Why Version Control?

§ Tutorial: Why Version Control?

Version control matters in three situations every applied economist now faces.

  • Yourself over time. analysis_v1.Ranalysis_v2.Ranalysis_FINAL_v2.Ranalysis_REALLY_FINAL.R. Which version produced the coefficient in Table 2?
  • You and AI coding tools. Claude Code, Copilot, Cursor, Codex all produce dozens of lines in seconds. Some are wrong in subtle ways.
  • You and co-authors. Multiple people editing the same project without silently overwriting each other’s work.

Git is the connective infrastructure that makes all three workable.

Git ≠ GitHub

§ Tutorial: What Is Git?

  • Git is a tool that runs on your laptop. It manages snapshots of your project locally.
  • GitHub is a cloud service that hosts Git repositories for backup, collaboration, and sharing.

Today: Git (local). Session 5 on Wednesday: GitHub (remote).

Situations Where Git Helps

§ Tutorial: Situations Where Git Helps

Six scenarios, each introducing vocabulary you will meet today and Wednesday.

  • Working solo, months latercommit, log, diff
  • Refactoring clunky codebranch, merge
  • Working with a co-authorclone, push, pull, remote, merge conflict
  • Returning to a paper revisiontag, checkout
  • Recovering from a mistakerestore, reset
  • Working with AI coding toolsdiff, stage, revert

The Three Areas of Git

§ Tutorial: The Three Areas of Git

Every Git project has three areas. The mental model of the entire course.

Working
Directory

git add →

Staging
Area

git commit →

Repository

The workflow: edit → stage → commit.

A Mental Model: The Shopping Cart

§ Tutorial: The Three Areas of Git

Git Shopping cart
Working directory Browsing the store, dropping items in the cart
Staging area Reviewing the cart at checkout
Repository Your order history
git add Moving an item to the checkout page
git commit Clicking Place Order. Confirmation number issued.
Commit hash The order confirmation number
git log Your order history page

Staging exists because you want to see exactly what you are about to commit to, before it becomes permanent.

⟶ Now switch to the tutorial: Setup, Your First Repository, Staging & Committing (~20 min hands-on)

Writing Good Commits

§ Tutorial: Writing good commits

Be specific. “Fix bug” is useless. “Fix off-by-one error in sample selection” is useful.

One logical change per commit. If the message needs the word and, it is two commits.

Research changelog style:

Add control for state fixed effects
Switch to winsorized outcome at 1%
Fix sample filter for pre-2000 observations

Avoid: “Fixed stuff”, “WIP”, “Lots of changes”, “.”

⟶ Now switch to the tutorial: Exercise 1 (~5 min hands-on)

Safety Rails: .gitignore + Undoing Mistakes

§ Tutorial: .gitignore and Undoing Mistakes

Keep files out of Git with a .gitignore:

*.csv, *.dta, data/   # data (too large or sensitive)
.Rhistory, .RData     # R artifacts
.DS_Store, Thumbs.db  # system files

Three undo operations:

  • Unstage by accident staged file: git restore --staged <file>
  • Discard a bad edit (destructive): git restore <file>
  • Undo last commit, keep the work: git reset --soft HEAD~1

Git is forgiving, but only for states it knows about. Commit often.

⟶ Now switch to the tutorial: Exercise 2 (~10 min hands-on)

What’s Next

Today: Git on your own laptop. Commits, staging, undoing mistakes, ignoring files.

Wednesday (Session 5): GitHub.

  • Push my-research to a GitHub repository
  • Pull changes from a co-author
  • Create branches for risky experiments
  • Open pull requests for review
  • Tag versions for paper submissions and replication packages

Before Wednesday: create a free GitHub account at github.com if you have not already. Questions now?

The full walkthrough with copy-paste commands, screenshots, and RStudio / VS Code equivalents is on the companion site: arielortizbobea.github.io/aem7010