Session 5: GitHub & Collaboration

AEM 7010 · Doing Applied Economics Research: Practical Skills

Prof. Ariel Ortiz-Bobea

2026-04-22

Quick Recap: Where We Left Off

§ Tutorial: Quick Recap

Session 4 gave you local Git on your laptop.

  • Snapshots: git commit, git log, git diff.
  • Staging: git add picks what goes into the next commit.
  • Safety rails: .gitignore, git restore, git reset --soft.

Today we take Git online: connect to GitHub, push and pull, branch, open pull requests, and tag versions for reproducibility.

It is fine if your my-research folder from session 4 is messy or gone. Today starts fresh from a template.

Why GitHub? Four Motivations

§ Tutorial: Why Put Your Code on GitHub?

GitHub adds a cloud copy of your Git history. Four reasons that is worth doing for every research project.

  • Backup and recovery. Hard drives fail, laptops get stolen. git clone restores a full project with history onto a new machine in one command.
  • Working across your own machines. Laptop, office desktop, compute cluster. Each one pulls and pushes; history stays the same everywhere.
  • Co-authors. Each person commits their own work. Git surfaces conflicts at the line level instead of silently overwriting edits.
  • Sharing and replication. Top economics journals increasingly require replication packages. A tagged GitHub repo is the cleanest way to comply.

Connect to GitHub: SSH Keys

§ Tutorial: Connect to GitHub: SSH Keys

SSH is how your laptop talks to GitHub for push and pull. Public-key cryptography: one keypair does it all.

  • You generate a private and public key on your laptop.
  • Private key stays home. Public key goes to GitHub.
  • Every connection signs a challenge. No password crosses the network.

Already connected to GitHub with SSH? Just run ssh -T git@github.com to verify. If you see “Hi yourname, you’ve successfully authenticated”, skip ahead.

⟶ Now switch to the tutorial: Connect to GitHub: SSH Keys, Steps 1–6 (~8 min). HTTPS fallback is at the bottom of the section if SSH is blocked on your network.

Where a Repo Comes From — and How It Syncs

§ Tutorial: Where a Repository Comes From · Cloning · Push and Pull

Three entry points for putting a project onto your laptop:

  • From scratch. New empty repo on GitHub; clone it (or git remote add an existing folder).
  • Clone an existing repo. A co-author invited you, or a tutorial points at a public repo.
  • Use this template. One click on GitHub makes your own copy with clean history. (What Exercise 3 uses.)

Once a remote is set up, two commands keep everything in sync:

%%{init: {'theme': 'neutral'}}%%
flowchart LR
  laptop["Your Laptop"] -- git push --> github["GitHub (origin)"]
  github -- git pull --> laptop

⟶ Now switch to the tutorial: Exercise 3 — use the course template to create your own my-research, clone it, make a change, push, pull (~15 min solo).

Branches: Parallel Lines of Work

§ Tutorial: Branches

A branch is a named pointer to a commit — free to create, free to delete, lets you work in parallel.

%%{init: {
  'theme': 'base',
  'themeVariables': {
    'git0': '#8EC6E8',
    'git1': '#8FC88E',
    'gitBranchLabel0': '#1B1B1B',
    'gitBranchLabel1': '#1B1B1B',
    'tagLabelBackground': '#F7F4E9',
    'tagLabelColor': '#1B1B1B',
    'tagLabelBorder': '#BBBBBB'
  },
  'gitGraph': { 'parallelCommits': true, 'showCommitLabel': false }
}}%%
gitGraph
   commit tag: "A"
   commit tag: "B"
   branch add-iv-analysis
   checkout add-iv-analysis
   commit tag: "D"
   commit tag: "E"
   checkout main
   commit tag: "C"
   merge add-iv-analysis tag: "F"

Branch when the work is: exploratory (might abandon), multi-step (many commits), risky (could break main), parallel with someone, or reviewed before merging. Otherwise, just commit on main.

Mental model: main is the canvas you will sign. A branch is a sketchpad. Copy over what works (merge); discard what does not (delete).

Pull Requests: The Review Gateway

§ Tutorial: Pull Requests

A pull request is GitHub’s code-review workflow. Not the same as git pull.

Six-step workflow:

  1. Branch, commit work, git push -u origin <branch>.
  2. Open a PR on github.com (Compare & pull request).
  3. Write a description (what changed, why).
  4. Co-author reviews the diff line by line.
  5. Approve / request changes.
  6. Click Merge pull request.

All of this happens on github.com. Neither RStudio nor VS Code has full PR support in its default install. You will practice the full PR loop in the take-home pair exercise.

Merge Conflicts

§ Tutorial: Handling Merge Conflicts

A conflict happens when two commits edit the same line of the same file. Git pauses and asks you to choose.

<<<<<<< HEAD
lm(log(wage) ~ educ + exper + tenure + nonwhite, data = wages)
=======
lm(log(wage) ~ educ + exper + tenure + female, data = wages)
>>>>>>> add-female-control

Two resolution patterns:

  • Combinable. Edit to keep both: ... + nonwhite + female. Stage, commit.
  • Incompatible. Pick one, or combine the intents explicitly. Stage, commit.

Conflicts are normal, not dangerous. Git surfaces the disagreement; you decide what the line should say.

⟶ Now switch to the tutorial: Exercise 4 — deliberately stage and resolve a real merge conflict, solo (~10 min hands-on).

Tags for Reproducibility

§ Tutorial: Tags for Reproducibility

A tag is a named pointer to a specific commit. Mark paper-submission milestones so you can return in one command.

v0.1-first-submission       code used in the initial JPE submission
v0.2-first-RR               code for the first round of R&R
v1.0-accepted               final version in the published paper
v1.0-replication-package    archive posted on openICPSR / Zenodo

Create: git tag -a v0.1 -m "..." · Push: git push --tags

Data too large for GitHub? Keep code on GitHub, data on openICPSR (AEA journals) or Zenodo (DOI, free, 50 GB, auto-syncs with GitHub Releases). Link them from the README. Git LFS is not a replication archive.

What’s Next

Today: GitHub as remote. Cloning, pushing, pulling, branches, merge conflicts, tagged replication.

Next Monday (Session 6): AI tools for research.

  • Coding assistants: Claude Code, Copilot, Cursor
  • Using AI safely with Git (review every diff; version-control every AI-assisted change)
  • Research transparency in the AI era

Before next class:

  • Finish the take-home pair PR exercise with a classmate (~20 min, on the tutorial).
  • Make sure your GitHub account is set up and SSH works; we will use it extensively in sessions 6–7.

Companion site with copy-paste commands and the full walkthrough: arielortizbobea.github.io/aem7010