DPS909 & OSD600 Winter 2017 - Git Walkthrough 2

From CDOT Wiki
Jump to: navigation, search

Git Walkthrough: Working Remote

In the previous walk-through, we looked at the basics of using git for revision control. This time we we'll learn how to use git in a distributed workflow, and how to interact with remote repositories, such as those on Github.

Step 1: Using Git with Github

For this walkthrough, we'll need an account on Github. Think of a Github account like a driver's license: it's what allows you to participate in many open source projects. If you haven't done so before, take a few minutes to do the following:

  1. create an account
  2. verify your email address
  3. consider enabling two-factor authentication
  4. add info to your account bio
  5. consider changing your profile picture

We'll also need to do a bit more setup with git. Specifically, we need to set some global config options, SSH keys, etc:

  1. set your username locally
  2. set your email address locally
  3. set your line endings
  4. consider setting other global defaults while you're at it (e.g., your default editor)
  5. set up SSH keys

Step 2: Clones and Forking

Git is a distributed revision control system, which means that unlike client/server systems, git has no concept of a central server that we work against. Instead, everyone who works on a repository first creates his or her own copy. If I want to work with a repository that you made, my first task is to make my own personal copy, then work on that directly.

Git calls this copy of a repository a clone. One of the great advantages of having your own local clone of a repository is that you don't need a network connection in order to work with it (i.e., everything is local), and you have complete control over everything you do (i.e., you don't need permission to commit changes). Cloning a repository copies every commit and version of every file in the original project's history. As a result, git clones can take a lot of disk space.

Every git developer works directly with his or her own local clone of a repository, then uses a combination of git's push and pull commands to sync their work with remote repositories--we'll do this below. Because its so common to want to share our local repositories with one another, and to clone each other's repositories locally, we use Github as a public host.

It's possible with git to clone repositories from a USB key, or copy entire repos across a shared drive. But it's not realistic to do this with people on the other side of the world. Instead, it's more convenient to use the Internet and a shared hosting platform.

Every Github user can have an unlimited number of open source repositories (private/closed source repositories cost money). Because the default workflow with git is to clone a repository before we use it, Github includes this as a central feature, which it calls forking.

What Github calls a fork is really just another name for a clone. You'll see both fork and clone used on Github, and you can think of the difference this way:

  • fork: copy an existing Github repository to your Github account. This forked copy will live on Github's servers, but you will have full ownership over it. It's typical to fork any repository to which you want to contribute.
  • clone: create a local copy of a repository on Github on your computer. It's typical to clone repositories that you have previously forked.

Let's fork a repository. Follow the instructions in Fork a Repo to fork the Spoon-Knife repository.

While we're at it, let's also fork the Bootstrap repository, so we can carry on with things we were trying in our earlier walkthrough.

Step 3: Cloning a Forked Repository

Now that we have a forked copy of the Spoon-Knife repository, let's clone it to our local computer. In order to clone a Github repository, we need an appropriate URL. NOTE: this is not the URL we use to access the repository in our web browser. Rather, we need a git URL.

Github provides git URLs using various protocols. The one we'll use most often is the SSH URL. To get an SSH URL for a Github repository, you need to have the appropriate rights (i.e., both read and write permissions). Every repository that you fork will automatically have these rights.

In my case, the URL I need is git@github.com:humphd/Spoon-Knife.git, and for you it will be git@github.com:{your-github-username}/Spoon-Knife.git. Using this SSH URL we can clone the repo locally:

$ git clone git@github.com:humphd/Spoon-Knife.git
Cloning into 'Spoon-Knife'...
remote: Counting objects: 16, done.
remote: Total 16 (delta 0), reused 0 (delta 0), pack-reused 16
Receiving objects: 100% (16/16), done.
Resolving deltas: 100% (3/3), done.

This did a few things:

  1. created a directory ./Spoon-Knife/
  2. created a .git/ database folder in ./Spoon-Knife/.git
  3. downloaded all commits in the forked repo from Github, saving them to our ./Spoon-Knife/.git database
  4. created remote tracking for the origin repository (i.e., our forked repo on Github, from which we cloned)
  5. checked-out the latest version of the code to our ./Spoon-Knife/ working directory

Let's take a look at what we have in our new cloned repo:

$ cd Spoon-Knife
$ ls
README.md  index.html styles.css
$ git show
commit d0dd1f61b33d64e29d8bc1372a94ef6a2fee76a9
Author: The Octocat <octocat@nowhere.com>
Date:   Wed Feb 12 15:20:44 2014 -0800

    Pointing to the guide for forking

diff --git a/README.md b/README.md
index 0350da3..f479026 100644
--- a/README.md
+++ b/README.md
@@ -6,4 +6,4 @@ Creating a *fork* is producing a personal copy of someone else's project. Forks

 After forking this repository, you can make some changes to the project, and submit [a Pull Request](https://github.com/octocat/Spoon-Knife/pulls) as practice.

-For some more information on how to fork a repository, [check out our guide, "Fork a Repo"](https://help.github.com/articles/fork-a-repo). Thanks! :sparkling_heart:
+For some more information on how to fork a repository, [check out our guide, "Forking Projects""](http://guides.github.com/overviews/forking/). Thanks! :sparkling_heart:

Step 4: Pushing Changes to Remote Repositories

Let's make a change to our cloned Spoon-Knife repo, then sync that change with our remote Github forked repository. We start by making a change locally.

In your editor, open the Spoon-Knife/index.html file and make a change to the text, for example:

Before After
<!-- Feel free to change this text here -->
<p>
  Fork me? Fork you, @octocat!
</p>
<p>
  Look, Mom, I'm on Github!
</p>

Now we can add and commit our change:

$ git status
On branch master
Your branch is up-to-date with 'origin/master'.
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

	modified:   index.html

no changes added to commit (use "git add" and/or "git commit -a")
$ git add index.html
$ git commit -m "Update text in index.html"
[master 57f7209] Update text in index.html
 1 file changed, 1 insertion(+), 2 deletions(-)
$ git show
commit 57f7209f758c4ba32be537434196b9168b2d53dc
Author: David Humphrey (:humph) david.humphrey@senecacollege.ca <david.humphrey@senecacollege.ca>
Date:   Thu Jan 12 11:48:56 2017 -0500

    Update text in index.html

diff --git a/index.html b/index.html
index a83618b..e4cff78 100644
--- a/index.html
+++ b/index.html
@@ -11,9 +11,8 @@

 <img src="forkit.gif" id="octocat" alt="" />

-<!-- Feel free to change this text here -->
 <p>
-  Fork me? Fork you, @octocat!
+  Look, Mom, I'm on Github!
 </p>

 </body>

Before our commit, there were 3 commits in the repo's log. Now there are 4:

$ git log
commit 57f7209f758c4ba32be537434196b9168b2d53dc
Author: David Humphrey (:humph) david.humphrey@senecacollege.ca <david.humphrey@senecacollege.ca>
Date:   Thu Jan 12 11:48:56 2017 -0500

    Update text in index.html

commit d0dd1f61b33d64e29d8bc1372a94ef6a2fee76a9
Author: The Octocat <octocat@nowhere.com>
Date:   Wed Feb 12 15:20:44 2014 -0800

    Pointing to the guide for forking

commit bb4cc8d3b2e14b3af5df699876dd4ff3acd00b7f
Author: The Octocat <octocat@nowhere.com>
Date:   Tue Feb 4 14:38:36 2014 -0800

    Create styles.css and updated README

commit a30c19e3f13765a3b48829788bc1cb8b4e95cee4
Author: The Octocat <octocat@nowhere.com>
Date:   Tue Feb 4 14:38:24 2014 -0800

    Created index page for future collaborative edits

Our local repo and our forked repo on Github are not out of sync: our local repo is ahead of the remote fork by 1 commit. By default, repos don't stay in sync. You can do whatever you want locally, and only share commits between repos when you want/need to do so.

Let's send our local changes (i.e., the new commit) to the remote forked repo. We do this using git's push command. Before we use push, we need to mention another git command, remote.

We said above that git repos aren't automatically kept in sync with one another, which is true. However, a git repo can be made aware of remote repos (i.e., other copies of the same repo that you or someone else owns) that you care about. For example, you might be working on a team of 3 developers, and want to tell git about each of their forks on Github, as well as your own. This lets you share commits between various remote repos.

When you clone a repo, git automatically sets up remote for you called origin. You can see it by doing this:

$ git remote -v
origin	git@github.com:humphd/Spoon-Knife.git (fetch)
origin	git@github.com:humphd/Spoon-Knife.git (push)

Here git shows two remote URLs, one for fetch and one for push. By default they are the same, although it's possible to have different URLs for the same remote. We can also add more remotes by giving a remote name and a git URL, like so:

$ git remote add upstream https://github.com/octocat/Spoon-Knife.git
$ git remote -v
origin	git@github.com:humphd/Spoon-Knife.git (fetch)
origin	git@github.com:humphd/Spoon-Knife.git (push)
upstream	https://github.com/octocat/Spoon-Knife.git (fetch)
upstream	https://github.com/octocat/Spoon-Knife.git (push)

Now we have 2 remotes:

  • origin - our personal forked copy on Github
  • upstream - the so-called upstream version of the repo, the one we originally forked

This is a very common way to set up your remotes. You can obviously use any names you like vs. origin and upstream (you can also use git remote rename if you want to change the names later).

With our remotes set up, we're ready to push our local change (commit) to our forked repo on Github:

$ git push origin
Counting objects: 3, done.
Delta compression using up to 8 threads.
Compressing objects: 100% (3/3), done.
Writing objects: 100% (3/3), 423 bytes | 0 bytes/s, done.
Total 3 (delta 1), reused 0 (delta 0)
remote: Resolving deltas: 100% (1/1), completed with 1 local objects.
To github.com:humphd/Spoon-Knife.git
   d0dd1f6..57f7209  master -> master

If everything goes according to plan, you should be able to see your commit on Github. For me, it's visible via the following URL:

https://github.com/humphd/Spoon-Knife/commits/master

Change it to use your username, and make sure you see your changes:

https://github.com/{your-username}/Spoon-Knife/commits/master

My full commit is visible at https://github.com/humphd/Spoon-Knife/commit/57f7209f758c4ba32be537434196b9168b2d53dc

Step 5: Pulling and Fetching Changes from Remote Repositories

Just as we can send changes (i.e., commits) to a remote repository that we own (or have been granted write permission), we can can pull changes from any public, cloned repo, whether we own it or not: we don't need explicit permission.

Earlier we forked the Bootstrap repo. Let's clone it locally now, and setup our origin and upstream remotes:

$ git clone git@github.com:humphd/bootstrap.git
Cloning into 'bootstrap'...
remote: Counting objects: 103419, done.
remote: Compressing objects: 100% (2/2), done.
remote: Total 103419 (delta 0), reused 0 (delta 0), pack-reused 103417
Receiving objects: 100% (103419/103419), 92.52 MiB | 1.40 MiB/s, done.
Resolving deltas: 100% (69071/69071), done.
$ cd bootstrap
$ git remote add upstream https://github.com/twbs/bootstrap.git
$ git remote -v
origin	git@github.com:humphd/bootstrap.git (fetch)
origin	git@github.com:humphd/bootstrap.git (push)
upstream	https://github.com/twbs/bootstrap.git (fetch)
upstream	https://github.com/twbs/bootstrap.git (push)

If we wanted to update our local clone to be in sync with the upstream Bootstrap repo, we'd do this:

$ git pull upstream v4-dev
From https://github.com/twbs/bootstrap
 * branch            v4-dev     -> FETCH_HEAD
Already up-to-date.

Here we've asked git to pull all commits on the v4-dev branch. We'll be discussing branches in detail in the next walk-through, but for now a branch is a line of commits with a friendly name: it's easier to remember v4-dev than b47c252ee13c536205105dcb16029021118f989c! Git responds by telling us that we're Already up-to-date, which means there's nothing new. If you wait for a while and try that again, you'll see that new changes are available, and will get added to our local repo.

We should say something about git's pull command vs. fetch. When git downloads all new commits from a remote repo, we call that a fetch. You can do it yourself:

$ git fetch upstream

In this case nothing got downloaded. A fetch is always safe to do, because it just downloads any new commits, but doesn't update your current work yet--you do that using git's merge. We'll discuss merge at length in the next walk-through, but for now it's useful to understand that a merge connects two (or more) existing commits. When you do a git pull, it combines a git fetch and git merge in a single operation.

Step 6: Creating Pull Requests (PRs)

So far we've learned how to pull and fetch changes from remote repositories, and also how to push changes to remote repositories we own. This leaves the question of how to share changes with repositories that you don't own. How does one contribute a bug fix to a project for which they don't have write permissions?

Github's answer to this question is the Pull Request, also known as a PR. A pull request is a commit (or set of commits) you've made that you want to have pulled into an upstream repository. A pull request provides code changes (commits), but also a forum for discussion, code review, and evolution of the changes.

We've been discussing Bootstrap, so let's look at a real pull request (you can see all current pull requests as well). This pull request was opened in October 2015, and provides a fix for this issue (note the presence of fixes #17811). Within this PR we can see a few things:

  • Conversation - a running discussion between developers about this change, including code review, new commits, and a history of different status and label changes
  • Commits - a list of all the commits in this PR. NOTE: you can always add more commits to a PR by pushing new commits to your remote, which we'll discuss in the next walk-through
  • Files changed - a complete DIFF of all the file changes, as well as specific code review comments on particular lines

Github PRs make it easy for anyone to contribute to any public repository, and for project owners to easily discuss and manage changes from the community.

Let's send a PR to the Spoon-Knife repo with our change from above. The steps to follow are outlined nicely in this Github help page. Follow the steps to create a PR for your change, and open a PR with the upstream repo.

Here's my change as a PR: https://github.com/octocat/Spoon-Knife/pull/12168

In the next walk-through, we'll look at branching, merging, and common workflows for contributing to any open source project.