Version control is a system that records changes to a file or set of files over time so that you can recall specific versions later.
The code programmers write changes often. Bugs need to be fixed, features need to be added, and content needs to be changed. Most code is stored as plain old text files, and the code is changed by editing these files. Every time a change is saved, the old version of the file is overwritten with a new one. Unfortunately, no programmer is perfect, and sometimes, mistakes are made. If you make a change to a file, save it, compile it, and find out that something went wrong, it's often helpful to be able to go back to the old version or to get a report of what was actually changed, in order to focus on what may have gone wrong.
Types of Version Control
There are many version control systems out there. Often they are divided into two groups: “centralized” and “distributed”.
Centralized version control systems are based on the idea that there is a single “central” copy of your project somewhere (probably on a server), and programmers will “commit” their changes to this central copy.
“Committing” a change simply means recording the change in the central system. Other programmers can then see this change. They can also pull down the change, and the version control tool will automatically update the contents of any files that were changed.
Most modern version control systems deal with “changesets,” which simply are groups of changes (possibly too many files) that should be treated as a cohesive whole. For example, a change to a C header file and the corresponding .c file should always be kept together.
Programmers no longer have to keep many copies of files on their hard drives manually, because the version control tool can talk to the central copy and retrieve any version they need on the fly. Some of the most common centralized version control systems you may have heard of or used are CVS, Subversion (or SVN), and Perforce.
Distributed Version Control
In the past five years or so a new breed of tools has appeared: so-called “distributed” version control systems (DVCS for short). The three most popular of these are Mercurial, Git, and Bazaar.
These systems do not necessarily rely on a central server to store all the versions of a project’s files. Instead, every developer “clones” a copy of a repository and has the full history of the project on their own hard drive. This copy (or “clone”) has all of the metadata of the original.
This method may sound wasteful, but in practice, it’s not a problem. Most programming projects consist mostly of plain text files (and maybe a few images), and disk space is so cheap that storing many copies of a file doesn’t create a noticeable dent in a hard drive’s free space. Modern systems also compress the files to use even less space.
The act of getting new changes from a repository is usually called “pulling,” and the act of moving your own changes to a repository is called “pushing”. In both cases, you move changesets (changes to files groups as coherent wholes), not single-file diffs.
One common misconception about distributed version control systems is that there cannot be a central project repository. This is simply not true – there is nothing stopping you from saying “this copy of the project is the authoritative one.” This means that instead of a central repository being required by the tools you use, it is now optional and purely a social issue.
Git Is a Version Control Application
It refers to a storage location, often for safety or preservation. A repository contains all of the project files (including documentation), and stores each file's revision history. Repositories can have multiple collaborators and can be either public or private.
A repository is the most basic element of GitHub. It’s easiest to imagine it as a project folder. However, unlike an ordinary folder on your laptop, a GitHub repository also offers simple and powerful tools for collaborating with others.
A repository contains all of the files associated with a project (including documentation), and stores each file’s revision history.
Commit
A commit, or "revision", is an individual change to a file (or set of files).
It's like when you save a file, except with Git, every time you save it creates a unique ID (a.k.a. the "SHA" or "hash") that allows you to keep a record of what changes were made when and by who. Commits usually contain a commit message which is a brief description of what changes were made.
Fork
- A fork is a personal copy of another user's repository that lives on your account.
Forks allow you to freely make changes to a project without affecting the original. Forks remain attached to the original, allowing you to submit a pull request to the original author to update with your changes.
A clone is a copy of a repository that lives on your computer instead of on a website's server somewhere, or the act of making that copy.
With your clone, you can edit the files in your preferred editor and use Git to keep track of your changes without having to be online.
- Pushing refers to sending your committed changes to a remote repository such as Bitbucket. For instance, if you change something locally, you'd want to then push those changes so that others may access them.
Pull requests let you tell others about changes you've pushed to a Bitbucket repository. Once a pull request is sent, interested parties can review the set of changes, discuss potential modifications, and even push follow-up commits if necessary.
Fetching refers to getting the latest changes from an online repository (like Bitbucket) without merging them in. Once these changes are fetched you can compare them to your local branches (the code residing on your local machine)
Working on GitHub Versus Working Locally
You can make changes to your project directly on GitHub, but most people prefer to work on their local machine so they can make changes in their favorite IDE or text editor. Let’s review a couple important terms:
The remote repository is a copy of the repository on GitHub. All collaborators synchronize their changes with this, making it the source of truth for the group.
Local repositories are git repositories stored on a user's computer. If the local directory is linked to a remote repository, then the local repository is a full copy with everything on the remote repository, including all of its files, branches, and history.
The local and remote repositories only interact when you run one of the four network commands in Git: git clone, git fetch, git pull, and git push.
Using Git / Basic commands
Git uses a small number of commands to perform its basic operations.
Overview of the GitHub Workflow
The main steps are:
Create a branch off of the master
Make commits
Open a pull request
Collaborate
Make more commits
Discuss and review code with team members
Deploy for final testing
Merge your branch into the master branch