Deconstructing the Git in GitHub

Microsoft's acquisition of GitHub has hit the headlines, and many are puzzled as to what GitHub is, and what its underpinning Git consists of.
12 June 2018

Tom Preston-Werner, founder, GitHub. Source: Flickr / JD Lasica

The tech media world was full of reports and ensuing opinion pieces about Microsoft’s acquisition early last week of GitHub, the developer-centric version control service beloved of the open source community.

And while the internet was metaphorically alight with arguments concerning Microsoft’s move into the sacrosanct world of open source, many readers in the wider tech community were left scratching their heads.

Not only, they wondered, is all the fuss was about, but what the hell is GitHub, a Git, and why does it matter?

Here then is Tech HQ‘s guide to all things Git – hopefully explaining what the technology is all about, and even touching on why the Microsoft/GitHub situation ruffled so many feathers.

Git was originally conceived as a new type of version control system (VCS) for the open source development of the Linux kernel.

As any would-be developer will know, version control is a way by which development teams can collaborate on the same project, share ideas, and track changes as a build moves on.

What a Git does is a type of version control that holds snapshots of a development project in its entirety locally — that is, on the developer’s computer or local server.

Why is that important? With the way people often work these days, having a centralized traditional version control system just isn’t practical.

You always need access to a version control server, and additionally, on larger, more complex projects, refreshing and updating to the latest versions of the nascent application can be time-consuming.

Git systems hold a snapshot of all projects’ current and historical files in a local database, making it perfect to use on or off-line.

And, because every version is represented, the individual can roll back or refer back to previous versions at will, seamlessly, and without having to wait for a server connection.

There are several advantages to using Gits for development teams, but, this type of repository is beginning to be used in different areas.

In short, any activity which involves the accurate tracking of the development of a project over time by multiple users can make excellent use of a Git.

And due to the open source nature of the technology, anyone can use one, deploy one and build one’s own.

GitHub came about as a web interface to a Git service, and provided a ubiquitous platform on which projects can be worked, without users having to wrestle with tiresome, locally-installed Git clients.

GitHub, as a business, developed a paid-for strand by which large projects from companies such as Apple, Google, and, significantly, Microsoft, could be worked on internally, in private.

Git

Git basic topology. Source: SlideShare/LinkedIn

But the majority of GitHub users publish their work publicly for others to comment on, improve upon, and share freely.

This is the basis of the FOSS or open-source model, often cast as a direct competitor to purely commercial, proprietary interests such as Microsoft’s.

Any Git project has three main areas. There’s the Git directory or repository which holds all the files. That means the current version of the project, any forks or offshoots from that project, and every change ever made to the project throughout its history.

On top of that, users spend most of their time in a Working Directory having “checked out” or “cloned” a project or part of a project to work on.

When any changes have been made and given proper consideration, they can be pushed to a staging area.

From the staging area, the Central Directory (or repository) can be updated according to what’s been staged. Who does that? Either the developer or more commonly in larger projects, the project leaders or overseers.

Enterprise users will recognize this type of activity in some of their own working methods.

Even simple use of technology like Google Sheets or Google Docs, for instance, can involve suggesting changes or edits to a document on which many are collaborating.

Suggestions can be accepted and made real in the final document, and by this means, a finalized version is created.

Git repositories work in much the same way, although with a much higher degree of granularity available.

With ease, projects can be branched, or forked, and those offshoots later amalgamated into the main, or become discrete projects on their own if required.

Other uses are beginning to percolate through to a broader market, with instances cropping up in law, blogging and writing, font editing, collaborative creativity, and even libraries of recipes, where the forks contain variations (spicier, milder) of the main version.

Microsoft’s claim to fame, or claim to infamy if you listen to certain people, is that it has always been a superb imitator and propagator, rather than an innovator.

The graphical user interface in the desktop environment which made the company into the global-straddling giant it is today was actually an invention of Xerox’s.

Its come-lately Azure platform trails in the wake of AWS, and the Bing search engine is only a niche player, operating behind the scenes on specific services (until recently powering searches for Apple’s Siri, for instance).

But with the company’s acquisition of GitHub, you can expect to see the version control system of Git more widely distributed and used in a more significant number of instances.

Microsoft’s genius has always been one of commercialization of the existing, so prepare to come across new services and technologies based on Git in the following months and years. Whether or not they’ll be free remains to be seen.