\input texinfo @c -*-texinfo-*- @c %**start of header @setfilename using-git.info @settitle using git @afourpaper @c @afivepaper @documentencoding UTF-8 @documentlanguage en @set TEXI_5 @c @set HARD_COPY_EDITION @c @smallbook @ifset HARD_COPY_EDITION @ifset TEXI_5 @cropmarks @end ifset @setchapternewpage odd @end ifset @finalout @c %**end of header @dircategory Version Control @direntry * using git: (using git). Using the Git source control management @end direntry @macro prelude @center @titlefont{Prelude} @iftex @sp 1 @end iftex @ifset TEXI_5 @ifnottex @quotation @end ifnottex @end ifset @noindent This Git manual is intended to be educational rather than a lookup documentation. And it is written as a tutorial, similarly to an educational math book, except it does not include problems to solve or test. Consequentially, it is designed to be read chapter by chapter with occasional look-backs. Because of the scattered design of the content layout, Git's online documentation, accessible with @command{man git} and @command{man git VERB}, is much better for quick lookup of information, such as how to use a command. This manual will teach you the basics for using Git with an early focus on contributing to existing projects, and iterates to increasingly more advanced features that makes life with Git even easier. But it will also teach you some ideas behide Git's design. You will also learn about side-issues such as how to best make your own project installable and navigateable by others without excessive documentation and without having to answer too many questions. @c @ifclear HARD_COPY_EDITION @c If you prefer a hard copy edition, you can order one from @c ... @c @end ifclear @ifset TEXI_5 @ifnottex @end quotation @end ifnottex @end ifset @end macro @copying Copyright @copyright{} 2013 Mattias Andrée @quotation Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; with no Invariant Sections, with no Front-Cover Texts, and with no Back-Cover Texts. A copy of the license is included in the section entitled ``GNU Free Documentation License''. @end quotation @end copying @ifnottex @node Top @top Using Git @insertcopying @prelude @end ifnottex @titlepage @title Using Git @subtitle Educational manual for Git, the version control system. @c @vskip 0pt plus 1filll @c @c this way, it is centered exactly in pdf and approximently in dvi and ps @c @c @center does not work for @image in dvi and ps @c @multitable @columnfractions 0.15 0.7 @c @item @tab @center @image{obj/logo,200px} @c @end multitable @c @vskip 0pt plus 1filll @author by Mattias Andrée (maandree) @page @center git: A silly, incompetent, stupid, annoying, or childish person. @vskip 0pt plus 1filll @insertcopying @page @prelude @end titlepage @contents @menu * Getting started:: Getting started with Git * Introduction:: What is Git and why is it the best? * Branching out:: The flexibility of non-linearity * Collaborating:: Shared goals, shared development * Basic commands:: So happy hacking! * I just don't know what went wrong:: Identifying when something broke and how to recover * Version control:: Time to release a new version? * Interface:: Git's interface design * Features:: Git's design and features * Beyond Git:: Just using Git is not enough * GNU Free Documentation License:: Sharing is good, it does not make you a pirate * Glossary:: Lost in all the big words? @end menu @c TODO Masterful flow @node Getting started @chapter Getting started @menu * Identify yourself:: Configure Git to identify you when you make commits * Create a repository:: Create your first repository * Create an origin:: Create a backup repository * Gratis hosting:: Get a hosting service for your superawesomazing projects * Generate your key:: Create an identification key @end menu @node Identify yourself @section Identify yourself The first thing you want to do right off the bat is to identify yourself: add your name and e-mail address to your Git configurations. This is done with two simple commands: @example git config --global user.name 'YOUR NAME' git config --global user.email 'YOUR_EMAIL_ADDRESS' @end example It is possible to sign your work with GPG. If you are planning on doing this, and doing this with another GPG key then your default key, you can configure Git to using another key by default: @example git config --global user.signingkey YOUR_GPG_KEY_ID @end example @node Create a repository @section Create a repository A repository is a directory under source control, normally your project you are working on. Create an empty directory and @command{cd} into it: @example mkdir MY_PROJECT cd MY_PROJECT @end example When you are inside the directory for the repository issus the Git command to initialise the repository: @example git init @end example This command creates a directory named @file{.git} inside the directory with all data Git requires to operate on the repository. The next thing you want to do is to create a @file{.gitignore} file. It is used to keep track of with files that should be included in the repository, unless overruled with a forced staging. A good base @file{.gitignore} content you probably always want to use is: @example _/ # It is a good idea to allow the directory _ to # contain temporary file you do not want to stage. .* # Generally you probably do not want to include # hidden files. !.git* # But you do generally want to include files # starting with .git, such as .gitignore. \#*\# *~ *.bak # And you do not want to include backup files. @end example Git parses @file{.gitignore} with wildcards, @code{#} for comments and @code{!} for inclusion rather than exclusion. Latter entires override earlier entries. When you have created your @file{.gitignore} you are ready to stage it and make your first commit: @example git add .gitignore git commit -m 'first commit' @end example @node Create an origin @section Create an origin It is a good idea to create a backup repository, so you do not lose your work on a disc failure, filesystem corruption or accidental removal. You can use such repositories for allowing collaboration with a common repository that all collaborators can submit commits to and fetch commits from. This repository is customarly called `origin'. And it is a bare repository, meaning that it only holds the data in the @file{.git} directory and cannot be used as the working directory, it is missing what is in Git called `index' @footnote{Or `cache', an obsolete term.} and `working tree'. @example mkcd -p /srv/git/MY_REPOSITORY.git cd /srv/git/MY_REPOSITORY.git git init --bare cd - # Go back to your project respository git remote add origin file:///srv/git/MY_REPOSITORY.git git push -u orgin master # master is the branch you are working in @end example It is standard to append @file{.git} to the end of the repository name when it is bare. To submit your changes to `origin' you can now use the command @command{git push}. To fetch updates others have made, use the command @command{git pull}. @node Gratis hosting @section Gratis hosting As seen, you do not need a hosting service --- if you have a network filesystem anyone with access to it can also have access your bare repository --- but it is a grate way for making your projects available to the world. Here is a list of gratis Git hosting services that allow hosting of Free Software projects. @table @asis @item @bullet{} @url{https://savannah.nongnu.org/, Savannah} Hosts Free Software only, and projects are audited for licensing issues upon registration. So it can take a short time before it is accepted, but your project will not use non-Free Software and no license information will be missing. Savannah runs on only Free Software. @item @bullet{} @url{https://gna.org, Gna!} For Free Software projects only. Gna! runs on only Free Software. @item @bullet{} @url{https://bitbucket.org, BitBucket} Gratis for 5 users, with unlimied number of private repositories for 5 collaborators. @item @bullet{} @url{https://github.com/, GitHub} 5 private repositories for students, for two yours and reactivatable when expired. Teachers and student organisations can get private repositories, as many as required, for an organisation. @item @bullet{} @url{https://www.assembla.com/catalog/51-free-private-git-repository-package?type=private&ad=git-wiki, Assembla} Hosting limited to 2 GB with one free private repository for three users. @item @bullet{} @url{https://www.cloudforge.com/pricing, CloudForge} Hosting limited to 2 GB. @end table You should note that there are, other, Git hosting services that does not allow Free Software. Some of them will allow Open Source, some will allow Free Software, but not gratis. @node Generate your key @section Generate your key Many Git servers authenticate using public SSH keys. If you do not already have SSH installed, install it, it is probably named @code{openssh} in your GNU/Linux@footnote{Or whatever you are using.} distribution's package repository. If not, you can download it at @url{http://www.openssh.com}. Before creating an SSH key, check if you already have one. You are looking for a pair of @file{id_rsa.pub} and @file{id_rsa}, in @file{~/.ssh}. @file{id_rsa} is your private key and should not be shared or made public. @file{id_rsa.pub} is you public key and is the file you want to upload to your Git hosting server. If you do not already have a key, you can create it with: @example ssh-keygen -t rsa -C 'YOUR_EMAIL_ADDRESS' @end example @node Introduction @chapter Introduction @menu * What is Git?:: So exactly what is Git? * It is distributed:: The power of non-centralisation * Integrity:: How you know that noone is messing with your project * Online documentation:: Git comes with online documentation @end menu @node What is Git? @section What is Git? Git is a version control system known for its lightning speed@footnote{Especially under POSIX compatible systems.} and being distributed. A version control system is a system for storing changes in a history tree and allow for multiple people to work on the same project without the risk of the code being too new to accept a submitted patch. When you are working it is important to keep track of changes so that you can find which edit step broke the system. But version control also lets you create branches, these are different versions of the same project being developed concurrently which lets your team implement features in parallel and merge them down onto the mainline when stable or implemented to an acceptable degree. Another important feature of version control that it can be used to tag releases of the code. If you have released a program and is sent a bug report you may want to test it one both the current version and the version the user used. @node It is distributed @section It is distributed Traditionally, version control systems were centralised. Every project has one repository all contributors pushed and pulled from. Git is distributed, this means that contributors clone the respository and works on that clone instead of ``checking out'' the current tip of the source code. This actually means that there are multiple backups of the respository and recovering a crash or corruption will be a breeze. It is a popular misconception that distributed systems are not suited for projects that requires an official central repository. This is far from true; projects have a central blessed repository, possibly with mirrors. A blessed repository, refered to as the upstream, is the projects official respository. It is maintained by a select few with input from submitted updates. But the upstream can also be a shared repository, this is the classical Subversion-style workflow, where everyone pulls from and pushes to. Git does not allow you to push before you have pulled to latest commit so this workflow works fine. Small projects will usally have one maintainer and contributors clone her blessed repository and sends submissons to her. Larger projects may have multiple maintainers that helps with accepting submissons. A common model like this, that you often see on GitHub, is the integeration manager workflow, where the maintainer is an integeration manager than accepts pull requests from developers that have public repositories, often called forks (which should not be confused with a project fork where the forker is taking the project in another direction and does not intend requests pulls.) Even larger project will usally work with a dictator and lieutenants workflow where developers clone the blessed repository and submits patches to the lieutenants who in turn submits the the dictator that finally pushes the changes to the blessed repository. @node Integrity @section Integrity Git cryptographically hashes all data associated with a commit, including the prior commit. This makes it unfeasible to modify a commit without changing the commit ID; changing the commit ID brakes the commit history and would therefore get noticed as the develops cannot work against a broken commit history. Additionally commits can be signed with GPG, so you can be sure that the committer is who she says she is. @node Online documentation @section Online documentation As with most software packages, Git includes online documentation, accessible in several ways. The simplest way --- this will working on all systems --- is by adding @option{--help} to the command @command{git} or @command{git VERB}. Git is a project started by Linus Torvalds, the creator of the Linux kernel, so naturally it has manpages, but not any official @command{info} manual. Since @command{info} supports manpages you can use @command{info} in place of @command{man}. Git includes several manpages, one manpage for every Git command: @command{man git VERB} or alternatively @command{man git-VERB}, and manpages on special topic: @itemize @item @command{man git} @item @command{man gitattributes} @item @command{man gitcli} @item @command{man gitcore-tutorial} @item @command{man gitcredentials} @item @command{man gitdiffcore} @item @command{man gitglossary} @item @command{man githooks} @item @command{man gitignore} @item @command{man gitmodules} @item @command{man gitnamespaces} @item @command{man gitrepository-layout} @item @command{man gitrevisions} @item @command{man gittutorial} @item @command{man gittutorial-2} @item @command{man gitworkflows} @end itemize @node Branching out @chapter Branching out @menu * Workflow:: Why should I branch? * Creating branches:: How do I branch? * Merging branches:: How do I merge my branches? @end menu @node Workflow @section Workflow Git encourage you to create multiple local branches of your repository. A branch is a fork of your commit history, it allows you to implement features in parallel. The most important part with this is that you can fix bugs meanwhile you are working on big new features. Your main branch is by default called `master', from it, it is recommended to have a branch called `develop'. The develop branch is the branch you work on, and when it is stable, you merge it with the master branch. From the develop branch you can branch out and create topic branches, and disposable experiments. @node Creating branches @section Creating branches The quickest way to create a new branch and start working on it is to issue a checkout command that create a new branch: @example git checkout -b BRANCH_NAME @end example After issuing this command you are located in a new branch. To create it in the origin, make a push: @example git push -u origin BRANCH_NAME @end example From this point on you can push without parameters: @example git push @end example The @option{-u origin BRANCH_NAME} is just to initially tell which remote repository a pushes should go to.@footnote{It actually also tells to not push anything else.} To switch branch use the checkout command: @example git checkout BRANCH_NAME @end example @node Merging branches @section Merging branches The merge a branch into another, switch to one of them and pull the other: @example git checkout MERGER git pull . MERGEE @end example In the default mode, @command{git pull . MERGEE} is a short and fore a fetch and merge: @example git fetch MERGEE && git merge MERGEE @end example If you two cannot be automatically merged, you will get a merge conflict. A case where you will get merge conflicts is when one of the branches has made a modification where the other has change the indention, so keep to a coding style from the start; or both has edited the same lines. If you get a merge conflict, Git will tell you so, in which files there are conflicts, and exit with the return code 1 to indicated that the merge was not successful and human intervention is required. If the merger branch has a file with the line @code{Hello world} and the mergee branch as the line @code{hello world!}, the file will contain: @example <<<<<<< HEAD Hello world ======= hello world! >>>>>>> xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx @end example Where @code{xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx} is the lower case hexadecimnal represention of the commit ID at the tip of the mergee branch, which is a SHA-1 hash sum of the commit. After a merge conflict you will need to stage the files and make a new commit. @node Collaborating @chapter Collaborating @menu * Cloning a repository:: Start your collaboration * Submitting patches:: Submit your work upstream * Accepting patches:: Accepting received commits * Making pull requests:: Request integration @end menu @node Cloning a repository @section Cloning a repository The first thing you need to do in order to begin collaboration is the clone the repository: @example git clone REPOSITORY -o upstream @end example By including @option{-o upstream}, Git sets up the cloned repository as a remote repository named `upstream'. If you want to access a branch in the upstream repository, use @code{upstream/BRANCH} as the branch name. @node Submitting patches @section Submitting patches The best way to create a patch is with Git's @command{format-patch} command. Assuming you began from @code{upstream/master}: @example git format-patch upstream/master @end example This command will create a patch whose name will be printed by @command{git format-patch}. Creating a patch this way will keep track of the commit messages, and the individual commits. Another advantage with this is that it can easily be submitted to a mailing list, which the common way for large projects for accepting patches. The created patch file is formated as an e-mail, with `[PATCH]' in the beginning of the subject line. If you update the patch it is customary to use `[PATCH v2]' instead and `[PATCH v3]' on the second update. If the patch, however it not ready for being included, but is rather for discussion, use `PATCH/RFC'@footnote{RFC is an abbreviation for `Request for comments'.} instead of `PATCH'. To send the patch, use @command{git send-email}: @example git send-email --to=EMAIL_ADDRESS_TO_SEND_TO PATCH_FILE @end example If you have registered to the mailing list with, or for some other reason want to send under, a different e-mail address then you made the commits with, you need to specify an envelop sender, by adding an option: @example --envelope-sender=SENDER_EMAIL_ADDRESS @end example You will also need the specify which SMTP server to use, authorisation and configurations: @example --smtp-server=DOMAIN # it usally is prefixed with smtp. --smtp-server-port=PORT --smtp-encryption=ssl # or tls --smtp-user=ACCOUNT # usally just the username without domain --smtp-pass=PASSPHRASE @end example If you are using a forwarding e-mail, such as @@member.fsf.org, you send using your normal e-mail, but use the forwarding e-mail address as the envelop-sender, most e-mail server should accept this. If you are replying to a message in the mailing list, perhaps with an updated patch, you should specify the message ID of the message to reply to. This done my adding the option: @example --in-reply-to=MESSAGE_ID @end example To get the message ID, open the message in your e-mail client and choose to see all headers --- if not possible: download the it as an mbox file and open it in an text editor --- a look for: @example Message-ID: @end example As indicated here, it is surrended by less than and grater than-signs. You should, if you have subscribed to the mailing list, have gotten it send to your e-mail. If you do not have it, go the the mailing list and click that you want to reply, it will open your e-mail client in compose mode and the in-reply-to address will have been set to the proper message ID.@footnote{Assuming your e-mail client supports field specification and is associated with @code{mailto:}, usally this is not a problem, if it is, claws-mail will do the job.} @node Accepting patches @section Accepting patches To apply a patch, use the @command{git am} @footnote{`am' stands for `apply mailbox', but it works on regular patch files.} command: @example git am PATCH_FILE @end example Is good practice to sign off commits to help establish a chain to trace submissions, and some projects will require it. To sign off with @command{git am}, just add @command{--signoff}. @node Making pull requests @section Making pull requests A less feature rich alternative than patches are pull request, but they are easier to just because you do know need to know anything to make a pull request and to accept them you just need to know how to pull from other repositories. Git does however provide a command the produces a clean standard message than can be posted on a mailing list. To do this just type: @example git request-pull FORKING_POINT_COMMIT YOUR_URL @end example Additionally you can add a commit that the pull requests stops at, if you have another commit than @code{HEAD} --- the current commit you are working at --- in mind. You can also add @option{-p} if you want to see the changes. @node Basic commands @chapter Basic commands @menu * The trees of Git:: How history in Git is structured * File operations:: Working with files in Git * Go back in time:: Virtual time travel @end menu @node The trees of Git @section The trees of Git Git has four trees should know about to better understand how Git works. The first tree you encounter is the working directory, called the working tree in Git. The tree begins in the parent so called git directory; the directory you executed @command{git init} in, and contains the directory @file{.git}. When you are using @command{git add} to stage files you encounter the next tree. This tree is called the index @footnote{Previously called the cache.}, and is separate from the working tree, when you stage a file, you stage an edit, if you edit the file further those changes does not make it into the index until you restage the file. When you have done some work --- just a small logical step is recommended --- and want to save your changes you commit then with @command{git commit}. This is when you encounter the third tree, the @code{HEAD}. @code{HEAD} is the file tree of the last commit, and it is updated when make a commit. The fourth tree is not a file tree, it is the commit tree. The important thing with Git is that this tree is not linear, it is a directed acyclic graph, so it is not really a tree, but you can think of it as one because you are normally only interested in the leaves, your branches. @node File operations @section File operations Their are four basic options you can do on files: add, update, remove and rename, adding and update is done with the same command: @example git add FILE @end example To remove a file or rename a file, just do as you normally would without Git, but prepend @code{git}: @example git rm FILE # Remove FILE git mv FILE NEW_NAME # Rename FILE to NEW_NAME @end example If an directory in becomes empty in the working directory it is automatically removed from working directory. And directories are never tracked by Git, so you cannot have an empty directory in a commit. You can also use @command{git add -u} @footnote{@option{-u} is the short option for @option{--update}.} to stage an edit in an already tracked file or stage the removal of it if it as been removed from the working directory. Git trackes file renames implicitly, so @command{git mv} is same thing as: @example cp FILE NEW_NAME git rm FILE git add NEW_NAME @end example This approach is flexible, but it has some caveats. GNU Arch keeps track files by giving them unique identifier, this solves the problem were you in Git can get an evil merge if the pulled branch does not have any common commits@footnote{Identified with commit ID, not snapshots, which reflects on more than the file content.}, for example, the pull patch was not made from a clone repository or did not contain commit history. Other systems tracks renames explicitly when a rename command is made, that is worst because than mean that you need to use the rename commit, and evil merges are even more probable. A problem will merging when where is a rename is that the changes are automerged instead of creating a conflict, you can get evil merges where the content of the resulting file refers to the files old name, naturally this is still a problem if another file depending on the renamed file is edited in parallel. If you want to know the file staging difference between the index and working directory type @command{git status}. @node Go back in time @section Go back in time Because Git keeps track of what has changed it has a log you access, which has commit messages, so you know when something has happend or what has happen lately. To read the log type: @example git log @end example If you want to know which files have changes, you can use @command{git whatchanged} instead. If you want to take a closer look a commit an see the state of the project at the commit type: @example git stash # Only if you have uncommited changes, this # saves you changes outside the tree in a stack. git checkout COMMIT_ID # Take a look around! git checkout - # Checking out - means that you checkout the # commit you were on before the last checkout. # Kind of like `cd -'. git stash pop # Only if you have uncommited changes, this # reapplies the changes you saved with `git stash' # and removes it from that stack. @end example If you instead what to see all changes from that point of time type: @example git diff COMMIT_ID @end example Or for a specific file: @example git diff COMMIT_ID FILE @end example If you decide that you want to go back permanently to this state you type: @example git revert THE_COMMIT_ID_OF_THE_COMMIT_AFTER_THAT_COMMIT..HEAD @end example If you have not push the commits you want to revert you can do a reset instead, this way they are irreverable removed instead of a new commit being made: @example git reset --hard COMMIT_ID @end example But you should think of that as running as root: @cartouche @noindent Chris: root is the number zero user, it is main user in your system, it can do everything, it can literally delete the filesystem while your operating system is running. @noindent Bryan: Yeah, it is a grate user in that regard. @noindent Chris: Yeah. Yeah. @noindent Bryan: Here is the thing, so people always say `do not run as root.' @emph{I always} run as root. @noindent Chris: Do you really always run as root? @noindent Bryan: Hell yes. Do you know why I always run as root? @noindent Chris: Why? @noindent Bryan: Awesome. @noindent Chris: Here, I… @noindent Bryan: I live on the edge. I'm like Mad Max. @noindent Chris: Do you really always run as root for real? @noindent Bryan: No, I do not run as root. Are you kidding me, that is asinine! I would love to think that I am so hardcore that I just always ran as root. I just, caution to the wind, screw it, lets just thunderdome this bitch and… you know, see what happens. But no, never run as root, never ever do that. The only time you run as root is when you run as root temporarly, you sudo. @noindent Chris: You need to do something. @end cartouche Only use @command{reset} if you are absolutely sure and know exactly what you are doing. @node I just don't know what went wrong @chapter I just don't know what went wrong @menu * Naïve reset:: Last resort recovery * Using the stash:: Stash changes * Commit amendment:: Correcting an unpushed commit * Bisection:: Identifing when something broke @end menu @node Naïve reset @section Naïve reset If something went horribly, horribly, horribly, horribly wrong and you do not know how to get back to a clean state, you can always do this naïvely by clone the repository: @example git clone REPOSITORY REPOSITORY.new cp REPOSITORY/.git/config REPOSITORY.new/.git/config yes | rm -r REPOSITORY mv REPOSITORY.new REPOSITORY @end example @node Using the stash @section Using the stash The stash is a grate utility for storing changes. If you have made changes in the working directory or the index, you can store them in the stash and both the working directory and the index will be restored to the @code{HEAD}. Keep in mind the the naïve reset will discard the stash because the stash is local. Changes stored to the stash can be applied to any branch and any later state of the @code{HEAD}, that is what the stash is made for. The basic stash operations include: @table @command @item git stash Store the changes made to the index and working directory. @item git stash drop Discard the object at the top of the stash stack. @item git stash apply Apply changes stored as the object at the top of the stash stack. @item git stash pop Synonym for @command{git stash apply && git stash drop}. @item git stash clear Discard all stored stash objects. @end table @node Commit amendment @section Commit amendment If you have not yet pushed your latest commit you can amend it. If you have pushed it, you cannot amend it because the commit ID changes because it is SHA-1 hashsum of all information. To amend your commit run @command{git commit --amend}. It will launch your text editor so you can edit the commit message, additionally all staged changes are included in the amendment. @node Bisection @section Bisection Bisection is the process of identifying when a bug was introduced. To start a bisection you first need to tell Git to start bisection and specify the commit range. If the current commit is bad you type: @example git bisect start git bisect bad git bisect good LAST_KNOWN_GOOD_COMMIT @end example After this you either type, depending if the commit Git checks out is good or bad: @example git bisect good git bisect bad @end example Git will tell you when it has found the the first bad or possible first bad commit. To then checkout the commit that was checked out before the bisection started type: @example git bisect reset @end example If you in the process of the bisection landed on a commit you need to skip because it has some other problem, you can use @example git reset --hard HEAD~N @end example Where @code{N} is the number of revisions before the checked out, you want to jump to. You can also use @command{git bisect skip} to tell Git which revision that cannot be tested so they are excluded from the bisection process. Instead of manually telling Git if a commit is good or bad, you can use: @example git bisect run TEST_SCRIPT [ARGUMENTS...] @end example The test script should exit with 0, if and only if the commit is good, 125 to skip the commit, and anything else between 1 and 127, inclusively, if the commmit is bad. Other values (128–255) will abort the bisection. @node Version control @chapter Version control @menu * Tagging versions:: Releasing your new version * Cherry picking:: Backporting and selective proposed update merge * Examine the log:: Utilising Git's commit log @end menu @node Tagging versions @section Tagging versions General, programs have different release version. When you release a new version, you tag the last commit makes it into that version. To do this, create an annotated tag object and push it to your origin: @example git tag -a 'RELEASE_VERSION' git push origin 'RELEASE_VERSION' @end example The created tag can be refered to as any commit or branch. If you want to remove a tag, you just tell Git to delete it and push the deletion of its reference to your origin: @example git tag -d 'RELEASE_VERSION' git push origin :refs/tags/'RELEASE_VERSION' @end example @file{refs/tags/RELEASE_VERSION} is a file in the @file{.git} directory. Using a @code{:} tells Git that you want to push the local file before the @code{:} to the remote file after the @code{:}. If the local file is not specified, in order words, the argument begins with @code{:}, you are telling Git to remove the remote file. This only work with references, that is, files inside @file{.git/refs}. @node Cherry picking @section Cherry picking Cherry picking is the action of applying changes made in another commit. It is a create tool both for apply changes made to another branch or to backport features. Cherry picking works just merging branches, except, instead of choosing a branch to pull and apply all it changes you choose individual commits, and commits ranges: @example git cherry-pick COMMIT_ID @end example @node Examine the log @section Examine the log So how are you going to cherry-pick commits if you do not know their commit ID:s? Simple, you use Git's log tool to find their ID:s. The log will only show earlier commits in the branch then the currently checked out commit. Typing @command{git log} will show you the commit's ID, author, date, and commit message. You can limit the commits to a commits where specify files has been changed by appending those files to the command. @node Interface @chapter Interface @menu * First things first:: Order matters! (Especially with find) * Wildcards:: Beware of wildcards @end menu @comment TODO man gitrevisions @node First things first @section First things first Many Git commands take both revisions and paths as their arguments. First come the revisions, then the files. If you have files that can be misunderstood for a revision, place a @option{--} between the revisions and the files, anything after a @option{--} is interpreted as a file. It is a good practice to do this in scripts that takes random user-input. @node Wildcards @section Wildcards Many Git commands allow wildcards in paths. These commands will expand wildcards in the arguments just a the shell. To avoid problems, never use characters in paths that are used in the shell for wildcards and expansions… yeah, I know, it is annoying. @node Features @chapter Features @menu * Git and permissions:: File permission tracking in Git * Git and timestamps:: File timestamp tracking in Git * Git and custom merge tools:: Merge tool customisability in Git * Git and shared build caches:: Shared build caches do not belong in source control * Git and keyword expansion:: Keyword expansion is evil and do not belong in source control * Git and links:: Symlink and hardlink tracking in Git * Git and filenames:: Filename tracking in Git * Git and merge tracking:: Merge commit tracking in Git * Git and empty directories:: Nontracking of empty directories in Git * Git and file renames:: File rename tracking in Git * Git and encoding convertions:: Encoding convertions in Git * Git and atomic commits:: Atomic commits and source control @end menu @node Git and permissions @section Git and permissions Filesystems lets you set permissions for users on files. Permissions, such as whether the owner, a specific user group, or others, can read, write or execute the file as well as whether the permissions of the owner if the file is granted the execute instance of the program for further permissions that are mission critical. Tracking permissions in version control system makes no sense since source code should always be readable and writable by the owner and it is local configuration whether you want to extend those permissions to other users on your computer. However there are source code that is executable, scripts, such as Bash, Perl and Python. This is may Git does however track the execution bits. The set-user-ID bit and the set-group-ID bit is however not tracked, nor should they, those should only be set when the program is being installed. Git does not track directories, so permissions no directories are also ignored. Be aware that some systems sets the execution bit willy-nilly when a file is created. This is a shortcoming of Git, but only in comparision to darcs. Since files are only executable if they are compiled or have a shebang, it would make more sense to set it on and only on the files with a shebang or otherwise tagged to have a specific permission. Additionally Git does not track file ownership --- that would not only be stupid, that would be outright dumb, even moronic ---, access control lists, extended attributes nor forks. @node Git and timestamps @section Git and timestamps One of Git's features is that it does not have the feature@footnote{Yes, that it does not have it. Do not be confused, not having features are often grater features.} of perserving timestamps. Perserving modification time on files is considered harmful. Consider if Git were to perserve the modification time. If you have compiled the program, and then checkout an earlier commit, perhaps another branch, and you build it again, now your compiled files from the other branch has a newer timestamp then the source in the current branch and the files will not build unless you force them to build, which you more than likely will forget. So you may observe a behaviour of your program that is not defined by the checked out source code. This is way Git sets the timestamp to the current time on every file it modifies, but only those. Files that have not changes will not change timestamp so the build system will not rebuild them unless there is some other reason it needs to. @node Git and custom merge tools @section Git and custom merge tools Git lets you use custom automatic merge tools. This can be used, for example, to merge binary file formats. It is possible to use a merge tool to required human interaction, however, this is discouraged because than the commit history will have no indication that there was a merge conflict. You can use a merge tool that tries to merge and marks all conflicts, and then manually use a tool the lets you see the conflicts and resolve. @node Git and shared build caches @section Git and shared build caches A few set source controls systems have shared build cache of derived objects. Git does not. Build cache, private or public, is none of source control systems' concern and should be handled with tools made specifically for just that. @node Git and keyword expansion @section Git and keyword expansion About half of all source control systems supports keyword expansion. As a feature, Git does not. Keyword expansion causes strange problems and is not really useful. Nor it is a true task for source control systems to implement, If can as successfully be implemented with external tools, where it belongs. @node Git and links @section Git and links A Git repository will not grow larger if you hard link a huge file over and over again, not that you would ever want to… However Git does not actually track hard links. Git supports symbolic links, and stores the exact reference stored in the symlink, it is not resolved or otherwise rewritten, so make short you use relative references in your symlinks. @node Git and filenames @section Git and filenames Git treats filenames as byte sequences, meaning that if it supports all characters supported by your file system, including one developers operativing systems uses UTF-8 and the other's uses UTF-16, or only ASCII, there will be problems. Will it is true that systems really should use UTF-8, be case sensitive and support the entire Unicode, even characters that are not yet defined with use of all 31 bits@footnote{Yes Unicode supports only @c It is a shortcoming of English that this up to 31 bits, so negative ordinals can @c footnote cannot be expressed more clarily be considered process private use.} @c without significant length increase. except the NUL control character, you may consider just using lower case ASCII and be friendly to other systems on use a very restricted set of punctuation and no control characters. @node Git and merge tracking @section Git and merge tracking Git takes history seriously, therefore when a merge is made, even if it can be automatically fully merged, a commit it created. This is very useful is keep track of changes and identifing and resolving evil merges. If you are inclined to shoot your self in the foot you can always rebase your repository so that merges are not created. @node Git and empty directories @section Git and empty directories Git does not track empty directories, it even removes them for you when it remove all files in a directory. Basically there is not reason to track directories, ever, but if you are so inclined you can add a @file{.gitignore} file inside it. @node Git and file renames @section Git and file renames The authors of Git believe that tracking file names implicitly provide a more flexible way to track how your tree is changing, and less painful merging of patches. @node Git and encoding convertions @section Git and encoding convertions Git can keep track of encoding for commit metadata. Git does however not convert file content, if your are indeed inclined to want to shoot yourself in the foot a filtering mechanism can be used. @node Git and atomic commits @section Git and atomic commits This might seem obvious, but CVS, ClearCase and Visual SourceSafe does not implement this; atomic commits. Atomic commits refers to a guarantee att an action cannot be interrupted and leave in a partially complete state but be marked at complete, either all changes are made or none at all. Basically this means that first you backup the reference to the current commit, mark a file as a commit is being saved, store the new commit, update the reference to the current commit to point at the new commit and then mark that no commit is being saved. If the mark is then detected to say that not commit is being save but instead absolutely anything else, restore the state in the same manner. @node Beyond Git @chapter Beyond Git @menu * Additional tools:: Programs that you can use togather with Git * The binary problem:: Binary files are evil agaist source control * Writing commit messages:: How to write good commit messages * Standard files:: People have expections, and they should have * Keeping the repository clean:: Good housekeeping is important * A friendly build system:: Build systems makes the it easier for you and others @end menu @node Additional tools @section Additional tools Git is used for source control, for a complete, possibily collaborative, development environment you need additional tools. Everything Git can do, you can do in the command line, but some repetitive gets cumbersome in the command line because you will need to run the same command in unpredictable variations. For this the package and command @command{tig} may be just want you need if you live in the terminal. @command{bugseverywhere} is a grate tool for keeping track of issues in Git repositories. Issues are commited to the current branch you are working on, meaning that you can have separate issues in separate branches. So if you have separate branches for separate features that are being implemented you can create separate issues inside those branches. And when a branch get merged with your develop branch the unresolved issues is merged into the develop branch. If you are working on a large project with multiple collaborator and contributors, and you have dedicated hosting serve, you can install Internet services that can assist collaboration and especially contributors. @itemize @item Flyspray is a web-based project management and issue tracking system used by many projects. @item GNU Mailman is a projcet for managing electronic mailing lists which can be very useful for accepting patches. @item newsd is standalone NNTP server for centralised newsgroup forum serving on a single server, which can be very discussions. @end itemize All of these programs, as well as Git, are released under the GNU General Public License, except Flyspray which is released under the GNU Lesser General Public License@footnote{Yeah, it is a weird license for anything that is not a library, but that is what it is what it is released under.}. @node The binary problem @section The binary problem Source control does not work well with binary files. Consider that two persons are edition the same file which cannot be interpreted by a human using a text editor. If there is a conflict, Git may not realise it depending on the binary format, or may not be able to merge the changes. If Git cannot merge the and you cannot open it in a text editor, you will not a file you can open and se the conflicts in so you must open both's versions and manual inspect and merge them. Perhaps you plan to work alone and never ever get another developer on your project and are not convinced that merge conflicts reason enough to work text based formats. Than consider that you need to know the exact changes of commits; the Git log will not be able to help you if you are using binary formats. Luckily most type of files formats have an text based alternative. @table @asis @item Raster images Portable pixmap (.ppm, .pgm, .pbm, .pnm) is a text based@footnote{Optionally partially binary.} image format that is supported by The GNU Image Manipulation Program (GIMP). @item Vector images Scalable Vector@footnote{Yes it is redudant, but that is really what it is called.} Graphics (SVG) is the most popular vector image format, and in fact it is text based, more precisely it is XML based. Will you may not be able to view an SVG file with merge conflicts, you are still able to open it in an text editor and fix the conficts. However it may not be the simplest thing to resolve in a text editor it helps you to identify where the conflicts are located in the graphics. @item Documentation Texinfo and TeX are two very popular alternatives to word editors such as Libre Office. Texinfo is designed for manuals and books, and can be compiled to virtually any publish format, including @command{info} manuals, Hypertext Markup Language, Portable Document Format and PostScript. TeX is more general purpose@footnote{Texinfo is actually a macro set for TeX.} has extensible and redefinable syntax and is written to guarantee that one source will always and everywhere produce the exact same binaries, in terms of hows that look when viewed. Its macro set LaTeX is the prefered office system in academia. @end table @node Writing commit messages @section Writing commit messages Commits are accompanied by messages. This both helps yourself and other developers to identify a specific commit of interest, as well as giving information about what is happening in the development process, so every developer can keep up to speed. Commit messages can also be used to create changelogs, where you create a change log from the commit history and filter out unimportant changes. Change logs and commit massages are generally written on the same style. A short message written in imperfect, optionally with additional larges paragraphs that goes more in depth. Even if your project is not in English it is preferable to keep your commit message as well as much as possible of your project (that is not visible to the user) in English. Assume your project is in Swedish, everyone that understands Swedish will be able to translate it to any other language. So far it would be okay to write anything in Swedish, but once it has been translate, for example to English, additional translators that do not understand Swedish can contribute with additional translators. Now it is perferable with English, while the translators can understand everything that are should translate they do not understand anything else. When translating a program it useful to be able to understand what the program does and not just want it prints so you can do more accurate translations, and sometimes just one translation is not enough to have en unambiguous understanding. It is preferable to commit as often as possible, however this interrupts your mind flow, so committing should take as short time as possible@footnote{Git is very friendly in this respect as create a commit is lightning fast, and photon fast in comparsing to other old school source control systems.}. Because of this you may need to compromise your messages --- which is not too bad because they are accompanied by code, but never do it when submitting patches or pull requests --- one thing you can do is to not describe a fix bug if it should be obvious for the changeset@footnote{Can only be obvious from small changes.}, after all if you do not understand a change you can always ask someone. Another thing you can do is to use some standard shorthands. @table @asis @item m Minor change. @item doc Add or change documentation. @item typo Fix a typo (a typing error.) @item spello Fix a spello (a spelling error.) @item stylo Fix a stylo (a literate style error.) @item grammaro Fix a grammaro (a grammatical error.) @item style Fix a coding style error or mistake. @item ref Abbreviation for `reference'. @item dir Abbreviation for `directory'. @footnote{`Catalogue' and, for Windows folk, `folder' are synonyms for `directory'.} @item conf Abbreviation for `configuration' or `configure'. @item misc Abbreviation for `miscellaneous'. @item + As well as, another logical change in the same commit. @end table If you want shorten the time it takes to create a commit, I personally recommend to have a short shell function for opening your text editor and stage the openned files when the editor exits. And create a shell function the runs @command{git commit -m "$*"}. @node Standard files @section Standard files All projects should have a set of files: @table @asis @item @file{README} @i{(optional)} You should have a readme file at the root of your project. It should describe the project and how the program is used. @item @file{DEPENDENCIES} @i{(optional)} If your program has other dependencies than a compiler, linker, interpreter, @command{libc}, @command{coreutils} and similar standard packages, and make tools such as @command{make} and @command{automake}. You should have a file that list all dependency: runtime dependencies, optional runtime dependencies, build dependencies, opt-out build dependencies, opt-in build dependencies. Try to specify version range and what the package is used for, especially for optional dependecies. @item @file{INSTALLING} @i{(optional)} For more advanced build systems you should have a file that specified how to configure the building process. @item @file{CONTRIBUTING} @i{(optional)} If your have rules on how to submit patches, code style guildlines, or other information for contributors, you should have files with all such information in the top of your repository. @item @file{HACKING} @i{(optional)} For complex project you can have a file named @file{HACKING} with information about how to modify the code. @item @file{COPYING} When you make your project available it is not longer private software and you need to give it a license compatible with its dependencies. If you do not have a license it defaults to being proprietary. The copying file includes the license summary or the complete license text if it is short, and at the top, the project name, short description, years of active development, and copyright holder name and e-mail address. It is the same text as the copyright information you put at the top of source code files. @item @file{LICENSE} If the projects license is large, you put the fill license plain text in the file named @file{LICENSE} in the top if your repository. @end table @node Keeping the repository clean @section Keeping the repository clean Keeping repositories clean is instrumental in making it easy to maintains and simple for new contributors to get started. Do not commit binaries to repository, it should only contain source files, this means that you do not commit the program precompiled, libraries the project is using and integrated development environment (IDE) files. You can however make exceptions for precompiled non-programs that are compiled by your build system if you think it is useful enough for users to be available precompiled. For example you can have a manual precompiled. Your project can include directories such as: @table @file @item bin This directory should not be committed, rather it should be ignored and be created when compiling the program, it should include linked files such as commands, .so-files and .jar-file. @item obj This directory should not be committed, rather it should be ignored and be created when compiling the program, it should include compiled but not linked files such as .o-files and .class-file. @item src Put your source code it this directory. @item dev Auxiliary files and scripts used by developers, such as code self tests, bisection commands, and resource file inspection scripts. @item contrib Personally, I do not like this, but you can use it for additional source that is not required for the core of the package. @item dist If you are maintain package distribution if your package of an operativing system distribution, you can have directories named @file{dist/DISTRIBUTION} for each distribition. You may want to do this this way because than other users can look at it and start maintaining package distribution of your package of the operativing system distribution they are using. @item share If you have of the following directories, you can put all of them this directories instead of in the root. @item completion All commands such have shell tab-completion, if you are writing them individually for each shell you can place them in this directory. @item manuals If you write manuals in multiple formats you can place them in this directory. @item info Every project should be well documented, if you are this with texinfo, you can put you texinfo files in this directory. @item man If your project have manpages, you can place them in this directory. @item po Programs that used @command{gettext} for translations can place the translations in this directory. @item * Resource files can be places in a directory named after their category. @end table @node A friendly build system @section A friendly build system `So I should not include project metafiles use by my integrated development environment?' No when you are doing that you are binding everyone to your environment and you do not provide everyone with a way to build your package. Every package should be buildable with a small set of commands that do not require human interaction beyond type them in a simple predictable manner in the command line, ther is very important for package distribution. To make possible for everyone to build your program you can use GNU Autotools or just a simple handwritten make file. Additionally such a build system lets you provide means to configure and customise the build process as well as installing and uninstalling (without using package management) the program. Further, it allows you to compile individual files and clean the directory from all compiled files. @node GNU Free Documentation License @appendix GNU Free Documentation License @include fdl.texinfo @node Glossary @appendix Glossary @table @asis @item alternate object database A repository can inherit part of its object database from another object database, which is called ``alternate''. @item bare repositry A bare repository is a repository without an index tree or working tree. It just contains the commits. Because it just contains the commits it does not have a @file{.git} directory, but instread directly contains the content you would find in a @file{.git} directory. If is normally named with @file{.git} is a suffix. @item blob object Untyped object, for example, the contents of a file. @item branch Alternative parallel development line, normally indended the be merged with the branch it forked out of. @item cache Obsolete, replaced by ‘index’. @item chain List of objects where each object has a reference to the next object, its successor. @item checkout The action of changing branch. @item cherry-picking Creating a new commit from a subset of commits. @item clean A working tree is clean if has no changes relative to the current @code{HEAD} @item commit (noun) A single point in the development history stored in Git. The entire history of a project is represented as a set of interrelated commits. Alternatives to Git may use the terms `revision' or `version' instead of `commit'. `Commit' is also used as a short and for `commit object'. @item commit (verb) The action of storing a new snapshot of the project's state. The state of the index is stored and @code{HEAD} is advanced to the new commit. @item commit object A Git internal object which contains the information about a particalur revision. It contains informations such as author, commiter, date, files and parents. @item core Git The fundamental tools of source code management. @item dangling object An object that is not reachable, even from other unreachable objects; there are not references to it. @item detached @code{HEAD} @code{HEAD} that does not store the name of a branch. Git allows you to checkout arbitrary commmits, when you checkout a commit that is not the tip of any branch, the @code{HEAD} is ``detached''. To store changes make in a detached @code{HEAD} you must first create a new branch from it. @item directory You may also know it as `catalogue' or even `folder'. It can contain files and other directories, which you can list with the command @command{ls} and you can change directory with the command @command{cd}. @item dirty The working tree is dirty if it contains uncommited modifications. @item evil merge A merge that introduces changes that do not appear in any parent. Variable name conflicts can be a cause of evil merges. This is why you do not rebase your commits. @item fast forward The action of doing a fast-forward merge, a pull for updates when Git branch is just behind, not diverged. @item fast-forward A special type of merge that will often be the case when you pull updates from a remote repository. A fast-forward merge is a merge where you have made no changes but there are changes on the remote-tracking branch that can be pulled without any merge logic. @item fetch When you fetch a branch's head ref from a remote repository, you download, to a local database, objects that have not yet been downloaded. @item gitfile A plain file named @file{.git} located in the root fo the working tree, that points the to real repository. The is a create idea to use this if you do not want a backup repostory as it prevents you from accidentally remove anything but the working tree. @item grafts Two different development lines can be join together by recoding facke ancestry information. This is configured via the @file{.git/info/grafts} file. @item hash Especially if cryptographic, almost unique, indefeasibly reverable, fixed size, scrambling of content. In Git's context, synonym for object name. @item head A named reference to the commit at the tip if a branch. @item @code{HEAD} The head of the currently checked out branch, or the currently checkout commit in which case the @code{HEAD} is ``detached''. @item head ref @itemx head reference Synonym for `head'. @item hook Several Git commands make callouts to user definable scripts. This allows developers to add functionally or checkout, and commands can be verified and aborted. @file{.git/hooks} are filled with sample scripts that can be enabled by removing their @file{.sample} suffix. @item index Snapshot of the working tree intended to be promoted to a commit. @item index entry The information regarding a particular file that is stored in the index @item master Unless set otherwise, the default branch in a repository. It is created with the first commit in the project. @item merge (verb) The action of integrating the commits for another development branch. @item merge (noun) A commit the is created when merging a branch into another. @item object The unit of storage in Git. It is identified by the SHA-1 hash of its content. @item object database Stores a set of Git objects. @item object identifier Unique 40 character hexadecimal identifer of an object, derived from the objects content, by hashing with SHA-1. @item object name Synonym for object identifier. @item object type `Commit', `tree', `tag' or `blob'. @item octopus The action of mergin more than two branches. @item origin The default repository to push to, the one you cloned. @item pack A set of objects that have been compressed. @item pack index A list of metadata for objects in a pack, to speed up access time of individual objects. @c TODO @item pathspec @c @item path specification @item parent Logical predecessor. The commit from which a new commit is made is called the new commit's parent. A merge have two or more parents, and the first commit, or the first commit in an orphaned branch, has no parents. @c TODO @item pickaxe @item plumbing The fundamental tools of source code management. Low-level commands. @item porcelain High-level commands. @item pull The action of integrate new commits from another branch, often a remote-tracking branch. @item push The action of sending updates to a remote repository. @item reachable All ancestors of a given commit are reachable from that commit. This can be generallised to chained objects. @item rebase The action of rewriting history by pretending that you pulled updates before committing. @item ref @itemx reference SHA-1 hash or name of a particular object. @item reflog @itemx reference log Local history of a reference. @item refspec @itemx reference specification Description of the mapping between ref and local ref. Used by fetch and push. @item remote-tracking branch A reference to a brnach that is used to follow changes in another repository. @item repo. @itemx repository Database of development history. @item resolve The action of manually merging the parts the tool could not automatically merge. @item revision A commit. @item rewind The action of throwing away part of the development. @item SHA-1 Secure Hash Algorithm 1, a cryptographic hash function used by Git for object names. It is the fastest secure hash algorithm around, bit is is also quite old. @item shallow repository A repository with an incomplete commit history. @item symref @itemx symbolic reference A reference that does not point to a SHA-1 ID, but rather to another reference. @c TODO @item tag @c TODO @item tag object @c TODO @item topic branch @c TODO @item tree @c TODO @item tree object @c TODO @item tree-ish @c TODO @item unmerged index @c TODO @item unreachable object @c TODO @item upstream branch @c TODO @item working tree @end table @bye TODO: .git/config man gitrepository-layout man gitignore man gitmodules TODO: (late in the manual) mirrors, but they are not safe! man gitattributes man githooks man gitdiffcore man gitcredentials man gitnamespaces man gitcore-tutorial