follow me on Twitter

    Blog has moved to, see April, 2008 on new site

    Blog entries for April, 2008

    This post moved to

    I previously mentioned Easy GIT, which greatly improves git, in large part by hiding man pages and command line options packed with unimportant implementation detail, while adding examples and options that relate to workflow.

    Since then I've been using Easy GIT with other people and a central repository, and moving up the learning curve a bit, and started to find some stuff that still doesn't work for me.

    (I guess I'll say eg and git interchangeably, since in many cases enhancements could go in either project.)

    Should be a way to globally see what is outstanding

    For my workflow, with a central repository and a small team (which is all my projects ever, whether D-Bus or Metacity or LiTL), any local-only changes or local-only branches are temporary.

    In fact my standard procedure on a branch that will last a few days is to push to the server pretty much every hour or so, so I have a backup. Easy GIT (or git) makes this easier in some ways, since it's easy to create my own branch on the server.

    No way I'm keeping a few days or weeks of work only on my local drive. I've watched a few too many other people do that and regret it.

    Here's what happens, though. As the day goes on I end up with a half-dozen branches, and some commits on master too, in various stages of patch review, some approved for merge to master and some not. For most of these branches I probably intended to push them to the server, but for some really small quick-fixes, perhaps not.

    Now say I want to power down, or go to bed for the night, or switch from my home computer to my work computer; what I want to do is say "sync to server" - just back it all up! I don't want stuff only on my local drive. If I go from home to work or vice versa, I want everything available on the server so I have it.

    Two ideas:

    • eg sync origin should be possible. "Just put everything on the server unless I've explicitly marked it local-only."
    • eg outstanding origin should be a command that describes all differences between local and remote repositories, so if something is not synced, I can quickly find it.

    This morning I set out to push all my patches that had not been pushed. Problem one: I couldn't figure out what these patches were.

    Remote tracking branches: implementation detail

    Remote tracking branches are confusing, and I think could simply be an implementation detail. I care about remote repositories ("remotes"); I care about branches that are on remotes; I care about having an offline cache of branches that are on remotes; but I do not care that the offline cache happens to be implemented as a branch. And I do not ever, ever, ever want to write to the remote tracking branch.

    How does one write to a remote tracking branch? I'm not sure to be honest. But today, for a second time, I discovered I had a remote tracking branch that was somehow not the same as the branch on the server it was supposed to be tracking. My only guess is that this results from typing "push origin/master" instead of "push origin", or the like. But I have no idea, really, how this could happen, or why I would want it to happen. Worse, I haven't been able to figure out how to fix it, short of a fresh clone.

    This is only a small symptom of the problem, though. I think the big picture is that for purposes of command line syntax, "origin/master" should mean "branch master on the origin remote." If an operation should be done offline (as everything except writes and fetches should be), then behind the scenes it would use the remote tracking branch. If an operation is a write, then it should go to the remote branch instead of the tracking branch.

    I don't need to know that "origin/master" and "branch master on origin" are different. I think it's clear in all contexts which one I mean, because git already separates network operations from local-only operations, and because it is never correct to modify the remote tracking branch (except to pull in new stuff from the remote branch, of course).

    On every pull, the system should verify that the remote tracking branch (aka the offline cache) is exactly the same as the remote branch, and make it be the same if it isn't. And "push --branch master origin" simply should not be different from "push origin/master" - that's crazy.

    Whether and where to push/pull: property of the branch, not of the push/pull operation

    Getting back to the idea of "eg sync": at any given time, I'm planning to either never push a branch, or always push a branch, or not push it for a while and then only push it when I explicitly decide to; but whatever the plan, it's not something that changes every hour. I want to say "keep this branch in sync with server"; or "don't send this to the server ever"; or "don't sync this for now, I'll re-enable sync later."

    If branches were tagged with whether to push them or not, and to which server branch, I could globally "eg sync" the entire repository.

    I guess git lets you push a branch to multiple different remote branches. Seems like an obscure feature that I can't imagine using. For me it would be fine if, for each remote branch I want changes on, I had to create a local branch, attach it to that remote branch, make the changes on this dedicated local branch, and push. In the normal case, I would have a local branch for all remote branches already anyway.

    But, if there are people who love pushing to lots of remote branches from one local branch, they can just set all branches as "never sync" and then they can push individual branches by hand. The rest of us should be able to sync all shared branches at once, while still having local-only branches if we want.

    Easy GIT has --all-branches and --matching-branches, but these are IMO wrong workarounds. --all-branches forces you to push stuff that may be a throwaway local branch or "on hold" temporarily. --matching-branches doesn't push new branches and may also push a branch you wanted to keep on hold. What's needed is that branches know where they go; I shouldn't have to push with a special "wildcard" option to do the normal thing, which is to sync all branches marked shared, and do not sync any branches I intend to be local-only.

    Feedback: tell me what's going on!

    Now that Easy GIT fixed the docs, I think the number-one UI deficiency in git is that it has no feedback; it does not explain what it's doing when it's doing it. Sometimes it's totally silent; sometimes it has a bunch of babble about "objects" and "packs" that means nothing to me; neither of those is good.

    This steepens the learning curve, since you can't watch what commands do.

    Maybe worse, it makes the source control system "feel bad." For me, the purpose of a source control system is to make it so I can never lose any history or data; when every command feels like it did something mysterious I'm not sure I understand, I don't have a sense of security.

    Commands should output things like: "downloading changes from remote server 'origin' on remote branch 'master'"; "merging branch origin/master onto branch master"; "3 new changes applied to master". For each command, I should get feedback on any network transfers; all branches that were involved; and all commits that were created or merged.

    "eg branch" should show more than only branch names to help orient me. I would like to know if the branch is synced and if so to which remote branch, for example.

    ChangeLog workflow is wrong

    For a detailed ChangeLog, I want to write the ChangeLog entry as I develop the code, using 'C-x 4 a' in Emacs, ideally.

    The problem is that when I go to commit, that's not when I want to write the log. I prefer to write it either as I go, or just before commit as part of self-reviewing the patch - I read the patch while doing 'C-x 4 a' to document each part. That's the value of having a ChangeLog file that exists always, and isn't just an open editor at commit time.

    However, if you have a ChangeLog file git barfs all over every merge. git should be smarter. Merging ChangeLog conflicts is not exactly a computationally intractable problem. But there's an even better solution maybe.

    Every time I switch to a branch, git could create an empty file called ChangeLog; then when I commit, it could pre-fill the editor with the contents of that file and reset the file to empty. Magic!

    The problem is not that ChangeLog disrupts git merges. The problem is that git does not support the nice format and workflow of ChangeLog.

    Use EMAIL and GECOS

    A minor thing, but if you just start using git, it puts garbage in the Author field. Every other program uses the EMAIL environment variable and your UNIX account information. That is a good default. If people want to override it via config option, then let them, but don't require configuration to get started.

    Easier way to see what a branch does

    If you want to review a branch to see if it should be merged, the syntax is the magic triple dots: git diff master...mybranch

    This is weird, arcane, hard to discover... and something I need to do all the time.

    I'm not sure what the right solution is. Maybe just docs, or maybe it should be an option to diff instead of the funky triple-dots.

    Deleting a remote branch

    I think to delete a remote branch you have to do eg push :branchname, another strange and surprising syntax.

    eg branch -d remotename/branchname should work, IMO. (Again, writes to a remote branch should modify the server-side branch, not the remote tracking branch.)

    Can the central repository be "messed up"?

    With subversion, I think it's basically impossible for someone with access to the central repo to accidentally make a change that can't be reverted. Sure they can log on to the server and delete stuff from the shell, but with Subversion commands, I can't do anything that won't show up in the history.

    I can't tell whether this is true with git. Throughout the docs there are options like "--force" and "--hard" and warnings about how using the command can screw you. I don't know how many of these warnings apply to central, remote repositories, but I worry about it. Remember, I don't understand the git docs, and hope I never have to try.

    An example from man git-push:

    Usually, the command refuses to update a remote ref that is not an ancestor of the local ref used to overwrite it. This flag disables the check. This can cause the remote repository to lose commits; use it with care.

    Wait - can cause the remote repository to lose commits???!!! This is not what I'm looking for in a source control system. It's the main thing a source control system is supposed to be preventing!

    Accidents worry me a lot more than malicious people or mysterious cosmic rays. Especially when something as absurdly hard to use as git is involved!

    It also bugs me that I can accidentally do things that while theoretically recoverable, are still very hard to recover from. For example, somehow having changes on remote tracking branches that are not on the server. (To beat that dead horse a bit more.)

    Overview of branch relationships

    If you want to understand the branch structure of a project, your best bet is gitk, and gitk is not a good bet. I do not understand the gitk display at all.

    There's probably some simple info the command line could report that would be very helpful, such as which branches have changes that are not on master, which branches were ever merged into a given branch, or which branch a branch was originally branched from. Perhaps some of this should be in the "git branch" output by default.


    So much work to do.

    This post moved to

    The LiTL UI team has arrived! Welcome to Johan, Lucas, Tommi, and Xan.

    Still looking for more people, of course.

    This post moved to

    Jason, I think you're missing a really basic point here. The point of a GUI is (usually) not to expose or wrap a command line app. It's to provide a nice way for some specific audience to do some specific thing.

    The whole reason PackageKit is new and different is: it's not a "frontend for RPM" or a "frontend for dpkg." There are already a ton of those, for the kinds of audience who want those.

    To design something, first-order, what should the UI be like for the specific audience and the specific things they want to do? Second-order, how can that UI be implemented?

    If you and Richard don't agree on the audience and purpose of PackageKit then of course you won't agree on the UI. That's why multiple UIs and multiple programs very frequently should and do exist.

    Whatever the merits of PackageKit, there's a great general lesson to be found here. Thinking of any software as a "frontend for XYZ" is flat-out wrong.... unless your intended users are used to XYZ and what they want is a frontend for XYZ!

    But, there is no rule that Richard has to write a package manager frontend for whoever it is that wants that. In fact, I hope he doesn't. I hope he writes a UI for keeping a completely terminal-free desktop up-to-date with security patches, and finding and installing new apps for said desktop. PackageKit seems to be pretty good for that.

    Something like 10 years ago now, I wrote a god-awful frontend to Apt called gnome-apt; and it opened a terminal (Zvt, back then, not VTE). It was hilariously, embarassingly bad UI even 10 years ago to open a terminal, and people made fun of me.

    My memory from back then: there was a future plan for all Debian packages to have default answers to all questions, specifically to avoid the UI wart of opening a terminal. Of course, who knows what's happened since. But I don't think opening a terminal became a better decision, that's for sure.

    It would be wonderful discipline for any software dev team serious about Linux "on the desktop" (whatever that means) to ban their own use of terminals. Of course, none of us have ever done this, and that explains a lot about the resulting products.

    btw, on this "Red Hat bias" topic: if I were writing PackageKit and wanted to screw over Debian, I would put in the show-a-terminal feature, because it's so comically bad I could then mock it. Just saying. I think Richard is doing a solid for Debian and Ubuntu to insist on having PackageKit be just as good on those distributions.