Lecture 5, Groupware

Part of the notes for CS:4980:1
by Douglas W. Jones
THE UNIVERSITY OF IOWA Department of Computer Science

Subversion

A compiler development project is a large effort, typically more than one person would want to do. As a result, we are naturally interested in tools to help a group of cooperating programmers work together on a large project. Over the years, two dominant approaches to group development have emerged, both of which classify broadly as version control systems. Any version control system will keep a record of previous versions of a program and allow you to roll back if you want to see earlier versiions, and all of them maintain some kind of change log.

One class of version control system allows individual team members to check out files, work on them, and then check in their changes. If one team member has a file checked out, the other team member must wait until it is checked in before checking it out. In effect, this approach enforces mutual exclusion locks on each file -- locks that are held by programmers.

The second approach eliminates mutual exclusion. If two team members happen to check out the same file, they are both allowed to make changes. When a file is checked in, the version control system checks to see if there have been other changes made to that file since it was checked out. If the changes do not lead to conflicts -- for example, if one team member edited lines 10 through 15 while the other team member made changes after line 20, the version control system simply merges their changes. If a conflict is found, for example, both programmers made changes to line 7, then the version control system informs one of the programmers (the one who checked in later) about the conflict, and allows them to resolve it.

Subversion, the tool we will be using, works this way. Subversion is widely used, both from the command line, and from various GUI tools. Subversion is installed on the departmental Linux machines.

To install the basic command-line Subversion tools on a Raspberry Pi, type this shell command:

apt-get install subversion

The apt-get command is used to download applications from the web. It can be used to install, uninstall or update applications from the list of applications it is aware of. It keeps a local database of what it has installed, and it takes care of such things as automatically installing other applications on which the new application depends. You can type apt-get with no arguments and it will list all the basic things it does.

Depending on how your system is installed and how you are logged on, you may not be able to run apt-get directly. It needs special privileges. You may have to type this variant command:

sudo apt-get install subversion

The sudo (super-user do) command runs the remainder of the line as a command line command with all privileges granted. If your system has a super-user password, it will prompt for that password -- and you'd better know it. Even in unprotected systems, requiring users to run dangerous operations under the sudo command prevents accidental errors.

Subversion is completely documented on line. The book Version Control with Subversion is available as a single HTML document, as a hyperlinked web site, and as a PDF download. It is not a comfortable starting point for the beginner, but it is an excellent reference.

From the Unix or Linux command line, Subversion is accessible through the svn command. The svn help command lists all of the subsidiary Subversion commands. One of them, used to add a file to the Subversion project, is svn add. To get more information about that command, type svn help add. This works for all of the subsidiary Subversion commands, but again, it is not a good starting point.

Connect to the Archive

A Subversion repository has been created for this class at

https://svn.divms.uiowa.edu/repos/cs4980

Yes, it looks like the URL of a password protected web site, but that is only because the Subversion system uses the web's HTTP protocol to upload and download files, and it uses the Apache web server's security mechanisms. Attempting to view this site using a web browser, even if you type in the correct user name and password (your own, since all students in this class have accounts on the departmental servers), you will not see much of interest.

To connect Subversion to this repository, issue the following shell command while in your home directory, with your own user name substituted for the string HAWKID:

svn checkout --username HAWKID https://svn.divms.uiowa.edu/repos/cs4980

This will work without the --username option if you are working on a machine where you have already signed in under your HawkID, for example, if you are connecting to Subversion from a departmental server. If you are on a Raspberry with the default user name, relying entirely on physical security for access control or with a user name different from your HawkID, you need to type in your HawkID in place of the HAWKID given in the above command.

On typing this command, Subversion will prompt you for your password. Use your HawkID password unless your password on the CLAS Linux cluster differs from that. Once authenticated, Subversion will create a new directory in your home directory (assuming that was the current directory) called c_196_jones. That directory will appear to be empty except for a file called README, and, if everyone follows the rules, one file per project, the root directory for that project.

In fact, there is much more there. If you do the ls -a command while in the new directory, you will see a hidden directory called .svn -- this directory holds all of Subversion's information about your project. Among other things, it holds the URL above, so you never need to type in that long command ever again, and it holds your password, so you never need to type that again.

One problem you may have is an objection from your local Subversion client that it doesn't trust the archive. In this case, this message will come before the prompt for your HawkID password:

Error validating server certificate for 'https://svn.divms.uiowa.edu:443':
 - The certificate is not issued by a trusted authority. Use the
   fingerprint to validate the certificate manually!
Certificate information:
 - Hostname: svn.divms.uiowa.edu
 - Valid: from Oct 24 00:00:00 2014 GMT until Aug 31 23:59:59 2017 GMT
 - Issuer: InCommon RSA Server CA, InCommon, Internet2, Ann Arbor, MI, US
 - Fingerprint: 19:0C:F0:3C:4F:AA:DB:98:97:CE:74:C9:FA:3E:67:21:42:D1:78:10
(R)eject, accept (t)emporarily or accept (p)ermanently?

Type p in response. You have actually connected to our subversion server, but your machine isn't fully linked into the public-key infrastructure well enough to trust the identity of our server. A single unified public-key infrastructure would be nice, but we aren't yet there. Many home computers are out of touch. The password dialogue will come next.

Some Subversion Commands

Here is a quick list of the Subversion commands you will be using regularly. These are more fully documented under the svn help command, and in Version Control with Subversion.

svn add FILENAME
After you have created a new file or directory in your Subversion repository, you can put it under Subversion control with the add command. Everything in the repository got there by being added at some point. Please do not add object files or executables. Try to keep the repository so it only contains the source code of your project, including documentation files and makefiles.

Note: Only add the source files for your project to the Subversion repository. Object files and executable files should stay private. This way, if one team member is comiling their code on a Raspberry for the ARM processor and another is compiling and debuging on an Intel x86 machine, there will be no conflicts (so long as the code does not use machine dependent features).

svn update
The update command, with no argument, will get you the current version of every file in the current directory -- that directory must, of course, be part of a Subversion repository.

As a rule, do svn update in your project's subdirectory before you try to use make to test your code.

As a rule, don't svn update the root directory for our project, c_196_jones. It will take lots of time and disk space because you will get copies of every project in the class, and that will tempt you to read code by people outside your project. Note that all Subversion actions are logged, and after all the groups are stable, we will change the access control lists to limit who can update the global directory!

svn update FILENAME
The update command, with a file or directory name as an argument, will get you the current version of that file or directory from the repository. Before editing any file in the archive, do an svn update on that file so you will be editing the current version. If you've just done a parameterless update of the directory holding the file, you have the current version of everything in that directory, but if you know lots of things are changing and you only want to look at one file, just update that file.

svn commit
The commit command, with no arguments, saves those files in the current directory that you have changed since you last updated them. When you commit, Subversion will open an editing session allowing you to give an explanation of the change you made in the log entry for the commit. Conventionally, change log messages should be short, things like "fixed bug in numeric conversion" or "added support for negative numbers". Your user ID and a time stamp are automatically included in the log message.

Note: The only files committed are those that have been registered with Subversion using svn add. As a result, your object files and any notes you happen to be keeping to yourself will not be stored in the archive or shared with others on the project.

Note: The editing session for the log message will be opened using the editor specified by the SVN_EDITOR shell variable. This shell variable is traditionally set in the shell's .rc file (on the departmental Linux system, this will be .tcshrc. On a Raspberry, .bashrc). If you don't set this variable, Subversion may use an editor you don't know.

Note: If two users try to commit changes to the same source file, Subversion is fairly smart. It will usually succede in merging the changes seamlessly so long as the two users only make changes in non-overlapping parts of the file. However, if the edits made by the two users involve the same part of the file, the first user to commit will win, and the second will receive an error message that shows the nature of the conflict and gives essentially two choices: a) Cancel the commit for the file in question; in this case, the best option is to do an update and then start moving forward from the other user's changes. b) Resolve the differences by continuing to edit from the version annotated with the conflict in order to fix the conflict, then commit.

svn commit FILENAME
Just commit the changes to the indicated file. The only reason to use this form of commit is when you want to commit only some of the changes you've made and keep the others private until they look nice.

svn mkdir
svn mv
svn cp
svn rm
The normal Unix/Linux mkdir, mv and cp commands still work in directories under Subversion, but they do not inform Subversion of the changes they made. Therefore, if the files you are working with are files you want to be in the repository, you must use the Subversion versions of these commands so that the repository is changed to mirror your local changes. Note that the actual changes may be deferred until you do svn commit to the directory that these have changed.

In summary, the normal minimum subversion session will go something like this, with PROJECT changed to be the name of your project:

cd cs4980/PROJECT
svn update
-- edit file FFF
svn commit

Good Subversion Manners

Commit your changes! The most common cause of trouble in a project comes when participants don't commit their changes frequently enough. If two people are working on related parts of a project and one of them does not commit changes, the other one will never benefit from those changes. The longer you wait between commits, the more likely your changes will be to conflict with the changes made by someone else.

Do not check out a file and edit it for days before committing your changes. Someone else will very likely make other changes and commit them. The longer you hold a file before committing, the more likely it is that there will be changes that conflict with what you are doing. It is good manners to work in small increments, using update to check out the current version and committing your changes frequently.

Keep an open side-channel! E-mail works, twitter works, project chat rooms work. Everyone in the project should agree on a channel, and everyone should be connected to the channel while working. If there's a commit conflict, a quick note could be a useful way to figure out whose changes make more sense. To avoid conflicts, if you need to change something that someone else might be working on, announce it to the group first. A message like "I want to fix the bugs in lex_advance for scanning integers" is a polite way to avoid conflicts if anyone else in the group is likely to be working on that.

Test before you commit! If you're changing source code, at the very least, get it syntactically correct. Once a particular source file reaches the point where it can be compiled, don't commit broken versions of that code. In projects with makefiles, run make before you commit in order to verify that you have not broken something.

If you must commit a file with known errors (syntactic or not), include appropriate bug notices to attract attention to them and take responsibility in your log update, saying that you're committing code known to contain errors.

Acknowledge bugs! Within any development group, develop a standard form of comment used to acknowledge bugs. For example, the text =BUG= in a comment with an explanation of the error (as much as it is understood) or a note about the unfinished work that ought to replace the comment. This way someone looking for work can use the grep =BUG= *.c command to see all the comments that invite their attention. When you find something that is broken and needs fixing, if you can't fix it on the spot and there is no =BUG= notice, add one. When you fix a bug, remove the bug notice.

Be polite! Ask before you make major changes to code someone else wrote. Discuss alternatives. E-mail is handy for this, but for small things treat the code itself as a communication channel (bug notices are a prime example of this). On the flip side, be forgiving. If someone finds and fixes a bug in your code, they're not interfering with your code, they're helping.

Take responsibility! Be accurate but concise in your log messages. The project is a group responsibility. If you find something that you can fix, fix it, don't delay to get permission. Time spent passing blame around, getting permission, and creating organization structure is time wasted.

Keep the Subversion repository clean! Don't let Subversion know about object files and executables. Note that if one user is developing on the Raspberry while another is developing on a departmental server, you are using two different compilers and they are targeting entirely different and incompatable computer architectures. Object files for one machine will not work on the other, so of course, the object files must not be included in the subversion archive.

This is a good reason to make sure that your makefile includes support for make clean.

Grep? What was that?

One of the most interesting Unix shell commands is grep. Type

grep expression FILENAMES

(substituting the names of real files for FILENAMES) and you will see a listing, on the terminal window, of all lines of text in the indicated files containing the expression. The expression can be as simple as a word, or it can be a regular expression (in the sense defined by Unix, which is not quite the same as the sense defined by automata theory). For example:

grep =BUG= *.c

searches all of the files in the current directory with names ending in .cto find lines containing the text =BUG=.

One thing to note: The utility of grep drops dramatically if you get in the habit of writing very long lines. Just because you have a video monitor on your computer that allows you to maximize windows to whole lines 500 characters long does not mean you should use all that width. Keep your lines short, and the output of GREP is far more useful.

The C style guidelines I recommend ask you to keep all source lines under 80 characters long -- not because old punched cards were 80 characters wide, but because people have a hard time scanning long lines of text. Newspapers don't print text the full width of a sheet of newspaper, they divide it into columns (usually much shorter than 80 characters). Typing paper is the width it is because it holds about 60-80 characters per line in the most common range of font sizes. Books are published on pages wide enough to hold about 60-70 characters in the font size used for the body of the text. Human readers seem to work better with lines about that long, and programs should be written for human readers, regardless of the fact that compilers are not troubled by these constraints.

Assignment

Attach to the course subversion repository and edit the README file so that your entry includes more than just your hawkID. Specifically, you must include the information others in your group need in order to contact you.