2006
| Revision History | |
|---|---|
| Revision 2.0 | June 13 2008 |
| Added new section, "Renaming and Moving" and a flowchart for renaming. | |
| Revision 1.5 | 07 June 2008 |
| Refined flowchart for merging and added a figure to illustrate complex branching of a simple project. | |
| Revision 1.4 | 15 March 2008 |
| Convert all flowcharts to OmniGraffle on Mac OS X. | |
| Revision 1.3 | 11 December 2007 |
| 98% done. Complete with branching and merging and flowcharts. | |
| Revision 1.2 | 07 November 2007 |
| Converted to DocBook. | |
| Revision 1.1 | 08 January 2007 |
| Additional modifications. | |
| Revision 1.0 | 22 November 2006 |
| Added flowcharts. | |
| Revision 0.5 | 28 May 2006 |
| Expanded sections into subsections. | |
| Revision 0.1 | 20 April 2006 |
| First Lyx version. | |
Table of Contents
When we drink, we get drunk. When we get drunk, we fall asleep. When we fall asleep, we commit no sin. When we commit no sin, we go to heaven. Sooooo, let's all get drunk and go to heaven!
SCM (Software Configuration Management or Source Code Management) has become a great tool for many software develoment projects since it was introduced. Both commercial and open source SCM software are available. CVS and SVN (Subversion) are the most popular among all. SVN is a version control system initiated in 2000 by CollabNet Inc. and has become more popular in recent years and as the result, many applications and utilities programs have been developed to enhance SVN. The shift from CVS to SVN is mainly influenced by SVN's latest technology such as atomic commit. This document is written based on the use of SVN.
Source Code Management (also known as revision control, version control, source control) manages the variable version of changes of same piece of infromation. It is most commonly used in software development and engneering to manage ongoing development of digital documents such as computer program source code, engineering drawings, blueprints, schematics, and other projects that may be worked on by a team of people. Solo developer can also use SCM to help manage and track the ongoing development of projects. The use of SCM has extended to business management to manage digital documents.
In software engineering, the SCM or Software Configuration Management has similar or identical approach but with wider coverage and purpose. The goals of SCM in software engineering are generally:
Configuration identification - In most software projects, the development will usually involve software components which the development depends upon. These dependencies usually require different compiling and building parameters. In addition, the project itself consists of many source files. The question is: "What code are we working with?"
Configuration control - Software evolves. During development, the software evolves more rapidly. Controlling the release of product and its changes is critically essential to ensure the high quality of product.
Status accounting - Record and report the status of development and components.
Review - To ensure completeness and consistency of components.
Build management - A software project may involve other dependency components which require same or different build tool and usually require different process. Manage the process and tools used dor builds.
Process management - Ensuring developers to adhere to development process.
Environment management - Managing the hardware and software that host the system.
Teamwork - Facilitate team interactions related to the process.
Defect tracking - To make sure defects can be track back to the source.
With the emergence of open source software development efforts, SCM (or more precisely SVN) is designed, particularly, for collaborated software development across continents. Atomic commit is a feature of SVN designed for a safe commit across any Internet or networking connection. With atomic commit, a collection of modifications either goes into the repository completely or not at all. It makes sure the changes are completely committed and safe.
One of the main goals of SCM is to ensure the disseminating of various versions of code under a controlled and manageable environment. It also allows developers to track back source codes to previous revision when serious errors or bugs are found in the release. Developers can review changes, undo changes, retrieve the project files from any given date, and much more. The possibilities are endless.
SVN is a great tool for software developers to relieve them from the ardous procedures of managing and tracking various revisions using a single repository. SVN tells us what was changed, when the changes were made, who changed it and why it was changed. However, every commits can be disastrous to some extent where tracking and maintaining become so complicated. Development process can grind to a halt due to this disaster. A defined guidelines is necessary to ensure entire development team to understand what and when to commit with greater awareness and responsibilities. This guidelines provides some general ideas to establish a SOP (Standard Operating Procedure) for developers using SCM, particularly SVN. Although this guidelines is written based on the use of SVN, it can be adapted to any other similar SCM environment.
While the following guidelines define when and how to commit changes, do exercise some common sense when following the guidelines. When in doubt, please consult team leader or person who is managing the integrity of the repository.
Well defined code ownership is essential and fundamental. The ownership of code does not define proprietorship, but the responsibilities of creating, maintaining, and reviewing the code as a rightful owner. People other than the rightful owner can also maintain the code, but with prior permission from the rightful owner, and their changes must be reviewed by the owner before commit. In some cases, a review can also succeed commit. This is not a control but a discipline process so that the maintainer and owner are aware of the changes and agree on the changes. With proper assignment of ownership, as indicated in Table 1, the responsibilities for each source file becomes clear.
Ownership plays an important role in design conformance. Consider
the following code fragment in sensor-node.ads
where procedure Connect takes three parameters;
Port, Node and Timing.
Timing has been declared as Positive as in the
system design. However, for some reason, a developer who is not the
owner of this source file has made a change in his implementation. This
changes Timing from Positive to
Integer to allow negative values to indicate errors. While
the developer thinks this is a great way to indicate errors, it will
cause confusion, compilation errors, and non-conforming standards in
system design and other source code which uses this package. With the
ownership of code, the developer will know who to contact about the
changes. Thus the developer can discuss this change with the original
author (or designer) before committing.
package Sensor.Node is
procedure Connect (Port : in DAQ.Data_Port;
Node : in DAQ.Node_ID;
Timing : in out Positive);
...
end Sensor.Node;
Table 1. An example of ownership assignment
| Source files/modules | Owner(s) |
|---|---|
sensor-node.adb sensor-node.ads sensor_node_manager.adb sensor_node_manager.ads sensor_node_manager.gpr | Katy (P), Kim (S1), Adrian (S2) |
sensor-activities.adb sensor-activities.ads sensor-calibrate.adb sensor-calibrate.ads sensor-realtime.adb sensor-realtime.ads sensors_array.ads | LiXiang |
filter-fourier.adb filter-fourier.ads filter.adb filter.ads | Patrick |
In some cases, there may be more than one owner is assigned to a
directory or files. Refer to Table 1 for example. Three owners have been assigned to
sensor-node and sensor_node_manager. Katy
has been named as primary owner. Kim and Adrian have been named as 1st
and 2nd secondary owner respectively. Katy is the team leader in this
aspect. If Katy is not available, Kim and/or Adrian can be contacted
and can review works in node_manager. Both Kim and Adrian
can decide if the work is good for commit. The ownership and authority
of decision comes down from the top hierarchy when someone is not
available or not contactable.
In case of only one owner has been assigned and he/she cannot be contacted, the chief designer, team leader or project manager will take over his/her position for such needs.
SVN can tell us what was changed, when the changes were made, and who changed it, but not why. It is useful to provide a detailed and meaningful explanation of why the changes were made. Table 2 shows some examples of bad and good logs.
Good commit log messages help the project team leader or manager compile a useful index of every commit (revision) from the post-commit logs to make maintaining, searching, and tracking revisions more convenient for all team members. This revision index is also useful documentation for tracking the history of development later on. Table 3 shows an example of an index of every revision with files involved and clear log messages.
Table 2. Bad and good log messages.
| Bad log messages | Good log messages |
|---|---|
| Updated to conform to new requirements | Updated to conform to requirements in Requirements-Nodes-20060510. |
| Added generally useful package to contain to_*_string functions for Enumeration types | Added generally useful package to contain To_String function to convert enumeration types to String. |
| Added comment not to use (Un)available procedures. | Added comment deprecating procedure Unavailable and procedure Available. |
Table 3. An index of revisions compiled from post-commit log.
| Rev. | By | Files | Log Messages |
|---|---|---|---|
| 234 | Katy |
| Removed unused components from Sensor_Node_Rec. |
| 235 | Kim |
| Added Ada.Text_IO.Put_Line to display (testing for Adrian) the exception name and message raised from Sensor.Initialize. |
| 236 | Patrick |
| Initial version with implementation comments in body. |
When fixing a bug, add a regression test for it. Do a regression test and have both code and test result reviewed by others or the owner of the code before commit. Include regression test and the test result, if possible, to the change log.
When fixing bugs, do not follow documentation blindly, it may well be wrong. Test all bug fix and its behavior before commit.
Do not commit unrelated changes. Commit all related changes in one and unrelated changes in separate commit. Separate commits make it easier to maintain, track, review and if necessary to undo the changes.
This is a difficult question. It needs more senses and decisions to perform a good commit. Is there an urgency to commit minor correction of typo or grammar mistake in an English sentence? Or a missing semicolon that prevent the code from compiling?
Commit as often as possible. If losing current work is a concern, commit every time soon after work has been completed. Do not procrastinate commit. Commit on the same day as soon as the work has finished. But it is always a good habit to commit useful and working code. When adding new packages, determine a milestone where useful functionality has been completed. Commit when a milestone has been achieved.
Example 1. When to Commit
package Nodes is
procedure Initialization;
procedure Finalization;
procedure Add;
procedure Delete;
procedure Get;
procedure Iterate;
end Nodes;
First milestone: procedure
Initialization and
Finalization.
Second milestone: procedure Add and
Delete.
Third milestone: procedure Get and
Iterate.
In this case, for package Nodes, commit can
be done when every each or all milestones have been completed. Do
not commit only Initialization and
Add without other supporting
functionality.
Avoid conflicts by updating often. The longer between updates, the harder it will be to merge changes with the changes done by other developers. Check the post commit messages to see if there is any changes relating to your current or vice versa. Commit immediately when a related update has been committed by other developers, otherwise inform the other developers about your work-in-progress so that they will be aware of your changes in relation to theirs.
Check for updates and run svn diff of the files to commit and check the changes on local work copy. Think twice before every commit. Review changes with other developers, team leader or project managers if doubt arises.
There are two types of review: pre-commit and post-commit review. Both have pros and cons. However, every commit must be tested and test results must be recorded and then reviewed. In pre-commit review, changes will not be available until everything has been tested, reviewed and committed while post-commit being the opposite. In post-commit review, non-performing functionality, bug fixes, non-conforming standards and undo is most likely to happen. It will affect many developers if they have already updated and started using the newly committed changes.
Every commit must be reviewed by the committer, the owner (if the committer is not the owner of code), system designer and a moderator. The review must be documented and approved before commit.
Branching is good. While new codes need to be reviewed and tested, often times, other development must move on and new ideas have to be explored; it is good to branch off from the trunk (main development) so not to interfere with other development by the other developers.
Branch as many as needed. But, the branches must be merged at some point to apply to them to another branch. This process sounds easy but in practice it is extremely tricky.
The first decision is when to branch? This can sometimes be an easy decision like "stable release", "major bug fix", etc. Most difficult decisions are most likely operating system dependence support. Consult fellow developers, team leader(s) and/or project manager.
Sometimes it will be difficult to decide where to branch. Give
much consideration where to branch. For example, decide which revision
is to be selected for alpha test. Figure 1 illustrates a branching where r111 has been selected for
a branch, do a svn copy from r111 to, say,
branch/alpha_test. Figure 2
illustrates a simple project with more complex branches are created and
merged later.
The main trunk and the various branches should have an owner assigned who will be rsponsible for.
Keeping a list of configurable items for the branch or trunk. The owner will maintain the contents list for the branch or trunk. The contents list consist of item name and a brief description about the item. This list is essential since new articles will be added or removed on the ongoing basis. The list will be able to help tracking changes, new additions and deletions to the repository for respective branch.
Establish a working policy for the branch or trunk. This policy will define when the code can be checked in after coding or after review etc. The policy also defines who is responsible to merge changes on the same file and toresolve conflicts.
Identify and document policy deviations. Establishes policies tend to have exceptions along the way. The owner is the key person responsible for identifying the workaround and tracking and/or documenting the same for future use.
Responsible for merge with the trunk. The branch owner is responsible for ensuring that the changes in the branch can be successfully merged with the main trunk at a reasonable point in time.
Branching produces a split in a code stream: different developers can be working in alternate universes of the same set of code. Changes are made independently to each stream. Merging brings changes back together to combine the streams. There are a number of reasons why you might want to use branches, and different reasons produce different kinds of branches. At some point when these reasons consolidate and a milestone achieved, these branches will merge.
Merging converges the branches back to the main trunk. Merging is the most complicated task of all in SCM and usually confusing when not careful.
The following simple best practices will walk you through the merging process in Subversion.
Find out at what version you created the branch
svn log --verbose --stop-on-copy \ svn://adastarinformatics.com/svn/projects/sensor_fusion/branches/my-realtime-branch
Step 1 will produce the following message. Take note of the revision number r287 of when the branch was created.
Example 2.
--------------------------------------------------------------------------- r287 | user | 2007-06-01 15:26:33 -0600 | 2 lines Changed paths: A /sensor_fusion/branches/my-realtime-branch (from /sensor_fusion/trunk:286)
Update working copy of trunk
$ cd sensor_fusion/trunk $ svn update At revision 335.
Merge the changes from r287 (when the branch was created) to the current Trunk version being r335.
$ svn merge -r 287:HEAD svn://adastarinformatics.com/svn/projects/sensor_fusion/branches/my-realtime-branch U sensor-realtime-nodes.adb U sensor-realtime-nodes.ads U sensor-realtime-queues.adb U sensor-realtime-queues.ads
Check to see if there are any conflicts and check the changes that have been merged.
$ svn status M sensor-realtime-nodes.adb M sensor-realtime-nodes.ads M sensor-realtime-queues.adb M sensor-realtime-queues.ads
Commit the merged changes back into the trunk and log it.
$ svn commit -m "Merged my-realtime-branch changes r287:335 into the trunk." Sending sensor-realtime-nodes.adb Sending sensor-realtime-nodes.ads Sending sensor-realtime-queues.adb Sending sensor-realtime-queues.ads Transmitting file data ... Committed revision 336.
During an on-going development, there are times when renaming a
file or directory becomes necessary. One of the practice is to create a
new branch and merge later but this method may cause confusion and
sometimes may mess up the repository unnecessarily. When this happens,
the best practice is to identify the components that need renaming and
inform the entire team about the rename once the rename has been
approved. Inform the entire team to commit their working copy, do a
svn update and lock the affected for renaming.
In a large-scale software development, source code is not located in one single folder in a project. Source code may be located in multiple locations or folders under the project directory. The classification of source code according to folders usually is to organize the source code according to functions or family of functions. There are times when requirement or software specification has changed, the functions or family of functions will change according to the software design. It will be require to move affected source files to a new location for a better organization and categorization of source files.
For example, filter-fourier.adb and
filter-fourier.ads were located in the project directory
sensor/. The source files filter-fourier.ad*
and related source files will have better organization after moving to a
new sub-folder sensor/math/.