Tuesday, July 28, 2009

WebDAV HTTP, or SVN RA?


The answer to this question isn’t as simple as deciding which protocol is faster. Even WebDAV advocates will usually admit that SVN RA is faster for most operations. In fact, support for a new HTTPv2.0 protocol that removes much of the overhead of the current WebDAV HTTP protocol is planned for Subversion 1.7 out early next year. However, the vast majority of development organizations don’t see the performance differences between the current WebDAV HTTP protocol and SVN RA as noticeable enough to users over a LAN in most cases to overcome the significant advantages WebDAV HTTP offers over SVN RA. The user experience over a WAN may be a different story, but I’ll come to that later.

The key advantages provided by WebDAV HTTP over SVN RA include:

• Web browser access to Subversion repositories through Apache.


• The ability to leverage the authentication and authorization options offered by Apache including its native LDAP integration.


• Standardized network encryption (SSL) and certificate handling.


• Complete logging capabilities that don’t exist using SVN RA with svnserve.


• Use of a standard port (80) that makes it easy to go through corporate firewalls.

Subversion started out using WebDAV HTTP with Apache. SVN RA with SSH came along later as a way to make it easier for CVS shops that were using CVS’ pServer protocol with SSH to migrate to Subversion without having to redo their security infrastructure. If you aren’t moving from CVS to Subversion, and have to set up SSH user accounts on the server from scratch, there can be significant effort involved. In addition, SSH introduces performance overhead of its own in comparison to SSL.

In a globally distributed development organization with remote users accessing a central Subversion repository over a WAN connection, the performance differences between WebDAV HTTP and SVN RA are more noticeable to users, but either way WAN latency will have a big impact on remote developer productivity. HTTPv2.0 is designed to close this gap by removing as many network round trips as possible over a LAN or a WAN between clients and the Apache Web server front-ending Subversion. Network round trips that the current WebDAV HTTP protocol generates due to CHECKOUT requests that happen before each PUT, and unnecessary PROPFIND requests will go away. The ability to pipeline requests and process them in parallel whenever possible are some of the other major changes coming. In addition, much of the handshaking that takes place using the current WebDAV HTTP protocol to establish connections and authenticate users, before they can even access the repository will be gone.

Yet without a distributed solution for Subversion, the same problems of WAN latency, degraded central server performance as the number of remote users grows, excessive network bandwidth usage due to unnecessary read operations, and availability will still be significant issues that will negatively impact developer productivity. I’ve covered these points in detail in two recent posts:


  1. Can Globally distributed Development Really be Supported with a Central Subversion Server?

  2. The Centralized vs, Distributed Debate Continues.

Subversion MultiSite, which will support the new HTTPv2.0 protocol when it becomes available with Subversion 1.7 already addresses these same issues over a WAN that HTTPv2.0 will when it comes out, in addition to overcoming the downsides of a central Subversion server implementation that HTTPv2.0 won't address.

Subversion MultiSite minimizes WAN round trips and network bandwidth usage in three ways:

1. Subversion MultiSite maintains persistent physical connections between repository replicas at each site. This means that the overhead of opening and closing a connection with each request over the WAN (i.e., the 3-way TCP handshake) is eliminated already. Subversion MultiSite users don’t have to wait for HTTP v2.0 to address this over the WAN.

2. Subversion MultiSite sends an entire commit, which could include hundreds of HTTP PUTs (one for each file in a commit) over a single connection, rather than opening and closing a connection for each PUT, as WebDAV HTTP would do on its own today.

3. Subversion MultiSite doesn't generate any WAN traffic unless it is absolutely necessary. Developers checkout from their local Subversion repository replica, so there are no remote reads. Only commits and other writes generate WAN traffic.

The bottom line is that if WAN latency is driving you toward SVN RA, in spite of the major benefits offered by WebDAV, Subversion MultiSite makes giving up those benefits unnecessary and allows an IT organization to overcome the other significant challenges presented by globally distributed development.

No comments: