Advice on design, authentication -- and more -- from an industry-leading networking guru.
Joel Snyder runs the world's largest and most complete VPN interoperability and testing facility. He recently spoke with SearchWin2000.com contributor Tom Lancaster about VPN performance in Windows.
SearchWin2000.com: In terms of design decisions, such as firewall and authentication server placement, clustering versus load-balancing and tunnel endpoint terminations, which Windows VPN configurations have you found to be effective or not effective?
Snyder: First, let's start by saying that you should never terminate a VPN tunnel on a server that's doing anything else but VPN. One of the wonderful things that we've learned the past few years is that servers are cheap, and partitioning your application and functional load across multiple servers is an excellent way to build both fault tolerance and high availability into your applications.
With all that being said, the best VPN configuration is one that meets the needs of your network. That includes both the topology issues -- things like Network Address Translation (NAT), which is a major nightmare to the VPN designer, and the political issue of who is going to manage, run and control the VPN. These are combined with the more obvious questions: how will the VPN interact with the firewall? What are the firewall requirements for VPN users? There are a lot of things like that.
There is no single obvious real answer, because this really depends on the needs of your application. For example, if you're pulling out a frame-relay network and replacing it with a VPN, and if you never firewalled off your VPN users before, then you probably are pretty happy dumping the VPN server inside the firewall. That's fast, efficient and solves a lot of problems with authentication and performance through the firewall. However, if you're building an extranet kind of application, the firewall function is almost always required inside of the VPN function -- you don't want these foreign users running amok on your network without adequate protection. Those are two extremes, but you can see that there are a lot of requirements to what works best. In high availability, for example, you almost always want to straddle your firewall. But there are no hard-and-fast rules, except this one: anyone who says that one single topology is 'best' is wrong.
SearchWin2000.com: Do you think split tunneling is a viable means of increasing performance without dramatically increasing risk?
Snyder: I'm a firm believer in split tunneling. The whole idea of NOT doing split tunneling is a conspiracy put together by (a) the L2TP people (who really can't avoid a single tunnel back to corporate HQ without major nightmares of routing) and (b) uneducated network managers. If we ignore the L2TP problem (and, of course, the PPTP problem, which is the same), what we see is people avoiding split tunneling as a way of curing the symptom of poor security auditing and control on remote access user platforms. Turn off split tunneling, the thinking seems to go, and you're 'safe.' Well, of course, that's total bull. The aggravation that goes on around split tunneling is like deciding what brand of pain reliever to take because you have a steel spike embedded in your head. The solution is to pull the spike out. If you have end-user systems that are connecting to your network, then you MUST be confident in the security of those systems. You cannot treat the symptom; you have to treat the cause.
SearchWin2000.com: What authentication mechanisms have you found to be effective from a performance standpoint? For instance, would using pre-shared secrets be significantly faster than certificates or another method that involves an authentication server?
Snyder: Let's start by saying that this only matters in the remote access case. For site-to-site, your tunnels are so long-lived that you can do anything you want and you'll be happy with the performance.
Pre-shared secrets without any additional authentication (i.e. XAUTH or CRACK) are obviously going to be the fastest; certificates are going to be much slower because of the additional public key operations involved. In neither case should you have to go out to an external authentication server. Despite the slowness of certificates, I still strongly prefer them as a solution. Why? Because they scale. Pre-shared secrets don't. You lose any hope for two-factor authentication with PSS, plus you have an ugly and long number you have to pass out to each user. It's bad design.
Most VPN vendors solve this by using a lame pre-shared secret (such as null or the same one for all users, the so-called 'Group pre-shared secret' or 'wildcard PSS') and then throwing in a secondary authentication using something like XAUTH or CRACK; in Microsoft's case, they'll push you to use L2TP and then you do a PPP-based authentication. When you start down that path, you're working on person-time, not machine-time. Who cares if it takes 100 ms or 1000 ms to set up the SA (PSS versus cert, for example) if in the middle the user has to type in a password? Or unlock their smart card? Or read a number off the token? In those cases, performance is a non-issue because you've injected user-think-time into the equation, and that so dominates the picture that you don't care about how long the authentication takes.
My favorite for people who like two-factor authentication and aren't afraid of technology is certificates locked up on smart cards. From a security model point of view that beats the heck out of any other approach, and gives you snappy performance compared to an L2TP or XAUTH/CRACK/Hybrid authentication system.
SearchWin2000.com: What would you say is the most common configuration mistake that leads to degraded VPN performance in Windows?
Snyder: Fragmentation. And it's not a configuration mistake -- it's a built-in problem in any TCP-based application. VPNs cause fragmentation, and firewalls and gateways often handle fragmentation very poorly. The major, major problem I see in real VPNs is that folks have not dealt with the fragmentation issue properly. Of course, you don't want to run around shortening your MTU just because you MIGHT have a problem; that will impact LAN communications as well. What you want to do is be aware of the potential problems with MTU and fragmentation, and then be prepared to both debug the problem (which implies that you understand it -- not hard to do, but you have to think about it) and then twiddle whatever knobs are necessary, whether these are registry entries or router MTU values, to solve it.
Second major problem: using L2TP when not required. The overhead is gross, and if you don't need it, then you shouldn't use it.
Joel Snyder, Ph.D., has been a senior partner at consulting firm Opus One, in Tucson, Ariz., for over two decades, where he specializes in helping companies build bigger, faster, stronger and safer networks. For the past five years, he's run the world's largest and most complete VPN interoperability and testing facility. His favorite crayon color is Burnt Sienna.
Tom Lancaster, CCIE# 8829, CNX# 1105, is an IT Architect for IBM Global Services, the author of several networking books, articles and newsletters, and a contributor to SearchWin2000.com.